BarTab

From MagnetoWiki
Jump to: navigation, search

BarTab Program description

BarTab is under development.

BarTab is a small application designed to run 1-way or 2-way ANOVAs with repeated measures on relatively small data sets. A number posthoc tests will also be provided. BarTab was designed to analyze the intake data collected in my lab, which typically consists of intake of 2-6 groups (with 6-8 animals per group) across 1 to 14 days of testing. (The intake data is collected from bar-coded bottles and food jars by the BarTender program). It also fills a hole in available small statistical packages, which either do not offer 2-way repeated measures ANOVAs or do not provide a selection of posthoc tests that can be applied to make arbitrary among-group comparisons.

The ANOVA calculation in BarTab are based on cells of data organized in a two dimension array of rows and columns (the number of cells is equal to the number of rows X the number of columns). The rows represent the groups of the first factor, and the columns represent the groups of the second factor (e.g., if factor 2 is time, then the columns are consecutive days of data collection). Each cell contains the data observations for a particular group of factor 1 at a particular treatment of factor 2 (e.g. in a Drug X Time ANOVA, a cell might be the intakes of the 38mg/kg-treated group on day 3 of intake testing).

Statistical References

I will list here the books and equations that were used in the programming of BarTab.

BarTab Objects and Data Structures

DataCell

The DataCell object is the atomic unit in BarTab. It represents an intersection of the two factors in a 2-way ANOVA (i.e. the data from Group 2 on Day 5 in a Group X Day ANOVA) or the data from one group in a 1-way ANOVA. The DataCell contains an array of real numbers which are the data from each individual subject in that cell (“observations”). (In other words, BarTab doesn’t track the data from individual subjects, but deals with the collected data within each cell of an ANOVA.

In addition to the data from each individual, each DataCell precalculates the sum, mean, variance, and standard deviation

Declaration

class DataCell:public Object {

public:
AnovaDataSet *data_set; // belongs to this AnovaDataSet
Boolean missing; // TRUE if this cell is not represented...
LineOfCells *row, *column;
// this cell belongs to this row and this column
// so it is the intersection of this row and column
unsigned long n; // number of observations in this cell (X)
double *observations; // array of observation values
unsigned long arraySize; // current size of the array
// observations are "1"-indexed within this array
double total; // sum of all observations in this cell
double mean; // mean of all observations in this cell
double unbiased_variance; // sum of squares of observations divided by n-1
double biased_variance; // sum of squares of observations divided by n
double deviation; // unbiased standard deviation
DataCell(); // constructor to make the cell;
//initializes the array of observations to CELL_PAGE_SIZE
~DataCell(); // destructor that deallocates the array of observations
void AddObservation(double datum); // set the i-th observation to this value
void Update(void); // recalculate the mean and the deviation
};

LineOfCells

The LineOfCells object contains a linear array of DataCell objects. The LineOfCells could be a row of cells (e.g. the data from one group across repeated days) or a column of cells (e.g. the data on one day from all groups).

In addition to a list of cells that make up the LineOfCells, the object calculates derived values such as sum (total) of observations, number of observations, mean value of all observations, standard deviation of all observations, as well as the mean and standard deviation of the means of each cell.

Declaration

class LineOfCells:public Object {

// could be a row of cells or a column of cells
public:
char name[256]; // name of this line of cells
AnovaDataSet *data_set; // belongs to this data set
ListObject *cells; // these cells are in this line
double total; // total of all observations in this line
double mean; // mean of all observations in this line
double deviation; // deviation of all observations in this line
double obs_n; // number of observations in this line of cells
double mean_of_cell_means;
double deviation_of_cell_means;
LineOfCells(); // constructor with the name of the line
~LineOfCells(); // destructor to deallocate the list object
void Update(void); // update all the means & standard deviations

};

AnovaDataSet

The AnovaDataSet object contains the DataCells that are going to be analyzed, and their arrangment into rows and columns. Data is therefore poured into the cells of the AnovaDataSet, which then calculates the ANOVA results.

To do: need to add some API glue so that outside routines can add cells, rows, and columns to the AnovaDataSet (at the moment, it reads sample data from a file); and need to specify and calculate repeated vs. unrepeated measures.

Declaration

class AnovaDataSet {


// of course, need to put in textual labels, etc...
public:
char name[256]; // name of the data set


unsigned long num_of_rows;
unsigned long num_of_columns;


ListObject *cells; // all cell objects
ListObject *rows; // LineOfCells objects
ListObject *columns; // LineOfCells objects
// the lists of cell, rows, and columns are "1" indexed



double grand_total; // overall sum of all observations in all cells
double grand_N; // overall number of observations in all the cells:::
double grand_mean; // mean of all the observations in all the cells
double grand_SS;
double grand_biased_variance; // SS_total divided by grand_N; multiply by grand_N to get SS_total
double grand_deviation; // biased standard deviation of all observations


double mean_of_cell_means;
double deviation_of_cell_means;


AnovaResultsTable results_table;


AnovaDataSet(char *new_name);
AnovaDataSet();
~AnovaDataSet(); // destructor to deallocate the ListObjects,
// and tell the cells, rows and columns to destroy themselves too


void SetUpCells(void);
void SetUpRowsAndColumns(void);
void Update(void);
void ReadDataFromFile(void);


// need to add some API glue so that outside routines can add cells, rows, and columns


void UpdateAnovaResultsTable(void);


protected:
// internal routines for calculating the ANOVA


double GetSSWithinCells(void);
double GetSSBetweenCells(void);
double GetSSBetweenLinesOfCells(ListObject *lineList,unsigned long num_of_cells,double *variance);
double GetSSBetweenRows(void);
double GetSSBetweenColumns (void);
double GetSSBetweenInteraction(void);


double GetMSWithinCells(void);
double GetMSBetweenCells(void);
double GetMSBetweenRows(void);
double GetMSBetweenColumns (void);
double GetMSBetweenInteraction(void);



double GetNumCells(void);
double GetDfTotal(void);
double GetDfWithinCells(void);
double GetDfError(void);
double GetDfBetweenCells(void);
double GetDfBetweenRows(void);
double GetDfBetweenColumns(void);
double GetDfBetweenInteraction(void);


//interface with data table...
void Read2WayDataFromDataTable(DataTable dt,AnovaParameters factors);


};

AnovaResultsTable

The results table is a structure inside of the AnovaDataSet object. When the dataset is updated, the results of the ANOVA are placed in this table.

To do: need a way to report errors, and some routines to nicely format the results table for display, or copying to clipboard, etc.


Declaration

typedef struct {


double DF_between_groups;


double cell_SS_within;
double cell_DF_within; // aka DF_error
double cell_MS_within;


double cell_variance;
double cell_SS_between;
double cell_DF_between;
double cell_MS_between;


double row_variance;
double row_SS_between;
double row_DF_between;
double row_MS_between;
double row_F;
double row_p;


double col_variance;
double col_SS_between;
double col_DF_between;
double col_MS_between;
double col_F;
double col_p;


double inter_SS_between;
double inter_DF_between;
double inter_MS_between;
double inter_F;
double inter_p;


} AnovaResultsTable;