Difference between revisions of "Xynk"
(Update Xynk link.) |
(copied old "BarTab" text) |
||
Line 2: | Line 2: | ||
The Xynk web site is https://xynk.xyz. | The Xynk web site is https://xynk.xyz. | ||
+ | |||
+ | |||
+ | =Xynk Program description= | ||
+ | |||
+ | ''Xynk is under development.'' | ||
+ | |||
+ | Xynk is a small application designed to run 1-way or 2-way ANOVAs with repeated measures on relatively small data sets. A number posthoc tests will also be provided. Xynk was designed to analyze the intake data collected in my lab, which typically consists of intake of 2-6 groups (with 6-8 animals per group) across 1 to 14 days of testing. (The intake data is collected from bar-coded bottles and food jars by the BarTender program). It also fills a hole in available small statistical packages, which either do not offer 2-way repeated measures ANOVAs or do not provide a selection of posthoc tests that can be applied to make arbitrary among-group comparisons. | ||
+ | |||
+ | The ANOVA calculation in Xynk are based on cells of data organized in a two dimension array of rows and columns (the number of cells is equal to the number of rows X the number of columns). The rows represent the groups of the first factor, and the columns represent the groups of the second factor (e.g., if factor 2 is time, then the columns are consecutive days of data collection). Each cell contains the data observations for a particular group of factor 1 at a particular treatment of factor 2 (e.g. in a Drug X Time ANOVA, a cell might be the intakes of the 38mg/kg-treated group on day 3 of intake testing). | ||
+ | |||
+ | =Statistical References= | ||
+ | I will list here the books and equations that were used in the programming of Xynk. | ||
+ | |||
+ | =Xynk Objects and Data Structures= | ||
+ | |||
+ | ==DataCell== | ||
+ | The DataCell object is the atomic unit in Xynk. It represents an intersection of the two factors in a 2-way ANOVA (i.e. the data from Group 2 on Day 5 in a Group X Day ANOVA) or the data from one group in a 1-way ANOVA. The DataCell contains an array of real numbers which are the data from each individual subject in that cell (“observations”). (In other words, Xynk doesn’t track the data from individual subjects, but deals with the collected data within each cell of an ANOVA. | ||
+ | |||
+ | In addition to the data from each individual, each DataCell precalculates the sum, mean, variance, and standard deviation | ||
+ | |||
+ | ===Declaration=== | ||
+ | |||
+ | <syntaxhighlight lang="c++"> | ||
+ | class DataCell:public Object { | ||
+ | |||
+ | public: | ||
+ | |||
+ | AnovaDataSet *data_set; // belongs to this AnovaDataSet | ||
+ | |||
+ | Boolean missing; // TRUE if this cell is not represented... | ||
+ | |||
+ | LineOfCells *row, *column; | ||
+ | |||
+ | // this cell belongs to this row and this column | ||
+ | |||
+ | // so it is the intersection of this row and column | ||
+ | |||
+ | unsigned long n; // number of observations in this cell (X) | ||
+ | |||
+ | double *observations; // array of observation values | ||
+ | |||
+ | unsigned long arraySize; // current size of the array | ||
+ | |||
+ | // observations are "1"-indexed within this array | ||
+ | |||
+ | double total; // sum of all observations in this cell | ||
+ | |||
+ | double mean; // mean of all observations in this cell | ||
+ | |||
+ | double unbiased_variance; // sum of squares of observations divided by n-1 | ||
+ | |||
+ | double biased_variance; // sum of squares of observations divided by n | ||
+ | |||
+ | double deviation; // unbiased standard deviation | ||
+ | |||
+ | DataCell(); // constructor to make the cell; | ||
+ | |||
+ | //initializes the array of observations to CELL_PAGE_SIZE | ||
+ | |||
+ | ~DataCell(); // destructor that deallocates the array of observations | ||
+ | |||
+ | void AddObservation(double datum); // set the i-th observation to this value | ||
+ | |||
+ | void Update(void); // recalculate the mean and the deviation | ||
+ | |||
+ | }; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ==LineOfCells== | ||
+ | |||
+ | The LineOfCells object contains a linear array of DataCell objects. The LineOfCells could be a row of cells (e.g. the data from one group across repeated days) or a column of cells (e.g. the data on one day from all groups). | ||
+ | |||
+ | In addition to a list of cells that make up the LineOfCells, the object calculates derived values such as sum (total) of observations, number of observations, mean value of all observations, standard deviation of all observations, as well as the mean and standard deviation of the means of each cell. | ||
+ | |||
+ | ===Declaration=== | ||
+ | |||
+ | <syntaxhighlight lang="c++"> | ||
+ | class LineOfCells:public Object { | ||
+ | |||
+ | // could be a row of cells or a column of cells | ||
+ | |||
+ | public: | ||
+ | |||
+ | char name[256]; // name of this line of cells | ||
+ | |||
+ | AnovaDataSet *data_set; // belongs to this data set | ||
+ | |||
+ | ListObject *cells; // these cells are in this line | ||
+ | |||
+ | double total; // total of all observations in this line | ||
+ | |||
+ | double mean; // mean of all observations in this line | ||
+ | |||
+ | double deviation; // deviation of all observations in this line | ||
+ | |||
+ | double obs_n; // number of observations in this line of cells | ||
+ | |||
+ | double mean_of_cell_means; | ||
+ | |||
+ | double deviation_of_cell_means; | ||
+ | |||
+ | LineOfCells(); // constructor with the name of the line | ||
+ | |||
+ | ~LineOfCells(); // destructor to deallocate the list object | ||
+ | |||
+ | void Update(void); // update all the means & standard deviations | ||
+ | |||
+ | }; | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | ==AnovaDataSet== | ||
+ | The AnovaDataSet object contains the DataCells that are going to be analyzed, and their arrangment into rows and columns. Data is therefore poured into the cells of the AnovaDataSet, which then calculates the ANOVA results. | ||
+ | |||
+ | ''To do: need to add some API glue so that outside routines can add cells, rows, and columns to the AnovaDataSet (at the moment, it reads sample data from a file); and need to specify and calculate repeated vs. unrepeated measures.'' | ||
+ | |||
+ | ===Declaration=== | ||
+ | |||
+ | <syntaxhighlight lang="c++"> | ||
+ | class AnovaDataSet { | ||
+ | |||
+ | |||
+ | // of course, need to put in textual labels, etc... | ||
+ | |||
+ | public: | ||
+ | |||
+ | |||
+ | |||
+ | char name[256]; // name of the data set | ||
+ | |||
+ | |||
+ | |||
+ | unsigned long num_of_rows; | ||
+ | |||
+ | unsigned long num_of_columns; | ||
+ | |||
+ | |||
+ | |||
+ | ListObject *cells; // all cell objects | ||
+ | |||
+ | ListObject *rows; // LineOfCells objects | ||
+ | |||
+ | ListObject *columns; // LineOfCells objects | ||
+ | |||
+ | // the lists of cell, rows, and columns are "1" indexed | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | double grand_total; // overall sum of all observations in all cells | ||
+ | |||
+ | double grand_N; // overall number of observations in all the cells::: | ||
+ | |||
+ | double grand_mean; // mean of all the observations in all the cells | ||
+ | |||
+ | double grand_SS; | ||
+ | |||
+ | double grand_biased_variance; // SS_total divided by grand_N; multiply by grand_N to get SS_total | ||
+ | |||
+ | double grand_deviation; // biased standard deviation of all observations | ||
+ | |||
+ | |||
+ | |||
+ | double mean_of_cell_means; | ||
+ | |||
+ | double deviation_of_cell_means; | ||
+ | |||
+ | |||
+ | AnovaResultsTable results_table; | ||
+ | |||
+ | |||
+ | |||
+ | AnovaDataSet(char *new_name); | ||
+ | |||
+ | AnovaDataSet(); | ||
+ | |||
+ | ~AnovaDataSet(); // destructor to deallocate the ListObjects, | ||
+ | |||
+ | // and tell the cells, rows and columns to destroy themselves too | ||
+ | |||
+ | |||
+ | |||
+ | void SetUpCells(void); | ||
+ | |||
+ | void SetUpRowsAndColumns(void); | ||
+ | |||
+ | void Update(void); | ||
+ | |||
+ | void ReadDataFromFile(void); | ||
+ | |||
+ | |||
+ | // need to add some API glue so that outside routines can add cells, rows, and columns | ||
+ | |||
+ | |||
+ | void UpdateAnovaResultsTable(void); | ||
+ | |||
+ | |||
+ | |||
+ | protected: | ||
+ | |||
+ | |||
+ | |||
+ | // internal routines for calculating the ANOVA | ||
+ | |||
+ | |||
+ | |||
+ | double GetSSWithinCells(void); | ||
+ | |||
+ | double GetSSBetweenCells(void); | ||
+ | |||
+ | double GetSSBetweenLinesOfCells(ListObject *lineList,unsigned long num_of_cells,double *variance); | ||
+ | |||
+ | double GetSSBetweenRows(void); | ||
+ | |||
+ | double GetSSBetweenColumns (void); | ||
+ | |||
+ | double GetSSBetweenInteraction(void); | ||
+ | |||
+ | |||
+ | double GetMSWithinCells(void); | ||
+ | |||
+ | double GetMSBetweenCells(void); | ||
+ | |||
+ | double GetMSBetweenRows(void); | ||
+ | |||
+ | double GetMSBetweenColumns (void); | ||
+ | |||
+ | double GetMSBetweenInteraction(void); | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | double GetNumCells(void); | ||
+ | |||
+ | double GetDfTotal(void); | ||
+ | |||
+ | double GetDfWithinCells(void); | ||
+ | |||
+ | double GetDfError(void); | ||
+ | |||
+ | double GetDfBetweenCells(void); | ||
+ | |||
+ | double GetDfBetweenRows(void); | ||
+ | |||
+ | double GetDfBetweenColumns(void); | ||
+ | |||
+ | double GetDfBetweenInteraction(void); | ||
+ | |||
+ | |||
+ | |||
+ | //interface with data table... | ||
+ | |||
+ | void Read2WayDataFromDataTable(DataTable dt,AnovaParameters factors); | ||
+ | |||
+ | |||
+ | }; | ||
+ | |||
+ | </syntaxhighlight> | ||
+ | |||
+ | ==AnovaResultsTable== | ||
+ | |||
+ | The results table is a structure inside of the AnovaDataSet object. When the dataset is updated, the results of the ANOVA are placed in this table. | ||
+ | |||
+ | ''To do: need a way to report errors, and some routines to nicely format the results table for display, or copying to clipboard, etc.'' | ||
+ | |||
+ | |||
+ | ===Declaration=== | ||
+ | |||
+ | <syntaxhighlight lang="c++"> | ||
+ | typedef struct { | ||
+ | |||
+ | |||
+ | |||
+ | double DF_between_groups; | ||
+ | |||
+ | |||
+ | |||
+ | double cell_SS_within; | ||
+ | |||
+ | double cell_DF_within; // aka DF_error | ||
+ | |||
+ | double cell_MS_within; | ||
+ | |||
+ | |||
+ | |||
+ | double cell_variance; | ||
+ | |||
+ | double cell_SS_between; | ||
+ | |||
+ | double cell_DF_between; | ||
+ | |||
+ | double cell_MS_between; | ||
+ | |||
+ | |||
+ | |||
+ | double row_variance; | ||
+ | |||
+ | double row_SS_between; | ||
+ | |||
+ | double row_DF_between; | ||
+ | |||
+ | double row_MS_between; | ||
+ | |||
+ | double row_F; | ||
+ | |||
+ | double row_p; | ||
+ | |||
+ | |||
+ | |||
+ | double col_variance; | ||
+ | |||
+ | double col_SS_between; | ||
+ | |||
+ | double col_DF_between; | ||
+ | |||
+ | double col_MS_between; | ||
+ | |||
+ | double col_F; | ||
+ | |||
+ | double col_p; | ||
+ | |||
+ | |||
+ | |||
+ | double inter_SS_between; | ||
+ | |||
+ | double inter_DF_between; | ||
+ | |||
+ | double inter_MS_between; | ||
+ | |||
+ | double inter_F; | ||
+ | |||
+ | double inter_p; | ||
+ | |||
+ | |||
+ | } AnovaResultsTable; | ||
+ | |||
+ | |||
+ | </syntaxhighlight> | ||
[[Category:Software]] [[Category:Xynk]] | [[Category:Software]] [[Category:Xynk]] |
Latest revision as of 11:22, 11 August 2024
Xynk is a fast graphing and statistical application for the Macintosh for the analysis of categorical data in the behavioral sciences. The program features one-way and two-way ANOVA with Tukey-Kramer HSD post hoc tests.
The Xynk web site is https://xynk.xyz.
Xynk Program description
Xynk is under development.
Xynk is a small application designed to run 1-way or 2-way ANOVAs with repeated measures on relatively small data sets. A number posthoc tests will also be provided. Xynk was designed to analyze the intake data collected in my lab, which typically consists of intake of 2-6 groups (with 6-8 animals per group) across 1 to 14 days of testing. (The intake data is collected from bar-coded bottles and food jars by the BarTender program). It also fills a hole in available small statistical packages, which either do not offer 2-way repeated measures ANOVAs or do not provide a selection of posthoc tests that can be applied to make arbitrary among-group comparisons.
The ANOVA calculation in Xynk are based on cells of data organized in a two dimension array of rows and columns (the number of cells is equal to the number of rows X the number of columns). The rows represent the groups of the first factor, and the columns represent the groups of the second factor (e.g., if factor 2 is time, then the columns are consecutive days of data collection). Each cell contains the data observations for a particular group of factor 1 at a particular treatment of factor 2 (e.g. in a Drug X Time ANOVA, a cell might be the intakes of the 38mg/kg-treated group on day 3 of intake testing).
Statistical References
I will list here the books and equations that were used in the programming of Xynk.
Xynk Objects and Data Structures
DataCell
The DataCell object is the atomic unit in Xynk. It represents an intersection of the two factors in a 2-way ANOVA (i.e. the data from Group 2 on Day 5 in a Group X Day ANOVA) or the data from one group in a 1-way ANOVA. The DataCell contains an array of real numbers which are the data from each individual subject in that cell (“observations”). (In other words, Xynk doesn’t track the data from individual subjects, but deals with the collected data within each cell of an ANOVA.
In addition to the data from each individual, each DataCell precalculates the sum, mean, variance, and standard deviation
Declaration
class DataCell:public Object {
public:
AnovaDataSet *data_set; // belongs to this AnovaDataSet
Boolean missing; // TRUE if this cell is not represented...
LineOfCells *row, *column;
// this cell belongs to this row and this column
// so it is the intersection of this row and column
unsigned long n; // number of observations in this cell (X)
double *observations; // array of observation values
unsigned long arraySize; // current size of the array
// observations are "1"-indexed within this array
double total; // sum of all observations in this cell
double mean; // mean of all observations in this cell
double unbiased_variance; // sum of squares of observations divided by n-1
double biased_variance; // sum of squares of observations divided by n
double deviation; // unbiased standard deviation
DataCell(); // constructor to make the cell;
//initializes the array of observations to CELL_PAGE_SIZE
~DataCell(); // destructor that deallocates the array of observations
void AddObservation(double datum); // set the i-th observation to this value
void Update(void); // recalculate the mean and the deviation
};
LineOfCells
The LineOfCells object contains a linear array of DataCell objects. The LineOfCells could be a row of cells (e.g. the data from one group across repeated days) or a column of cells (e.g. the data on one day from all groups).
In addition to a list of cells that make up the LineOfCells, the object calculates derived values such as sum (total) of observations, number of observations, mean value of all observations, standard deviation of all observations, as well as the mean and standard deviation of the means of each cell.
Declaration
class LineOfCells:public Object {
// could be a row of cells or a column of cells
public:
char name[256]; // name of this line of cells
AnovaDataSet *data_set; // belongs to this data set
ListObject *cells; // these cells are in this line
double total; // total of all observations in this line
double mean; // mean of all observations in this line
double deviation; // deviation of all observations in this line
double obs_n; // number of observations in this line of cells
double mean_of_cell_means;
double deviation_of_cell_means;
LineOfCells(); // constructor with the name of the line
~LineOfCells(); // destructor to deallocate the list object
void Update(void); // update all the means & standard deviations
};
AnovaDataSet
The AnovaDataSet object contains the DataCells that are going to be analyzed, and their arrangment into rows and columns. Data is therefore poured into the cells of the AnovaDataSet, which then calculates the ANOVA results.
To do: need to add some API glue so that outside routines can add cells, rows, and columns to the AnovaDataSet (at the moment, it reads sample data from a file); and need to specify and calculate repeated vs. unrepeated measures.
Declaration
class AnovaDataSet {
// of course, need to put in textual labels, etc...
public:
char name[256]; // name of the data set
unsigned long num_of_rows;
unsigned long num_of_columns;
ListObject *cells; // all cell objects
ListObject *rows; // LineOfCells objects
ListObject *columns; // LineOfCells objects
// the lists of cell, rows, and columns are "1" indexed
double grand_total; // overall sum of all observations in all cells
double grand_N; // overall number of observations in all the cells:::
double grand_mean; // mean of all the observations in all the cells
double grand_SS;
double grand_biased_variance; // SS_total divided by grand_N; multiply by grand_N to get SS_total
double grand_deviation; // biased standard deviation of all observations
double mean_of_cell_means;
double deviation_of_cell_means;
AnovaResultsTable results_table;
AnovaDataSet(char *new_name);
AnovaDataSet();
~AnovaDataSet(); // destructor to deallocate the ListObjects,
// and tell the cells, rows and columns to destroy themselves too
void SetUpCells(void);
void SetUpRowsAndColumns(void);
void Update(void);
void ReadDataFromFile(void);
// need to add some API glue so that outside routines can add cells, rows, and columns
void UpdateAnovaResultsTable(void);
protected:
// internal routines for calculating the ANOVA
double GetSSWithinCells(void);
double GetSSBetweenCells(void);
double GetSSBetweenLinesOfCells(ListObject *lineList,unsigned long num_of_cells,double *variance);
double GetSSBetweenRows(void);
double GetSSBetweenColumns (void);
double GetSSBetweenInteraction(void);
double GetMSWithinCells(void);
double GetMSBetweenCells(void);
double GetMSBetweenRows(void);
double GetMSBetweenColumns (void);
double GetMSBetweenInteraction(void);
double GetNumCells(void);
double GetDfTotal(void);
double GetDfWithinCells(void);
double GetDfError(void);
double GetDfBetweenCells(void);
double GetDfBetweenRows(void);
double GetDfBetweenColumns(void);
double GetDfBetweenInteraction(void);
//interface with data table...
void Read2WayDataFromDataTable(DataTable dt,AnovaParameters factors);
};
AnovaResultsTable
The results table is a structure inside of the AnovaDataSet object. When the dataset is updated, the results of the ANOVA are placed in this table.
To do: need a way to report errors, and some routines to nicely format the results table for display, or copying to clipboard, etc.
Declaration
typedef struct {
double DF_between_groups;
double cell_SS_within;
double cell_DF_within; // aka DF_error
double cell_MS_within;
double cell_variance;
double cell_SS_between;
double cell_DF_between;
double cell_MS_between;
double row_variance;
double row_SS_between;
double row_DF_between;
double row_MS_between;
double row_F;
double row_p;
double col_variance;
double col_SS_between;
double col_DF_between;
double col_MS_between;
double col_F;
double col_p;
double inter_SS_between;
double inter_DF_between;
double inter_MS_between;
double inter_F;
double inter_p;
} AnovaResultsTable;