Difference between revisions of "Xynk"

From MagnetoWiki
Jump to navigation Jump to search
(added categories)
(copied old "BarTab" text)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Xynk is a graphing and statistical program that is under development
+
[https://xynk.xyz Xynk is a fast graphing and statistical application for the Macintosh] for the analysis of categorical data in the behavioral sciences. The program features one-way and two-way ANOVA with Tukey-Kramer HSD post hoc tests.
  
==Tab Delimited File Format==
+
The Xynk web site is https://xynk.xyz.
  
How XYnk parses tab delimited files, which is a useful format for importing:
 
  
The first line of the file is intepreted as the headers for the columns.
+
=Xynk Program description=
  
Xynk interprets the first column in the file as the "Subjects" column, although it retains the header name from the file (i.e. the column will appear as "AID" in the measures view, but labeled as "Subject" in the data view.
+
''Xynk is under development.''
 +
 
 +
Xynk is a small application designed to run 1-way or 2-way ANOVAs with repeated measures on relatively small data sets. A  number posthoc tests will also be provided.  Xynk was designed to analyze the intake data collected in my lab, which typically consists of intake of 2-6 groups (with 6-8 animals per group) across 1 to 14 days of testing. (The intake data is collected from bar-coded bottles and food jars by the BarTender program).  It also fills a hole in available small statistical packages, which either do not offer 2-way repeated measures ANOVAs or do not provide a selection of posthoc tests that can be applied to make arbitrary among-group comparisons.
 +
 
 +
The ANOVA calculation in Xynk are based on cells of data organized in a two dimension array of rows and columns (the number of cells is equal to the number of rows X the number of columns).  The rows represent the groups of the first factor, and the columns represent the groups of the second factor (e.g., if factor 2 is time, then the columns are consecutive days of data collection).  Each cell contains the data observations for a particular group of factor 1 at a particular treatment of factor 2 (e.g. in a Drug X Time ANOVA, a cell might be the intakes of the 38mg/kg-treated group on day 3 of intake testing).
 +
 
 +
=Statistical References=
 +
I will list here the books and equations that were used in the programming of Xynk.
 +
 
 +
=Xynk Objects and Data Structures=
 +
 
 +
==DataCell==
 +
The DataCell object is the atomic unit in Xynk. It represents an intersection of the two factors in a 2-way ANOVA (i.e. the data from Group 2 on Day 5 in a Group X Day ANOVA) or  the data from one group in a 1-way ANOVA. The DataCell contains an array of real numbers which are the data from each individual subject in that cell (“observations”). (In other words, Xynk doesn’t track the data from individual subjects, but deals with the collected data within each cell of an ANOVA.
 +
 
 +
In addition to the data from each individual, each DataCell precalculates the sum, mean, variance, and standard deviation
 +
 +
===Declaration===
 +
 
 +
<syntaxhighlight lang="c++">
 +
class DataCell:public Object {
 +
 
 +
  public:
 +
 
 +
    AnovaDataSet *data_set;  // belongs to this AnovaDataSet
 +
 
 +
    Boolean missing; // TRUE if this cell is not represented...
 +
 
 +
    LineOfCells *row, *column;
 +
 
 +
      // this cell belongs to this row and this column
 +
 
 +
      // so it is the intersection of this row and column
 +
 
 +
    unsigned long n; // number of observations in this cell (X)
 +
 
 +
    double *observations; // array of observation values
 +
 
 +
    unsigned long arraySize; // current size of the array
 +
 
 +
      // observations are "1"-indexed within this array
 +
 
 +
    double total; // sum of all observations in this cell
 +
 
 +
    double mean; // mean of all observations in this cell
 +
 
 +
    double unbiased_variance; //  sum of squares of observations divided by n-1
 +
 
 +
    double biased_variance; // sum of squares of observations divided by n
 +
 
 +
    double deviation; // unbiased standard deviation
 +
 
 +
    DataCell(); // constructor to make the  cell;
 +
 
 +
      //initializes the array of observations to CELL_PAGE_SIZE
 +
 
 +
    ~DataCell(); // destructor that deallocates the array of observations
 +
 
 +
    void AddObservation(double datum); // set the i-th observation to this value
 +
 
 +
    void Update(void); // recalculate the mean and the deviation
 +
 
 +
  };
 +
</syntaxhighlight>
 +
 
 +
==LineOfCells==
 +
 
 +
The LineOfCells object contains a linear array of DataCell objects. The LineOfCells could be a row of cells (e.g. the data from one group across repeated days) or a column of cells (e.g. the data on one day from all groups). 
 +
 
 +
In addition to a list of cells that make up the LineOfCells,  the object calculates derived values such as sum (total) of observations, number of observations, mean value of all observations, standard deviation of all observations, as well as the mean and standard deviation of the means of each cell.
 +
 
 +
===Declaration===
 +
 
 +
<syntaxhighlight lang="c++">
 +
class LineOfCells:public Object {
 +
 
 +
  // could be a row of cells or a column of cells
 +
 
 +
  public:
 +
 
 +
    char name[256]; // name of this line of cells
 +
 
 +
    AnovaDataSet *data_set; // belongs to this data set
 +
 
 +
    ListObject *cells; // these cells are in this line
 +
 
 +
    double total; // total of all observations in this line
 +
 
 +
    double mean; // mean of all observations in this line
 +
 
 +
    double deviation; // deviation of all observations in this line
 +
 
 +
    double obs_n; // number of observations in this line of cells
 +
 
 +
    double mean_of_cell_means;
 +
 
 +
    double deviation_of_cell_means;
 +
 
 +
    LineOfCells(); // constructor with the name of the line
 +
 
 +
    ~LineOfCells(); // destructor to deallocate the list object
 +
 
 +
    void Update(void); // update all the means & standard deviations
 +
 
 +
};
 +
</syntaxhighlight>
 +
 
 +
==AnovaDataSet==
 +
The AnovaDataSet object contains the DataCells that are going to be analyzed, and their arrangment into rows and columns. Data is therefore poured into the cells of the AnovaDataSet, which then calculates the ANOVA results.
 +
 
 +
''To do: need to add some API glue so that outside routines can add cells, rows, and columns to the AnovaDataSet (at the moment, it reads sample data from a file); and need to specify and calculate repeated vs. unrepeated measures.''
 +
 
 +
===Declaration===
 +
 
 +
<syntaxhighlight lang="c++">
 +
class AnovaDataSet {
 +
 
 +
 
 +
  // of course, need to put in textual labels, etc...
 +
 
 +
  public:
 +
 
 +
 
 +
 
 +
    char name[256]; // name of the data set
 +
 
 +
 
 +
 
 +
    unsigned long num_of_rows;
 +
 
 +
    unsigned long num_of_columns;
 +
 
 +
 
 +
 
 +
    ListObject *cells; // all cell objects
 +
 
 +
    ListObject *rows; // LineOfCells objects
 +
 
 +
    ListObject *columns; // LineOfCells objects
 +
 
 +
      // the lists of cell, rows, and columns are "1" indexed
 +
 
 +
 
 +
 
 +
 
 +
    double grand_total; // overall sum of all observations in all cells
 +
 
 +
    double grand_N; // overall number of observations in all the cells:::
 +
 
 +
    double grand_mean; // mean of all the observations in all the cells
 +
 
 +
    double grand_SS;
 +
 
 +
    double grand_biased_variance; // SS_total divided by grand_N; multiply by grand_N to get SS_total
 +
 
 +
    double grand_deviation; // biased standard deviation of all observations
 +
 
 +
 
 +
 
 +
    double mean_of_cell_means;
 +
 
 +
    double deviation_of_cell_means;
 +
 
 +
 
 +
    AnovaResultsTable results_table;
 +
 
 +
 
 +
 
 +
    AnovaDataSet(char *new_name);
 +
 
 +
    AnovaDataSet();
 +
 
 +
    ~AnovaDataSet(); // destructor to deallocate the ListObjects,
 +
 
 +
          // and tell the cells, rows and columns to destroy themselves too
 +
 
 +
 
 +
 
 +
    void SetUpCells(void);
 +
 
 +
    void SetUpRowsAndColumns(void);
 +
 
 +
    void Update(void);
 +
 
 +
    void ReadDataFromFile(void);
 +
 
 +
 
 +
      // need to add some API glue so that outside routines can add cells, rows, and columns
 +
 
 +
 
 +
    void UpdateAnovaResultsTable(void);
 +
 
 +
 
 +
 
 +
  protected:
 +
 
 +
 
 +
 
 +
    // internal routines for calculating the ANOVA
 +
 
 +
 
 +
 
 +
    double GetSSWithinCells(void);
 +
 
 +
    double GetSSBetweenCells(void);
 +
 
 +
    double GetSSBetweenLinesOfCells(ListObject *lineList,unsigned long num_of_cells,double *variance);
 +
 
 +
    double GetSSBetweenRows(void);
 +
 
 +
    double GetSSBetweenColumns (void);
 +
 
 +
    double GetSSBetweenInteraction(void);
 +
 
 +
 
 +
    double GetMSWithinCells(void);
 +
 
 +
    double GetMSBetweenCells(void);
 +
 
 +
    double GetMSBetweenRows(void);
 +
 
 +
    double GetMSBetweenColumns (void);
 +
 
 +
    double GetMSBetweenInteraction(void);
 +
 
 +
 
 +
 
 +
 
 +
    double GetNumCells(void);
 +
 
 +
    double GetDfTotal(void);
 +
 
 +
    double GetDfWithinCells(void);
 +
 
 +
    double GetDfError(void);
 +
 
 +
    double GetDfBetweenCells(void);
 +
 
 +
    double GetDfBetweenRows(void);
 +
 
 +
    double GetDfBetweenColumns(void);
 +
 
 +
    double GetDfBetweenInteraction(void);
 +
 
 +
 
 +
 
 +
    //interface with data table...
 +
 
 +
    void Read2WayDataFromDataTable(DataTable dt,AnovaParameters factors);
 +
 
 +
 
 +
};
 +
 
 +
</syntaxhighlight>
 +
 
 +
==AnovaResultsTable==
 +
 
 +
The results table is a structure inside of the AnovaDataSet object. When the dataset is updated, the results of the ANOVA are placed in this table.
 +
 
 +
''To do:  need a way to report errors, and some routines to nicely format the results table for display, or copying to clipboard, etc.''
 +
 
 +
 
 +
===Declaration===
 +
 
 +
<syntaxhighlight lang="c++">
 +
typedef struct {
 +
 
 +
 
 +
 
 +
    double DF_between_groups;
 +
 
 +
 
 +
 
 +
    double cell_SS_within;
 +
 
 +
    double cell_DF_within; // aka DF_error
 +
 
 +
    double cell_MS_within;
 +
 
 +
 
 +
 
 +
    double cell_variance;
 +
 
 +
    double cell_SS_between;
 +
 
 +
    double cell_DF_between;
 +
 
 +
    double cell_MS_between;
 +
 
 +
 
 +
 
 +
    double row_variance;
 +
 
 +
    double row_SS_between;
 +
 
 +
    double row_DF_between;
 +
 
 +
    double row_MS_between;
 +
 
 +
    double row_F;
 +
 
 +
    double row_p;
 +
 
 +
 
 +
 
 +
    double col_variance;
 +
 
 +
    double col_SS_between;
 +
 
 +
    double col_DF_between;
 +
 
 +
    double col_MS_between;
 +
 
 +
    double col_F;
 +
 
 +
    double col_p;
 +
 
 +
 
 +
 
 +
    double inter_SS_between;
 +
 
 +
    double inter_DF_between;
 +
 
 +
    double inter_MS_between;
 +
 
 +
    double inter_F;
 +
 
 +
    double inter_p;
 +
 
 +
 
 +
} AnovaResultsTable;
 +
 
 +
 
 +
</syntaxhighlight>
  
Xynk expects to find at least a column with the heading "Group", which it uses to sort the rows into groups.
 
  
 
[[Category:Software]] [[Category:Xynk]]
 
[[Category:Software]] [[Category:Xynk]]

Latest revision as of 11:22, 11 August 2024

Xynk is a fast graphing and statistical application for the Macintosh for the analysis of categorical data in the behavioral sciences. The program features one-way and two-way ANOVA with Tukey-Kramer HSD post hoc tests.

The Xynk web site is https://xynk.xyz.


Xynk Program description

Xynk is under development.

Xynk is a small application designed to run 1-way or 2-way ANOVAs with repeated measures on relatively small data sets. A number posthoc tests will also be provided. Xynk was designed to analyze the intake data collected in my lab, which typically consists of intake of 2-6 groups (with 6-8 animals per group) across 1 to 14 days of testing. (The intake data is collected from bar-coded bottles and food jars by the BarTender program). It also fills a hole in available small statistical packages, which either do not offer 2-way repeated measures ANOVAs or do not provide a selection of posthoc tests that can be applied to make arbitrary among-group comparisons.

The ANOVA calculation in Xynk are based on cells of data organized in a two dimension array of rows and columns (the number of cells is equal to the number of rows X the number of columns). The rows represent the groups of the first factor, and the columns represent the groups of the second factor (e.g., if factor 2 is time, then the columns are consecutive days of data collection). Each cell contains the data observations for a particular group of factor 1 at a particular treatment of factor 2 (e.g. in a Drug X Time ANOVA, a cell might be the intakes of the 38mg/kg-treated group on day 3 of intake testing).

Statistical References

I will list here the books and equations that were used in the programming of Xynk.

Xynk Objects and Data Structures

DataCell

The DataCell object is the atomic unit in Xynk. It represents an intersection of the two factors in a 2-way ANOVA (i.e. the data from Group 2 on Day 5 in a Group X Day ANOVA) or the data from one group in a 1-way ANOVA. The DataCell contains an array of real numbers which are the data from each individual subject in that cell (“observations”). (In other words, Xynk doesn’t track the data from individual subjects, but deals with the collected data within each cell of an ANOVA.

In addition to the data from each individual, each DataCell precalculates the sum, mean, variance, and standard deviation

Declaration

class DataCell:public Object {

  public:

    AnovaDataSet *data_set;  // belongs to this AnovaDataSet

    Boolean missing; // TRUE if this cell is not represented...

    LineOfCells *row, *column; 

      // this cell belongs to this row and this column

      // so it is the intersection of this row and column

    unsigned long n; // number of observations in this cell (X)

    double *observations; // array of observation values

    unsigned long arraySize; // current size of the array

      // observations are "1"-indexed within this array

    double total; // sum of all observations in this cell

    double mean; // mean of all observations in this cell

    double unbiased_variance; //  sum of squares of observations divided by n-1

    double biased_variance; // sum of squares of observations divided by n

    double deviation; // unbiased standard deviation

    DataCell(); // constructor to make the  cell; 

      //initializes the array of observations to CELL_PAGE_SIZE

    ~DataCell(); // destructor that deallocates the array of observations

    void AddObservation(double datum); // set the i-th observation to this value

    void Update(void); // recalculate the mean and the deviation

  };

LineOfCells

The LineOfCells object contains a linear array of DataCell objects. The LineOfCells could be a row of cells (e.g. the data from one group across repeated days) or a column of cells (e.g. the data on one day from all groups).

In addition to a list of cells that make up the LineOfCells, the object calculates derived values such as sum (total) of observations, number of observations, mean value of all observations, standard deviation of all observations, as well as the mean and standard deviation of the means of each cell.

Declaration

class LineOfCells:public Object {

  // could be a row of cells or a column of cells

  public:

    char name[256]; // name of this line of cells

    AnovaDataSet *data_set; // belongs to this data set

    ListObject *cells; // these cells are in this line

    double total; // total of all observations in this line

    double mean; // mean of all observations in this line

    double deviation; // deviation of all observations in this line

    double obs_n; // number of observations in this line of cells

    double mean_of_cell_means;

    double deviation_of_cell_means;

    LineOfCells(); // constructor with the name of the line

    ~LineOfCells(); // destructor to deallocate the list object

    void Update(void); // update all the means & standard deviations

};

AnovaDataSet

The AnovaDataSet object contains the DataCells that are going to be analyzed, and their arrangment into rows and columns. Data is therefore poured into the cells of the AnovaDataSet, which then calculates the ANOVA results.

To do: need to add some API glue so that outside routines can add cells, rows, and columns to the AnovaDataSet (at the moment, it reads sample data from a file); and need to specify and calculate repeated vs. unrepeated measures.

Declaration

class AnovaDataSet {


  // of course, need to put in textual labels, etc... 

  public:

  

    char name[256]; // name of the data set



    unsigned long num_of_rows;

    unsigned long num_of_columns;



    ListObject *cells; // all cell objects

    ListObject *rows; // LineOfCells objects

    ListObject *columns; // LineOfCells objects

      // the lists of cell, rows, and columns are "1" indexed




    double grand_total; // overall sum of all observations in all cells

    double grand_N; // overall number of observations in all the cells:::

    double grand_mean; // mean of all the observations in all the cells

    double grand_SS;

    double grand_biased_variance; // SS_total divided by grand_N; multiply by grand_N to get SS_total 

    double grand_deviation; // biased standard deviation of all observations



    double mean_of_cell_means;

    double deviation_of_cell_means;


    AnovaResultsTable results_table;



    AnovaDataSet(char *new_name);

    AnovaDataSet();

    ~AnovaDataSet(); // destructor to deallocate the ListObjects,

          // and tell the cells, rows and columns to destroy themselves too



    void SetUpCells(void);

    void SetUpRowsAndColumns(void);

    void Update(void);

    void ReadDataFromFile(void);


      // need to add some API glue so that outside routines can add cells, rows, and columns 


    void UpdateAnovaResultsTable(void);



  protected:

  

    // internal routines for calculating the ANOVA



    double GetSSWithinCells(void);

    double GetSSBetweenCells(void);

    double GetSSBetweenLinesOfCells(ListObject *lineList,unsigned long num_of_cells,double *variance);

    double GetSSBetweenRows(void);

    double GetSSBetweenColumns (void);

    double GetSSBetweenInteraction(void);


    double GetMSWithinCells(void);

    double GetMSBetweenCells(void);

    double GetMSBetweenRows(void);

    double GetMSBetweenColumns (void);

    double GetMSBetweenInteraction(void);




    double GetNumCells(void);

    double GetDfTotal(void);

    double GetDfWithinCells(void);

    double GetDfError(void);

    double GetDfBetweenCells(void);

    double GetDfBetweenRows(void);

    double GetDfBetweenColumns(void);

    double GetDfBetweenInteraction(void);



    //interface with data table...

    void Read2WayDataFromDataTable(DataTable dt,AnovaParameters factors);


};

AnovaResultsTable

The results table is a structure inside of the AnovaDataSet object. When the dataset is updated, the results of the ANOVA are placed in this table.

To do: need a way to report errors, and some routines to nicely format the results table for display, or copying to clipboard, etc.


Declaration

typedef struct {



    double DF_between_groups;



    double cell_SS_within;

    double cell_DF_within; // aka DF_error

    double cell_MS_within;



    double cell_variance;

    double cell_SS_between;

    double cell_DF_between;

    double cell_MS_between;



    double row_variance;

    double row_SS_between;

    double row_DF_between;

    double row_MS_between;

    double row_F;

    double row_p;



    double col_variance;

    double col_SS_between;

    double col_DF_between;

    double col_MS_between;

    double col_F;

    double col_p;



    double inter_SS_between;

    double inter_DF_between;

    double inter_MS_between;

    double inter_F;

    double inter_p;


} AnovaResultsTable;