Thursday, June 11, 2015

How to deal with data variability

A lot of the data provided are average values of a parameter, with the standard error of the mean (SEM) as error bars.

But this has a visual effect on the message conveyed and I want to explain here why this choice was made.
For a single set of data I could use either the SEM or the standard deviation (SD) as a representation of the "variability" of the data:
  • The SD quantifies scatter — how much the values vary from one another
  • The SEM quantifies how precisely you know the true average of the population. It takes into account both the value of the SD and the sample size
Below is an example of the same data using either SEM or SD:
The main goals of the comparisons made in that blog are to see if a defined parameter has an effect. Because of the complex nature of sleepwalking (stemming from a complex organ: the brain) and of my approach of considering one factor at a time (as a start point) in a most likely multi-factorial condition, my data have a lot of variability.

So I use statistical calculation (t-test if not otherwise indicated) to determine correlations and SEM values for the graph representation as I am mostly interested to identify the true average for each parameter, not the variability within it.

No comments:

Post a Comment