GGIR is an R-package to process multi-day raw accelerometer data for physical activity and sleep research. The term raw refers to data being expressed in m/s2 or gravitational acceleration as opposed to the previous generation accelerometers which stored data in accelerometer brand specific units. The signal processing includes automatic calibration, detection of sustained abnormally high values, detection of non-wear and calculation of average magnitude of dynamic acceleration based on a variety of metrics. Next, GGIR uses this information to describe the data per recording, per day of measurement, and (optionally) per segment of a day of measurement, including estimates of physical activity, inactivity and sleep. We published an overview paper of GGIR in 2019 link.
This vignette provides a general introduction on how to use GGIR and interpret the output, additionally you can find a introduction video and a mini-tutorial on YouTube. If you want to use your own algorithms for raw data then GGIR facilitates this with it’s external function embedding feature, documented in a separate vignette: Embedding external functions in GGIR. GGIR is increasingly being used by research groups across the world. A non-exhaustive overview of academic publications related to GGIR can be found here. R package GGIR would not have been possible without the support of the contributors listed in the author list at GGIR, with specific code contributions over time since April 2016 (when GGIR development moved to GitHub) shown here.
Cite GGIR:
When you use GGIR in publications do not forget to cite it properly as that makes your research more reproducible and it gives credit to it’s developers. See paragraph on Citing GGIR for details.
How to contribute to the code?
The development version of GGIR can be found on github, which is also where you will find guidance on how to contribute.
How can I get service and support?
GGIR is open source software and does not come with service or support guarantees. However, as user-community you can help each other via the GGIR google group or the GitHub issue tracker. Please use these public platform rather than private e-mails such that other users can learn from the conversations.
If you need dedicated support with the use of GGIR or need someone to adapt GGIR to your needs then Vincent van Hees is available as independent consultant.
Change log
Our log of main changes to GGIR over time can be found here.
Download and install RStudio (optional, but recommended)
Install GGIR with its dependencies from CRAN. You can do this with one command from the console command line:
install.packages("GGIR", dependencies = TRUE)
Alternatively, to install the latest development version with the latest bug fixes use instead:
install.packages("remotes")
remotes::install_github("wadpac/GGIR")
read.myacc.csv
and argument
rmc.noise
in the GGIR function
documentation (pdf).GGIR comes with a large number of functions and optional settings (arguments) per functions.
To ease interacting with GGIR there is one central function, named
GGIR
, to talk to all the other functions. In the past this
function was called g.shell.GGIR
, but we decided to shorten
it to GGIR
for convenience. You can still use
g.shell.GGIR
because g.shell.GGIR
has become a
wrapper function around GGIR
passing on all arguments to
GGIR
and by that providing identical functionality.
In this paragraph we will guide you through the main arguments to
GGIR
relevant for 99% of research. First of all, it is
important to understand that the GGIR package is structured in two
ways.
Firstly, it has a computational structure of five parts which are
applied sequentially to the data, and that GGIR
controls
each of these parts:
The reason why it split up in parts is that it avoids having the re-do all analysis if you only want to make a small change in the more downstream parts. The specific order and content of the parts has grown for historical and computational reasons.
Secondly, the function arguments which we will refer to as input parameters are structured thematically independently of the five parts they are used in:
This structure was introduced in GGIR version 2.5-6 to make the GGIR code and documentation easier to navigate.
To see the parameters in each parameter category and their default values do:
library(GGIR)
print(load_params())
If you are only interested in one specific category like sleep:
library(GGIR)
print(load_params()$params_sleep)
If you are only interested in parameter “HASIB.algo” from the sleep_params object:
library(GGIR)
print(load_params()$params_sleep[["HASPT.algo"]])
Documentation for this parameter objects can be found in the (GGIR function
documentation (pdf)). All of these are accepted as argument to
function GGIR
, because GGIR
is a shell around
all GGIR functionality. However, the params_
objects
themselves can not be provided as input to GGIR
.
You will probably never need to think about most of the arguments listed above, because a lot of arguments are only included to facilitate methodological studies where researchers want to have control over every little detail. See previous paragraph for links to the documentation and how to find the default value of each parameter.
The bare minimum input needed for GGIR
is:
library(GGIR)
GGIR(datadir="C:/mystudy/mydata",
outputdir="D:/myresults")
Argument datadir
allows you to specify where you have
stored your accelerometer data and outputdir
allows you to
specify where you would like the output of the analyses to be stored.
This cannot be equal to datadir
. If you copy paste the
above code to a new R script (file ending with .R) and Source it in
R(Studio) then the dataset will be processed and the output will be
stored in the specified output directory.
Below we have highlighted the key arguments you may want to be aware of. We are not giving a detailed explanation, please see the package manual for that.
mode
- which part of GGIR to run, GGIR is constructed
in five parts.overwrite
- whether to overwrite previously produced
milestone output. Between each GGIR part, GGIR stores milestone output
to ease re-running parts of the pipeline.idloc
- tells GGIR where to find the participant ID
(default: inside file header)strategy
- informs GGIR how to consider the design of
the experiment.
strategy
is set to value 1, then check out arguments
hrs.del.start
and hrs.del.end
.strategy
is set to value 3, then check out arguments
ndayswindow
.maxdur
- maximum number of days you expect in a data
file based on the study protocol.desiredtz
- time zone of the experiment.chunksize
- a way to tell GGIR to use less memory,
which can be useful on machines with limited memory.includedaycrit
- tell GGIR how many hours of valid data
per day (midnight-midnight) is acceptable.includenightcrit
- tell GGIR how many hours of a valid
night (noon-noon) is acceptable.qwindow
- argument to tell GGIR whether and how to
segment the day for day-segment specific analysis.mvpathreshold
and boutcriter
-
acceleration threshold and bout criteria used for calculating time spent
in MVPA (only used in GGIR part2).epochvalues2csv
- to export epoch level magnitude of
acceleration to a csv files (in addition to already being stored as
RData file)dayborder
- to decide whether the edge of a day should
be other than midnight.iglevels
- argument related to intensity gradient
method proposed by A. Rowlands.do.report
- specify reports that need to be
generated.viewingwindow
and visualreport
- to create
a visual report, this only works when all five parts of GGIR have
successfully run.The table below shows all GGIR input arguments, the GGIR part (1, 2, 3, 4 and/or 5) they are used in, and the parameter object they belong too. As you will see a few parameters are not part of any parameter object. Their default values can be found in the GGIR function documentation (pdf).
Argument (parameter) | Used in GGIR part | Parameter object |
---|---|---|
datadir | 1, 2, 4, 5 | not in parameter objects |
f0 | 1, 2, 3, 4, 5 | not in parameter objects |
f1 | 1, 2, 3, 4, 5 | not in parameter objects |
windowsizes | 1, 5 | params_general |
desiredtz | 1, 2, 3, 4, 5 | params_general |
overwrite | 1, 2, 3, 4, 5 | params_general |
do.parallel | 1, 2, 3, 5 | params_general |
maxNcores | 1, 2, 3, 5 | params_general |
myfun | 1, 2, 3 | not in parameter objects |
outputdir | 1 | not in parameter objects |
studyname | 1 | not in parameter objects |
chunksize | 1 | params_rawdata |
do.enmo | 1 | params_metrics |
do.lfenmo | 1 | params_metrics |
do.en | 1 | params_metrics |
do.bfen | 1 | params_metrics |
do.hfen | 1 | params_metrics |
do.hfenplus | 1 | params_metrics |
do.mad | 1 | params_metrics |
do.anglex | 1 | params_metrics |
do.angley | 1 | params_metrics |
do.angle | 1 | params_metrics |
do.enmoa | 1 | params_metrics |
do.roll_med_acc_x | 1 | params_metrics |
do.roll_med_acc_y | 1 | params_metrics |
do.roll_med_acc_z | 1 | params_metrics |
do.dev_roll_med_acc_x | 1 | params_metrics |
do.dev_roll_med_acc_y | 1 | params_metrics |
do.dev_roll_med_acc_z | 1 | params_metrics |
do.lfen | 1 | params_metrics |
do.lfx | 1 | params_metrics |
do.lfy | 1 | params_metrics |
do.lfz | 1 | params_metrics |
do.hfx | 1 | params_metrics |
do.hfy | 1 | params_metrics |
do.hfz | 1 | params_metrics |
do.bfx | 1 | params_metrics |
do.bfy | 1 | params_metrics |
do.bfz | 1 | params_metrics |
do.zcx | 1 | params_metrics |
do.zcy | 1 | params_metrics |
do.zcz | 1 | params_metrics |
lb | 1 | params_metrics |
hb | 1 | params_metrics |
n | 1 | params_metrics |
do.cal | 1 | params_rawdata |
spherecrit | 1 | params_rawdata |
minloadcrit | 1 | params_rawdata |
printsummary | 1 | params_rawdata |
print.filename | 1 | params_general |
backup.cal.coef | 1 | params_rawdata |
rmc.noise | 1 | params_rawdata |
rmc.dec | 1 | params_rawdata |
rmc.firstrow.acc | 1 | params_rawdata |
rmc.firstrow.header | 1 | params_rawdata |
rmc.col.acc | 1 | params_rawdata |
rmc.col.temp | 1 | params_rawdata |
rmc.col.time | 1 | params_rawdata |
rmc.unit.acc | 1 | params_rawdata |
rmc.unit.temp | 1 | params_rawdata |
rmc.origin | 1 | params_rawdata |
rmc.header.length | 1 | params_rawdata |
mc.format.time | 1 | params_rawdata |
rmc.bitrate | 1 | params_rawdata |
rmc.dynamic_range | 1 | params_rawdata |
rmc.unsignedbit | 1 | params_rawdata |
rmc.desiredtz | 1 | params_rawdata |
rmc.sf | 1 | params_rawdata |
rmc.headername.sf | 1 | params_rawdata |
rmc.headername.sn | 1 | params_rawdata |
rmc.headername.recordingid | 1 | params_rawdata |
rmc.header.structure | 1 | params_rawdata |
rmc.check4timegaps | 1 | params_rawdata |
rmc.col.wear | 1 | params_rawdata |
rmc.doresample | 1 | params_rawdata |
imputeTimegaps | 1 | params_rawdata |
selectdaysfile | 1, 2 | params_cleaning |
dayborder | 1, 2, 5 | params_general |
dynrange | 1 | params_rawdata |
configtz | 1 | params_general |
minimumFileSizeMB | 1 | params_rawdata |
interpolationType | 1 | params_rawdata |
metadatadir | 2, 3, 4, 5 | not in parameter objects |
minimum_MM_length.part5 | 5 | params_cleaning |
strategy | 2, 5 | params_cleaning |
hrs.del.start | 2, 5 | params_cleaning |
hrs.del.end | 2, 5 | params_cleaning |
maxdur | 2, 5 | params_cleaning |
max_calendar_days | 2 | params_cleaning |
includedaycrit | 2 | params_cleaning |
L5M5window | 2 | params_247 |
M5L5res | 2, 5 | params_247 |
winhr | 2, 5 | params_247 |
qwindow | 2 | params_247 |
qlevels | 2 | params_247 |
ilevels | 2 | params_247 |
mvpathreshold | 2 | params_phyact |
boutcriter | 2 | params_phyact |
ndayswindow | 2 | params_cleaning |
idloc | 2, 4 | params_general |
do.imp | 2 | params_cleaning |
storefolderstructure | 2, 4, 5 | params_output |
epochvalues2csv | 2 | params_output |
do.part2.pdf | 2 | params_output |
mvpadur | 2 | params_phyact |
window.summary.size | 2 | params_247 |
bout.metric | 2, 5 | params_phyact |
closedbout | 2 | params_phyact |
IVIS_windowsize_minutes | 2 | params_247 |
IVIS_epochsize_seconds | 2 | params_247 |
IVIS.activity.metric | 2 | params_247 |
iglevels | 2, 5 | params_247 |
TimeSegments2ZeroFile | 2 | params_cleaning |
qM5L5 | 2 | params_247 |
MX.ig.min.dur | 2 | params_247 |
qwindow_dateformat | 2 | params_247 |
anglethreshold | 3 | params_sleep |
timethreshold | 3 | params_sleep |
acc.metric | 3, 5 | params_general |
ignorenonwear | 3 | params_sleep |
constrain2range | 3 | params_sleep |
do.part3.pdf | 3 | params_output |
sensor.location | 3, 4 | params_general |
HASPT.algo | 3 | params_sleep |
HASIB.algo | 3 | params_sleep |
Sadeh_axis | 3 | params_sleep |
longitudinal_axis | 3 | params_sleep |
HASPT.ignore.invalid | 3 | params_sleep |
loglocation | 4, 5 | params_sleep |
colid | 4 | params_sleep |
coln1 | 4 | params_sleep |
nnights | 4 | params_sleep |
sleeplogidnum | 4, 5 | params_sleep |
do.visual | 4 | params_output |
outliers.only | 4 | params_output |
excludefirstlast | 4 | params_cleaning |
criterror | 4 | params_output |
includenightcrit | 4 | params_cleaning |
relyonguider | 4 | params_sleep |
relyonsleeplog | 4 | not in parameter objects |
def.noc.sleep | 4 | params_sleep |
data_cleaning_file | 4, 5 | params_cleaning |
excludefirst.part4 | 4 | params_cleaning |
excludelast.part4 | 4 | params_cleaning |
sleeplogsep | 4 | params_cleaning |
sleepwindowType | 4 | params_cleaning |
excludefirstlast.part5 | 5 | params_cleaning |
boutcriter.mvpa | 5 | params_phyact |
boutcriter.in | 5 | params_phyact |
boutcriter.lig | 5 | params_phyact |
threshold.lig | 5 | params_phyact |
threshold.mod | 5 | params_phyact |
threshold.vig | 5 | params_phyact |
timewindow | 5 | params_output |
boutdur.mvpa | 5 | params_phyact |
boutdur.in | 5 | params_phyact |
boutdur.lig | 5 | params_phyact |
save_ms5rawlevels | 5 | params_output |
part5_agg2_60seconds | 5 | params_general |
save_ms5raw_format | 5 | params_output |
save_ms5raw_without_invalid | 5 | params_output |
includedaycrit.part5 | 5 | params_cleaning |
frag.metrics | 5 | params_phyact |
LUXthresholds | 5 | params_247 |
LUX_cal_constant | 5 | params_247 |
LUX_cal_exponent | 5 | params_247 |
LUX_day_segments | 5 | params_247 |
do.sibreport | 5 | params_output |
Cut-points to estimate time spent in acceleration levels that are roughly liked to levels of energy metabolism have been proposed by:
Acceleration metric not available in GGIR? Some of the above publications make use of acceleration metrics that sum their values per epoch rather than average them per epoch like GGIR does. So, to use their cut-point value we need to multiply the proposed cut-point by the sample frequency used in the study that proposed it. For each of the studies this is detailed below. Note that GGIR intentionally does not sum values per epoch because that approach makes the cut-point sample frequency dependent, which complicates comparisons and harmonisation of literature. The explained variance and accuracy remains identical because we are only multiplying with a constant.
Esliger 2011, Phillips 2013, Fraysse 2020, Dibben2020:
do.enmoa = TRUE
, do.enmo = FALSE
, and
acc.metric=”ENMOa”
.threshold.lig = ((LightCutPointFromPaper_in_gmins/sampleRateInStudy)*60) * 1000
threshold.mod = ((ModerateCutPointFromPaper_in_gmins/sampleRateInStudy)*60) * 1000
threshold.vig = ((VigorousCutPointFromPaper_in_gmins/sampleRateInStudy)*60) * 1000
mvpathreshold = ((ModerateCutPointFromPaper_in_gmins/sampleRateInStudy)*60) * 1000
sampleRateInStudy
was 80 for Esliger and
Phillips and 100 for Fraysse.Roscoe 2017:
do.enmoa = TRUE
, do.enmo = FALSE
, and
acc.metric=”ENMOa”
.threshold.lig = (LightCutPointFromPaper_in_gsecs/85.7) * 1000
threshold.mod = (ModerateCutPointFromPaper_in_gsecs/85.7) * 1000
threshold.vig = (VigorousCutPointFromPaper_in_gsecs/85.7) * 1000
mvpathreshold = (ModerateCutPointFromPaper_in_gsecs/85.7) * 1000
Schaeffer 2014:
do.en = TRUE
, do.enmo = FALSE
, and
acc.metric=”EN”
.threshold.lig = (LightCutPointFromPaper/75) * 1000
threshold.mod = (ModerateCutPointFromPaper/75) * 1000
threshold.vig = (VigorousCutPointFromPaper/75) * 1000
mvpathreshold = (ModerateCutPointFromPaper/75) * 1000
Vaha-Ypya et al 2015:
Hildebrand 2014, Hildebrand 2016, Migueles 2021. Sanders 2018:
Sensor calibration
In all of the studies above, excluding Hildebrand et al. 2016, no effort was made to calibrate the acceleration sensors relative to gravitational acceleration prior to cut-point development. Theoretically this can be expected to cause a bias in the cut-point estimates proportional to the calibration error in each device, especially for cut-points based on acceleration metrics which rely on the assumption of accurate calibration such as metrics: ENMO, EN, ENMOa, and by that also metric SVMgs used by studies such as Esliger 2011, Phillips 2013, and Dibben 2020.
Idle sleep mode and ActiGraph
Studies done with ActiGraph devices when configured with ‘idle sleep mode’ on, will have zero-strings in all three axes during periods of no movement. Studies do not clarify how these zeros strings are accounted for. The insertion of zero strings is problematic as raw data accelerometers should always measure the gravitational component when not moving. This directly impacts metrics that rely on the presence of a gravitational component such as ENMO, EN, ENMOAa, and SVMgs. However, also other metrics may be affected as the sudden disappearance of gravitational acceleration will cause a spike at the start and end of the non-movement time segment. More generally speaking, we advise ActiGraph users to disable the ‘idle sleep mode’ as it harms the transparency and reproducibility since no mechanism exists to replicate it in other accelerometer brands, and it is likely to challenge accurate assessment of sleep and sedentary behaviour. We also advise that data collected with ‘idle sleep mode’ turned on is not be referred to as raw data accelerometry, because the data collection process has involved proprietary pre-processing steps which is violates the core principle of raw data collection.
Validity of validation studies
Several studies that aimed to independently evaluate cut-point methods failed at recognising these challenges. Further, validation studies are typically limited to laboratory conditions and a small population. Therefore, it is best to interpret cut-points with caution. Future methodological studies around cut-points are advised to account for accelerometer calibration error and the problematic time gaps in for example the ActiGraph when configured with ‘idle sleep mode’.
If you consider all the arguments above you me may end up with a call
to GGIR
that could look as follows.
library(GGIR)
GGIR(
mode=c(1,2,3,4,5),
datadir="C:/mystudy/mydata",
outputdir="D:/myresults",
do.report=c(2,4,5),
#=====================
# Part 2
#=====================
strategy = 1,
hrs.del.start = 0, hrs.del.end = 0,
maxdur = 9, includedaycrit = 16,
qwindow=c(0,24),
mvpathreshold =c(100),
bout.metric = 6,
excludefirstlast = FALSE,
includenightcrit = 16,
#=====================
# Part 3 + 4
#=====================
def.noc.sleep = 1,
outliers.only = TRUE,
criterror = 4,
do.visual = TRUE,
#=====================
# Part 5
#=====================
threshold.lig = c(30), threshold.mod = c(100), threshold.vig = c(400),
boutcriter = 0.8, boutcriter.in = 0.9, boutcriter.lig = 0.8,
boutcriter.mvpa = 0.8, boutdur.in = c(1,10,30), boutdur.lig = c(1,10),
boutdur.mvpa = c(1),
includedaycrit.part5 = 2/3,
#=====================
# Visual report
#=====================
timewindow = c("WW"),
visualreport=TRUE)
Once you have used GGIR
and the output directory
(outputdir) will be filled with milestone data and results.
Function GGIR
stores all the explicitly entered argument
values and default values for the argument that are not explicitly
provided in a csv-file named config.csv stored in the root of the output
folder. The config.csv file is accepted as input to GGIR
with argument configfile
to replace the specification of
all the arguments, except datadir
and
outputdir
, see example below.
library(GGIR)
GGIR(datadir="C:/mystudy/mydata",
outputdir="D:/myresults", configfile = "D:/myconfigfiles/config.csv")
The practical value of this is that it eases the replication of analysis, because instead of having to share you R script, sharing your config.csv file will be sufficient. Further, the config.csv file contribute to the reproducibility of your data analysis.
Note 1: When combining a configuration file with explicitly provided
argument values, the explicitly provided argument values will overrule
the argument values in the configuration file. Note 2: The config.csv
file in the root of the output folder will be overwritten every time you
use GGIR
. So, if you would like to add annotations in the
file, e.g. in the fourth column, then you will need to store it
somewhere outside the output folder and explicitly point to it with
configfile
argument.
You can use
source("pathtoscript/myshellscript.R")
or use the Source button in RStudio if you use RStudio.
GGIR by default support multi-thread processing, which can be turned
off by seting argument do.parallel = FALSE
. If this is
still not fast enough then I advise using a GGIR on a computing cluster.
The way I did it on a Sun Grid Engine cluster is shown below, please
note that some of these commands are specific to the computing cluster
you are working on. Also, you may actually want to use an R package like
clustermq or snowfall, which avoids having to write bash script. Please
consult your local cluster specialist to tailor this to your situation.
In my case, I had three files for the SGE setting:
submit.sh
for i in {1..707}; do
n=1
s=$(($(($n * $[$i-1]))+1))
e=$(($i * $n))
qsub /home/nvhv/WORKING_DATA/bashscripts/run-mainscript.sh $s $e
done
run-mainscript.sh
#! /bin/bash
#$ -cwd -V
#$ -l h_vmem=12G
/usr/bin/R --vanilla --args f0=$1 f1=$2 < /home/nvhv/WORKING_DATA/test/myshellscript.R
myshellscript.R
options(echo=TRUE)
args = commandArgs(TRUE)
if(length(args) > 0) {
for (i in 1:length(args)) {
eval(parse(text = args[[i]]))
}
}
GGIR(f0=f0,f1=f1,...)
You will need to update the ...
in the last line with
the arguments you used for GGIR
. Note that
f0=f0,f1=f1
is essential for this to work. The values of
f0
and f1
are passed on from the bash
script.
Once this is all setup you will need to call
bash submit.sh
from the command line.
Important Note:
Please make sure that you process one GGIR part at the same time on a
cluster, because each part assumes that preceding parts have been ran.
You can make sure of this by always specifying argument
mode
to a single part of GGIR. Once the analysis stops
update argument mode
to the next part until all parts are
done. The speed of the parallel processing is obviously dependent on the
capacity of your computing cluster and the size of your dataset.
GGIR generates the following types of output. - csv-spreadsheets with all the variables you need for physical activity, sleep and circadian rhythm research - Pdfs with on each page a low resolution plot of the data per file and quality indicators - R objects with milestone data - Pdfs with a visual summary of the physical activity and sleep patterns as identified (see example below)
Part 2 generates the following output:
(Part of) variable name | Description |
---|---|
ID | Participant id |
device_sn | Device serial number |
bodylocation | Body location extracted from file header |
filename | Name of the data file |
start_time | Timestamp when recording started |
startday | Day of the week on which recording started |
samplefreq | Sample frequency (Hz) |
device | Accelerometer brand, e.g. GENEACtiv |
clipping_score | The Clipping score: Fraction of 15 minute windows per file for which the acceleration in one of the three axis was close to the maximum for at least 80% of the time. This should be 0. |
meas_dur_dys} | Measurement duration (days) |
complete_24hcycle | Completeness score: Fraction of 15 minute windows per 24 hours for which no valid data is available at any day of the measurement. |
meas_dur_def_proto_day | measurement duration according to protocol (days): Measurement duration (days) minus the hours that are ignored at the beginning and end of the measurement motivated by protocol design |
wear_dur_def_proto_day | wear duration duration according to protocol (days): So, if the protocol was seven days of measurement, then wearing the accelerometer for 8 days and recording data for 8 days will still make that the wear duration is 7 days |
calib_err | Calibration error (static estimate) Estimated based on all ‘non-movement’ periods in the measurement after applying the autocalibration. |
calib_status | Calibration status: Summary statement about the status of the calibration error minimisation |
ENMO_fullRecordingMean | ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24-hour cycles (diurnal balanced), with invalid data imputed by the average at similar time points on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics (i.e. BFEN, HFEN, HFENplus). We emphasize that it is calculated over the full recording because the alternative is that a variable is only calculated overmeasurement days with sufficient valid hours of data. |
ENMO | (only available if set to true in part1.R) ENMO is the main summary measure of acceleration. The value presented is the average ENMO over all the available data normalised per 24 hour cycles, with invalid data imputed by the average at similar timepoints on different days of the week. In addition to ENMO it is possible to extract other acceleration metrics in part1.R (i.e. BFEN, HFEN, HFENplus) See also van Hees PLoSONE April 2013 for a detailed description and comparison of these techniques. |
pX_A_mg_0-24h_fullRecording | This variable represents the Xth percentile in the distribution of short epoch metric value A of the average day. The average day may not be ideal for describing the distribution. Therefore, the code also extracts the following variable. |
AD_pX_A_mg_0-24h | This variable represents the Xth percentile in the distribution of short epoch metric value A per day averaged across all days. |
L5_A_mg_0-24 | Average of metric A during the least active five* hours in the day
that is the lowest rolling average value of metric A. (* window size is
modifiable by argument winhr ) |
M5_A_mg_0-24 | Average of metric A during the most active five* hours in the day
that is the lowest rolling average value of metric A. (* window size is
modifiable by argument winhr ) |
L5hr_A_mg_0-24 | Starting time in hours and fractions of hours of L5_A_mg_0-24 |
M5hr_A_mg_0-24 | Starting time in hours and fractions of hours of M5_A_mg_0-24 |
ig_gradient_ENMO_0-24hr_fullRecording | Intensity gradient calculated over the full recording. |
1to6am_ENMO_mg | Average metric value ENMO between 1am and 6am |
N valid WEdays | Number of valid weekend days |
N valid WKdays | Number of valid week days |
IS_interdailystability | inter daily stability. The movement count that is derived for this was an attempt to follow the original approach by Eus J. W. Van Someren (Chronobiology International. 1999. Volume 16, issue 4). |
IV_intradailyvariability | intra daily variability. In contrast to the original paper, we ignore the epoch transitions between the end of a day and the beginning of the next day for the numerator of the equation, this to make it a true measure of intradaily variability. Same note as for IS: The movement count that is derived for this was an attempt to follow the original approach. |
IVIS_windowsize_minutes | Sizes of the windows based on which IV and IS are calculated (note that this is modifiable) |
IVIS_epochsize_seconds | Argument has been deprecated |
AD_ |
All days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over all the available days |
WE_ |
Weekend days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over weekend days only |
WD_ |
Week days (plain average of all available days, no weighting). The variable was calculated per day and then averaged over week days only |
WWE_ |
Weekend days (weighted average) The variable was calculated per day and then averaged over weekend days. Double weekend days are averaged. This is only relevant for experiments that last for more than seven days. |
WWD_ |
Week days (weighted average) The variable was calculated per day and then averaged over week days. Double week days were averaged. This is only relevant for experiments that last for more than seven days) |
WWD_MVPA_E5S_T100_ENMO | Time spent in moderate-to-vigorous based on 5 second epoch size and an ENMO metric threshold of 100 |
WWE_MVPA_E5S_B1M80%_T100_ENMO |
Time spent in moderate-to-vigorous based on 5 second epoch size and an ENMO metric threshold of 100 based on a bout criteria of 100 |
WE_[100,150)_mg_0-24h_ENMO |
Time spent between (and including) 100 mg and 150 (excluding 150 itself) between 0 and 24 hours (the full day) using metric ENMO data exclusion strategy (value=1, ignore specific hours; value=2, ignore all data before the first midnight and after the last midnight) |
_MVPA_E5S_B1M80_T100 |
MVPA calculated based on 5 second epoch setting bout duration 1 Minute and inclusion criterion of more than 80 percent. This is only done for metric ENMO at the moment, and only if mvpa threshold is not left blank |
_ENMO_mg |
ENMO or other metric was first calculated per day and then average according to AD, WD, WWE, WWD |
data exclusion strategy | A log of the decision made when calling g.impute: value=1 mean ignore specific hours; value=2 mean ignore all data before the first midnight and after the last midnight |
n hours ignored at start of meas (if strategy=1) | number of hours ignored at the start of the measurement (if strategy = 1) A log of decision made in part2.R |
n hours ignored at end of meas (if strategy=1) | number of hours ignored at the end of the measurement (if strategy = 1). A log of decision made in part2.R |
n hours ignored at end of meas (if strategy=1) | number of days of measurement after which all data is ignored (if strategy = 1) A log of decision made in part2.R |
epoch size to which acceleration was averaged (seconds) | A log of decision made in part1.R |
pdffilenumb | Indicator of in which pdf-file the plot was stored |
pdfpagecount | Indicator of in which pdf-page the plot was stored |
cosinor_ |
Cosinor analysis estimates such as mes, amp, acrophase, and acrotime, as documented in the ActCR package. |
cosinorExt_ |
Extended Cosinor analysis estimates such as minimum, amp, alpha, beta, acrotime, UpMesor, DownMesor, MESOR, and F_pseudo, as documented in the ActCR package. |
cosinorIV |
Cosinor analysis compatible estimate of the Intradaily Variability (IV) |
cosinorIS |
Cosinor analysis compatible estimate of Interdaily Stability (IS) |
This is a non-exhaustive list, because most concepts have been explained in summary.csv
(Part of) variable name | Description |
---|---|
ID | Participant id |
filename | Name of the data file |
calender_date | Timestamp and date on which measurement started |
bodylocation | Location of the accelerometer as extracted from file header |
N valid hours | Number of hours with valid data in the day |
N hours | Number of hours of measurement in a day, which typically is 24, unless it is a day on which the clock changes (DST) resulting in 23 or 25 hours. The value can be less than 23 if the measurement started or ended this day |
weekday | Name of weekday |
measurement | Day of measurement Day number relative to start of the measurement |
L5hr_ENMO_mg_0-24h | Hour on which L5 starts for these 24 hours (defined with metric ENMO) |
L5_ENMO_mg_0-24h | Average acceleration for L5 (defined with metric ENMO) |
[A,B)_mg_0-24h_ENMO |
Time spent in minutes between (and including) acceleration value A in mg and (excluding) acceleration value B in mg based on metric ENMO |
ig_gradient_ENMO_0-24hr | Gradient from intensity gradient analysis proposed by Rowlands et al. 2018 based on metric ENMO for the time segment 0 to 24 hours |
ig_intercept_ENMO_0-24hr | Intercept from intensity gradient analysis proposed by Rowlands et al. 2018 based on metric ENMO for the time segment 0 to 24 hours |
ig_rsquared_ENMO_0-24hr | r squared from intensity gradient analysis proposed by Rowlands et al. 2018 based on metric ENMO for the time segment 0 to 24 hours |
Part 4 generates the following output:
The csv. files contain the variables as shown below.
(Part of) variable name | Description |
---|---|
ID | Participant ID extracted from file |
night | Number of the night in the recording |
sleeponset | Detected onset of sleep expressed as hours since the midnight of the previous night. |
wakeup | Detected waking time (after sleep period) expressed as hours since the midnight of the previous night. |
SptDuration | Difference between onset and waking time. |
sleepparam | Definition of sustained inactivity by accelerometer. |
guider | guider used as discussed in paragraph Sleep analysis. |
guider_onset | Start of Sleep Period Time window derived from the guider. |
guider_wake | End of Sleep Period Time window derived guider. |
guider_SptDuration | Time SPT duration derived from guider_wake and guider_onset. |
error_onset | Difference between sleeponset and guider_onset |
error_wake | Difference between wakeup and guider_wake |
fraction_night_invalid | Fraction of the night (noon-noon or 6pm-6pm) for which the data was invalid, e.g. monitor not worn or no accelerometer measurement started/ended within the night. |
SleepDurationInSpt | Total sleep duration, which equals the accumulated nocturnal sustained inactivity bouts within the Sleep Period Time. |
duration_sib_wakinghours | Accumulated sustained inactivity bouts during the day. These are the periods we would label during the night as sleep, but during the day they form a subclass of inactivity, which may represent day time sleep or wakefulness while being motionless for a sustained period of time number_sib_sleepperiod} Number of nocturnal sleep periods, with nocturnal referring to the Sleep Period Time window. |
duration_sib_wakinghours_atleast15min | Same as duration_sib_wakinghours, but limited to SIBs that last at least 15 minutes. |
number_sib_wakinghours | Number of sustained inactivity bouts during the day, with day referring to the time outside the Sleep Period Time window. |
sleeponset_ts | sleeponset formatted as a timestamp |
wakeup_ts | wakeup formatted as a timestamp |
guider_onset_ts | guider_onset formatted as a timestamp |
guider_wake_ts | guider_wake formatted as a timestamp |
page | pdf page on which the visualisation can be found |
daysleeper | If 0 then the person is a nightsleeper (sleep period did not overlap with noon) if value=1 then the person is a daysleeper (sleep period did overlap with noon) |
weekday | Day of the week on which the night started |
calendardate | Calendar date on which the night started in day/month/year format. |
filename | Name of the accelerometer file |
cleaningcode | see paragraph Cleaningcode. |
sleeplog_used | Whether a sleep log was used (TRUE/FALSE) |
acc_available | Whether accelerometer data was available (TRUE/FALSE). |
WASO | Wake After Sleep Onset: SptDuration - SleepDurationInSpt |
SptDuration | Sleep Period Time window duration: wakeup - sleeponset |
error_onset | Difference between sleeponset and guider_onset (this variable is only available in the full report as stored in the QC folder) |
error_wake | Difference between wakeup and guider_wake (this variable is only available in the full report as stored in the QC folder) |
SleepRegularityIndex | The Sleep Regularity Index as proposed by Phillips et al. 2017, but calculated per day-pair to |
enable user to study patterns across days. SriFractionValid | Fraction of the 24 hour period that was valid in both current as well as in matching timestamps for the next calendar day. See GGIR function manual for details.
These additional are only stored if you used a sleeplog that captures
time in bed, or when using guider HorAngle for hip-worn accelerometer
data. If either of these applies set argument
sleepwindowType
to “TimeInBed”.
(Part of) variable name | Description |
---|---|
guider_guider_inbedStart | Time of getting in bed |
guider_guider_inbedEnd | Time of getting out of bed |
guider_inbedDuration | Time in Bed: guider_inbedEnd - guider_inbedStart |
sleepefficiency | Sleep efficiency, calculated as: SleepDurationInSpt / guider_inbedDuration |
sleeplatency | Sleep latency, calculated as: sleeponset - guider_inbedStart |
In the person level report the variables are derived from the variables in the night level summary. Minor extensions to the variable names explain how variables are aggregated across the days. Please find below extra clarification on a few of the variable names for which the meaning may not be obvious:
(Part of) variable name | Description |
---|---|
_mn |
mean across days |
_sd |
standard deviation across days |
_AD |
All days |
_WE |
Weekend days |
_WD |
Week days |
sleeplog_used | Whether a sleeplog was available (TRUE) or not (FALSE) |
sleep_efficiency | Accelerometer detrive sleep efficiency within the sleep period time calculated as the ratio between acc_SleepDurationInSpt and acc_SptDuration (denominator). Only available at person level, because at night level the user can calculate this from existing variables. |
n_nights_acc | Number of nights of accelerometer data |
n_nights_sleeplog | Number of nights of sleeplog data. |
n_WE_nights_complete | Number of weekend nights complete which means both accelerometer and estimate from guider. |
n_WD_nights_complete | Number of weekday nights complete which means both accelerometer and estimate from guider. |
n_WEnights_daysleeper | Number of weekend nights on which the person slept until after noon. |
n_WDnights_daysleeper | Number of weekday nights on which the person slept until after noon. |
duration_sib_wakinghour | Total duration of sustained inactivity bouts during the waking hours. |
number_sib_wakinghours | Number of sustained inactivity bouts during the waking hours. |
average_dur_sib_wakinghours | Average duration of the sustained inactivity bouts during the day (outside the sleep period duration). Calculated as duration_sib_wakinghour divided by number_sib_wakinghours per day, after which the mean and standard deviation are calculated across days. |
Visualisation to support data quality checks: - visualisation_sleep.pdf (optional)
When input argument do.visual
is set to TRUE GGIR can
show the following visual comparison between the time window of being
asleep (or in bed) according to the sleeplog and the detected sustained
inactivity bouts according to the accelerometer data. This visualisation
is stored in the results folder as
visualisation_sleep.pdf
.
Explanation of the image: Each line represents one night. Colours are
used to distinguish definitions of sustained inactivity bouts (2
definitions in this case) and to indicate existence or absence of
overlap with the sleeplog. When argument outliers.only
is
set to FALSE it will visualise all available nights in the dataset. If
outliers.only
is set to TRUE it will visualise only nights
with a difference in onset or waking time between sleeplog and sustained
inactivity bouts larger than the value of argument
criterror
.
This visualisation with outliers.only set to TRUE and critererror set to 4 was very powerful to identify entry errors in sleeplog data in van Hees et al PLoSONE 2015. We had over 25 thousand nights of data, and this visualisation allowed us to quickly zoom in on the most problematic nights to investigate possible mistakes in GGIR or mistakes in data entry.
The output of part 5 is dependent on the parameter configuration, it will generate as many output files as there are unique combination of the three thresholds provide. For example, the output could be:
For example, the following files will be generated if the threshold configuration was 30 for light activity, 100 for moderate and 400 for vigorous activity: - part5_daysummary_MM_L30M100V400_T5A5.csv - part5_daysummary_WW_L30M100V400_T5A5.csv - part5_personsummary_MM_L30M100V400_T5A5.csv - part5_personsummary_WW_L30M100V400_T5A5.csv - file summary reports/Report_nameofdatafile.pdf
(Term in) variable name | Description |
---|---|
sleeponset | onset of sleep expressed in hours since the midnight in the night preceding the night of interest, e.g. 26 is 2am. |
wakeup | waking up time express in the same way as sleeponset. |
sleeponset_ts | onset of sleep expressed as a timestamp hours:minutes:seconds |
daysleeper | if 0 then the person woke up before noon, if 1 then the person woke up after noon |
cleaningcode | See paragraph Cleaningcode. |
dur_day_spt_min | Total length of daytime waking hours and spt combined (typically 24 hours for MM report). |
dur_ |
duration of a behavioral class that will be specified int he rest of the variable name |
ACC_ |
(average) acceleration according to default metric specific by acc.metric |
_spt_wake_ |
Wakefulness within the Sleep period time window. |
_spt_sleep_ |
Sleep within the Sleep period time window. |
_IN_ |
Inactivity |
_LIG_ |
Light activity |
_MOD_ |
Moderate activity |
_VIG_ |
Vigorous activity |
_MVPA_ |
Moderate or Vigorous activity |
_unbt_ |
Unbouted |
_bts_ |
Bouts (also known as sojourns), which are segments that for which the acceleration is within a specified range for a specified fraction of the time. |
_bts_1_10_ |
Bouts lasting at least 1 minute and less than 10 minutes (1 and 9.99 minutes are included, but 10 minutes is not). |
Nblock | number of blocks of a certain behavioral class, not these are not bouts but a count of the number of times the behavioral class occurs without interruptions. |
WW | in filename refers to analyses based on the timewindow from waking to waking up |
MM | in filename refers to analyses done on windows between midnight and midnight |
calendar_date | calendar date on which the window started in day/month/year format. So, for WW window this could mean that you have two windows starting on the same date. |
weekday | weekday on which the window started. So, for WW window this could mean that you have two windows starting on the weekday. |
_total_IN |
total time spent in inactivity (no distinction between bouted or unbouted behavior, this is a simple count of the number of epochs that meet the threshold criteria. |
_total_LIG |
total time spent in light activity. |
nonwear_perc_day | Non-wear percentage during the waking hours of this day. |
nonwear_perc_spt | Non-wear percentage during the spt hours of this day. |
nonwear_perc_day_spt | Non-wear percentage during the whole day, including waking and spt. |
dur_day_min | Duration of waking hours within this day window |
dur_spt_min | Duration of Sleep Period Time within this day window. |
dur_day_spt_min | Duration this day window, including both waking hours and SPT. |
sleep_efficiency | sleep_efficiency in part 5 is not the same as in part 4, but calculated as the percentage of sleep within the sleep period time window. The conventional approach is the approach used in part 4. |
L5TIME | Timing of least active 5hrs, expressed as timestamp in the day |
M5TIME | Timing of most active 5hrs |
L5TIME_num, M5TIME_num | Timing of least/most active 5hrs, expressed as hours in the day. Note that L5/M5 timing variables are difficult to average across days because 23:00 and 1:00 would average to noon and not to midnight. So, caution is needed when interpreting person averages. |
L5VALUE | Acceleration value for least active 5hrs |
M5VALUE | Acceleration value for most active 5hrs |
FRAG_ |
All variables related to behavioural fragmentation analysis |
TP_ |
Transition probability |
PA2IN | Physical activity fragments followed by inactivity fragments |
IN2PA | Physical inactivity fragments followed by activity fragments |
Nfrag | Number of fragments |
IN2LIPA | Inactivity fragments followed by LIPA |
IN2MVPA | Inactivity fragments followed by MVPA |
mean_dur |
mean duration of a fragment category |
Gini_dur |
Gini index |
CoV_dur |
Coefficient of Variation |
alpha | Power law exponent |
x0.5 |
Derived from power law exponent alpha, see Chastin et al. 2010 |
W0.5 |
Derived from power law exponent alpha, see Chastin et al. 2010 |
nap_count | Total number of naps, only calculated when argument do.sibreport =
TRUE, currently optimised for 3.5-year olds. See function documentation
for function g.part5.classifyNaps in the GGIR function
documentation (pdf). |
nap_totalduration | Total nap duration, only calculated when argument do.sibreport =
TRUE, currently optimised for 3.5-year old. See function documentation
for function g.part5.classifyNaps in the GGIR function
documentation (pdf). |
Special note if you are working on compositional data analysis:
The duration of all dur_
variables that have
_total_
in their name should add up to the total length of
the waking hours in a day. Similarly, the duration of all other
dur_
variables excluding the variables _total_
in their name and excluding the variable with dur_day_min
,
dur_spt_min
, and dur_day_spt_min
should also
add up to the length of the full day.
Motivation for default boutcriter.in = 0.9:
The idea is that if you allow for bouts of 30 minutes it would not
make sense to allow for breaks of 20 percent (6 minutes!) this is why I
used a more stringent criteria for the highest category. Please note
that you can change these criteria via arguments
boutcriter.mvpa
, boutcriter.in
, and
boutcriter.lig
.
Most variables in the person level summary are derived from the day
level summary, but extended with _pla
to indicate that the
variable was calculated as the plain average across all valid days.
Variables extended with _wei
represent the weighted average
of across all days where weekend days always weighted 2/5 relative to
the contribution of week days.
Variable name | Description |
---|---|
Nvaliddays | Total number of valid days. |
Nvaliddays_WD | Number of valid week days. |
Nvaliddays_WE | Number of valid weekend days, where the days that start on Saturday or Sunday are considered weekend. |
NcleaningcodeX | Number of days that had cleaning code X for the corresponding sleep analysis in part 4. In case of MM analysis this refers to the night at the end of the day. |
Nvaliddays_AL10F_WD | Number of valid week days with at least 10 fragments (5 inactivity or 5 inactive) |
Nvaliddays_AL10F_WE | Number of valid weekend days with at least 10 fragments (5 inactivity or 5 inactive) |
_wei |
weighted average of weekend and week days, using a 2/5 ratio, see above. |
_pla |
plain average of all days, see above |
In this chapter we will try to collect motivations and clarification behind GGIR which may not have been clear from the existing publications.
Some tips to increase reproducibility of your findings:
An acceleration sensor works on the principle that acceleration is captured mechanically and converted into an electrical signal. The relationship between the electrical signal and the acceleration is usually assumed to be linear, involving an offset and a gain factor. We shall refer to the establishment of the offset and gain factor as the sensor calibration procedure. Accelerometers are usually calibrated as part of the manufacturing process under non-movement conditions using the local gravitational acceleration as a reference. The manufacturer calibration can later be evaluated by holding each sensor axis parallel (up and down) or perpendicular to the direction of gravity; readings for each axis should be ±1 and 0 g, respectively. However, this procedure can be cumbersome in studies with a high throughput. Furthermore, such a calibration check will not be possible for data that have been collected in the past and for which the corresponding accelerometer device does not exist anymore. Techniques have been proposed that can check and correct for calibration error based on the collected triaxial accelerometer data in the participant’s daily life without additional experiments, referred to as autocalibration. The general principle of these techniques is that a recording of acceleration is screened for nonmovement periods. Next, the moving average over the nonmovement periods is taken from each of the three orthogonal sensor axes and used to generate a three-dimensional ellipsoid representation that should ideally be a sphere with radius 1 g. Here, deviations between the radius of the three-dimensional ellipsoid and 1 g (ideal calibration) can then be used to derive correction factors for sensor axis-specific calibration error. This auto-calibration performed by GGIR uses this technique and a more detailed description and demonstration can be found in the published paper.
Reference:
Key decisions to be made:
do.call
in GGIR
to
do.call=FALSE
.Key output variables:
cal.error.end
as stored in
data_quality_report.csv or variable value calib_err in summary.csv.
These should be less than 0.01 g (10mg).Accelerometer non-wear time is estimated on the basis of the standard deviation and the value range of the raw data from each accelerometer axis. Classification is done per 15 minute (or ws2) block and based on the characteristics of the 60 minute (or ws) window centred at these 15 minutes. A block is classified as non-wear time if the standard deviation of the 60 minute window is less than 13.0 mg (\(1 mg = 0.00981 m·s^−2\)) and the value range of the 60 minute window is less than 50 mg, for at least two out of the three accelerometer axes. The procedure for non-wear detection was modified in comparison to the procedure as applied in the 2011 PLoSONE publication link. Instead of 30-minute time windows 60-minute time windows were used to decrease the chance of accidently detecting short sedentary periods as non-wear time. The windows were overlapping (15 minute steps, window overlap of 45 minutes), which was done to improve the accuracy of detecting the boundaries of non-wear time as opposed to non-overlapping time windows. Inspection of unpublished data on non-wear classification by the algorithm as described in our published work indicated that the algorithm does not cope well with periods of monitor transportation per post. Here, long periods of non-wear are briefly interrupted by periods of movement, which are normally interpreted as monitor wear. Therefore, the algorithm was expanded with an additional stage in which the plausibility of “wear-periods” in-between non-wear periods is tested. Short periods of detected wear-time in-between longer periods of detected non-wear were classified as non-wear time based on the duration and the proportion of the duration relative to the bordering periods of detected non-wear-periods. The following criteria were derived from visual observation of various datasets using knowledge about study protocols. All detected wear-periods of less than six hours and less than 30% of the combined duration of their bordering non-wear periods were classified as non-wear. Additionally, all wear-periods of less than three hours and which formed less than 80% of their bordering non-wear periods were classified as non-wear. The motivation for selecting a relatively high criteria (< 30%) in combination with a long period (6hrs) and a low criteria (< 80%) in combination with a short period (3 hrs) was that long period are more likely to be actually related to monitor wear time. A visual model was created, see Figure 1. Here, units of time are presented in squares and marked grey if detected as non-wear time. Period C is detected as wear-time and borders to non-wear periods B and D, see Figure 1. If the length of C is less than six hours and C divided by the sum of B and D is less than 0.3 then the first criteria is met and block C is turned into a non-wear period.
By visual inspection of >100 traces from a large observational study it turned out that applying this stage in three iterative stages allowed for improved classification of periods characterised by intermittent periods of non-wear and apparent wear. Further, an additional rule was introduced for the final 24 hours of each measurement. The final 24 hours are often considered the period in which the accelerometer is potentially taken off but moved because of transportation, e.g. by the mail service. All wear-periods in the final 24 hrs of each measurement shorter than three hours and preceded by at least one hour of non-wear time were classified as non-wear. Finally, if the measurement starts or ends with a period of less than three hours of wear followed by non-wear (any length) then this period of wear is classified as non-wear. These additional criteria for screening the beginning and end of the accelerometer file reflect the likelihood of movements that are involved when starting the accelerometer or downloading the data from the accelerometer.
Reference:
Key decisions to be made:
Key output variables:
The acceleration signal was screened for ‘clipping’. If more than 50% of the data points in a 15 minute time window are higher than 7.5g (close to the maximal dynamic range of this sensor) the corresponding time period is considered as potentially corrupt data, which may be explained by the sensor getting stuck at its extreme value.
Reference:
Although many data points are collected we decide to only work with aggregated values (e.g. 1 or 5 second epochs) for the following reasons:
Accelerometers are often used to describe patterns in metabolic energy expenditure. Metabolic energy expenditure is typically defined per breath or per minute (indirect calorimetry), per day (room calorimeter), or per multiple days (doubly labelled water method). In order to validate our methods against these reference standards we need to work with a similar time resolution.
Collapsing the data to epoch summary measures helps to standardise for differences in sample frequency between studies.
There is little evidence that the raw data is an accurate representation of body acceleration. All scientific evidence on the validity of accelerometer data has so far been based on epoch averages.
Collapsing the data to epoch summary measures may help to average out different noise levels and make sensor brands more comparable.
GGIR uses short (default 5 seconds) and long epochs (default 15 minutes). The epochs are aligned to the hour in the day, and to each other. For example, if a recording starts at 9:52:00 then the GGIR will work with epochs derived from 10:00:00 onward. If the recording starts at 10:12 then GGIR will work with epochs derived from 10:15:00 onward.
Motivation:
If the first 15 minute epochs would start at 9:52 then the next one would start at 10:07, which makes it impossible to make statement about behaviour between 10:00 and 13:00.
In GGIR sleep analysis has been implemented in part 3 and 4. Sleep analysis comes at two levels: The identification of the main Sleep Period Time (SPT) window or the time in bed window (TIB), and the discrimination of sleep and wakefulness periods. The term sleep is somewhat controversial in the context of accelerometry, because accelerometer only capture lack of movement. To acknowledge this challenge GGIR refers to these classified sleep periods as sustained inactivity bouts (abbreviated as SIB).
Current, GGIR offers the user the choice to identify SIB period using any of the following algorithms:
HASIB.algo = “Sadeh1994"
and argument
Sadeh_axis = "Y"
to indicate that the algorithm should use
the Y-axis of the sensor.HASIB.algo = “Galland2012"
. Further, set
Sadeh_axis = "Y"
to specify that the algorithm should use
the Y-axis.Notes on the replication of the movement counts needed for the Sadeh and Galland algorithms:
The implementation of the zero-crossing count in GGIR is not an exact copy of the original approach as used in the AMA-32 Motionlogger Actigraph by Ambulatory-monitoring Inc. (“AMI”) that was used in the studies by Sadeh and colleagues in late 1980s and 1990s. No complete publicly accessible description of that approach exists. From personal correspondence with AMI, I learnt that the technique has been kept proprietary and has never been shared with or sold to other actigraphy manufacturers (time of correspondence October 2021). Therefore, if you would like to replicate the exact zero-crossing counts calculation used by Sadeh and colleague’s consider using AMI’s actigraph device. However, if you prioritise openness over methodological consistency with the original studies by Sadeh and colleagues then you may want to consider any of the open source techniques in this library.
Missing information about the calculation includes: (1) Sadeh specified that calculations were done on data from the Y-axis but the direction of the Y-axis was not clarified. Therefore, it is unclear whether the Y-axis at the time corresponds to the Y-axis of modern sensors, (2) Properties of the frequency filter are missing like the filter order and more generally it is unclear how to simulate original acceleration sensor behaviour with modern sensor data, and (3) Sensitivity of the sensor, we are now guessing that the Motionlogger had a sensitivity of 0.01 g but without direct proof.
The method proposed by Galland and colleagues in 2012 was designed for counts captured with the Actical device (Mini Mitter Co, Inc Bend OR). Based on the correspondence with AMI we can conclude that Actical counts are not identical to AMI’s actigraph counts. Further, a publicly accessible complete description of the Actical calculation does not exist. Therefore, we can also conclude that methodological consistency cannot be guaranteed for Actical counts.
To aid research in exploring count type algorithms, we also
implemented the brondcounts
as proposed by Brønd and
Brondeel and available via R package activityCounts. To extract this
metric in addition to the zero crossing count, specify
do.brondcounts = TRUE
which is used in GGIR part 1 and uses
R package activityCounts in the background. As a result, sleep estimates
for Sadeh or Galland will be derived based on both the zero crossing
algorithm and the brondcounts
algorithms.
SIBs (explained above) can occur anytime in the day. In order to differentiate SIBs that correspond to daytime rest/naps from SIBs that correspond to the main Sleep Period Time window (abbreviated as SPT), a guiding method referred as guider is used. All SIBs that overlap with the window defined by guider are considered as sleep within the SPT window. The start of the first SIB identified as sleep period and the end of the last SIB identified as sleep period define the beginning and the end of the SPT window. In this way the classification relies on the accelerometer for detecting the timing of sleep onset and waking up time, but the guider tells it in what part of the day it should look, as SPT window will be defined only if SIB is detected during the guider specified window.
If a guider reflects the Time in Bed the interpretation of the Sleep Period Time, Sleep onset time and Wakeup time remains unchanged. However, we can then also assess sleep latency and and sleep efficiency, which will be included in the report.
The guiding method as introduced above can be one of the following methods:
sleepwindowType
to clarify whether the
sleeplog capture “TimeInBed” or “SPT”. If it is set to “TimeInBed”, GGIR
will automatically expand its part 4 analyses with sleep latency and
sleep efficiency assessment.def.noc.sleep
.sensor.location="hip"
, because this will trigger the
identification of the longitudinal axis based on 24-hour lagged
correlation. You can also force GGIR to use a specific axis as
longitudinal axis with argument longitudinal_axis
. Next, it
identifies when the horizontal axis is between -45 and 45 degrees and
considers this a horizontal posture. Next, this is used to identify the
largest time in bed period, by only considering horizontal time segments
of at least 30 minutes, and then looking for longest horizontal period
in the day where gaps of less than 60 minutes are ignored. Therefore, it
is partially similar to the HDCZA algorithm. When “HorAngle” is used,
sleepwindowType
is automatically set to “TimeInBed”.For all guiders other than “HorAngle” and “sleep log” argument
sleepwindowType
is automatically switched to “SPT”, such
that no attempt is made to estimate sleep latency or sleep
efficiency.
GGIR uses by default the sleep log, if the sleep log is not available
it falls back on the HDCZA algorithm (or HorAngle if
sensor.location="hip"
). If HDCZA is not successful GGIR
will falls back on the L5+/-12 definition, and if this is not available
it will use the setwindow. The user can specify the priority with
argument def.noc.sleep
. So, when we refer to guider then we
refer to one of these five methods.
If the guider indicates that the person woke up after noon, the sleep analysis in part 4 is performed again on a window from 6pm-6pm. In this way our method is sensitive to people who have their main sleep period starting before noon and ending after noon, referred as daysleeper=1 in daysummary.csv file, which you can interpret as night workers. Note that the L5+/-12 algorithm is not configured to identify daysleepers, it will only consider the noon-noon time window.
To monitor possible problems with the sleep assessment, the variable cleaningcode is recorded for each night. Cleaningcode per night (noon-noon or 6pm-6pm as described above) can have one of the following values:
includenightcrit
indicates the minimum number of
hours of valid data needed within those 24 hours.All the information for each night is stored in the
results/QC
folder allowing tracing of the data analysis and
night selection. The cleaned results stored in the results folder. In
part 4 a night is excluded from the ‘cleaned’ results based on the
following criteria:
Be aware that if using the full output and working with wrist accelerometer data, then missing entries in a sleep log that asks for Time in Bed will be replaced by HDCZA estimates of SPT. Therefore, extra caution should be taken when working with the full output.
Notice that part 4 is focused on sleep research, by which the cleaned reported is the way it is. In the next section we will discuss the analysis done by part 5. There, the choice of guider may be considered less important, by which we use different criteria for including nights. So, you may see that a night that is excluded from the cleaned results in part 4 still appears in the cleaned results for part 5.
The package allows some adjustments to be made after data quality
check. The data_cleaning_file
argument allows you to
specify individuals and nights for whom part4 should entirely rely on
the guider (for example if we decide to use sleep log only information).
The first column of this file should have header ID
and
there should be a column relyonguider_part4
to specify the
night. The data_cleaning_file
allows you to tell GGIR which
person(s) and night(s) should be omitted in part 4. The the night
numbers to be excluded should be listed in a column
night_part4
as header.
In part 5 the sleep estimates from part 4 are used to describe
24-hour time use. Part 5 allows you to do this in two ways: Literally 24
hours which start and end a calendar day (default midnight, but
modifiable with argument dayborder
) or from waking up to
waking up. In GGIR we refer to the former as MM windows
and to the latter as WW windows. The onset and waking
times are guided by the estimates from part 4, but if they are missing
part 5 will attempt to retrieve the estimate from the guider method,
because even if the accelerometer was not worn during the night, or a
sleep log is missing in a study where sleep log was proposed to the
participants, estimates from a sleep log or HDCZA can still be
considered a reasonable estimate of the SPT window in the context of
24-hour time use analysis.
If WW is used in combination with ignoring the first and last
midnight, argument excludefirstlast
, then the first wake-up
time (on the second recording day) needs to be extracted for the first
WW day. This is done with the guider method. This also means that the
last WW window ends on the before last morning of the recording.
A distinction is made between the full results stored in the
results/QC
folder and the cleaned results stored in the
results folder.
If you want to inspect the time series corresponding to these windows
then see argument save_ms5rawlevels
, which allows you to
export the time series including behavioral classes and non-wear
information to csv files. The behavioral classes are included as
numbers, the legend for these classes is stored as a separate legend
file in the meta/ms5.outraw folder named “behavioralcodes2020-04-26.csv”
where the date will correspond to the date of analysis.
Additional input arguments that may be of interest:
save_ms5raw_format
is a character string to specify how
data should be stored: either “csv” (default) or “RData”. Only used if
save_ms5rawlevels=TRUE.save_ms5raw_without_invalid
is Boolean to indicate
whether to remove invalid days from the time series output files. Only
used if save_ms5rawlevels=TRUE.The time series output file comes with the following columns:
Column name | Description |
---|---|
SleepPeriodTime | Is 1 if SPT is detected, 0 if not. Note that this refers to the combined usage of guider and detected sustained inactivity bouts (rest periods). |
invalidepoch | Is 1 if epoch was detect as invalid (e.g. non-wear), 0 if not. |
guider | Is 1 if guider method detect epoch as SPT (e.g. sleeplog, HDCZA), 0 if not. You will not find here which guider is used, for this see other GGIR output. |
window | Numeric indicator of the analysis window in the recording. If timewindow = “MM” then these correspond to calendar days, if timewindow = “WW” then these correspond to which wakingup-wakingup window in the recording. So, in a recording of one week you may find window numbers 1, 2, 3, 4, 5 and 6. |
class_id | The behavioural class codes are documented in the exported csv file meta/ms5outraw/behaviouralcodes.csv. Class codes above class 8 will be analysis specific, because it depends on the number time variants of the bouts used. For example, if you look at MVPA lasting 1-10, 10-20, 30-40 then all of them will have their own class_id. |
invalid_fullwindow | Fraction of the window (see above) that represents invalid data. I added this to make it easier to filter the timeseries based on whether days are valid or not. |
invalid_sleepperiod | Fraction of SPT within the current window that represents invalid data. |
invalid_wakinghours | Fraction of SPT within the current window that represents invalid data. |
To familiarize yourself with how the variables it can be helpful to plot them yourself.
For users we also want to export the time serie sof multiple metric
values see argument epochvalues2csv
which relates to the
storage of time series in GGIR part 2.
The full part 5 output is stored in the results/QC
folder. The default inclusion criteria for days in the cleaned output
from part 5 (stored in the results
folder) are:
includedaycrit.part5
(default 2/3).minimum_MM_length.part5
(default 23). Note that if your experiment started and ended in the
middle of the day then this default setting will exclude those
incomplete first and last days. If you think including these days is
still meaningful for your work then adjust the argument
minimum_MM_length.part5
.Important notes:
results/QC
folder.includenightcrit
as used for
part 4 is not used in part 5.The data_cleaning_file
argument discussed in Data_cleaning_file also allows you to
tell GGIR which person(s) and day(s) should be omitted in part 5. The
the day numbers to be excluded should be listed in a column
day_part5
as header.
When setting input argument as frag.metrics="all"
GGIR
part 5 will perform daytime behavioural fragmentation analysis. Do this
in combination with argument part5_agg2_60seconds=TRUE
as
that will aggregate the time series to 1 minute resolution as is common
in behavioural fragmentation literature.
In GGIR, a fragment is a defined as a sequence of epochs that belong to one of the four categories:
Each of these categories represents the combination of bouted and
unbouted time in the respective categories. Inactivity and physical
activity add up to a full day, as well as inactivity, LIPA and MVPA. The
fragmentation metrics are applied in function
g.fragmentation
.
Literature about these metrics:
CoV
) is calculated according
to Blikman
et al. 2014.TP
) from Inactivity (IN) to
Physical activity (IN2PA) and from Physical activity to inactivity
(PA2IN) are calculated as 1 divided by the mean fragment duration. The
transition probability from Inactivity to LIPA and MVPA are calculated
as: (Total duration in IN followed by LIPA or MVPA, respectively,
divided by total duration in IN) divided by the average duration in
IN.Gini
from the
ineq
R package, and with it’s argument corr
set to TRUE.NFragPM
) is calculated
identical to metric fragmentation
in Chastin et
al. 2012, but it is renamed here to be a more specific reflection of
the calculation. The term fragmentation
appears too generic
given that all fragmentation metrics inform us about fragmentation.
Please not that this is effectively the same metric as the transition
probability, because total number divided by total sum in duration
equals 1 divided by average duration. This is just different terminology
for the same construct.Conditions for calculation and value when condition is not met:
Gini
and CoV
are only calculated
if there are at least 10 fragments (e.g. 5 inactive and 5 active). If
this condition is not met the metric value will be set to missing.mean_dur_PA
and mean_dur_IN
), are calculated
when there are at least 2 fragments (1 inactive, 1 active). If this
condition is not met the value will is set to zero.TP
are calculated if: There is at
least 1 inactivity fragment AND (1 LIPA OR 1 MVPA fragment). If this
condition is not met the TP
metric value is set to
zero.To keep an overview of which recording days met the criteria for
non-zero standard deviation and at least ten fragments, GGIR part5
stores variable Nvaliddays_AL10F
at person level (=Number
of valid days with at least 10 fragments), and SD_dur
(=standard deviation of fragment durations) at day level as well as
aggregated per person.
Difference between fragments and blocks:
Elsewhere in the part5 we use the term block
. A
block
is a sequence of epochs that belong to the same
behavioural class. This may sound similar to the definition of a
fragment, but for blocks we distinguish every behavioural class, which
includes the subcategories such as bouted and unbouted behaviour. This
means that variables Nblock_day_total_IN
and
Nblock_day_total_LIG
are identical to
Nfrag_IN_day
and Nfrag_LIPA_day
,
respectively.
Differences with R package ActFrag:
The fragmentation functionality is loosely inspired on the great work done by Dr. Junrui Di and colleages in R package ActFrag, as described in Junrui Di et al. 2017.
However, we made a couple of a different decisions that may affect comparability:
GGIR offers a range of acceleration metrics to choose from, but only one metric can be the default. Acceleration metric ENMO (Euclidean Norm Minus One with negative values rounded to zero) has been the default metric in GGIR. In 2013 we wrote a paper in which we investigated different ways of summarising the raw acceleration data. In short, different metrics exist and there is very little literature to support the superiority of any metric at the time. As long as different studies use different metrics their findings will not be comparable. Therefore, the choice for metric ENMO is partially pragmatic. GGIR uses ENMO as default because:
See also this blog post on this topic.
I wanted a short name and not to spend too much time finding it. At the time I was primarily working with GENEActiv and GENEA data In R, and that’s how the name GGIR was born: Short, easy to remember, and as acronym sufficiently vague to not be tight up with a specific functionality. However, later the functionality expanded to other sensor brands, so the abbreviation has lost its functional meaning.
Detection of the least active (LX) and most active (MX) X hours,
where X is defined by argument winhr
. For both GGIR
calculates the average acceleration, the start time, and if argument
iglevels
is specified also the intensity gradient. If
argument winhr
is a vector then descriptive values for LX
and MX are derived per value in winhr
. Within GGIR part 2
MXLX is calculated per calendar day and, if argument
qwindow
is specified, per segment of the day. Within GGIR
part 5 MXLX is calculated per window, and if used in combination with
the GENEActiv accelerometer brand LUX estimates per LX and MX are
included.
The (Extended) Cosinor analysis quantifies the circadian 24 hour
cycle. To do this GGIR uses R package ActCR as a
dependency. Specify argument cosinor = TRUE
to perform
these analysis.
The implementation within GGIR part 2 is as follows:
log(acceleration converted to _mg_ + 1)
.The default calculation uses IVIS.activity.metric = 1 which uses the continuous numeric acceleration values. However, as we later realised this is not compatible with the original approach by van Someren and colleagues, which uses a binary distinction between active and inactive. Therefore, a second option has been added (IVIS.activity.metric = 2), which needs to be used in combination with accelerometer metric ENMO, and collapses the acceleration values into a binary score of rest versus active.
Further, the IV algorithm by van Someren has been modified in GGIR to ignore transitions between the epoch at the end of each day and the following epoch at the start of the next day, because technically these transitions do not relate to intradaily variability.
IS is sometimes used as a measure of behavioural robustness when
conducting Cosinor analysis. However, to work with the combination of
the two outcomes it seems important that IS is calculated from the same
time series. Therefore, when cosinor = TRUE
IV and IS are
calculated twice: Once as part of the default IV and IS analysis as
discussed above, and once as part of the Cosinor analysis using the same
log transformed time series. More specifically, the IV and IS algorithm
is applied with IVIS.activity.metric = 2
and
IVIS_acc_threshold = log(20 + 1)
to make the binary
distinction between active and inactive, and
IVIS_per_daypair = TRUE
. The setting
IVIS_per_daypair
was specifically designed for this context
to handle the potentially missing values in the time series as used for
Cosinor analysis. Applying the default IVIS algorithm would not be able
to handle the missing values and would result in a loss of information
if all non-matching epochs across the entire recording were excluded.
Instead, IV and IS are calculated as follows:
The new Cosinor-compatible IV and IS estimates are stored as output
variables cosinorIV
and cosinorIS
.
A correct citation of research software is important to make your research reproducible and to acknowledge the effort that goes into the development of open-source software.
To do so, please report the GGIR version you used in the text. Additionally, please also cite:
If your work depends on the quantification of physical activity then also cite:
If you used the auto-calibration functionality then also cite:
If you used the sleep detection then also cite:
If you used the sleep detection without relying on sleep diary then also cite:
If you used the sleep regularity index then also cite:
The copyright of the GGIR logo lies with Accelting (Almere, The Netherlands), please contact v.vanhees@acceleting.com to ask for permission to use this logo.