Summary
This is an overview of the Sleep Data Automation code, a project developed by the Mobile Technologies Core and the Sleep & Circadian Research Laboratory. It automates the process of cleaning sleep data from Fitbit fitness trackers, obtained via Fitabase, and comparing or joining it to self-reported sleep diary data. To make it easier for researchers to re-use the code, it was developed in Microsoft Excel, with optional components in Power Automate.
Details
Overview
With the advancements in sleep tracking accuracy in consumer-grade wearable devices, more and more researchers are utilizing them in their studies as they provide a convenient and low-cost option to capture sleep-related metrics at the participant's home. However, many factors outside the researcher's control can still influence the accuracy of sleep tracking - anything from reading in bed or watching TV, to soap from a shower, to the battery running out. Thus, researchers often make use of sleep diaries, which allow for cleaning sleep start and end times while providing a measure of perceived sleep. Until now, merging sleep diary and fitness tracker data has been a tedious manual process with lots of decision making involved.
The Mobile Technologies Core developed a way to automatically apply an algorithm used by the University of MIchigan's Sleep & Circadian Research Laboratory to determine the correct sleep and wakeup times, calculate many sleep and HRV metrics, and complete other tedious processes. We also created an automation to parse electronic sleep diary entries that are sent from a text message to an Outlook mailbox. Given that most researchers would have access to Excel, we used Excel and its Power Query component to ensure reproducibility and to make the code reusable. The sleep diary automation was done in Power Automate, a cloud product from Microsoft for low-code workflow automations.
What This Automation Does
-
Parse sms-to-email messages to get a list of sleep markers (self-reported sleep/wake times) into a spreadsheet
-
Decipher the correct sleep/wake times based on sleep stages from Fitabase (because Fitbit includes time in bed before and after sleep sometimes)
-
Identify the main sleep episode and filter out naps, based on configurable settings (longest duration, latest sleep episode, etc.)
-
From that, get the correct total time asleep, time awake, time in bed, and other measures
-
Use the calculated sleep/wake times to get the correct HRV values falling only inside that time window, and calculate averages for RMSSD/HF/LF
-
Combine sleep stages, HRV and email markers (sleep diary)
-
Optionally, adjust the sleep/wake up times based on self-reported times automatically when actual vs self-reported differ by a certain number of minutes
-
Option to enter manual time adjustments in an override file, which will will override both Fitbit and self-reported times
-
Present everything in a single sheet with one row per participant, per day, per sleep episode
-
Show a weekly summary with a drop-down to select the study participant and the week
Algorithm Details
Base Algorithm
The base algorithm is part of the study titled "IBD-Sleep: A Pilot Study Looking at Changes in Sleep Timing and IBD Symptoms," whose principal investigator is Dr. Helen Burgess. We automate this in Power Query, a component of Microsoft Excel that allows merging multiple files and transforming the data one step at a time. This process also involves basic data cleanup, such as removing trailing spaces, making participant IDs uppercase, grouping, sorting, etc. The base algorithm does the following:
- It determines the correct sleep onset and wakeup times by walking through 30-second sleep stages and cutting out times at the beginning and the end in which the participant is still awake.
- It assigns the correct sleep day so that even if a participant went to bed after midnight, the episode still counts as sleep for the prior day.
- It compares sleep diary data to the calculated sleep onset and wakeup times to alert the researcher if there is a discrepancy between them. The research team can follow up with the participant (at weekly meetings) to determine why there is such a discrepancy, and make any needed adjustments.
- It calculates HRV metrics based on 5-minute HRV by looking for the data that falls only and completely inside the calculated sleep window.
Automated Sleep Diary Merging
In addition, we automated [optional] steps suggested by Dr. Cathy Goldstein, a professor of Neurology at the University of Michigan Sleep Disorders Center and faculty lead of the EFDC Mobile Technologies Core and MeTRIC. These optional enhancements include the ability to detect main sleep vs. naps based on configurable parameters, and removing naps automatically. Furthermore, we can adjust the sleep/wake up times based on self-reported times automatically when actual vs self-reported differ by a certain number of minutes. This is done by applying the base algorithm (filtering out times when participant is actually awake at the beginning and end of the sleep episode) but inside the time window specified by the sleep diary.
Override Spreadsheet
We also implemented a way to manually override sleep times using an Override Spreadsheet. This method was used by Moony Rizvydeen and Zainab Fayyaz in Dr. Burgess' study team, but it required manual re-calculation of all metrics in a sleep episode. By automating this in Power Query, study team members can use the Override Spreadsheet as one of the inputs, so all they have to do is click Refresh in the output spreadsheet. This also allows the team to keep a log of manual overrides, along with the reason (e.g. sleep diary was more correct due the participant reading in bed). This step functions similarly to sleep diary overrides (calculating sleep/wakeup times inside the specified time window), but it also takes precedence over sleep diary entries.
Sleep Diary SMS-to-Email Parsing
An optional automation was created in Power Automate. Participants keep an electronic sleep diary by simply sending a text message to a shared mailbox, with their participant ID and the keywords "sleep time" or "wake time". This automation parses sms-to-email messages in that shared Outlook mailbox, creating a structured table with the participant's sleep and wakeup times. It then saves the data to a spreadsheet in OneDrive for Business, which can be used by the Power Query automation. Researchers who use paper sleep diaries can still use the OneDrive spreadsheet, but it would require manual data entry from paper into the spreadsheet.
Output
The output consists of a spreadsheet with two sheets, one containing the calculated sleep and HRV results in detail (one row per participant per sleep episode), and the other with a weekly summary data (with a drop-down to select the week and the participant).
How to Use
Get the Code
The code for this project is available in GitHub at: https://github.com/DepressionCenter/SleepDataAutomation
Quick Start Guide
- Download the Excel-PowerQuery directory
- Save your Fitbit, sleep markers, and sleep override files to the CSV directory
- Open CleanSleepData.xlsx, adjust the parameters in Power Query, and refresh all data
- To use sms-to-email sleep markers, import the Power Automate zip file into your Power Automate environment, adjust the mailbox name and output location for OneDrive, and create the needed directories in the mailbox
Notes
- This is the first of a series of articles documenting the Sleep Data Automation code. More detailed instructions will be available in the upcoming weeks.
Resources
Citation
If you find this repository, code, article or paper useful for your research, please cite it.
Mongefranco, Gabriel; Rizvydeen, Moony; Fayyaz, Zainab; Burgess, Helen; Goldstein, Cathy (2024). Sleep Data Automation. University of Michigan. Software. https://michmed.org/sleepdata
DOI: 10.6084/m9.figshare.25669173.v1
About the Author
|
Gabriel Mongefranco is a Mobile Data Architect at the University of Michigan Eisenberg Family Depression Center. Gabriel has over a decade of experience in data analytics, dashboard design, automation, back end software development, database design, middleware and API architecture, and technical writing.
| |
|