Quantitative Medicine Scientist critical path institute, United States
Disclosure(s):
Mingyuan Wang, PhD: No financial relationships to disclose
Objectives: Digital health technology (DHT) monitors patients’ vital signs and activities in clinical and free-living environments, enabling decentralized trials, closed-loop devices, and digital therapies. Activity annotation from passive sensor data is critical but challenging. This abstract evaluates unique DHT-derived feature groups, assessing their impact on annotation models. The research reveals feature group importance for free-living activity classification, supporting future research on patient monitoring and sensor applications for targeted activities.
Methods: The CAPTURE-24 [1] data contains 100 Hz real-time wrist accelerometer data from 151 free-living participants, annotated via wearable camera images and sleep diary. The Willetts mapping [2] was used to categorize activities into 10 labels (e.g. manual work, mixed activity, sitting, sports, walking). 32 features (original features) were extracted and benchmarked in their study. We added 118 new features belonging to 7 unique feature groups that measure forearm orientation, instantaneous device angle, activity intensity, signal shape, frequency, and statistical quality. Another group includes conceptually similar features to the originals but with different signal processing strategies. Feature extraction was performed in 30 seconds non-overlap windows. XGBoost [3] model was chosen for consistency with the original study's benchmarks. Each new feature group was added individually to assess the impact on individual classes. Predictability across all classes was evaluated on all features. The models were evaluated using F1-score. Paired t-tests compared models by leveraging 30 repetitions of subject-independent random split (66%/34% train/test). The Benjamini-Yekutieli method [4] was applied to mitigate multiple comparison issues. Feature selection was also carried out using SHAP [5].
Results: The feature group specified analysis showed several associations by activity: forearm orientation features have stronger link to bicycling and household chores; signal shape based features to sitting, sleeping, standing, and walking; and summary statistics features to manual work and vehicle use. The model with all the features performed significantly better than the one with original features (average F1: 0.442 vs 0.492, p-value < 0.0001). The model can be further simplified by including 70 total features or 50 new features while achieving a slightly higher F1-score. This improvement mainly came from bicycling, household chores, manual work, sitting, standing, vehicle , and walking. Forearm orientation, signal shape, and frequency are the main feature groups that contribute to this improvement.
Conclusions: This work demonstrates the implementation of unique feature groups to improve the individual class and overall classes annotation. It also aids in the understanding of the association between specific feature groups and individual classes, supporting a more targeted feature integration approach for future research and applications in patient activity classes of interest. Furthermore, analysis shows which groups to focus on if overall prediction is the goal.
Citations: [1] Chan, S., Hang, Y., Tong, C., Acquah, A., Schonfeldt, A., Gershuny, J. and Doherty, A., 2024. CAPTURE-24: A large dataset of wrist-worn activity tracker data collected in the wild for human activity recognition. Scientific Data, 11(1), p.1135. [2] Willetts, M., Hollowell, S., Aslett, L., Holmes, C. & Doherty, A. Statistical machine learning of sleep and physical activity phenotypes from sensor data in 96,220 uk biobank participants. Scientifc reports 8, 1–10 (2018). [3] Chen, T. and Guestrin, C., 2016, August. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). [4] Benjamini, Y. and Yekutieli, D., 2001. The control of the false discovery rate in multiple testing under dependency. Annals of statistics, pp.1165-1188. [5] Lundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
Keywords: activity classification, digital health technology, feature extraction