Associate Director, Data Science Kyowa Kirin Inc Conshohocken, Pennsylvania, United States
Disclosure(s):
Dimple Patel: No financial relationships to disclose
Objectives: High-quality data is essential for (PK/PD) analysis, as the accuracy and reliability of the results depends on the integrity of the data. The process for creating a NONMEM dataset for modeling involves integrating data across SDTM and ADaM domains, followed by data transformation, handling, cleaning, and imputation. Ensuring data accuracy and integrity in NONMEM datasets requires efficient quality control (QC) methods to minimize time spent on checks. Implementing a semi-automated process can significantly reduce QC time ensuring data accuracy and data quality for modeling.
Methods: PK/PD NONMEM dataset was created from multiple simulated data such as demographics, dose administration, PK concentration, Laboratory results data in SDTM standard format using R Studio. A quality control (QC) checklist was established for efficient identification and resolution of discrepancies, ensuring data integrity. R Markdown script was set up based on the QC checklist to compare summary statistics and datasets from NONMEM dataset with corresponding source data. Additional checks for missing or out-of-range values are performed using basic functions (is.na (), conditional checks). Visualizations, such as scatter plots, are generated to provide a graphical comparison of concentrations in both datasets.
Results: The summary statistics and dataset comparison results were generated from NONMEM and corresponding SDTM/ADAM domains and were displayed side by side in R Markdown HTML output. The Table of contents in HTML allows scientists to navigate output results efficiently. A semi-automated R Markdown quality check code streamlines the process of comparing NONMEM dataset with corresponding source data. All steps are documented and reproducible through R Markdown, ensuring transparency and ease of updates.
Conclusions: The semi-automated quality check process developed using R Markdown provides an efficient and reproducible method for comparing summary statistics, datasets and identifying any discrepancies between NONMEM datasets and their corresponding source data. Since the structure of the source data domain and NONMEM dataset are in standard format, R Markdown script can be easily used as a template and modified for the new project. This semi-automated process can significantly reduce QC time ensuring data accuracy and data quality for modeling. Furthermore, it provides an audit trail for documentation and reporting purposes.