“From architecture to execution, I help teams build trustworthy data environments.”
Healthcare Data Warehousing for Research and Analytics
Design, refine, and analyze healthcare data to facilitate medical research and improve care.
As a Prof. of Genetics at the Mount Sinai School of Medicine, I played a key role on the team that created the school's clinical data warehouse (CDW), the Mount Sinai Data Warehouse (MSDW).
Configured the web app Leaf to access MSDW
Identified problems with the ETL process that loaded electronic health records (EHR) into MSDW, thereby improving its quality and utility
Integrated standard healthcare codes from the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) into the ETL, which standardized more data in MSDW
Services include:
Evaluate your clinical data warehouse
Evaluate the team creating your clinical data warehouse
Case Study: Configure an Easy-to-use web app that medical
researchers use to query a huge Clinical Data Warehouse
The data in clinical data warehouses are voluminous and complex. Arthur made it easy for medical researchers to access this data at the Mount Sinai School of Medicine.
The Mount Sinai Data Warehouse (MSDW) contains data about more than 105 million healthcare encounters with over 12 million individual patients.
MSDW uses the widely-used OMOP Common Data Model (CDM) to organize its data. The current (v5.4) CDM contains 39 tables with about 1000 fields.
Medical researchers at Mount Sinai want to use this data in their research. E.g., a study might want to investigate men between 30 and 50 years old who tested positive for Covid, and then break them into two groups: the ones who had been vaccinated before contracting Covid and the ones who had not.
But the medical researchers were overwhelmed by MSDW's size and complexity.
Arthur solved this problem by setting up an easy-to-use, self-service web app called Leaf that lets medical researchers drag and drop medical terms to easily select patient cohorts for their studies.
This kind of app is called a Cohort Query Tool (CQT), because each query of the clinical data warehouse finds a set, or cohort, of patients.
First, Arthur identified a suitable app. The effort needed to repurpose and reuse an existing app, as Sinai lacked sufficient time or manpower to create a new app, and good apps already existed. The web app Arthur identified is called Leaf.
This video by Nic Dobbins, the creator of Leaf, demonstrates how it works. The left side of the white pane contains lists of standardized medical concepts. For example, at 1:15 in the video Nic expands the Procedures concepts and then drags the concept "Had an operation on the endocrine system" into a filter box on the right. That filters the set of patients being retrieved into those who have had this procedure.
Then Arthur created a plan to make Leaf work at Sinai, and executed the plan.
He installed Leaf
He configured its security credentials and connected it to MSDW
He learned Leaf's configuration language, and created code that mapped its concepts to the clinical data in MSDW, such as diagnoses and medications
He also mapped demographics such as age, gender, race between them
He directed the MSDW system administrators to index the database so that Leaf queries ran quickly