Dr. Mike Hogarth & an institutional COVID registry:Creating a COVID-19 registry in 28 days
In April 2020 we built COVID-19 patient registries, in collaboration with Dr. Mike Hogarth, within one month
Working within the secure data environment, we allowed researchers to investigate COVID-19 patient data from his institution and the multi-institution datasets
The COVID-19 patient registries can be used as the basis for create any on-demand patient registries in other therapeutic areas
About Dr. Mike Hogarth
Mike Hogarth, MD, FACMI, FACP
Biomedical Informatics at UCSD Health
“Leading biomedical informatician Michael Hogarth, MD, has joined…as Clinical Research Information Officer (CRIO)… In this role, he will lead efforts to align and integrate resources in Information Services with clinical research informatics. He will also develop and optimize strategies to use our current technology and systems to support the integration of research with patient care, including the use of the electronic medical record (EMR) for clinical trial recruitment and EMR data for observational research.”
“The power of this [platform] is not just the flexibility and the configurability and some simplicity. But the power is working with investigators who do have the questions to then configure analyses that they and their colleagues might want, or find useful.”
– Mike Hogarth, MD, FACMI, FACP,
Clinical Professor, Biomedical Informatics at UCSD Health
In January 2020, Dr. Hogarth and Tag.bio met at the PMWC (Precision Medicine World Conference). The initial conversation led to a meeting in San Diego where we discussed using the Tag.bio platform to collaborate on the analysis of patient data.
We met again at the end of February and realized that the COVID-19 pandemic would become ubiquitous. In response to this challenge, we formed a rapid collaboration to analyze COVID-19 patient data within his institution and the institution’s five health campuses. The intent was to rapidly build data products that would allow analysis of the extent and effect of COVID-19 on patients.
At the beginning of 2020 it became clear that the COVID-19 pandemic was progressing rapidly. To create data tools that would address the needs of researchers we needed to create them within weeks.
His institution had already implemented a robust and compliant cloud based system, called the VRD (virtual research desktop). This enabled users to access sensitive or proprietary data through a secure portal. This made it easy for Tag.bio to concentrate on building the data products and achieve compliance by simply deploying within the VRD.
The first challenge was to MAP data into a form that could be used by Tag.bio data products. The OMOP (Observational Medical Outcomes Partnership) common data model had been chosen as the concept model for both COVID datasets so in theory this should have made data easily cross-comparable.
However, it became obvious that while using the same concept model (OMOP), each institution mapped the underlying patient data to OMOP concept IDs in different ways. They also chose a range of different data elements to include in the extracted dataset.
In this Campus LISA webinar, you will learn how you can utilize data mesh for healthcare. Panelists include: Mike Hogarth, Mark Mooney, and Tom Covington.
We used the Tag.bio automated parsers in conjunction with our experience with the OMOP model to rapidly create analysis ready data. Our analysis apps are modular and were constructed in hours and then pointed at the two datasets. Some modifications were needed between the datasets, but these are easier to create when you are able to modify from a basic pattern.
As his institution had set up a usable, compliant environment for the institute’s researchers we were able to simply use this and deploy within it. The data itself was already de-identified and approved for use through DUAs (Data use agreements).
Data format and mapping
Having worked OMOP before, we were able to rapidly map the OMOP concept IDs to human readable text, Athena was invaluable here. Once we have created a basic template for OMOP we work by modifying that base to incorporate individual mapping details. Here we were able to harmonize some of the choices around concept ID between the multi-institutional COVID datasets and Dr. Hogarth’s institutional data.
Creating data products
To enable domain experts, such as researchers and physicians, to work with the COVID-19 data we built sets of data products that used point-and-click analysis apps to investigate the data. The multi-institutional COVID datasets and Dr. Hogarth’s institute’s specific datasets do not contain identical data-elements (for example: blood type is present in the multi-institutional COVID datasets but not in his institute’s dataset).
One of the advantages of working with a data mesh architecture is that you can rapidly build products from any subset of data. With the COVID-19 tested populations we chose to create a data product from each of the data sources (his institute’s specific data and the multi-institutional COVID datasets), and then created sub-products from each of these larger datasets that were composed only of the patients who tested positive.
This resulted in four data products:
- Data product 1: All patients from the multi-institutional COVID datasets who have been tested for COVID-19
- Data product 1a: Patients from the multi-institutional COVID datasets who have been tested positive for COVID-19
- Data product 2: All patients from Dr. Hogarth’s institutional dataset who have been tested for COVID-19
- Data product 2a: Patients from Dr. Hogarth’s institutional dataset who have been tested positive for COVID-19
To accelerate the learning curve among researchers and physicians, we built the data products’ analysis apps to function in similar ways.
The standard app types we built were:
The specialty app types we built were:
- Cox survival
These apps enable the researchers and physicians to run analyses without having to write code. The users simply need to point-and-click to select parameters and hit run to perform the analysis and be taken to the results.
To build any intuitive app you have to know what questions people want to ask. Our approach has always been to include researchers and physicians from the beginning of the app building process. This is done by demoing the basic apps to physicians and asking them to answer their own questions with the apps. Their requests for new data types or ways to group data or ways to extract data are then included as the apps are iterated. Below are some examples of how their questions got turned into apps:
“If the COVID-positive patients have a pre-existing condition, what are their outcomes?”
“How many patients being admitted to the hospital needed mechanical ventilation?”
“What are the risk factors for patients being on a ventilator?”
“What are the characteristics of the COVID-positive patients who stayed in the ICU?”
Since all of these apps use similar basic components, the domain experts don’t need to re-learn how to run them when they are pointed to different underlying data. This not only makes it faster to learn the system but makes it easier to pivot from one dataset to another and run the comparable analyses on different datasets.
The Nightingale Platform
Tag.bio is a data analysis platform company. For this instance of a Tag.bio data platform we worked with Dr. Mike Hogarth to individualize this for his team to promote recognition and adoption of the platform. We have given it a recognizable look and feel by working with Dr. Hogarth to brand the platform as the “Nightingale Registries”.
Having demonstrated that the Nightingale platform could be used to provide intuitive access to COVID-19 data we realized that these basic apps could be used with any other segment of patient data and we are now expanding the idea of registries on demand beyond COVID-19 and into other therapeutic areas.
The principle of rapidly creating data products that provide researchers and physicians with a point and click options for data analysis is one that has a broad application. With the help of Dr. Hogarth, we recently took the idea to another institute within the five health campuses system and are working to implement a version of registries at those academic medical centers as well.
“We really see this [platform] as a powerful way to provide these research datasets in a very actionable, usable way to the research community.”
– Mike Hogarth, MD, FACMI, FACP,
Clinical Professor, Biomedical Informatics at UCSD Health
With our agile and versatile platform, we were able to go from the concept of building the COVID-19 registry to making it live in 28 days.
Because we went through this process of making data available in a secured research environment, we are reusing this foundation to rapidly expand into other areas, such as:
Providing a central platform for researchers and physicians to access data from Dr. Hogarth’s institution and the multi-institutional COVID datasets
Enabling researchers and physicians to analyze data without learning to code
Reusing the analysis apps that were created to analyze the real COVID-19 data on a synthetic data so that they can be demoed to the public
Inspiring other institutions within the five health campuses system to use our platform to build out their own patient registries — allowing each institute to both use their data products within their own organization and to share some data products the institution’s system
Partnering with AWS to create a standard cloud based architecture to run this kind of compliant and secure registry at any academic medical center