Data harmonization for Translational Oncology

Convert static clinical trial data files into active, analyzable data products. for translational oncology, for translational research, for translational medicine icon - use cases
Use Cases icon - data
Data icon - analysis

Use Cases

Clinical trials frequently involve similar data workflows however the data is siloed into single trial sets. Large population Phase III trials can create significant amounts of data that have the potential to provide new understanding of the efficacy, or lack thereof, of the clinical trial. Analyzing across trials reveals common mechanisms.

The data mesh architecture approach ensures that your datasets from all clinical trial stages are interoperable and useful to communities of researchers, clinicians and data scientists. It is helpful when you need to instantly pivot from one data source to the next as you translate research data into everyday clinical practice. Clinical Trial Harmonization

When you treat your data as a product, it becomes usable to multiple researchers and data scientists. The data products in the data mesh approach can help your organization streamline the clinical data workflow process to produce quality controlled outputs.

How does building a data product help with the workflow? All data products go through a primary data mapping stage where it sets a foundation for the subsequent updates. One of the benefits is that when the data gets updated, it’s deployed as new versioned snapshots. Then a robust data and app testing system ensures integrity and robustness of all data products.

With data and analyses automatically timestamped and versioned, you can access a history of versioned results and trace your visualizations / outputs back to the originating data source. Clinical Data Workflow

The flexibility of’s platform can help you advance the field of immunology. Since immunology connects to many therapeutic areas, such as immuno-oncology, inflammation, and autoimmunity, you can use one platform to instantly pivot between data products from each of these therapeutic areas.

Data type examples:
  • Immune repertoire
  • Single cell
  • All omics
  • Immune typing for cancer
Data source examples: Immunology As A Base


Current workflow:

Data is managed project by project using multiple data mappings for multiple analysis platforms. This results in redundant work and makes quality control outputs a challenge. Data Harmonization

New, flexible workflow:

Data is now mapped into individual data products at the initial stage, which eliminates re-work. Then other analysis platforms can pull the harmonized, versioned controlled data from the data products, making it easier to produce quality controlled outputs. Data Harmonization

Data not harmonizeddata-harmonized

A data product is the fundamental building block of the data mesh architecture. It follows the domain-driven design approach to ensure that each data product is usable by anyone. There are many definitions of what a data product is, however at, we define a data product as three components: data mapping, algorithms, and smart API joined to create a foundational building block for the data mesh.

Why build reusable data products:
  • Maximize value from your data
  • Accelerate time to discovery
  • Give users data ownership
  • Use the language domain experts understand
  • Agile development and versioned updates data product - layers

All data and analyses are automatically recorded with timestamps, versions and analysis names and owners. This makes all analysis actions traceable and replayable within the data portal.

Why it matters:
  • Control access to provide effective and efficient data governance processes
  • Enable others to reproduce your analysis and confirm your work
  • Quality control outputs for internal reports and external publication Data Traceability


Each data product comes with its own no-code, guided analysis apps. The apps allow you to ask specific and follow-on questions of your data on the fly. These are some examples:

  • You can use the “Gene Expression Signatures” app to view the level of expression of a gene signature across cancer types within the TCGA pan-cancer atlas data product.
  • You can use the “Single-Cell Gene Expression” app to look for genes that are differentially expressed in cancer cells within the head and neck cancer data product.
Why it matters:
  • Your researchers, who have deep domain knowledge, can directly ask and answer their own questions of their data to make impactful real-time decisions
  • Your data scientists can help multiple researchers at once
  • Promote a data-driven culture Analysis Apps

UDATs are the distillation of your investigation. It’s a structured data object created when you or an algorithm extract something useful from the data. Examples of UDATs:

  • Analysis results
  • Analysis parameters
  • Cohorts
Why it matters:
  • It provides full reproducibility and reusability of analysis
  • With all of your extracted UDATs automatically saved in one location, it’s easy for you to instantly find, cross compare and share any of your UDATs with your peers
  • Retain institutional knowledge – if any of your team members transition, you’ll still have access to their analysis history Useful Data Artifacts (UDATs)

Since all of your UDATs are automatically saved with timestamp and versions of the data product, you can reproduce any of your UDATs. Examples:

  • For analyses: You can re-parameterize your analysis based on the last saved version to get new results, which in turn creates a new UDAT
  • For cohorts: You can build a new cohort based on your existing cohort
Why it matters:
  • The traceability of your presented outcomes provide confidence that you work can be replicated using the same data and analysis methods
  • Ensure quality controlled outputs for your internal reports and external publication reproducible analysis
Let’s get the conversation started

From a 30-minute demo to an inquiry about our 4-week pilot project, we are here to answer all of your questions!