Understanding and leveraging useful data artifacts (UDATs)

Bringing something revolutionary to data research – full reproducibility and reusability of analysis.

Tag.bio - useful data artifacts - udats

What are UDATs (useful data artifacts)?

UDATs are the distillation of your investigation. A UDAT is a structured data object created when you or an algorithm extract something useful from the data.

Read Our CSO’s In-Depth Explanation

Below are examples of UDATs.

Tag.bio - analysis parameter - useful data artifacts - udats

Analysis parameters

  • When you use an analysis app, you usually will select some parameters to run your analysis. 
  • These parameters are automatically saved in your analysis history, automatically making them data artifacts.
Tag.bio - cohorts and variable groups - useful data artifacts - udats

Cohorts

  • A cohort is a group of entities which share common data characteristics. For example, all patients within a specific age range can be a cohort. Cohorts can also be defined as “clusters” or “segments” after running algorithms.
  • Whether or not you save your cohorts when you run analyses, your cohorts are automatically saved in your analysis history, automatically making them data artifacts.
Tag.bio - variable groups - useful data artifacts - udats

Variable groups

  • A variable group is a group of entity attributes, such as factors, measurements, or conditions. Variable groups are also known as signatures or feature sets.
  • Whether or not you save your groups of variables when you run analyses, they are automatically saved in your analysis history, automatically making them data artifacts.
Tag.bio - analysis results - useful data artifacts - udats

Analysis results

There are many types of analysis results. For example:

  • Summary: an analysis on a single cohort
  • Comparison: an analysis comparing cohorts
  • Similarity: similarities or differences between entities, such as nearest-neighbors
  • Correlation: similarities or differences between variables
  • Descriptive models: sophisticated algorithms
  • Projection models: supervised algorithms
  • Systems models: combining cohorts and variable groups with external knowledge

Optimize the reuse of UDATs

The most important aspects of using the FAIR (findable, accessible, interoperable, reusable) principles are reproducibility and reusability.

Something that can be reproduced can be reused. Reusability reduces redundancy and makes your research more efficient. Tag.bio promotes data, analysis, and UDAT reuse.

Tag.bio FAIR principle

Findable

“I know where to look for my UDATs.”

Tag.bio FAIR principle

Accessible

“I can access any of my UDATs.”

Tag.bio FAIR principle

Interoperable

“I can use my UDATs as signals that apply across datasets.”

Tag.bio FAIR principle

Reusable

“I can use my UDATs as a starting point for a new analysis.”

What can you do with UDATs?

Share them with your team members to reproduce

Use it as a starting point for further investigations

Reproduce it for quality assurance and auditing

Save it for future references

Publish your findings

And anything else that you can think of!

Let’s get the conversation started

From a 30-minute demo to an inquiry about our 4-week pilot project, we are here to answer all of your questions!