Bringing something revolutionary to data research – full reproducibility and reusability of analysis.
What are UDATs (useful data artifacts)?
UDATs are the distillation of your investigation. A UDAT is a structured data object created when you or an algorithm extract something useful from the data.
Below are examples of UDATs.
- When you use an analysis app, you usually will select some parameters to run your analysis.
- These parameters are automatically saved in your analysis history, automatically making them data artifacts.
- A cohort is a group of entities which share common data characteristics. For example, all patients within a specific age range can be a cohort. Cohorts can also be defined as “clusters” or “segments” after running algorithms.
- Whether or not you save your cohorts when you run analyses, your cohorts are automatically saved in your analysis history, automatically making them data artifacts.
- A variable group is a group of entity attributes, such as factors, measurements, or conditions. Variable groups are also known as signatures or feature sets.
- Whether or not you save your groups of variables when you run analyses, they are automatically saved in your analysis history, automatically making them data artifacts.
There are many types of analysis results. For example:
- Summary: an analysis on a single cohort
- Comparison: an analysis comparing cohorts
- Similarity: similarities or differences between entities, such as nearest-neighbors
- Correlation: similarities or differences between variables
- Descriptive models: sophisticated algorithms
- Projection models: supervised algorithms
- Systems models: combining cohorts and variable groups with external knowledge
Optimize the reuse of UDATs
The most important aspects of using the FAIR (findable, accessible, interoperable, reusable) principles are reproducibility and reusability.
Something that can be reproduced can be reused. Reusability reduces redundancy and makes your research more efficient. Tag.bio promotes data, analysis, and UDAT reuse.
“I know where to look for my UDATs.”
“I can access any of my UDATs.”
“I can use my UDATs as signals that apply across datasets.”
“I can use my UDATs as a starting point for a new analysis.”
What can you do with UDATs?
Share them with your team members to reproduce
Use it as a starting point for further investigations
Reproduce it for quality assurance and auditing
Save it for future references
Publish your findings
And anything else that you can think of!