Frequently Asked Questions
Below are some of the frequently asked questions we get from our customers. If you have more questions, or would like clarifications, please feel free to contact us.
I work in R and Python. Can I use these with your system?
Yes, we have R and Python SDKs that allow you to plug your scripts and models into a data node from a familiar Notebook environment.
I like my processes. Why should I switch to yours?
Our platform is designed for independent data products, giving you the flexibility to build whatever you want. Our aim is to augment your current systems and processes, not replace them. For example, many of our customers begin with Tag.bio to solve emerging data analysis challenges for which their existing systems are ill-designed.
How Tag.bio Compares
How are you different from other products, e.g. Tableau and Spotfire?
Users interact with Tableau and Spotfire through dashboards. These dashboards show individual answers around limited, pre-selected questions – they do not enable iterative investigations. Additionally, user interactions with dashboard software are not preserved, resulting in difficulty when reproducing analyses, i.e. “how did I do that before?”.
Tag.bio analysis apps guide users through iterative, open-ended investigations by allowing them to ask their own questions. All questions and answers are preserved in a history that is reproducible, reusable, and shareable. In addition, Tag.bio has an advantage over dashboard platforms because our user experience is driven from within each data node – and each data node can provide its own native, domain-specific suite of analysis apps.
Our advantage over those platforms is that our user experience is driven from within each data node, and therefore each data node can provide its own native, domain-specific suite of functionality.
- Life sciences example: The Pan Cancer data node has an analysis app to identify gene mutations which significantly impact gene expression signatures and survival/treatment outcomes.
- Healthcare example: The Hospital System data node has an analysis app to identify variables and conditions which impact 30-day readmission rates for different patient cohorts.
How are you different from R and Python?
R and Python are powerful data science coding languages, with extensive libraries of packages, but not an enterprise platform.
Tag.bio enables data scientists to integrate useful R and Python code into any data node, automatically turning scripts into analysis apps. Publishing an R/Python analysis app allows the data scientist to empower domain experts with a parameterizable, enterprise data product. This reduces the number of follow-up requests for parameter changes, and enables monitoring of app usage/ROI to guide future development decisions.
For domain expert users, the R/Python apps published by data scientists are runnable on the Tag.bio portal with a consistent User Experience – just like all other analysis apps. The R/Python apps enable the user to build a cohort, send that cohort as a dataframe to the script which performs analysis and prepares results as a report with visualizations. This allows users to review and preserve useful data artifacts, all of which are saved in history.
How are you different from Jupyter Notebook?
Tag.bio enables a Jupyter Notebook environment for any data scientist, with authorization, to access and query published data node within the data mesh. The Notebook environment contains embedded Tag.bio R/Python SDKs, and facilitates a native experience for data scientists to interact with Tag.bio data nodes as useful, harmonized sources of data.
Data scientists typically use the Jupyter Notebook environment in Tag.bio to extract dataframes from data nodes, perform exploratory data analysis, and develop R/Python scripts to be plugged-into each data node as analysis apps.
What data types can Tag.bio load?
Tag.bio data nodes can be deployed on a myriad of disparate data sources, either independently with one or more nodes on each data source, or as integration nodes for multiple data sources.
Tag.bio data nodes deploy instantly as turnkey solutions on top of common schemas in healthcare and life sciences, e.g. electronic health records, clinical trials, and public data schemas (e.g. TCGA). However, most organizations have data formatted in specialized schemas or file structures – and stored in file systems, databases, data warehouses and data lakes. The data mapping layer of each Tag.bio data node can be customized to load data from any single or combination of tabular data sources – e.g. CSV, SQL.
Do you clean data?
Not all data, even data of high quality, is analysis-ready. To make data analysis-ready, Tag.bio data nodes enable industry-standard data transformations to harmonize data sources – e.g. datetime handling, numeric normalizations, and mapping to common concepts like ICD10 or Gene Ontology.
If you require extensive data quality analysis, cleaning and structuring, we can accommodate that in collaboration with trusted service partners.
How does my data get into a data node?
The Tag.bio platform has a low-code data mapping/configuration layer which extracts data from tabular data sources (e.g. CSV, SQL), and can either instantly load data as-is, or perform industry-standard joins and transformations to make data analysis-ready. If your data source is stored in an industry-standard system/schema (e.g. OMOP, VCF), Tag.bio provides turnkey data mapping to instantly launch a data node with that data.
How long does it take to create a data node?
Typically, the data node initiation process takes 2 to 20 hours depending on data volume, data schema/structure, and the quality of data. All data mapping required for data node initiation is facilitated via low-code, modular templates.
How long does it take to produce a new analysis app?
Typically, the process takes thirty minutes to a couple of hours depending on the complexity, novelty, and scope of questions to answer. For example, apps that provide a straightforward summary of variables take around thirty minutes to develop. Apps that integrate novel algorithms, visualizations and utilize R/Python plugins can take 2 to 20 hours.
Development of analysis apps is made consistent and fast via our low-code, modular template system.