Platform Overview

Image services and supports

API Services and Supports

Image API up moblie
Graphics data products mobile
Graphic interconnected data products
Graphic Enterprise AI
Gaphics API upside

Data Products

Tag.bio data products are the fundamental building block of the data mesh architecture and implement a domain-driven design to ensure that each data product is usable by anyone.

Data-as-a-Product

Data products represent a harmonized, composable application layer on top of disparate data sources. Along with employing a universal “Smart” API, they also present a simple, clean, standardized data model for apps and data scientists who would do queries and extract data frames. It has the following features:

Value-oriented

Ownership of functionality is shareable with domain teams and business units.

Versioned

All data snapshots, mapping, statistics, ML models, visualizations and reports are versioned for analysis provenance and perfect reproducibility.

Well-modeled

Each data product presents its data model to algorithms, visualizations, BI tools, and data scientists as a single, sliceable data frame with common vocabularies and ontology concepts.

Harmonized

Disparate data products can map into an archetypal data model, which enables instant transfer of algorithms and data consumption methodologies.

Containerized

Each data product utilizes its own git repository (or archetype repository) and is deployed as a Docker container into a robust, fully-managed Kubernetes compute infrastructure.

Decentralized

Data products can be designed, managed, and tested independently, minimizing overhead and maintenance costs as data and usage evolves.

Low-code Data Preparation

Connect to multiple data technologies – CSV, TSV, SQL, Parquet, data warehouses, and data lakes.

Turnkey data product archetypes

For CDISC, OMOP, EHR, RWE, DNA-Seq, RNA-Seq (incl. single-cell).

Modular data ingestion & modeling functions (low code)

Using JSON or YAML to deploy a harmonized data product for novel data sources.

Harmonized data models can easily share

Algorithms, visualizations, reports, ML, and LLM apps.

Harmonized API Access

Harmonized

All disparate, composable data products speak the same API language; specific functionality is enabled using callable methods through the API.

Reproducible, point-and-click analysis (no-code)

Anyone can utilize the Tag.bio web application to create and share cohorts, statistics, visualizations, reports, and machine learning models – with versioning and provenance.

R & Python libraries

Data scientists can access any data product (with authorization) to easily slice and extract data frames for ad-hoc analysis & visualization and building of machine learning and generative AI models.

SQL API

Data analysts, BI tools and other software clients can connect to a data product’s restful API or SQL API to slice and extract data frames for downstream use.

Embedded & Customized Algorithms

Bring the compute to the data (pro-code) – implement pluggable algorithms and visualizations with R scripts, Python scripts, RMarkdown templates and Python notebooks.

Columnar, in-memory, optimized data access

Enables fast algorithms, visualization, and ML processing of data.

Serve visualizations & reports from the data product API

PNG, PDF, HTML, Plotly, RMarkdown, Python notebooks.

Embedded ML models & Generative AI

Data products can serve API methods and apps for prediction, classification and generative AI from structured queries or exploratory prompts.

Interconnected Data Products

Tag.bio Data Mesh

Tag.bio's Data Mesh presents a network of interconnected data products, adhering to FAIR principles via smart APIs. This framework offers several benefits to customers. It promotes data discoverability and interoperability, ensuring that both data and analyses are FAIR-compliant, expediting the process of connecting real-world data sources.

Additionally, the platform is designed to retain institutional knowledge by automatically saving and cataloging analysis activities, creating a valuable resource for domain experts.

Tools and Ecosystem of Data Mesh

Tag.bio provides a robust suite of features to enhance data collaboration and analysis. It facilitates scalable data mesh deployment with built-in CI/CD pipelines, allowing for the creation of modular, containerized data products that can be geographically distributed with Federated Computational Governance. Data products can be added, modified, or removed independently, ensuring flexibility and adaptability. The platform prioritizes security, running within customers secure network environments, and implementing Single Sign On (SSO) to simplify user access management. 

Tag.bio empowers users to create, share, and analyze data artifacts, supporting data scientists, ML engineers and BI professionals with a comprehensive suite of tools, no-code analyses, with seamless integration with popular cloud services like AWS, Azure and Google Cloud.

Tag.bio Smart APIs

Tag.bio offers a versatile range of APIs and services that enhance data product accessibility and functionality. These APIs support various essential functions, including the Search and Discovery System, R and Python SDKs, Dashboard, Data Products Insights, Monitoring, and more. 

Tag.bio's Smart API streamlines access to unique data product functionalities, allowing seamless integration with other data products and facilitating guided analysis apps for domain experts. Designed for universal communication, the API combines diverse elements such as datasets, algorithms, R/Python scripts, and analysis workflows into API calls. By fostering interoperability and adhering to the FAIR (findable, accessible, interoperable, reusable) principles, Tag.bio's API and services empower data scientists, engineers, and domain experts to harness data product capabilities, making the data truly useful.

Enterprise AI: Predictive and Generative AI

Enterprise AI

Tag.bio Enterprise AI is an all-encompassing solution that simplifies the AI lifecycle by offering a unified platform for building, deploying, and managing Predictive & Generative AI projects. 

The  platform seamlessly integrates data sets, smart APIs, and advanced statistical and machine learning algorithms into data products enabling users to uncover valuable insights through user-friendly Apps and/or Generative AI prompts.

The platform comes with:

  • Containerized Private Model Registries 

  • Developer Studio

  • AL/ML Frameworks: SciKit Learn, TensorFlow, PyTorch, MxNet, BYOA 

  • Integration with Cloud ML Platforms including Amazon SageMaker, Azure ML and  Google Vertex AI

Retrieval-augmented Generation and Fine Tuning

We help customers navigate the complexity of LLM, Retrieval-augmented generation (RAG) and Fine Tuning within our platform. RAG enhances the quality of responses generated by LLMs by incorporating internal domain specific knowledge from data products. This augmentation enriches the LLM's internal information representation, resulting in more comprehensive and contextually relevant responses.

Tag.bio enables users to:

Perform Prompt-tuning to derive insights from data products

• Use RAGs with interconnected data products 

• Fine-tune models using harmonized internal domain knowledge

Image EAI1 3

LLM Agents and Support for Foundational Models

Tag.bio's platform provides a robust support for LLM agents (LangChain, LlamaIndex, AutoChain, Amazon BedRock, and Vertex AI). In addition, it seamlessly integrates with a variety of vector databases, including Pinecone, Chroma, Faiss, and Pgvector, ensuring efficient data handling.

The platform also leverages a spectrum of foundational models such as GPT4, Claude-2, Llama-2, Falcon, Mistral 7B, and PaLM2, thereby providing a comprehensive toolkit for users to harness the power of diverse AI technologies.