Adapt to the agile & growing needs of your organization. New data types, new data products, and new apps are easily added to the mesh without any impact on established data services.
What is a data mesh?
A data mesh is a network of distributed data products linked together, which follow FAIR principles (findable, accessible, interoperable, and reusable) using smart APIs.
Another way to understand the concept of a data mesh is to think of it like the world wide web:
- You have your data in one or more data products, which are like web servers
- You have an analysis platform, which is like a web browser
- Data products have smart APIs, which allow the analysis platform to access and utilize any data product, like HTTP / DNS
Following the WWW concept, when you have a web server, you can serve domain-specific content. Similarly, Tag.bio’s data product serves domain-driven analysis functionality.
Why this matters:
This decentralized WWW pattern scales better than the centralized data lake pattern.
- Data products are developed, published and maintained independently by domain-centric teams
- Data products are able to communicate within the data mesh via smart APIs
- Data sources and functionality available within each data product are tightly versioned
- The same data product can be accessed by multiple teams, analysis platforms, R/Python SDKs, and third-party clients
Similar to how a web browser allows a user to browse for and interact with content served by many disparate web servers, the analysis platform allows a user to browse for and interact with content served by the data products. The content can be private or public with access dependent on the user’s authorization roles.
Why this matters:
- A centralized platform for access to registered data products
- Ability to cross-compare disparate data in one platform
- Store a reproducible history of all your analysis activities and reusable UDATs (useful data artifacts)
- The analysis platform allows FAIR (findable, accessible, interoperable, reusable) interaction with both data and analyses
In order for a web browser to communicate with the web servers, it needs a common language, which is the HTTP. In the data mesh case, the analysis platform can communicate with the data products using a common language, which is the smart API.
Why this matters:
- Smart API is a universal communication protocol to access domain-specific data and functionality within data products
- It enables each data product to communicate domain-specific language and functionality to end users
- A common way to extract and transfer data from data products to third party softwares (i.e. R, Python, Jupyter notebook, Tableau)
Data must be FAIR
Tag.bio’s data mesh focuses on making the data FAIR.
“I know where to look for any of my organization’s data.”
“I can access the data that I need.”
“I integrate this data with another data.”
“I can use the same data to ask and answer different types of questions.”
Advantages of the data mesh
Attempting to bring all of your data into the same place and the same universal schema is unsustainable at scale. A decentralized data mesh solves that problem, with domain-driven – yet harmonized – data products designed and quickly deployed by smaller, specialized teams.
Each data product in the mesh can be worked on independently. As each data product is containerized, it can be deployed as soon as any changes are ready.
As new data arises, new data products can be constructed and deployed to the mesh. The same data product can be accessed by many analysis platforms and teams. This allows your organization to scale your data mesh as you grow.
Accelerate time to value
Get value from day one. As a single data product with a single analysis app can be released within hours. This allows domain experts to instantly start asking and answering their own questions.