2 Data Product
Gytis Repečka edited this page 2026-01-15 10:58:31 +02:00

A data product is a reusable, self-contained package that combines data, metadata, semantics and templates to support diverse business use cases. It can include components such as datasets, dashboards, reports, machine learning models, pre-built queries or data pipelines.

The concept of data products gained prominence in 2019 when Zhamak Dehghani introduced data products as a core component of the data mesh architecture.1

Characteristics

  • Discoverable
  • Understandable
  • Interoperable
  • Shareable
  • Secure
  • Reusable

Types

Databricks training material2 distinguishes following types of Data Products:

  • Source-aligned Data Product - usable and relevant representation of source data. This is private data asset that are not shared with others.
  • Derived Data Product or Data Product - cleansed and enriched data asset designed for analytical usecases. It provides a single source of truth with a unified view across the domain (or subject area) and consistent data definitions. This type of data asset is shared, reusable and is available across the organization.
  • Customer-aligned Data Product - derivative type of Data Product, built on lower-level Data Products. It is designed for specific purpose for end-user(s) - e.g.: dashboards, reports, calculations. May or may not be shared across the organization.

  1. What is a data product? (2026) IBM. ↩︎

  2. Databricks customer academy (2025). ↩︎