Skip to content

Formalize dbt producer coverage matrix (Scenarios, Facets, and Spec vs. Implementation Versions) #292

@roller100

Description

@roller100

Summary

The current dbt compatibility test suite serves as a good smoke test, but lacks a formalized coverage matrix. Currently, our configuration files (versions.json and scenario-specific config.json) dictate what runs, but we don't have an explicit baseline for what must be covered to consider a dbt version or scenario "supported."

To prevent regressions and clarify support windows, we need to explicitly define our coverage expectations across three specific dimensions.

1. Version Coverage (Spec vs. Implementation)

As analyzed in previous design explorations (e.g. MULTI_SPEC_ANALYSIS.md), our current testing tends to lock implementation versions and spec versions together. We currently use versions.json (component_version) and config.json (openlineage_versions) to define test boundaries. However, these represent two distinct concepts:

  • Implementation Versions: The openlineage-dbt package and dbt-core version (e.g., 1.8.0, 1.30.0). How many trailing versions are we committing to test?
  • Specification Versions: The JSON schema for event validation (e.g., 2-0-2). Since multiple implementation versions often bundle the same specification (e.g., 1.23.0 and 1.30.0 both bundle spec 2-0-2), we need a documented policy to unlock cross-version matrix testing. We need to ensure that an older implementation works with newer spec validators, and vice versa.

2. Facet Coverage

We need to define a baseline of core OpenLineage facets that every dbt scenario must generate and validate.

  • Looking at scenarios/csv_to_postgres/config.json, we currently tag tests with facets: ["dataSource", "schema", "columnLineage", "sql"].
  • We should standardize which facets are mandatory for all scenarios versus which are optional/scenario-specific.

3. Scenario Coverage

What real-world dbt topologies are required to prove compatibility?

  • We should maintain a checklist of baseline dbt behaviors (e.g., seeds, incremental models, tests, snapshots) that must be present in the scenarios/ directory to satisfy a "fully tested" implementation.

Requested Outcome

  1. Document a concrete support matrix/window for dbt-core and OpenLineage spec versions, addressing the distinction between implementation and spec versions.
  2. Establish a standard baseline of required Facets for any new dbt scenario added to config.json.
  3. Audit our existing scenarios against this new baseline to identify gaps in our current test suite.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions