Skip to content

perf: skip unnecessary metadata fetch calls for tags when not configured#1387

Open
tejassp-db wants to merge 17 commits intomainfrom
PECOBLR-2497/skip-unused-metadata-calls
Open

perf: skip unnecessary metadata fetch calls for tags when not configured#1387
tejassp-db wants to merge 17 commits intomainfrom
PECOBLR-2497/skip-unused-metadata-calls

Conversation

@tejassp-db
Copy link
Copy Markdown
Collaborator

Summary

  • Skips fetch_tags and fetch_column_tags queries to information_schema during incremental and view materializations when the model has no tags configured
  • Falls back to fetching metadata when model config is unavailable (safe default)
  • No behavior change for models with tags configured — diffs still computed correctly

What changed

  • Added requires_server_metadata_for_diff() to TagsConfig and ColumnTagsConfig to signal when server metadata is needed
  • IncrementalTableAPI._describe_relation and ViewAPI._describe_relation now check model config before making tag fetch calls
  • get_relation_config accepts the model config to pass through to _describe_relation
  • Fixed duplicate TagsProcessor in StreamingTableConfig.config_components

Test plan

  • Unit tests: empty tags (skip), non-empty tags (fetch), null config fallback (fetch), both tags + column tags present, hive_metastore (skip)

  • Functional tests: override fetch_tags/fetch_column_tags macros to raise errors, confirming calls are actually skipped when expected

    PECOBLR-2497

Skip fetch_tags and fetch_column_tags information_schema queries during
incremental and view materializations when the model has no tags configured.
This avoids unnecessary server roundtrips on every run for models that don't
use tags, while preserving full fetch behavior when tags are present or
when the model config is unavailable.

PECOBLR-2497
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 10, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  dbt/adapters/databricks
  impl.py 967, 975-984, 1070-1072, 1103-1106, 1146, 1164-1165
  dbt/adapters/databricks/relation_configs
  base.py 41
  column_tags.py
  tags.py
Project Total  

This report was generated by python-coverage-comment-action

Add unit and functional coverage for skipping tag metadata queries when model
config does not require them. This protects the new fetch-planning logic
across incremental, view, streaming table, and materialized view test paths
without changing unrelated unstaged work.
tejassp-db and others added 9 commits April 27, 2026 10:28
partial_parse.msgpack does not persist compiled_code, causing
get_config_from_model to raise "Cannot compile model ... with no SQL
query" for materialized view and streaming table change tests.

Co-authored-by: Isaac
Co-authored-by: Isaac
The previous `dict and len(...) > 0` short-circuited to a dict literal
when empty, breaking the declared `-> bool` return type.

Co-authored-by: Isaac
Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall intentation of the change looks good, there are a few things specifically where we need changes

  • Function signature - make new param optional + naming + other related changes
  • Coverage - Can you confirm on coverage across all supported materialisation types (MV/ST missing; Views don't have support for column_tags)
  • Functional test changes

Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py Outdated
Comment thread dbt/adapters/databricks/impl.py
Comment thread tests/functional/adapter/views/test_view_metadata_fetch_skips.py Outdated
Comment thread tests/functional/adapter/incremental/test_incremental_metadata_fetch_skips.py Outdated
else:
results["information_schema.tags"] = None

column_tag_config = (
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments on naming + checks on desried_column_tag_config

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suffix _config and surrounding code makes it obvious that its coming from the user configuration.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable naming should still be intentional and descriptive, we should update it to model_column_tag_config

Comment thread dbt/adapters/databricks/impl.py
- Add functional tests for optional tag fetching in Materialized view.
- Fix tests to check exceptions, and not log lines.
- Make parameter optional to be backward compatible with jinja calls.
Comment thread dbt/adapters/databricks/impl.py
Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments, overall much better. I still feel we should add a comment in all the materialisation types where we are doing this optimisation that this only works because we don't support deleting/removing tags through dbt-databricks, as that will break the optimisations done and better to encode as a comment

Comment thread dbt/adapters/databricks/impl.py
results["information_schema.tags"] = adapter.execute_macro("fetch_tags", kwargs=kwargs)
results["show_tblproperties"] = adapter.execute_macro("fetch_tbl_properties", kwargs=kwargs)

table_tag_config = model_config.config.get(TagsProcessor.name) if model_config else None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to model_table_tag_config we should differentiate between desired(model) configs and existing (relation) configs better

Comment thread dbt/adapters/databricks/impl.py
@sd-db sd-db self-requested a review May 8, 2026 08:31
Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing tests 1/2


from tests.functional.adapter.helpers import FAIL_IF_TAG_FETCH_CALLED_MACROS

VIEW_WITHOUT_TAGS_SQL = """
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should move this inside fixtures file

"""


def get_model_config(project, relation: BaseRelation):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this ? Since model_config is optional in the tests we can just not pass it and that also tests for other behaviour overall. Look at the places where this gets called it is most a no-op. I think this is mostly an artifact of the config not being Optional earlier. Since now it is we can look to remove as well.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this is adding extra complexity while mostly being no-op

adapter.execute_macro.return_value = MagicMock()

results = MaterializedViewAPI._describe_relation(adapter, MagicMock())
results = MaterializedViewAPI._describe_relation(adapter, MagicMock(), Mock())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: while the change is okay, it would be better if just did not pass model_config here, i.e no change as model_config being None, triggers the actual tag fetch macro which is what the test is looking for. This test was added when there no optimisation to not fetch and better to test that path instead of the else path.

Copy link
Copy Markdown
Collaborator

@sd-db sd-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing Tests 2/2

)


class TestDescribeRelationMetadataFetchPlanning:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: These tests look really good, but we missed MV materealisation here which would have this even more complete. Fine without as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants