Summary
Retrieving the proper descriptions for Databricks Unity Catalog objects via dbt can be done in‑line by using the --description flag on the materialization configuration or by leveraging a custom macro that emits ALTER TABLE or ALTER SCHEMA statements during the on-run-end hook. This post outlines the root cause behind the limitation, the impact on real systems, and a senior‑engineer approach to fix it.
Root Cause
- dbt does not expose Schema/Database description support in its core output.
- Databricks Unity Catalog requires a separate
ALTERstatement to set metadata. - The default dbt lifecycle does not hook into catalog‑level DDL after model creation.
Why This Happens in Real Systems
- Schema drift: Production catalogs rarely receive automated description updates, leading to ambiguous data lineage.
- Manual overhead: Data stewards need to issue SQL manually, which is error‑prone and not version‑controlled.
- Audit gaps: Missing descriptions break data governance and compliance checks.
Real-World Impact
- Data discoverability decreases for new team members.
- Regulatory audits flag absent metadata, potentially delaying releases.
- Data quality assessments suffer without contextual schema information.
- Operational cost rises due to manual SQL maintenance and re‑runs.
Example or Code (if necessary and relevant)
-- Example macro: macros/add_schema_description.sql
{% macro add_schema_description(schema_name, description) %}
ALTER SCHEMA {{ schema_name }} SET TAGS (description="{{ description }}");
{% endmacro %}
-- Example usage in dbt_project.yml
on-run-end:
- "{{ add_schema_description('my_schema', 'Batch data processed daily') }}"
How Senior Engineers Fix It
- Create reusable macros that generate
ALTER SCHEMAorALTER TABLEstatements. - Hook the macros into
on-run-endto batch up alterations after model deployments. - Leverage YAML docs to document schema and table descriptions, ensuring sync with code.
- Version‑control all metadata changes alongside model SQL in the repo.
- Automate tests that verify descriptions exist:
dbt test --data-tests.
Why Juniors Miss It
- Assume the catalog auto‑populates descriptions from dbt model comments.
- Misinterpret the
--descor--descriptionoptions as DB‑specific. - Leave out the necessary
on-run-endhook, thinking the macro will run automatically. - Overlook the need for tags or extended properties in Unity Catalog, focusing solely on standard SQL
COMMENT ON.