DBT Core Databricks Project 🚀

Overview

This project implements data transformations using dbt (data build tool) with Databricks as the underlying data warehouse. The project follows a medallion architecture (Bronze, Silver, Gold layers) for data processing and includes comprehensive testing, documentation, and CI/CD integration.

Project Structure: dbt_core_databricks

dbt_core_databricks/
├── analyses/
│   ├── .gitkeep
│   └── macro_demo.sql
├── macros/
│   ├── .gitkeep
│   ├── current_timestamp.sql
│   ├── generate_schema_name.sql
│   └── multiply_cols.sql
├── models/
│   ├── bronze/
│   │   ├── bronze_orders.sql
│   │   ├── bronze_reviews.sql
│   │   └── bronze_users.sql
│   ├── silver/
│   │   ├── _silver.yml
│   │   ├── silver_orders.sql
│   │   ├── silver_products.sql
│   │   └── silver_users.sql
│   ├── gold/
│   │   ├── gold.yml
│   │   ├── gold_avg_rating__daily.sql
│   │   └── gold_sales__daily.sql
│   └── sources/
│       ├── _sources.md
│       └── landing_sources.yml
├── seeds/
│   └── .gitkeep
├── snapshots/
│   ├── .gitkeep
│   ├── _snapshots.yml
│   └── products_snapshots.sql
├── tests/
│   ├── .gitkeep
│   └── generic/
│       └── assert_non_negative.sql
├── .gitignore
├── .user.yml
├── README.md
├── dbt_project.yml
├── package-lock.yml
└── packages.yml

Dashboard Screenshot

Lineage Graph

SQL Compilation Example

Execution Commands in Databricks

Job Execution in DBT

Additional Lineage Visualization

📊 Data Model Overview

Bronze Layer (Raw Data)

Direct ingestion from landing zone
Tables: orders, products, reviews, users
Minimal transformations
PII data tagged with 'contains_pii'

Silver Layer (Transformed)

Cleaned and standardized data
Business logic applications
Key models:
- silver_orders: Calculated order amounts
- silver_products: Current product information
- silver_users: Anonymized user data

Gold Layer (Business Ready)

Aggregated analytics views
Key models:
- gold_sales__daily: Daily sales analytics
- gold_avg_rating__daily: Product rating analytics

🚀 Getting Started

Prerequisites

dbt Core installed
Databricks account and cluster
Python 3.7+

Setup

Clone the repository bash git clone
Install dependencies bash pip install dbt-databricks
Configure profiles.yml yaml dbt_core_databricks: outputs: dev: type: databricks catalog: your_catalog schema: your_schema host: your-databricks-host http_path: your-http-path token: your-token threads: 1 target: dev

🛠️ Features

Data Testing

Generic tests for data quality
Custom test macro: assert_non_negative
Unit tests for transformations
Source freshness checks

Macros

multiply_columns_and_round: Calculate monetary values
generate_schema_name: Custom schema handling
current_timestamp: Timestamp utilities

Snapshots

Type 2 SCD for products table
Timestamp-based tracking
Configured in snapshots/products_snapshots.sql

📚 Documentation

Comprehensive table and column descriptions
Source documentation in models/sources/_sources.md
Generated documentation available via dbt docs

🔄 Development Workflow

Basic Commands

bash dbt run # Run all models dbt test # Run all tests dbt docs generate # Generate documentation dbt docs serve # Serve documentation locally dbt run --select tag:daily # Run tagged models

Model Tags

contains_pii: Models with sensitive data
daily: Daily refresh models
weekly: Weekly refresh models

🔐 Security

PII data tagged and tracked
Credentials managed via environment variables
No sensitive information in repository

🤝 Contributing

Fork the repository
Create a feature branch
Commit changes
Push to the branch
Create a Pull Request

Built using dbt and Databricks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DBT Core Databricks Project 🚀

Overview

Project Structure: dbt_core_databricks

Dashboard Screenshot

Lineage Graph

SQL Compilation Example

Execution Commands in Databricks

Job Execution in DBT

Additional Lineage Visualization

📊 Data Model Overview

Bronze Layer (Raw Data)

Silver Layer (Transformed)

Gold Layer (Business Ready)

🚀 Getting Started

Prerequisites

Setup

🛠️ Features

Data Testing

Macros

Snapshots

📚 Documentation

🔄 Development Workflow

Basic Commands

Model Tags

🔐 Security

🤝 Contributing

Files

README.md

Latest commit

History

README.md

File metadata and controls

DBT Core Databricks Project 🚀

Overview

Project Structure: dbt_core_databricks

Dashboard Screenshot

Lineage Graph

SQL Compilation Example

Execution Commands in Databricks

Job Execution in DBT

Additional Lineage Visualization

📊 Data Model Overview

Bronze Layer (Raw Data)

Silver Layer (Transformed)

Gold Layer (Business Ready)

🚀 Getting Started

Prerequisites

Setup

🛠️ Features

Data Testing

Macros

Snapshots

📚 Documentation

🔄 Development Workflow

Basic Commands

Model Tags

🔐 Security

🤝 Contributing