
dbt - Advanced Features
Jinja macros, custom tests, packages, hooks, snapshots (SCD), incremental models, dbt Cloud, CI/CD
1In dbt, what is the main purpose of Jinja macros?
In dbt, what is the main purpose of Jinja macros?
Answer
Jinja macros enable code reuse across multiple dbt models. They work like functions that accept parameters and return dynamically generated SQL code. This avoids code duplication and makes it easier to maintain complex transformations throughout the project.
2How to define a reusable dbt macro in a file?
How to define a reusable dbt macro in a file?
Answer
A dbt macro is defined using the Jinja macro/endmacro syntax in a .sql file within the macros folder. The macro name is specified after the macro keyword, followed by parameters in parentheses. This macro can then be called from any model in the project.
3What is the difference between 'timestamp' and 'check' strategies for dbt snapshots?
What is the difference between 'timestamp' and 'check' strategies for dbt snapshots?
Answer
The timestamp strategy compares an update date column (updated_at) to detect changes, which is more performant as it only compares one column. The check strategy compares values of specified columns (check_cols) to detect any change, useful when there is no reliable timestamp column available.
Which columns are automatically added by dbt when creating a snapshot?
How to configure an incremental model with the 'merge' strategy in dbt?
+17 interview questions
Other Data Engineering interview topics
Linux & Shell - Fundamentals
Git & GitHub - Fundamentals
Advanced Python for Data Engineering
Docker - Fundamentals
Google Cloud Platform - Fundamentals
CI/CD and Code Quality
Docker Compose
FastAPI - Data APIs
Advanced SQL for Data Engineering
Data Lake - Architecture and Ingestion
BigQuery for Data Engineering
PostgreSQL - Administration
Data Modeling for Data Engineering
Fivetran & Airbyte - Data Ingestion
dbt - Fundamentals
Apache Airflow - Fundamentals
Kubernetes - Fundamentals
ETL / ELT / ETLT Patterns
Apache Airflow - Advanced
Airflow + dbt - Pipeline Orchestration
PySpark - Large-Scale Processing
Google Pub/Sub - Data Streaming
Apache Beam & Dataflow
Kubernetes - Production and Scaling
Terraform - Infrastructure as Code
NoSQL Databases
Modern Data Architecture
Monitoring and Observability
IAM and Data Security
Master Data Engineering for your next interview
Access all questions, flashcards, technical tests, code review exercises and interview simulators.
Start for free