Data Engineering

Airflow + dbt - Pipeline Orchestration

astronomer-cosmos, DbtDagParser, dbt run/test in Airflow, dependency management, end-to-end monitoring

20 interview questionsยท
Senior
1

What is the main advantage of using astronomer-cosmos to integrate dbt into Airflow?

Answer

Astronomer-cosmos automatically converts dbt models into individual Airflow tasks, providing granular visibility on each model in the Airflow UI. This allows leveraging Airflow features (retry, alerting, monitoring) at the model level rather than on the entire dbt project.

2

How does cosmos handle dependencies between dbt models in an Airflow DAG?

Answer

Cosmos analyzes dbt's manifest.json to extract the dependency graph between models. It then automatically creates dependency relationships (upstream/downstream) between corresponding Airflow tasks, thus respecting the execution order defined by refs in the dbt project.

3

What is the difference between 'local' and 'docker' execution modes in cosmos?

Answer

In local mode, cosmos runs dbt directly in the Airflow worker's Python environment, requiring dbt to be installed. In docker mode, each dbt task runs in an isolated Docker container with its own dbt image, providing better isolation and dependency reproducibility.

4

How to configure cosmos to run only a subset of dbt models based on tags?

5

What is the role of DbtTaskGroup in the Airflow-dbt integration with cosmos?

+17 interview questions

Master Data Engineering for your next interview

Access all questions, flashcards, technical tests, code review exercises and interview simulators.

Start for free