
Airflow + dbt - Pipeline Orchestration
astronomer-cosmos, DbtDagParser, dbt run/test in Airflow, dependency management, end-to-end monitoring
1What is the main advantage of using astronomer-cosmos to integrate dbt into Airflow?
What is the main advantage of using astronomer-cosmos to integrate dbt into Airflow?
Answer
Astronomer-cosmos automatically converts dbt models into individual Airflow tasks, providing granular visibility on each model in the Airflow UI. This allows leveraging Airflow features (retry, alerting, monitoring) at the model level rather than on the entire dbt project.
2How does cosmos handle dependencies between dbt models in an Airflow DAG?
How does cosmos handle dependencies between dbt models in an Airflow DAG?
Answer
Cosmos analyzes dbt's manifest.json to extract the dependency graph between models. It then automatically creates dependency relationships (upstream/downstream) between corresponding Airflow tasks, thus respecting the execution order defined by refs in the dbt project.
3What is the difference between 'local' and 'docker' execution modes in cosmos?
What is the difference between 'local' and 'docker' execution modes in cosmos?
Answer
In local mode, cosmos runs dbt directly in the Airflow worker's Python environment, requiring dbt to be installed. In docker mode, each dbt task runs in an isolated Docker container with its own dbt image, providing better isolation and dependency reproducibility.
How to configure cosmos to run only a subset of dbt models based on tags?
What is the role of DbtTaskGroup in the Airflow-dbt integration with cosmos?
+17 interview questions
Other Data Engineering interview topics
Linux & Shell - Fundamentals
Git & GitHub - Fundamentals
Advanced Python for Data Engineering
Docker - Fundamentals
Google Cloud Platform - Fundamentals
CI/CD and Code Quality
Docker Compose
FastAPI - Data APIs
Advanced SQL for Data Engineering
Data Lake - Architecture and Ingestion
BigQuery for Data Engineering
PostgreSQL - Administration
Data Modeling for Data Engineering
Fivetran & Airbyte - Data Ingestion
dbt - Fundamentals
Apache Airflow - Fundamentals
Kubernetes - Fundamentals
dbt - Advanced Features
ETL / ELT / ETLT Patterns
Apache Airflow - Advanced
PySpark - Large-Scale Processing
Google Pub/Sub - Data Streaming
Apache Beam & Dataflow
Kubernetes - Production and Scaling
Terraform - Infrastructure as Code
NoSQL Databases
Modern Data Architecture
Monitoring and Observability
IAM and Data Security
Master Data Engineering for your next interview
Access all questions, flashcards, technical tests, code review exercises and interview simulators.
Start for free