
CI/CD and Code Quality
Ruff, Pylint, Poetry, GitHub Actions, CI/CD pipelines, automated tests, pre-commit hooks, code coverage
1What is Ruff in the Python ecosystem?
What is Ruff in the Python ecosystem?
Answer
Ruff is an extremely fast Python linter and formatter written in Rust. It advantageously replaces tools like Flake8, isort, and Black while offering 10 to 100 times better performance. Ruff supports over 700 linting rules and easily integrates into CI/CD pipelines and pre-commit hooks.
2What is the main role of the pyproject.toml file with Poetry?
What is the main role of the pyproject.toml file with Poetry?
Answer
The pyproject.toml file is the central configuration file for a Python project with Poetry. It defines project metadata (name, version, description), production and development dependencies, scripts, and tool configurations like Ruff or pytest. This standardized file replaces setup.py, requirements.txt, and setup.cfg.
3Which Poetry command installs all dependencies of an existing project?
Which Poetry command installs all dependencies of an existing project?
Answer
The poetry install command reads pyproject.toml and poetry.lock files to install all project dependencies in an isolated virtual environment. If poetry.lock exists, exact versions are used to ensure reproducibility. Otherwise, Poetry resolves dependencies and creates the lock file.
What is a pre-commit hook in the Git context?
What is the basic structure of a GitHub Actions workflow?
+17 interview questions
Other Data Engineering interview topics
Linux & Shell - Fundamentals
Git & GitHub - Fundamentals
Advanced Python for Data Engineering
Docker - Fundamentals
Google Cloud Platform - Fundamentals
Docker Compose
FastAPI - Data APIs
Advanced SQL for Data Engineering
Data Lake - Architecture and Ingestion
BigQuery for Data Engineering
PostgreSQL - Administration
Data Modeling for Data Engineering
Fivetran & Airbyte - Data Ingestion
dbt - Fundamentals
Apache Airflow - Fundamentals
Kubernetes - Fundamentals
dbt - Advanced Features
ETL / ELT / ETLT Patterns
Apache Airflow - Advanced
Airflow + dbt - Pipeline Orchestration
PySpark - Large-Scale Processing
Google Pub/Sub - Data Streaming
Apache Beam & Dataflow
Kubernetes - Production and Scaling
Terraform - Infrastructure as Code
NoSQL Databases
Modern Data Architecture
Monitoring and Observability
IAM and Data Security
Master Data Engineering for your next interview
Access all questions, flashcards, technical tests, code review exercises and interview simulators.
Start for free