Data Engineering

Monitoring and Observability

Structured logging, metrics, alerting, SLA/SLO/SLI, data quality checks, Great Expectations, Soda

20 interview questionsยท
Senior
1

What is structured logging in the context of a data pipeline?

Answer

Structured logging means emitting logs in a parsable format (JSON, key-value) rather than free text. This allows easy filtering, searching and aggregating logs in tools like Cloud Logging, Elasticsearch or Datadog. In a data pipeline, this greatly facilitates debugging by enabling filtering by DAG, task_id, run_id or any business context.

2

What is the difference between an SLI (Service Level Indicator) and an SLO (Service Level Objective)?

Answer

An SLI is a measurable metric that quantifies an aspect of service quality (e.g., job success rate, pipeline latency). An SLO is a target defined on that metric (e.g., 99.5% of jobs must succeed). The SLA is the contractual commitment to customers based on internal SLOs. This hierarchy enables objective reliability monitoring and triggering alerts before violating SLAs.

3

What is an Expectation in Great Expectations?

Answer

An Expectation is a declarative assertion about data, like expect_column_values_to_not_be_null or expect_column_values_to_be_between. Great Expectations automatically generates documentation and actionable validation results. These Expectations are grouped into Suites that define the complete quality contract for a dataset.

4

What is the main role of Soda in a data pipeline?

5

What is a runbook in the context of data incident management?

+17 interview questions

Master Data Engineering for your next interview

Access all questions, flashcards, technical tests, code review exercises and interview simulators.

Start for free