Question 1

What is the main architectural characteristic of Google Pub/Sub?

Accepted Answer

Google Pub/Sub is a serverless asynchronous messaging service that decouples message producers from consumers. Publishers send messages to topics without knowing the subscribers, and subscribers receive messages via subscriptions without knowing the publishers. This architecture enables independent horizontal scaling on both sides.

Question 2

What is the fundamental difference between a topic and a subscription in Pub/Sub?

Accepted Answer

A topic is a named channel to which publishers send messages, while a subscription is a named entity representing a subscriber's interest in receiving messages from a topic. A topic can have multiple subscriptions, and each subscription receives a copy of every message published to the topic.

Question 3

When should a pull subscription be preferred over a push subscription?

Accepted Answer

A pull subscription is preferable when the subscriber needs to control the message consumption rate (flow control), process large batches, or when the execution environment cannot expose a public HTTPS endpoint. Pull also makes it easier to handle load spikes by dynamically adjusting the number of messages retrieved.

Google Pub/Sub - Data Streaming

What is the main architectural characteristic of Google Pub/Sub?

Answer

What is the fundamental difference between a topic and a subscription in Pub/Sub?

Answer

When should a pull subscription be preferred over a push subscription?

Answer

How does the acknowledgement mechanism work in Pub/Sub?

What is the role of a dead letter topic in Pub/Sub?

Other Data Engineering interview topics

Linux & Shell - Fundamentals

Git & GitHub - Fundamentals

Advanced Python for Data Engineering

Docker - Fundamentals

Google Cloud Platform - Fundamentals

CI/CD and Code Quality

Docker Compose

FastAPI - Data APIs

Advanced SQL for Data Engineering

Data Lake - Architecture and Ingestion

BigQuery for Data Engineering

PostgreSQL - Administration

Data Modeling for Data Engineering

Fivetran & Airbyte - Data Ingestion

dbt - Fundamentals

Apache Airflow - Fundamentals

Kubernetes - Fundamentals

dbt - Advanced Features

ETL / ELT / ETLT Patterns

Apache Airflow - Advanced

Airflow + dbt - Pipeline Orchestration

PySpark - Large-Scale Processing

Apache Beam & Dataflow

Kubernetes - Production and Scaling

Terraform - Infrastructure as Code

NoSQL Databases

Modern Data Architecture

Monitoring and Observability

IAM and Data Security

Master Data Engineering for your next interview