
IAM and Data Security
Least privilege, service accounts, GCP roles, encryption at rest/in transit, data masking, audit logs, GDPR compliance, VPC Service Controls
1What is the fundamental principle to apply when assigning IAM permissions in GCP?
What is the fundamental principle to apply when assigning IAM permissions in GCP?
Answer
The principle of least privilege means granting only the permissions strictly necessary to accomplish a task. In Data Engineering, this means a pipeline should only have access to the buckets, datasets, and tables it actually needs. This principle reduces the attack surface and limits potential damage if a service account is compromised.
2What is the difference between a service account and a user account in GCP?
What is the difference between a service account and a user account in GCP?
Answer
A service account is an identity designed for applications and services, while a user account represents a person. Service accounts authenticate using JSON keys or Workload Identity, have no password, and are designed for automation. In Data Engineering, each pipeline should have its own service account with specific permissions.
3What is the IAM role hierarchy in GCP, from least permissive to most permissive?
What is the IAM role hierarchy in GCP, from least permissive to most permissive?
Answer
The IAM role hierarchy goes from Viewer (read-only) to Editor (read/write without IAM management) to Owner (full control including IAM and billing). For data pipelines, it is recommended to use granular predefined roles like BigQuery Data Viewer or Storage Object Creator rather than these overly broad primitive roles.
Why should JSON service account keys be avoided in a GCP production environment?
What is the difference between encryption at rest and encryption in transit?
+17 interview questions
Other Data Engineering interview topics
Linux & Shell - Fundamentals
Git & GitHub - Fundamentals
Advanced Python for Data Engineering
Docker - Fundamentals
Google Cloud Platform - Fundamentals
CI/CD and Code Quality
Docker Compose
FastAPI - Data APIs
Advanced SQL for Data Engineering
Data Lake - Architecture and Ingestion
BigQuery for Data Engineering
PostgreSQL - Administration
Data Modeling for Data Engineering
Fivetran & Airbyte - Data Ingestion
dbt - Fundamentals
Apache Airflow - Fundamentals
Kubernetes - Fundamentals
dbt - Advanced Features
ETL / ELT / ETLT Patterns
Apache Airflow - Advanced
Airflow + dbt - Pipeline Orchestration
PySpark - Large-Scale Processing
Google Pub/Sub - Data Streaming
Apache Beam & Dataflow
Kubernetes - Production and Scaling
Terraform - Infrastructure as Code
NoSQL Databases
Modern Data Architecture
Monitoring and Observability
Master Data Engineering for your next interview
Access all questions, flashcards, technical tests, code review exercises and interview simulators.
Start for free