Data Engineering

IAM and Data Security

Least privilege, service accounts, GCP roles, encryption at rest/in transit, data masking, audit logs, GDPR compliance, VPC Service Controls

20 interview questionsยท
Senior
1

What is the fundamental principle to apply when assigning IAM permissions in GCP?

Answer

The principle of least privilege means granting only the permissions strictly necessary to accomplish a task. In Data Engineering, this means a pipeline should only have access to the buckets, datasets, and tables it actually needs. This principle reduces the attack surface and limits potential damage if a service account is compromised.

2

What is the difference between a service account and a user account in GCP?

Answer

A service account is an identity designed for applications and services, while a user account represents a person. Service accounts authenticate using JSON keys or Workload Identity, have no password, and are designed for automation. In Data Engineering, each pipeline should have its own service account with specific permissions.

3

What is the IAM role hierarchy in GCP, from least permissive to most permissive?

Answer

The IAM role hierarchy goes from Viewer (read-only) to Editor (read/write without IAM management) to Owner (full control including IAM and billing). For data pipelines, it is recommended to use granular predefined roles like BigQuery Data Viewer or Storage Object Creator rather than these overly broad primitive roles.

4

Why should JSON service account keys be avoided in a GCP production environment?

5

What is the difference between encryption at rest and encryption in transit?

+17 interview questions

Master Data Engineering for your next interview

Access all questions, flashcards, technical tests, code review exercises and interview simulators.

Start for free