1
Linux und Shell: Wesentliche Befehle, Bash-Scripting, Berechtigungen, Cron Jobs
2
Git und GitHub: Branching, Merge, Rebase, Pull Requests, CI/CD Workflows
3
Fortgeschrittenes Python: OOP, Decorators, Generators, Context Managers, Typing, async/await
4
CI/CD: Linting (Ruff, Pylint), Packaging (Poetry), Tests, GitHub Actions, Pipelines
5
Docker: Dockerfile, Images, Container, Volumes, Networks, Multi-Stage Builds
6
Docker Compose: Multi-Container-Services, Abhängigkeiten, Healthchecks, lokale Orchestrierung
7
FastAPI: Routen, Pydantic-Modelle, Dependencies, Middleware, Deployment
8
Fortgeschrittenes SQL: Window Functions, CTEs, analytische Abfragen, Optimierung, Indexierung
9
BigQuery: Serverless-Architektur, Partitionierung, Clustering, Kosten, UDFs, Federated Queries
10
PostgreSQL: Konfiguration, Replikation, Indexierung (B-Tree, GIN, GiST), VACUUM, EXPLAIN ANALYZE
11
Datenmodellierung: Sternschema, Fakten-/Dimensionstabellen, Normalisierung, SCD, Data Vault
12
ELT vs ETL vs ETLT: Muster, Trade-offs, Architekturentscheidungen
13
Fivetran und Airbyte: Connectors, Sync-Modi, CDC, Schemaevolution
14
dbt: Models, Sources, Refs, Tests, Snapshots, Incremental Models, Jinja Macros
15
Apache Airflow: DAGs, Operators, Sensors, XCom, Connections, Pools, Task-Abhängigkeiten
16
PySpark: RDD vs DataFrame, Transformationen, Actions, Partitionierung, Broadcast Variables
17
Streaming: Pub/Sub (Topics, Subscriptions), Apache Beam (PCollections, Transforms, Windowing), Dataflow
18
Kubernetes: Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, Helm, Scaling
19
Terraform: Providers, Resources, State, Modules, Plan/Apply, Infrastructure as Code
20
IAM und Sicherheit: Least-Privilege-Prinzipien, Service Accounts, GCP-Rollen
21
NoSQL-Datenbanken: GraphDB (Neo4j), Document DBs (MongoDB, Firestore), Wide Column (Cassandra, Bigtable)
22
Datenarchitektur: Data Lake vs Data Warehouse vs Data Lakehouse, Data Mesh, Data Contracts
23
Monitoring und Observability: Logging, Metriken, Alerting, SLA/SLO/SLI, Data Quality Checks