1
Linux & Shell : commandes essentielles, scripting bash, permissions, cron jobs
2
Git & GitHub : branching, merge, rebase, pull requests, CI/CD workflows
3
Python avancé : POO, décorateurs, générateurs, context managers, typing, async/await
4
CI/CD : linting (Ruff, Pylint), packaging (Poetry), tests, GitHub Actions, pipelines
5
Docker : Dockerfile, images, conteneurs, volumes, networks, multi-stage builds
6
Docker Compose : services multi-conteneurs, dépendances, healthchecks, orchestration locale
7
FastAPI : routes, modèles Pydantic, dépendances, middleware, déploiement
8
SQL avancé : window functions, CTEs, requêtes analytiques, optimisation, indexation
9
BigQuery : architecture serverless, partitionnement, clustering, coûts, UDFs, federated queries
10
PostgreSQL : configuration, réplication, indexation (B-tree, GIN, GiST), VACUUM, EXPLAIN ANALYZE
11
Data Modeling : schéma en étoile, tables de faits/dimensions, normalisation, SCD, data vault
12
ELT vs ETL vs ETLT : patterns, trade-offs, choix d'architecture
13
Fivetran & Airbyte : connecteurs, sync modes, CDC, schéma evolution
14
dbt : models, sources, refs, tests, snapshots, incremental models, Jinja macros
15
Apache Airflow : DAGs, operators, sensors, XCom, connections, pools, task dependencies
16
PySpark : RDD vs DataFrame, transformations, actions, partitioning, broadcast variables
17
Streaming : Pub/Sub (topics, subscriptions), Apache Beam (PCollections, transforms, windowing), Dataflow
18
Kubernetes : pods, deployments, services, ingress, ConfigMaps, Secrets, Helm, scaling
19
Terraform : providers, resources, state, modules, plan/apply, infrastructure as code
20
IAM & sécurité : principes du moindre privilège, service accounts, rôles GCP
21
Bases de données NoSQL : GraphDB (Neo4j), Document DBs (MongoDB, Firestore), Wide Column (Cassandra, Bigtable)
22
Architecture Data : Data Lake vs Data Warehouse vs Data Lakehouse, Data Mesh, Data Contracts
23
Monitoring & observabilité : logging, métriques, alerting, SLA/SLO/SLI, data quality checks