Data Science & ML

DATA

Comprehensive Data Science and Machine Learning curriculum with Python as the main language. From data manipulation with Pandas and NumPy to implementing Deep Learning models with TensorFlow/Keras, through classic ML with Scikit-Learn. Also includes MLOps skills to deploy and maintain models in production with Docker, FastAPI and cloud platforms.

What you'll learn

Modern Python with object-oriented programming and best practices

Data manipulation with Pandas, NumPy and SQL (BigQuery)

Visualization with Matplotlib, Seaborn and Plotly

Descriptive and inferential statistics with Statsmodel

Machine Learning with Scikit-Learn and XGBoost (regression, classification, clustering)

Deep Learning with TensorFlow and Keras (CNN, RNN, Transformers)

NLP and GenAI with Hugging Face, LangChain and LLMs (GPT, Gemini)

MLOps with MLflow, Docker, FastAPI and Streamlit

Development environments: Jupyter, Google Colab

Cloud deployment with Google Compute, Cloud Storage and GPU

Key topics to master

The most important concepts to understand this technology and ace your interviews

Python: types, data structures, OOP, decorators, generators, context managers

NumPy: arrays, broadcasting, indexing, vectorized operations, linear algebra

Pandas: DataFrames, Series, indexing, groupby, merge, pivot, time series

SQL: SELECT, JOIN, GROUP BY, window functions, CTEs, query optimization

Visualization: Matplotlib (figures, axes, subplots), Seaborn (statistical plots), Plotly (interactive)

Statistics: distributions, hypothesis testing, confidence intervals, regression

Feature Engineering: encoding, scaling, feature selection, feature creation

Supervised ML: linear/logistic regression, trees, Random Forest, XGBoost, metrics

Unsupervised ML: K-Means, hierarchical clustering, PCA, t-SNE

ML Pipeline: train/test split, cross-validation, hyperparameter tuning, overfitting

Deep Learning: perceptrons, backpropagation, activation functions, optimizers, loss functions

CNN: convolutions, pooling, architectures (ResNet, VGG), transfer learning

RNN/LSTM: sequences, vanishing gradient, attention mechanism, Transformers

NLP: tokenization, embeddings, word2vec, BERT, LLM fine-tuning

MLOps: versioning (MLflow), containerization (Docker), API (FastAPI), monitoring

Cloud: Google Cloud (Compute, Storage, BigQuery), GPU training, Vertex AI

AI Ethics: bias, explainability (SHAP, LIME), fairness, GDPR

Recent Data Science & ML articles

Discover our latest articles and guides on Data Science & ML

MLOps interview questions illustrated with an MLflow model registry, deployment pipeline, and drift monitoring dashboard on a dark background

June 23, 2026

MLOps in 2026: MLflow, Model Registry and Technical Interview Questions

MLOps interview questions covering the ML lifecycle, MLflow experiment tracking, model registry promotion, deployment patterns, drift monitoring, and system design for 2026, with Python code and answers.

RAG retrieval-augmented generation pipeline architecture with vector database and LLM

June 6, 2026

RAG and LLMs in 2026: Retrieval-Augmented Generation for Data Science Interviews

Retrieval-Augmented Generation (RAG) explained for data science interviews in 2026. Covers vector databases, chunking strategies, embedding models, agentic RAG, Graph RAG, and production-ready pipeline architecture.

Hugging Face Transformers NLP fine-tuning tutorial 2026

May 19, 2026

Hugging Face Transformers in 2026: NLP, Fine-Tuning and Interview Questions

Hugging Face Transformers tutorial covering the v5 API, fine-tuning with LoRA, NLP pipelines, and the most common interview questions asked in data science roles in 2026.

View all Data Science & ML articles