
Advanced Pandas
GroupBy, merge, concat, pivot tables, time series, apply/transform, MultiIndex, performance
1Which method allows applying multiple different aggregation functions to a single column with groupby?
Which method allows applying multiple different aggregation functions to a single column with groupby?
Answer
The agg() (or aggregate()) method allows applying multiple aggregation functions to the same columns. You can pass a list of functions like ['sum', 'mean', 'count'] or a dictionary to specify different functions per column. This flexibility is essential for creating comprehensive statistical reports in a single operation.
2How to explicitly name the resulting columns during a groupby aggregation using named aggregation syntax?
How to explicitly name the resulting columns during a groupby aggregation using named aggregation syntax?
Answer
Named aggregation syntax uses agg() with named tuples via keyword arguments. For example: df.groupby('category').agg(total_sales=('sales', 'sum'), avg_price=('price', 'mean')). This approach produces explicit and readable column names, avoiding MultiIndex in columns which can complicate subsequent processing.
3What is the main difference between transform() and apply() in a groupby context?
What is the main difference between transform() and apply() in a groupby context?
Answer
transform() returns a result of the same size as the input, aligned to the original index, ideal for adding group statistics to each row (e.g., group mean). apply() is more flexible and can return a different-sized result, but is generally slower. Use transform() for operations like group normalization or z-score calculation.
How to filter groups in a groupby to keep only those that satisfy a condition (for example, groups with more than 10 elements)?
What is the difference between pd.merge() with how='left' and how='inner'?
+21 interview questions
Other Data Science & ML interview topics
Python Basics
Python Object-Oriented Programming
Python Data Structures
Git Fundamentals
SQL Basics
NumPy Fundamentals
Pandas Basics
Jupyter & Google Colab
SQL Joins & Advanced Queries
Visualization with Matplotlib & Seaborn
Interactive Visualizations with Plotly
Descriptive Statistics
Inferential Statistics
Web Scraping
BigQuery & Cloud Data
Feature Engineering
Supervised ML: Regression
Supervised ML: Classification
Decision Trees & Ensembles
Unsupervised ML
ML Pipelines & Validation
Time Series & Forecasting
Deep Learning Fundamentals
TensorFlow & Keras
CNN & Image Classification
RNN & Sequences
Transformers & Attention
NLP & Hugging Face
GenAI & LangChain
MLOps & Deployment
Master Data Science & ML for your next interview
Access all questions, flashcards, technical tests, code review exercises and interview simulators.
Start for free