Python for Data Analytics: Matplotlib, Seaborn and Visualization for Interviews

Master Python data visualization with Matplotlib and Seaborn. Practical tutorial covering charts, styling, subplots and common interview questions for data analytics roles in 2026.

Python data visualization with Matplotlib and Seaborn for data analytics interviews

Python data visualization ranks among the most tested skills in data analytics interviews. Hiring managers expect candidates to produce clean, readable charts from raw data — and to explain their design choices under pressure.

This tutorial covers Matplotlib 3.10 and Seaborn 0.13, the two libraries that dominate technical interviews for analyst and data scientist positions. Every code example runs as-is with Python 3.12+.

Interview Insight

Most data analytics interviews include a live coding round where candidates must generate a visualization from a dataset in under 15 minutes. The patterns below map directly to those exercises.

Setting Up a Python Data Visualization Environment

Before writing any chart code, the environment needs the right dependencies. A clean virtual environment avoids version conflicts between Matplotlib, Seaborn, and their shared NumPy/Pandas foundations.

bash
# setup.sh
python -m venv venv
source venv/bin/activate
pip install matplotlib==3.10.8 seaborn==0.13.2 pandas numpy

A quick sanity check confirms everything works:

python
# verify_install.py
import matplotlib
import seaborn as sns
import pandas as pd

print(f"Matplotlib: {matplotlib.__version__}")
print(f"Seaborn: {sns.__version__}")
print(f"Pandas: {pd.__version__}")

With dependencies locked, the next step focuses on Matplotlib fundamentals — the building block for every Seaborn chart.

Matplotlib Fundamentals: Figure, Axes and the Object-Oriented API

Matplotlib offers two APIs: the pyplot state machine and the object-oriented (OO) interface. The OO API provides explicit control over every element and is the standard expected in professional codebases and interviews.

python
# bar_chart_oo.py
import matplotlib.pyplot as plt
import numpy as np

# Sample quarterly revenue data
quarters = ["Q1", "Q2", "Q3", "Q4"]
revenue = [42_000, 58_000, 51_000, 67_000]

# Create figure and axes explicitly
fig, ax = plt.subplots(figsize=(8, 5))

# Draw bars with a specific color
ax.bar(quarters, revenue, color="#2563eb", width=0.5)

# Label axes clearly — interviewers check for this
ax.set_xlabel("Quarter")
ax.set_ylabel("Revenue (USD)")
ax.set_title("Quarterly Revenue — 2025")

# Format y-axis with dollar amounts
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"${x:,.0f}"))

# Remove top and right spines for a cleaner look
ax.spines[["top", "right"]].set_visible(False)

plt.tight_layout()
plt.savefig("quarterly_revenue.png", dpi=150)
plt.show()

Key details interviewers notice: explicit fig, ax creation instead of plt.plot(), readable axis labels, formatted tick values, and removed chart junk (unnecessary spines). These small choices signal production-level thinking.

Building Subplots for Comparative Analysis

Interviews frequently require comparing multiple metrics side by side. The subplots() function handles this with a grid layout.

python
# subplots_comparison.py
import matplotlib.pyplot as plt
import numpy as np

months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]
users = [1200, 1350, 1500, 1420, 1680, 1820]
revenue = [24_000, 27_000, 30_000, 28_400, 33_600, 36_400]

# Two side-by-side plots sharing the x-axis
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5), sharey=False)

# Left panel: user growth as a line chart
ax1.plot(months, users, marker="o", color="#2563eb", linewidth=2)
ax1.set_title("Monthly Active Users")
ax1.set_ylabel("Users")
ax1.spines[["top", "right"]].set_visible(False)

# Right panel: revenue as a bar chart
ax2.bar(months, revenue, color="#16a34a", width=0.5)
ax2.set_title("Monthly Revenue")
ax2.set_ylabel("Revenue (USD)")
ax2.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"${x:,.0f}"))
ax2.spines[["top", "right"]].set_visible(False)

fig.suptitle("Product Metrics — H1 2025", fontsize=14, fontweight="bold")
plt.tight_layout()
plt.savefig("product_metrics.png", dpi=150)
plt.show()

Using sharey=False lets each panel scale independently — revenue in dollars and users as counts have different magnitudes. The suptitle adds an overarching title above both subplots.

Common Interview Mistake

Candidates often use plt.plot() for everything instead of the object-oriented API. When an interviewer asks to add a second y-axis or adjust a single subplot, the pyplot approach falls apart. Always default to fig, ax = plt.subplots().

Seaborn Statistical Plots: From Distribution to Correlation

Seaborn builds on Matplotlib and specializes in statistical visualization. Where Matplotlib requires manual configuration, Seaborn infers reasonable defaults from the data structure.

A distribution analysis — one of the most common interview tasks — takes a single function call:

python
# distribution_analysis.py
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Simulate salary data for two departments
np.random.seed(42)
data = pd.DataFrame({
    "salary": np.concatenate([
        np.random.normal(75_000, 12_000, 200),  # Engineering
        np.random.normal(65_000, 10_000, 150),  # Marketing
    ]),
    "department": ["Engineering"] * 200 + ["Marketing"] * 150
})

# KDE plot comparing salary distributions
fig, ax = plt.subplots(figsize=(10, 5))
sns.kdeplot(
    data=data,
    x="salary",
    hue="department",        # Automatically splits by category
    fill=True,               # Shaded area under the curve
    alpha=0.4,
    palette=["#2563eb", "#dc2626"],
    ax=ax                    # Attach to our explicit axes
)

ax.set_title("Salary Distribution by Department")
ax.set_xlabel("Annual Salary (USD)")
ax.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"${x:,.0f}"))
ax.spines[["top", "right"]].set_visible(False)

plt.tight_layout()
plt.savefig("salary_distribution.png", dpi=150)
plt.show()

The hue parameter splits the data automatically, and fill=True makes overlapping regions visible. This pattern — grouping distributions by category — appears in nearly every analytics interview panel.

Ready to ace your Data Analytics interviews?

Practice with our interactive simulators, flashcards, and technical tests.

Seaborn Heatmaps for Correlation Matrices

Correlation heatmaps reveal relationships between numeric variables at a glance. Interviewers use them to test whether candidates can identify multicollinearity or spot strong feature relationships.

python
# correlation_heatmap.py
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Simulate e-commerce metrics
np.random.seed(42)
n = 500
page_views = np.random.poisson(15, n)
time_on_site = page_views * 2.5 + np.random.normal(0, 5, n)
cart_adds = np.random.binomial(page_views, 0.3)
purchases = np.random.binomial(cart_adds, 0.4)

df = pd.DataFrame({
    "page_views": page_views,
    "time_on_site": time_on_site,
    "cart_adds": cart_adds,
    "purchases": purchases
})

# Compute Pearson correlation
corr_matrix = df.corr()

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(
    corr_matrix,
    annot=True,              # Show correlation values in cells
    fmt=".2f",               # Two decimal places
    cmap="RdBu_r",           # Diverging colormap centered on 0
    vmin=-1, vmax=1,         # Fixed scale for consistency
    square=True,             # Square cells
    linewidths=0.5,
    ax=ax
)

ax.set_title("E-commerce Metrics Correlation")
plt.tight_layout()
plt.savefig("correlation_heatmap.png", dpi=150)
plt.show()

The diverging RdBu_r colormap makes positive correlations blue and negative correlations red — a convention interviewers expect. Setting vmin=-1 and vmax=1 ensures the color scale remains interpretable regardless of the actual data range.

Styling Charts for Professional Presentations

Raw Matplotlib output looks dated. A few configuration lines transform charts into presentation-ready visuals that demonstrate attention to detail during interviews.

python
# professional_styling.py
import matplotlib.pyplot as plt
import seaborn as sns

# Apply Seaborn's built-in theme
sns.set_theme(
    style="whitegrid",       # Clean background with grid lines
    palette="muted",         # Professional color palette
    font_scale=1.1           # Slightly larger text
)

# Global Matplotlib overrides
plt.rcParams.update({
    "figure.facecolor": "white",
    "axes.facecolor": "white",
    "font.family": "sans-serif",
    "axes.titlesize": 14,
    "axes.labelsize": 12,
})

Applying sns.set_theme() at the top of a script propagates consistent styling to every subsequent chart. During interviews, this avoids wasting time on per-chart formatting.

Version Note

Seaborn 0.13 deprecated set_style() and set_palette() as separate calls. The unified set_theme() function replaces both. Older Stack Overflow answers still reference the deprecated API — avoid using them in 2026 interviews.

Common Data Visualization Interview Questions

Beyond coding, interviewers probe conceptual understanding of visualization best practices. The questions below appear consistently across data analytics and data science interviews.

When should a bar chart be used instead of a line chart? Bar charts display comparisons between discrete categories (departments, product types, regions). Line charts show trends over continuous or ordered intervals (time series, sequential measurements). Using a line chart for unordered categories implies a false relationship between adjacent bars.

What makes a misleading chart? Truncated y-axes, dual y-axes with different scales, 3D effects on 2D data, and cherry-picked time ranges all distort perception. The fix: start the y-axis at zero for bar charts, label axes explicitly, and avoid decorative elements that obscure the data.

How does the choice of color palette affect data interpretation? Sequential palettes (light-to-dark) suit ordered data like temperature or revenue. Diverging palettes (two hues meeting at a neutral center) highlight deviations from a midpoint, such as profit/loss or correlation coefficients. Categorical palettes use distinct hues for unrelated groups. Colorblind-safe palettes (like Seaborn's colorblind or muted) ensure accessibility — a detail that distinguishes senior candidates.

Explain the difference between plt.show() and plt.savefig(). plt.show() renders the figure to an interactive window and clears the figure state afterward. plt.savefig() writes the figure to a file without clearing it. Calling savefig() after show() produces an empty file — a common bug. The correct order: savefig() first, then show().

Putting It Together: End-to-End Interview Exercise

A typical take-home or live-coding exercise combines data loading, cleaning, and multiple chart types. The following example mirrors real interview prompts from analytics teams.

python
# interview_exercise.py
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

sns.set_theme(style="whitegrid", palette="muted", font_scale=1.05)

# Simulate 12 months of product data
np.random.seed(42)
months = pd.date_range("2025-01", periods=12, freq="MS")
df = pd.DataFrame({
    "month": months,
    "revenue": np.cumsum(np.random.normal(5000, 2000, 12)) + 50_000,
    "customers": np.cumsum(np.random.poisson(50, 12)) + 500,
    "churn_rate": np.clip(np.random.normal(0.05, 0.015, 12), 0.01, 0.12)
})

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Panel 1: Revenue trend
axes[0].plot(df["month"], df["revenue"], marker="o", color="#2563eb")
axes[0].set_title("Revenue Trend")
axes[0].set_ylabel("Revenue (USD)")
axes[0].yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"${x:,.0f}"))
axes[0].tick_params(axis="x", rotation=45)
axes[0].spines[["top", "right"]].set_visible(False)

# Panel 2: Customer growth
axes[1].bar(df["month"], df["customers"], color="#16a34a", width=20)
axes[1].set_title("Customer Growth")
axes[1].set_ylabel("Total Customers")
axes[1].tick_params(axis="x", rotation=45)
axes[1].spines[["top", "right"]].set_visible(False)

# Panel 3: Churn rate with threshold line
axes[2].plot(df["month"], df["churn_rate"], marker="s", color="#dc2626")
axes[2].axhline(y=0.05, color="gray", linestyle="--", label="Target: 5%")
axes[2].set_title("Monthly Churn Rate")
axes[2].set_ylabel("Churn Rate")
axes[2].yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f"{x:.1%}"))
axes[2].tick_params(axis="x", rotation=45)
axes[2].spines[["top", "right"]].set_visible(False)
axes[2].legend()

fig.suptitle("Product Dashboard — FY 2025", fontsize=14, fontweight="bold")
plt.tight_layout()
plt.savefig("interview_dashboard.png", dpi=150)
plt.show()

This exercise demonstrates three chart types in a unified layout, proper axis formatting, a reference line for churn targets, and consistent styling. These are the exact elements interviewers score on.

Conclusion

  • The Matplotlib object-oriented API (fig, ax = plt.subplots()) gives full control and is the expected standard in professional and interview settings
  • Seaborn's hue parameter and built-in statistical functions (KDE, heatmaps) handle grouped analysis with minimal code
  • Correlation heatmaps with diverging colormaps and fixed scales (vmin=-1, vmax=1) are a staple of analytics interviews
  • Always call savefig() before show() to avoid blank output files
  • Clean styling — removed spines, formatted tick labels, explicit titles — signals production-quality thinking to interviewers
  • Practice building multi-panel dashboards under time pressure: most live coding rounds allocate 10-15 minutes per visualization task

Start practicing!

Test your knowledge with our interview simulators and technical tests.

Tags

#python
#matplotlib
#seaborn
#data-visualization
#data-analytics
#interview

Share

Related articles