Data Analytics

Descriptive Statistics

Mean vs median, variance, standard deviation, normal distribution, skewness, correlation vs causation, sampling bias, percentiles

20 interview questionsยท
Junior
1

Which measure of central tendency represents the value that divides a sorted dataset into two equal halves?

Answer

The median is the middle value of a dataset sorted in ascending order. It separates exactly 50% of lower values from 50% of higher values. Unlike the mean, the median is not affected by extreme values, making it a more robust indicator for skewed distributions such as income or real estate prices.

2

What is the fundamental difference between the mean and the median?

Answer

The mean takes all values into account and is therefore sensitive to extreme values (outliers), while the median depends only on the central position of sorted data. For example, if five salaries are 30k, 35k, 40k, 45k, and 500k, the mean is pulled upward by 500k (130k), whereas the median stays at 40k, better reflecting the group's reality.

3

What is the mode in a dataset?

Answer

The mode is the value that appears most frequently in a dataset. A dataset can be unimodal (one mode), bimodal (two modes), or multimodal (multiple modes). The mode is the only measure of central tendency that can be used with categorical data, such as favorite color or best-selling product category.

4

What does the variance measure in a dataset?

5

What is the relationship between variance and standard deviation?

+17 interview questions

Master Data Analytics for your next interview

Access all questions, flashcards, technical tests, code review exercises and interview simulators.

Start for free