Data set X and data set Y each consist of 25 values. The table shows the frequencies of the values...
GMAT Problem-Solving and Data Analysis : (PS_DA) Questions
Data set X and data set Y each consist of 25 values. The table shows the frequencies of the values for each data set. Which of the following statements best compares the medians of the two data sets?
| Value | Data set X frequency | Data set Y frequency |
|---|---|---|
| 10 | 1 | 9 |
| 20 | 3 | 7 |
| 30 | 5 | 5 |
| 40 | 7 | 3 |
| 50 | 9 | 1 |
The median of data set X is greater than the median of data set Y.
The median of data set X is less than the median of data set Y.
The median of data set X is equal to the median of data set Y.
There is not enough information to compare the medians of the data sets.
1. INFER the median position
- With 25 values in each data set, when arranged in order, the median is the middle value
- Median position = \(\left(25+1\right)/2 = 13\)th position
- We need to find the 13th value in each ordered data set
2. SIMPLIFY by calculating cumulative frequencies
- Rather than listing all 25 values, we can track position ranges using cumulative frequency
- Cumulative frequency shows how many values we've counted 'up to and including' each data value
3. SIMPLIFY the cumulative frequencies for Data set X
- Value 10: frequency 1 → positions 1 (cumulative: 1)
- Value 20: frequency 3 → positions 2-4 (cumulative: \(1+3=4\))
- Value 30: frequency 5 → positions 5-9 (cumulative: \(4+5=9\))
- Value 40: frequency 7 → positions 10-16 (cumulative: \(9+7=16\))
- Value 50: frequency 9 → positions 17-25 (cumulative: \(16+9=25\)) ✓
4. INFER Data set X median location
- The 13th position falls within the range 10-16
- All values in positions 10-16 are 40
- Therefore, median of Data set X = 40
5. SIMPLIFY the cumulative frequencies for Data set Y
- Value 10: frequency 9 → positions 1-9 (cumulative: 9)
- Value 20: frequency 7 → positions 10-16 (cumulative: \(9+7=16\))
- Value 30: frequency 5 → positions 17-21 (cumulative: \(16+5=21\))
- Value 40: frequency 3 → positions 22-24 (cumulative: \(21+3=24\))
- Value 50: frequency 1 → position 25 (cumulative: \(24+1=25\)) ✓
6. INFER Data set Y median location
- The 13th position falls within the range 10-16
- All values in positions 10-16 are 20
- Therefore, median of Data set Y = 20
7. INFER the comparison
- Data set X median: 40
- Data set Y median: 20
- Since \(40 > 20\), the median of data set X is greater than the median of data set Y
Answer: A
Why Students Usually Falter on This Problem
Most Common Error Path:
Weak INFER skill: Students calculate cumulative frequencies correctly but misidentify which range contains the 13th position. They might think 'cumulative frequency of 16 means the median is 16' or confuse the cumulative count with the actual data value.
For example, seeing that Data set X reaches cumulative frequency 16 at value 40, they might incorrectly conclude the median is 16 rather than recognizing that positions 10-16 all have value 40. This conceptual confusion about position vs. value leads to incorrect median identification and wrong comparison.
This may lead them to select Choice B or D depending on how they misinterpret the cumulative frequencies.
Second Most Common Error:
Missing conceptual knowledge: Students don't remember that for 25 values, the median is the 13th position. They might use the wrong position (like 12th or 12.5th) or try to average two middle values as if there were an even number of data points.
This fundamental error in median position leads them to look for the wrong value in their cumulative frequency analysis, producing incorrect medians for both data sets.
This causes them to get stuck and guess among the answer choices.
The Bottom Line:
This problem requires students to bridge frequency table interpretation with median calculation—they must translate frequencies into position ranges while maintaining focus on finding a specific ranked position (13th) rather than getting distracted by the frequency counts themselves.
The median of data set X is greater than the median of data set Y.
The median of data set X is less than the median of data set Y.
The median of data set X is equal to the median of data set Y.
There is not enough information to compare the medians of the data sets.