ANATOMY OF A BOXPLOT: Modified Boxplot
Boxplots present the five-numbersummary statistics, as well as provide a
visual presentation of the datas
distribution. In a Modified Boxplot
the whiskers may not extend all the
way to the minimum and maximum
values. See also: Traditional Boxplot.
y-axis scale
& label
A Boxplot is divided into four
sections by the three quartile values
Q1, Q2 (Median), and Q3. While all
sections contain the same number of
data values, the size of these four
sections may vary in length. This
appearance is due to the density of
values within each section (i.e.
whether the values are close together,
creating a shorter appearing quartile
or farther apart, creating a longer
appearing quartile).
<= Maximum:
$149,100
<= Adjacent Point: $143,800
<= Q3:$105,500
<= Median (Q2 ): $92,100
<= Q1: $79,900
<= Minimum: $51,800
Related Items:
A Boxplot may be presented either
horizontally or vertically (as done
here).
See the Building a Modified Boxplot
steps for information regarding:
Interquartile Range (IQR); Lower
Limits, Upper Limits, and Adjacent
Points.
Historical Note: Who invented this
useful tool for quick data analysis?
John Tukey, Statistician.
Title
Notes on the Modified Boxplot:
The column of $ values to the right of the boxplot are superimposed and
not a part of the chart.
The Maximum value is an Outlier. It is a value beyond the point to which
a whisker may be drawn (UL = Q3 + 1.5(IQR) = $143,900).
By definition the Upper Limit to which the upper whisker may be drawn is
$143,900. The last data value before this value is located at $143,800 and is
referred to as an Adjacent Point. As there are no data between $143,800
and $143,900, the whisker is drawn only to this point.
The values of Q1, Median, and Q3, make up the central box and divide the
data into four equal parts.
Here the Lower Whisker extends to the Minimum value in the data set,
$51,800, as it is within the calculated Lower Limit ($41,500).
Building a Modified Boxplot:
1) Draw the horizontal axis.
2) Label the axis (dollars, cm., sec., etc.) and
insert the scale being used (e.g. ).
3) Determine the values of the five-numbersummary.
4) Place a dot along the scale line for each of the
five points.
5) Draw horizontal lines through the points
representing Q1 and Q3. Connect these lines,
forming a rectangle.
6) Draw a horizontal line through the Median
and connect it to the box.
7) Determine the Interquartile Range: IQR =
Q3 Q1.
8) Determine the Lower Limit (LL) and the
Upper Limit (UL) of length for the whiskers.
LL = Q1 1.5(IQR); UL = Q3 + 1.5(IQR).
These will give the minimum and maximum
distance to which the whiskers may be drawn.
9) The whiskers are drawn to the data point
closest to the LL and UL without going beyond
these points. These data values are referred to
as Adjacent Points the points after which
there are no actual data values before reaching
the LL or UL. Thus, the whiskers may not
extend as far as they mathematically could.
10) Identify values beyond the LL and UL with
an asterisk or other symbol. These values
(beyond 1.5 IQRs from Q1 and Q3 are Outliers
(extreme values).
11) Add a title describing the charts contents
and any other information about the data that
might be useful.
The Data: Real Estate Housing Sales in DallasFort Worth area circa 1999. N = 518