Thus:
Often it is necessary to interpolate between data values to accomplish this, as in the following example.
i x[i]
1 102 2 105 ------------- first quartile, Q1 = (105+106)/2 = 105.5 3 106 4 109 ------------- second quartile, Q2 = (109+110)/2 = 109.5 5 110 6 112 ------------- third quartile, Q3 = (112+115)/2 = 113.5 7 115 8 118
If the sample size[?] is not a multiple of four, some of the quartiles may be numbers in the original data set, as in this example:
i x[i]
1 102 2 105 -- Q[1] = 105 3 106 ------------- Q[2] = 107.5 4 109 5 110 -- Q[3] = 110 6 112
In both of the above cases, the first and third quartiles can be taken to be the median values of the lower and upper halves of the data, respectively. However, there are two schools of thought on how to apply this definition when the overall median is one of the original data values.
One may include the median in both "halves" of the data:
i x[i]
1 102 2 105 3 106 -- Q1 = 106 4 109 5 110 )- Q2 = 110 (note line 5 has been duplicated 5 110 to illustrate the point) 6 112 7 115 -- Q3 = 115 8 118 9 120
Or not include the median in either "half":
i x[i]
1 102 2 105 ------------- Q1 = 105.5 3 106 4 109
5 110 -- Q2 = 110
6 112 7 115 ------------- Q3 = 116.5 8 118 9 120
More precise mathematical formulations are possible....
The difference between the upper and lower quartiles is called the interquartile range.
See also: Summary statistics, Quantile, Percentile[?]
Search Encyclopedia
|
Featured Article
|