So the pattern of asymmetry of these data is not straightforward. In other words, the box suggests that the data might be right-skew rather than left-skew. In this case, the left part of the box is shorter than the right part. The box corresponds to the middle half of the data values, and the line denoting the median divides this into two parts, each corresponding to one-quarter of the data. However, the box gives a different impression. To some extent the boxplot reflects this: the left whisker is considerably longer than the right, indicating that the smaller values are more spread out than are the larger values. (b) The sample skewness is negative, indicating that the data are left-skew. It gives a clear picture of all these features and, as you will see, allows a visual appreciation of lack of symmetry.įigure 1.5a Figure 1.5a Boxplot for silica content of chondrite meteors You should now ensure that you understand simple boxplots by constructing one for yourself.Ī boxplot displays the median, the quartiles, the range of values covered by the data and any outliers which may be present. However, it should be borne in mind that this particular data set has only eleven values, and this is too small a number to infer anything definite about any underlying structure. The corresponding lack of symmetry shows up in the boxplot: the right-hand section of the box is longer than the left. These particular data are not symmetric they are right-skew, and in fact the sample skewness is 2.572. Some kind of assessment of symmetry is possible, since symmetric data will produce a boxplot which is symmetric about the median. The unusually large value in this data set is clearly shown and the median gives an assessment of the centre. Thus these aspects of the diagram give an idea of the dispersion of the data set. The length of the box represents the interquartile range and the lengths of the whiskers relative to the length of the box give an idea of how stretched out the rest of the values are. You can see how a boxplot gives a quick visual assessment of the data. The approach adopted here is one of the simplest and is probably the most common. Some textbooks and software always draw the whiskers right out to the minimum and maximum values and do not mark (potential) outliers separately. Some approaches even distinguish between moderate and severe outliers by using different symbols for them. The whiskers may extend as low as one or even up to two interquartile ranges either side of the box. All boxplots show the three quartiles, but the conventions defining the extent of the whiskers vary from text to text and from one computer package to another. It must be stressed that boxplot construction is an area where there are no universally accepted rules. So the interquartile range is needed to construct the whiskers.įigure 1.4 Figure 1.4 Completed boxplot for collapsed runners The lower adjacent value is the furthest observation which is within one and a half iqr (interquartile range) of the lower end of the box and the upper adjacent value is the furthest observation which is within one and a half iqr of the upper end of the box. The whiskers are drawn outwards as far as observations called adjacent values. However, as you will see in the next step, some observations may be classified as potential outliers and in fact the whiskers extend only to cover observations which are not classified as potential outliers. Essentially, each whisker extends outwards from the edge of the box as far as the most extreme observation. These are lines drawn parallel to the scale (so they are horizontal in this course). The vertical line inside the box is located at the median. The ‘box’ is a rectangle with edges defined by the lower and upper quartiles so it indicates where the ‘middle 50%’ of the data can be found. The median of this data set is 110, and the lower and upper quartiles are 79 and 162, respectively. The median and quartiles are used to construct the ‘box’. Since the minimum is 66 and the maximum is 414, a scale from 0 to 500 (say) is suitable in this case. The steps involved in constructing the boxplot in Figure 1.1 for the data set of β endorphin concentrations are as follows.įirst, a convenient scale is drawn covering the extent of the data. The easiest way to understand exactly what a boxplot represents and how it is constructed is to think about how you would draw one by hand. (1987) Beta-endorphin: a factor in 'fun run' collapse? British Medical Journal, 294, 1004.) (Data sourced from Dale, G., Fleetwood, J.A., Weddell, A., Ellis, R.D. Figure 1.1 A boxplot for the collapsed runners Figure 1.1 A boxplot for the collapsed runners
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |