Back to chapter

8.8:

Goodness-of-Fit Test

JoVE Core
Statistics
A subscription to JoVE is required to view this content.  Sign in or start your free trial.
JoVE Core Statistics
Goodness-of-Fit Test

Languages

Share

The goodness-of-fit test establishes whether an observed frequency distribution mirrors a claimed distribution.

Consider the dataset of people visiting the gym on weekdays. One can perform a goodness-of-fit test to determine whether the observed client attendance agrees with the expected frequency of client attendance.

To perform a goodness-of-fit test, the dataset values must be randomly selected and have a frequency value for each category, with the expected frequency of each category being at least 5.

The chi-square test statistic for the goodness-of-fit test can be computed using the shown formula. Here, O and E represent the observed and expected attendances, k is the number of weekdays, and n is the number of sample values or attendance counts recorded. The number of degrees of freedom is minus one.

Goodness-of-fit hypothesis tests are always right-tailed, implying that the critical region and critical values are located at the extreme right of the distribution curve.

The critical values and P-values help determine if there is a good fit between the observed and expected values.

8.8:

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as follows:

Equation1

where: 

O = observed values (data), and  E = expected values (from theory)

The observed values are the data values, and the expected values are the values you would expect to get if the null hypothesis were true. It is important to note that each cell’s expected needs to be at least five to use this test. The number of degrees of freedom is Equation2, where k = the number of different data cells or categories.

The goodness-of-fit test is almost always right-tailed. If the observed and the corresponding expected values are not close, the test statistic will be significant and located at the extreme right tail of the chi-square curve.

This text is adapted from Openstax, Introductory Statistics, 11.2 Goodness-of-Fit Test.