Encyclopedia > Overfitting

  Article Content

Overfitting

In statistics, overfitting is fitting a statistical model that has too many parameters. An absurd and false model may fit perfectly if the model has enough complexity by comparison to the amount of data available. Overfitting is generally recognized to be a violation of Occam's razor.

A field that has more recently adopted the concept of overfitting is machine learning. Usually a learning algorithm is trained using some set of training examples, i.e. exemplary situations for which the desired output is known. The learner is assumed to reach a state where it will also be able to predict the correct output for other examples, thus generalizing to situations not presented during training (based on its inductive bias). However, especially in cases where learning was performed too long or where training examples are rare, the learner may adjust to very specific random features of the training data, that have no causal relation to the target function. In this process of overfitting, the performance on the training examples still increases while the performance on unseen data becomes worse.

In both statistics and machine learning, in order to avoid overfitting, it is necessary to use additional techniques (e.g. cross validation[?], early stopping), that can indicate when further training is not resulting in better generalization.



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
Jamesport, New York

... the average family size is 2.88. In the town the population is spread out with 20.6% under the age of 18, 5.0% from 18 to 24, 26.8% from 25 to 44, 27.7% from 45 to 64, and ...

 
 
 
This page was created in 36.1 ms