Encyclopedia > Gradient descent

  Article Content

Gradient descent

Gradient descent is an incremental hill-climbing algorithm that approaches a minimum or maximum of a function by taking steps proportional to the gradient (or the approximate gradient) at the current point.

There are two main forms of gradient descent commonly used in machine learning : batch and on-line.

In batch gradient descent, the true gradient is used to update the parameters of a model. The true gradient is usually the sum of the gradients caused by each individual training example. Therefore, batch gradient descent requires one sweep through the training set before any parameters can be changed.

In on-line gradient descent, the true gradient is approximated by the gradient of the cost function only evaluate on a single training example. Therefore, the parameters of the model are updated after each training example. For large data sets, on-line gradient descent can be much faster than batch gradient descent.

There is a compromise between the two forms, which is often called "mini-batches", where the true gradient is approximated by a sum over a small number of training examples.

On-line gradient descent is a form of stochastic approximation[?]. The theory of stochastic approximation gives conditions on when on-line gradient descent converges.



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
242

... Contents 242 Centuries: 2nd century - 3rd century - 4th century Decades: 190s 200s 210s 220s 230s - 240s - 250s 260s 270s 280s 290s Years: 237 238 239 240 241 - ...

 
 
 
This page was created in 31 ms