newton's method?

hnuser355 · on Nov 25, 2018

I don’t know any of machine learning but I’ve heard folks in some fields will prefer a gradient descent algorithm even though it doesn’t have nice convergence speed because the Hessian (Hessian approximation) will be too huge or difficult to calculate

salty_biscuits · on Nov 25, 2018

There are always tricks to keep the problem size small for quadratic methods(such as L-BFGS). In ML stochastic gradient descent (a bad name for doing gradient descent on batches of the data at a time) seems to have a bit of magic in it's ability to go uphill to escape local minima as well as letting the problem scale to much larger sets of data.

mkl · on Nov 25, 2018

Newton's method finds zeros of functions, which is different from optimisation (finding maximums or minimums). If you use Newton's method on a function which is the derivative of some other function you're trying to optimise, then it will find critical points, which may be the optimums you're looking for (or inflection points, saddles, etc.).

snovv_crash · on Nov 25, 2018

This is why you actually do something like Levenberg-Marquardt, which combines a gradient descent with a trust region of quadratic optimization, which is varied depending on if the cost function is decreasing or not. And of course it is iterative, so a single bad step doesn't destroy everything. The advantage comes from faster convergence in practice.

mi_lk · on Nov 25, 2018

Do you know of any good literature on LM or trust region? I'm interested in the details.

snovv_crash · on Nov 29, 2018

I strongly recommend reading the documentation for Ceres Solver [1]. It is intuitive and in-depth, and references everything you will need.

1. http://ceres-solver.org/nnls_solving.html#trust-region-metho...

knicholes · on Nov 25, 2018

Newton's method requires calculating the second derivative, which is quite a bit more expensive than SGD, especially with many parameters.

nagVenkat · on Nov 25, 2018

By Newton’s method, do you mean simple gradient descent?

enthdegree · on Dec 5, 2018