package org.apache.spark.mllib.optimization
LBFGS
Study materials:
Mathematics:
- https://en.wikipedia.org/wiki/Gradient
- https://en.wikipedia.org/wiki/Hessian_matrix
- https://en.wikipedia.org/wiki/Taylor_series
- https://en.wikipedia.org/wiki/Positive-definite_matrix
- https://en.wikipedia.org/wiki/Quasi-Newton_method
Paper
Aja Notes
Jacobian Matrix -> Hessian Matrix -> BGFS -> L-BGFS
In class LBFGS CachedDiffFunction is used,
CachedDiffFunction vs DiffFunction
- DiffFunctions per se don't have any caching whereas CachedDiffFunction is a simple wrapper around any DiffFunction that caches the last input and its result.
- In case the CachedDiffFunction gets called with the same input, it returns the result of the previous evaluation of the inner DiffFunction instead of re-evaluating.
- Especially in the various Minimizer implementations, DiffFunctions can get called multiple times with the same input, so in that case CachedDiffFunctions help a lot
- if the evaluation of your goal function is expensive.
No comments:
Post a Comment