hIPPYLearn : an inexact Newton-CG method for training neural networks with analysis of the Hessian

Gao, Ge, 1993-

hIPPYLearn : an inexact Newton-CG method for training neural networks with analysis of the Hessian

Access full-text files

GAO-MASTERSREPORT-2017.pdf (2.62 MB)

Date

2017-05

Authors

Gao, Ge, 1993-

Abstract

Neural networks, as part of deep learning, have become extremely pop- ular due to their ability to extract information from data and to generalize it to new unseen inputs. Neural network has contributed to progress in many classic problems. For example, in natural language processing, utilization of neural network significantly improved the accuracy of parsing natural language sentences [11]. However, training complicated neural network is expensive and time-consuming. In this paper, we introduce more efficient methods to train neural network using Newton-type optimization algorithm. Specifically, we use TensorFlow, the powerful machine learning package developed by Google [2] to define the structure of the neural network and the loss function that we want to optimize. TensorFlow’s automatic differentiation capabilities allow us to efficiently compute gradient and Hessian of the loss function that are needed by the scalable numerical optimization algorithm implemented in hIPPYlib [12]. Numerical examples demonstrate the better performance of Newton method compared to Steepest Descent method, both in terms of number of iterations and computational time. Another important contribution of this work is the study of the spectral properties of the Hessian of the loss function. The distribution of the eigenvalues of the Hessian, in fact, provides extremely valuable information regarding which directions in parameter space are well informed by the data.