Test time cost sensitivity in machine learning
View/ Open
Date
23/11/2019Author
Gray, Gavin Douglas Buchanan
Metadata
Abstract
The use of deep neural networks has enabled machines to classify images, translate
between languages and compete with humans in games. These achievements have
been enabled by the large and expensive computational resources that are now available
for training and running such networks. However, such a computational burden
is highly undesirable in some settings. In this thesis we demonstrate how the computational
expense of a machine learning algorithm may be reduced. This is possible
because, until recently, most research in deep learning has focused on achieving better
statistical results on benchmarks, rather than targeting efficiency. However, the
learning process is flexible enough for us to control for the test-time computational
expense that will be paid when the model is run in an application. To achieve this
test-time computation sensitivity, a budget can be incorporated as part of the model.
This budget expresses what costs we are willing to incur when we allocate resources
at test time. Alternatively we can prescribe the size or computational resources we
expect and use that to decide on the appropriate classification model. In either case,
considering the resources available when building the model allows us to use it more
effectively. In this thesis, we demonstrate methods to reduce the stored size, or floating
point operations, of state-of-the-art classification models by an order of magnitude
with little effect on their performance. Finally, we find that such compression
can even be performed by simply changing the parameterisation of linear transforms
used in the network. These results indicate that the design of learning systems can
benefit from taking resource efficiency into account.