Multi-task learning with Gaussian processes
Abstract
Multi-task learning refers to learning multiple tasks simultaneously, in order to avoid tabula rasa learning
and to share information between similar tasks during learning. We consider a multi-task Gaussian
process regression model that learns related functions by inducing correlations between tasks directly.
Using this model as a reference for three other multi-task models, we provide a broad unifying view of
multi-task learning. This is possible because, unlike the other models, the multi-task Gaussian process
model encodes task relatedness explicitly.
Each multi-task learning model generally assumes that learning multiple tasks together is beneficial. We
analyze how and the extent to which multi-task learning helps improve the generalization of supervised
learning. Our analysis is conducted for the average-case on the multi-task Gaussian process model, and
we concentrate mainly on the case of two tasks, called the primary task and the secondary task. The
main parameters are the degree of relatedness ρ between the two tasks, and πS, the fraction of the total
training observations from the secondary task. Among other results, we show that asymmetric multitask
learning, where the secondary task is to help the learning of the primary task, can decrease a lower
bound on the average generalization error by a factor of up to ρ2πS. When there are no observations
for the primary task, there is also an intrinsic limit to which observations for the secondary task can
help the primary task. For symmetric multi-task learning, where the two tasks are to help each other to
learn, we find the learning to be characterized by the term πS(1 − πS)(1 − ρ2). As far as we are aware,
our analysis contributes to an understanding of multi-task learning that is orthogonal to the existing
PAC-based results on multi-task learning. For more than two tasks, we provide an understanding of
the multi-task Gaussian process model through structures in the predictive means and variances given
certain configurations of training observations. These results generalize existing ones in the geostatistics
literature, and may have practical applications in that domain.
We evaluate the multi-task Gaussian process model on the inverse dynamics problem for a robot manipulator.
The inverse dynamics problem is to compute the torques needed at the joints to drive the
manipulator along a given trajectory, and there are advantages to learning this function for adaptive
control. A robot manipulator will often need to be controlled while holding different loads in its end
effector, giving rise to a multi-context or multi-load learning problem, and we treat predicting the inverse
dynamics for a context/load as a task. We view the learning of the inverse dynamics as a function
approximation problem and place Gaussian process priors over the space of functions. We first show
that this is effective for learning the inverse dynamics for a single context. Then, by placing independent
Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task
Gaussian process prior for handling multiple loads, where the inter-context similarity depends on the
underlying inertial parameters of the manipulator. Experiments demonstrate that this multi-task formulation
is effective in sharing information among the various loads, and generally improves performance
over either learning only on single contexts or pooling the data over all contexts. In addition to the experimental
results, one of the contributions of this study is showing that the multi-task Gaussian process
model follows naturally from the physics of the inverse dynamics.