Modeling Structured Dynamics with Deep Neural Networks

Villegas, Ruben

Modeling Structured Dynamics with Deep Neural Networks

Villegas, Ruben

2019

View/Open

rubville_1.pdf

(21.4MB

PDF)

Abstract

Neural networks have become powerful machinery for identifying patterns from raw input data from large amounts of data. Research adopting neural networks has excelled in tasks such as object recognition, reinforcement learning, speech recognition, image in-...painting, amongst others. Previous works have notably excelled at inferring information about the input data; either from sequence of frames or single frames. However, very few works have focused on modeling structured motion dynamics for generative tasks. Structured motion is defined as the constant topological configuration of objects maintained through time. In this thesis, I develop new neural networks that effectively model structured motion dynamics useful for generative tasks such as future motion prediction and transfer. Accurate structured dynamic models are an important piece in achieving general artificial intelligence. It has been shown that agents equipped with such models can learn from environments with far less interactions due to being able to predict the consequences of their actions. Additionally, accurate motion dynamic models are be useful for applications such as motion editing, motion transfer, and others. Such applications can enhance visual artists ability to create content for the web or can assist movie makers when transferring motion from actors into movie characters with minimal effort. This thesis initially presents motion dynamics models in two dimensions: I first present a neural network architecture that decomposes video into two information pathways that deal with video dynamics and frame spatial layout separately. The two pathways are later combined to generate future frames that contain highly structured objects moving. Second, I propose to take it a step further by having a motion stream that is visually interpretable. Specifically, there is a motion stream that predicts structured motion dynamics as landmarks of the moving structures that evolve through time, and there is an image generation module that generates future frames given the landmarks and a single frame from the past using image analogy principles. Next, we keep the image analogy principles of our previous work, however, we formulate the video prediction problem such that general features for moving objects structures are learned. Finally, by taking advantage of recent advances in computational devices for large scale deep learning research, I present a study on the effects of maximal capacity and minimal inductive bias of neural networks based video prediction frameworks. From our very thorough evaluation and experimentation, we find that network capacity plays a very important role in the performance of deep networks for video prediction that can be applied to any of the previously investigated methods. Consequently, this thesis presents motion dynamics models in three dimensions: I propose a neural kinematics network with adversarial cycle consistency. Specifically, I propose a layer based on the kinematic equations that takes advantage of the backpropagation algorithm used to optimize neural networks to automatically discover rotation angles that represent pure motion which can be used for motion transfer from one kinematic structure into another. Because of the unsupervised nature of learning, the learned model generalizes to never before seen human video from which motion data is extracted using an off-the-shelf algorithm. Overall, this thesis focuses on modeling structured dynamics using the representational power of deep neural networks. Modeling structured dynamics is an important problem in both general artificial intelligence, as well as, in applications dealing video editing, video generation, video understanding and animation. [more]

Subjects

deep learning, structured motion, human motion, video prediction, animation

Types

Thesis

Handle

https://hdl.handle.net/2027.42/153399

Metadata

Show full item record

Collections

Dissertations and Theses (Ph.D. and Master's)

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.