Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation

- 3 mins

Takeaways

Constructing Geometry-Aware Latent Representation

With,

– Latent Representation

– Latent Representations of body’s 3D pose, appearance and Background respectively.

– Set of image pairs, from i and j cameras at time t.

– Rotation matrix from camera i coordinate system to camera j.

and - Encoder and Decoder parts on the network with and learnable parameters respectively.

Thus, we encode the image into a latent representation and reconstruct it back to by minimizing over training set .

Encoding Geometry

With images from different viewpoints i and j and with the rotation matrix , the view-change information can be introduced as an additional input to the encoder and decoder and train them to decode and resynthesize .

This models view-change as a 3D rotation by matrix multiplication of the encoder by the before using this as an input to the decoder.

Formally, the auto-encoder outputs,

, with

and is optimized by minimizing over training set .

The decoder doesn’t need to learn how to rotate the input to a new view but only how to decode the 3D latent vector . Being a matrix, it can be understood as a set of 3D points, and can be mapped to with a different decoder to the 3D pose’s space via a semi-supervised setup.

Disentangling Appearance

Two frames and of the same subject at different times t and t’ are trained simultaneously. The differences in the images are caused by 3D pose changes, latent space representations are swapped; i.e. decoder uses and to resynthesize frames t and and for frame t’. This results in encoding pose and encoding appearance.

The encoder-decoder setup now becomes

Disentangling Background

By constructing background image (median of all images from a viewpoint) a direct connection to in the decoder is introduced with an additional convolutional layer to synthesize the decoded image.

Optimization

Sum of the per-pixel error is minimized, from a mini-batch triplets of from individual sequences.

That is,

comments powered by Disqus
rss facebook twitter github youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora