Pose Estimation Dancer Analysis

2024-10-14 10:56 UTC gpt-4o Open in ChatGPT ↗

Can you help me rewrite the text below in a more condensed manner while maintaining all the important information? It’s part of a video voiceover to explain the technical approach of pose estimation and its results on dancers’ bodies

understand, we first had to capture the 3D positions of our dancers’ joints over time, using Ultra-Low-Light RGB cameras and third-party neural networks pre-trained on datasets of moving humans.

This process is called 3D Human Pose Estimation, and we researched far and wide to find the latest State-of-the-Art neural network models for this task. After testing a plethora of both commercial and open-source techniques, we ended up using the publicly available GVHMR model, released in September 2024, to perform the final processing of our movement workshop recordings.

While our initial goal was to obtain stable, reliable and high quality recording data, we had a significant revelation that the true value of this camera based motion capture process was perhaps instead in the oddities, glitches and errors that resulted from the stranger moments provided by our movement workshop dataset.

On the occasions where the dancers covered their faces, distorted their bodies and attempted to appear as inhuman as possible, the pose estimation models simply couldn’t cope: joints and root positions flicker and shift, previously stable axis rotate and everything that the machine previously thought to be true about the observed human is undone.

When digitised and applied to a 3D avatar these moments of machine incomprehension produce surreal and somewhat disturbing results, as the figure drifts above the ground plane and moves in staccatoed jerking motions, all humanity seems to disappear.

Or does it?

Ultimately this is what the machine believes to be true, based on what its senses are telling it. Needless to say this sent us down a path of philosophical and artistic questioning, examining what it means to ‘move as a human’ and to be observed by a machine that shares a different ‘reality’ to our own. Most importantly, how can these discrepancies in perception amplify a co-performance?

To begin, we captured the 3D positions of our dancers’ joints over time using Ultra-Low-Light RGB cameras and third-party neural networks trained on human movement datasets. This process, called 3D Human Pose Estimation, led us to explore the latest State-of-the-Art models. After extensive testing of various techniques, we chose the GVHMR model, released in September 2024, to process our movement workshop recordings.

While our initial goal was to obtain stable, high-quality data, we discovered the true value lay in the errors and glitches that emerged from more unusual movements. When dancers obscured their faces or distorted their bodies, the pose estimation models struggled—joints flickered, positions shifted, and previously stable movements unraveled.

These moments, when applied to a 3D avatar, created surreal and unsettling effects. The figure would float, jerk, and lose its human qualities, raising questions about machine perception. This led us to reflect on what it means to ‘move as a human’ and how the machine’s interpretation of reality differs from our own. We explored how these discrepancies in perception could enhance a collaborative performance.