Rewrite this as a clear cohesive summary of an interactive experience:
A single user enters the interactive zone where their movements / gestures will be tracked by the camera(s).
They are facing the LED Screen and the camera(s). Facing away from any surrounding audience.
They are prompted to move, dance, pose , mimic, wave, jump, etc. (Perhaps the experience starts with a short ‘tour’ of some example movements/choreographies to give them ideas / inspiration, people may be unlikely to move in interesting ways without being prompted).
Some sort of cue to start the experience.
As they move they see a tracked avatar / skeleton / representation of their self and perhaps some fancy overlaid graphics recording their tracking data: Position/Coordinates (X, Y, Z), Velocity, Acceleration, Angular Velocity, Angular Acceleration, Jerk, Maximum Speed, Jump Height, Range of Motion etc.
We could potentially see their movements being separated into 3 second loops which save and are replayed, representing how the training data is segmented.
Based on the poses they strike and the movements they perform these will be ‘saved’ to a location within the ‘LSTM Galaxy’ determined by the values from the tracking data.
Movements and choreographies are represented within the ‘Galaxy’ as spheres / stars / planets and visualised on screen. We can zoom into these spheres to see videos of the movements contained within them (the video will only show the avatar / representation performing the movements, not a video of the person).
The experience consists of four phases:
Prompting / Cueing
Realtime Recording
Analysis / Inference
Mapping to Latent Space / Galaxy