At Gen-9, we explored home-based therapy applications for patients discharged from hospitals or elderly individuals who needed ongoing physical rehabilitation. The challenge: how do you guide someone through exercises correctly without a therapist physically present?
Using Microsoft Kinect depth cameras, we tracked the patient's full body skeleton in real time — torso, limbs, and joints — and compared their movements against prescribed exercise forms. The system provided immediate corrective feedback, counted repetitions, and generated session reports for remote clinicians conducting teleconsultations.
Watch a simulated therapy session comparing ideal exercise form (left) against a patient's tracked movements (right). Use the timeline scrubber to step through the exercise, or press play to watch the full session.
The Kinect captures a 640×480 depth stream at 30fps using structured infrared light. Its body tracking pipeline identifies 20 skeletal joints per person in real time, producing 3D coordinates for head, shoulders, elbows, wrists, hips, knees, and ankles.
For each joint triplet (e.g., shoulder–elbow–wrist), angles are computed via the dot product of adjacent bone vectors. These are compared against a prescribed reference pose, with deviations scored per joint and aggregated into an overall form accuracy percentage.
Color-coded joint indicators and directional cues guide the patient toward correct form. The system counts reps, tracks accuracy over time, and produces session summaries for remote clinician review.
Processed 30 depth frames per second with the Kinect SDK’s skeleton tracker, extracting multiple joint positions per frame with centimeter-level accuracy at typical therapy distances (1.5–3m).
The entire system ran on consumer hardware — a Kinect sensor (~$150) and a standard PC. This makes home deployment feasible compared to clinic-grade motion capture costing tens of thousands.