Spaces:
Sleeping
Sleeping
Saket Shirsath
commited on
Commit
·
08f2cb4
1
Parent(s):
b908a66
Update index.md
Browse files
index.md
CHANGED
@@ -1,71 +1,35 @@
|
|
1 |
### Saket Shirsath, Parth Thakkar, Ben Wolfson
|
2 |
<br>
|
|
|
3 |
|
4 |
-
## Problem Statement:
|
5 |
|
6 |
-
|
7 |
|
8 |
-
|
9 |
|
10 |
-
|
11 |
|
12 |
-
|
13 |
|
14 |
-
The next step of our implementation is to analyze the exercise form in the input footage to determine if form breakdown is occurring. First, we detect its keypoints, using either Chamfer distancing or normalized correlation with our reference exercise database to detect any deviations from ‘good’ form, isolating specific joints or body parts that are known to be areas of breakdown when it comes to a particular exercise. We can use the output of our human pose detection for both our input image and our good/bad form stored images to find similarities. Based on the chamfer distance or normalized correlation we calculate, we can advise the user on things they could work on.
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
-
<img src="assets\OpenPose.jpg" height="500px">
|
20 |
|
21 |
-
|
|
|
|
|
22 |
|
23 |
-
### Experimental Setup:
|
24 |
|
25 |
-
|
26 |
|
27 |
-
|
28 |
|
29 |
-
|
30 |
-
- Will be used to train our neural net model
|
31 |
|
32 |
-
|
33 |
-
We will collect images of good and bad form of our 3 different exercises from different angles to use as our base images that our input media will be compared to this [dataset](https://exrx.net/Lists/Directory)
|
34 |
|
35 |
-
|
36 |
|
37 |
-
- We will utilize a code repository called OpenPose that utilizes a neural network model trained off a dataset of human pose points. This will help us detect the pose that our subject is doing and find points of interest.
|
38 |
-
- [Tutorial](https://www.learnopencv.com/deep-learning-based-human-pose-estimation-using-opencv-cpp-python/)
|
39 |
-
- [Repository](https://github.com/spmallick/learnopencv/tree/master/OpenPose)
|
40 |
-
- We will implement Hough transform code to detect unwanted curves.
|
41 |
-
- We will implement the code that calculates the Chamfer distance between two human poses from the input media and stored image respectively to determine similarities and differences.
|
42 |
-
- We will implement the normalized correlation code that determines the similarities/difference between the input media and the stored images.
|
43 |
-
- Note: we will be comparing the effectiveness of Chamfer distance vs. normalized correlation to see what works better.
|
44 |
|
45 |
-
### Success Measure:
|
46 |
|
47 |
-
We can define our success in a series of milestones:
|
48 |
-
1. Successfully detect the appropriate joints and limbs in media where a person is doing either of our 3 exercises.
|
49 |
-
2. Successfully identify the exercise an individual is doing based on our human pose estimation on our media.
|
50 |
-
3. Successfully identify areas of bad form in an individual’s exercise performance.
|
51 |
-
|
52 |
-
## List of Experiments (perform with 5 people mentioned earlier):
|
53 |
-
|
54 |
-
One major point of uncertainty for this project is how our HPE model will interpret a barbell, which is required in the three exercises we are focusing on. The second is the subject angle or perspective of the user in the input footage. Most HPE experimentations done in the past rely on front-facing subjects. To capture an exercise’s area of potential form breakdown is not always possible with the camera focusing on users from the front. We will have to play around with our model to ensure that it is robust, working from multiple different angles per exercise.
|
55 |
-
|
56 |
-
### Identification of Exercise:
|
57 |
-
|
58 |
-
Provide several input images to our program to detect what exercises the input images are depicting. This experiment is crucial for our next 3 experiments. We will feed different variations of the 3 exercises with good and bad form to make sure we can detect them accurately.
|
59 |
-
- Uncertainties: If an exercise is performed egregiously poorly (with significant deviation from proper form), will our program still be able to accurately detect it?
|
60 |
-
|
61 |
-
### Deadlift:
|
62 |
-
- Good form: Attempt to detect features such as vertical arms, straight back.
|
63 |
-
- Bad form: Attempt to detect features such as curved back, diagonal or bent arms.
|
64 |
-
|
65 |
-
### Squat:
|
66 |
-
- Good Form: Attempt to detect features such as straight back, butt behind feet, hips parallel to ground.
|
67 |
-
- Bad Form: Attempt to detect features such as curved back, hips non-parallel to ground (signifies incomplete rep).
|
68 |
-
|
69 |
-
### Bench Press:
|
70 |
-
- Good Form: Attempt to detect tucked in elbows, slightly arched back, bar directly above chest
|
71 |
-
- Bad Form: Attempt to detect flat back, flared out elbows.
|
|
|
1 |
### Saket Shirsath, Parth Thakkar, Ben Wolfson
|
2 |
<br>
|
3 |
+
[Project Proposal](proposal.md)
|
4 |
|
|
|
5 |
|
6 |
+
## Abstract:
|
7 |
|
8 |
+
Weightlifters, both old and new injure, themselves while performing popular exercises like bench press, squats, and deadlifts. We aim to create a program that will identify which exercise is being performed and what corrections need to be made to have ideal form. Our main objective for this update is to identify which exercise is being performed. We took two approaches to identify exercises: a convolutional neural network pre-trained with the MPII Human Pose Dataset and a convolutional neural network built with Keras and trained with images scraped from the internet. Both provided very promising initial results.
|
9 |
|
10 |
+
## Teaser Figure:
|
11 |
|
12 |
+
<img src="assets\teaser.png" height="500px">
|
13 |
|
|
|
14 |
|
15 |
+
## Pose Identification using a Convolutional Neural Network:
|
16 |
|
17 |
+
One of our milestones for this iteration of our project was the successful pose detection of the exercises we are working with. This entails finding the wire figures of a person’s body when performing a particular exercise. The approach we take with this requires us to train a neural network with the MPII Human Pose Dataset. The first stage of the process is to create a set of 2D confidence maps of body part locations such as the elbow, knee, wrist, etc. Once that is done, the confidence maps are run through an algorithm to produce the 2D joint locations for the person in the image.
|
|
|
18 |
|
19 |
+
| Squat | Bench | Deadlift |
|
20 |
+
| ----- | ----- | -------- |
|
21 |
+
|<img src="assets\squat_pose.png" height="500px">|<img src="assets\bench_pose.png" height="500px">|<img src="assets\deadlift_pose.png" height="500px">|
|
22 |
|
|
|
23 |
|
24 |
+
## Results for HPE
|
25 |
|
26 |
+
Our goal while using this approach was to develop a library of poses for each of these exercises so that when we pass in an input image into our program, we can compare the pose for that to our library using an image comparison algorithm such as shortest squared distance or structural similarity measure. We quickly realized that while this approach was not very consistent. There were often similarities in pose position when it came to exercises like the squat and the deadlift, and we were not getting accurate results. It seemed that the position of the barbell with respect to the body in the image was a bigger factor in exercise classification than we originally thought.
|
27 |
|
28 |
+
Nevertheless, the results of the human pose estimation were very promising. We speculate that the actual positions of the joints could definitely be taken into account to give form recommendations to the user if we would classify the exercise being performed correctly. As a result, we needed to find a more accurate way to detect our exercises.
|
|
|
29 |
|
30 |
+
## Image Classification with Categorical Convolutional Neural Network
|
|
|
31 |
|
32 |
+
To solve our issue with accurate exercise classification, we experimented with a brute-force categorical convolutional neural network built with Keras. Using the Bing Web Search API, we scraped 250 images for each of the chosen exercises: barbell bench press, barbell back squat, and barbell deadlift. Then, we manually filtered through those to throw away faulty representations and fix formatting issues. For our final dataset, we were left with 155 bench press images, 196 squat images, and 198 deadlift images. For model validation, we randomly allocated 20% of our dataset. The associated image collections are shown in the Appendix.
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
|
|
35 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|