Sean Carnahan commited on
Commit
e2492f0
·
1 Parent(s): ac66fa3

Move app files to root for Hugging Face deployment

Browse files
Files changed (4) hide show
  1. Dockerfile +30 -0
  2. README.md +76 -2
  3. app.py +590 -0
  4. requirements.txt +80 -0
Dockerfile ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use a Python version that matches your (keras2env) as closely as possible
2
+ FROM python:3.9-slim
3
+
4
+ WORKDIR /app
5
+
6
+ # Install system dependencies
7
+ RUN apt-get update && apt-get install -y --no-install-recommends \
8
+ libgl1-mesa-glx \
9
+ libglib2.0-0 \
10
+ && rm -rf /var/lib/apt/lists/*
11
+
12
+ COPY requirements.txt .
13
+ RUN pip install --no-cache-dir -r requirements.txt
14
+
15
+ # Copy all necessary application files and folders from HFup/ to /app in the container
16
+ # These paths are relative to the Dockerfile's location (i.e., inside HFup/)
17
+ COPY app.py .
18
+ COPY bodybuilding_pose_analyzer bodybuilding_pose_analyzer
19
+ COPY external external
20
+ COPY yolov7 yolov7
21
+ COPY yolov7-w6-pose.pt .
22
+ COPY static static
23
+
24
+ # Ensure the uploads directory within static exists and is writable
25
+ RUN mkdir -p static/uploads && chmod -R 777 static/uploads
26
+
27
+ EXPOSE 7860
28
+
29
+ # Command to run app with Gunicorn
30
+ CMD ["gunicorn", "--bind", "0.0.0.0:7860", "--workers", "1", "--threads", "2", "--timeout", "300", "app:app"]
README.md CHANGED
@@ -1,2 +1,76 @@
1
- # cv.github.io
2
- View Project Midterm Update 1 [here](https://saketshirsath.github.io/cv.github.io/)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Gladiator Pose Analyzer
3
+ emoji: 💪🏋️‍♂️
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ pinned: false
8
+ app_port: 7860
9
+ # Add license if you have one, e.g., license: apache-2.0
10
+ ---
11
+
12
+ # Gladiator Pose Analyzer 💪🏋️‍♂️
13
+
14
+ **Live Demo:** [Link to your Gladiator Pose Analyzer Hugging Face Space] (<- REPLACE THIS with your actual Space URL after deployment)
15
+
16
+ ## Overview
17
+
18
+ The Gladiator Pose Analyzer is a web application designed for bodybuilding pose analysis and feedback. Users can upload videos of their poses, and the application utilizes computer vision models to provide insights into angles, form corrections, and pose classification.
19
+
20
+ This Space uses a Flask backend with various machine learning models for pose estimation and classification.
21
+
22
+ ## Features
23
+
24
+ * **Video Upload:** Upload your bodybuilding pose videos (MP4, AVI, MOV, MKV).
25
+ * **Multiple Pose Estimation Models:**
26
+ * **Gladiator SupaDot (MediaPipe):** General pose estimation using MediaPipe Pose.
27
+ * **Gladiator BB - Lightning (MoveNet):** Fast and efficient pose estimation with MoveNet Lightning.
28
+ * **Gladiator BB - Thunder (MoveNet):** Higher accuracy pose estimation with MoveNet Thunder.
29
+ * **(Experimental) YOLOv7-w6 Pose:** Object detection based pose estimation (can be selected if enabled in UI).
30
+ * **Automated Pose Classification:** A custom-trained CNN classifies common bodybuilding poses (e.g., Side Chest, Front Double Biceps).
31
+ * **Real-time Feedback Panel:** Displays:
32
+ * Selected model.
33
+ * Current classified pose (via CNN, updated periodically).
34
+ * Calculated body angles (e.g., shoulder, elbow, knee).
35
+ * Specific form corrections based on ideal angle ranges for classified poses.
36
+ * General notes for poses where specific angle checks aren't defined.
37
+ * **Processed Video Output:** View the input video overlaid with detected keypoints and the feedback panel.
38
+
39
+ ## How to Use
40
+
41
+ 1. **Navigate to the Live Demo link** provided above.
42
+ 2. **Choose a Pose Estimation Model** from the dropdown menu:
43
+ * `Gladiator SupaDot` (MediaPipe based)
44
+ * `Gladiator BB - Lightning` (MoveNet Lightning)
45
+ * `Gladiator BB - Thunder` (MoveNet Thunder)
46
+ 3. **Select a Video File:** Click the "Choose File" button and select a video of your pose.
47
+ 4. **Upload:** Click the "Upload Video" button.
48
+ 5. **Processing:** Wait for the video to be processed. The server will analyze the video frame by frame.
49
+ 6. **View Results:** The processed video with keypoint overlays and the dynamic feedback panel will be displayed.
50
+
51
+ ## Models Used
52
+
53
+ * **Pose Estimation:**
54
+ * **MediaPipe Pose:** For the "Gladiator SupaDot" option.
55
+ * **Google MoveNet (Lightning & Thunder):** TensorFlow Hub models for "Gladiator BB" options.
56
+ * **YOLOv7-w6 Pose:** `yolov7-w6-pose.pt` (if enabled/selected).
57
+ * **Pose Classification:**
58
+ * A custom Convolutional Neural Network (CNN) trained on bodybuilding poses, loaded from `external/BodybuildingPoseClassifier/bodybuilding_pose_classifier.h5`.
59
+ * Classes: Side Chest, Front Double Biceps, Back Double Biceps, Front Lat Spread, Back Lat Spread.
60
+
61
+ ## Technical Stack
62
+
63
+ * **Backend:** Flask (Python)
64
+ * **Frontend:** HTML, CSS, JavaScript (served by Flask)
65
+ * **CV & ML Libraries:** OpenCV, TensorFlow/Keras, PyTorch, MediaPipe
66
+ * **Deployment:** Docker on Hugging Face Spaces
67
+
68
+ ## Known Issues & Limitations
69
+
70
+ * Accuracy of pose estimation and classification can vary depending on video quality, lighting, angle, and occlusion.
71
+ * The feedback provided is based on predefined angle ranges and may not cover all nuances of perfect form.
72
+ * Processing time can be significant for longer videos or when using more computationally intensive models.
73
+
74
+ ---
75
+
76
+ *Remember to replace placeholder links and add any other specific information relevant to your project!*
app.py ADDED
@@ -0,0 +1,590 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, render_template, request, jsonify, send_from_directory, url_for
2
+ from flask_cors import CORS
3
+ import cv2
4
+ import torch
5
+ import numpy as np
6
+ import os
7
+ from werkzeug.utils import secure_filename
8
+ import sys
9
+ import traceback
10
+ from tensorflow.keras.models import load_model
11
+ from tensorflow.keras.preprocessing import image
12
+ import time
13
+
14
+ # Add bodybuilding_pose_analyzer to path
15
+ sys.path.append('.') # Assuming app.py is at the root of cv.github.io
16
+ from bodybuilding_pose_analyzer.src.movenet_analyzer import MoveNetAnalyzer
17
+ from bodybuilding_pose_analyzer.src.pose_analyzer import PoseAnalyzer
18
+
19
+ # Add YOLOv7 to path
20
+ sys.path.append('yolov7')
21
+
22
+ from yolov7.models.experimental import attempt_load
23
+ from yolov7.utils.general import check_img_size, non_max_suppression_kpt, scale_coords
24
+ from yolov7.utils.torch_utils import select_device
25
+ from yolov7.utils.plots import plot_skeleton_kpts
26
+
27
+ def wrap_text(text: str, font_face: int, font_scale: float, thickness: int, max_width: int) -> list[str]:
28
+ """Wrap text to fit within max_width."""
29
+ if not text:
30
+ return []
31
+
32
+ lines = []
33
+ words = text.split(' ')
34
+ current_line = ''
35
+
36
+ for word in words:
37
+ # Check width if current_line + word fits
38
+ test_line = current_line + word + ' '
39
+ (text_width, _), _ = cv2.getTextSize(test_line.strip(), font_face, font_scale, thickness)
40
+
41
+ if text_width <= max_width:
42
+ current_line = test_line
43
+ else:
44
+ # Word doesn't fit, so current_line (without the new word) is a complete line
45
+ lines.append(current_line.strip())
46
+ # Start new line with the current word
47
+ current_line = word + ' '
48
+ # If a single word is too long, it will still overflow. Handle by breaking word if necessary (future enhancement)
49
+ (single_word_width, _), _ = cv2.getTextSize(word.strip(), font_face, font_scale, thickness)
50
+ if single_word_width > max_width:
51
+ # For now, just add the long word and let it overflow, or truncate it.
52
+ # A more complex solution would break the word.
53
+ lines.append(word.strip()) # Add the long word as its own line
54
+ current_line = '' # Reset current_line as the long word is handled
55
+
56
+ if current_line.strip(): # Add the last line
57
+ lines.append(current_line.strip())
58
+
59
+ return lines if lines else [text] # Ensure at least the original text is returned if no wrapping happens
60
+
61
+ app = Flask(__name__, static_url_path='/static', static_folder='static')
62
+ CORS(app, resources={r"/*": {"origins": "*"}})
63
+
64
+ app.config['UPLOAD_FOLDER'] = 'static/uploads'
65
+ app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024 # 16MB max file size
66
+
67
+ # Ensure upload directory exists
68
+ os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
69
+
70
+ # Initialize YOLOv7 model
71
+ device = select_device('')
72
+ yolo_model = None # Initialize as None
73
+ stride = None
74
+ imgsz = None
75
+
76
+ try:
77
+ yolo_model = attempt_load('yolov7-w6-pose.pt', map_location=device)
78
+ stride = int(yolo_model.stride.max())
79
+ imgsz = check_img_size(640, s=stride)
80
+ print("YOLOv7 Model loaded successfully")
81
+ except Exception as e:
82
+ print(f"Error loading YOLOv7 model: {e}")
83
+ traceback.print_exc()
84
+ # Not raising here to allow app to run if only MoveNet is used. Error will be caught if YOLOv7 is selected.
85
+
86
+ # YOLOv7 pose model expects 17 keypoints
87
+ kpt_shape = (17, 3)
88
+
89
+ # Load CNN model for bodybuilding pose classification
90
+ cnn_model_path = 'external/BodybuildingPoseClassifier/bodybuilding_pose_classifier.h5'
91
+ cnn_model = load_model(cnn_model_path)
92
+ cnn_class_labels = ['side_chest', 'front_double_biceps', 'back_double_biceps', 'front_lat_spread', 'back_lat_spread']
93
+
94
+ def predict_pose_cnn(img_path):
95
+ img = image.load_img(img_path, target_size=(150, 150))
96
+ img_array = image.img_to_array(img)
97
+ img_array = np.expand_dims(img_array, axis=0) / 255.0
98
+ predictions = cnn_model.predict(img_array)
99
+ predicted_class = np.argmax(predictions, axis=1)
100
+ confidence = float(np.max(predictions))
101
+ return cnn_class_labels[predicted_class[0]], confidence
102
+
103
+ @app.route('/static/uploads/<path:filename>')
104
+ def serve_video(filename):
105
+ response = send_from_directory(app.config['UPLOAD_FOLDER'], filename, as_attachment=False)
106
+ # Ensure correct content type, especially for Safari/iOS if issues arise
107
+ if filename.lower().endswith('.mp4'):
108
+ response.headers['Content-Type'] = 'video/mp4'
109
+ return response
110
+
111
+ @app.after_request
112
+ def after_request(response):
113
+ response.headers.add('Access-Control-Allow-Origin', '*')
114
+ response.headers.add('Access-Control-Allow-Headers', 'Content-Type,Authorization,X-Requested-With,Accept')
115
+ response.headers.add('Access-Control-Allow-Methods', 'GET,PUT,POST,DELETE,OPTIONS')
116
+ return response
117
+
118
+ def process_video_yolov7(video_path): # Renamed from process_video
119
+ global yolo_model, imgsz, stride # Ensure global model is used
120
+ if yolo_model is None:
121
+ raise RuntimeError("YOLOv7 model failed to load. Cannot process video.")
122
+ try:
123
+ if not os.path.exists(video_path):
124
+ raise FileNotFoundError(f"Video file not found: {video_path}")
125
+
126
+ cap = cv2.VideoCapture(video_path)
127
+ if not cap.isOpened():
128
+ raise ValueError(f"Failed to open video file: {video_path}")
129
+
130
+ fps = int(cap.get(cv2.CAP_PROP_FPS))
131
+ width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
132
+ height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
133
+
134
+ print(f"Processing video: {width}x{height} @ {fps}fps")
135
+
136
+ # Create output video writer
137
+ output_path = os.path.join(app.config['UPLOAD_FOLDER'], 'output.mp4')
138
+ fourcc = cv2.VideoWriter_fourcc(*'avc1')
139
+ out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
140
+
141
+ frame_count = 0
142
+ while cap.isOpened():
143
+ ret, frame = cap.read()
144
+ if not ret:
145
+ break
146
+
147
+ frame_count += 1
148
+ print(f"Processing frame {frame_count}")
149
+
150
+ # Prepare image
151
+ img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
152
+ img = cv2.resize(img, (imgsz, imgsz))
153
+ img = img.transpose((2, 0, 1)) # HWC to CHW
154
+ img = np.ascontiguousarray(img)
155
+ img = torch.from_numpy(img).to(device)
156
+ img = img.float() / 255.0
157
+ if img.ndimension() == 3:
158
+ img = img.unsqueeze(0)
159
+
160
+ # Inference
161
+ with torch.no_grad():
162
+ pred = yolo_model(img)[0] # Use yolo_model
163
+ pred = non_max_suppression_kpt(pred, conf_thres=0.25, iou_thres=0.45, nc=yolo_model.yaml['nc'], kpt_label=True)
164
+
165
+ # Draw results
166
+ output_frame = frame.copy()
167
+ poses_detected = False
168
+ for det in pred:
169
+ if len(det):
170
+ poses_detected = True
171
+ det[:, :4] = scale_coords(img.shape[2:], det[:, :4], frame.shape).round()
172
+ for row in det:
173
+ xyxy = row[:4]
174
+ conf = row[4]
175
+ cls = row[5]
176
+ kpts = row[6:]
177
+ kpts = torch.tensor(kpts).view(kpt_shape)
178
+ output_frame = plot_skeleton_kpts(output_frame, kpts, steps=3, orig_shape=output_frame.shape[:2])
179
+
180
+ if not poses_detected:
181
+ print(f"No poses detected in frame {frame_count}")
182
+
183
+ out.write(output_frame)
184
+
185
+ cap.release()
186
+ out.release()
187
+
188
+ if frame_count == 0:
189
+ raise ValueError("No frames were processed from the video")
190
+
191
+ print(f"Video processing completed. Processed {frame_count} frames")
192
+ # Return URL for the client, using the 'serve_video' endpoint
193
+ output_filename = 'output.mp4'
194
+ return url_for('serve_video', filename=output_filename, _external=False)
195
+ except Exception as e:
196
+ print('Error in process_video:', e)
197
+ traceback.print_exc()
198
+ raise
199
+
200
+ def process_video_movenet(video_path, model_variant='lightning', pose_type='front_double_biceps'):
201
+ try:
202
+ print(f"[PROCESS_VIDEO_MOVENET] Called with video_path: {video_path}, model_variant: {model_variant}, pose_type: {pose_type}")
203
+ if not os.path.exists(video_path):
204
+ raise FileNotFoundError(f"Video file not found: {video_path}")
205
+
206
+ analyzer = MoveNetAnalyzer(model_name=model_variant)
207
+ cap = cv2.VideoCapture(video_path)
208
+ if not cap.isOpened():
209
+ raise ValueError(f"Failed to open video file: {video_path}")
210
+ fps = int(cap.get(cv2.CAP_PROP_FPS))
211
+ width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
212
+ height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
213
+
214
+ # Add panel width to total width
215
+ panel_width = 300
216
+ total_width = width + panel_width
217
+
218
+ print(f"Processing video with MoveNet ({model_variant}): {width}x{height} @ {fps}fps")
219
+ print(f"Output dimensions will be: {total_width}x{height}")
220
+ output_filename = f'output_movenet_{model_variant}.mp4'
221
+ output_path = os.path.join(app.config['UPLOAD_FOLDER'], output_filename)
222
+ print(f"Output path: {output_path}")
223
+
224
+ fourcc = cv2.VideoWriter_fourcc(*'avc1')
225
+ out = cv2.VideoWriter(output_path, fourcc, fps, (total_width, height))
226
+ if not out.isOpened():
227
+ raise ValueError(f"Failed to create output video writer at {output_path}")
228
+
229
+ frame_count = 0
230
+ current_pose = pose_type
231
+ segment_length = 4 * fps if fps > 0 else 120
232
+ cnn_pose = None
233
+ last_valid_landmarks = None
234
+ landmarks_analysis = {'error': 'Processing not started'} # Initialize landmarks_analysis
235
+
236
+ while cap.isOpened():
237
+ ret, frame = cap.read()
238
+ if not ret:
239
+ break
240
+ frame_count += 1
241
+ if frame_count % 30 == 0:
242
+ print(f"Processing frame {frame_count}")
243
+
244
+ # Process frame
245
+ processed_frame, current_landmarks_analysis, landmarks = analyzer.process_frame(frame, current_pose, last_valid_landmarks=last_valid_landmarks)
246
+ landmarks_analysis = current_landmarks_analysis # Update with the latest analysis
247
+ if frame_count % 30 == 0: # Log every 30 frames
248
+ print(f"[MOVENET_DEBUG] Frame {frame_count} - landmarks_analysis: {landmarks_analysis}")
249
+ if landmarks:
250
+ last_valid_landmarks = landmarks
251
+
252
+ # CNN prediction (every 4 seconds)
253
+ if (frame_count - 1) % segment_length == 0:
254
+ temp_img_path = f'temp_frame_for_cnn_{frame_count}.jpg' # Unique temp name
255
+ cv2.imwrite(temp_img_path, frame)
256
+ try:
257
+ cnn_pose_pred, cnn_conf = predict_pose_cnn(temp_img_path)
258
+ print(f"[CNN] Frame {frame_count}: Pose: {cnn_pose_pred}, Conf: {cnn_conf:.2f}")
259
+ if cnn_conf >= 0.3:
260
+ current_pose = cnn_pose_pred # Update current_pose for the analyzer
261
+ except Exception as e:
262
+ print(f"[CNN] Error predicting pose on frame {frame_count}: {e}")
263
+ finally:
264
+ if os.path.exists(temp_img_path):
265
+ os.remove(temp_img_path)
266
+
267
+ # Create side panel
268
+ panel = np.zeros((height, panel_width, 3), dtype=np.uint8)
269
+
270
+ # --- Dynamic Text Parameter Calculations ---
271
+ current_font = cv2.FONT_HERSHEY_DUPLEX
272
+
273
+ # Base font scale and reference video height for scaling
274
+ # Adjust base_font_scale_at_ref_height if text is generally too large or too small
275
+ base_font_scale_at_ref_height = 0.6
276
+ reference_height_for_font_scale = 640.0 # e.g., a common video height like 480p, 720p
277
+
278
+ # Calculate dynamic font_scale
279
+ font_scale = (height / reference_height_for_font_scale) * base_font_scale_at_ref_height
280
+ # Clamp font_scale to a min/max range to avoid extremes
281
+ font_scale = max(0.4, min(font_scale, 1.2))
282
+
283
+ # Calculate dynamic thickness
284
+ thickness = 1 if font_scale < 0.7 else 2
285
+
286
+ # Calculate dynamic line_height based on actual text height
287
+ # Using a sample string like "Ag" which has ascenders and descenders
288
+ (_, text_actual_height), _ = cv2.getTextSize("Ag", current_font, font_scale, thickness)
289
+ line_spacing_factor = 1.8 # Adjust for more or less space between lines
290
+ line_height = int(text_actual_height * line_spacing_factor)
291
+ line_height = max(line_height, 15) # Ensure a minimum line height
292
+
293
+ # Initial y_offset for the first line of text
294
+ y_offset_panel = max(line_height, 20) # Start considering top margin and text height
295
+ # --- End of Dynamic Text Parameter Calculations ---
296
+
297
+ display_model_name = f"Gladiator {model_variant.capitalize()}"
298
+ cv2.putText(panel, f"Model: {display_model_name}", (10, y_offset_panel), current_font, font_scale, (0, 255, 255), thickness, lineType=cv2.LINE_AA)
299
+ y_offset_panel += line_height
300
+
301
+ if 'error' not in landmarks_analysis:
302
+ cv2.putText(panel, "Angles:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
303
+ y_offset_panel += line_height
304
+ for joint, angle in landmarks_analysis.get('angles', {}).items():
305
+ text_to_display = f"{joint.capitalize()}: {angle:.1f} deg"
306
+ cv2.putText(panel, text_to_display, (20, y_offset_panel), current_font, font_scale, (0, 255, 0), thickness, lineType=cv2.LINE_AA)
307
+ y_offset_panel += line_height
308
+
309
+ # Define available width for text within the panel, considering padding
310
+ text_area_x_start = 20
311
+ panel_padding = 10 # Padding from the right edge of the panel
312
+ text_area_width = panel_width - text_area_x_start - panel_padding
313
+
314
+ if landmarks_analysis.get('corrections'):
315
+ y_offset_panel += int(line_height * 0.5) # Smaller gap before section title
316
+ cv2.putText(panel, "Corrections:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
317
+ y_offset_panel += line_height
318
+ for correction_text in landmarks_analysis.get('corrections', []):
319
+ wrapped_lines = wrap_text(correction_text, current_font, font_scale, thickness, text_area_width)
320
+ for line in wrapped_lines:
321
+ cv2.putText(panel, line, (text_area_x_start, y_offset_panel), current_font, font_scale, (0, 0, 255), thickness, lineType=cv2.LINE_AA)
322
+ y_offset_panel += line_height
323
+
324
+ # Display notes if any
325
+ if landmarks_analysis.get('notes'):
326
+ y_offset_panel += int(line_height * 0.5) # Smaller gap before section title
327
+ cv2.putText(panel, "Notes:", (10, y_offset_panel), current_font, font_scale, (200, 200, 200), thickness, lineType=cv2.LINE_AA)
328
+ y_offset_panel += line_height
329
+ for note_text in landmarks_analysis.get('notes', []):
330
+ wrapped_lines = wrap_text(note_text, current_font, font_scale, thickness, text_area_width)
331
+ for line in wrapped_lines:
332
+ cv2.putText(panel, line, (text_area_x_start, y_offset_panel), current_font, font_scale, (200, 200, 200), thickness, lineType=cv2.LINE_AA)
333
+ y_offset_panel += line_height
334
+ else:
335
+ cv2.putText(panel, "Error:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
336
+ y_offset_panel += line_height
337
+ # Also wrap error message if it can be long
338
+ error_text = landmarks_analysis.get('error', 'Unknown error')
339
+ text_area_x_start = 20 # Assuming error message also starts at x=20
340
+ panel_padding = 10
341
+ text_area_width = panel_width - text_area_x_start - panel_padding
342
+ wrapped_error_lines = wrap_text(error_text, current_font, font_scale, thickness, text_area_width)
343
+ for line in wrapped_error_lines:
344
+ cv2.putText(panel, line, (text_area_x_start, y_offset_panel), current_font, font_scale, (0, 0, 255), thickness, lineType=cv2.LINE_AA)
345
+ y_offset_panel += line_height
346
+
347
+ combined_frame = np.hstack((processed_frame, panel))
348
+ out.write(combined_frame)
349
+
350
+ cap.release()
351
+ out.release()
352
+
353
+ if frame_count == 0:
354
+ raise ValueError("No frames were processed from the video by MoveNet")
355
+
356
+ print(f"MoveNet video processing completed. Processed {frame_count} frames. Output: {output_path}")
357
+ print(f"Output file size: {os.path.getsize(output_path)} bytes")
358
+
359
+ return url_for('serve_video', filename=output_filename, _external=False)
360
+ except Exception as e:
361
+ print(f'Error in process_video_movenet: {e}')
362
+ traceback.print_exc()
363
+ raise
364
+
365
+ def process_video_mediapipe(video_path):
366
+ try:
367
+ print(f"[PROCESS_VIDEO_MEDIAPIPE] Called with video_path: {video_path}")
368
+ if not os.path.exists(video_path):
369
+ raise FileNotFoundError(f"Video file not found: {video_path}")
370
+
371
+ analyzer = PoseAnalyzer()
372
+ cap = cv2.VideoCapture(video_path)
373
+ if not cap.isOpened():
374
+ raise ValueError(f"Failed to open video file: {video_path}")
375
+ fps = int(cap.get(cv2.CAP_PROP_FPS))
376
+ width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
377
+ height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
378
+
379
+ # Add panel width to total width
380
+ panel_width = 300
381
+ total_width = width + panel_width
382
+
383
+ print(f"Processing video with MediaPipe: {width}x{height} @ {fps}fps")
384
+ output_filename = f'output_mediapipe.mp4'
385
+ output_path = os.path.join(app.config['UPLOAD_FOLDER'], output_filename)
386
+ fourcc = cv2.VideoWriter_fourcc(*'avc1')
387
+ out = cv2.VideoWriter(output_path, fourcc, fps, (total_width, height))
388
+ if not out.isOpened():
389
+ raise ValueError(f"Failed to create output video writer at {output_path}")
390
+
391
+ frame_count = 0
392
+ current_pose = 'Uncertain' # Initial pose for MediaPipe
393
+ segment_length = 4 * fps if fps > 0 else 120
394
+ cnn_pose = None
395
+ last_valid_landmarks = None
396
+ analysis_results = {'error': 'Processing not started'} # Initialize analysis_results
397
+
398
+ while cap.isOpened():
399
+ ret, frame = cap.read()
400
+ if not ret:
401
+ break
402
+ frame_count += 1
403
+ if frame_count % 30 == 0:
404
+ print(f"Processing frame {frame_count}")
405
+
406
+ # Process frame with MediaPipe
407
+ processed_frame, current_analysis_results, landmarks = analyzer.process_frame(frame, last_valid_landmarks=last_valid_landmarks)
408
+ analysis_results = current_analysis_results # Update with the latest analysis
409
+ if landmarks:
410
+ last_valid_landmarks = landmarks
411
+
412
+ # CNN prediction (every 4 seconds)
413
+ if (frame_count - 1) % segment_length == 0:
414
+ temp_img_path = f'temp_frame_for_cnn_{frame_count}.jpg' # Unique temp name
415
+ cv2.imwrite(temp_img_path, frame)
416
+ try:
417
+ cnn_pose_pred, cnn_conf = predict_pose_cnn(temp_img_path)
418
+ print(f"[CNN] Frame {frame_count}: Pose: {cnn_pose_pred}, Conf: {cnn_conf:.2f}")
419
+ if cnn_conf >= 0.3:
420
+ current_pose = cnn_pose_pred # Update current_pose to be displayed
421
+ except Exception as e:
422
+ print(f"[CNN] Error predicting pose on frame {frame_count}: {e}")
423
+ finally:
424
+ if os.path.exists(temp_img_path):
425
+ os.remove(temp_img_path)
426
+
427
+ # Create side panel
428
+ panel = np.zeros((height, panel_width, 3), dtype=np.uint8)
429
+
430
+ # --- Dynamic Text Parameter Calculations ---
431
+ current_font = cv2.FONT_HERSHEY_DUPLEX
432
+
433
+ # Base font scale and reference video height for scaling
434
+ # Adjust base_font_scale_at_ref_height if text is generally too large or too small
435
+ base_font_scale_at_ref_height = 0.6
436
+ reference_height_for_font_scale = 640.0 # e.g., a common video height like 480p, 720p
437
+
438
+ # Calculate dynamic font_scale
439
+ font_scale = (height / reference_height_for_font_scale) * base_font_scale_at_ref_height
440
+ # Clamp font_scale to a min/max range to avoid extremes
441
+ font_scale = max(0.4, min(font_scale, 1.2))
442
+
443
+ # Calculate dynamic thickness
444
+ thickness = 1 if font_scale < 0.7 else 2
445
+
446
+ # Calculate dynamic line_height based on actual text height
447
+ # Using a sample string like "Ag" which has ascenders and descenders
448
+ (_, text_actual_height), _ = cv2.getTextSize("Ag", current_font, font_scale, thickness)
449
+ line_spacing_factor = 1.8 # Adjust for more or less space between lines
450
+ line_height = int(text_actual_height * line_spacing_factor)
451
+ line_height = max(line_height, 15) # Ensure a minimum line height
452
+
453
+ # Initial y_offset for the first line of text
454
+ y_offset_panel = max(line_height, 20) # Start considering top margin and text height
455
+ # --- End of Dynamic Text Parameter Calculations ---
456
+
457
+ cv2.putText(panel, "Model: Gladiator SupaDot", (10, y_offset_panel), current_font, font_scale, (0, 255, 255), thickness, lineType=cv2.LINE_AA)
458
+ y_offset_panel += line_height
459
+ if frame_count % 30 == 0: # Print every 30 frames to avoid flooding console
460
+ print(f"[MEDIAPIPE_PANEL] Frame {frame_count} - Current Pose for Panel: {current_pose}")
461
+ cv2.putText(panel, f"Pose: {current_pose}", (10, y_offset_panel), current_font, font_scale, (255, 0, 0), thickness, lineType=cv2.LINE_AA)
462
+ y_offset_panel += int(line_height * 1.5)
463
+
464
+ if 'error' not in analysis_results:
465
+ cv2.putText(panel, "Angles:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
466
+ y_offset_panel += line_height
467
+ for joint, angle in analysis_results.get('angles', {}).items():
468
+ text_to_display = f"{joint.capitalize()}: {angle:.1f} deg"
469
+ cv2.putText(panel, text_to_display, (20, y_offset_panel), current_font, font_scale, (0, 255, 0), thickness, lineType=cv2.LINE_AA)
470
+ y_offset_panel += line_height
471
+
472
+ if analysis_results.get('corrections'):
473
+ y_offset_panel += line_height
474
+ cv2.putText(panel, "Corrections:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
475
+ y_offset_panel += line_height
476
+ for correction in analysis_results.get('corrections', []):
477
+ cv2.putText(panel, f"• {correction}", (20, y_offset_panel), current_font, font_scale, (0, 0, 255), thickness, lineType=cv2.LINE_AA)
478
+ y_offset_panel += line_height
479
+
480
+ # Display notes if any
481
+ if analysis_results.get('notes'):
482
+ y_offset_panel += line_height
483
+ cv2.putText(panel, "Notes:", (10, y_offset_panel), current_font, font_scale, (200, 200, 200), thickness, lineType=cv2.LINE_AA) # Grey color for notes
484
+ y_offset_panel += line_height
485
+ for note in analysis_results.get('notes', []):
486
+ cv2.putText(panel, f"• {note}", (20, y_offset_panel), current_font, font_scale, (200, 200, 200), thickness, lineType=cv2.LINE_AA)
487
+ y_offset_panel += line_height
488
+ else:
489
+ cv2.putText(panel, "Error:", (10, y_offset_panel), current_font, font_scale, (255, 255, 255), thickness, lineType=cv2.LINE_AA)
490
+ y_offset_panel += line_height
491
+ cv2.putText(panel, analysis_results.get('error', 'Unknown error'), (20, y_offset_panel), current_font, font_scale, (0, 0, 255), thickness, lineType=cv2.LINE_AA)
492
+
493
+ combined_frame = np.hstack((processed_frame, panel)) # Use processed_frame from analyzer
494
+ out.write(combined_frame)
495
+
496
+ cap.release()
497
+ out.release()
498
+ if frame_count == 0:
499
+ raise ValueError("No frames were processed from the video by MediaPipe")
500
+ print(f"MediaPipe video processing completed. Processed {frame_count} frames. Output: {output_path}")
501
+ return url_for('serve_video', filename=output_filename, _external=False)
502
+ except Exception as e:
503
+ print(f'Error in process_video_mediapipe: {e}')
504
+ traceback.print_exc()
505
+ raise
506
+
507
+ @app.route('/')
508
+ def index():
509
+ return render_template('index.html')
510
+
511
+ @app.route('/upload', methods=['POST'])
512
+ def upload_file():
513
+ try:
514
+ if 'video' not in request.files:
515
+ print("[UPLOAD] No video file in request")
516
+ return jsonify({'error': 'No video file provided'}), 400
517
+
518
+ file = request.files['video']
519
+ if file.filename == '':
520
+ print("[UPLOAD] Empty filename")
521
+ return jsonify({'error': 'No selected file'}), 400
522
+
523
+ if file:
524
+ allowed_extensions = {'mp4', 'avi', 'mov', 'mkv'}
525
+ if '.' not in file.filename or file.filename.rsplit('.', 1)[1].lower() not in allowed_extensions:
526
+ print(f"[UPLOAD] Invalid file format: {file.filename}")
527
+ return jsonify({'error': 'Invalid file format. Allowed formats: mp4, avi, mov, mkv'}), 400
528
+
529
+ # Ensure the filename is properly sanitized
530
+ filename = secure_filename(file.filename)
531
+ print(f"[UPLOAD] Original filename: {file.filename}")
532
+ print(f"[UPLOAD] Sanitized filename: {filename}")
533
+
534
+ # Create a unique filename to prevent conflicts
535
+ base, ext = os.path.splitext(filename)
536
+ unique_filename = f"{base}_{int(time.time())}{ext}"
537
+ filepath = os.path.join(app.config['UPLOAD_FOLDER'], unique_filename)
538
+
539
+ print(f"[UPLOAD] Saving file to: {filepath}")
540
+ file.save(filepath)
541
+
542
+ if not os.path.exists(filepath):
543
+ print(f"[UPLOAD] File not found after save: {filepath}")
544
+ return jsonify({'error': 'Failed to save uploaded file'}), 500
545
+
546
+ print(f"[UPLOAD] File saved successfully. Size: {os.path.getsize(filepath)} bytes")
547
+
548
+ try:
549
+ model_choice = request.form.get('model_choice', 'Gladiator SupaDot')
550
+ print(f"[UPLOAD] Processing with model: {model_choice}")
551
+
552
+ if model_choice == 'movenet':
553
+ movenet_variant = request.form.get('movenet_variant', 'lightning')
554
+ print(f"[UPLOAD] Using MoveNet variant: {movenet_variant}")
555
+ output_path_url = process_video_movenet(filepath, model_variant=movenet_variant)
556
+ else:
557
+ output_path_url = process_video_mediapipe(filepath)
558
+
559
+ print(f"[UPLOAD] Processing complete. Output URL: {output_path_url}")
560
+
561
+ if not os.path.exists(os.path.join(app.config['UPLOAD_FOLDER'], os.path.basename(output_path_url))):
562
+ print(f"[UPLOAD] Output file not found: {output_path_url}")
563
+ return jsonify({'error': 'Output video file not found'}), 500
564
+
565
+ return jsonify({
566
+ 'message': f'Video processed successfully with {model_choice}',
567
+ 'output_path': output_path_url
568
+ })
569
+
570
+ except Exception as e:
571
+ print(f"[UPLOAD] Error processing video: {str(e)}")
572
+ traceback.print_exc()
573
+ return jsonify({'error': f'Error processing video: {str(e)}'}), 500
574
+
575
+ finally:
576
+ try:
577
+ if os.path.exists(filepath):
578
+ os.remove(filepath)
579
+ print(f"[UPLOAD] Cleaned up input file: {filepath}")
580
+ except Exception as e:
581
+ print(f"[UPLOAD] Error cleaning up file: {str(e)}")
582
+
583
+ except Exception as e:
584
+ print(f"[UPLOAD] Unexpected error: {str(e)}")
585
+ traceback.print_exc()
586
+ return jsonify({'error': 'Internal server error'}), 500
587
+
588
+ if __name__ == '__main__':
589
+ # Ensure the port is 7860 and debug is False for HF Spaces deployment
590
+ app.run(host='0.0.0.0', port=7860, debug=False)
requirements.txt ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ absl-py==2.2.2
2
+ astunparse==1.6.3
3
+ attrs==25.3.0
4
+ blinker==1.9.0
5
+ certifi==2025.4.26
6
+ cffi==1.17.1
7
+ charset-normalizer==3.4.2
8
+ click==8.1.7
9
+ contourpy==1.2.1
10
+ cycler==0.12.1
11
+ ffmpeg-python==0.2.0
12
+ filelock==3.18.0
13
+ Flask==3.1.1
14
+ flask-cors==5.0.1
15
+ flatbuffers==25.2.10
16
+ fonttools==4.58.0
17
+ fsspec==2025.3.2
18
+ future==1.0.0
19
+ gast==0.6.0
20
+ google-pasta==0.2.0
21
+ gunicorn==22.0.0
22
+ grpcio==1.71.0
23
+ h5py==3.13.0
24
+ idna==3.10
25
+ itsdangerous==2.2.0
26
+ jax==0.4.30
27
+ jaxlib==0.4.30
28
+ Jinja2==3.1.6
29
+ keras==3.9.2
30
+ kiwisolver==1.4.7
31
+ libclang==18.1.1
32
+ Markdown==3.8
33
+ markdown-it-py==3.0.0
34
+ MarkupSafe==3.0.2
35
+ matplotlib==3.9.4
36
+ mdurl==0.1.2
37
+ mediapipe==0.10.21
38
+ ml_dtypes==0.5.1
39
+ mpmath==1.3.0
40
+ namex==0.0.9
41
+ networkx==3.2.1
42
+ ngrok==1.4.0
43
+ numpy==1.26.4
44
+ opencv-contrib-python==4.11.0.86
45
+ opencv-python==4.11.0.86
46
+ opt_einsum==3.4.0
47
+ optree==0.15.0
48
+ packaging==25.0
49
+ pandas==2.2.3
50
+ pillow==11.2.1
51
+ protobuf==4.25.7
52
+ pycparser==2.22
53
+ Pygments==2.19.1
54
+ pyparsing==3.2.3
55
+ python-dateutil==2.9.0.post0
56
+ pytz==2025.2
57
+ PyYAML==6.0.2
58
+ requests==2.32.3
59
+ rich==14.0.0
60
+ scipy==1.13.1
61
+ seaborn==0.13.2
62
+ sentencepiece==0.2.0
63
+ six==1.17.0
64
+ sounddevice==0.5.1
65
+ sympy==1.14.0
66
+ tensorboard==2.19.0
67
+ tensorboard-data-server==0.7.2
68
+ tensorflow==2.19.0
69
+ tensorflow-hub==0.16.1
70
+ tensorflow-io-gcs-filesystem==0.37.1
71
+ termcolor==3.1.0
72
+ tf_keras==2.19.0
73
+ torch==2.7.0
74
+ torchvision==0.22.0
75
+ tqdm==4.67.1
76
+ typing_extensions==4.13.2
77
+ tzdata==2025.2
78
+ urllib3==2.4.0
79
+ Werkzeug==3.1.3
80
+ wrapt==1.17.2