freddyaboulton HF Staff commited on
Commit
d049777
·
verified ·
1 Parent(s): ceec7bb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +165 -9
README.md CHANGED
@@ -1,14 +1,170 @@
1
  ---
2
- title: Gradio Webrtc
3
- emoji: 📈
4
- colorFrom: pink
5
- colorTo: green
 
6
  sdk: gradio
7
- sdk_version: 5.1.0
8
- app_file: app.py
9
  pinned: false
10
- license: mit
11
- short_description: Stream Audio/Video in real time with WebRTC
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags: [gradio-custom-component, Video, Audio, streaming, webrtc, realtime]
3
+ title: gradio_webrtc
4
+ short_description: Stream audio/video in realtime with webrtc
5
+ colorFrom: blue
6
+ colorTo: yellow
7
  sdk: gradio
 
 
8
  pinned: false
 
 
9
  ---
10
 
11
+ <h1 style='text-align: center; margin-bottom: 1rem'> Gradio WebRTC ⚡️ </h1>
12
+
13
+ <div style="display: flex; flex-direction: row; justify-content: center">
14
+ <img style="display: block; padding-right: 5px; height: 20px;" alt="Static Badge" src="https://img.shields.io/badge/version%20-%200.0.5%20-%20orange">
15
+ <a href="https://github.com/freddyaboulton/gradio-webrtc" target="_blank"><img alt="Static Badge" src="https://img.shields.io/badge/github-white?logo=github&logoColor=black"></a>
16
+ </div>
17
+
18
+ <h3 style='text-align: center'>
19
+ Stream video and audio in real time with Gradio using WebRTC.
20
+ </h3>
21
+
22
+ ## Installation
23
+
24
+ ```bash
25
+ pip install gradio_webrtc
26
+ ```
27
+
28
+ ## Examples:
29
+ 1. [Object Detection from Webcam with YOLOv10](https://huggingface.co/spaces/freddyaboulton/webrtc-yolov10n) 📷
30
+ 2. [Streaming Object Detection from Video with RT-DETR](https://huggingface.co/spaces/freddyaboulton/rt-detr-object-detection-webrtc) 🎥
31
+ 3. [Text-to-Speech](https://huggingface.co/spaces/freddyaboulton/parler-tts-streaming-webrtc) 🗣️
32
+
33
+ ## Usage
34
+
35
+ The WebRTC component supports the following three use cases:
36
+ 1. Streaming video from the user webcam to the server and back
37
+ 2. Streaming Video from the server to the client
38
+ 3. Streaming Audio from the server to the client
39
+
40
+ Streaming Audio from client to the server and back (conversational AI) is not supported yet.
41
+
42
+
43
+ ## Streaming Video from the User Webcam to the Server and Back
44
+
45
+ ```python
46
+ import gradio as gr
47
+ from gradio_webrtc import WebRTC
48
+
49
+
50
+ def detection(image, conf_threshold=0.3):
51
+ ... your detection code here ...
52
+
53
+
54
+ with gr.Blocks() as demo:
55
+ image = WebRTC(label="Stream", mode="send-receive", modality="video")
56
+ conf_threshold = gr.Slider(
57
+ label="Confidence Threshold",
58
+ minimum=0.0,
59
+ maximum=1.0,
60
+ step=0.05,
61
+ value=0.30,
62
+ )
63
+ image.stream(
64
+ fn=detection,
65
+ inputs=[image, conf_threshold],
66
+ outputs=[image], time_limit=10
67
+ )
68
+
69
+ if __name__ == "__main__":
70
+ demo.launch()
71
+
72
+ ```
73
+ * Set the `mode` parameter to `send-receive` and `modality` to "video".
74
+ * The `stream` event's `fn` parameter is a function that receives the next frame from the webcam
75
+ as a **numpy array** and returns the processed frame also as a **numpy array**.
76
+ * Numpy arrays are in (height, width, 3) format where the color channels are in RGB format.
77
+ * The `inputs` parameter should be a list where the first element is the WebRTC component. The only output allowed is the WebRTC component.
78
+ * The `time_limit` parameter is the maximum time in seconds the video stream will run. If the time limit is reached, the video stream will stop.
79
+
80
+ ## Streaming Video from the User Webcam to the Server and Back
81
+
82
+ ```python
83
+ import gradio as gr
84
+ from gradio_webrtc import WebRTC
85
+ import cv2
86
+
87
+ def generation():
88
+ url = "https://download.tsi.telecom-paristech.fr/gpac/dataset/dash/uhd/mux_sources/hevcds_720p30_2M.mp4"
89
+ cap = cv2.VideoCapture(url)
90
+ iterating = True
91
+ while iterating:
92
+ iterating, frame = cap.read()
93
+ yield frame
94
+
95
+ with gr.Blocks() as demo:
96
+ output_video = WebRTC(label="Video Stream", mode="receive", modality="video")
97
+ button = gr.Button("Start", variant="primary")
98
+ output_video.stream(
99
+ fn=generation, inputs=None, outputs=[output_video],
100
+ trigger=button.click
101
+ )
102
+
103
+ if __name__ == "__main__":
104
+ demo.launch()
105
+ ```
106
+
107
+ * Set the "mode" parameter to "receive" and "modality" to "video".
108
+ * The `stream` event's `fn` parameter is a generator function that yields the next frame from the video as a **numpy array**.
109
+ * The only output allowed is the WebRTC component.
110
+ * The `trigger` parameter the gradio event that will trigger the webrtc connection. In this case, the button click event.
111
+
112
+ ## Streaming Audio from the Server to the Client
113
+
114
+ ```python
115
+ import gradio as gr
116
+ from pydub import AudioSegment
117
+
118
+ def generation(num_steps):
119
+ for _ in range(num_steps):
120
+ segment = AudioSegment.from_file("/Users/freddy/sources/gradio/demo/audio_debugger/cantina.wav")
121
+ yield (segment.frame_rate, np.array(segment.get_array_of_samples()).reshape(1, -1))
122
+
123
+ with gr.Blocks() as demo:
124
+ audio = WebRTC(label="Stream", mode="receive", modality="audio")
125
+ num_steps = gr.Slider(
126
+ label="Number of Steps",
127
+ minimum=1,
128
+ maximum=10,
129
+ step=1,
130
+ value=5,
131
+ )
132
+ button = gr.Button("Generate")
133
+
134
+ audio.stream(
135
+ fn=generation, inputs=[num_steps], outputs=[audio],
136
+ trigger=button.click
137
+ )
138
+ ```
139
+
140
+ * Set the "mode" parameter to "receive" and "modality" to "audio".
141
+ * The `stream` event's `fn` parameter is a generator function that yields the next audio segment as a tuple of (frame_rate, audio_samples).
142
+ * The numpy array should be of shape (1, num_samples).
143
+ * The `outputs` parameter should be a list with the WebRTC component as the only element.
144
+
145
+ ## Deployment
146
+
147
+ When deploying in a cloud environment (like Hugging Face Spaces, EC2, etc), you need to set up a TURN server to relay the WebRTC traffic.
148
+ The easiest way to do this is to use a service like Twilio.
149
+
150
+ ```python
151
+ from twilio.rest import Client
152
+ import os
153
+
154
+ account_sid = os.environ.get("TWILIO_ACCOUNT_SID")
155
+ auth_token = os.environ.get("TWILIO_AUTH_TOKEN")
156
+
157
+ client = Client(account_sid, auth_token)
158
+
159
+ token = client.tokens.create()
160
+
161
+ rtc_configuration = {
162
+ "iceServers": token.ice_servers,
163
+ "iceTransportPolicy": "relay",
164
+ }
165
+
166
+ with gr.Blocks() as demo:
167
+ ...
168
+ rtc = WebRTC(rtc_configuration=rtc_configuration, ...)
169
+ ...
170
+ ```