Spaces:
Running
Running
Upload folder using huggingface_hub
Browse files- scratch.py +24 -0
- script.md +11 -1
scratch.py
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
from fastrtc import Stream, ReplyOnPause
|
2 |
+
import numpy as np
|
3 |
+
|
4 |
+
|
5 |
+
def echo(audio: tuple[int, np.ndarray]):
|
6 |
+
# The function will be passed the audio until the user pauses
|
7 |
+
# Implement any iterator that yields audio
|
8 |
+
# See "LLM Voice Chat" for a more complete example
|
9 |
+
yield audio
|
10 |
+
|
11 |
+
|
12 |
+
stream = Stream(
|
13 |
+
handler=ReplyOnPause(echo),
|
14 |
+
modality="audio",
|
15 |
+
mode="send-receive",
|
16 |
+
ui_args={
|
17 |
+
"icon": "https://upload.wikimedia.org/wikipedia/commons/thumb/0/01/Portrait-of-a-woman.jpg/960px-Portrait-of-a-woman.jpg?20200608215745",
|
18 |
+
"pulse_color": "rgb(35, 157, 225)",
|
19 |
+
"icon_button_color": "rgb(35, 157, 225)",
|
20 |
+
"title": "Gemini Audio Video Chat",
|
21 |
+
},
|
22 |
+
)
|
23 |
+
|
24 |
+
stream.ui.launch()
|
script.md
CHANGED
@@ -1,5 +1,15 @@
|
|
1 |
Hi, I'm Freddy and I want to give a tour of FastRTC - the real-time communication library for Python.
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
4 |
Let's start with the basics - echoing audio.
|
5 |
|
|
|
1 |
Hi, I'm Freddy and I want to give a tour of FastRTC - the real-time communication library for Python.
|
2 |
+
|
3 |
+
Why is this important? In the last few months, we've seen many advances in real-time speech and vision models coming from closed-source models, open-source models, and API providers.
|
4 |
+
|
5 |
+
Despite these innovations, it's still difficult to build real-time AI applications that stream audio and video, especially in Python. This is because:
|
6 |
+
|
7 |
+
- ML engineers may not have experience with the technologies needed to build real-time applications, such as WebRTC or Websockets.
|
8 |
+
- Implementing algorithms for voice detection and turn taking is tricky!
|
9 |
+
- Best practices are scattered across various sources and even code assistant tools like Cursor and Copilot struggle to write Python code that supports real-time audio/video applications. I learned that the hard way!
|
10 |
+
|
11 |
+
All this means that if you want to take advantage of the latest advances in AI, you have to spend a lot of time figuring out how to do real-time streaming.
|
12 |
+
`FastRTC` solves this problem by automatically turning any python function into a real-time audio and video stream over WebRTC or WebSockets with little additional code or overhead. Let's see how it works.
|
13 |
|
14 |
Let's start with the basics - echoing audio.
|
15 |
|