File size: 4,748 Bytes
4e26a4d
9b9deec
4489b32
 
4e26a4d
08863b4
 
4e26a4d
f980962
9b9deec
4e26a4d
 
9665df1
9b9deec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
title: JULIA
emoji: 🔥
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.28.3
app_file: app.py
pinned: false
short_description: Voice Chat with JULIA
---

# JULIA⚡

A personal assistant inspired by Tony Stark's JARVIS, built using Gradio, edge_tts, Hugging Face Inference API, and streaming STT (Speech-to-Text) with Nemo.

## Features

- Voice and text input support
- Text-to-Speech response
- Multiple model support via Hugging Face Inference API
- Friendly and concise responses from a virtual assistant named Julia

## How It Works

- **Speech-to-Text (STT):** Uses the `streaming_stt_nemo` library to transcribe audio inputs.
- **Text Generation:** Uses models from Hugging Face Inference API to generate responses.
- **Text-to-Speech (TTS):** Uses `edge_tts` to convert the generated response into audio.

## Interface

The Gradio interface includes:

- A dropdown to select the model.
- An audio input for voice commands.
- A text input for typed commands.
- A send button to submit the input.
- A text output to display the assistant's response.
- An audio output to play the assistant's response.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Acknowledgments

Special thanks to the developers of Gradio, Hugging Face, edge_tts, and Nemo for their amazing libraries.

Sure! Here's a detailed description of each model available in the dropdown for your virtual assistant project:

### Model Descriptions

#### 1. **Mixtral 8x7B**

- **Description:** Mixtral 8x7B is a state-of-the-art large language model developed by Mistralai. It's designed to handle a variety of natural language understanding and generation tasks with high accuracy and coherence.
- **Strengths:**
  - Excels in general knowledge and conversational tasks.
  - Capable of producing detailed and contextually relevant responses.
- **Use Cases:** Suitable for detailed Q&A, storytelling, and providing comprehensive explanations.

#### 2. **Llama 3 8B**

- **Description:** Llama 3 8B is the latest iteration in the Llama series developed by Meta. It focuses on generating human-like text based on the provided prompts.
- **Strengths:**
  - Highly optimized for generating coherent and context-aware text.
  - Efficient in understanding and maintaining conversation flow.
- **Use Cases:** Ideal for chatbots, creative writing, and interactive dialogues.

#### 3. **Mistral 7B v0.3**

- **Description:** Mistral 7B v0.3 is a powerful language model developed by Mistralai, designed to perform well in both understanding and generating text across various domains.
- **Strengths:**
  - High performance in both technical and casual conversational contexts.
  - Robust in handling diverse topics and maintaining context over longer interactions.
- **Use Cases:** Best suited for customer support, technical assistance, and in-depth discussions.

#### 4. **Phi 3 mini**

- **Description:** Phi 3 mini, developed by Microsoft, is a compact yet efficient language model optimized for fast and responsive text generation.
- **Strengths:**
  - Lightweight and quick to respond.
  - Maintains a good balance between performance and computational efficiency.
- **Use Cases:** Perfect for real-time applications, quick Q&A, and use cases where response time is critical.

### Available Models

This project supports multiple language models to cater to different needs. Here are the details of each model:

1. **Mixtral 8x7B**

   - **Description:** A state-of-the-art language model by Mistralai, designed for a wide range of natural language tasks.
   - **Strengths:** Excels in general knowledge, conversational tasks, and detailed responses.
   - **Use Cases:** Q&A, storytelling, detailed explanations.

2. **Llama 3 8B**

   - **Description:** The latest in the Llama series by Meta, focusing on generating human-like text.
   - **Strengths:** Highly coherent text generation, maintains conversation flow.
   - **Use Cases:** Chatbots, creative writing, interactive dialogues.

3. **Mistral 7B v0.3**

   - **Description:** A powerful model by Mistralai, capable of understanding and generating text across various domains.
   - **Strengths:** High performance in technical and casual contexts, robust over longer interactions.
   - **Use Cases:** Customer support, technical assistance, in-depth discussions.

4. **Phi 3 mini**
   - **Description:** A compact model by Microsoft, optimized for fast and responsive text generation.
   - **Strengths:** Lightweight, quick response times, efficient.
   - **Use Cases:** Real-time applications, quick Q&A, scenarios requiring fast responses.

You can select any of these models based on your specific needs from the dropdown menu in the interface.