import gradio as gr import openai import os import pdfplumber import docx import pandas as pd from PIL import Image import pytesseract from pydub import AudioSegment import tempfile import sounddevice as sd import scipy.io.wavfile as wav openai.api_key = os.getenv("OPENAI_API_KEY") def extract_text(file): ext = file.name.split(".")[-1].lower() if ext == "pdf": with pdfplumber.open(file.name) as pdf: return "\n".join([page.extract_text() for page in pdf.pages if page.extract_text()]) elif ext in ["doc", "docx"]: doc = docx.Document(file.name) return "\n".join([p.text for p in doc.paragraphs]) elif ext in ["xls", "xlsx", "csv"]: df = pd.read_excel(file.name) if ext != "csv" else pd.read_csv(file.name) return df.to_string() elif ext in ["jpg", "jpeg", "png"]: image = Image.open(file.name) text = pytesseract.image_to_string(image) return text or "❌ No text found in image." else: return "❌ Unsupported file format." def transcribe_audio(audio_path): try: with open(audio_path, "rb") as f: transcript = openai.Audio.transcribe("whisper-1", f) return transcript["text"] except Exception as e: return f"❌ Transcription error: {str(e)}" def generate_response(messages): try: response = openai.ChatCompletion.create( model="gpt-3.5-turbo", messages=messages ) return response.choices[0].message["content"] except Exception as e: return f"❌ Error: {str(e)}" chat_history = [] def chat(user_message, file=None, image=None, mic_audio=None): if mic_audio: user_message = transcribe_audio(mic_audio) elif file: user_message = extract_text(file) elif image: user_message = extract_text(image) chat_history.append({"role": "user", "content": user_message}) bot_response = generate_response(chat_history) chat_history.append({"role": "assistant", "content": bot_response}) return bot_response with gr.Blocks() as demo: gr.Markdown("### 🎧 **Neobot - Always Listening**") chatbot = gr.Chatbot(height=300) with gr.Row(): txt = gr.Textbox(placeholder="Type here or use mic...", scale=4) send_btn = gr.Button("🚀", scale=1) with gr.Row(): mic_audio = gr.Audio(type="filepath", label="🎤 Record Voice", interactive=True) upload_file = gr.File(label="📎 Upload File") upload_img = gr.Image(type="filepath", label="🖼️ Upload Image") def handle_input(message, file, image, mic_audio): reply = chat(message, file, image, mic_audio) return chatbot.update(chatbot.value + [[message, reply]]) send_btn.click(handle_input, inputs=[txt, upload_file, upload_img, mic_audio], outputs=chatbot) txt.submit(handle_input, inputs=[txt, upload_file, upload_img, mic_audio], outputs=chatbot) demo.launch()