udayl's picture
updated Dockerfile
bfc166e
metadata
title: NotebookLM-Kokoro TTS Project
sdk: docker
app_file: gradio_app.py
pinned: true

NotebookLM-Kokoro TTS Project

This project uses Kokoro – a lightweight, open-weight TTS model with 82 million parameters – to create a Google NotebookLM style Text-to-Speech application.

Why Kokoro?

  • Non-Proprietary & Open-Source: Kokoro is best in its class as a non-proprietary model, giving you full flexibility to deploy in production environments or personal projects.
  • High Efficiency: Despite its lightweight architecture, Kokoro delivers comparable quality to larger models while being faster and more cost-efficient.
  • Benchmarks: According to benchmarks available on the TTS-Arena page, Kokoro outperforms many closed-source models, making it the ideal choice for open deployments.
  • Easy Integration: With simple pip and Homebrew installation for dependencies like espeak-ng, integration into Python projects is straightforward.

Setup Instructions

Environment Setup

This project uses the uv Python package manager. Follow these steps:

  1. Install uv:

    pip install uv
    
  2. Create a new environment named notebooklm:

    uv venv
    
  3. Activate the environment:

    source .venv/bin/activate
    
  4. Install Python dependencies:

    pip install "kokoro>=0.9.2" soundfile torch
    
  5. Install espeak-ng (Mac users):

    brew install espeak-ng
    

Running the Application

Once the environment is set up, run the main TTS script as follows:

python notebook_lm_kokoro.py

This will process the transcript text using Kokoro and output audio segments as WAV files.

Conclusion

Kokoro’s combination of efficiency, quality, and open-access makes it the best non-proprietary TTS model available, as confirmed by recent benchmarks. Enjoy exploring and extending this project!