Running Whisper ASR on Android Phone/Tablet with Termux

Community Article Published August 27, 2025

Automatic speech recognition on an android samsung tablet. Using whisper model from openai. It accept an audio file , one can record an audio file through microphone, and then let the asr model transcribe it. Same process will work on any android based mobile phone.

$$$ $

This guide shows how to install Termux, build Whisper (ggml / whisper.cpp), record audio, and transcribe it — all locally on your Android device.

Install Termux
Install Termux from F-Droid (not Play Store): 🔗 https://f-droid.org/packages/com.termux/
Open Termux once to finish setup.

Install dependencies & build Whisper

Update & basic tools

pkg update -y && pkg upgrade -y pkg install -y git cmake clang make ffmpeg curl

Clone whisper.cpp

cd ~ git clone --depth 1 https://github.com/ggerganov/whisper.cpp.git cd whisper.cpp

Download a model (base.en is light, small.en/medium.en/large-v2 are more accurate)

bash ./models/download-ggml-model.sh base.en

Build without OpenMP (stable on Termux)

cmake -S . -B build -DGGML_NO_OPENMP=ON cmake --build build -j"$(nproc)"

Download a WAV file & test transcription

Download a test WAV

curl -L -o demo.wav https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.wav

Transcribe with whisper.cpp

./build/bin/whisper-cli -m models/ggml-base.en.bin -f demo.wav -l en -otxt -of demo cat demo.txt

Install Termux:API for microphone recording
Install Termux:API app from F-Droid: 🔗 https://f-droid.org/packages/com.termux.api/
Give it Microphone permission in Android Settings.
Install the Termux package:

pkg install -y termux-api

Record audio with microphone

Start recording (in one Termux session)

termux-microphone-record -f mic_raw.wav -l 60 start

👉 Records up to 60 seconds. (Change -l 60 for duration in seconds.)

Stop recording (in another Termux session)

termux-microphone-record -q

Transcribe your microphone audio

Convert to 16 kHz mono (Whisper format)

ffmpeg -y -loglevel error -i mic_raw.wav -ar 16000 -ac 1 mic16.wav

Transcribe

./build/bin/whisper-cli -m models/ggml-base.en.bin -f mic16.wav -l en -otxt -of mic cat mic.txt

Optional: Upgrade to larger Whisper models

If your phone or tablet has enough RAM (2–4 GB free) you can run larger models for better accuracy.

Small English-only (better than base.en, still fast)

bash ./models/download-ggml-model.sh small.en

Medium English-only (~769 MB, high accuracy)

bash ./models/download-ggml-model.sh medium.en

Large-v2 multilingual (~1.5 GB, best accuracy, slower)

bash ./models/download-ggml-model.sh large-v2

Use them the same way by changing the -m option. Example with small.en:

./build/bin/whisper-cli -m models/ggml-small.en.bin -f mic16.wav -l en -otxt -of mic_small cat mic_small.txt

For medium:

./build/bin/whisper-cli -m models/ggml-medium.en.bin -f mic16.wav -l en -otxt -of mic_med cat mic_med.txt

For large:

./build/bin/whisper-cli -m models/ggml-large-v2.bin -f mic16.wav -l en -otxt -of mic_large cat mic_large.txt

✅ Summary

You now have a complete offline automatic speech recognition setup on an Android phone or tablet.

Base models are light and fast.

Small/Medium/Large models trade speed for better accuracy.

You can record live speech, transcribe WAV files, and even process long recordings — all without internet.

This shows that modern ASR can run directly on small devices, proving phones and tablets can be standalone speech recognition machines.

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote