11 KiB
LocalVocal - Speech AI assistant OBS Plugin
Introduction
LocalVocal lets you transcribe, locally on your machine, speech into text and simultaneously translate to any language. ✅ No GPU required, ✅ no cloud costs, ✅ no network and ✅ no downtime! Privacy first - all data stays on your machine.
If this free plugin has been valuable consider adding a ⭐ to this GH repo, rating it on OBS, subscribing to my YouTube channel where I post updates, and supporting my work on GitHub, Patreon or OpenCollective 🙏
Internally the plugin is running OpenAI's Whisper to process real-time the speech and predict a transcription. It's using the Whisper.cpp project from ggerganov to run the Whisper network efficiently on CPUs and GPUs. Translation is done with CTranslate2.
Usage
https://youtu.be/ns4cP9HFTxQ
https://youtu.be/4llyfNi9FGs
https://youtu.be/R04w02qG26o
Do more with LocalVocal:
- RealTime Translation
- Translate Caption any Application
- Real-time Translation with DeepL
- Real-time Translation with OpenAI
- ChatGPT + Text-to-speech
- POST Captions to YouTube
- Local LLM Real-time Translation
- Usage Tutorial
Current Features:
- Transcribe audio to text in real time in 100 languages
- Display captions on screen using text sources
- Send captions to a .txt or .srt file (to read by external sources or video playback) with and without aggregation option
- Sync'ed captions with OBS recording timestamps
- Send captions on a RTMP stream to e.g. YouTube, Twitch
- Bring your own Whisper model (any GGML)
- Translate captions in real time to major languages (both Whisper built-in translation as well as NMT models)
- CUDA, hipBLAS (AMD ROCm), Apple Arm64, AVX & SSE acceleration support
- Filter out or replace any part of the produced captions
- Partial transcriptions for a streaming-captions experience
- 100s of fine-tuned Whisper models for dozens of languages from HuggingFace
Roadmap:
- More robust built-in translation options
- Additional output options: .vtt, .ssa, .sub, etc.
- Speaker diarization (detecting speakers in a multi-person audio stream)
Check out our other plugins:
- Background Removal removes background from webcam without a green screen.
- Detect will detect and track >80 types of objects in real-time inside OBS
- CleanStream for real-time filler word (uh,um) and profanity removal from a live audio stream
- URL/API Source that allows fetching live data from an API and displaying it in OBS.
- Squawk adds lifelike local text-to-speech capabilities built-in OBS
Download
Check out the latest releases for downloads and install instructions.
Models
The plugin ships with the Tiny.en model, and will autonomously download other Whisper models through a dropdown. There's also an option to select an external GGML Whisper model file if you have it on disk.
Get more models from https://ggml.ggerganov.com/ and HuggingFace, follow the instructions on whisper.cpp to create your own models or download others such as distilled models.
Building
The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows (with and without Nvidia CUDA) and Linux.
Start by cloning this repo to a directory of your choice.
Mac OSX
Using the CI pipeline scripts, locally you would just call the zsh script, which builds for the architecture specified in $MACOS_ARCH (either x86_64
or arm64
).
$ MACOS_ARCH="x86_64" ./.github/scripts/build-macos -c Release
Install
The above script should succeed and the plugin files (e.g. obs-localvocal.plugin
) will reside in the ./release/Release
folder off of the root. Copy the .plugin
file to the OBS directory e.g. ~/Library/Application Support/obs-studio/plugins
.
To get .pkg
installer file, run for example
$ ./.github/scripts/package-macos -c Release
(Note that maybe the outputs will be in the Release
folder and not the install
folder like pakage-macos
expects, so you will need to rename the folder from build_x86_64/Release
to build_x86_64/install
)
Linux
Ubuntu
For successfully building on Ubuntu, first clone the repo, then from the repo directory:
$ sudo apt install -y libssl-dev
$ ./.github/scripts/build-linux
Copy the results to the standard OBS folders on Ubuntu
$ sudo cp -R release/RelWithDebInfo/lib/* /usr/lib/
$ sudo cp -R release/RelWithDebInfo/share/* /usr/share/
Note: The official OBS plugins guide recommends adding plugins to the ~/.config/obs-studio/plugins
folder. This has to do with the way you installed OBS.
In case the above doesn't work, attempt to copy the files to the ~/.config
folder:
$ mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit
$ cp -R release/RelWithDebInfo/lib/x86_64-linux-gnu/obs-plugins/* ~/.config/obs-studio/plugins/obs-localvocal/bin/64bit/
$ mkdir -p ~/.config/obs-studio/plugins/obs-localvocal/data
$ cp -R release/RelWithDebInfo/share/obs/obs-plugins/obs-localvocal/* ~/.config/obs-studio/plugins/obs-localvocal/data/
Other distros
For other distros where you can't use the CI build script, you can build the plugin as follows
-
Clone the repository and install these dependencies using your distribution's package manager:
- libssl (with development headers)
-
Generate the CMake build scripts (adjust folders if necessary)
cmake -B build-dir --preset linux-x86_64 -DUSE_SYSTEM_CURL=ON -DCMAKE_INSTALL_PREFIX=./output_dir
-
Build the plugin and copy the files to the output directory
cmake --build build-dir --target install
-
Copy plugin to OBS plugins folder
mkdir -p ~/.config/obs-studio/plugins/bin/64bit cp -R ./output_dir/lib/obs-plugins/* ~/.config/obs-studio/plugins/bin/64bit/
N.B. Depending on your system, the plugin might be in
./output_dir/lib64/obs-plugins
instead. -
Copy plugin data to OBS plugins folder - Possibly only needed on first install
mkdir -p ~/.config/obs-studio/plugins/data cp -R ./output_dir/share/obs/obs-plugins/obs-localvocal/* ~/.config/obs-studio/plugins/data/
Windows
Use the CI scripts again, for example:
> .github/scripts/Build-Windows.ps1 -Configuration Release
The build should exist in the ./release
folder off the root. You can manually install the files in the OBS directory.
> Copy-Item -Recurse -Force "release\Release\*" -Destination "C:\Program Files\obs-studio\"
Building with CUDA support on Windows
LocalVocal will now build with CUDA support automatically through a prebuilt binary of Whisper.cpp from https://github.com/locaal-ai/locaal-ai-dep-whispercpp. The CMake scripts will download all necessary files.
To build with cuda add ACCELERATION
as an environment variable (with cpu
, hipblas
, or cuda
) and build regularly
> $env:ACCELERATION="cuda"
> .github/scripts/Build-Windows.ps1 -Configuration Release