65da380f9f
* Bump whisper, clblast, add buffered output * Update CPU_OR_CUDA environment variable error messages * Update Cublas validation in Package-Windows.ps1 and initialize function in captions-thread.h * Update Cublas validation and fix typo in Package-Windows.ps1 * Update default whisper model path to Whisper Tiny English (74Mb) * Update translation strings for multiple locales |
||
---|---|---|
.github | ||
build-aux | ||
cmake | ||
data | ||
src | ||
.clang-format | ||
.cmake-format.json | ||
.gitignore | ||
buildspec.json | ||
CMakeLists.txt | ||
CMakePresets.json | ||
LICENSE | ||
patch_libobs.diff | ||
README.md |
LocalVocal - Speech AI assistant OBS Plugin
Introduction
LocalVocal live-streaming AI assistant plugin allows you to transcribe, locally on your machine, audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). ✅ No GPU required, ✅ no cloud costs, ✅ no network and ✅ no downtime! Privacy first - all data stays on your machine.
If this free plugin has been valuable to you consider adding a ⭐ to this GH repo, rating it on OBS, subscribing to my YouTube channel where I post updates, and supporting my work: https://github.com/sponsors/royshil
https://youtu.be/5XqTMqpui3Q & https://youtu.be/Q34LQsx-nlg & https://youtu.be/4BTmoKr0YMw
Do more with LocalVocal:
- Translate Caption any Application
- Real-time Translation with DeepL
- POST Captions to YouTube
- Local LLM Real-time Translation
Current Features:
- Transcribe audio to text in real time in 100 languages
- Display captions on screen using text sources
- Send captions to a file (which can be read by external sources)
- Send captions on a RTMP stream to e.g. YouTube, Twitch
- Bring your own Whisper model (GGML)
- Translate captions in real time to major languages
- CUDA support and Apple Arm64 support
Roadmap:
- Remove unwanted words from the transcription
- Summarize the text and show "highlights" on screen
- Detect key moments in the stream and allow triggering events (like replay)
- Detect emotions/sentiment and allow triggering events (like changing the scene or colors etc.)
Internally the plugin is running a neural network (OpenAI Whisper) locally to predict in real time the speech and provide captions.
It's using the Whisper.cpp project from ggerganov to run the Whisper network in a very efficient way on CPUs and GPUs.
Check out our other plugins:
- Background Removal removes background from webcam without a green screen.
- Detect will detect and track >80 types of objects in real-time inside OBS
- 🚧 Experimental 🚧 CleanStream for real-time filler word (uh,um) and profanity removal from live audio stream
- URL/API Source that allows fetching live data from an API and displaying it in OBS.
- Polyglot translation AI plugin for real-time, local translation to hunderds of languages
Download
Check out the latest releases for downloads and install instructions.
Models
The plugin ships with the Tiny.en model, and will autonomoously download other bigger Whisper models through a dropdown. However there's an option to select an external model file if you have it on disk.
Get more models from https://ggml.ggerganov.com/ and follow the instructions on whisper.cpp to create your own models or download others such as distilled models.
Building
The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows (with and without Nvidia CUDA) and Linux.
Start by cloning this repo to a directory of your choice.
Mac OSX
Using the CI pipeline scripts, locally you would just call the zsh script, which builds for the architecture specified in $MACOS_ARCH (either x86_64
or arm64
).
$ MACOS_ARCH="x86_64" ./.github/scripts/build-macos -c Release
Install
The above script should succeed and the plugin files (e.g. obs-localvocal.plugin
) will reside in the ./release/Release
folder off of the root. Copy the .plugin
file to the OBS directory e.g. ~/Library/Application Support/obs-studio/plugins
.
To get .pkg
installer file, run for example
$ ./.github/scripts/package-macos -c Release
(Note that maybe the outputs will be in the Release
folder and not the install
folder like pakage-macos
expects, so you will need to rename the folder from build_x86_64/Release
to build_x86_64/install
)
Linux (Ubuntu)
Use the CI scripts again
$ ./.github/scripts/build-linux.sh
Copy the results to the standard OBS folders on Ubuntu
$ sudo cp -R release/RelWithDebInfo/lib/* /usr/lib/x86_64-linux-gnu/
$ sudo cp -R release/RelWithDebInfo/share/* /usr/share/
Note: The official OBS plugins guide recommends adding plugins to the ~/.config/obs-studio/plugins
folder.
Windows
Use the CI scripts again, for example:
> .github/scripts/Build-Windows.ps1 -Configuration Release
The build should exist in the ./release
folder off the root. You can manually install the files in the OBS directory.
> Copy-Item -Recurse -Force "release\Release\*" -Destination "C:\Program Files\obs-studio\"
Building with CUDA support on Windows
LocalVocal will now build with CUDA support automatically through a prebuilt binary of Whisper.cpp from https://github.com/occ-ai/occ-ai-dep-whispercpp. The CMake scripts will download all necessary files.
To build with cuda add CPU_OR_CUDA
as an environment variable (with cpu
, 12.2.0
or 11.8.0
) and build regularly
> $env:CPU_OR_CUDA="12.2.0"
> .github/scripts/Build-Windows.ps1 -Configuration Release