obs-localvocal/README.md

80 lines
3.9 KiB
Markdown
Raw Normal View History

2023-08-14 07:21:43 +00:00
# LocalVocal - AI assistant OBS Plugin
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
<div align="center">
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
[![GitHub](https://img.shields.io/github/license/royshil/obs-localvocal)](https://github.com/royshil/obs-localvocal/blob/main/LICENSE)
[![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/royshil/obs-localvocal/push.yaml)](https://github.com/royshil/obs-localvocal/actions/workflows/push.yaml)
[![Total downloads](https://img.shields.io/github/downloads/royshil/obs-localvocal/total)](https://github.com/royshil/obs-localvocal/releases)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/royshil/obs-localvocal)](https://github.com/royshil/obs-localvocal/releases)
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
</div>
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
## Introduction
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
LocalVocal live-streaming AI assistant plugin allows you to transcribe, locally on your machine, audio speech into text and perform various language processing functions on the text using AI / LLMs (Large Language Models). ✅ No GPU required, ✅ no cloud costs, ✅ no network and ✅ no downtime! Privacy first - all data stays on your machine.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Current Features:
- Transcribe audio to text in real time in 100 languages
- Display captions on screen using text sources
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Roadmap:
- Remove unwanted words from the transcription
- Translate captions in real time to 50 languages
- Summarize the text and show "highlights" on screen
- Detect key moments in the stream and allow triggering events (like replay)
- Detect emotions/sentiment and allow triggering events (like changing the scene or colors etc.)
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Internally the plugin is running a neural network ([OpenAI Whisper](https://github.com/openai/whisper)) locally to predict in real time the speech and provide captions.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
It's using the [Whisper.cpp](https://github.com/ggerganov/whisper.cpp) project from [ggerganov](https://github.com/ggerganov) to run the Whisper network in a very efficient way on CPUs and GPUs.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Check out our other plugins:
- [Background Removal](https://github.com/royshil/obs-backgroundremoval) removes background from webcam without a green screen.
- 🚧 Experimental 🚧 [CleanStream](https://github.com/royshil/obs-cleanstream) for real-time filler word (uh,um) and profanity removal from live audio stream
- [URL/API Source](https://github.com/royshil/obs-urlsource) that allows fetching live data from an API and displaying it in OBS.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
If you like this work, which is given to you completely free of charge, please consider supporting it on GitHub: https://github.com/sponsors/royshil
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
## Download
Check out the [latest releases](https://github.com/royshil/obs-urlsource/releases) for downloads and install instructions.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
## Building
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
The plugin was built and tested on Mac OSX (Intel & Apple silicon), Windows and Linux.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Start by cloning this repo to a directory of your choice.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
### Mac OSX
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Using the CI pipeline scripts, locally you would just call the zsh script. By default this builds a universal binary for both Intel and Apple Silicon. To build for a specific architecture please see `.github/scripts/.build.zsh` for the `-arch` options.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
```sh
$ ./.github/scripts/build-macos -c Release
```
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
#### Install
The above script should succeed and the plugin files (e.g. `obs-urlsource.plugin`) will reside in the `./release/Release` folder off of the root. Copy the `.plugin` file to the OBS directory e.g. `~/Library/Application Support/obs-studio/plugins`.
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
To get `.pkg` installer file, run for example
```sh
$ ./.github/scripts/package-macos -c Release
```
(Note that maybe the outputs will be in the `Release` folder and not the `install` folder like `pakage-macos` expects, so you will need to rename the folder from `build_x86_64/Release` to `build_x86_64/install`)
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
### Linux (Ubuntu)
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Use the CI scripts again
```sh
$ ./.github/scripts/build-linux.sh
```
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
### Windows
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
Use the CI scripts again, for example:
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
```powershell
> .github/scripts/Build-Windows.ps1 -Target x64 -CMakeGenerator "Visual Studio 17 2022"
```
2023-08-10 19:05:20 +00:00
2023-08-14 07:21:43 +00:00
The build should exist in the `./release` folder off the root. You can manually install the files in the OBS directory.