Model directory (#172)

* refactor: Handle file exceptions when writing raw sentence and translations

This commit modifies the code in transcription-filter-callbacks.cpp to handle file exceptions when writing raw sentence and translations to files. It adds exception handling using try-catch blocks to ensure that file operations are properly handled. This change improves the robustness of the code and prevents crashes or unexpected behavior when file operations fail.

* refactor: Update models_info function to use cached models information

The models_info function in model-downloader.cpp has been updated to use a cached version of the models information. This improves performance by avoiding unnecessary file reads and JSON parsing. The function now returns a const reference to the cached models_info map. This change ensures that the models_info function is more efficient and reduces the overhead of loading the models information.

Refactor the code in model-downloader.cpp to use the updated models_info function and remove the unnecessary file read and JSON parsing code.

Closes #123

* refactor: Simplify file handling in transcription-filter-callbacks.cpp

* refactor: Add script to query Hugging Face models and update models_directory.json

This commit adds two new scripts, hugging_face_model_query.py and hugging_face_model_query_all.py, to query Hugging Face models and update the models_directory.json file. The hugging_face_model_query.py script fetches model information from the Hugging Face API and adds new models to the models_directory.json file. The hugging_face_model_query_all.py script fetches a list of models matching a specific search criteria and adds the matching models to the models_directory.json file. These scripts will help keep the models_directory.json file up to date with the latest models available on Hugging Face.

Refactor the file handling in transcription-filter-callbacks.cpp

This commit simplifies the file handling in the transcription-filter-callbacks.cpp file. The changes aim to improve the readability and maintainability of the code by reducing complexity and removing unnecessary code.

Update the models_info function to use cached models information

This commit updates the models_info function to use cached models information instead of fetching it every time the function is called. This change improves the performance of the function by reducing the number of API calls and improves the overall efficiency of the code.

Handle file exceptions when writing raw sentence and translations

This commit adds exception handling code to handle file exceptions when writing raw sentence and translations. The changes ensure that any file-related exceptions are caught and properly handled, preventing the program from crashing or producing incorrect results.

Simplify the Onnxruntime installation in FetchOnnxruntime.cmake

This commit simplifies the Onnxruntime installation process in the FetchOnnxruntime.cmake file. The changes aim to make the installation steps more concise and easier to understand, improving the overall maintainability of the code.

Update the version to 0.3.6 and adjust the website URL

This commit updates the version of the software to 0.3.6 and adjusts the website URL accordingly. The changes ensure that the software is properly versioned and the website URL is up to date.

* refactor: Add ExtraInfo struct to ModelInfo and update models_info function

* refactor: Update model names in models_directory.json and fix URL in transcription-filter.h
This commit is contained in:
Roy Shilkrot 2024-10-08 22:41:20 -04:00 committed by GitHub
parent 622f0b163e
commit 5670ac94b2
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
11 changed files with 4595 additions and 274 deletions

File diff suppressed because it is too large Load Diff

View File

@ -16,13 +16,21 @@ struct ModelFileDownloadInfo {
enum ModelType { MODEL_TYPE_TRANSCRIPTION, MODEL_TYPE_TRANSLATION };
struct ExtraInfo {
std::string language;
std::string description;
std::string source;
};
struct ModelInfo {
std::string friendly_name;
std::string local_folder_name;
ModelType type;
std::vector<ModelFileDownloadInfo> files;
ExtraInfo extra;
};
extern std::map<std::string, ModelInfo> models_info;
extern const std::map<std::string, ModelInfo> &models_info();
extern const std::vector<ModelInfo> get_sorted_models_info();
#endif /* MODEL_DOWNLOADER_TYPES_H */

View File

@ -8,9 +8,16 @@
std::string find_model_folder(const ModelInfo &model_info)
{
if (model_info.friendly_name.empty() || model_info.local_folder_name.empty() ||
model_info.files.empty()) {
obs_log(LOG_ERROR, "Model info is invalid.");
if (model_info.friendly_name.empty()) {
obs_log(LOG_ERROR, "Model info is invalid. Friendly name is empty.");
return "";
}
if (model_info.local_folder_name.empty()) {
obs_log(LOG_ERROR, "Model info is invalid. Local folder name is empty.");
return "";
}
if (model_info.files.empty()) {
obs_log(LOG_ERROR, "Model info is invalid. Files list is empty.");
return "";
}

View File

@ -1,246 +1,288 @@
#include "model-downloader-types.h"
#include "plugin-support.h"
std::map<std::string, ModelInfo> models_info = {{
{"M2M-100 418M (495Mb)",
{"M2M-100 418M",
"m2m-100-418M",
MODEL_TYPE_TRANSLATION,
{{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/model.bin?download=true",
"D6703DD9F920FF896E45C3D97B490761BED5944937B90BBE6A7245F5652542D4"},
{
"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/config.json?download=true",
"4244772990E30069563E3DDFB4AD6DC95BDFD2AC3DE667EA8858C9B0A8433FA8",
},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/generation_config.json?download=true",
"AED76366507333DDBB8BD49960F23C82FE6446B3319A46A54BEFDB45324CCF61"},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/shared_vocabulary.json?download=true",
"7EB5D0FF184C6095C7C10F9911C0AEA492250ABD12854F9C3D787C64B1C6397E"},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/special_tokens_map.json?download=true",
"C1A4F86C3874D279AE1B2A05162858DB5DD6C61665D84223ED886CBCFF08FDA6"},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/tokenizer_config.json?download=true",
"AE54F15F0649BB05041CDADAD8485BA1FAF40BC33E6B4C2A74AE2D1AE5710FA2"},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/vocab.json?download=true",
"B6E77E474AEEA8F441363ACA7614317C06381F3EACFE10FB9856D5081D1074CC"},
{"https://huggingface.co/jncraton/m2m100_418M-ct2-int8/resolve/main/sentencepiece.bpe.model?download=true",
"D8F7C76ED2A5E0822BE39F0A4F95A55EB19C78F4593CE609E2EDBC2AEA4D380A"}}}},
{"M2M-100 1.2B (1.25Gb)",
{"M2M-100 1.2BM",
"m2m-100-1_2B",
MODEL_TYPE_TRANSLATION,
{{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/model.bin?download=true",
"C97DF052A558895317312470E1FF7CB8EAE5416F7AE16214A2983C6853DD3CE5"},
{
"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/config.json?download=true",
"4244772990E30069563E3DDFB4AD6DC95BDFD2AC3DE667EA8858C9B0A8433FA8",
},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/generation_config.json?download=true",
"AED76366507333DDBB8BD49960F23C82FE6446B3319A46A54BEFDB45324CCF61"},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/shared_vocabulary.json?download=true",
"7EB5D0FF184C6095C7C10F9911C0AEA492250ABD12854F9C3D787C64B1C6397E"},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/special_tokens_map.json?download=true",
"C1A4F86C3874D279AE1B2A05162858DB5DD6C61665D84223ED886CBCFF08FDA6"},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/tokenizer_config.json?download=true",
"1566A6CFA4F541A55594C9D5E090F530812D5DE7C94882EA3AF156962D9933AE"},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/vocab.json?download=true",
"B6E77E474AEEA8F441363ACA7614317C06381F3EACFE10FB9856D5081D1074CC"},
{"https://huggingface.co/jncraton/m2m100_1.2B-ct2-int8/resolve/main/sentencepiece.bpe.model?download=true",
"D8F7C76ED2A5E0822BE39F0A4F95A55EB19C78F4593CE609E2EDBC2AEA4D380A"}}}},
{"NLLB 200 1.3B (1.4Gb)",
{"NLLB 200 1.3B",
"nllb-200-1.3b",
MODEL_TYPE_TRANSLATION,
{{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/model.bin?download=true",
"72D7533DC7A0E8F10F19A650D4E90FAF9CBFA899DB5411AD124BD5802BD91263"},
{
"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/config.json?download=true",
"0C2F6FA2057C7264D052FB4A62BA3476EEAE70487ACDDFA8E779A53A00CBF44C",
},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/tokenizer.json?download=true",
"E316B82DE11D0F951F370943B3C438311629547285129B0B81DADABD01BCA665"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/shared_vocabulary.txt?download=true",
"A132A83330F45514C2476EB81D1D69B3C41762264D16CE0A7EA982E5D6C728E5"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/special_tokens_map.json?download=true",
"992BD4ED610D644D6823081937BCC91BB8878DD556CEA4AE5327F2480361330E"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/tokenizer_config.json?download=true",
"D1AA8C3697D3E35674F97B5B7E9C99D22B010F528E80140257D97316BE90D044"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-1.3B-ct2-int8/resolve/main/sentencepiece.bpe.model?download=true",
"14BB8DFB35C0FFDEA7BC01E56CEA38B9E3D5EFCDCB9C251D6B40538E1AAB555A"}}}},
{"NLLB 200 600M (650Mb)",
{"NLLB 200 600M",
"nllb-200-600m",
MODEL_TYPE_TRANSLATION,
{{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/model.bin?download=true",
"ED1BEAF75134DE7505315A5223162F56ACFF397EFF6B50638A500D3936FE707B"},
{
"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/config.json?download=true",
"0C2F6FA2057C7264D052FB4A62BA3476EEAE70487ACDDFA8E779A53A00CBF44C",
},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/tokenizer.json?download=true",
"E316B82DE11D0F951F370943B3C438311629547285129B0B81DADABD01BCA665"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/shared_vocabulary.txt?download=true",
"A132A83330F45514C2476EB81D1D69B3C41762264D16CE0A7EA982E5D6C728E5"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/special_tokens_map.json?download=true",
"992BD4ED610D644D6823081937BCC91BB8878DD556CEA4AE5327F2480361330E"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/tokenizer_config.json?download=true",
"D1AA8C3697D3E35674F97B5B7E9C99D22B010F528E80140257D97316BE90D044"},
{"https://huggingface.co/JustFrederik/nllb-200-distilled-600M-ct2-int8/resolve/main/sentencepiece.bpe.model?download=true",
"14BB8DFB35C0FFDEA7BC01E56CEA38B9E3D5EFCDCB9C251D6B40538E1AAB555A"}}}},
{"MADLAD 400 3B (2.9Gb)",
{"MADLAD 400 3B",
"madlad-400-3b",
MODEL_TYPE_TRANSLATION,
{{"https://huggingface.co/santhosh/madlad400-3b-ct2/resolve/main/model.bin?download=true",
"F3C87256A2C888100C179D7DCD7F41DF17C767469546C59D32C7DDE86C740A6B"},
{
"https://huggingface.co/santhosh/madlad400-3b-ct2/resolve/main/config.json?download=true",
"A428C51CD35517554523B3C6B6974A5928BC35E82B130869A543566A34A83B93",
},
{"https://huggingface.co/santhosh/madlad400-3b-ct2/resolve/main/shared_vocabulary.txt?download=true",
"C327551CE3CA6EFC7B437E11A267F79979893332DDA8A1D146E2C950815193F8"},
{"https://huggingface.co/santhosh/madlad400-3b-ct2/resolve/main/sentencepiece.model?download=true",
"EF11AC9A22C7503492F56D48DCE53BE20E339B63605983E9F27D2CD0E0F3922C"}}}},
{"Whisper Base q5 (57Mb)",
{"Whisper Base q5",
"whisper-base-q5",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-base-q5_1.bin",
"422F1AE452ADE6F30A004D7E5C6A43195E4433BC370BF23FAC9CC591F01A8898"}}}},
{"Whisper Base English q5 (57Mb)",
{"Whisper Base En q5",
"ggml-model-whisper-base-en-q5_1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-base.en-q5_1.bin",
"4BAF70DD0D7C4247BA2B81FAFD9C01005AC77C2F9EF064E00DCF195D0E2FDD2F"}}}},
{"Whisper Base (141Mb)",
{"Whisper Base",
"ggml-model-whisper-base",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-base.bin",
"60ED5BC3DD14EEA856493D334349B405782DDCAF0028D4B5DF4088345FBA2EFE"}}}},
{"Whisper Base English (141Mb)",
{"Whisper Base En",
"ggml-model-whisper-base-en",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-base.en.bin",
"A03779C86DF3323075F5E796CB2CE5029F00EC8869EEE3FDFB897AFE36C6D002"}}}},
{"Whisper Large v1 q5 (1Gb)",
{"Whisper Large v1 q5",
"ggml-model-whisper-large-q5_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-large-q5_0.bin",
"3A214837221E4530DBC1FE8D734F302AF393EB30BD0ED046042EBF4BAF70F6F2"}}}},
{"Whisper Medium q5 (514Mb)",
{"Whisper Medium q5",
"ggml-model-whisper-medium-q5_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-medium-q5_0.bin",
"19FEA4B380C3A618EC4723C3EEF2EB785FFBA0D0538CF43F8F235E7B3B34220F"}}}},
{"Whisper Medium English q5 (514Mb)",
{"Whisper Medium En q5",
"ggml-model-whisper-medium-en-q5_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-medium.en-q5_0.bin",
"76733E26AD8FE1C7A5BF7531A9D41917B2ADC0F20F2E4F5531688A8C6CD88EB0"}}}},
{"Whisper Small q5 (181Mb)",
{"Whisper Small q5",
"ggml-model-whisper-small-q5_1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-small-q5_1.bin",
"AE85E4A935D7A567BD102FE55AFC16BB595BDB618E11B2FC7591BC08120411BB"}}}},
{"Whisper Small English q5 (181Mb)",
{"Whisper Small En q5",
"ggml-model-whisper-small-en-q5_1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-small.en-q5_1.bin",
"BFDFF4894DCB76BBF647D56263EA2A96645423F1669176F4844A1BF8E478AD30"}}}},
{"Whisper Small (465Mb)",
{"Whisper Small",
"ggml-model-whisper-small",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-small.bin",
"1BE3A9B2063867B937E64E2EC7483364A79917E157FA98C5D94B5C1FFFEA987B"}}}},
{"Whisper Small English (465Mb)",
{"Whisper Small En",
"ggml-model-whisper-small-en",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-small.en.bin",
"C6138D6D58ECC8322097E0F987C32F1BE8BB0A18532A3F88F734D1BBF9C41E5D"}}}},
{"Whisper Tiny (74Mb)",
{"Whisper Tiny",
"ggml-model-whisper-tiny",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-tiny.bin",
"BE07E048E1E599AD46341C8D2A135645097A538221678B7ACDD1B1919C6E1B21"}}}},
{"Whisper Tiny q5 (31Mb)",
{"Whisper Tiny q5",
"ggml-model-whisper-tiny-q5_1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-tiny-q5_1.bin",
"818710568DA3CA15689E31A743197B520007872FF9576237BDA97BD1B469C3D7"}}}},
{"Whisper Tiny English q5 (31Mb)",
{"Whisper Tiny En q5",
"ggml-model-whisper-tiny-en-q5_1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-tiny.en-q5_1.bin",
"C77C5766F1CEF09B6B7D47F21B546CBDDD4157886B3B5D6D4F709E91E66C7C2B"}}}},
{"Whisper Tiny English q8 (42Mb)",
{"Whisper Tiny En q8",
"ggml-model-whisper-tiny-en-q8_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-tiny.en-q8_0.bin",
"5BC2B3860AA151A4C6E7BB095E1FCCE7CF12C7B020CA08DCEC0C6D018BB7DD94"}}}},
{"Whisper Tiny English (74Mb)",
{"Whisper Tiny En",
"ggml-model-whisper-tiny-en",
MODEL_TYPE_TRANSCRIPTION,
{{"https://ggml.ggerganov.com/ggml-model-whisper-tiny.en.bin",
"921E4CF8686FDD993DCD081A5DA5B6C365BFDE1162E72B08D75AC75289920B1F"}}}},
{"Whisper Large v3 (3Gb)",
{"Whisper Large v3",
"ggml-large-v3",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3.bin",
"64d182b440b98d5203c4f9bd541544d84c605196c4f7b845dfa11fb23594d1e2"}}}},
{"Whisper Large v3 q5 (1Gb)",
{"Whisper Large v3 q5",
"ggml-large-v3-q5_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-q5_0.bin",
"d75795ecff3f83b5faa89d1900604ad8c780abd5739fae406de19f23ecd98ad1"}}}},
{"Whisper Large v2 (3Gb)",
{"Whisper Large v2",
"ggml-large-v2",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v2.bin",
"9a423fe4d40c82774b6af34115b8b935f34152246eb19e80e376071d3f999487"}}}},
{"Whisper Large v1 (3Gb)",
{"Whisper Large v1",
"ggml-large-v1",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v1.bin",
"7d99f41a10525d0206bddadd86760181fa920438b6b33237e3118ff6c83bb53d"}}}},
{"Whisper Medium English (1.5Gb)",
{"Whisper Medium English",
"ggml-medium-en",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.en.bin",
"cc37e93478338ec7700281a7ac30a10128929eb8f427dda2e865faa8f6da4356"}}}},
{"Whisper Medium (1.5Gb)",
{"Whisper Medium",
"ggml-medium",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-medium.bin",
"6c14d5adee5f86394037b4e4e8b59f1673b6cee10e3cf0b11bbdbee79c156208"}}}},
{"Whisper Large v3 Turbo (1.62Gb)",
{"Whisper Large v3 Turbo",
"ggml-large-v3-turbo",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo.bin",
"1FC70F774D38EB169993AC391EEA357EF47C88757EF72EE5943879B7E8E2BC69"}}}},
{"Whisper Large v3 Turbo q5 (574Mb)",
{"Whisper Large v3 Turbo q5",
"ggml-large-v3-turbo-q5_0",
MODEL_TYPE_TRANSCRIPTION,
{{"https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-turbo-q5_0.bin",
"394221709CD5AD1F40C46E6031CA61BCE88931E6E088C188294C6D5A55FFA7E2"}}}},
}};
#include <obs-module.h>
#include <fstream>
#include <map>
#include <string>
#include <memory>
#include <sstream>
#include <optional>
#include <nlohmann/json.hpp>
#include <curl/curl.h>
static size_t WriteCallback(void *contents, size_t size, size_t nmemb, void *userp)
{
((std::string *)userp)->append((char *)contents, size * nmemb);
return size * nmemb;
}
/**
* @brief Downloads a JSON file from a specified GitHub URL.
*
* This function uses libcurl to download a JSON file from a GitHub repository.
* The downloaded content is stored in the provided string reference.
*
* @param json_content A reference to a string where the downloaded JSON content will be stored.
* @return true if the download was successful and the HTTP response code was 200, false otherwise.
*
* The function performs the following steps:
* - Initializes a CURL session.
* - Sets the URL to download the JSON from.
* - Sets the callback function to write the downloaded data.
* - Follows redirects and sets a timeout of 10 seconds.
* - Performs the download operation.
* - Checks for errors and logs them using obs_log.
* - Cleans up the CURL session.
* - Checks the HTTP response code to ensure it is 200 (OK).
*/
bool download_json_from_github(std::string &json_content)
{
CURL *curl;
CURLcode res;
std::string readBuffer;
long http_code = 0;
curl = curl_easy_init();
if (curl) {
curl_easy_setopt(
curl, CURLOPT_URL,
"https://raw.githubusercontent.com/locaal-ai/obs-localvocal/master/data/models/models_directory.json");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L); // Follow redirects
curl_easy_setopt(curl, CURLOPT_TIMEOUT, 10L); // Set a timeout (10 seconds)
res = curl_easy_perform(curl);
if (res != CURLE_OK) {
obs_log(LOG_ERROR, "Failed to download JSON from GitHub: %s",
curl_easy_strerror(res));
curl_easy_cleanup(curl);
return false;
}
curl_easy_getinfo(curl, CURLINFO_RESPONSE_CODE, &http_code);
curl_easy_cleanup(curl);
if (http_code != 200) {
obs_log(LOG_ERROR, "HTTP error: %ld", http_code);
return false;
}
} else {
obs_log(LOG_ERROR, "Failed to initialize curl");
return false;
}
json_content = readBuffer;
return true;
}
/**
* @brief Parses a JSON object to extract model information.
*
* This function takes a JSON object representing a model and extracts various
* fields to populate a ModelInfo structure. It performs validation on the
* presence and types of required fields and logs warnings for any missing or
* invalid fields.
*
* @param model The JSON object containing the model information.
* @return An optional ModelInfo object. If the required fields are missing or
* invalid, it returns std::nullopt.
*
* The JSON object is expected to have the following structure:
* {
* "friendly_name": "string", // Required
* "local_folder_name": "string", // Optional
* "type": "string", // Optional, expected values: "MODEL_TYPE_TRANSCRIPTION" or "MODEL_TYPE_TRANSLATION"
* "files": [ // Optional, array of file objects
* {
* "url": "string", // Required in each file object
* "sha256": "string" // Required in each file object
* },
* ...
* ],
* "extra": { // Optional
* "language": "string", // Optional
* "description": "string", // Optional
* "source": "string" // Optional
* }
* }
*/
std::optional<ModelInfo> parse_model_json(const nlohmann::json &model)
{
ModelInfo model_info;
if (!model.contains("friendly_name") || !model["friendly_name"].is_string()) {
obs_log(LOG_WARNING,
"Missing or invalid 'friendly_name' for a model. Skipping this model.");
return std::nullopt;
}
model_info.friendly_name = model["friendly_name"].get<std::string>();
if (model.contains("local_folder_name") && model["local_folder_name"].is_string()) {
model_info.local_folder_name = model["local_folder_name"].get<std::string>();
} else {
obs_log(LOG_WARNING, "Missing or invalid 'local_folder_name' for model: %s",
model_info.friendly_name.c_str());
}
if (model.contains("type") && model["type"].is_string()) {
const std::string &type_str = model["type"].get<std::string>();
if (type_str == "MODEL_TYPE_TRANSCRIPTION")
model_info.type = ModelType::MODEL_TYPE_TRANSCRIPTION;
else if (type_str == "MODEL_TYPE_TRANSLATION")
model_info.type = ModelType::MODEL_TYPE_TRANSLATION;
else
obs_log(LOG_WARNING, "Invalid 'type' for model: %s",
model_info.friendly_name.c_str());
} else {
obs_log(LOG_WARNING, "Missing or invalid 'type' for model: %s",
model_info.friendly_name.c_str());
}
if (model.contains("files") && model["files"].is_array()) {
for (const auto &file : model["files"]) {
ModelFileDownloadInfo file_info;
if (file.contains("url") && file["url"].is_string())
file_info.url = file["url"].get<std::string>();
else
obs_log(LOG_WARNING,
"Missing or invalid 'url' for a file in model: %s",
model_info.friendly_name.c_str());
if (file.contains("sha256") && file["sha256"].is_string())
file_info.sha256 = file["sha256"].get<std::string>();
else
obs_log(LOG_WARNING,
"Missing or invalid 'sha256' for a file in model: %s",
model_info.friendly_name.c_str());
model_info.files.push_back(file_info);
}
} else {
obs_log(LOG_WARNING, "Missing or invalid 'files' array for model: %s",
model_info.friendly_name.c_str());
}
// Parse the new "extra" field
if (model.contains("extra") && model["extra"].is_object()) {
const auto &extra = model["extra"];
if (extra.contains("language") && extra["language"].is_string())
model_info.extra.language = extra["language"].get<std::string>();
if (extra.contains("description") && extra["description"].is_string())
model_info.extra.description = extra["description"].get<std::string>();
if (extra.contains("source") && extra["source"].is_string())
model_info.extra.source = extra["source"].get<std::string>();
}
return model_info;
}
/**
* @brief Loads model information from a JSON source.
*
* This function attempts to download a JSON file containing model information from GitHub.
* If the download fails, it falls back to loading the JSON file from a local directory.
* The JSON file is expected to contain an array of models under the key "models".
* Each model's information is parsed and stored in a map with the model's friendly name as the key.
*
* @return A map where the keys are model friendly names and the values are ModelInfo objects.
*/
std::map<std::string, ModelInfo> load_models_info()
{
std::map<std::string, ModelInfo> models_info_map;
nlohmann::json model_directory_json;
// Try to download from GitHub first
std::string github_json_content;
bool download_success = download_json_from_github(github_json_content);
if (download_success) {
obs_log(LOG_INFO, "Successfully downloaded models directory from GitHub");
std::istringstream json_stream(github_json_content);
json_stream >> model_directory_json;
} else {
// Fall back to local file
obs_log(LOG_INFO, "Falling back to local models directory file");
char *model_directory_json_file = obs_module_file("models/models_directory.json");
if (model_directory_json_file == nullptr) {
obs_log(LOG_ERROR, "Cannot find local model directory file");
return models_info_map;
}
obs_log(LOG_INFO, "Local model directory file: %s", model_directory_json_file);
std::string model_directory_file_str = std::string(model_directory_json_file);
bfree(model_directory_json_file);
std::ifstream model_directory_file(model_directory_file_str);
if (!model_directory_file.is_open()) {
obs_log(LOG_ERROR, "Failed to open local model directory file");
return models_info_map;
}
model_directory_file >> model_directory_json;
}
if (!model_directory_json.contains("models") ||
!model_directory_json["models"].is_array()) {
obs_log(LOG_ERROR, "Invalid JSON structure: 'models' array not found");
return models_info_map;
}
for (const auto &model : model_directory_json["models"]) {
auto model_info_opt = parse_model_json(model);
if (model_info_opt) {
models_info_map[model_info_opt->friendly_name] = *model_info_opt;
}
}
obs_log(LOG_INFO, "Loaded %zu models", models_info_map.size());
return models_info_map;
}
const std::map<std::string, ModelInfo> &models_info()
{
static const std::unique_ptr<const std::map<std::string, ModelInfo>> cached_models_info =
std::make_unique<const std::map<std::string, ModelInfo>>(load_models_info());
return *cached_models_info;
}
const std::vector<ModelInfo> get_sorted_models_info()
{
const auto &models_map = models_info();
std::vector<ModelInfo> standard_models;
std::vector<ModelInfo> huggingface_models;
// Separate models into two categories
for (const auto &[key, model] : models_map) {
if (!model.extra.source.empty()) {
huggingface_models.push_back(model);
} else {
standard_models.push_back(model);
}
}
// Sort both vectors based on friendly_name
auto sort_by_name = [](const ModelInfo &a, const ModelInfo &b) {
return a.friendly_name < b.friendly_name;
};
std::sort(standard_models.begin(), standard_models.end(), sort_by_name);
std::sort(huggingface_models.begin(), huggingface_models.end(), sort_by_name);
// Combine the sorted vectors with a separator
std::vector<ModelInfo> result = std::move(standard_models);
if (!huggingface_models.empty()) {
ModelInfo info;
info.friendly_name = "--------- HuggingFace Models ---------";
info.type = ModelType::MODEL_TYPE_TRANSCRIPTION;
result.push_back(info);
result.insert(result.end(), huggingface_models.begin(), huggingface_models.end());
}
return result;
}

View File

@ -7,7 +7,6 @@ target_sources(
PRIVATE ${CMAKE_SOURCE_DIR}/src/tests/localvocal-offline-test.cpp
${CMAKE_SOURCE_DIR}/src/tests/audio-file-utils.cpp
${CMAKE_SOURCE_DIR}/src/transcription-utils.cpp
${CMAKE_SOURCE_DIR}/src/model-utils/model-infos.cpp
${CMAKE_SOURCE_DIR}/src/model-utils/model-find-utils.cpp
${CMAKE_SOURCE_DIR}/src/whisper-utils/whisper-processing.cpp
${CMAKE_SOURCE_DIR}/src/whisper-utils/whisper-utils.cpp

View File

@ -21,6 +21,7 @@
#include "audio-file-utils.h"
#include "translation/language_codes.h"
#include "ui/filter-replace-utils.h"
#include "model-utils/model-downloader-types.h"
#include <stdio.h>
#include <stdlib.h>
@ -84,6 +85,14 @@ void obs_log(int log_level, const char *format, ...)
printf("\n");
}
const std::map<std::string, ModelInfo> &models_info()
{
static const std::unique_ptr<const std::map<std::string, ModelInfo>> cached_models_info =
std::make_unique<const std::map<std::string, ModelInfo>>();
return *cached_models_info;
}
transcription_filter_data *
create_context(int sample_rate, int channels, const std::string &whisper_model_path,
const std::string &silero_vad_model_file, const std::string &ct2ModelFolder,

View File

@ -112,13 +112,18 @@ void send_sentence_to_file(struct transcription_filter_data *gf,
}
if (!gf->save_srt) {
// Write raw sentence to file
std::ofstream output_file(gf->output_file_path, openmode);
output_file << str_copy << std::endl;
output_file.close();
if (write_translations) {
std::ofstream translated_output_file(translated_file_path, openmode);
translated_output_file << translated_sentence << std::endl;
translated_output_file.close();
try {
std::ofstream output_file(gf->output_file_path, openmode);
output_file << str_copy << std::endl;
output_file.close();
if (write_translations) {
std::ofstream translated_output_file(translated_file_path,
openmode);
translated_output_file << translated_sentence << std::endl;
translated_output_file.close();
}
} catch (const std::ofstream::failure &e) {
obs_log(LOG_ERROR, "Exception opening/writing/closing file: %s", e.what());
}
} else {
if (result.start_timestamp_ms == 0 && result.end_timestamp_ms == 0) {
@ -297,12 +302,15 @@ void recording_state_callback(enum obs_frontend_event event, void *data)
struct transcription_filter_data *gf_ =
static_cast<struct transcription_filter_data *>(data);
if (event == OBS_FRONTEND_EVENT_RECORDING_STARTING) {
if (gf_->save_srt && gf_->save_only_while_recording) {
if (gf_->save_srt && gf_->save_only_while_recording &&
gf_->output_file_path != "") {
obs_log(gf_->log_level, "Recording started. Resetting srt file.");
// truncate file if it exists
std::ofstream output_file(gf_->output_file_path,
std::ios::out | std::ios::trunc);
output_file.close();
if (std::ifstream(gf_->output_file_path)) {
std::ofstream output_file(gf_->output_file_path,
std::ios::out | std::ios::trunc);
output_file.close();
}
gf_->sentence_number = 1;
gf_->start_timestamp_ms = now_ms();
}

View File

@ -152,15 +152,16 @@ void add_transcription_group_properties(obs_properties_t *ppts,
obs_property_t *whisper_models_list = obs_properties_add_list(
transcription_group, "whisper_model_path", MT_("whisper_model"),
OBS_COMBO_TYPE_LIST, OBS_COMBO_FORMAT_STRING);
// Add models from models_info map
for (const auto &model_info : models_info) {
if (model_info.second.type == MODEL_TYPE_TRANSCRIPTION) {
obs_property_list_add_string(whisper_models_list, model_info.first.c_str(),
model_info.first.c_str());
}
}
obs_property_list_add_string(whisper_models_list, "Load external model file",
"!!!external!!!");
// Add models from models_info map
for (const auto &model_info : get_sorted_models_info()) {
if (model_info.type == MODEL_TYPE_TRANSCRIPTION) {
obs_property_list_add_string(whisper_models_list,
model_info.friendly_name.c_str(),
model_info.friendly_name.c_str());
}
}
// Add a file selection input to select an external model file
obs_properties_add_path(transcription_group, "whisper_model_path_external",
@ -191,7 +192,7 @@ void add_translation_group_properties(obs_properties_t *ppts)
// add "Whisper-Based Translation" option
obs_property_list_add_string(prop_translate_model, MT_("Whisper-Based-Translation"),
"whisper-based-translation");
for (const auto &model_info : models_info) {
for (const auto &model_info : models_info()) {
if (model_info.second.type == MODEL_TYPE_TRANSLATION) {
obs_property_list_add_string(prop_translate_model, model_info.first.c_str(),
model_info.first.c_str());

View File

@ -20,9 +20,9 @@ void transcription_filter_show(void *data);
void transcription_filter_hide(void *data);
const char *const PLUGIN_INFO_TEMPLATE =
"<a href=\"https://github.com/occ-ai/obs-localvocal/\">LocalVocal</a> (%1) by "
"<a href=\"https://github.com/occ-ai\">OCC AI</a> ❤️ "
"<a href=\"https://www.patreon.com/RoyShilkrot\">Support & Follow</a>";
"<a href=\"https://github.com/locaal-ai/obs-localvocal/\">LocalVocal</a> (%1) by "
"<a href=\"https://github.com/locaal-ai\">Locaal AI</a> ❤️ "
"<a href=\"https://locaal.ai\">Support & Follow</a>";
const char *const SUPPRESS_SENTENCES_DEFAULT =
"Thank you for watching\nPlease like and subscribe\n"

View File

@ -22,7 +22,7 @@ void start_translation(struct transcription_filter_data *gf)
return;
}
const ModelInfo &translation_model_info = models_info[gf->translation_model_index];
const ModelInfo &translation_model_info = models_info().at(gf->translation_model_index);
std::string model_file_found = find_model_folder(translation_model_info);
if (model_file_found == "") {
obs_log(LOG_INFO, "Translation CT2 model does not exist. Downloading...");

View File

@ -72,13 +72,13 @@ void update_whisper_model(struct transcription_filter_data *gf)
// new model is not external file
shutdown_whisper_thread(gf);
if (models_info.count(new_model_path) == 0) {
if (models_info().count(new_model_path) == 0) {
obs_log(LOG_WARNING, "Model '%s' does not exist",
new_model_path.c_str());
return;
}
const ModelInfo &model_info = models_info[new_model_path];
const ModelInfo &model_info = models_info().at(new_model_path);
// check if the model exists, if not, download it
std::string model_file_found = find_model_bin_file(model_info);