Commit Graph

29 Commits

Author SHA1 Message Date
Roy Shilkrot
41bd57fd5a refactor: Update translation options in transcription-filter-properties.cpp
Simplify the translation options in the transcription-filter-properties.cpp file by adding a new option "translate_only_full_sentences". This option will be visible only when the "translate_enabled" flag is true and the "is_advanced" flag is set.

Remove unnecessary code in model-infos.cpp

Remove the code that logs a warning message when the "sha256" field is missing or invalid in the model JSON file. This code is no longer needed as it does not affect the functionality of the program.

Comment out download_json_from_github in model-infos.cpp

Comment out the call to the "download_json_from_github" function in the load_models_info() function in model-infos.cpp. This function is currently not working as intended and needs further investigation.
2024-10-09 10:46:46 -04:00
Roy Shilkrot
5670ac94b2
Model directory (#172)
* refactor: Handle file exceptions when writing raw sentence and translations

This commit modifies the code in transcription-filter-callbacks.cpp to handle file exceptions when writing raw sentence and translations to files. It adds exception handling using try-catch blocks to ensure that file operations are properly handled. This change improves the robustness of the code and prevents crashes or unexpected behavior when file operations fail.

* refactor: Update models_info function to use cached models information

The models_info function in model-downloader.cpp has been updated to use a cached version of the models information. This improves performance by avoiding unnecessary file reads and JSON parsing. The function now returns a const reference to the cached models_info map. This change ensures that the models_info function is more efficient and reduces the overhead of loading the models information.

Refactor the code in model-downloader.cpp to use the updated models_info function and remove the unnecessary file read and JSON parsing code.

Closes #123

* refactor: Simplify file handling in transcription-filter-callbacks.cpp

* refactor: Add script to query Hugging Face models and update models_directory.json

This commit adds two new scripts, hugging_face_model_query.py and hugging_face_model_query_all.py, to query Hugging Face models and update the models_directory.json file. The hugging_face_model_query.py script fetches model information from the Hugging Face API and adds new models to the models_directory.json file. The hugging_face_model_query_all.py script fetches a list of models matching a specific search criteria and adds the matching models to the models_directory.json file. These scripts will help keep the models_directory.json file up to date with the latest models available on Hugging Face.

Refactor the file handling in transcription-filter-callbacks.cpp

This commit simplifies the file handling in the transcription-filter-callbacks.cpp file. The changes aim to improve the readability and maintainability of the code by reducing complexity and removing unnecessary code.

Update the models_info function to use cached models information

This commit updates the models_info function to use cached models information instead of fetching it every time the function is called. This change improves the performance of the function by reducing the number of API calls and improves the overall efficiency of the code.

Handle file exceptions when writing raw sentence and translations

This commit adds exception handling code to handle file exceptions when writing raw sentence and translations. The changes ensure that any file-related exceptions are caught and properly handled, preventing the program from crashing or producing incorrect results.

Simplify the Onnxruntime installation in FetchOnnxruntime.cmake

This commit simplifies the Onnxruntime installation process in the FetchOnnxruntime.cmake file. The changes aim to make the installation steps more concise and easier to understand, improving the overall maintainability of the code.

Update the version to 0.3.6 and adjust the website URL

This commit updates the version of the software to 0.3.6 and adjusts the website URL accordingly. The changes ensure that the software is properly versioned and the website URL is up to date.

* refactor: Add ExtraInfo struct to ModelInfo and update models_info function

* refactor: Update model names in models_directory.json and fix URL in transcription-filter.h
2024-10-08 22:41:20 -04:00
Roy Shilkrot
e3c69518a7
Fix hangups and VAD segmentation (#157)
* Fix hangups and VAD segmentation

* feat: Add max_sub_duration field to transcription filter data

* chore: Update VAD parameters for better segmentation accuracy

* feat: Add segment_duration field to transcription filter data

* feat: Optimize VAD processing for better performance

* feat: Refactor token buffer thread and whisper processing

The code changes involve refactoring the token buffer thread and whisper processing. The token buffer thread now uses the variable name `word_token` instead of `word` for better clarity. In the whisper processing, the log message format has been updated to include the segment number and token number. These changes aim to improve the performance and accuracy of VAD processing, as well as add new fields to the transcription filter data.

* Refactor token buffer thread and whisper processing

* refactor: Update translation context in transcription filter

The code changes in this commit update the translation context in the transcription filter. The `translate_add_context` property has been changed from a boolean to an integer slider, allowing the user to specify the number of context lines to add to the translation. This change aims to provide more flexibility in controlling the context for translation and improve the accuracy of the translation output.

* refactor: Update last_text variable name in transcription filter callbacks

* feat: Add translation language utilities

This commit adds a new file, `translation-language-utils.h`, which contains utility functions for handling translation languages. The `remove_start_punctuation` function removes any leading punctuation from a given string. This utility will be used in the translation process to improve the quality of the translated output.

* feat: Update ICU library configuration and dependencies

This commit updates the configuration and dependencies of the ICU library. The `BuildICU.cmake` file has been modified to use the `INSTALL_DIR` variable instead of the `ICU_INSTALL_DIR` variable for setting the ICU library paths. Additionally, the `ICU_IN_LIBRARY` variable has been renamed to `ICU_IN_LIBRARY` for better clarity. These changes aim to improve the build process and ensure proper linking of the ICU library.

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

* refactor: Update ICU library configuration and dependencies

This commit updates the `BuildICU.cmake` file to set the `CFLAGS`, `CXXFLAGS`, and `LDFLAGS` environment variables to `-fPIC` for Linux platforms. This change aims to ensure that the ICU library is built with position-independent code, improving compatibility and security. Additionally, the `icuin` library has been renamed to `icui18n` to align with the naming convention. These updates enhance the build process and maintain consistency in the ICU library configuration.
2024-09-06 10:27:05 -04:00
Ruwen Hahn
0592fa7d9d
Upgrade silero vad v5 (and some other changes) (#148)
* Add accessor for VAD window size in samples

* Feed buffered audio data to VAD in proper window sizes

* Wake whisper thread whenever audio is received

* Update silero VAD to v5

* Only reset VAD state between chunks of activity
2024-08-02 14:25:59 -04:00
Roy Shilkrot
78907ea14d
refactor: Update whisper model path and enable hipBLAS acceleration (#146)
* refactor: Update whisper model path and enable hipBLAS acceleration

* refactor: Update whisper model path and enable hipBLAS acceleration

* refactor: Update whisper model path and enable hipBLAS acceleration

* refactor: Update whisper model path and enable hipBLAS acceleration

* refactor: Update whisper model path and enable hipBLAS acceleration

* refactor: Update whisper model path and enable CoreML acceleration
2024-07-31 00:40:36 -04:00
Roy Shilkrot
b3e4bfa33a
refactor: Enable partial transcription with a latency of 1000ms (#141)
* refactor: Enable partial transcription with a latency of 1000ms

* refactor: Update CMakePresets.json and buildspec.json

- Remove the "QT_VERSION" variable from CMakePresets.json for all platforms
- Update the "version" of "obs-studio" and "prebuilt" dependencies in buildspec.json
- Update the "version" of "qt6" dependency in buildspec.json
- Update the "version" of the project to "0.3.3" in buildspec.json
- Update the "version" of the project to "0.3.3" in CMakePresets.json
- Remove unused code in whisper-processing.cpp

* refactor: Add -Wno-error=deprecated-declarations option to compilerconfig.cmake

* refactor: Update language codes in translation module
2024-07-19 14:02:24 -04:00
Roy Shilkrot
44f072b5ff
refactor: Add transcription-filter-properties.cpp for managing filter… (#138)
* refactor: Add transcription-filter-properties.cpp for managing filter properties

* refactor: Add translation_monitor to transcription filter

- Add translation_monitor to the transcription filter data structure
- Initialize and stop the translation_monitor in the transcription_filter_update function
- Update the send_caption_to_source function to use the translation_monitor for sending translated captions
- Clear the translation_monitor when disabling buffered output in the transcription_filter_update function

* refactor: Simplify UI and improve error handling in transcription filter
2024-07-17 12:18:31 -04:00
Roy Shilkrot
3c3b640bdb
Simplified UI (#136)
* refactor: Update translation option in transcription filter

- Update the translation option in the transcription filter to use a more concise label
- Remove unnecessary code related to file output in the transcription filter
- Improve the handling of whisper model paths in the transcription filter
- Set the default language to "auto" in the transcription filter properties

* refactor: Improve error handling in model-downloader.cpp and transcription-filter-callbacks.cpp

* refactor: Improve error handling in model-downloader.cpp and transcription-filter-callbacks.cpp
2024-07-15 18:28:03 -04:00
Roy Shilkrot
ee07bbe569
refactor: Update file output option in transcription filter (#128)
- Update the file output option in the transcription filter to use the new "Save to File" label instead of "Text File output"
- Add a new boolean flag "save_to_file" in the transcription filter data structure to track the file output setting
- Update the code in transcription-filter-callbacks.cpp and transcription-filter.cpp to use the new flag for file output logic
- Update the properties and UI in transcription-filter-properties.cpp to reflect the changes
2024-07-09 17:02:58 -04:00
Roy Shilkrot
32bbd99404
refactor: Add filter-replace-dialog.cpp for filter and replace functi… (#124)
* refactor: Add filter-replace-dialog.cpp for filter and replace functionality

* refactor: Improve filter-replace-dialog.cpp for filter and replace functionality
2024-07-02 15:27:11 -04:00
Roy Shilkrot
958266fb4e
refactor: Update buffer_output_type translations in locale files (#119)
* refactor: Update buffer_output_type translations in locale files

* refactor: Update buffer_num_chars_per_line translation in locale files

* refactor: Remove unused code related to buffer output type selection

* refactor: Update TokenBufferThread to use TokenBufferString for caption building

* refactor: Update TokenBufferThread to use TokenBufferString for caption building
2024-06-26 15:35:43 -04:00
Roy Shilkrot
67993f393d
Steamline and refactor (#105)
* refactor: Update whispercpp dependency to version 0.0.3

* refactor: Add buffered output parameters for transcription filter

* refactor: Remove unused parameter in set_source_signals function

* refactor: Fix character splitting bug in TokenBufferThread

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Remove unused parameter in set_source_signals function

* refactor: Fix floating point precision issue in whisper-processing.cpp

* refactor: Improve remove_leading_trailing_nonalpha function in transcription-utils.cpp

* refactor: Update VAD threshold in transcription filter

* refactor: Update VAD threshold parameter name in silero-vad-onnx.h

* refactor: Update VAD threshold parameter name in silero-vad-onnx.h

* refactor: Update lock_guard parameter name in TokenBufferThread
2024-06-05 18:02:36 -04:00
Roy Shilkrot
5227a437b6
VAD based segmentation (#97)
* refactor: Add whisper_buffer to transcription_filter_data struct

* refactor: Add sentence_psum_accept_thresh to transcription_filter_data struct

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Add audio-file-utils.cpp for audio file handling

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Add external model option to translation settings

* refactor: Add support for input tokenization style in translation settings

* refactor: Update buffer size and overlap size in whisper-processing.cpp
2024-05-16 15:07:00 -04:00
Roy Shilkrot
31c41a9574
Offline transcription accuracy tests (#96)
* Update translation-utils.h, transcription-filter.h, whisper-model-utils.h, model-find-utils.h, and model-downloader.h

* Update create_context function to include ct2ModelFolder parameter

* fix: add fix_utf8 flag to transcription_filter_data struct

* Update create_context function to include ct2ModelFolder parameter

* Update read_text_from_file function to include join_sentences parameter

* fix: Update VadIterator::reset_states to include reset_hc parameter

* Update create_context function to include whisper_sampling_method parameter

* Update tests README with additional configuration options

* feat: Add function to find file in folder by regex expression

* refactor: Improve text conditioning logic in transcription-filter.cpp

* refactor: Improve text conditioning logic in transcription-filter.cpp

* chore: Update ctranslate2 dependency to version 1.2.0

* refactor: Improve text conditioning logic in transcription-filter.cpp

* chore: Update cmake BuildCTranslate2.cmake to disable -Wno-comma warning

* refactor: Update translation context in whisper-processing.cpp and translation-utils.cpp
2024-05-10 17:37:09 -04:00
Roy Shilkrot
ab1b74a35c
Overlap analysis (#92)
* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update suppress_sentences in en-US.ini and transcription-filter-data.h

* Update suppress_sentences and fix whitespace in transcription-filter-data.h, whisper-processing.h, transcription-utils.cpp, and transcription-filter.h

* Update whisper-processing.cpp and whisper-utils.cpp files

* Update findStartOfOverlap function signature to use int instead of size_t

* Update Whispercpp_Build_GIT_TAG to use commit 7395c70a748753e3800b63e3422a2b558a097c80 in BuildWhispercpp.cmake

* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update unused parameter in transcription-filter-properties function

* Update log level and add suppress_sentences feature in transcription-filter.cpp and whisper-processing.cpp

* Add translation output feature in en-US.ini and transcription-filter-data.h

* Add DTW token timestamps and buffered output feature

* trigger rebuild

* Refactor remove_leading_trailing_nonalpha function to improve readability and performance

* Refactor is_lead_byte and is_trail_byte macros for improved readability and maintainability

* Refactor is_lead_byte and is_trail_byte macros for improved readability and maintainability

* trigger build
2024-04-25 17:14:13 -04:00
Roy Shilkrot
65da380f9f
Bump whisper, clblast, add buffered output (#90)
* Bump whisper, clblast, add buffered output

* Update CPU_OR_CUDA environment variable error messages

* Update Cublas validation in Package-Windows.ps1 and initialize function in captions-thread.h

* Update Cublas validation and fix typo in Package-Windows.ps1

* Update default whisper model path to Whisper Tiny English (74Mb)

* Update translation strings for multiple locales
2024-04-18 10:28:32 -04:00
Roy Shilkrot
f79571f316
Add Silero VAD (#85)
* Add Silero VAD model and integrate it into the transcription filter

* Fix Silero VAD model path and enable n_threads

* Update translation strings for multiple locales

* Update Onnxruntime library linking and fix compiler warning

* Fix variable naming and type casting in Silero VAD implementation

* Update Silero VAD model path and enable n_threads
2024-04-13 22:39:28 -04:00
Roy Shilkrot
069cba1c7a Update translation strings for multiple locales 2024-04-02 00:01:01 -04:00
Roy Shilkrot
a569da2ed3
Built-in Translation (#79)
* Add translation feature and dependencies

* Add model-infos.cpp and translate_add_context to en-US.ini

* Fix formatting and whitespace issues

* Update build plugin and version, fix translation and whisper-utils

* Fix compiler warning and simplify code in transcription-filter.cpp

* Update CMakePresets.json and buildspec.json

* Fix Clang compiler warnings

* Enable QT in CMakePresets.json

* Fix compiler warnings and create missing config folder

* Fix formatting of is_lead_byte and is_trail_byte macros
2024-04-01 14:37:31 -04:00
Roy Shilkrot
0c7d7234af
Update CUDA support and model versions (#78) 2024-03-24 21:23:06 -04:00
Roy Shilkrot
8b4471fad4
Update save_srt option and add truncate_output_file option (#64)
* Update save_srt option and add truncate_output_file option

* Refactor code for readability and maintainability

* Update clang-format version to 16.0.5

* Update .clang-format and model-downloader-ui.cpp

* Fix is_lead_byte and is_trail_byte macros
2024-01-25 11:44:05 -05:00
Roy Shilkrot
b45b235ad6
Bump whisper.cpp. Simple settings mode (#60)
* bump whispercpp, simple settings mode

* lint
2023-12-21 11:08:36 -05:00
Roy Shilkrot
465193a12b adding min sub duration 2023-10-29 00:26:36 -04:00
Roy Shilkrot
82a473fabb add file rename 2023-10-12 10:45:34 -04:00
Roy Shilkrot
9299e7592e srt saving 2023-10-07 13:46:58 -04:00
Roy Shilkrot
1b6da3c0f9 Locale and translations 2023-10-04 22:12:17 -04:00
Roy Shilkrot
b14ba3e93f attempt fix 2023-08-20 01:25:19 +03:00
Roy Shilkrot
7023ec5152 initial 2023-08-12 23:51:51 +03:00
Roy Shilkrot
ad7cb94c55
Initial commit 2023-08-10 22:05:20 +03:00