Commit Graph

162 Commits

Author SHA1 Message Date
Roy Shilkrot
5227a437b6
VAD based segmentation (#97)
* refactor: Add whisper_buffer to transcription_filter_data struct

* refactor: Add sentence_psum_accept_thresh to transcription_filter_data struct

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Add audio-file-utils.cpp for audio file handling

* refactor: Update buffer size and overlap size in whisper-processing.cpp

* refactor: Add external model option to translation settings

* refactor: Add support for input tokenization style in translation settings

* refactor: Update buffer size and overlap size in whisper-processing.cpp
2024-05-16 15:07:00 -04:00
Roy Shilkrot
9c45376d7a Update version to 0.2.6 in buildspec.json 2024-05-11 08:46:50 -04:00
Roy Shilkrot
31c41a9574
Offline transcription accuracy tests (#96)
* Update translation-utils.h, transcription-filter.h, whisper-model-utils.h, model-find-utils.h, and model-downloader.h

* Update create_context function to include ct2ModelFolder parameter

* fix: add fix_utf8 flag to transcription_filter_data struct

* Update create_context function to include ct2ModelFolder parameter

* Update read_text_from_file function to include join_sentences parameter

* fix: Update VadIterator::reset_states to include reset_hc parameter

* Update create_context function to include whisper_sampling_method parameter

* Update tests README with additional configuration options

* feat: Add function to find file in folder by regex expression

* refactor: Improve text conditioning logic in transcription-filter.cpp

* refactor: Improve text conditioning logic in transcription-filter.cpp

* chore: Update ctranslate2 dependency to version 1.2.0

* refactor: Improve text conditioning logic in transcription-filter.cpp

* chore: Update cmake BuildCTranslate2.cmake to disable -Wno-comma warning

* refactor: Update translation context in whisper-processing.cpp and translation-utils.cpp
2024-05-10 17:37:09 -04:00
Roy Shilkrot
2e83300fbb
Update buffer size and overlap size in whisper-processing.h and defau… (#95)
* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update audio processing timestamp calculation in whisper-processing.cpp

* Update OBS plugin installation instructions for Linux

* Fix typo in update_whisper_model function name
2024-05-02 01:03:06 -04:00
Roy Shilkrot
493ecad254
Update CTranslate2 and cpu_features dependencies (#94)
* Update CTranslate2 and cpu_features dependencies

* Update CTranslate2 and cpu_features dependencies

* Update dependencies and fix special tokens handling

* Add BUILD_BYPRODUCTS to CMake build command

* Update version to 0.2.5 in buildspec.json
2024-04-30 09:48:23 -04:00
Roy Shilkrot
3b955e3031
Fix special tokens (#93)
* Update version to 0.2.4 in buildspec.json

* Update special token handling in whisper-processing.cpp

* Update special token handling in whisper-processing.cpp
2024-04-26 15:34:18 -04:00
Roy Shilkrot
f36f6ec96c Update version to 0.2.3 in buildspec.json 2024-04-25 17:15:14 -04:00
Roy Shilkrot
ab1b74a35c
Overlap analysis (#92)
* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update suppress_sentences in en-US.ini and transcription-filter-data.h

* Update suppress_sentences and fix whitespace in transcription-filter-data.h, whisper-processing.h, transcription-utils.cpp, and transcription-filter.h

* Update whisper-processing.cpp and whisper-utils.cpp files

* Update findStartOfOverlap function signature to use int instead of size_t

* Update Whispercpp_Build_GIT_TAG to use commit 7395c70a748753e3800b63e3422a2b558a097c80 in BuildWhispercpp.cmake

* Update buffer size and overlap size in whisper-processing.h and default buffer size in msec in transcription-filter.cpp

* Update unused parameter in transcription-filter-properties function

* Update log level and add suppress_sentences feature in transcription-filter.cpp and whisper-processing.cpp

* Add translation output feature in en-US.ini and transcription-filter-data.h

* Add DTW token timestamps and buffered output feature

* trigger rebuild

* Refactor remove_leading_trailing_nonalpha function to improve readability and performance

* Refactor is_lead_byte and is_trail_byte macros for improved readability and maintainability

* Refactor is_lead_byte and is_trail_byte macros for improved readability and maintainability

* trigger build
2024-04-25 17:14:13 -04:00
Roy Shilkrot
65da380f9f
Bump whisper, clblast, add buffered output (#90)
* Bump whisper, clblast, add buffered output

* Update CPU_OR_CUDA environment variable error messages

* Update Cublas validation in Package-Windows.ps1 and initialize function in captions-thread.h

* Update Cublas validation and fix typo in Package-Windows.ps1

* Update default whisper model path to Whisper Tiny English (74Mb)

* Update translation strings for multiple locales
2024-04-18 10:28:32 -04:00
Kaito Udagawa
e5a10f48cc
Fix add_custom_command to accept the argument with paren (#88)
* Update FetchOnnxruntime.cmake

* Update FetchOnnxruntime.cmake
2024-04-15 21:38:46 -04:00
Kaito Udagawa
f4307168de
Update build scripts according to the latest obs-plugintemplate (#87)
* Update build-project.yaml

* Update action.yaml

* Update helpers_common.cmake

* Update compilerconfig.cmake

* Update .clang-format

* Fix

* Fix

* Update build-project.yaml

* Update check-format.yaml

* Update push.yaml

* Update build-project.yaml
2024-04-15 08:19:40 -04:00
Roy Shilkrot
f79571f316
Add Silero VAD (#85)
* Add Silero VAD model and integrate it into the transcription filter

* Fix Silero VAD model path and enable n_threads

* Update translation strings for multiple locales

* Update Onnxruntime library linking and fix compiler warning

* Fix variable naming and type casting in Silero VAD implementation

* Update Silero VAD model path and enable n_threads
2024-04-13 22:39:28 -04:00
Roy Shilkrot
069cba1c7a Update translation strings for multiple locales 2024-04-02 00:01:01 -04:00
Roy Shilkrot
3afe7670fe Readme update 2024-04-01 22:24:32 -04:00
Roy Shilkrot
4638ce80fe
Remove Cublas input from build script (#80)
* Remove Cublas input from build script

* Remove CUDA Toolkit installation and curl submodule
2024-04-01 21:59:37 -04:00
Roy Shilkrot
a569da2ed3
Built-in Translation (#79)
* Add translation feature and dependencies

* Add model-infos.cpp and translate_add_context to en-US.ini

* Fix formatting and whitespace issues

* Update build plugin and version, fix translation and whisper-utils

* Fix compiler warning and simplify code in transcription-filter.cpp

* Update CMakePresets.json and buildspec.json

* Fix Clang compiler warnings

* Enable QT in CMakePresets.json

* Fix compiler warnings and create missing config folder

* Fix formatting of is_lead_byte and is_trail_byte macros
2024-04-01 14:37:31 -04:00
Roy Shilkrot
0c7d7234af
Update CUDA support and model versions (#78) 2024-03-24 21:23:06 -04:00
Roy Shilkrot
6791e5a5d3 Update build variants in push.yaml 2024-03-22 17:01:56 -04:00
Roy Shilkrot
17ffcfc2c1
Enable MacOS ARM64 and Windows CUDA builds (#76)
* Enable CoreML and allow fallback to CPU on MacOS ARM64

* Disable CoreML support on MacOS ARM64

* Fix build configuration for MacOS

* Update macOS build configuration based on MACOS_ARCH environment variable

* Update BuildWhispercpp.cmake to disable FMA instructions on non-Apple platforms

* Add cuBLAS support to build and package actions

* Update Cublas versions in Windows build and packaging scripts

* Update CUDA_TOOLKIT_ROOT_DIR environment variable

* Add sub-packages and non-cuda-sub-packages options to CUDA toolkit setup

* Update CUDA sub-packages in build-project.yaml

* Add "visual_studio_integration" to sub-packages in CUDA build workflow

* Fix typo in build-project.yaml

* Fix typo in CUDA build method

* Update sub-packages in CUDA toolkit installation

* Remove unnecessary CUDA sub-packages and method
2024-03-22 13:33:07 -04:00
Roy Shilkrot
8b1ab7cfed
Update README.md 2024-03-21 16:29:49 -04:00
Roy Shilkrot
8e35a192b8
Update README.md 2024-03-19 17:43:23 -04:00
Roy Shilkrot
7a1a6f8d69 Bump libcurl 2024-03-18 00:06:16 -04:00
Roy Shilkrot
db235eb19c Update version number in buildspec.json 2024-03-17 18:51:56 -04:00
Roy Shilkrot
8fe7da6d42
Fix Max Channels, Update macOS brew command and fix compiler warnings (#75) 2024-03-17 13:16:01 -04:00
Roy Shilkrot
240f7dd817
Variable buffer size options (#66)
* Update buffer size and overlap size handling

* Refactor buffer size calculation and formatting in transcription filter

This commit refactors the buffer size calculation in the transcription filter code to improve readability and maintainability. The code now uses a more concise and formatted approach to calculate the buffer size in milliseconds. Additionally, the commit also improves the formatting and readability of the code in the whisper-processing file. These changes enhance the overall code quality and maintainability.
2024-03-08 10:25:58 -05:00
Roy Shilkrot
4c15b9514c
Update Whispercpp_Build_GIT_TAG in BuildWhispercpp.cmake (#72) 2024-03-08 10:25:03 -05:00
Roy Shilkrot
e7a4823b8f
Update README.md 2024-01-25 21:47:28 -05:00
Roy Shilkrot
d8f64971c2
Update version and Whispercpp build tag (#65) 2024-01-25 11:59:05 -05:00
Roy Shilkrot
8b4471fad4
Update save_srt option and add truncate_output_file option (#64)
* Update save_srt option and add truncate_output_file option

* Refactor code for readability and maintainability

* Update clang-format version to 16.0.5

* Update .clang-format and model-downloader-ui.cpp

* Fix is_lead_byte and is_trail_byte macros
2024-01-25 11:44:05 -05:00
Roy Shilkrot
b45b235ad6
Bump whisper.cpp. Simple settings mode (#60)
* bump whispercpp, simple settings mode

* lint
2023-12-21 11:08:36 -05:00
Roy Shilkrot
8c02e0c3fc
Fix CUDA build, shuffle whisper files around (#58)
* fix CUDA build, shuffle whisper files around

* lint
2023-11-20 09:18:06 -05:00
Roy Shilkrot
33b9756624
Update README.md 2023-11-15 22:55:24 -05:00
Roy Shilkrot
5971b8bfa1 lint 2023-11-15 22:19:11 -05:00
Roy Shilkrot
677c08c672 roll back to faster whispercpp ver 2023-11-15 22:17:38 -05:00
Roy Shilkrot
1d80602bbe
Bump whispercpp, fix mac build (#56) 2023-11-15 18:49:25 -05:00
Roy Shilkrot
ba8bd4dbaf
Fix destroy crash (#55) 2023-11-15 17:42:09 -05:00
Roy Shilkrot
9920fda792
Merge pull request #54 from occ-ai/roy.add_fpic_to_plugin_support_linux
Add -fPIC to plugin-support on linux
2023-11-13 17:01:21 -05:00
Roy Shilkrot
ec65ffbbf7 cmake-format 2023-11-13 16:40:09 -05:00
Roy Shilkrot
dcfaddeedb add fpic to plugin-support on linux 2023-11-13 16:33:57 -05:00
Roy Shilkrot
810e8555b3
Merge pull request #51 from occ-ai/roy.bump_whisper_cpp_ver
Bump whisper version for 25% speed gains
2023-11-04 22:52:12 -04:00
Roy Shilkrot
ea34206400 bump whisper 2023-11-04 22:30:23 -04:00
Roy Shilkrot
1db40e341c bump v0.0.6 2023-11-03 23:29:21 -04:00
Roy Shilkrot
bdb416d47f
Merge pull request #47 from occ-ai/roy.fix_win32_unicode_model_path
Fix windows Unicode model path and characters display
2023-11-03 23:26:51 -04:00
Roy Shilkrot
6441245b65 Merge remote-tracking branch 'origin/master' into roy.fix_win32_unicode_model_path 2023-11-03 09:38:42 -04:00
Roy Shilkrot
292cf5b7ee lint 2023-11-03 09:27:08 -04:00
Roy Shilkrot
3273a79b98 fix characters 2023-11-03 09:25:59 -04:00
Roy Shilkrot
8d924d0cb1
Update README.md 2023-10-31 09:45:46 -04:00
Roy Shilkrot
cb151ea71c
Merge pull request #46 from obs-ai/roy.min_sub_duration
Adding min sub render duration
2023-10-29 02:48:55 -04:00
Roy Shilkrot
de00201c95 guard windows.h 2023-10-29 01:50:27 -04:00
Roy Shilkrot
ad182f4593 lint 2023-10-29 01:45:50 -04:00