From 0a90316625317b99d954ada4926880f567e9988b Mon Sep 17 00:00:00 2001 From: Dimitrii Voronin <36505480+adamnsandle@users.noreply.github.com> Date: Fri, 17 Dec 2021 17:13:33 +0200 Subject: [PATCH 1/4] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0ae9751..47068ee 100644 --- a/README.md +++ b/README.md @@ -47,7 +47,7 @@ https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc- - **Flexible sampling rate** - Silero VAD [supports](https://github.com/snakers4/silero-vad/wiki/Quality-Metrics#sample-rate-comparison) **8000 Hz** and **16000 Hz** [sampling rates](https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Sampling_rate). + Silero VAD [supports](https://github.com/snakers4/silero-vad/wiki/Quality-Metrics#sample-rate-comparison) **8000 Hz** and **16000 Hz** (JIT) and **16000 Hz** (ONNX) [sampling rates](https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Sampling_rate). - **Flexible chunk size** From 011268e492e78c0bbf75fab2d684274324312f26 Mon Sep 17 00:00:00 2001 From: Alexander Veysov Date: Fri, 17 Dec 2021 22:00:36 +0300 Subject: [PATCH 2/4] Polish the copy a bit --- README.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 47068ee..b186e31 100644 --- a/README.md +++ b/README.md @@ -29,13 +29,13 @@ https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc-

Key Features


-- **High accuracy** +- **Stellar accuracy** Silero VAD has [excellent results](https://github.com/snakers4/silero-vad/wiki/Quality-Metrics#vs-other-available-solutions) on speech detection tasks. - **Fast** - One audio chunk (30+ ms) [takes](https://github.com/snakers4/silero-vad/wiki/Performance-Metrics#silero-vad-performance-metrics) around **1ms** to be processed on a single CPU thread. Using batching or GPU can also improve performance considerably. + One audio chunk (30+ ms) [takes](https://github.com/snakers4/silero-vad/wiki/Performance-Metrics#silero-vad-performance-metrics) around **1ms** to be processed on a single CPU thread. Using batching or GPU can also improve performance considerably. Under certain conditions ONNX may even run up to 2-3x faster. - **Lightweight** @@ -47,12 +47,16 @@ https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc- - **Flexible sampling rate** - Silero VAD [supports](https://github.com/snakers4/silero-vad/wiki/Quality-Metrics#sample-rate-comparison) **8000 Hz** and **16000 Hz** (JIT) and **16000 Hz** (ONNX) [sampling rates](https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Sampling_rate). + Silero VAD [supports](https://github.com/snakers4/silero-vad/wiki/Quality-Metrics#sample-rate-comparison) **8000 Hz** and **16000 Hz** (PyTorch JIT) and **16000 Hz** (ONNX) [sampling rates](https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Sampling_rate). - **Flexible chunk size** Model was trained on audio chunks of different lengths. **30 ms**, **60 ms** and **100 ms** long chunks are supported directly, others may work as well. +- **Highly Portable** + + Silero VAD reaps benefits from the rich ecosystems built around **PyTorch** and **ONNX** running everywhere where these runtimes are available. +

Typical Use Cases


From 0d61e4cee10fd68a34bcfa5137aa7762b112fe51 Mon Sep 17 00:00:00 2001 From: Alexander Veysov Date: Fri, 17 Dec 2021 22:03:49 +0300 Subject: [PATCH 3/4] Update README.md --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index b186e31..a0418c8 100644 --- a/README.md +++ b/README.md @@ -57,6 +57,10 @@ https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc- Silero VAD reaps benefits from the rich ecosystems built around **PyTorch** and **ONNX** running everywhere where these runtimes are available. +- **No Strings Attached** + + Published under permissive license (MIT) Silero VAD has zero strings attached - no telemetry, no keys, no registration, no built-in expiration, no keys or vendor lock. +

Typical Use Cases


From f40cc128a45d5eb549f3df5db015e1ec2e2fc0eb Mon Sep 17 00:00:00 2001 From: Alexander Veysov Date: Tue, 21 Dec 2021 08:24:48 +0300 Subject: [PATCH 4/4] Update utils_vad.py --- utils_vad.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/utils_vad.py b/utils_vad.py index 29b69c3..8a8b5a5 100644 --- a/utils_vad.py +++ b/utils_vad.py @@ -178,7 +178,7 @@ def get_speech_timestamps(audio: torch.Tensor, raise ValueError("More than one dimension in audio. Are you trying to process audio with 2 channels?") if sampling_rate == 8000 and window_size_samples > 768: - warnings.warn('window_size_samples is too big for 8000 sampling_rate! Better set window_size_samples to 256, 512 or 1536 for 8000 sample rate!') + warnings.warn('window_size_samples is too big for 8000 sampling_rate! Better set window_size_samples to 256, 512 or 768 for 8000 sample rate!') if window_size_samples not in [256, 512, 768, 1024, 1536]: warnings.warn('Unusual window_size_samples! Supported window_size_samples:\n - [512, 1024, 1536] for 16000 sampling_rate\n - [256, 512, 768] for 8000 sampling_rate')