mirror of
https://github.com/snakers4/silero-vad.git
synced 2026-02-05 18:09:22 +08:00
Merge branch 'master' of github.com:snakers4/silero-vad
This commit is contained in:
29
README.md
29
README.md
@@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
[](https://colab.research.google.com/github/snakers4/silero-vad/blob/master/silero-vad.ipynb)
|
[](https://colab.research.google.com/github/snakers4/silero-vad/blob/master/silero-vad.ipynb)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- [Silero VAD](#silero-vad)
|
- [Silero VAD](#silero-vad)
|
||||||
- [Getting Started](#getting-started)
|
- [Getting Started](#getting-started)
|
||||||
@@ -23,39 +23,42 @@
|
|||||||
`Single Image Why our VAD is better than WebRTC`
|
`Single Image Why our VAD is better than WebRTC`
|
||||||
|
|
||||||
Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier.
|
Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier.
|
||||||
Enterprise-grade Speech Products made refreshingly simple (all see our [STT](https://github.com/snakers4/silero-models)).
|
Enterprise-grade Speech Products made refreshingly simple (see our [STT](https://github.com/snakers4/silero-models) models).
|
||||||
|
|
||||||
Currently, there are hardly any high quality / modern / free / public voice activity detectors except for WebRTC Voice Activity Detector ([link](https://github.com/wiseman/py-webrtcvad)).
|
Currently, there are hardly any high quality / modern / free / public voice activity detectors except for WebRTC Voice Activity Detector ([link](https://github.com/wiseman/py-webrtcvad)).
|
||||||
|
|
||||||
Also in enterprise it is crucial to be able to anonymize large-scale spoken corpora (i.e. remove personal data). Typically personal data is considered to be private / sensitive if it contains (i) a name (ii) some private ID. Name recognition is highly subjective and would depend on locale and business case, but Voice Activity and Number detections are quite general tasks.
|
Also in enterprise it is crucial to be able to anonymize large-scale spoken corpora (i.e. remove personal data). Typically personal data is considered to be private / sensitive if it contains (i) a name (ii) some private ID. Name recognition is highly subjective and would depend on locale and business case, but Voice Activity and Number detections are quite general tasks.
|
||||||
|
|
||||||
**Key advantages / features:**
|
**Key features:**
|
||||||
|
|
||||||
- Modern, portable;
|
- Modern, portable;
|
||||||
- Small memory footprint;
|
- Lowe memory footprint;
|
||||||
- Trained on huge spoken corpora and noise / sound libraries;
|
|
||||||
- Slower than WebRTC, but sufficiently fast for IOT / edge / mobile applications;
|
|
||||||
- Superior metrics to WebRTC;
|
- Superior metrics to WebRTC;
|
||||||
|
- Trained on huge spoken corpora and noise / sound libraries;
|
||||||
|
- Slower than WebRTC, but fast enough for IOT / edge / mobile applications;
|
||||||
|
|
||||||
**Typical use cases:**
|
**Typical use cases:**
|
||||||
|
|
||||||
- Spoken corpora anonymization;
|
- Spoken corpora anonymization;
|
||||||
- Voice detection for IOT / edge / mobile use cases;
|
- Voice activity detection for IOT / edge / mobile use cases;
|
||||||
- Data cleaning and preparation, number and voice detection in general;
|
- Data cleaning and preparation, number and voice detection in general;
|
||||||
|
|
||||||
## Getting Started
|
## Getting Started
|
||||||
|
|
||||||
The models are small enough to be included directly into this repository. Newer models will supersede older models directly.
|
The models are small enough to be included directly into this repository. Newer models will supersede older models directly.
|
||||||
|
|
||||||
Currently we provide the following models:
|
Currently we provide the following functionality:
|
||||||
|
|
||||||
| | Released |PyTorch | ONNX | VAD | Number Detector | Language Classifier | Languages | Colab |
|
| PyTorch | ONNX | VAD | Number Detector | Language Clf | Languages | Colab |
|
||||||
|----|------------|-------------------|--------------------|---------------------| --------------------|---------------------|-------------------------|-------|
|
|-------------------|--------------------|---------------------|-----------------|--------------|------------------------|-------|
|
||||||
| v1 | 2020-12-15 |:heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | | | `ru`, `en`, `de`, `es` | [](https://colab.research.google.com/github/snakers4/silero-vad/blob/master/silero-vad.ipynb) |
|
| :heavy_check_mark:| :heavy_check_mark: | :heavy_check_mark: | | | `ru`, `en`, `de`, `es` | [](https://colab.research.google.com/github/snakers4/silero-vad/blob/master/silero-vad.ipynb) |
|
||||||
|
|
||||||
Version history:
|
**Version history:**
|
||||||
|
|
||||||
- v1, 2020-12-15, initial release, no Number Detector or Language Classifier heads yet;
|
| Version | Date | Comment |
|
||||||
|
|---------|-------------|---------------------------------------------------|
|
||||||
|
| `v1` | 2020-12-15 | initial release |
|
||||||
|
| `v2` | coming soon | Add Number Detector or Language Classifier heads |
|
||||||
|
|
||||||
### PyTorch
|
### PyTorch
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user