Files
silero-vad/README.md
Dimitrii Voronin 8c1ae73ee7 Update README.md
2021-12-07 12:01:07 +02:00

4.4 KiB

Mailing list : test Mailing list : test License: CC BY-NC 4.0

Open In Colab

header


Silero VAD


Silero VAD - pre-trained enterprise-grade Voice Activity Detector (also see our STT models).


Real Time Example

https://user-images.githubusercontent.com/36505480/144874384-95f80f6d-a4f1-42cc-9be7-004c891dd481.mp4


Key Features


  • High accuracy

    Silero VAD shows an excellent result for speech detection in streaming tasks.

  • Fast

    One audio chunk (30+ ms) takes 1ms to be processed on a single CPU thread. Using batching and/or GPU one can greatly speed up inference time in production tasks.

  • Lightweight

    JIT model size is less than one megabyte.

  • Generalized

    Silero VAD was trained on a big corpora that included over 100 languages and performs well on audio of varying backgorund noise levels.

  • Variable sampling rate

    Silero VAD supports 8000 and 16000 sampling rate

  • Variable chunk size

    Model was trained on audio chunks of variable lengths. Chunks of length 30 ms, 60 ms and 100 ms are supported directly, other may perform well too.


Typical Use Cases


  • Voice activity detection for IOT / edge / mobile use cases
  • Data cleaning and preparation, voice detection in general

Links



Get In Touch


Try our models, create an issue, start a discussion, join our telegram chat, email us, read our news. Please see our wiki and tiers for relevant information and email us directly.

Please see our wiki and tiers for relevant information and email us directly.

Citations

@misc{Silero VAD,
  author = {Silero Team},
  title = {Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/snakers4/silero-vad}},
  commit = {insert_some_commit_here},
  email = {hello@silero.ai}
}