In the last ten years Deep Learning has revolutionized numerous fields: natural language processing, image classification, automatic translation... The list grows longer each day. Speech recognition has also been placed under its spell. DeepSpeech is set to be part of this revolution in speech recognition.
DeepSpeech is an open source speech recognition engine we are working on. It is based off of Baidu’s research and uses the TensorFlow machine learning framework. It’s currently in early development. If you are interested in contributing, fork our code!
The major problem in open source speech recognition is not with algorithms but is with data. There is simply not enough open source data available! So, we have decided to change that. Introducing Murmur.
Murmur is a simple webapp for collecting speech samples to train speech recognition engines. With Murmur we will slowly build a speech corpus to train our open source models. If you are interested in contributing, fork our code!
While the majority of Deep Learning speech recognition work has focused on technologies, which are rather wasteful of memory and CPU cycles, a new counter balance is developing that focuses on small footprint devices. Pipsqueak is part of this new trend in Deep Learning speech recognition.
The goal of Pipsqueak is to implement the end-to-end deep learning speech recognition engine of McGraw et al. and to integrate this engine in to Vaani. This will allow Vaani to work completely off-line while still allowing for the high quality speech recognition we have now become used to. If you are interested in contributing, let us know!