speechcontrol-devel team mailing list archive
-
speechcontrol-devel team
-
Mailing list archive
-
Message #00023
Re: Calling Developers to Their Stations
I forgot to mention the new sonic home page:
http://vinux-project.org/sonic
The adoption rate seems quite strong. It's going into TTS engines and
Audio book applications, and devices for the blind.
Bill
On Wed, Jan 12, 2011 at 5:30 PM, Bill Cox <waywardgeek@xxxxxxxxx> wrote:
> On Wed, Jan 12, 2011 at 3:37 PM, Jacky Alcine <jackyalcine@xxxxxxxxx> wrote:
>> I need for pedro3005, webrsk, waywardgeek, m0hi, and bedahr to either take
>> participation in the python-openmary project or the speechcontrol-daemon
>> project.
>
> Hi, Jacky. I personally feel the weak link in speech control is the
> non-distributable nature of some speech recognition code, and the lack
> of productization in Sphinx. I may be wrong, but I believe I can
> write a very good quality speech recognition engine that could make a
> huge difference to open-source speech control. If you don't mind, I'd
> like to continue with this work. To date, it's resulted in libsonic
> for speeding up speech with low distortion. The next big step will be
> isolated word recognition. I've done a ton of work on cleaning up
> spectrograms, and I believe I have the best algorithms anywhere, other
> than potential trade-secret algorithms. Check out my web page on
> generating spectrograms:
>
> http://vinux-project.org/time-aliased-hann/
>
> In addition to improved spectrograms, I believe I can write code to
> fairly accurately annotate the speech stream with voice events:
> glottal open, plosive open, stops, fricative begin/end, etc. I think
> I can combine evidence from both the time domain and frequency domain
> to determine what kind of fricatives and plosives are present in the
> sound stream. I'm hopeful that the combination of improved spectral
> analysis and time domain analysis will yield better results than we've
> seen in any system to date.
>
> In short, I'd like to keep working full steam ahead on this. I can do
> debian packaging and such, but I'd like my big project to be the
> speech recognition engine.
>
> Bill
>
Follow ups
References