speechcontrol-devel team mailing list archive
-
speechcontrol-devel team
-
Mailing list archive
-
Message #00022
Re: Calling Developers to Their Stations
On Wed, Jan 12, 2011 at 3:37 PM, Jacky Alcine <jackyalcine@xxxxxxxxx> wrote:
> I need for pedro3005, webrsk, waywardgeek, m0hi, and bedahr to either take
> participation in the python-openmary project or the speechcontrol-daemon
> project.
Hi, Jacky. I personally feel the weak link in speech control is the
non-distributable nature of some speech recognition code, and the lack
of productization in Sphinx. I may be wrong, but I believe I can
write a very good quality speech recognition engine that could make a
huge difference to open-source speech control. If you don't mind, I'd
like to continue with this work. To date, it's resulted in libsonic
for speeding up speech with low distortion. The next big step will be
isolated word recognition. I've done a ton of work on cleaning up
spectrograms, and I believe I have the best algorithms anywhere, other
than potential trade-secret algorithms. Check out my web page on
generating spectrograms:
http://vinux-project.org/time-aliased-hann/
In addition to improved spectrograms, I believe I can write code to
fairly accurately annotate the speech stream with voice events:
glottal open, plosive open, stops, fricative begin/end, etc. I think
I can combine evidence from both the time domain and frequency domain
to determine what kind of fricatives and plosives are present in the
sound stream. I'm hopeful that the combination of improved spectral
analysis and time domain analysis will yield better results than we've
seen in any system to date.
In short, I'd like to keep working full steam ahead on this. I can do
debian packaging and such, but I'd like my big project to be the
speech recognition engine.
Bill
Follow ups
References