- If something is not documented well, it is probably tested well. If you can find the tests that cover it, you can usually get a good idea of how it works. - There is no multithreading. I had to fake it by breaking up my loops with setTimeouts.
There is an IRC channel if you need a place to go for help. Also, feel free to PM.
In my time using offline speech-recognition tech, I never was able to use julius properly (docs were a little lacking) so I jumped on to CMUSphinx/pocketsphinx, but I think you've just done a huge amount of work to bring Julius out of obscurity (at least in my mind). Thanks very much
[EDIT] - I really can't get across enough how awesome this is, please add a gittip link or bitcoin address or something
I also came across PocketSphinx, which has been around a little longer - that may have some users in the wild already, and I wouldn't be surprised if it was used for navigation somewhere.
[PocketSphinx] https://github.com/syl22-00/pocketsphinx.js/
Something like that...
I'll be sure to include them soon - it's probably just a few more lines of code, so you can expect them in the onrecognition function this afternoon.
> Each person's voice is different. Some sounds, like "s", sound about the same no matter who says them, but other sounds, like vowels, tend to differ a lot from person to person. We use a special way of representing sound, the cepstrum, that captures lots of information, including the characteristic way you pronounce your vowels. Of course, someone could imitate the way you talk; fortunately, the cepstrum also captures certain fundamental characteristics of voices that are impossible to change. For instance, the length of your vocal tract -- the place where sound is produced in your body -- cannot be changed, and different length vocal tracts tend to produce cepstra with different characteristics. By identifying both the way you talk, and the way your body produces sound, WhisperID can do a great job of figuring out who you are.
Quick question the Julius website says there is no English acoustic model available [1], how did you solve this? Do you provide a default acoustic model?
[1] http://julius.sourceforge.jp/en_index.php?q=en_grammar.html
> VoxForge was set up to collect transcribed speech for use with Free and Open Source Speech Recognition Engines (on Linux, Windows and Mac).
VoxForge's sample grammar is provided as a default, in its own folder [2]. It would be nice to get a high-quality acoustic model, as voxforge's is not that comprehensive yet, but I couldn't find anything with the right licensing and zero cost. If anyone knows of one, I'd love to hear about it.
[1] http://www.voxforge.org/ [2] https://github.com/zzmp/juliusjs/tree/master/dist/voxforge
I said "hello" it said "DIAL OH OH".
I said "Apple" it said "GET KENT".
WTF?
Much thanks to @iffy for writing the first pass.
It uses voxforge's sample vocabulary, so you'll need to say things like "Dial 1 2 3" or "Call Kenneth McDougall" for it to understand you, but the vocabulary is easily swapped out for your own projects, as explained in the README.
I would definitely give this JuliusJS library a try. I am actually amazed that JuliusJS doesn't carry all the heavy data like speak.js does (multiple languages support though). I love the fact that you state 100% client side!