About VoiceCon
What is VoiceCon?
VoiceCon is the first serious attempt at a speech recognition system for RISC OS computers.
What do I need
You need the following :
- RISC OS 3.5 or greater
- A sampler. The currently supported list is as follows:
- VTi Printer Port Sampler
- VTi SoundByte recorder
- Argo SoundByte Recorder
- NorthWest SEMERC ReSound
- Armadillo 448
- Armadillo 448M
- RiscStation 7500
- 4 MBytes of memory
- Hard disc drive
The recommended specifications are:
- StrongARM RISC PC
- 1+ MBytes of VRAM
- 8 MBytes of memory
If you have an old VTi Printer Port Sampler (without a microphone), or an Armadillo card then you will need a microphone amplifier, which will connect into the Sampler unit.
What has it been tested on?
It has been tested, and shown to work on:
- A7000, 8 MBytes RAM, 0 MBytes VRAM
- RPC700, 8 MBytes RAM, 0 MBytes VRAM
- 202MHz StrongARM Risc PC, 16 MBytes RAM, 2 MBytes VRAM
- 233MHz StrongARM Risc PC, 40 MBytes RAM, 2 MBytes VRAM
- RiscStation lite 7500+, 16 MBytes RAM, 0 Bytes VRAM
Since VoiceCon works on an A7000, a StrongARM Risc PC, and a RiscStation 7500 then it should work on any RISC OS 3.5+ computer.
How does it work?
That would be telling!
In short, VoiceCon splits a word into three characteristics, containing information as to the "shape" of the word. These are the D, E and F characteristics. VoiceCon has to learn how you say words - by default, a limited number of words. Once VoiceCon has learnt these words, whenever you say a word, VoiceCon looks at the shape of your word, and compares it with what you taught it with. If a suitable match is found, VoiceCon adds the word to the current sentence. This sentence is transmitted to all the other applications, so applications can decide what to do with the word. VoiceCon stores three characteristics of each word recorded.
The D characteristic is 32 bytes in size, and is used as a rough overview of the word.
The E characteristic is 128 bytes in size, and is used as a slightly better overview of the word.
The F characteristic is 512 bytes in size, and is used as the most accurate overview of the word.
The algorithm combines these three characteristics with suitable weights, and works out which word is the closest match.
Distribution and copyright
VoiceCon is © Jason Tribbeck 1998-2001. The algorithm and methods are patent-pending.
This software can be freely distributed provided the following conditions are adhered to:
- The software is transmitted in whole, with no omissions or changes;
- The software is not to be included in any commercial product without written permission from the author;
- No reverse engineering is permitted on this code - it is patent pending;
If you want a version for your particular sampler, then please get the manufacturers of the sampler to contact me with full technical details of the sampler. If I cannot get the technical details, then there is absolutely no way I can do the conversion.