Other Posts in Speech

  1. Text to Speech in .Net using C#
  2. Speech Recognition in .Net using C#

Speech Recognition in .Net using C#

9/10/2009

Speech recognition is a much more difficult task than simply making the computer speak. So you would think that it would mean that there would be a good 100 to 200 lines of code required to get the task done. But with .Net it's more like 8.

   1: SpeechRecognitionEngine RecognitionEngine = new SpeechRecognitionEngine();
   2: RecognitionEngine.LoadGrammar(new DictationGrammar());
   3: RecognitionResult Result = RecognitionEngine.Recognize();
   4: StringBuilder Output = new StringBuilder();
   5: foreach (RecognizedWordUnit Word in Result.Words)
   6: {
   7:     Output.Append(Word.Text);
   8: }

The code above should be pretty obvious as to what is going on, with one exception. The LoadGrammar line might give you some pause. The system basically needs to know what to be looking for and has two modes. The first mode is dictation. This is what you would use for something like Word (and is what I show above). The second mode is command mode. In that case you have to build your own grammar (passing in text words, etc. for it to look for). The main reason to use this if you wanted to control an application with specific phrases. Other things to note is that the code above is synchronous. If you wanted to, you can do this async and have it notify you when it's done. We have other options as well, such as the ability to tie it to a wave file, etc. just like the text to speech bit of code.

Anyway, the SpeechRecognitionEngine is apart of the System.Speech.Recognition namespace. It's basically our gateway to the built in speech recognition software that Microsoft uses. Well, sort of anyway... Normally when Microsoft puts something in .Net, it means that it will work. Speech recognition doesn't seem to want to on Windows Server 2008... I spent the better part of a day trying to get it to work with nothing to show for it. Anyway, on Vista and Windows 7, this code will give you speech recognition. With XP, you have to download and install the Speech SDK from Microsoft. Windows Server 2003 seems to also work with that download, but Windows Server 2008 seems busted at this point (and I'm guessing there is no rush to fix that). There are claims out there that you can get it to work but none of them have worked for me thus far. Anyway, I hope this helps someone. Give it a try, leave feedback, and happy coding.



Comments

James Craig
May 08, 2011 5:45 PM

The .Net lib that you need is System.Speech. It contains the System.Speech.Recognition namespace. If it doesn't come up, make sure you're using .Net 3.0 on up. (if you're stuck on 2.0, I believe you're out of luck)

slyval1
May 02, 2011 3:37 AM

I am trying to use the code above for speech recognition but it seems there is no namespce System.speech.recongnition in Visual Studio 2010. Is there a library I have to add?

James Craig
March 19, 2011 4:30 PM

I have never written code for Azure. I have no idea if System.Speech.Recognition namespace can even be used on it. I'm sorry that I'm not much help in this regard.

Sun
March 18, 2011 12:10 PM
Hi James, I am using your code for a speech recognition application in WIndows Azure. However, in the line:RecognitionResult Result = RecognitionEngine.Recognize();The program is running for a long time with repeated occurrence of the following messages. It is kind of stuck there.Microsoft.WindowsAzure.ServiceRuntime Verbose: 500 : Role instance status check startingMicrosoft.WindowsAzure.ServiceRuntime Verbose: 502 : Role instance status check succeeded: ReadyCan you please tell me what is wrong here

James Craig
October 27, 2010 9:03 AM

Well you can't really do what you want to do with the code I've put up. In your case you're going to have to call RecognizeAsync (make sure to tie into the RecognizedCompleted event). The RecognizeCompletedEventArgs object that is returned should contain the result object, which has the Audio property, which will give you the audio position... A bit more complicated than it needs to be but it should work. Although you may end up with a phrase and not just a single word, but sadly that's the best way I know how to do what you want to do...

young
October 26, 2010 2:53 PM

I am trying to grap each audio position value from Words for example.since AudioPosition is private, is there any other way to get Words[i].audioPosition?Much appreciate it.Thanks in advance.Young

James Craig
June 19, 2010 8:51 PM

Assuming you have SP3 installed and a recognizer installed as well (basically have office 2003 installed or SAPI 5.1 SDK or one of the other recognizers out there), it will work on XP.

mahmoud
June 19, 2010 7:32 AM

is it work under win xp ?

James Craig
June 03, 2010 7:52 AM

Well, by default the code above would use the default audio input device. Assuming you have a mic, that would be it. You may want to double check that the recognizer is set up properly by setting it to a wav file. To do that you would add the following line prior to calling Recognize:RecognitionEngine.SetInputToWaveFile("THE LOCATION OF YOUR FILE");

Eng_rayan
June 02, 2010 10:12 AM

I have an xp and ive download the sapi sdk 5.1 i use the code that you put and i builded sucsessfully but when i start debugging an error occured" No audio input is supplied to this recognizer"

James Craig
May 18, 2010 9:39 AM

The code you have looks OK. I would step through it to see where the error occurs and what, if any, error codes you receive. That being said, I haven't used the old SAPI code since I switched to Windows 7 (System.Speech is a managed wrapper that's built into .Net starting with Vista). I'm going to assume that you need it to run on XP though and that's why you're using SAPI instead.Anyway one of the big issues that I use to run into with XP was that there wasn't a speech recognition engine built in by default. But if you have the Speech SDK (probably 5.1 since that's what everyone links to) or Office installed, you should have an engine. Past that double check your setup (it's really common to mess up your grammar, etc.).

Martin Lutini
May 18, 2010 5:46 AM

Hi Craig,Am doing my final project in which i am implementing a speech recognition system that can identify and individual by their ID number dictation. I am really new to C# sharp and .NET but i have tried to read and compile something but i get stuck when running it// Code that Handles a recognised voice from the microphonepublic void RecoContext_Recognition(int StreamNumber, object StreamPosition, SpeechRecognitionType RecognitionType, ISpeechRecoResult e){// get phrasestring phrase=e.PhraseInfo.GetText(0,-1,true);//make sure its in lower case (for safer use only)phrase=phrase.ToLower();//if recognised any ...if(phrase!=""){switch(e.PhraseInfo.Rule.Name) //rule name not the phrase!{case "Activate":{//load grammarSAPIGrammarFromFile("XMLDeactivate.xml");//notify user

James Craig
April 30, 2010 8:22 AM

At this point I can honestly say that I don't know what the issue is. I would try posting the issue on Stack Overflow: http://stackoverflow.com. I'm sorry that I'm not of more help.

jamal
April 29, 2010 12:11 AM

still its not working, the page is keep loading only..i dont have idea on the Sync and Async , i dont know how to stop the page and make it so the output.pls, if you can give the code.

James Craig
April 28, 2010 9:57 AM

Well if you want to do async recognition, you need to set up the code a bit different from what you have above. Below is a really basic form as an example:public partial class Form2 : Form{public Form2(){InitializeComponent();using (SpeechRecognitionEngine RecognitionEngine = new SpeechRecognitionEngine()){RecognitionEngine.SetInputToWaveFile("C:\\YOURWAVFILE.wav");RecognitionEngine.RecognizeCompleted += new EventHandler<RecognizeCompletedEventArgs>(RecognitionEngine_RecognizeCompleted);RecognitionEngine.LoadGrammar(new DictationGrammar());RecognitionEngine.RecognizeAsync();}}void RecognitionEngine_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e){lbl_result.Text = e.Result.Text;}}But basically you need to set up the RecognizeComplet

jamal
April 27, 2010 11:29 PM

Hi Craig Thank you for your reply , but i used Async="true" in the page directive . i referred some sites it guides me to use Async="true" .While i run the page and put break point on the Label which displays the recognized Text it shows some output but when i run the page it wont display anything it just keep loading .do i need to close any function to make the page displays the output.

James Craig
April 27, 2010 1:58 PM

You're using asynchronous recognition as well as synchronous. You should choose one or the other in this case. Also, if you're putting the recognition engine in a using statement, you shouldn't call dispose. As far as why it's only displaying during loading, to be honest, I don't know. It might be an issue where the recognition engine locks the wave file, but I'm not sure. As far as getting the recognition a bit better, depending it may need to be trained.To train your system to recognize your voice (or whomever), go to the control panel, ease of access, and then speech recognition. There should be a button that says "Train your machine to understand you better" or something along those lines. But then it's training on your voice. But it will do a better job of recognizing anything that you say.

jamal
April 27, 2010 9:52 AM

Hi craig,am using the below codeusing (RecognitionEngine = new SpeechRecognitionEngine(new CultureInfo("en-US"))){RecognitionEngine.SetInputToWaveFile("D:/wav/music.wav");SpVoice voice = new SpVoice();voice.Rate = 10;voice.Volume = 100;RecognitionEngine.LoadGrammar(new DictationGrammar());RecognitionResult Result = RecognitionEngine.Recognize();RecognitionEngine.RecognizeAsync(RecognizeMode.Single);StringBuilder Output = new StringBuilder();foreach (RecognizedWordUnit Word in Result.Words){Output.Append(Word.Text + " ");}RecognitionEngine.RecognizeAsyncStop();lbl_result.Text = Output.ToString();

jamal
April 27, 2010 9:52 AM

Hi craig,am using the below codeusing (RecognitionEngine = new SpeechRecognitionEngine(new CultureInfo("en-US"))){RecognitionEngine.SetInputToWaveFile("D:/wav/music.wav");SpVoice voice = new SpVoice();voice.Rate = 10;voice.Volume = 100;RecognitionEngine.LoadGrammar(new DictationGrammar());RecognitionResult Result = RecognitionEngine.Recognize();RecognitionEngine.RecognizeAsync(RecognizeMode.Single);StringBuilder Output = new StringBuilder();foreach (RecognizedWordUnit Word in Result.Words){Output.Append(Word.Text + " ");}RecognitionEngine.RecognizeAsyncStop();lbl_result.Text = Output.ToString();

James Craig
February 14, 2010 8:47 PM

Well speech recognition will not work on Windows Server 2008 and takes a lot of work to get it up and going on 2003. So depending on your server, it would be difficult to get working. Most likely it's not going to work on a website.

Jim
February 12, 2010 11:49 PM

thanks for sharing, this is exactly what I'm looking for. Is this for windows form or web? I used the code for my website and my site will always be in "loading" mode and will never display the result.here's the codeSpeechRecognitionEngine RecognitionEngine;using (RecognitionEngine = new SpeechRecognitionEngine(new CultureInfo("en-US"))){RecognitionEngine.SetInputToWaveFile(Server.MapPath("speech.wav"));RecognitionEngine.LoadGrammar(new DictationGrammar());RecognitionResult Result = RecognitionEngine.Recognize();StringBuilder Output = new StringBuilder();foreach (RecognizedWordUnit Word in Result.Words){Output.Append(Word.Text + " ");}lbl_result.Text = Output.ToString();}