Speech Recognition in javascript

Sergio Rueda's photo
Sergio Rueda
·Nov 8, 2020·

3 min read

Subscribe to my newsletter and never miss my upcoming articles

SpeechRecognition interface as the SpeechSynthesis interface are part of the Web Speech API.

SpeechRecognition is still in an experimental phase, so please be sure to check here for browser compatibility.

For more information please visit developer mozilla

To use Speech Recognition in java script.

do as follows:

1.- set window.SpeechRecognition to either window.SpeechRecognition or window.webkitSpeechRecognition, as some systems may use either.

 window.SpeechRecognition =  window.SpeechRecognition || window.webkitSpeechRecognition;

2.- Create an instance of SpeechRecognition:

  let recognition = new window.SpeechRecognition();

3.- In this example I will be using Spanish as my base language, but you can always change it to your preferences. Note: to use any voice app your system will have to have installed all the languages you will want to use. windows 10 already has them .

For languages we will have to use 'lang' property. As you may guessed we this API has a few properties, but we will only 'lang'.

recognition.lang = 'es-MX';

4.- once steps 1 thru 3 are completed, the we are ready to start our speech recognition engine.


5.- At this point we should be listening for audio, and now we can start capturing the results of what we spoke. To do this we will have to extract the text from the results object. The converted audio into text will be deep in two arrays under results and the transcript. In the following example I spoke "Cuarenta", which was converted into "40". Please se picture:


extract with: eg: results[0][0].transcipt

First, an eventlistener has to be created for the'result' event of SpeeechRecognition.

     recognition.addEventListener('result', onAudio);

The above code will only execute whenever 'result' event occurs, and when it happens it will call a callback funtion called "onAudio".

function onAudio(e)
    let  message = document.getElementById('msg');
    let msg = e.results[0][0].transcript;

    // put msg on DOM
    message.innerHTML = `
    <div>You said: 
          <span class="box">${msg}</span>

6.- Add another event listener to capture 'end' event. This will only restart capturing what we speak.

recognition.addEventListener('end', () => recognition.start());

Please see live example here: Note: if it does not works as is, please check your voice/microphone settings or copy the link and open it in google chrome browser.

Thanks for reading this article

Let's Connect

Twiter Rueda Tech

Next article: Undecided

Share this