Artificial Intelligence for Speech Recognition

Custom Student Mr. Teacher ENG 1001-04 6 January 2017

Artificial Intelligence for Speech Recognition


When you dial the telephone number of a big company, you are likely to hear the sonorous voice of a cultured lady who responds to your call with great courtesy saying “welcome to company X. Please give me the extension number you want” .You pronounces the extension number, your name, and the name of the person you want to contact. If the called person accepts the call, the connection is given quickly. This is artificial intelligence where an automatic call-handling system is used without employing any telephone operator. Artificial Intelligence (AI) involves two basic ideas. First, it involves studying the thought processes of human beings. Second, it deals with representing those processes via machines (computers, robots, etc).AI is the behavior of a machine, which, if performed by a human being, would be called intelligent. It makes machines smarter and more useful, is less expensive than natural intelligence.

Natural Language Processing (NLP) refers to Artificial Intelligence methods of Communicating with a computer in a natural language like English. The main objective of a NLP program is to understand input and initiate action. The input words are scanned and matched against internally stored known words. Identification of a keyword causes some action to be taken. In this way, one can communicate with computer in one’s language. One of the main application of AI is speech recognition system is that it lets user do other works simultaneously. The speech recognition process is performed by a software component known as the speech recognition engine.

A speech recognition system is a type of software that allows the user to have their spoken words converted into written text in a computer application such as a word processor or spreadsheet. The computer can also be controlled by the use of spoken commands. As we can’t design electronic device which recognizes everyone’s voice, based on that it is divided into speaker dependency and speaker independency. The working of the system involves ADC, comparison of this binary version with the stored words. The limitations for this are: must be completely trained by the user, most successful for those competent in the art of dictation.

It is applicable in blue eyes technology, telephone applications like travel booking, financial account information, in military for controlling of weapons. By considering all the above factors it differs from other technologies as it produce written text from the user’s dictation, without using, or with only minimal use of, a traditional keyboard and mouse. This is an obvious benefit to many people who, for any number of reasons, do not find it easy to use a keyboard, or whose spelling and literacy skills would benefit from seeing occur. Speech recognition will revolutionize the way people conduct business over the Web and will, ultimately, differentiate world-class ebusinesses the Web, decreases fatigue and created its own path across various fields.


Evidence of Artificial Intelligence folklore can be traced back to ancient Egypt, but with the development of the electronic computer in 1941, the technology finally became available to create machine intelligence. The term artificial intelligence was first coined in 1956, at the Dartmouth conference, and since then Artificial Intelligence has expanded because of the theories and principles developed by its dedicated researchers. Artificial intelligence, also known as machine intelligence, is defined as intelligence exhibited by anything manufactured (i.e. artificial) by humans or other sentient beings or systems (should such things ever exist on Earth or elsewhere). With the popularity of the AI computer growing, the interest of the public has also grown. Applications for the Apple Macintosh and IBM compatible computer, such as voice and character recognition have become available. Also AI technology has made steadying camcorders simple using fuzzy logic. With a greater demand for AI-related technology, new advancements are becoming available.

Inevitably Artificial Intelligence has, and will continue to affecting our lives. Artificial Intelligence (AI) Effort to develop computer-based systems: that behave like humans: learn languages  accomplish physical tasks use a perceptual apparatus With the development of practical techniques based on AI research, advocates of AI have argued that opponents of AI have repeatedly changed their position on tasks such as computer chess or speech recognition that were previously regarded as “intelligent” in order to deny the accomplishments of AI. They point out that this moving of the goalposts effectively defines “intelligence” as “whatever humans can do that machines cannot”.

A speech recognition system is a type of software that allows the user to have their spoken words converted into written text in a computer application such as a word processor or spreadsheet. The computer can also be controlled by the use of spoken commands. Speech recognition software can be installed on a personal computer of appropriate specification. The user speaks into a microphone (a headphone microphone is usually supplied with the product). The software generally requires an initial training and enrolment process in order to teach the software to recognize the voice of the user. A voice profile is then produced that is unique to that individual. This procedure also helps the user to learn how to speak to a computer.


The user speaks to the computer through a microphone, which in turn, identifies the meaning of the words and sends it to NLP device for further processing. Once recognized, the words can be used in a variety of applications like display, robotics, Commands to computers, and dictation .The word recognizer is a speech recognition system that identifies individual words.

Following are a few of the basic terms and concepts that are fundamental to speech recognition. Utterances Pronunciations Grammar Accuracy

The speech quality varies from person to person. The grammar used by the speaker and accepted by the system, noise level, noise type, position of the microphone, and speed and manner of the user¶s speech are some factors that may affect the quality of the speech recognition. The computer must be trained to the voice of that particular individual. Such a system is called Speaker-dependent system. Speaker-independent system can be used by anybody, and can recognize any voice, even though the characteristics vary widely from one speaker to another.


The normal speech has a frequency range of 200 Hz to 7KHz. Recognizing a telephone call is more difficult as it has bandwidth limitations of 300Hz to 3.3KHz.As explained earlier the spoken words are processed by the filters and ADCs. The binary representation of each of these words becomes a template or standard against which the future words are compared. These templates are stored in the memory. Once the storing process is completed, the system can go into its active mode and is capable of identifying the spoken words. As each word is spoken, it is converted into binary equivalent and stored in RAM.

The computer then starts searching and compares the binary input pattern with the templates. It is to be noted that even if the same speaker talks the same text, there are always slight variations in amplitude or loudness of the signal, pitch, frequency difference, time gap etc.Due to this reason there is never a perfect match between the template and the binary input word. The pattern matching process therefore uses statistical techniques and is designed to look for the best fit. The values of binary input words are subtracted from the corresponding values in the templates. If both the values are same, the difference is zero and there is perfect match. If not the subtraction produces some difference or error. the smaller the error the better the match.


The search process takes a considerable amount of time, as the CPU has to make many comparisons before recognition occurs. This necessitates use of very high-speed processors. A Large RAM is also required as even though a spoken word may last only a few hundred milliseconds, but the same is translated into many thousands of digital words. It is important to note that alignment of words and speeds as well as elongate different parts of the same word. This is important for the speaker- independent recognizers.


Speech recognition is used to enable deaf people to understand the spoken word via speech to text conversion, which is very helpful. Speech recognition is especially useful for people who have difficulty using their hands, ranging from mild repetitive stress injuries to involved disabilities that preclude using conventional computer input devices.


speech recognizers have been operated successfully in fighter aircraft with applications including: setting radio frequencies, commanding an autopilot system, setting steer-point coordinates and weapons release parameters, and controlling flight displays.


Training for military (or civilian) air traffic controllers (ATC) represents an excellent application for speech recognition systems.


Speech is used mostly as a part of User Interface, for creating pre-defined or custom speech commands.

LIMITATIONS:  It needs to be completely tailored to the user and trained by the user. It is often set up on one machine, and so can create difficulties for a user who Works from many locations, for example from school and home. It depends on the user having the desire to produce text and be able to invest the Time, training and perseverance necessary to achieve it. It is most successful for those competent in the art of dictation


Speech recognition had prevailed and achieved tremendous results in different field’s .I t made our interactions with the computer easier than earlier. This technology had reduced the difference between human-to-human and human-to-machine interaction.


It would yield better results: When it was made noise resistant. Understand our emotions User friendly as accent of a machine differs from human’s It must be portable to use irrespective of the device.


www.seminoron .com .com


  • Subject:

  • University/College: University of Arkansas System

  • Type of paper: Thesis/Dissertation Chapter

  • Date: 6 January 2017

  • Words:

  • Pages:

We will write a custom essay sample on Artificial Intelligence for Speech Recognition

for only $16.38 $12.9/page

your testimonials