To install StudyMoose App tap and then “Add to Home Screen”
Save to my list
Remove from my list
In this paper, Intelligent Volume Controller (IVC) using fuzzy logic is developed for mobile phones to improve the voice quality in the presence of background noise. The IVC uses the noise level and class information as an input to automatically adjust the volume of the mobile phone in the presence of background noise.
Fuzzy Noise Classification system consists of mainly two stages: feature extraction and feature matching. We have conducted several experiments on real audio signals and the result prove the effectiveness of the proposed approach.
During the research work, noise attenuation level of up to 0.05db, 0.012db and 0.017db was achieved using the proposed Volume Controller that seems to be quite satisfactory.
Whenever we are having a conversation on a mobile phone, if the background noise level is high, sometimes, speaker on the other end is asked to speak up or to come out of a noisy place. Otherwise, users tend to bring their mobiles very close to their ears in the presence of high background noise.
Therefore, in this paper, we have made an attempt to improve the quality of service by developing an intelligent volume controller that could automatically adjust the volume of mobile phones.
The noise level derived from the voice activity detector (VAD) present within the speech codec, the current volume level, and the noise class acts as an input to the fuzzy volume controller. Background noise levels may be high while in busses, trains, planes, markets, sporting venues, and other public places. Now days, most of smart phones are coming up with efficient noise cancellation systems.
However, high and ambient noise levels remains difficult to overcome and impose a major limitation on automotive audio systems available in mobile phones.
Background noise classification information is very useful and can be used to dynamically adapt the changes in acoustic volume levels in different noise environments. The fuzzy volume controller for mobile phones adjusts the volume according to the background noise level and noise class using a fuzzy expert system. By intelligently adjusting the volume level, the Quality of service can be improved for both stationary and non stationary background noise in mobile environments.
Hearing loss in individuals can be gradual, and quality of hearing can vary from person to person. Hearing loss can also result in difficulty for individuals in understanding speech in the presence of background noise. Hence, the IVC needs to be personalized based on the requirements of an individual’s hearing requirements of understanding speech in the presence of background noise.
The manual volume control has several shortcomings. For a sophisticated mobile phone, the user needs to be trained to use volume control to get the better sound quality. The volume control needs to take into account the following:
A volume controller that has the capability to intelligently adjust the volume levels is referred as Intelligent Volume Controller (IVC). Design and development of the IVCs depends upon the very complex models of hearing mechanism as well as the measurement of inputs required for IVC such as background noise level, loudness level.
Background noise level and loudness level are known as subjective measures. In this context of uncertainty and imprecision, the soft computing techniques become useful for IVC implementation. In general, intelligent volume control is used to overcome most of the problems faced by the user while using manual volume control. Intelligent volume control has several advantages over the conventional controllers in cellular phones such as:
Noise classes like car noise fall into low-frequency noise. They do not affect the intelligibility of speech so much as factory noise. Therefore, the use of background noise and class information can provide effective volume adjustment. Depending upon the characteristics of the data; the quest for a suitable model is a pertinent challenge. An intelligent volume controller is designed and its performance is observed in terms of noise attenuation level.
Best noise parameter estimate for intelligent volume controller is done using artificial neural network (ANN) in order to get accurate noise classes to produce the better results.
In our research work, audio data samples were collected and further processed in Matlab environment. Four different types of noises i.e. car noise, office noise, market noise and train noise were collected. 100 samples of each type of noise totaling to 400 noise samples were downloaded on internet from website www.partnersinrhyme.com and stored in computer. Also, same four types of real data noise with 100 samples of each type of noise totaling to 400 noise samples were collected from environment. Thus, total 800 samples were stored in memory with the help of a microphone connected to personal computer.
In our daily life, we encounter different types and level of environmental acoustical noises like office noise, car noise and traffic noise etc. In various audio processing systems such as voice coding, noise recognition and speaker verification, the unwanted noise signals are picked up along with the noise signals which often cause degradation in the performance of communication systems.
In last two decades, numerous techniques and algorithms have been developed by researchers to classify the background noise using features such as log area (LAR) coefficients, zero crossing rate (ZCR), line spectral frequency (LSF) and power spectral density (PSD). But, none of the aforementioned techniques have been proven effective to yield the desired noise classification results due to their own inherent limitations associated with each technique.
The proposed IVC expert system consists mainly of three stages:
Here, noise parameter estimates based on MFCC, LPC and RCEP are obtained for 04 noises, i.e., car, office, market and train noises. A wide range of possibilities exist for parametrically representing the speech signal for the speaker recognition task, such as Linear Prediction Coding (LPC), Mel Frequency Cepstrum Coefficients (MFCC), and others.
Mel Frequency Cepstral Coefficients (MFCC) technique is used to extract features from the noise signal. By applying the procedure described above, for each noise frame of around 30msec with overlap, a set of Mel Frequency Cepstral Coefficients is computed. These are result of a cosine transform of the logarithm of the short-term power spectrum expressed on a Mel Frequency scale. This set of coefficients is called an acoustic vector. Therefore, each input utterance is transformed into a sequence of acoustic vectors.
These acoustic vectors can be used as a noise parameter estimate to classify noise signals. Thus, the MFCC feature vector is computed by integration of the spectrum within triangular bins arranged on a Mel frequency axis to form a mel frequency binned spectrum, followed by a log and DCT operations. The MFCC vectors and pitch if used contain sufficient information for continuous speech recognition (at various rates of error).
However, it is not obvious that good quality speech can be regenerated from them, since during the feature extraction; a significant amount of information is lost. This includes the phase information, discarded during the spectrum calculation (absolute value of the windowed DFT) and the fine details of the spectrum discarded during the integration. In addition, in most cases some of the higher order cepstral coefficients are discarded.
A small drawback is that MFCCs are more computationally expensive. The following tow modifications in the MFCC based algorithms are generally being used to improve its performance by making it faster and computationally efficient such as Windowing (Kaiser Window) and absolute of DFT.
The basic idea behind linear prediction is that a noise sample can be approximated as a linear combination of past noise samples. By minimizing the sum of the squared differences (over a finite interval) between the actual noise samples and the predicted ones, a unique set of predictor coefficients can be determined. The LPC algorithm is a tradition method that can be used to forecast every sample by the former samples such that its mean squared error between prediction value and true value is minimal.
Real Cepstrum (RCEPS) was first introduced by as part of the homomorphic analysis of speech signals, in order to estimate the transfer function of vocal tract and glottal pulse, under the assumption that pitch can be modeled as an impulse train. From the theoretical point of view, the Cepstrum is defined as the inverse Fourier transform of the real logarithm of the magnitude of the Fourier transform.
Therefore, by keeping only the first few cepstral coefficients and setting the remaining coefficients to zero, it is possible to smooth the harmonic structure of the spectrum. Cepstral coefficients are therefore very convenient coefficients to represent the noise spectral envelope.
This denotes the Fourier Transform of x and hence real Cepstrum as a real-valued function can be used for the separation of two signals convolved with each other. Thus, RCEP is a Cepstrum-based technique for determining a Harmonics-to-Noise Ratio (HNR) in Noise Signals and is a valid technique for determining the amount of spectral noise, because it is almost linearly sensitive to both noise and jitter for a large part of the noise or jitter continuum. Thus real Cepstrum block gives the real Cepstrum output of the input frame and is also a popular way to define the prediction filter.
In this paper, an ANN classifier is used here as a pre-processing unit to yield the noise level and noise class as its output class. Classifier designed in this study is a feed-forward neural network comprising of two-layer (one input layer, one hidden layer, and one output layer), with a back propagation algorithm that has been used in medical imaging and medical image analysis. Scaled conjugate gradient back-propagation (trainscg) function is used to train the network with 30 neurons in hidden layer. Sigmoid function is used as an activation function.
The input layer consists of 9 neurons, each representing a distinguished feature of detected mass region and the output layer consists of 2 neurons, each representing malignant or benign mass regions. The output of the system is presented in the form of [1 0] if the masses were found to be malignant, and [0 1] for benign cases. During training, each hidden unit is used to transform the signals received from the input layer to produce an output.
Weights are adjusted so that the error between the observed output from each unit and the desired output specified by the target matrix is minimized. The dataset is divided into three categories: 50% of the samples are used to train the network, 20% of the samples are used for the validation process, and rests of the 30% of the samples are used to test the network performance. To analyze the network response, we used to examine the confusion matrix by considering the outputs of the trained network and comparing them to the expected results. The training of the network automatically stops, when generalization stops improving during validation process or the maximum number of allowed iterations are reached.
The noise level detector gives noise level (NL) as an output. The noise level is obtained in dB based on artificial neural network (ANN) which is fed to Intelligent Volume Controller as one of the input feature.
Noise parameters such as MFCC, RCEP, and LPC are computed in this block that serves as an input to the ANN to yield the noise class. This block gives noise level (NC) as output. The noise class is obtained as a subset of a larger category of noise and based on artificial neural network (ANN) which is fed to IVC based on fuzzy expert system.
The above three models of IVC block fuzzy model give volume level change (VLC) based on the following three inputs noise level output of NLD in dB. Prior to analysis of noise signals, noise data are first acquired and its detailed discussion is given below:
In our research work, audio data samples were collected and further processed in MATLAB 2013 environment. Four different types of noises, i.e., car noise, office noise, market noise and train noise were collected. 500 samples of each type of noise totalling to 2000 noise samples were collected from internet and stored in computer. Also, same four types of real data noise with 500 samples of each type of noise totalling to 2000 noise samples were collected from environment.
Thus, total 4000 samples were stored in memory with the help of a microphone connected to personal computer. User defined program was written in MATLAB for Mel frequency Cepstral coefficient (MFCC) while built-in programs for Linear predictive coding (LPC) Real Cepstral parameter (RCEP) and power spectrum have been explored in MATLAB to estimate noise parameters which may be utilized for noise analysis through any one of the soft computing techniques viz., neural networks, fuzzy logic, genetic algorithms or a combination of these.
A one hidden layer feed forward neural network was created with 20 neurons in the hidden layer, two neurons for input of two simultaneous noises to be classified and one neuron for out of noise class to be obtained after classification. The dataset was then divided into training, cross validation and testing data sets. The training dataset was presented to the network for learning phase. Cross-validation dataset was used to measure the training performance during training and
hence provided an independent measure of training performance. It was trained with the training, cross validation and testing samples. In this work, 70% trained neural network was tested with the 30% testing samples which were partitioned from the main dataset. The testing data was not used in training in any way and hence provided an 'out-of-sample' dataset to test the network. The network response was then compared against the desired target response to build the classification matrix which provided a comprehensive picture of classifiers performance.
The Intelligent volume controller in presence of background noise for mobile phones is used to design a fuzzy volume controller by defining fuzzy membership function and using fuzzy 40 if then fuzzy rules. Volume level (VL) was analyzed for four categories of internet noises from 0.1 dB to 100dB and it was also normalized from 0 to 1 for defining of membership function in fuzzy: (a) Office Noise, (b) Market Noise, (c) Car Noise, and (d)Train Noise
Table 1- FIS (Fuzzy inference System) Variables in design of fuzzy volume controller
The following forty if then fuzzy rules were defined for membership functions in fuzzy toolbox of MATLAB to design fuzzy volume controller for mobile phones:
Rule 1: If NL is low, VL is LP and NC is Office, then VLC is LP.
Rule 2: If NL is low, VL is MP and NC is Market, then VLC is LP.
Rule 3: If NL is medium, VL is LP and NC is Office, then VLC is MN.
Rule 4: If NL is high, VL is HP and NC is car, then VLC is HP .
Rule 5: If NL is high, VL is HP and NC is train, then VLC is VHP.
Rule 6: If NL is medium, VL is MP and NC is car, then VLC is MP.
Rule 7: If NL is medium, VL is HP and NC is office, then VLC is MN.
Rule 8: If NL is high, VL is MP and NC is train, then VLC is HP.
Rule 9:If NL is medium, VL is HP and NC is market, then VLC is LP.
Rule 10: If NL is high, VL is LP and NC is market, then VLC is VHP.
Rule 11: If NL is low, VL is medium and NC is market, then VLC is LP.
Rule 12: If NL is low, VL is HP and NC is market, then VLC is MP.
Rule 13: If NL is medium, VL is LP and NC is office, then VLC is MP.
Rule 14: If NL is high, VL is HP and NC is train, then VLC is VHP.
Rule 15: If NL is medium, VL is medium and NC is office, then VLC is MN.
Rule 16: If NL is high, VL is LP and NC is car, then VLC is HP.
Rule 17: If NL is high, VL is LP and NC is market, then VLC is MP.
Rule 18: If NL is high, VL is MP and NC is office, then VLC is MP.
Rule 19: If NL is high, VL is MP and NC is car, then VLC is HP.
Rule 20: If NL is high, VL is MP and NC is train, then VLC is HP.
Rule 21: If NL is high, VL is LP and NC is office, then VLC is LP.
Rule 22: If NL is low, VL is LP and NC is office, then VLC is LP.
Rule 23: If NL is low, VL is LP and NC is market, then VLC is LP.
Rule 24: If NL is low, VL is MP and NC is car, then VLC is MP.
Rule 25: If NL is low, VL is MP and NC is market, then VLC is HP.
Rule 26: If NL is low, VL is HP and NC is train, then VLC is VHP.
Rule 27: If NL is medium, VL is LP and NC is car, then VLC is HP.
Rule 28: If NL is medium, VL is LP and NC is market, then VLC is MN.
Rule 29: If NL is medium, VL is MP and NC is car, then VLC is MP.
Rule 30: If NL is medium, VL is MP and NC is market, then VLC is MN
Rule 31: If NL is medium, VL is HP and NC is train, then VLC is VHP.
Rule 32: If NL is high, VL is MP and NC is market, then VLC is MP.
Rule 33: If NL is low, VL is LP and NC is car, then VLC is HP.
Rule 34: If NL is medium, VL is MP and NC is train, then VLC is VHP.
Rule 35: If NL is low, VL is HP and NC is office, then VLC is MN.
Rule 36: If NL is high, VL is MP and NC is market, then VLC is HP.
Rule 37: If NL is low, VL is LP and NC is office, then VLC is MP.
Rule 38: If NL is medium, VL is MP and NC is market, then VLC is MP.
Rule 39: If NL is high, VL is HP and NC is car, then VLC is HP.
Rule 40: If NL is low, VL is MP and NC is train, then VLC is VHP.
IMPLEMENTATION OF INTELLIGENT VOLUME CONTROLLER (IVC) FOR MOBILE PHONES
the output of neural background noise classifier as Noise Class (NC), Neural noise level detector as Noise Level (NL) and Volume level (VL) from mobile phones was fed to fuzzy volume controller and corresponding volume level change (VLC) was found to vary from 5db to 60 db.
Range for volume level(VL) in case of human talk varies from 30 db to 60 db which was mixed with following four categories of internet noises from 0.1 db to 100 db and it was also normalized from 0 to 1 for defining the membership function in fuzzy:
Using fuzzy volume controller for mobile phones, maximum VLC (Volume level Change) was observed in case of train noise class and minimum in case of office noise class. The minimum and maximum VLC (Volume Level Change) for (0-10db) for office noise and (40-65db) for train noise and for other two class range was (10-40db) for car and market noise. The overall corresponding volume level change(VLC) in intelligent volume controller varied from 5 db to 60 db..Thus VLC (volume level change) was observed maximum in case of train and minimum in case of office.
Based on forty fuzzy rules, following plots were obtained in MATLAB. Using the fuzzy volume controller for mobile phones, maximum VLC (volume level change) was observed in case of train noise class and minimum in case of office noise class. The minimum and Maximum VLC (volume level Change) for (0-10 db) for office noise and (40-65 db) for train noise and for other two class range was (10-40 db) for car noise and market noise. The overall corresponding volume level change (VLC) in intelligent volume controller varied from 5 db to 60db. Thus VLC was observed maximum in case of train and minimum in case of office. Volume Level (VL) was analysed for four categories of internet noises from 0.1db to 100 db and it was also normalized from 0 to 1 for defining of membership function in fuzzy.
Was obtained for FIS variable noise level(NL) for low range(0-0.4),medium range(0.3-0.7) and high range(0.6-1.0). Was obtained for FIS variable volume level(VL),low positive-LP(0-0.4),medium positive-MP(0.3-0.8) and high positive-HP(0.7-1). Was obtained for FIS variable noise class (NC) for office noise class-ONC(0-0.3),market noise Class-MNC(0.2-0.5),car noise class-CNC(0.45-0.75) and train noise class-TNC(.7-1), was obtained for FIS variable volume level change(VLC) for low positive-LP(0-.3)medium negative-MN(0.2-0.5),medium positive-MP(0.4-0.8),High positive-HP(0.7-0.9) and very high positive-VHP(0.9-1). Was obtained for 40 fuzzy rules in editor. Was obtained for following range of NL,VL,NC,VLC in rule viewer in fuzzy MATLAB toolbox for NL-0.333,VL-0.317,NC-0.575 and VLC-0.407. Surface plot was obtained in surface viewer for NL(0-1) ,VL(0-1) and VLC(1-0.8).In this manner plots were obtained for four categories of noises.
Using the fuzzy volume controller for mobile phones, maximum VLC (volume level Change) was observed in case of train noise class and minimum in case of office noise class. The minimum and maximum VLC (Volume Level Change) for (0-10db) for office noise and (40-65 db) for train noise and for other two class range was (10-40 db) for car noise and market noise. The overall corresponding volume level change (VLC) in intelligent volume controller varies from 5 db to 60db.
Thus, VLC (volume level change) was observed maximum in case of train and minimum in case of office. Overall ,after comparison it was found that performance of fuzzy (PID) volume controller for mobile phones was better than other models of conventional (PI), neural (PI,PD and PID) and fuzzy(PI and PD) and based on noise attenuation level, VLC (Volume Level Change) was observed maximum in case of train and minimum in case of office.
Fuzzy Volume Controller for Mobile Phones. (2024, Feb 06). Retrieved from https://studymoose.com/document/fuzzy-volume-controller-for-mobile-phones
👋 Hi! I’m your smart assistant Amy!
Don’t know where to start? Type your requirements and I’ll connect you to an academic expert within 3 minutes.
get help with your assignment