Musical genres are categorized by human. It depends on human hearing. There are common characteristics shared by categories. These characteristics are related to instrumentation, rhythmic structure, and harmonic content of the music. Currently many music is still classified by manually. Automated system for musical genre classification can assist or replace manual work for classifying musical genre. In this paper, the automatic classification of audio signals into hierarchy of musical genres is explored. Three feature sets for representing timbral texture, rhythmic content and pitch content are proposed. Also propose classification through two-times KNN classification method and show enhancement of accuracy. Using two-time KNN classification method increases accuracy about 5% than one-time –++++KNN classification which two-time KNN classification accuracy is 77.9% and one-time KNN classification accuracy is 73.3%.
Index Terms – Music classification, feature extraction, wavelets, KNN classification
Table of Contents I. II. Introduction Music Modeling & Genre Segmentation
III. Feature Extraction A. Timbral Texture Features i. ii. iii. iv. B. Spectral shape features Mel-frequency cepstral coefficients (MFCCs) Texture window Low-Energy features
C. Pitch Content Features IV. Classification V. Evaluation and Discussion VI. References
I. Introduction Musical genres are categorized by human. It depends on human hearing. There are common characteristics shared by categories. These characteristics are related to instrumentation, rhythmic structure, and harmonic content of the music. Genre classification is magnified when music industry moved from CD to web. In web music is distributed in large amount so importance of genre classification is magnified. Currently many music is still classified by manually. Automated system for musical genre classification can assist or replace manual work for classifying musical genre. In era of web, it enabled to access large amount of all kinds of data such as music, movies, news and so on. Music database has been grown exponentially since first perceptual coders early in the 90’s. As database grows it demanded tools that can enable search, retrieve and handle large amount of data. Classifying musical genre was great tool for searching, retrieving and handling large music database [1-3]. There are several more method such as music emotion classification , beat tracking , preference recommendation , and etc.. Musical genres classification (MGC) are created and used for categorized and describe music. Musical genre has no precise definitions or boundaries because it is
categorized by human hearing. Musical genres classification are highly related to public marketing, historical and cultural factors. Different countries and organizations have different genre lists, and they even define the same genre with different definitions. So it is hard to define certain genres precisely. There is not an official specification of music genre until now. There are about 500 to 800 genres in music [7, 8]. Some researchers suggested the definition of musical genres classification . After several attempt to define musical genres researchers figured out that it shares certain characteristics such as instrumentation, rhythmic structure, and pitch content. Genre hierarchies were created by human experts and they are currently used to classify music in the web. Auto MGC can provide automating classifying process and provide important component for complete music information. The most significant proposal to specifically deal with this task was released in 2002 . Several strategies dealing with related problems have been proposed in research areas. In this paper, automatic musical genre classification is proposed showed in Figure 1. For feature extraction, three sets of features for representing instrumentation (timberal), rhythmic content and pitch content are proposed.
Figure 1 Automatic Musical Genre Classification
II. Music Modeling & Genre Segmentation An untrained and non-expert person can detect the genre of a song with accuracy of 72% by hearing three-second segmentation of the song . However computer is not design like human brain so it can’t process MGC like human. Despite whole song may somehow influence the representativeness of feature, using whole song can extract most of features that music has. Also to extract short segment of music for automation system is unsuited for the purpose because difficulty of finding exact time of music that represents genre of music. Without research finding certain section of music representing its characteristic using whole song to modeling is proper way to MGC. There are too many music genres used in web [7, 8]. Classification genre has to be
simplified and in this paper proposed genres which are popular used in MP3 players in the market.
Figure 2 Taxonomy of Music Genre
III. Feature Extraction Feature extraction is the process of computing numerical representation that can be used to characterize segment of audio and classify its genre. Digital music file contains data sampled from analog audio signal. It has huge data size compared to its actual information. Features are thus extracted from audio signal to obtain more meaningful information and reduce the over-loading processing. For feature extraction three sets of features for representing instrumentation (timberal), rhythmic content and pitch content will be used .
1. Timbral Texture Features The features used to represent timbre texture are based on the features proposed in speech recognition. The following specific features are usually used to represent timbre texture. ① Spectral shape features [1-3] Spectral shape features are computed directly from the power spectrum of an audio signal frame, describing the shape and characteristics of the power spectrum. The calculated features are based on the short time Fourier transform (STFT) and are calculated for every short-time frame of sound. There are several ways to extract feature with spectral shape feature. 1. Spectral centroid is centroid of the magnitude spectrum of STFT and its measure of spectral brightness.