Recognition Modelling

Due to the excessive number of features, a 10-fold cross-validated SVM-RFE was used to rank the importance of the features after extracting them, and then the features were added sequentially for LDA classification to record the change in accurancy with the number of features selected, and finally the best number of features was recorded as the input for the subsequent classifications (see Fig. 8). The highest accurancy for LDA classification was 89.2% (pre) / 95.6% (pre + n×mR0).

 

Since none of the MFCCs extracted with a fixed number of windows achieved better results than the GMM fitting method for LDA classification (6-window: 86.6%; 10-window: 88.5%; 100-window: <80%), we tested the effectiveness of the other classifiers using only the features extracted by the GMM fitting method. In this test, we randomly selected 20% of the data as the test set, and the rest of the data were used to train the classifier, which were repeated 10 times for each kernel function to record the distribution of the accuracy. Among them, the classification effect of GMM is poor when using only pre as MRU, while the effect is generally better than using only pre when using pre + n×mR0 as MRU. 

There are many classifiers that ca be used for individual recognition. Considering the performances and possibilities of the classifiers, this research compared the classification effectiveness of three classifiers that have been developed considerably in the field of gibbon bioacoustics or human sound pattern recognition, i.e., (1) linear discriminant analysis (LDA), (2) support vector machine (SVM) and (3) GMM (classification by determining the similarity between the data to be measured and the existing data). 

 

The basic method of sound pattern characteristics extraction has been identified, and a preliminary system method for individual sound recognition of Hainan gibbons has been established. Our preliminary results show that the existing system method is relatively reliable, and is to achieve the expected goals of the project. Among them, using pre + n×mR0 as MRU, extracting sound pattern characteristics using GMM fitting method, and using linear SVM for classification would be more effective. In the follow-up work, the data of rare individuals will be constantly supplemented, and design of the algorithm system will be improved, the ability of the classifier to recognise unknown individuals will be given, and the performance of the system will be comprehensively evaluated, so as to ultimately realise the recognition of individual sound of Hainan gibbons.

Sound pattern analysis

The manual screening of 532 Hainan gibbon acoustic sample has been completed, including those obtained during tracking and observation of gibbons using a portable recorder and those obtained using an automated recorder. During the screening process, three recording qualities were initially categorized, namely hight, medium, and low. 44 high-quality recordings from seven individual callers were obtained. The seven individual callers were GAM1、GBM1、GBSA、GCM1、GCM2、GDM1、GEM1, where the letter after “G” represents the family group number and the letter after “M/S” represents the individual number of adult male/subadult male individual number. Only about 40.9% of the recordings were made manually. The raw files of all automated recordings were provided by the team of professor Wang Jichao, and the related data were backed up at Hainan Institute of National Park.

 

Mel-frequency cepstrum coefficients (MFCCs) is a method of extracting frequency envelope features by cepstrum after weakening the high-frenquency information on the basis of human hearing[1], which has a wide range of applications in the field of human and bioacoustics. In this study, MFCCs and the first-order and second-order differences (△、△2) are used to achieve automated feature extraction.

 

5 signature notes of the male Hainan gibbon have been identified (Fig.1), including boom note, aa note, pre-modulated note, modulated-R0 note, and modulated-R1 note. 

 

According to the acoustic niche hypothesis, the calls of different species are differentiated in the time and frequency domains (see Fig. 2), so extracting features in a specific frequency range can greatly reduce the influence of noise, and the smaller the frequency range delineated, the more likely it is that more noise will be excluded. In addition, when the structure of each minimum recognition units (MRUs) is the same, the difficulty of recognition is greatly reduced.

 

In view of the above situation, in this phase of the research, we tried (1) applying pre only and (2) using pre + n×mR0 as MRU, respectively, and comparing the classification results so as to determine the most appropriate feature extraction in the subsequent work. In the case of voice annotation, all the above steps can be implemented automatically by R language code.

https://maracuyacraft.wordpress.com/2015/02/24/artesania-en-coral-negro/
Caribbean
Central America
South America
Montserrat
Berjano Esquivel
https://maracuyacraft.wordpress.com/2015/02/24/artesania-en-coral-negro/
Caribbean
Central America
South America
Montserrat
Berjano Esquivel
https://maracuyacraft.wordpress.com/2015/02/24/artesania-en-coral-negro/
Caribbean
Central America
South America
Montserrat
Berjano Esquivel
Gonzalo Castillo
North America
Gonzalo
Castillo
Gonzalo Castillo
North America
Gonzalo
Castillo
Gonzalo Castillo
North America
Gonzalo
Castillo
Sophia Cooke
South America
Sophia
Cooke
Sophia Cooke
South America
Sophia
Cooke