TU Berlin

Communication Systems GroupScientific Publications

Page Content

to Navigation

Scientific Publications

How efficient is MPEG-7 for General Sound Recognition?
Citation key 0780Kim2004
Author Hyoung-Gook Kim and Juan José Burred and Thomas Sikora
Title of Book 25th International AES Conference Metadata for Audio
Year 2004
Address London, UK
Month jun
Abstract Our challenge is to analyze/classify video sound track content for indexing purposes. To this end we compare the performance of MPEG-7 Audio Spectrum Projection (ASP) features based on several basis decomposition algorithms vs. Mel-scale Frequency Cepstrum Coefficients (MFCC). For basis decomposition in the feature extraction we evaluate three approaches: Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Audio features are computed from these reduced vectors and are fed into a continuous hidden Markov model (CHMM) classifier. Our conclusion is that established MFCC features yield better performance compared to MPEG-7 ASP in the general sound recognition under practical constraints.
Link to publication Download Bibtex entry


Quick Access

Schnellnavigation zur Seite über Nummerneingabe