direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Sound Classification and Similarity

Project information
Project Researcher:
Dr.-Ing. Hyoung-Gook Kim
Research Student:
Martin Haller

Sound Recognition


In this project, we present a generalized sound classification and indexing technique using MPEG-7 descriptors. The goal of the system is to extract significant features from a clip of sound in order to find sounds with similar characteristics. The MPEG-7 sound classification and indexing tools consist of low-level descriptors and high-level description schemes. For low-level descriptors, low-dimensional features based on spectral basis descriptors are produced in three stages: Normalized Audio Spectrum Envelope (NASE), Principal Component Analysis (PCA), and Independent Component Analysis (ICA). High-level description schemes are used to describe the modeling of audio features, the procedure of audio classification, and retrieval. A classifier based on continuous hidden Markov models (HMM) is applied. The sound model state path, which is selected according to the maximum likelihood model, is stored in an MPEG-7 sound database and used as an index for query applications. Keywords: MPEG-7, Normalized Audio Spectrum Envelope, Principal Component Analysis, Independent Component Analysis, hidden Markov models Figure 1 shows a graphical user interface of the Sound Recognition that uses the sound classification tools to search a large database by categories and finds the best matches to the selected query sound using state-path histograms.


We have created several different demonstration programs to illustrate applications of our sound recognition system.

See also:
Independent Component Analysis (ICA)

Zusatzinformationen / Extras


Schnellnavigation zur Seite über Nummerneingabe