Inhalt des Dokuments
MPEG-7-based Audio Annotation for the Archival of Digital Video
Manager ||Prof. Dr.-Ing. Thomas
|Founded by ||BMWA 
(German Federal Ministry of Economics and Labour)|
|Project Period ||11/2002 -
is a standardisation initiative of the Motion Pictures Expert Group
(MPEG) that, instead of focusing on coding like MPEG-1, MPEG-2 and
MPEG-4, is meant to be an standardization of the way to describe
multimedia content (see also: MPEG-7 Link list ).
This project is actually part of a larger one, called MPEG-7-based Archival of Digital Video. Its objective is the achievement of a complete audio-visual database management platform, allowing to segment, index and retrieve audio-visual data, based on MPEG-7 "descriptors" and tools.
2 other partners are involved:
- Heinrich-Hertz-Institut  (HHI),
which addresses the analysis of visual information
(MPEG-7-based Analyse and Visualisation Modules for the Archival of Digital Video).
- Canto Software , which addresses
the general structure of the archival system
(MPEG-7-based Metadata Indexing Methods for the Archival of Digital Video)
Our part of the project concerns the
segmentation, indexing and retrieval of audio information.
We focus on 3 main tasks:
Audio recordings are segmented and classified into coarse sound classes (voice, music, environmental sounds and silence) based on MPEG-7 Low Level Descriptors (LLDs).
- Sound Recognition and Classification
The MPEG-7 sound recognition tools provide a unified interface for searching the media by automatically indexing of audio using trained sound classes in a pattern recognition framework. We develop sound recognition systems that use (1) reduced-dimension features based on Independent Component Analysis (ICA) and (2) Hidden Markov Model (HMM) classifiers.
- Spoken Content Indexing and
The MPEG-7 Spoken Content Description Tools allow detailed description of words and/or phones spoken within an audio stream. The Spoken Content Descriptor is a compact representation of the output of an Automatic Speech Recognition (ASR) system.
MPEG-7 AUDIO :
- Low Level Descriptors 
- Sound Recognition 
- Spoken Content 
- Speech Processing 
|Prof. Dr.-Ing. Thomas Sikora |
|Dr.-Ing. Hyoung-Gook Kim |
|Dr. Nicolas Moreau |
|Dipl.-Ing. Samour Amjad |
|Edgar Berdahl |
|Juan José Burred |
|Andreas Cobet |
|Yuanfeng Cui |
|Daniel Ertelt |
|Martin Haller |