direkt zum Inhalt springen

direkt zum Hauptnavigationsmenü

Sie sind hier

TU Berlin

Inhalt des Dokuments

Multimedia Analysis and Processing


The Communication Systems Group of the TU Berlin, led by Prof. Dr. Thomas Sikora, has a strong track record in multimedia analysis and processing with more than 50 publications in this field, and was involved in many national and international funded projects related to multimedia analysis and processing. The following list gives an overview of research areas we are involved in:

  • Video surveillance, people safety and privacy protection
  • Multi-object tracking
  • Optical flow estimation
  • Crowd analysis and people counting
  • Lost luggage detection
  • Violent behaviour detection
  • Genre and commercial detection in TV broadcast
  • Geo-tagging
  • Face detection and recognition
  • Image and video segmentation
  • MPEG-7
  • Analysis and classification of speech-, audio- and video data
  • etc.

Research Activities

IOU Tracker


Tracking-by-detection is a common approach to multi-object tracking. With ever increasing performances of object detectors, the basis for a tracker becomes much more reliable. In combination with commonly higher frame rates, this poses a shift in the challenges for a successful tracker. We propose a very simple tracking algorithm which can compete with more sophisticated approaches at a fraction of the computational cost. With thorough experiments we show its potential using a wide range of object detectors. The proposed method can easily run at thousands of frames per second (fps) while outperforming the state-of-the-art on the DETRAC vehicle tracking dataset and achieves competitive results on the MOT17 benchmark. mehr zu: IOU Tracker

Multi-Object Tracking


The Probability Hypothesis Density (PHD) filter is a multi-object Bayes filter which has recently attracted a lot of interest in the tracking community mainly for its linear complexity and its ability to deal with high clutter especially in radar/sonar scenarios. In the computer vision community however, underlying constraints are different from radar scenarios and have to be taken into account when using the PHD filter. mehr zu: Multi-Object Tracking

Robust Local Optical Flow Estimation


The Robust Local Optical Flow (RLOF) is a sparse optical flow and feature tracking method. The main objective is to provide a fast and accurate motion estimation solution. The main advantage of the RLOF approach is the adjustable runtime and computational complexity which is in contrast to most common optical flow methods linearly dependend on the number of motion vectors (features) to be estimated. Thus the RLOF is a local optical flow method and most related to the PLK method ( better known as KLT Tracker ) and thus the famous Lucas Kanade method. The sparse-to-dense interpolation scheme allows for fast computation of dense optical flow fields. mehr zu: Robust Local Optical Flow Estimation

Multimodal Geo-Tagging


We present a hierarchical, multi-modal approach for placing Flickr videos on the map. Our approach makes use of external resources to identify toponyms in the metadata and of visual and textual features to identify similar content. First, the geographical boundaries extraction method identi es the country and its dimension. We use a database of more than 3.6 million Flickr images to group them together into geographical regions and to build a hierarchical model. A fusion of visual and textual methods is used to classify the videos location into possible regions. Next, the visually nearest neighbour method uses a nearest neighbour approach to nd correspondences with the training images within the preclassified regions. The video sequences are represented using low-level feature vectors from multiple key frames. The Flickr videos are tagged with the geo-information of the visually most similar training item within the regions that is previously ltered by the pre-classi cation step for each test video. The results show that we are able to tag one third of our videos correctly within an error of 1 km. mehr zu: Multimodal Geo-Tagging

Consistent Two-Level Metric


Since the commonly used benchmarks for abandoned object detection (AOD) only have few abandoned objects and a non-standardized evaluation procedure, an objective performance comparison between different methods is hard. Therefore, we propose a new evaluation metric which is focused on an end-user application case and an evaluation protocol which eliminates uncertainties in previous performance assessments. mehr zu: Consistent Two-Level Metric

Motion-based Object Segmentation


We present an unsupervised motion-based object segmentation algorithm for video sequences with moving camera, employing bidirectional inter-frame change detection. For every frame, two error frames are generated using motion compensation. They are combined and a segmentation algorithm based on thresholding is applied. We employ a simple and effective error fusion scheme and consider spatial error localization in the thresholding step. We find the optimal weights for the weighted mean thresholding algorithm that enables unsupervised robust moving object segmentation. mehr zu: Motion-based Object Segmentation

Short-term Motion-based Object Segmentation


Motion-based segmentation approaches employ either long-term motion information or suffer from lack of accuracy and robustness. We present an automatic motion-based object segmentation algorithm for video sequences with moving camera, employing short-term motion information solely. For every frame, two error frames are generated using motion compensation. They are combined and a thresholding segmentation algorithm is applied. Recent advances in the field of global motion estimation enable outlier elimination in the background area, and thus a more precise definition of the foreground is achieved. We propose a simple and effective error frame generation and we consider spatial error localization. Thus, we achieve improved performance compared with a previously proposed short-term motion-based method and we provide subjective as well as objective evaluation. mehr zu: Short-term Motion-based Object Segmentation

Software & Datasets

Background Substraction / Foreground Detection
SGMM-SOD Library
Robust Local Optical Flow
RLOF Library
Evaluation framework for abandoned object  detection
IOU Tracker
Code on Github
Multi-Object and Multi-Camera Tracking Dataset
MOCAT Dataset
Crowd Analysis Optical Flow, Tracking and Detection Dataset
TUBCrowdFlow on Github

Related Publications






  • Rubén Heras Evangelio, Michael Pätzold, Thomas Sikora
    Splitting Gaussians in Mixture Models
    9th IEEE International Conference on Advanced Video and Signal-Based Surveillance, Beijing, China, 18.09.2012 - 21.09.2012
    ISBN: 978-1-4673-2499-1
    Details BibTeX
  • Tobias Senst, Rubén Heras Evangelio, Ivo Keller, Thomas Sikora
    Clustering Motion for Real-Time Optical Flow based Tracking
    IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2012), Beijing, China, 18.09.2012 - 21.09.2012, pp. 410--415
    ISBN: 978-1-4673-2499-1 DOI: 10.1109/AVSS.2012.20
    Details BibTeX
  • Alexander Kuhn, Tobias Senst, Ivo Keller, Thomas Sikora, Holger Theisel
    A Lagrangian Framework for Video Analytics
    IEEE Workshop on Multimedia Signal Processing (MMSP 2012), Banff, Canada, 17.09.2012 - 19.09.2012, pp. 387--392
    IEEE Catalog Number: CFP12MSP-USB E-ISBN : 978-1-4673-4571-2 Print ISBN: 978-1-4673-4570-5 INSPEC Accession Number: 13116365 DOI: 10.1109/MMSP.2012.6343474
    Details BibTeX




  • Mustafa Karaman, Lutz Goldmann, Thomas Sikora
    Improving object segmentation by reflection detection and removal
    Visual Communications and Image Processing (VCIP), IS&T/SPIE's Electronic Imaging 2009, San Jose, CA, USA, 18.01.2009 - 22.01.2009
    Details BibTeX


Zusatzinformationen / Extras