Multi-modal, Multi-resource Methods for Placing Flickr Videos on the Map
Citation key 1293Kelm2011
Author Pascal Kelm and Sebastian Schmiedeke and Thomas Sikora
Title of Book ACM International Conference on Multimedia Retrieval (ICMR)
Pages 8
Year 2011
DOI 10.1145/1991996.1992048
Month apr
Abstract We present three approaches for placing videos in Flickr on the world map. The toponym extraction and geo lookup ap- proach makes use of external resources to identify toponyms in the metadata and associate them with geo-coordinates. The metadata-based region model approach uses a k-nearest- neighbour classifier trained over geographical regions. Videos are represented using their metadata in a text space with re- duced dimensionality. The visual region model approach uses a support vector machine also trained over geographical re- gions. Videos are represented using low-level feature vectors from multiple key frames. Voting methods are used to form a single decision for each video. We compare the approaches experimentally, highlighting the importance of using appro- priate metadata features and suitable regions as the basis of the region model. The best performance is achieved by the geo-lookup approach used with fallback to the visual region model when the video metadata contains no toponym.
