Thema
CNN Application to Video Saliency
Typ Master, Forschungspraxis, Bachelor, Ing.prax.
Betreuer Mikhail Startsev
Tel.: +49 (0)89 289-28550
E-Mail: mikhail.startsev@tum.de
Sachgebiet Computer Vision
Beschreibung One of the important questions in computer vision is how you determine what information in a scene (represented by an image or a video) is relevant. So-called “saliency models” [1] have been used to predict informativeness in images. However for videos the ways of incorporating the temporal component of the series of frames into an attention prediction model range from being extremely computationally intensive (ex. deep neural networks using 3D convolution operators) to the ones using hand-crafted approaches (ex. the use of optical flow or using two subsequent frames as input).

In order to avoid or reduce the “hand-engineered” aspect of the features in use, different modifications of traditional 2D CNNs can be employed. The deep learning methods have already proven their worth in the image saliency task [2] and some results related to videos are starting to appear as well. In this project the candidate will work with various CNN models that work with video data in order to compare their performance. Depending on the progress, learning several models from scratch on pre-recorded data can be beneficial.

[1] en.wikipedia.org/wiki/Salience_(neuroscience)
[2] saliency.mit.edu/results_mit300.html
Voraussetzung Understanding of machine learning concepts and solid programming skills are desirable.
Bewerbung If you are interested in this topic, we welcome the applications via the email address above. Please set the email subject to “<Type of application> application for topic 'XYZ'”, ex. “Master’s thesis application for topic 'XYZ'”, while clearly specifying why are you interested in the topic in the text of the message. Also make sure to attach your most recent CV (if you have one) and grade report.