TY - CONF
T1 - Scene modelling and classification using learned spatial relations
AU - Hogg, David C.
AU - Cohn, Anthony G.
AU - Dee, Hannah
N1 - Dee, H. M., Hogg, D. C., and Cohn, A. G. Scene Modelling and Classification Using Learned Spatial Relations Springer LNCS (Conference on Spatial Information Theory (COSIT)) pp 295-311, l'Aber Wrac'h, France, September 2009.
Sponsorship: EPSRC
PY - 2009
Y1 - 2009
N2 - This paper describes a method for building visual scene models from video data using quantized descriptions of motion. This method enables us to make meaningful statements about video scenes as a whole (such as “this video is like that video”) and about regions within these scenes (such as “this part of this scene is similar to this part of that scene”). We do this through unsupervised clustering of simple yet novel motion descriptors, which provide a quantized representation of gross motion within scene regions. Using these we can characterise the dominant patterns of motion, and then group spatial regions based upon both proximity and local motion similarity to define areas or regions with particular motion characteristics. We are able to process scenes in which objects are difficult to detect and track due to variable frame-rate, video quality or occlusion, and we are able to identify regions which differ by usage but which do not differ by appearance (such as frequently used paths across open space). We demonstrate our method on 50 videos making up very different scene types: indoor scenarios with unpredictable unconstrained motion, junction scenes, road and path scenes, and open squares or plazas. We show that these scenes can be clustered using our representation, and that the incorporation of learned spatial relations into the representation enables us to cluster more effectively.
AB - This paper describes a method for building visual scene models from video data using quantized descriptions of motion. This method enables us to make meaningful statements about video scenes as a whole (such as “this video is like that video”) and about regions within these scenes (such as “this part of this scene is similar to this part of that scene”). We do this through unsupervised clustering of simple yet novel motion descriptors, which provide a quantized representation of gross motion within scene regions. Using these we can characterise the dominant patterns of motion, and then group spatial regions based upon both proximity and local motion similarity to define areas or regions with particular motion characteristics. We are able to process scenes in which objects are difficult to detect and track due to variable frame-rate, video quality or occlusion, and we are able to identify regions which differ by usage but which do not differ by appearance (such as frequently used paths across open space). We demonstrate our method on 50 videos making up very different scene types: indoor scenarios with unpredictable unconstrained motion, junction scenes, road and path scenes, and open squares or plazas. We show that these scenes can be clustered using our representation, and that the incorporation of learned spatial relations into the representation enables us to cluster more effectively.
U2 - 10.1007/978-3-642-03832-7_18
DO - 10.1007/978-3-642-03832-7_18
M3 - Paper
SP - 295
EP - 311
ER -