A pertinent evaluation of automatic video summary

Sivapriyaa Kannappan, Yonghuai Liu, Bernard Tiddeman

Research output: Chapter in Book/Report/Conference proceedingConference Proceeding (Non-Journal item)

4 Citations (SciVal)


Video summarization is useful to find a concise representation of the original video, nevertheless its evaluation is somewhat challenging. This paper proposes a simple and efficient method for precisely evaluating the video summaries produced by the existing techniques. This method includes two steps. The first step is to establish a set of matched frames between automatic summary (AT) and the ground truth summary (GT) through two-way search, in which the similarity between two frames are measured using correlation coefficient. The second step is to estimate the consistency among these established matches, so that the difference among these frames in the AT and GT are preserved respectively. To accomplish this, a compatibility matrix is built based on the features extracted from each of these frames. The consistency values among these matched frames are estimated as the eigenvector of this matrix corresponding to the maximum eigenvalue. Such matched frames with a small enough consistency value will be rejected, leading to more accurate performance estimation of the video summarization techniques. Experimental results based on a publicly accessible dataset shows that the proposed method is effective in finding true matches and provide more realistic measurement of the performance for various techniques.
Original languageEnglish
Title of host publicationPattern Recognition
Subtitle of host publication23rd International Conference on Pattern Recognition 2016
PublisherIEEE Press
Publication statusPublished - 24 Apr 2017
Event23rd International Conference on Pattern Recognition - Cancún, Mexico
Duration: 04 Dec 201608 Dec 2016


Conference23rd International Conference on Pattern Recognition
Abbreviated titleICPR
Period04 Dec 201608 Dec 2016


Dive into the research topics of 'A pertinent evaluation of automatic video summary'. Together they form a unique fingerprint.

Cite this