This paper describes our work on the retrieval of polyphonic notes from a musical performance. Current state of the art transcription and score following systems extract features from the audio signal which are used to estimate what has been played in order to transcribe or find a position within a musical score. We propose that by extracting features from a video signal and fusing them with the audio features, the robustness of a system can be improved. We offer a framework which can be used to integrate these two modalities, and review preliminary work.
|Publication status||Published - 2008|
|Event||International Computer Music Conference, ICMC 2008 - Belfast, Ireland|
Duration: 24 Aug 2008 → 29 Aug 2008
|Conference||International Computer Music Conference, ICMC 2008|
|Period||24 Aug 2008 → 29 Aug 2008|