Audio-Visual Localization Of Multiple Speakers In A Video Teleconferencing Setting
- Topics:
- Audio and video
- Source:
- York University
FREE Registration is required
Overview: Attending to multiple speakers in a video teleconferencing setting is a complex task. From a visual point of view, multiple speakers can occur at different locations and present radically different appearances. From an audio point of view, multiple speakers may be speaking at the same time, and background noise may make it difficult to localize sound sources without some a priori estimate of the sound source locations. This paper presents a novel sensor and corresponding sensing algorithms to address the task of attending, simultaneously, to multiple speakers for video teleconferencing. A panoramic visual sensor is used to capture a 360o view of the speakers in the environment and from this view potential speakers are identified via a color histogram approach.
(Is this item miscategorized? Does it need more tags? Let us know.)
Format: PDF | Size: 810KB | Date: Jan 2003 | Pages: 32





