9:00 - 11:00 SESSION I | Session Chair: TBD | |
Towards single-channel unsupervised source separation of speech mixtures: The layered harmonics/formants separation-tracking model
|
Manuel Reyes-Gomez, Nebojsa Jojic, Daniel P. W. Ellis |
|
Soft Mask Estimation for Single Channel Speaker Separation
|
Aarthi M. Reddy, Bhiksha Raj |
|
Hierarchical clustering applied to overcomplete BSS for convolutive mixtures
|
Stefan Winter, Hiroshi Sawada, Shoko Araki, Shoji Makino |
|
Separation of Sound Sources by Convolutive Sparse Coding
|
Tuomas Virtanen |
|
Sound Source Localization and Separation Based on the EM Algorithm
|
Futoshi Asano, Hideki Asoh |
|
A Sector-Based Approach for Localization of Multiple Speakers with Microphone Arrays
|
Guillaume Lathoud, Iain A. McCowan |
|
11:00 - 11:20 COFFEE BREAK |
||
11:20 - 13:00 SESSION II | Session Chair: TBD | |
Stochastic techniques in deriving perceptual knowledge
|
Hynek Hermansky | |
Physical principles driven joint evaluation of multiple F0 hypotheses
|
Chunghsin Yeh, Axel Röbel |
|
Harmonicity Based Blind Dereverberation with Time Warping
|
Tomohiro Nakatani, Keisuke Kinoshita, Masato Miyoshi, Parham S. Zolfaghari |
|
Auditory Segmentation Based on Event Detection
|
Guoning Hu, DeLiang Wang |
|
Features for segmenting and classifying long-duration recordings of "personal" audio
|
Daniel P. W. Ellis, Keansub Lee |
|
13:00 - 14:00 LUNCH BREAK |
||
14:00 - 16:00 SESSION III | Session Chair: Hynek Hermansky |
|
PLP-squared: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns
|
Marios Athineos, Hynek Hermansky, Daniel P.W. Ellis | |
Auditory-based automatic speech recognition
|
Werner Hemmert, Marcus Holmberg, David Gelbart |
|
Model-Based Fusion of Bone and Air Sensors for Speech Enhancement and Robust Speech Recognition
|
John Hershey, Trausti Kristjansson, Zhengyou Zhang |
|
MAP Estimation of Speech Spectral Component Under GGD a Priori
|
Rajkishore Prasad, Hiroshi Saruwatari, Kiyohiro Shikano |
|
Multiple-Microphone Robust Speech Recognition Using Decoder-Based Channel Selection
|
Yasunari Obuchi |
|
Bayesian Networks for Error Handling through Multimodality Fusion in Spoken Dialogues with Mobile Robots
|
Plamen Prodanov, Andrzej Drygajlo |
|
16:00 - 16:20 COFFEE BREAK |
||
16:20 - 18:00 SESSION IV | Session Chair: TBD |
|
Representation and Classification of the Timbre Space of a Single Musical Instrument
|
Hugo de Paula, Hani Yehia, Mauricio A. Loureiro |
|
Specmurt Anasylis: A Piano-Roll-Visualization of Polyphonic Music Signal by Deconvolution of Log-Frequency Spectrum
|
Shigeki Sagayama, Keigo Takahashi, Hirokazu Kameoka, Takuya Nishimoto |
|
Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods
|
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno |
|
Modelling of Note Events for Singing Transcription
|
Matti P. Ryynänen, Anssi P. Klapuri |
|
Discovering Auditory Objects Through Non-Negativity Constraints
|
Paris Smaragdis |
|
18:00 OPEN DISCUSSION, LOCATION TBD |