|
Technical Program
The workshop will be held in
Hilton Portland Salon Ballroom. This is downstairs in the Hilton Executive Tower building, which is at 545 SW Taylor (kitty-corner to the main hotel), then go down the stairs to the right of the entrance.
Click on each title to retrieve the corresponding paper, or you can download all papers in a zip file:
sapascale2012papers.zip (16MB).
Polls Page
Friday September 7th |
0930-0945 | Welcome & introduction |
0945-1045 | Keynote 1:
Human sound perception - what can we learn from it when developing audio analysis algorithms?
Tuomas Virtanen (Tampere University of Technology)
|
1045-1105 |
break |
1105-1130 |
Pitch Estimation Using Mutual Information
(pp. 1-4)
Majid Mirbagheri, Yanbo Xu, Shihab Shamma (University of Maryland College Park)
|
1130-1155 |
Establishing some principles of human speech production through two-dimensional computational models
(pp. 5-10)
Mauro Nicolao, Roger K. Moore (University of Sheffield)
|
1155-1220 |
A Spectral Envelope Estimation Method Based on F0-Adaptive Multi-Frame Integration Analysis
(pp. 11-16)
Tomoyasu Nakano, Masataka Goto (AIST)
|
1220-1315 |
lunch |
1315-1340 |
Cochlear Implant-like Processing of Speech Signal for Speaker Verification
(pp. 17-21)
Cong-Thanh Do, Claude Barras (LIMSI-CNRS/Universite Paris-Sud)
|
1340-1405 |
Speech intelligibility enhancement for HMM-based synthetic speech in noise
(pp. 22-27)
Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King (University of Edinburgh)
|
1405-1430 |
A Generalized Stein's Estimation Approach for Speech Enhancement Based on Perceptual Criteria
(pp. 28-33)
Sunder Ram Krishnan, Chandra Sekhar Seelamantula (Indian Institute of Science)
|
1430-1455 |
Non-Stationary Signal Processing and its Application in Speech Recognition
(pp. 34-39)
Zoltán Tüske, Friedhelm R. Drepper, Ralf Schlüter (RWTH Aachen University)
|
1455-1515 |
break |
1515-1540 |
Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models
(pp. 40-45)
Liang Lu, Arnab Ghoshal, Steve Renals (University of Edinburgh)
|
1540-1605 |
Hierarchical Hybrid Language Models for Open Vocabulary Continuous Speech Recognition using WFST
(pp. 46-51)
M. Ali Basha Shaik, David Rybach, Stefan Hahn, Ralf Schlueter, Hermann Ney (RWTH Aachen University)
|
1605-1630 |
Template-based ASR using Posterior features and Synthetic References: comparing different TTS systems
(pp. 52-57)
Serena Soldo, Mathew Magimai.-Doss, Hervé Bourlard (Idiap Research Institute)
|
1630-1655 |
Explicit Duration Modelling in HMM-based Speech Synthesis using a Hybrid Hidden Markov Model-Multilayer Perceptron
(pp. 58-63)
Kalu U. Ogbureke, João P. Cabral, Julie Carson-Berndsen (University College Dublin)
|
Saturday September 8th |
0945-1045 | Keynote 2:
Speech processing in human auditory cortex
Nima Mesgarani (UCSF)
|
1045-1105 |
break |
1105-1130 |
Language Identification using Spectro-Temporal Patch features
(pp. 110-113)
Kamal Sahni, Pranay Dinghe, Rita Singh, Bhiksha Raj (CMU)
|
1130-1155 |
Joint Detection and Localization of Multiple Speakers using a Probabilistic Steered Response Power
(pp. 68-73)
Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel, Dietrich Klakow (Saarland University/Idiap)
|
1155-1220 |
Structured Sparse Coding for Microphone Array Location Calibration
(pp. 74-79)
Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard (Idiap/CMU)
|
1220-1315 |
lunch |
1315-1340 |
Inharmonic Speech: A Tool for the Study of Speech Perception and Separation
(pp. 114-117)
Josh McDermott, Dan Ellis, Hideki Kawahara (NYU/Columbia/Wakayama)
|
1340-1405 |
Multi-Channel Speech Separation with Soft Time-Frequency Masking
(pp. 86-91)
Rahil Mahdian Toroghi, Friedrich Faubel, Dietrich Klakow (Saarland University)
|
1405-1430 |
Smoothing Speech Trajectories by Regularization
(pp. 92-97)
Heyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves (Radboud University Nijmegen)
|
1430-1450 |
break |
1450-1515 |
Data-driven Speech Representations for NMF-based Word Learning
(pp. 98-103)
Joris Driesen, Jort F. Gemmeke, Hugo Van hamme (KU Leuven)
|
1515-1540 |
Spectro-Temporal Features with Distribution Equalization
(pp. 104-109)
Samuel K. Ngouoko M., Britta Wrede, Martin Heckmann (Bielefeld University/Honda Research)
|
1540-1605 |
Log-normal matrix factorization with application to speech-music separation
(pp. 80-85)
Takuya Yoshioka, Sakaue Daichi, (NTT Communication Science Laboratories)
|
1605-1615 |
Conclusion & farewell |
(not presented) |
|
Dimensionality Reduction of Large TDOA Vectors for Speaker Diarization
(pp. 64-67)
Deepu Vijayasenan, Fabio Valente (Universität des Saarlands/Idiap)
|
Dan Ellis
<dpwe@ee.columbia.edu>
Last update: Wed Sep 19 09:54:45 PM EDT 2012
|