[SAPA logo] [SCALE logo]
SAPA - SCALE Conference 2012

Salon Ballroom, Hilton Portland
7-8 September 2012, Portland, OR, USA
http://www.sapaworkshops.org/2012
* Home
* People
* Dates
* Program
* Submit
* Registration
* Formatting instructions
 
* SAPA Workshops

Technical Program

The workshop will be held in Hilton Portland Salon Ballroom. This is downstairs in the Hilton Executive Tower building, which is at 545 SW Taylor (kitty-corner to the main hotel), then go down the stairs to the right of the entrance.

Click on each title to retrieve the corresponding paper, or you can download all papers in a zip file: sapascale2012papers.zip (16MB).

Polls Page


Friday September 7th
0930-0945Welcome & introduction
0945-1045Keynote 1:
Human sound perception - what can we learn from it when developing audio analysis algorithms?
Tuomas Virtanen (Tampere University of Technology)
1045-1105 break
1105-1130 Pitch Estimation Using Mutual Information
(pp. 1-4)
Majid Mirbagheri, Yanbo Xu, Shihab Shamma (University of Maryland College Park)
1130-1155 Establishing some principles of human speech production through two-dimensional computational models
(pp. 5-10)
Mauro Nicolao, Roger K. Moore (University of Sheffield)
1155-1220 A Spectral Envelope Estimation Method Based on F0-Adaptive Multi-Frame Integration Analysis
(pp. 11-16)
Tomoyasu Nakano, Masataka Goto (AIST)
1220-1315 lunch
1315-1340 Cochlear Implant-like Processing of Speech Signal for Speaker Verification
(pp. 17-21)
Cong-Thanh Do, Claude Barras (LIMSI-CNRS/Universite Paris-Sud)
1340-1405 Speech intelligibility enhancement for HMM-based synthetic speech in noise
(pp. 22-27)
Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King (University of Edinburgh)
1405-1430 A Generalized Stein's Estimation Approach for Speech Enhancement Based on Perceptual Criteria
(pp. 28-33)
Sunder Ram Krishnan, Chandra Sekhar Seelamantula (Indian Institute of Science)
1430-1455 Non-Stationary Signal Processing and its Application in Speech Recognition
(pp. 34-39)
Zoltán Tüske, Friedhelm R. Drepper, Ralf Schlüter (RWTH Aachen University)
1455-1515 break
1515-1540 Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models
(pp. 40-45)
Liang Lu, Arnab Ghoshal, Steve Renals (University of Edinburgh)
1540-1605 Hierarchical Hybrid Language Models for Open Vocabulary Continuous Speech Recognition using WFST
(pp. 46-51)
M. Ali Basha Shaik, David Rybach, Stefan Hahn, Ralf Schlueter, Hermann Ney (RWTH Aachen University)
1605-1630 Template-based ASR using Posterior features and Synthetic References: comparing different TTS systems
(pp. 52-57)
Serena Soldo, Mathew Magimai.-Doss, Hervé Bourlard (Idiap Research Institute)
1630-1655 Explicit Duration Modelling in HMM-based Speech Synthesis using a Hybrid Hidden Markov Model-Multilayer Perceptron
(pp. 58-63)
Kalu U. Ogbureke, João P. Cabral, Julie Carson-Berndsen (University College Dublin)

Saturday September 8th
0945-1045Keynote 2:
Speech processing in human auditory cortex
Nima Mesgarani (UCSF)
1045-1105 break
1105-1130 Language Identification using Spectro-Temporal Patch features
(pp. 110-113)
Kamal Sahni, Pranay Dinghe, Rita Singh, Bhiksha Raj (CMU)
1130-1155 Joint Detection and Localization of Multiple Speakers using a Probabilistic Steered Response Power
(pp. 68-73)
Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel, Dietrich Klakow (Saarland University/Idiap)
1155-1220 Structured Sparse Coding for Microphone Array Location Calibration
(pp. 74-79)
Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard (Idiap/CMU)
1220-1315 lunch
1315-1340 Inharmonic Speech: A Tool for the Study of Speech Perception and Separation
(pp. 114-117)
Josh McDermott, Dan Ellis, Hideki Kawahara (NYU/Columbia/Wakayama)
1340-1405 Multi-Channel Speech Separation with Soft Time-Frequency Masking
(pp. 86-91)
Rahil Mahdian Toroghi, Friedrich Faubel, Dietrich Klakow (Saarland University)
1405-1430 Smoothing Speech Trajectories by Regularization
(pp. 92-97)
Heyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves (Radboud University Nijmegen)
1430-1450 break
1450-1515 Data-driven Speech Representations for NMF-based Word Learning
(pp. 98-103)
Joris Driesen, Jort F. Gemmeke, Hugo Van hamme (KU Leuven)
1515-1540 Spectro-Temporal Features with Distribution Equalization
(pp. 104-109)
Samuel K. Ngouoko M., Britta Wrede, Martin Heckmann (Bielefeld University/Honda Research)
1540-1605 Log-normal matrix factorization with application to speech-music separation
(pp. 80-85)
Takuya Yoshioka, Sakaue Daichi, (NTT Communication Science Laboratories)
1605-1615 Conclusion & farewell

(not presented)
  Dimensionality Reduction of Large TDOA Vectors for Speaker Diarization
(pp. 64-67)
Deepu Vijayasenan, Fabio Valente (Universität des Saarlands/Idiap)

Valid HTML 4.01! Dan Ellis <dpwe@ee.columbia.edu>
Last update: Wed Sep 19 09:54:45 PM EDT 2012