SAPA-SCALE 2012

SAPA - SCALE Conference 2012

Salon Ballroom, Hilton Portland
7-8 September 2012, Portland, OR, USA
http://www.sapaworkshops.org/2012

	Home
	People
	Dates
	Program
	Submit
	Registration
	Formatting instructions

	SAPA Workshops

Technical Program

The workshop will be held in Hilton Portland Salon Ballroom. This is downstairs in the Hilton Executive Tower building, which is at 545 SW Taylor (kitty-corner to the main hotel), then go down the stairs to the right of the entrance.

Click on each title to retrieve the corresponding paper, or you can download all papers in a zip file: sapascale2012papers.zip (16MB).

Polls Page

Friday September 7th
0930-0945	Welcome & introduction
0945-1045	Keynote 1: Human sound perception - what can we learn from it when developing audio analysis algorithms? Tuomas Virtanen (Tampere University of Technology)
1045-1105	break
1105-1130	Pitch Estimation Using Mutual Information (pp. 1-4) Majid Mirbagheri, Yanbo Xu, Shihab Shamma (University of Maryland College Park)
1130-1155	Establishing some principles of human speech production through two-dimensional computational models (pp. 5-10) Mauro Nicolao, Roger K. Moore (University of Sheffield)
1155-1220	A Spectral Envelope Estimation Method Based on F0-Adaptive Multi-Frame Integration Analysis (pp. 11-16) Tomoyasu Nakano, Masataka Goto (AIST)
1220-1315	lunch
1315-1340	Cochlear Implant-like Processing of Speech Signal for Speaker Verification (pp. 17-21) Cong-Thanh Do, Claude Barras (LIMSI-CNRS/Universite Paris-Sud)
1340-1405	Speech intelligibility enhancement for HMM-based synthetic speech in noise (pp. 22-27) Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King (University of Edinburgh)
1405-1430	A Generalized Stein's Estimation Approach for Speech Enhancement Based on Perceptual Criteria (pp. 28-33) Sunder Ram Krishnan, Chandra Sekhar Seelamantula (Indian Institute of Science)
1430-1455	Non-Stationary Signal Processing and its Application in Speech Recognition (pp. 34-39) Zoltán Tüske, Friedhelm R. Drepper, Ralf Schlüter (RWTH Aachen University)
1455-1515	break
1515-1540	Joint Uncertainty Decoding with Unscented Transform for Noise Robust Subspace Gaussian Mixture Models (pp. 40-45) Liang Lu, Arnab Ghoshal, Steve Renals (University of Edinburgh)
1540-1605	Hierarchical Hybrid Language Models for Open Vocabulary Continuous Speech Recognition using WFST (pp. 46-51) M. Ali Basha Shaik, David Rybach, Stefan Hahn, Ralf Schlueter, Hermann Ney (RWTH Aachen University)
1605-1630	Template-based ASR using Posterior features and Synthetic References: comparing different TTS systems (pp. 52-57) Serena Soldo, Mathew Magimai.-Doss, Hervé Bourlard (Idiap Research Institute)
1630-1655	Explicit Duration Modelling in HMM-based Speech Synthesis using a Hybrid Hidden Markov Model-Multilayer Perceptron (pp. 58-63) Kalu U. Ogbureke, João P. Cabral, Julie Carson-Berndsen (University College Dublin)
Saturday September 8th
0945-1045	Keynote 2: Speech processing in human auditory cortex Nima Mesgarani (UCSF)
1045-1105	break
1105-1130	Language Identification using Spectro-Temporal Patch features (pp. 110-113) Kamal Sahni, Pranay Dinghe, Rita Singh, Bhiksha Raj (CMU)
1130-1155	Joint Detection and Localization of Multiple Speakers using a Probabilistic Steered Response Power (pp. 68-73) Youssef Oualil, Mathew Magimai.-Doss, Friedrich Faubel, Dietrich Klakow (Saarland University/Idiap)
1155-1220	Structured Sparse Coding for Microphone Array Location Calibration (pp. 74-79) Afsaneh Asaei, Bhiksha Raj, Hervé Bourlard (Idiap/CMU)
1220-1315	lunch
1315-1340	Inharmonic Speech: A Tool for the Study of Speech Perception and Separation (pp. 114-117) Josh McDermott, Dan Ellis, Hideki Kawahara (NYU/Columbia/Wakayama)
1340-1405	Multi-Channel Speech Separation with Soft Time-Frequency Masking (pp. 86-91) Rahil Mahdian Toroghi, Friedrich Faubel, Dietrich Klakow (Saarland University)
1405-1430	Smoothing Speech Trajectories by Regularization (pp. 92-97) Heyun Huang, Louis ten Bosch, Bert Cranen, Lou Boves (Radboud University Nijmegen)
1430-1450	break
1450-1515	Data-driven Speech Representations for NMF-based Word Learning (pp. 98-103) Joris Driesen, Jort F. Gemmeke, Hugo Van hamme (KU Leuven)
1515-1540	Spectro-Temporal Features with Distribution Equalization (pp. 104-109) Samuel K. Ngouoko M., Britta Wrede, Martin Heckmann (Bielefeld University/Honda Research)
1540-1605	Log-normal matrix factorization with application to speech-music separation (pp. 80-85) Takuya Yoshioka, Sakaue Daichi, (NTT Communication Science Laboratories)
1605-1615	Conclusion & farewell
(not presented)
	Dimensionality Reduction of Large TDOA Vectors for Speaker Diarization (pp. 64-67) Deepu Vijayasenan, Fabio Valente (Universität des Saarlands/Idiap)

Dan Ellis <dpwe@ee.columbia.edu>
Last update: Wed Sep 19 09:54:45 PM EDT 2012