FY 2024 | 2023 | 2022 | 2021 | 2020 |
FY 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 |
FY 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 |

FY 2024

M.Elmers, K.Inoue, D.Lala, K.Ochi, and T.Kawahara.
Analysis and detection of differences in spoken user behaviors between autonomous and wizard-of-oz systems.
In Proc. Oriental-COCOSDA Workshop, 2024. (PDF file)
Y.Fu, C.Chu, and T.Kawahara.
StyEmp: Stylizing empathetic response generation via multi-grained prefix encoder and personality reinforcement.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.172--185, 2024. (PDF file)
T.Honda, S.Sakai, and T.Kawahara.
Efficient and robust long-form speech recognition with hybrid h3-conformer.
In Proc. INTERSPEECH, pp.2985--2899, 2024. (PDF file)
H.Shi and T.Kawahara.
Dual-path adaptation of pretrained feature extraction module for robust automatic speech recognition.
In Proc. INTERSPEECH, pp.2850--2854, 2024. (PDF file)
Y.Gao, H.Shi, C.Chu, and T.Kawahara.
Speech emotion recognition with multi-level acoustic and semantic information extraction and interaction.
In Proc. INTERSPEECH, pp.1060--1064, 2024. (PDF file)
K.Ochi, K.Inoue, D.Lala, and T.Kawahara.
Entrainment analysis and prosody prediction of subsequent interlocutor's backchannels in dialogue.
In Proc. INTERSPEECH, pp.462--466, 2024. (PDF file)
K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
Multilingual turn-taking prediction using voice activity projection.
In Proc. COLING, pp.11873--11883, 2024. (PDF file)
M.Masuyama, T.Kawahara, and K.Matsuda.
Video retrieval system using automatic speech recognition for the Japanese Diet.
In ParlaCLARIN IV Workshop, pp.145--148, 2024. (PDF file)
T.Kawahara.
Quantitative analysis of editing in transcription process in Japanese and European Parliaments and its diachronic changes.
In ParlaCLARIN IV Workshop, pp.66--69, 2024. (PDF file)
H.Shi, K.Shimada, M.Hirano, T.Shibuya, Y.Koyama, Z.Zhong, S.Takahashi, T.Kawahara, and Y.Mitsufuji.
Diffusion-based speech enhancement with joint generative and predictive decoders.
In Proc. IEEE-ICASSP, pp.12951--12955, 2024. (PDF file)
Y.Gao, H.Shi, C.Chu, and T.Kawahara.
Enhancing two-stage finetuning for speech emotion recognition using adapters.
In Proc. IEEE-ICASSP, pp.11316--11320, 2024. (PDF file)
W.Zhou, Z.Yang, C.Chu, S.Li, R.Dabre, Y.Zhao, and T.Kawahara.
MOS-FAD: Improving fake audio detection via automatic mean opinion score prediction.
In Proc. IEEE-ICASSP, pp.876--880, 2024. (PDF file)
K.Shimada, K.Uchida, Y.Koyama, T.Shibuya, S.Takahashi, Y.Mitsufuji, and T.Kawahara.
Zero- and few-shot sound event localization and detection.
In Proc. IEEE-ICASSP, pp.636--640, 2024. (PDF file)

FY 2023

K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
An analysis of user behaviours for objectively evaluating spoken dialogue systems.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024. (PDF file)
H.Kawai, D.Lala, K.Inoue, K.Ochi, and T.Kawahara.
Evaluation of a semi-autonomous attentive listening system with takeover prompting.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024. (PDF file)
K.Yamamoto, S.Kawano, T.Kawahara, and K.Yoshino.
Data augmentation for robust natural language generation based on phrase alignment and sentence structure.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024. (PDF file)
Y.Fu, H.Song, T.Zhao, and T.Kawahara.
Enhancing personality recognition in dialogue by data augmentation and heterogeneous conversational graph networks.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024. (PDF file)
Z.H.Pang, Y.Fu, D.Lala, K.Ochi, K.Inoue, and T.Kawahara.
Acknowledgment of emotional states: Generating validating responses for empathetic dialogue.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024. (PDF file)
K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
Real-time and continuous turn-taking prediction using voice activity projection.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo. Paper, 2024. (PDF file)
S.Yamashita, K.Inoue, A.Guo, S.Mochizuki, T.Kawahara, and R.Higashinaka.
RealPersonaChat: A realistic persona chat corpus with interlocutors' own personalities.
In Proc. PACLIC, 2023. (PDF file)
E.Nakamura.
Computational Analysis of Selection and Mutation Probabilities in the Evolution of Chord Progressions.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.462--473, 2023. (PDF file)
E.Nakamura, T.Eipert and F.C.Moss.
Historical Changes of Modes and their Substructure Modeled as Pitch Distributions in Plainchant from the 1100s to the 1500s.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.450--461, 2023. (PDF file)
T.Nabeoka, E.Nakamura, and K.Yoshii.
Automatic Orchestration of Piano Scores for Wind Bands with User-Specified Instrumentation.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.387--394, 2023. (PDF file)
Y.Fujita, Y.Bando, K.Imoto, M.Onishi, and K.Yoshii.
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection.
In Proc. APSIPA ASC, pp.2077--2083, 2023. (PDF file)
J.Zhao, E.Nakamura, and K.Yoshii.
Multimodal Multifaceted Music Emotion Recognition Based on Self-Attentive Fusion of Psychology-Inspired Symbolic and Acoustic Features.
In Proc. APSIPA ASC, pp.1657--1661, 2023. (PDF file)
T.Deng, E.Nakamura, and K.Yoshii.
Audio-To-Score Singing Transcription Based on Joint Estimation of Pitches, Onsets, and Metrical Positions with Tatum-Level CTC Loss.
In Proc. APSIPA ASC, pp.583--590, 2023. (PDF file)
T-P.Chen, L.Su, and K.Yoshii.
Learning Multifaceted Self-Similarity for Musical Structure Analysis
In Proc. APSIPA ASC, pp.165--172, 2023. (PDF file)
D.Kamakura, E.Nakamura, and K.Yoshii.
CTC2: End-To-End Drum Transcription Based on Connectionist Temporal Classification with Constant Tempo Constraint.
In Proc. APSIPA ASC, pp.158--164, 2023. (PDF file)
D.Kamakura, E.Nakamura, K.Yoshii, and T.Oyama.
Joint Drum Transcription and Metrical Analysis Based on Periodicity-Aware Multi-Task Learning.
In Proc. APSIPA ASC, pp.151--157, 2023. (PDF file)
K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
Towards objective evaluation of socially-situated conversational robots: Assessing human-likeness through multimodal user behaviors.
In Proc. ICMI (Companion; Late Breaking Results), pp.86--90, 2023. (PDF file)
Y.Fu, K.Inoue, C.Chu, and T.Kawahara.
Reasoning before responding: Integrating commonsense-based causality explanation for empathetic response generation.
In Proc. SIGdial Meeting Discourse \& Dialogue, 2023. (PDF file)
S.Kobuki, K.Seaborn, S.Tokunaga, K.Fukumori, S.Hidaka, K.Tamura, K.Inoue, T.Kawahara, and M.Otake-Matsuura.
Robotic backchanneling in online conversation facilitation: A cross-generational study.
In Proc. RO-MAN, 2023. (PDF file)
Y.Gao, C.Chu, and T.Kawahara.
Two-stage finetuning of wav2vec 2.0 for speech emotion recognition with ASR and gender pretraining.
In Proc. INTERSPEECH, pp.3635--3639, 2023. (PDF file)
J.Lee, M.Mimura, and T.Kawahara.
Embedding articulatory constraints for low-resource speech recognition based on large pre-trained model.
In Proc. INTERSPEECH, pp.1392--1396, 2023. (PDF file)
M.Terao, E.Nakamura, and K.Yoshii.
Neural Band-to-Piano Score Arrangement with Stepless Difficulty Control.
In Proc. IEEE-ICASSP, 2023. (PDF file)
H.Shi, M.Mimura, L.Wang, J.Dang, and T.Kawahara.
Time-domain speech enhancement assisted by multi-resolution frequency encoder and decoder.
In Proc. IEEE-ICASSP, 2023. (PDF file)
K.Soky, S.Li, C.Chu, and T.Kawahara.
Domain and language adaptation using heterogeneous datasets for wav2vec2.0-based speech recognition of low-resource language.
In Proc. IEEE-ICASSP, 2023. (PDF file)

FY 2022

K.Yamamoto, K.Inoue, and T.Kawahara.
Character adaptation of spoken dialogue systems based on user personalities.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023. (PDF file)
Y.Fu, K.Inoue, D.Lala, K.Yamamoto, C.Chu, and T.Kawahara.
Improving empathetic response generation with retrieval based on emotion recognition.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023. (PDF file)
Y.Muraki, H.Kawai, K.Yamamoto, K.Inoue, D.Lala, and T.Kawahara.
Semi-autonomous guide agents with simultaneous handling of multiple users.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023. (PDF file)
D.Lala, K.Inoue, T.Kawahara, and K.Sawada.
Backchannel generation model for a third party listening agent.
In Proc. Human-Agent Interaction (HAI), pp.114--122, 2022. (PDF file)
H.Shi, Y.Shu, L.Wang, J.Dang, and T.Kawahara.
Fusing multiple bandwidth spectrograms for improving speech enhancement.
In Proc. APSIPA ASC, pp.1935--1940, 2022. (PDF file)
H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
Subband-based spectrogram fusion for speech enhancement by combining mapping and masking approaches.
In Proc. APSIPA ASC, pp.286--292, 2022. (PDF file)
K.Sekiguchi, A.A.Nugraha, Y.Du, Y.Bando, M.Fontaine, and K.Yoshii.
Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments.
In Proc. IEEE/RSJ IROS, 2022. (PDF file)
H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Non-autoregressive error correction for CTC-based ASR with phone-conditioned masked LM.
In Proc. INTERSPEECH, pp.3889--3893, 2022. (PDF file)
Y.Du, A.A.Nugraha, K.Sekiguchi, Y.Bando, M.Fontaine and K.Yoshii.
Direction-aware joint adaptation of neural speech enhancement and recognition in real multiparty conversational environments.
In Proc. INTERSPEECH, pp.2918--2922, 2022. (PDF file)
S.Kawano, M.Arioka, A.Yuguchi, K.Yamamoto, K.Inoue, T.Kawahara, S.Nakamura, and K.Yoshino.
Multimodal persuasive dialogue corpus using teleoperated android.
In Proc. INTERSPEECH, pp.2308--2312, 2022. (PDF file)
J.Nozaki, T.Kawahara, K.Ishizuka, and T.Hashimoto.
End-to-end speech-to-punctuated-text recognition.
In Proc. INTERSPEECH, pp.1811--1815, 2022. (PDF file)
K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
Leveraging simultaneous translation for enhancing transcription of low-resource language via cross attention mechanism.
In Proc. INTERSPEECH, pp.1362--1366, 2022. (PDF file)
H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
Monaural speech enhancement based on spectrogram decomposition for convolutional neural network-sensitive feature extraction.
In Proc. INTERSPEECH, pp.221--225, 2022. (PDF file)
H.Kawai, Y.Muraki, K.Yamamoto, D.Lala, K.Inoue, and T.Kawahara.
Simultaneous job interview system using multiple semi-autonomous agents.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.107--110, 2022. (PDF file)
A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC), 2022. (PDF file)
Y.Sumura, K.Sekiguchi, Y.Bando, A.A.Nugraha, and K.Yoshii.
Joint Localization and Synchronization of Distributed Camera-Attached Microphone Arrays for Indoor Scene Analysis.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC), 2022. (PDF file)
S.Ueno and T.Kawahara.
Phone-informed refinement of synthesized mel spectrogram for data augmentation in speech recognition.
In Proc. IEEE-ICASSP, pp.8572--8576, 2022. (PDF file)
H.Zhang, M.Mimura, T.Kawahara, and K.Ishizuka.
Selective multi-task learning for speech emotion recognition using corpora of different styles.
In Proc. IEEE-ICASSP, pp.7707--7711, 2022. (PDF file)
A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation.
In Proc. IEEE-ICASSP, pp.501--505, 2022. (PDF file)
M.Terao, Y.Hiramatsu, R.Ishizuka, Y.Wu, and K.Yoshii.
Difficulty-Aware Neural Band-to-Piano Score Arrangement Based on Note- and Statistic-Level Criteria.
In Proc. IEEE-ICASSP, pp.196--200, 2022. (PDF file)

FY 2021

M.Mimura, S.Sakai, and T.Kawahara.
An end-to-end model from speech to clean transcript for parliamentary meetings.
In Proc. APSIPA ASC, pp.465--470, 2021. (PDF file)
H.Shi, L.Wang, S.Li, C.Fan, J.Dang, and T.Kawahara.
Spectrograms fusion-based end-to-end robust automatic speech recognition.
In Proc. APSIPA ASC, pp.438--442, 2021. (PDF file)
K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
On the use of speaker information for automatic speech recognition in speaker-imbalanced corpora.
In Proc. APSIPA ASC, pp.433--437, 2021. (PDF file)
H.Futami, H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
ASR rescoring and confidence estimation with ELECTRA.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.380--387, 2021. (PDF file)
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Data augmentation for ASR using TTS via a discrete representation.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.68--75, 2021. (PDF file)
K.Soky, M.Mimura, T.Kawahara, S.Li, C.Ding, C.Chu, and S.Sam.
Khmer speech translation corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC).
In Proc. Oriental-COCOSDA Workshop, pp.122--127, 2021. (PDF file)
T.Oyama, R.Ishizuka, and K.Yoshii.
Phase-Aware Joint Beat and Downbeat Estimation Based on Periodicity of Metrical Structure.
In Proc. ISMIR, pp.493--499, 2021. (PDF file)
Y.Hiramatsu, E.Nakamura, and K.Yoshii.
Joint Estimation of Note Values and Voices for Audio-to-Score Piano Transcription.
In Proc. ISMIR, pp.278--284, 2021. (PDF file)
H.Inaguma, M.Mimura, and T.Kawahara.
VAD-free streaming hybrid CTC/Attention ASR for unsegmented recording.
In Proc. INTERSPEECH, pp.4049--4053, 2021. (PDF file)
H.Inaguma, M.Mimura, and T.Kawahara.
StableEmit: Selection probability discount for reducing emission latency of streaming monotonic attention ASR.
In Proc. INTERSPEECH, pp.1817--1821, 2021. (PDF file)
M.Fontaine, K.Sekiguchi, A.A.Nugraha, Y.Bando, and K.Yoshii.
Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation.
In Proc. INTERSPEECH, pp.661--665, 2021. (PDF file)
K.Inoue, H.Sakamoto, K.Yamamoto, D.Lala, and T.Kawahara.
A multi-party attentive listening robot which stimulates involvement from side participants.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.261--264, 2021. (PDF file)
E.Ishii, G.I.Winata, S.Cahyawijaya, D.Lala, T.Kawahara, and P.Fung.
ERICA: An empathetic android companion for Covid-19 quarantine.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.257--260, 2021. (PDF file)
T.Zhao and T.Kawahara.
Multi-referenced training for dialogue response generation.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.190--201, 2021. (PDF file)
H.Inaguma, T.Kawahara, and S.Watanabe.
Source and target bidirectional knowledge distillation for end-to-end speech translation.
In Proc. NAACL-HLT, pp.1872--1881, 2021. (PDF file)
H.Inaguma, Y.Higuchi, K.Duh, T.Kawahara, and S.Watanabe.
Non-autoregressive end-to-end speech translation with dual-decoder.
In Proc. IEEE-ICASSP, pp.7488--7492, 2021. (PDF file)
K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, and K.Yoshii.
Autoregressive fast multichannel nonnegative matrix factorization for joint blind source separation and dereverberation.
In Proc. IEEE-ICASSP, pp.511--516, 2021. (PDF file)
Y.Hiramatsu, G.Shibata, R.Nishikimi, E.Nakamura, and K.Yoshii.
Statistical correction of transcribed melody notes based on probabilistic integration of a music language model and a transcription error model.
In Proc. IEEE-ICASSP, pp.256--261, 2021. (PDF file)

FY 2020

D.Lala, K.Inoue, K.Yamamoto, and T.Kawahara.
Findings from human-android dialogue research with ERICA.
In Proc. IJCAI-2020 workshop on ROBOT-DIAL, 2020. (PDF file)
S.Zhang, T.Zhao, and T.Kawahara.
Topic-relevant response generation using optimal transport for an open-domain dialog system.
In Proc. COLING, pp.4067--4077, 2020. (PDF file)
J.Woo, M.Mimura, K.Yoshii, and T.Kawahara.
End-to-end music-mixed speech recognition.
In Proc. APSIPA ASC, pp.800--804, 2020. (PDF file)
M.Togami, Y.Masuyama, T.Komatsu, K.Yoshii, and T.Kawahara.
Integration of semi-blind speech source separation and voice activity detection for flexible spoken dialogue.
In Proc. APSIPA ASC, pp.788--793, 2020. (PDF file)
M.Wake, M.Togami, K.Yoshii, and T.Kawahara.
Integration of semi-blind speech source separation and voice activity detection for flexible spoken dialogue.
In Proc. APSIPA ASC, pp.775--780, 2020. (PDF file)
Y.Wu, E.Nakamura, and K.Yoshii.
A Variational Autoencoder for Joint Chord and Key Estimation from Audio Chromagrams
In Proc. APSIPA ASC, pp.500--506, 2020. (PDF file)
R.Ishizuka, R.Nishikimi, E.Nakamura, and K.Yoshii.
Tatum-Level Drum Transcription Based On a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training
In Proc. APSIPA ASC, pp.359--364, 2020. (PDF file)
D.Lala, K.Inoue, and T.Kawahara.
Prediction of shared laughter for human-robot dialogue.
In Proc. ICMI (Companion; Late Breaking Results), pp.62--66, 2020. (PDF file)
K.Inoue, K.Hara, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
Job interviewer android with elaborate follow-up question generation.
In Proc. ICMI, pp.324--332, 2020. (PDF file)
M.Fontaine, K.Sekiguchi, A.A.Nugraha, and K.Yoshii.
Unsupervised robust speech enhancement based on alpha-stable fast multichannel nonnegative matrix factorization.
In Proc. INTERSPEECH, pp.4541--4545, 2020. (PDF file)
K.Yamamoto, K.Inoue, and T.Kawahara.
Semi-supervised learning for character expression of spoken dialogue systems.
In Proc. INTERSPEECH, pp.4188--4192, 2020. (PDF file)
T.V.Dang, T.Zhao, S.Ueno, H.Inaguma, and T.Kawahara.
End-to-end speech-to-dialog-act recognition.
In Proc. INTERSPEECH, pp.3910--3914, 2020. (PDF file)
H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Distilling the knowledge of BERT for sequence-to-sequence ASR.
In Proc. INTERSPEECH, pp.3635--3639, 2020. (PDF file)
K.Matsuura, M.Mimura, S.Sakai, and T.Kawahara.
Generative adversarial training data adaptation for very low-resource automatic speech recognition.
In Proc. INTERSPEECH, pp.2737--2741, 2020. (PDF file)
Y.Bando, K.Sekiguchi, A.A.Nugraha, and K.Yoshii.
Adaptive neural speech enhancement with a denoising variational autoencoder.
In Proc. INTERSPEECH, pp.2437--2441, 2020. (PDF file)
H.Inaguma, M.Mimura, and T.Kawahara.
Enhancing monotonic multihead attention for streaming ASR.
In Proc. INTERSPEECH, pp.2137--2141, 2020. (PDF file)
H.Inaguma, M.Mimura, and T.Kawahara.
CTC-synchronous training for monotonic attention model.
In Proc. INTERSPEECH, pp.571--575, 2020. (PDF file)
H.Feng, S.Ueno, and T.Kawahara.
End-to-end speech emotion recognition combined with acoustic-to-word ASR model.
In Proc. INTERSPEECH, pp.501--505, 2020. (PDF file)
G.Shibata, R.Nishikimi, and K.Yoshii.
Music structure analysis based on an LSTM-HSMM hybrid model.
In Proc. ISMIR, pp.15--22, 2020. (PDF file)
Y.Du, K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, K.Yoshii, and T.Kawahara.
Semi-supervised multichannel speech separation based on a phone- and speaker-aware deep generative model of speech spectrograms.
In Proc. EUSIPCO, pp.870--874, 2020. (PDF file)
K.Yoshii, K.Sekiguchi, Y.Bando, M.Fontaine, and A.A.Nugraha.
Fast multichannel correlated tensor factorization for blind source separation.
In Proc. EUSIPCO, pp.306-310, 2020. (PDF file)
T.Zhao, D.Lala, and T.Kawahara.
Designing precise and robust dialogue response evaluators.
In Proc. ACL, pp.26--33, 2020. (PDF file)
K.Inoue, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
An attentive listening system with android ERICA: Comparison of autonomous and WOZ interactions.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.118--127, 2020. (PDF file)
S.Nakamura, C.T.Ishi, and T.Kawahara.
Analysis and modeling of between-sentence pauses in news speech by Japanese newscasters.
In Proc. Int'l Conf. Speech Prosody, pp.680--684, 2020. (PDF file)
S.Isonishi, K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Response generation to out-of-database questions for example-based dialogue systems.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020. (PDF file)
K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
A character expression model affecting spoken dialogue behaviors.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020. (PDF file)
K.Matsuura, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Speech corpus of Ainu folklore and end-to-end speech recognition for Ainu language.
In Proc. Int'l Conf. Language Resources \& Evaluation (LREC), pp.2622--2628, 2020. (PDF file)

FY 2019

H.Inaguma, K.Duh, T.Kawahara, and S.Watanabe.
Multilingual end-to-end speech translation.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.570--577, 2019. (PDF file)
K.Soky, S.Li, T.Kawahara, and S.Seng.
Multi-lingual transformer training for Khmer automatic speech recognition.
In Proc. APSIPA ASC, pp.1893--1896, 2019. (PDF file)
G.Shibata, R.Nishikimi, E.Nakamura, and K.Yoshii. Statistical music structure analysis based on a homogeneity-, repetitiveness-, and regularity-aware hierarchical hidden semi-Markov model.
In Proc. ISMIR, pp. 268--275, 2019. (PDF file)
R.Nishikimi, E.Nakamura, M.Goto, and K.Yoshii.
End-to-end melody note transcription based on a beat-synchronous attention mechanism.
In Proc. IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. (PDF file)
T.Carsault, A.McLeod, P.Esling, J.Nika, E.Nakamura, and K.Yoshii.
Multi-step chord sequence prediction based on aggregated multi-scale encoder-decoder networks.
In Proc. IEEE Workshop Machine Learning for Signal Processing (MLSP), 2019. (PDF file)
D.Lala, K.Inoue, and T.Kawahara.
Smooth turn-taking by a robot using an online continuous model to generate turn-taking cues.
In Proc. ICMI, pp.226--234, 2019. (PDF file)
S.Li, R.Dabre, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
Improving transformer-based speech recognition systems with compressed structure and speech attributes augmentation.
In Proc. INTERSPEECH, pp.4400--4404, 2019. (PDF file)
D.Lala, S.Nakamura, and T.Kawahara.
Analysis of effect and timing of fillers in natural turn-taking.
In Proc. INTERSPEECH, pp.4175--4179, 2019. (PDF file)
K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
Turn-taking prediction based on detection of transition relevance place.
In Proc. INTERSPEECH, pp.4170--4174, 2019. (PDF file)
Y.Li, T.Zhao, and T.Kawahara.
Improved end-to-end speech emotion recognition using self attention mechanism and multitask learning.
In Proc. INTERSPEECH, pp.2803--2807, 2019. (PDF file)
S.Li, X.Lu, C.Ding, P.Shen, T.Kawahara, and H.Kawai.
Investigating radical-based end-to-end speech recognition systems for Chinese dialects and Japanese.
In Proc. INTERSPEECH, pp.2200--2204, 2019. (PDF file)
S.Li, C.Ding, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
End-to-end articulatory attribute modeling for low-resource multilingual speech recognition.
In Proc. INTERSPEECH, pp.2145--2149, 2019. (PDF file)
Y.Wu, T.Carsault, and K.Yoshii.
Automatic chord estimation based on a frame-wise convolutional recurrent neural network with non-aligned annotations.
In Proc. EUSIPCO, 2019. (PDF file)
K.Sekiguchi, A.Arie Nugraha, Y.Bando, and K.Yoshii.
Fast multichannel source separation based on jointly diagonalizable spatial covariance matrices.
In Proc. EUSIPCO, 2019. (PDF file)
D.Lala, G.Wilcock, K.Jokinen, and T.Kawahara.
ERICA and WikiTalk.
In Proc. IJCAI, Vol.Demo. Paper, pp.6533--6535, 2019. (PDF file)
S.Nakamura, C.T.Ishi, and T.Kawahara.
Prosodic characteristics of Japanese newscaster speech for different speaking situations.
In Proc. Int'l Congress Phonetic Sciences (ICPhS), 2019. (PDF file)
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Multi-speaker sequence-to-sequence speech synthesis for data augmentation in acoustic-to-word speech recognition.
In Proc. IEEE-ICASSP, pp.6161--6165, 2019. (PDF file)
H.Inaguma, J.Cho, M.K.Baskar, T.Kawahara, and S.Watanabe.
Transfer learning of language-independent end-to-end ASR with language model fusion.
In Proc. IEEE-ICASSP, pp.6096--6100, 2019. (PDF file)
A.Arie Nugraha, K.Sekiguchi, and K.Yoshii.
A deep generative model of speech complex spectrograms.
In Proc. IEEE-ICASSP, pp.905-909, 2019. (PDF file)
S.Ueda, K.Shibata, Y.Wada, R.Nishikimi, E.Nakamura, and K.Yoshii.
Bayesian drum transcription based on nonnegative matrix factor decomposition with a deep score prior.
In Proc. IEEE-ICASSP, pp.456-460, 2019. (PDF file)
K.Shibata, R.Nishikimi, S.Fukayama, M.Goto, E.Nakamura, K.Itoyama, and K.Yoshii.
Joint transcription of lead, bass, and rhythm guitars based on a factorial hidden semi-Markov model.
In Proc. IEEE-ICASSP, pp.236-240, 2019. (PDF file)
E.Nakamura, K.Shibata, R.Nishikimi, and K.Yoshii.
Unsupervised melody style conversion.
In Proc. IEEE-ICASSP, pp.196-200, 2019. (PDF file)
A.McLeod, E.Nakamura, and K.Yoshii.
Improved metrical alignment of MIDI performance based on a repetition-aware online-adapted grammar.
In Proc. IEEE-ICASSP, pp.186-190, 2019. (PDF file)
R.Nishikimi, E.Nakamura, S.Fukayama, M.Goto, and K.Yoshii.
Automatic singing transcription based on encoder-decoder recurrent neural networks with a weakly-supervised attention mechanism.
In Proc. IEEE-ICASSP, pp.161-165, 2019. (PDF file)
K.Inoue, K.Hara, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
A job interview dialogue system with autonomous android ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol.Demo. Paper, 2019. (PDF file)
K.Inoue, D.Lala, K.Yamamoto, K.Takanashi, and T.Kawahara.
Engagement-based adaptive behaviors for laboratory guide in human-robot dialogue.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019. (PDF file)
K.Tanaka, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
End-to-end modeling for selection of utterance constructional units via system internal states.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019. (PDF file)

FY 2018

M.Mimura, S.Ueno, H.Inaguma, S.Sakai, and T.Kawahara.
Leveraging sequence-to-sequence speech synthesis for enhancing acoustic-to-word speech recognition.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 477--484, 2018. (PDF file)
H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
Improving OOV detection and resolution with external language models in acoustic-to-word ASR.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 212--218, 2018. (PDF file)
S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
Improving very deep time-delay neural network with vertical-attention for effectively training CTC-based ASR systems.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 77--83, 2018. (PDF file)
E.Nakamura, R.Nishikimi, S.Dixon, and K.Yoshii.
Probabilistic sequential patterns for singing transcription.
In Proc. APSIPA ASC, pp.1905--1912, 2018. (PDF file)
K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Dialogue behavior control model for expressing a character of humanoid robots.
In Proc. APSIPA ASC, pp.1732--1737, 2018. (PDF file)
K.Sekiguchi, Y.Bando, K.Yoshii, and T.Kawahara.
Bayesian multichannel speech enhancement with a deep speech prior.
In Proc. APSIPA ASC, pp.1233--1239, 2018. (PDF file)
Y.Wada, R.Nishikimi, E.Nakamura, K.Itoyama, and K.Yoshii.
Sequential generation of singing F0 contours from musical note sequences based on WaveNet.
In Proc. APSIPA ASC, pp.983--989, 2018. (PDF file)
T.Kawahara.
Human-like conversational robot.
In Proc. APSIPA ASC, p. (overview talk), 2018. (PDF file)
D.Lala, K.Inoue, and T.Kawahara.
Evaluation of real-time deep learning turn-taking models for multiple dialogue scenarios.
In Proc. ICMI, pp.78--86, 2018. (PDF file)
H.Tsushima, E.Nakamura, K.Itoyama and K.Yoshii.
Interactive arrangement of chords and melodies based on a tree-structured generative model.
In Proc. ISMIR, 2018. (PDF file)
K.Yoshii, K.Kitamura, Y.Bando, E.Nakamura, and T.Kawahara.
Independent low-rank tensor analysis for audio source separation.
In Proc. EUSIPCO, pp.1671--1675, 2018. (PDF file)
S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
Improving CTC-based acoustic model with very deep residual time-delay neural networks.
In Proc. INTERSPEECH, pp.3708--3712, 2018. (PDF file)
S.Ueno, T.Moriya, M.Mimura, S.Sakai, Y.Yamaguchi, Y.Aono, and T.Kawahara.
Encoder transfer for attention-based acoustic-to-word speech recognition.
In Proc. INTERSPEECH, pp.2424--2428, 2018. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Forward-backward attention decoder.
In Proc. INTERSPEECH, pp.2232--2236, 2018. (PDF file)
K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
Prediction of turn-taking using multitask learning with prediction of backchannels and fillers.
In Proc. INTERSPEECH, pp.991--995, 2018. (PDF file)
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Engagement recognition in spoken dialogue via neural network by aggregating different annotators' models.
In Proc. INTERSPEECH, pp.616--626, 2018. (PDF file)
T.Zhao and T.Kawahara.
A unified neural architecture for joint dialog act segmentation and recognition in spoken dialog system.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.201--208, 2018. (PDF file)
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Latent character model for engagement recognition based on multimodal behaviors.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018. (PDF file)
R.Nakanishi, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Generating fillers based on dialog act pairs for smooth turn-taking by humanoid robot.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018. (PDF file)
T.Kawahara.
Spoken dialogue system for a human-like conversational robot ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), p. (keynote speech), 2018. (PDF file)
K.Yoshii.
Correlated tensor factorization for audio source separation.
In Proc. IEEE-ICASSP, pp.731--735, 2018. (PDF file)
S.Ueno, H.Inaguma, M.Mimura, and T.Kawahara.
Acoustic-to-word attention-based model complemented with character-level CTC-based model.
In Proc. IEEE-ICASSP, pp.5804--5808, 2018. (PDF file)
Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Statistical speech enhancement based on probabilistic integration of variational autoencoder and non-negative matrix factorization.
In Proc. IEEE-ICASSP, pp.716--720, 2018. (PDF file)
K.Shimada, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Unsupervised beamforming based on multichannel nonnegative matrix factorization for noisy speech recognition.
In Proc. IEEE-ICASSP, pp.5734--5738, 2018. (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Efficient learning of articulatory models based on multi-label training and label correction for pronunciation learning.
In Proc. IEEE-ICASSP, pp.6239--6243, 2018. (PDF file)
H.Inaguma, M.Mimura, K.Inoue, K.Yoshii, and T.Kawahara.
An end-to-end approach to joint social signal detection and automatic speech recognition.
In Proc. IEEE-ICASSP, pp.6214--6218, 2018. (PDF file)
E.Nakamura, E.Benetos, K.Yoshii, and S.Dixon.
Towards complete polyphonic music transcription: Integrating multi-pitch detection and rhythm quantization.
In Proc. IEEE-ICASSP, pp.101--105, 2018. (PDF file)
T.Kawahara, K.Inoue, D.Lala, and K.Takanashi.
Audio-visual conversation analysis by smart posterboard and humanoid robot.
In Proc. IEEE-ICASSP, pp.6573--6577, 2018. (PDF file)

FY 2017

T.Hagiya, K.Hoashi, and T.Kawahara.
Voice input tutoring system for older adults using input stumble detection.
In Proc. ACM Int'l Conf. Intelligent User Interfaces (IUI), pp. 415--419, 2018. (PDF file)
S.Li, X.Lu, P.Shen, R.Takashima, T.Kawahara, and H.Kawai.
Incremental training and constructing the very deep convolutional residual network acoustic models.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.222--227, 2017. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.134--140, 2017. (PDF file)
Y.Li, C.T.Ishi, N.Ward, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Emotion recognition by combining prosody and sentiment analysis for expressing reactive emotion by humanoid robot.
In Proc. APSIPA ASC, 2017. (PDF file)
T.Kawahara.
Automatic meeting transcription system for the Japanese Parliament (Diet).
In Proc. APSIPA ASC, p. (overview talk), 2017. (PDF file)
T.Zhao and T.Kawahara.
Joint learning of dialog act segmentation and recognition in spoken dialog using neural networks.
In Proc. IJCNLP, pp.704--712, 2017. (PDF file)
T.Kawahara.
Modeling difficulties of second language learners using speech technology.
In Proc. Seoul International Conference on Speech Sciences (SICSS), p. 11 (keynote speech), 2017. (PDF file)
D.Lala, K.Inoue, P.Milhorat, and T.Kawahara.
Detection of social signals for recognizing engagement in human-robot interaction.
In Proc. AAAI Fall Sympo. Natural Communication for Human-Robot Collaboration, 2017. (PDF file)
R.Nishikimi, E.Nakamura, M.Goto, K.Itoyama, and K.Yoshii.
Scale- and rhythm-aware musical note estimation for vocal F0 trajectories based on a semi-tatum-synchronous hierarchical hidden semi-Markov model.
In Proc. ISMIR, 2017. (PDF file)
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Function- and rhythm-aware melody harmonization based on tree-structured parsing and split-merge sampling of chord sequences.
In Proc. ISMIR, 2017. (PDF file)
K.Yoshii, E.Nakamura, K.Itoyama, and M.Goto.
Infinite probabilistic latent component analysis for audio source separation.
In Proc. IEEE Workshop Machine Learning for Signal Processing (MLSP), 2017. (PDF file)
M.Wake, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Semi-blind speech enhancement based on recurrent neural network for source separation and dereverberation.
In Proc. IEEE Workshop Machine Learning for Signal Processing (MLSP), 2017. (PDF file)
M.Mirzaei, K.Meshgi, and T.Kawahara.
Detecting listening difficulty for second language learners using automatic speech recognition errors.
In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.164--168, 2017. (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Transfer learning based non-native acoustic modeling for pronunciation error detection.
In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.50--54, 2017. (PDF file)
M.Mirzaei, K.Meshgi, and T.Kawahara.
Listening difficulty detection to foster second language listening with the partial and synchronized caption system.
In Proc. EUROCALL, pp.211--216, 2017. (PDF file)
M.Mimura, Y.Bando, K.Shimada, S.Sakai, K.Yoshii, and T.Kawahara.
Combined multi-channel NMF-based robust beamforming for noisy speech recognition.
In Proc. INTERSPEECH, pp.2451--2455, 2017. (PDF file)
S.Nakamura, R.Nakanishi, K.Takanashi, and T.Kawahara.
Analysis of the relationship between prosodic features of fillers and its forms or occurrence positions.
In Proc. INTERSPEECH, pp.1726--1230, 2017. (PDF file)
H.Inaguma, K.Inoue, M.Mimura, and T.Kawahara.
Social signal detection in spontaneous dialogue using bidirectional LSTM-CTC.
In Proc. INTERSPEECH, pp.1691--1695, 2017. (PDF file)
D.Lala, P.Milhorat, K.Inoue, M.Ishida, K.Takanashi, and T.Kawahara.
Attentive listening system with backchanneling, response generation and flexible turn-taking.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.127--136, 2017. (PDF file)
Y.Ojima, T.Nakano, S.Fukayama, J.Kato, M.Goto, K.Itoyama, K.Yoshii.
A Singing Instrument for Real-Time Vocal-Part Arrangement of Music Audio Signals.
In Proc. Sound and Music Computing Conference (SMC), pp.443--449, 2017. (PDF file)
Y.Wada, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii.
An adaptive karaoke system that plays accompaniment parts of music audio signals synchronously with users' singing voices.
In Proc. Sound and Music Computing Conference (SMC), pp.110--116, 2017. (PDF file)
P.Milhorat, D.Lala, K.Inoue, Z.Tianyu, M.Ishida, K.Takanashi, S.Nakamura, and T.Kawahara.
A conversational dialogue manager for the humanoid robot ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2017. (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data.
In Proc. IEEE-ICASSP, pp.5815--5819, 2017. (PDF file)
S.Li, X.Lu, S.Sakai, M.Mimura, and T.Kawahara.
Semi-supervised ensemble DNN acoustic model training.
In Proc. IEEE-ICASSP, pp.5270--5274, 2017. (PDF file)
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
Bayesian multichannel nonnegative matrix factorization for audio source separation and localization.
In Proc. IEEE-ICASSP, pp.551--555, 2017. (PDF file)

FY 2016

D.Lala, Y.Li, and T.Kawahara.
Utterance behavior of users while playing basketball with a virtual teammate.
In Proc. ICAART, pp.28--38, 2017. (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Multi-lingual and multi-task DNN learning for articulatory error detection.
In Proc. APSIPA ASC, 2016. (PDF file)
M.Mirzaei, K.Meshgi, and T.Kawahara.
ASR errors as predictor of L2 listening difficulties and PSC enhancement.
In Proc. Coling Workshop on Computational Linguistics for Linguistic Complexity (CL4LC), pp.192--201, 2016. (PDF file)
K.Inoue, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
Annotation and analysis of listener's engagement based on multi-modal behaviors.
In Proc. ICMI Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (MA3HMI), 2016. (PDF file)
H.Inaguma, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Prediction of ice-breaking between participants using prosodic features in the first meeting dialogue.
In Proc. ICMI Workshop on Advancements in Social Signal Processing for Multimodal Interaction (ASSP4MI), 2016. (PDF file)
D.Lala, P.Milhorat, K.Inoue, T.Zhao, and T.Kawahara.
Multimodal interaction with the autonomous android ERICA.
In Proc. ICMI, Vol.Demo. Paper, pp.417--418, 2016. (PDF file)
Y.Bando, H.Suhara, M.Tanaka, T.Kamegawa, K.Itoyama, K.Yoshii, F.Matsuno, H.G.Okuno.
Sound-based online localization for an in-pipe snake robot.
In Proc. IEEE Int'l Symp. Safety, Security, and Rescue Robotics (SSRR), 2016. (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Pronunaciation error detection using DNN articulatory model based on multi-lingual and multi-task learning.
In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), 2016. (PDF file)
S.Li, X.Lu, S.Mori, Y.Akita, and T.Kawahara.
Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data.
In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), 2016. (PDF file)
K.Sekiguchi, Y.Bando, K.Nakamura, K.Nakadai, K.Itoyama, and K.Yoshii.
Online Simultaneous Localization and Mapping of Multiple Sound Sources and Asynchronous Microphone Arrays.
In Proc. IEEE/RSJ IROS, pp. 1973-1979, 2016. (PDF file)
K.Kitamura, Y.Bando, K.Itoyama, and K.Yoshii.
Student's t Multichannel Nonnegative Matrix Factorization for Blind Source Separation.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC), 2016. (PDF file)
D.Lala and T.Kawahara.
Managing dialog and joint actions for virtual basketball teammates.
In Proc. IVA, Vol.Poster, 2016. (PDF file)
K.Inoue, P.Milhorat, D.Lala, T.Zhao, and T.Kawahara.
Talking with ERICA, an autonomous android.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.212--215, 2016. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Joint optimization of denoising autoencoder and DNN acoustic model based on multi-target learning for noisy speech recognition.
In Proc. INTERSPEECH, pp.3803--3807, 2016. (PDF file)
T.Kawahara, T.Yamaguchi, K.Inoue, K.Takanashi, and N.Ward.
Prediction and generation of backchannel form for attentive listening systems.
In Proc. INTERSPEECH, pp.2890--2894, 2016. (PDF file)
E.Nakamura, K.Yoshii and S.Sagayama.
Rhythm Transcription of MIDI Performances Based on a Merged-Output HMM for Multiple Voices.
In Proc. Sound and Music Computing Conference (SMC), pp.338--343, 2016. (PDF file)
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii.
A Unified Bayesian Model of Time-Frequency Clustering and Low-Rank Approximation for Multi-Channel Source Separation.
In Proc. EUSIPCO, pp.2280-2284, 2016. (PDF file)
E.Nakamura, K.Itoyama, K.Yoshii.
Rhythm Transcription of MIDI Performances Based on Hierarchical Bayesian Modelling of Repetition and Modification of Musical Note Patterns.
In Proc. EUSIPCO, pp.1946-1950, 2016. (PDF file)
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, H.G.Okuno.
Variational Bayesian Multi-Channel Robust NMF for Human-Voice Enhancement with a Deformable and Partially-Occluded Microphone Array.
In Proc. EUSIPCO, pp.1018-1022, 2016. (PDF file)
D.F.Glas, T.Minato, C.T.Ishi, T.Kawahara, and H.Ishiguro.
ERICA: The ERATO Intelligent Conversational Android.
In Proc. RO-MAN, pp.22--29, 2016. (PDF file)
M.Mirzaei, K.Meshgi, and T.Kawahara.
Leveraging automatic speech recognition errors to detect challenging speech segments in TED talks.
In Proc. EUROCALL, pp.313--318, 2016. (PDF file)
R.Nishikimi, E.Nakamura, K.Itoyama and K.Yoshii.
Musical Note Estimation for F0 Trajectories of Singing Voices Based on a Bayesian Semi-Beat-Synchronous HMM.
In Proc. ISMIR, pp.461--467, 2016. (PDF file)
Y.Ojima, E.Nakamura, K.Itoyama and K.Yoshii.
A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis.
In Proc. ISMIR, pp.309--315, 2016. (PDF file)
N.Ward, Y.Li, T.Zhao, and T.Kawahara.
Interactional and pragmatics-related prosodic patterns in Mandarin dialog.
In Proc. Int'l Conf. Speech Prosody, 2016. (PDF file)
S.Li, Y.Akita, and T.Kawahara.
Data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training.
In Proc. IEEE-ICASSP, pp.5875--5879, 2016. (PDF file)
E.Nakamura, M.Hamanaka, K.Hirata, and K.Yoshii.
Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music.
In Proc. IEEE-ICASSP, pp.276--280, 2016. (PDF file)
K.Yoshii, K.Itoyama, and M.Goto.
Student's t nonnegative matrix factorization and positive semidefinite tensor factorization for single-channel audio source separation.
In Proc. IEEE-ICASSP, pp.51--55, 2016. (PDF file)

FY 2015

T.Yamaguchi, K.Inoue, K.Yoshino, K.Takanashi, N.Ward, and T.Kawahara.
Analysis and prediction of morphological patterns of backchannels for attentive listening agents.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2016. (PDF file)
T.Kawahara, T.Yamaguchi, M.Uesato, K.Yoshino, and K.Takanashi.
Synchrony in prosodic and linguistic features between backchannels and preceding utterances in attentive listening.
In Proc. APSIPA ASC, pp.392--395, 2015. (PDF file)
Y.Akita, N.Kuwahara, and T.Kawahara.
Automatic classification of usability of ASR result for real-time captioning of lectures.
In Proc. APSIPA ASC, pp.19--22, 2015. (PDF file)
K.Yoshii, K.Itoyama, and M.Goto.
Infinite Superimposed Discrete All-pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms.
In Proc. ISMIR, pp.86--92, 2015. (PDF file)
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, and H.G.Okuno.
Human-Voice Enhancement Based on Online RPCA for a Hose-Shaped Rescue Robot with a Microphone Array.
In Proc. IEEE Int'l Symp. Safety, Security, and Rescue Robotics (SSRR), 2015. (PDF file)
K.Youssef, K.Itoyama, and K.Yoshii.
Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context.
In Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2015.
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, and H.G.Okuno.
Microphone-Accelerometer Based 3D Posture Estimation for a Hose-shaped Rescue Robot.
In Proc. IEEE/RSJ IROS, pp.5580--5586, 2015. (PDF file)
M.Ohkita, Y.Bando, Y.Ikemiya, K.Itoyama, and K.Yoshii.
Audio-Visual Beat Tracking Based on a State-Space Model for a Dancing Robot Playing with Humans.
In Proc. IEEE/RSJ IROS, pp.5555--5560, 2015. (PDF file)
K.Sekiguchi, Y.Bando, K.Itoyama, and K.Yoshii.
Optimizing the Layout of Multiple Mobile Robots for Cooperative Sound Source Separation.
In Proc. IEEE/RSJ IROS, pp5548--8884, 2015. (PDF file)
S.Li, Y.Akita, and T.Kawahara.
Discriminative data selection for lightly supervised training of acoustic model using closed caption texts.
In Proc. INTERSPEECH, pp.3526--3530, 2015. (PDF file)
K.Inoue, Y.Wakabayashi, H.Yoshimoto, K.Takanashi, and T.Kawahara.
Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations.
In Proc. INTERSPEECH, pp.3086--3090, 2015. (PDF file)
S.Li, X.Lu, Y.Akita, and T.Kawahara.
Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation.
In Proc. INTERSPEECH, pp.2892--2896, 2015. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Speech dereverberation using long short-term memory.
In Proc. INTERSPEECH, pp.2435--2439, 2015. (PDF file)
K.Itakura I.Nishimuta, Y.Bando, K.Itoyama, and K.Yoshii.
Bayesian Integration of Sound Source Separation and Speech Recognition:A New Approach to Simultaneous Speech Recognition.
In Proc. INTERSPEECH, pp736--740, 2015. (PDF file)
M.Mirzaei and T.Kawahara.
ASR technology to empower partial and synchronized caption for L2 listening development.
In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.65--70, 2015. (PDF file)
M.Mirzaei, K.Meshgi, Y.Akita, and T.Kawahara.
Errors in automatic speech recognition versus difficulties in second language listening.
In Proc. EUROCALL, pp.410--415, 2015. (PDF file)
A.Dobashi, Y.Ikemiya, K.Itoyama, K.Yoshii.
A Music Performance Assistance System based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals.
In Proc. Sound and Music Computing Conference (SMC), pp.99--104, 2015. (PDF file)
T.Fukuda, Y.Ikemiya, K.Itoyama, K.Yoshii.
A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification.
In Proc. Sound and Music Computing Conference (SMC 2015), pp.105--110, 2015. (PDF file)
T.Sasada, S.Mori, T.Kawahara, and Y.Yamakata.
Named entity recognizer trainable from partially annotated data.
In Proc. PACLING, pp.10--17, 2015. (PDF file)
Y.Akita, Y.Tong, and T.Kawahara.
Language model adaptation for academic lectures using character recognition result of presentation slides.
In Proc. IEEE-ICASSP, pp.5431--5435, 2015. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Deep autoencoders augmented with phone-class feature for reverberant speech recognition.
In Proc. IEEE-ICASSP, pp.4356--4369, 2015. (PDF file)
Y.Bando, T.Otsuka, K.Itoyama, K.Yoshii, Y.Sasaki, S.Kagami, and H.G.Okuno.
Challenges in Deploying A Microphone Array to Localize and Separate Sound Sources in Real Auditory Scenes.
In Proc. IEEE-ICASSP, pp.723-727, 2015. (PDF file)
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Singing Voice Analysis and Editing based on Mutually Dependent F0 Estimation and Source Separation.
In Proc. IEEE-ICASSP, pp.574-578, 2015. (PDF file)
S.Maruo, K.Yoshii, K.Itoyama, M.Mauch, and M.Goto.
A Feedback Framework for Improved Chord Recognition Based on NMF-based Approximate Note Transcription.
In Proc. IEEE-ICASSP, pp.196-200, 2015. (PDF file)

FY 2014

Y.Bando, T.Otsuka, I.Aihara, H.Awano, K.Itoyama, K.Yoshii, and H.G.Okuno.
Recognition of In-field Frog Chorusing using Bayesian Nonparametric Microphone Array Processing.
In Proc. AAAI-2015 Workshop on Comutational Sustainability, 2015. (PDF file)
T.Kawahara, M.Uesato, K.Yoshino, and K.Takanashi.
Toward adaptive generation of backchannels for attentive listening agents.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015. (PDF file)
K.Yoshino and T.Kawahara.
News navigation system based on proactive dialogue strategy.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015. (PDF file)
I.Nishimuta, K.Yoshii, K.Itoyama, and H.G.Okuno.
Development of a Robot Quizmaster with Auditory Functions for Speech-based Multiparty Interaction.
In Proc. IEEE/SICE Int'l Sympo. System Integration (SII 2014), pp.328--333, 2014. (PDF file)
Y.Wakabayashi, K.Inoue, H.Yoshimoto, and T.Kawahara.
Speaker diarization based on audio-visual integration for smart posterboard.
In Proc. APSIPA ASC, 2014. (PDF file)
M.Mimura and T.Kawahara.
Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription.
In Proc. APSIPA ASC, 2014. (PDF file)
M.Mirzaei, Y.Akita, and T.Kawahara.
Partial and synchronized caption generation to develop second language listening skill.
In ICCE Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA), pp.13--23, 2014. (PDF file)
I.Nishimuta, N.Hirayama, K.Yoshii, K.Itoyama, and H.G.Okuno.
A Robot Quizmaster that can Localize, Separate, and Recognize Simultaneous Utterances for a Fastest-Voice-First Quiz Game.
roceedings of IEEE-RAS Interanational Conference on Humanoid Robots (Humanoids 2014), pp.967--972, 2014. (PDF file)
Y.Bando, K.Itoyama, S.Tadokoro, M.Konyo, K.Nakadai, K.Yoshii, and H.G.Okuno.
A Sound-based Online Method for Estimating the Time-Varying Posture of a Hose-shaped Robot.
In Proc. Int'l Sympo. Safety, Security, and Rescue Robotics (SSRR-2014), pp.1--6, 2014. (PDF file)
A.Maezawa, K.Itoyama, K.Yoshii, and H.G.Okuno.
Bayesian Audio Alignment Based on A Unified Generative Model of Music Comosition and Performance.
In Proc. ISMIR, pp.233--238, 2014. (PDF file)
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Transferring Vocal Expressions of a Professional Singer to Unaccompanied Singing Signals.
In Proc. ISMIR, 2014. (PDF file)
K.Sudoh, M.Nagata, S.Mori, and T.Kawahara.
Japanese-to-English patent translation system based on domain-adapted word segmentation and post-ordering.
In Proc. Assoc. for Machine Translation in the Americas (AMTA), Vol.1, pp.234--248, 2014. (PDF file)
K.Inoue, Y.Wakabayashi, H.Yoshimoto, and T.Kawahara.
Speaker diarization using eye-gaze information in multi-party conversations.
In Proc. INTERSPEECH, pp.562--566, 2014. (PDF file)
S.Li, Y.Akita, and T.Kawahara.
Corpus and transcription system of Chinese Lecture Room.
In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), pp.442--445, 2014. (PDF file)
M.Mirzaei, Y.Akita, and T.Kawahara.
Partial and synchronized captioning: A new tool for second language listening development.
In Proc. EUROCALL, pp.230--236, 2014. (PDF file)
K.Yoshino and T.Kawahara.
Information navigation system based on POMDP that tracks user focus.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.32--40, 2014. (PDF file)
M.Mimura, S.Sakai, and T.Kawahara.
Exploring deep neural networks and deep autoencoders in reverberant speech recognition.
In Workshop on Hands-free Speech Communication \& Microphone Arrays (HSCMA), 2014. (PDF file)
K.Yoshii, H.Fujihara, T.Nakano, and M.Goto.
Cultivating Vocal Activity Detection for Music Audio Signals in a Circulation-type Crowdsourcing Ecosystem.
In Proc. IEEE-ICASSP, pp.624-628, 2014. (PDF file)

FY 2013

T.Kawahara.
Smart posterboard: Multi-modal sensing and analysis of poster conversations.
In Proc. APSIPA ASC, p. (plenary overview talk), 2013. (PDF file)
K.Yoshino, S.Mori, and T.Kawahara.
Predicate argument structure analysis using partially annotated corpora.
In Proc. IJCNLP, pp.957--961, 2013. (PDF file)
T.Kawahara, S.Hayashi, and K.Takanashi.
Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations.
In Proc. INTERSPEECH, pp.1882--1885, 2013. (PDF file)
K.Yoshino, S.Mori, and T.Kawahara.
Incorporating semantic information to selection of web texts for language model of spoken dialogue system.
In Proc. IEEE-ICASSP, pp.8252--8256, 2013. (PDF file)

FY 2012

K.Yoshino, S.Mori, and T.Kawahara.
Language modeling for spoken dialogue system based on filtering using predicate-argument structures.
In Proc. COLING, pp.2993--3002, 2012. (PDF file)
C.Lee and T.Kawahara.
Hybrid vector space model for flexible voice search.
In Proc. APSIPA ASC, 2012. (PDF file)
K.Yoshino, S.Mori, and T.Kawahara.
Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures.
In Proc. APSIPA ASC, 2012. (PDF file)
Y.Akita, M.Watanabe, and T.Kawahara.
Automatic transcription of lecture speech using language model based on speaking-style transformation of proceeding texts.
In Proc. INTERSPEECH, 2012. (PDF file)
R.Gomez and T.Kawahara.
Dereverberation based on wavelet packet filtering for robust automatic speech recognition.
In Proc. INTERSPEECH, 2012. (PDF file)
T.Kawahara, T.Iwatate, and K.Takanashi.
Prediction of turn-taking by combining prosodic and eye-gaze information in poster conversations.
In Proc. INTERSPEECH, 2012. (PDF file)
T.Kawahara, T.Iwatate, T.Tsuchiya, and K.Takanashi.
Can we predict who in the audience will ask what kind of questions with their feedback behaviors in poster conversation?
In Proc. Interdisciplinary Workshop on Feedback Behaviors in Dialog, pp.35--38, 2012. (PDF file)
T.Kawahara.
Transcription system using automatic speech recognition for the Japanese Parliament (Diet).
In Proc. AAAI/IAAI, pp.2224--2228, 2012. (PDF file)
G.Neubig, T.Watanabe, S.Mori, and T.Kawahara.
Machine translation without words through substring alignment.
In Proc. ACL, pp.165--174, 2012. (PDF file)
T.Kawahara.
Multi-modal sensing and analysis of poster conversations toward smart posterboard.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.1--9 (keynote speech), 2012. (PDF file)
M.Ablimit, T.Kawahara, and A.Hamdulla.
Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language.
In Proc. IEEE-ICASSP, pp.5009--5012, 2012. (PDF file)

FY 2011

M.Ablimit, A.Hamdulla, and T.Kawahara.
Morpheme concatenation approach in language modeling for large-vocabulary Uyghur speech recognition.
In Proc. Oriental-COCOSDA Workshop, 2011. (PDF file)
R.Gomez and T.Kawahara.
Optimized wavelet-based speech enhancement for speech recognition in noisy and reverberant conditions.
In Proc. APSIPA ASC, 2011. (PDF file)
M.Mimura and T.Kawahara.
Fast speaker normalization and adaptation based on BIC for meeting speech recognition.
In Proc. APSIPA ASC, 2011. (PDF file)
M.Ablimit, T.Kawahara, and A.Hamdulla.
Lexicon optimization for automatic speech recognition based on discriminative learning.
In Proc. APSIPA ASC, 2011. (PDF file)
H.Wang, T.Kawahara, and Y.Wang.
Improving non-native speech recognition performance by discriminative training for language model in a CALL system.
In Proc. APSIPA ASC, 2011. (PDF file)
T.Hirayama, Y.Sumi, T.Kawahara, and T.Matsuyama.
Info-concierge: Proactive multi-modal interaction through mind probing.
In Proc. APSIPA ASC, 2011. (PDF file)
C.Lee, T.Kawahara, and A.Rudnicky.
Combining slot-based vector space model for voice book search.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), pp. 27--35, 2011. (PDF file)
Y.Akita and T.Kawahara.
Automatic comma insertion of lecture transcripts based on multiple annotations.
In Proc. INTERSPEECH, pp.2889--2892, 2011. (PDF file)
R.Gomez and T.Kawahara.
Denoising using optimized wavelet filtering for automatic speech recognition.
In Proc. INTERSPEECH, pp.1673--1676, 2011. (PDF file)
G.Neubig, T.Watanabe, E.Sumita, S.Mori, and T.Kawahara.
An unsupervised model for joint phrase alignment and extraction.
In Proc. ACL-HLT, pp.632--641, 2011. (PDF file)
K.Yoshino, S.Mori, and T.Kawahara.
Spoken dialogue system based on information extraction using similarity of predicate argument structures.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.59--66, 2011. (PDF file)

FY 2010

T.Kawahara, H.Wang, Y.Tsubota, and M.Dantsuji.
English and Japanese CALL systems developed at Kyoto University.
In Proc. APSIPA ASC, pp.804--810, 2010. (PDF file)
R.Gomez and T.Kawahara.
Optimizing wavelet parameters for dereverberation in automatic speech recognition.
In Proc. APSIPA ASC, pp.446--449, 2010. (PDF file)
T.Kawahara.
Automatic transcription of parliamentary meetings and classroom lectures -- a sustainable approach and real system evaluations --.
In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), pp.1--6 (keynote speech), 2010. (PDF file)
M.Ablimit, G.Neubig, M.Mimura, S.Mori, T.Kawahara, and A.Hamdulla.
Uyghur morpheme-based language models and ASR.
In Proc. Int'l Conf. Signal Processing, pp.581--584, 2010. (PDF file)
K.Yoshino and T.Kawahara.
Spoken dialogue system based on information extraction from web text.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo. Paper, pp.196--197, 2010. (PDF file)
T.Kawahara, K.Sumi, Z.Q.Chang, and K.Takanashi.
Detection of hot spots in poster conversations based on reactive tokens of audience.
In Proc. INTERSPEECH, pp.3042--3045, 2010. (PDF file)
G.Neubig, M.Mimura, S.Mori, and T.Kawahara.
Learning a language model from continuous speech.
In Proc. INTERSPEECH, pp.1053--1056, 2010. (PDF file)
Y.Itoh, H.Nishizaki, X.Hu, H.Nanjo, T.Akiba, T.Kawahara, S.Nakagawa, T.Matsui, Y.Yamashita, and K.Aikawa.
Constructing Japanese test collections for spoken term detection.
In Proc. INTERSPEECH, pp.677--680, 2010. (PDF file)
T.Kawahara, N.Katsumaru, Y.Akita, and S.Mori.
Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures.
In Proc. INTERSPEECH, pp.626--629, 2010. (PDF file)
R.Gomez and T.Kawahara.
An improved wavelet-based dereverberation for robust automatic speech recognition.
In Proc. INTERSPEECH, pp.578--581, 2010. (PDF file)
Y.Akita, M.Mimura, G.Neubig, and T.Kawahara.
Semi-automated update of automatic transcription system for the Japanese national congress.
In Proc. INTERSPEECH, pp.338--341, 2010. (PDF file)
T.Kawahara, Z.Q.Chang, and K.Takanashi.
Analysis on prosodic features of Japanese reactive tokens in poster conversations.
In Proc. Int'l Conf. Speech Prosody, 2010. (PDF file)
G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
Improved statistical models for SMT-based speaking style transformation.
In Proc. IEEE-ICASSP, pp.5206--5209, 2010. (PDF file)
R.Gomez and T.Kawahara.
Optimizing spectral subtraction and Wiener filtering for robust speech recognition in reverberant and noisy conditions.
In Proc. IEEE-ICASSP, pp.4566--4569, 2010. (PDF file)
D.Cournapeau, S.Watanabe, A.Nakamura, and T.Kawahara.
Using online model comparison in the Variational Bayes framework for online unsupervised voice activity detection.
In Proc. IEEE-ICASSP, pp.4462--4465, 2010. (PDF file)

FY 2009

T.Kawahara.
New perspectives on spoken language understanding: Does machine need to fully understand speech?
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.46--50 (invited paper), 2009. (PDF file)
T.Misu, K.Sugiura, T.Kawahara, K.Ohtake, C.Hori, H.Kashioka, and S.Nakamura.
Online learning of Bayes risk-based optimization of dialogue management.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2009. (PDF file)
R.Gomez and T.Kawahara.
Tight integration of dereverberation and automatic speech recognition.
In Proc. APSIPA ASC, pp.639--643, 2009. (PDF file)
T.Akiba, K.Aikawa, Y.Itoh, T.Kawahara, H.Nanjo, H.Nishizaki, N.Yasuda, Y.Yamashita, and K.Itou.
Developing an SDR test collection from Japanese lecture audio data.
In Proc. APSIPA ASC, pp.324--330, 2009. (PDF file)
K.Katsurada, A.Lee, T.Kawahara, T.Yotsukura, S.Morishima, T.Nishimoto, Y.Yamashita, and T.Nitta.
Development of a toolkit for spoken dialog systems with an anthropomorphic agent: Galatea.
In Proc. APSIPA ASC, pp.148--153, 2009. (PDF file)
A.Lee and T.Kawahara.
Recent development of open-source speech recognition engine Julius.
In Proc. APSIPA ASC, pp.131--137, 2009. (PDF file)
G.Neubig, S.Mori, and T.Kawahara.
A WFST-based log-linear framework for speaking-style transformation.
In Proc. INTERSPEECH, pp.1495--1498, 2009. (PDF file)
R.Gomez and T.Kawahara.
Optimization of dereverberation parameters based on likelihood of speech recognizer.
In Proc. INTERSPEECH, pp.1223--1226, 2009. (PDF file)
K.Sumi, T.Kawahara, J.Ogata, and M.Goto.
Acoustic event detection for spotting hot spots in podcasts.
In Proc. INTERSPEECH, pp.1143--1146, 2009. (PDF file)
Y.Akita, M.Mimura, and T.Kawahara.
Automatic transcription system for meetings of the Japanese national congress.
In Proc. INTERSPEECH, pp.84--87, 2009. (PDF file)
K.Komatani, T.Kawahara, and H.G.Okuno.
A model of temporally changing user behaviors in a deployed spoken dialogue system.
In Proc. Int'l Conf. User Modeling, Adaptation, and Personalization (UMAP) (LNCS 5535), pp.409--414, 2009. (PDF file)
T.Kawahara, M.Mimura, and Y.Akita.
Language model transformation applied to lightly supervised training of acoustic model for congress meetings.
In Proc. IEEE-ICASSP, pp.3853--3856, 2009. (PDF file)

FY 2008

M.Ablimit, M.Eli, and T.Kawahara.
Partly supervised Uighur morpheme segmentation.
In Proc. Oriental-COCOSDA Workshop, pp.71--76, 2008. (PDF file)
T.Shinozaki, S.Furui, and T.Kawahara.
Aggregated cross-validation and its efficient application to Gaussian mixture optimization.
In Proc. INTERSPEECH, pp.2382--2385, 2008. (PDF file)
T.Sasada, S.Mori, and T.Kawahara.
Extracting word-pronunciation pairs from comparable set of text and speech.
In Proc. INTERSPEECH, pp.1821--1824, 2008. (PDF file)
H.Wang and T.Kawahara.
A Japanese CALL system based on dynamic question generation and error prediction for ASR.
In Proc. INTERSPEECH, pp.1737--1740, 2008. (PDF file)
T.Kawahara, M.Toyokura, T.Misu, and C.Hori.
Detection of feeling through back-channels in spoken dialogue.
In Proc. INTERSPEECH, p. 1696, 2008. (PDF file)
T.Kawahara, H.Setoguchi, K.Takanashi, K.Ishizuka, and S.Araki.
Multi-modal recording, analysis and indexing of poster sessions.
In Proc. INTERSPEECH, pp.1622--1625, 2008. (PDF file)
K.Komatani, T.Kawahara, and H.G.Okuno.
Predicting ASR errors by exploiting barge-in rate of individual users for spoken dialogue systems.
In Proc. INTERSPEECH, pp.183--186, 2008. (PDF file)
K.Ishizuka, S.Araki, and T.Kawahara.
Statistical speech activity detection based on spatial power distribution for analyses of poster presentations.
In Proc. INTERSPEECH, pp.99--102, 2008. (PDF file)
T.Misu and T.Kawahara.
Bayes risk-based dialogue management for document retrieval system with speech interface.
In Proc. COLING, Vol.Posters \& Demo., pp.59--62, 2008. (PDF file)
H.Wang and T.Kawahara.
Effective error prediction using decision tree for ASR grammar network in CALL system.
In Proc. IEEE-ICASSP, pp.5069--5072, 2008. (PDF file)
T.Kawahara, Y.Nemoto, and Y.Akita.
Automatic lecture transcription by exploiting presentation slide information for language model adaptation.
In Proc. IEEE-ICASSP, pp.4929--4932, 2008. (PDF file)
S.Sakai, T.Kawahara, and S.Nakamura.
Admissible stopping in Viterbi beam search for unit selection in concatenative speech synthesis.
In Proc. IEEE-ICASSP, pp.4613--4616, 2008. (PDF file)
D.Cournapeau and T.Kawahara.
Using Variational Bayes Free Energy for unsupervised voice activity detection.
In Proc. IEEE-ICASSP, pp.4429--4432, 2008. (PDF file)
T.Shinozaki and T.Kawahara.
GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.
In Proc. IEEE-ICASSP, pp.4405--4408, 2008. (PDF file)

FY 2007

T.Shinozaki and T.Kawahara.
HMM training based on CV-EM and CV Gaussian mixture optimization.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.318--322, 2007. (PDF file)
H.Setoguchi, K.Takanashi, and T.Kawahara.
Multi-modal conversational analysis of poster presentations using multiple sensors.
In Proc. ICMI Workshop on Tagging, Mining and Retrieval of Human Related Activity Information, pp.44--47, 2007. (PDF file)
D.Cournapeau and T.Kawahara.
Evaluation of real-time voice activity detection based on high order statistics.
In Proc. INTERSPEECH, pp.2945--2948, 2007. (PDF file)
T.Misu and T.Kawahara.
Bayes risk-based optimization of dialogue management for document retrieval system with speech interface.
In Proc. INTERSPEECH, pp.2705--2708, 2007. (PDF file)
C.Waple, H.Wang, T.Kawahara Y.Tsubota, and M.Dantsuji.
Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.
In Proc. INTERSPEECH, pp.2177--2180, 2007. (PDF file)
T.Shinozaki and T.Kawahara.
Gaussian mixture optimization for HMM based on efficient cross-validation.
In Proc. INTERSPEECH, pp.2061--2064, 2007. (PDF file)
Y.Akita, Y.Nemoto, and T.Kawahara.
PLSA-based topic detection in meetings for adaptation of lexicon and language model.
In Proc. INTERSPEECH, pp.602--605, 2007. (PDF file)
K.Komatani, T.Kawahara, and H.G.Okuno.
Analyzing temporal transition of real user's behaviors in a spoken dialogue system.
In Proc. INTERSPEECH, pp.142--145, 2007. (PDF file)
T.Misu and T.Kawahara.
An interactive framework for document retrieval and presentation with question-answering function in restricted domain.
In Proc. Int'l Conf. Industrial, Engineering \& Other Applications of Artificial Intelligent Systems (IEA/AIE) (LNAI 4570), pp. 126--134, 2007. (PDF file)
T.Misu and T.Kawahara.
Speech-based interactive information guidance system using question-answering technique.
In Proc. IEEE-ICASSP, Vol.4, pp.145--148, 2007. (PDF file)
T.Kawahara, M.Saikou, and K.Takanashi.
Automatic detection of sentence and clause units using local syntactic dependency.
In Proc. IEEE-ICASSP, Vol.4, pp.125--128, 2007. (PDF file)
Y.Akita and T.Kawahara.
Topic-independent speaking-style transformation of language model for spontaneous speech recognition.
In Proc. IEEE-ICASSP, Vol.4, pp.33--36, 2007. (PDF file)

FY 2006

T.Kawahara.
Intelligent transcription system based on spontaneous speech processing.
In Proc. Int'l Conference on Informatics Research for Development of Knowledge Society Infrastructure, pp.19--26, 2007. (PDF file)
Y.Kida and T.Kawahara.
Evaluation of voice activity detection by combining multiple features with weight adaptation.
In Proc. INTERSPEECH, pp.1966--1969, 2006. (PDF file)
S.Sakai and T.Kawahara.
Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis.
In Proc. INTERSPEECH, pp.1746--1749, 2006. (PDF file)
D.Cournapeau, T.Kawahara, K.Mase, and T.Toriyama.
Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm.
In Proc. INTERSPEECH, pp.1201--1204, 2006. (PDF file)
Y.Akita, M.Saikou, H.Nanjo, and T.Kawahara.
Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machines.
In Proc. INTERSPEECH, pp.1033--1036, 2006. (PDF file)
C.Waple, Y.Tsubota, M.Dantsuji, and T.Kawahara.
Prototyping a CALL system for students of Japanese using dynamic diagram generation and interactive hints.
In Proc. INTERSPEECH, pp.821--824, 2006. (PDF file)
T.Misu and T.Kawahara.
A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts.
In Proc. INTERSPEECH, pp.9--12, 2006. (PDF file)
R.Hamabe, K.Uchimoto, T.Kawahara, and H.Isahara.
Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
In Proc. COLING-ACL, Vol.Poster Sessions, pp.324--330, 2006. (PDF file)
Y.Akita, C.Troncoso, and T.Kawahara.
Automatic transcription of meetings using topic-oriented language model adaptation.
In Proc. Western Pacific Acoustics Conference (WESPAC), 2006. (PDF file)
H.Nanjo, Y.Akita, and T.Kawahara.
Computer assisted speech transcription system for efficient speech archive.
In Proc. Western Pacific Acoustics Conference (WESPAC), 2006. (PDF file)
Y.Akita and T.Kawahara.
Efficient estimation of language model statistics of spontaneous speech via statistical transformation model.
In Proc. IEEE-ICASSP, Vol.1, pp.1049--1052, 2006. (PDF file)

FY 2005

T.Misu and T.Kawahara.
Speech-based information retrieval system with clarification dialogue strategy.
In Proc. Human Language Technology Conf. (HLT/EMNLP), pp. 1003--1010, 2005. (PDF file)
Y.Kida and T.Kawahara.
Voice activity detection based on optimally weighted combination of multiple features.
In Proc. INTERSPEECH, pp.2621--2624, 2005. (PDF file)
C.Troncoso and T.Kawahara.
Trigger-based language model adaptation for automatic meeting transcription.
In Proc. INTERSPEECH, pp.1297--1300, 2005. (PDF file)
T.Misu and T.Kawahara.
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
In Proc. INTERSPEECH, pp.637--640, 2005. (PDF file)
H.Nanjo, T.Misu, and T.Kawahara.
Minimum Bayes-risk decoding considering word significance for information retrieval system.
In Proc. INTERSPEECH, pp.561--564, 2005. (PDF file)
I.R.Lane and T.Kawahara.
Utterance verification incorporating in-domain confidence and discourse coherence measures.
In Proc. INTERSPEECH, pp.421--424, 2005. (PDF file)
C.Troncoso, T.Kawahara, H.Yamamoto, and G.Kikui.
Trigger-based language model construction by combining different corpora.
In Proc. Pacific Assoc. Computational Linguistics (PACLING), pp.340--344, 2005. (PDF file)
H.Nanjo and T.Kawahara.
A new ASR evaluation measure and minimum Bayes-risk decoding for open-domain speech understanding.
In Proc. IEEE-ICASSP, Vol.1, pp.1053--1056, 2005. (PDF file)
I.R.Lane and T.Kawahara.
Incorporating dialogue context and topic clustering in out-of-domain detection.
In Proc. IEEE-ICASSP, Vol.1, pp.1045--1048, 2005. (PDF file)
Y.Akita and T.Kawahara.
Generalized statistical modeling of pronunciation variations using variable-length phone context.
In Proc. IEEE-ICASSP, Vol.1, pp.689--692, 2005. (PDF file)

FY 2004

T.Kawahara, A.Lee, K.Takeda, K.Itou, and K.Shikano.
Recent progress of open-source LVCSR engine Julius and Japanese model repository.
In Proc. INTERSPEECH, pp.3069--3072, 2004. (PDF file)
K.Shitaoka, H.Nanjo, and T.Kawahara.
Automatic transformation of lecture transcription into document style using statistical framework.
In Proc. INTERSPEECH, pp.2881--2884, 2004. (PDF file)
S.Ueno, I.R.Lane, and T.Kawahara.
Example-based training of dialogue planning incorporating user and situation models.
In Proc. INTERSPEECH, pp.2837--2840, 2004. (PDF file)
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Topic classification and verification modeling for out-of-domain utterance detection.
In Proc. INTERSPEECH, pp.2197--2200, 2004. (PDF file)
T.Kitade, H.Nanjo, and T.Kawahara.
Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers.
In Proc. INTERSPEECH, pp.2169--2172, 2004. (PDF file)
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Practical use of English pronunciation system for Japanese students in the CALL classroom.
In Proc. INTERSPEECH, pp.1689--1692, 2004. (PDF file)
Y.Akita and T.Kawahara.
Language model adaptation based on PLSA of topics and speakers.
In Proc. INTERSPEECH, pp.1045--1048, 2004. (PDF file)
T.Misu, K.Komatani, and T.Kawahara.
Confirmation strategy for document retrieval systems with spoken dialog interface.
In Proc. INTERSPEECH, pp.45--48, 2004. (PDF file)
K.Komatani, T.Misu, T.Kawahara, and H.G.Okuno.
Efficient confirmation strategy for large-scale text retrieval systems with spoken dialogue interface.
In Proc. COLING, pp.1100--1106, 2004. (PDF file)
K.Shitaoka, K.Uchimoto, T.Kawahara, and H.Isahara.
Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
In Proc. COLING, pp.1107--1113, 2004. (PDF file)
Y.Akita, M.Hasegawa, and T.Kawahara.
Automatic audio archiving system for panel discussions.
In Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), 2004. (PDF file)
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Practical use of autonomous English pronunciation learning system for Japanese students.
In Proc. InSTIL/ICALL -- NLP and Speech Technologies in Advanced Language Learning Systems, pp.139--142, 2004. (PDF file)
A.Lee, K.Shikano, and T.Kawahara.
Real-time word confidence scoring using local posterior probabilities on tree trellis search.
In Proc. IEEE-ICASSP, Vol.1, pp.793--796, 2004. (PDF file)
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Out-of-domain detection based on confidence measures from multiple topic classification.
In Proc. IEEE-ICASSP, Vol.1, pp.757--760, 2004. (PDF file)
H.Nanjo, T.Kitade, and T.Kawahara.
Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers.
In Proc. IEEE-ICASSP, Vol.1, pp.449--452, 2004. (PDF file)
M.Nishida and T.Kawahara.
Speaker indexing and adaptation using speaker clustering based on statistical model selection.
In Proc. IEEE-ICASSP, Vol.1, pp.353--356, 2004. (PDF file)
K.Komatani, R.Ito, T.Kawahara, and H.G.Okuno.
Recognition of emotional states in spoken dialogue with a robot.
In Proc. Int'l Conf. Industrial \& Engineering Applications of Artificial Intelligence \& Expert Systems (IEA/AIE) (LNAI 3029), pp. 413--423, 2004. (PDF file)
T.Kawahara.
Automatic speech transcription and archiving system using the Corpus of Spontaneous Japanese.
In Proc. Int'l Congress Acoustics (ICA), pp.161--164, 2004. (PDF file)

FY 2003

T.Kawahara.
Spoken language processing for audio archives of lectures and panel discussions.
In Proc. Int'l Conference on Informatics Research for Development of Knowledge Society Infrastructure, pp.23--30, 2004. (PDF file)
T.Kawahara, T.Kitade, K.Shitaoka, and H.Nanjo.
Efficient access to lecture audio archives through spoken language processing.
In Proc. Special Workshop in Maui (SWIM), 2004. (PDF file)
T.Kawahara, K.Shitaoka, T.Kitade, and H.Nanjo.
Automatic indexing of key sentences for lecture archives.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), 2003. (PDF file)
Y.Tsubota, M.Dantsuji, and T.Kawahara.
An English pronunciation learning system for Japanese students based on diagnosis of critical pronunciation errors.
In Proc. EUROCALL, p. 204, 2003. (PDF file)
Y.Akita and T.Kawahara.
Unsupervised speaker indexing using anchor models and automatic transcription of discussions.
In Proc. EUROSPEECH, pp.2985--2988, 2003. (PDF file)
M.Nishida and T.Kawahara.
Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation.
In Proc. EUROSPEECH, pp.1849--1852, 2003. (PDF file)
T.Kawahara, R.Ito, and K.Komatani.
Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy.
In Proc. EUROSPEECH, pp.1701--1704, 2003. (PDF file)
K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
User modeling in spoken dialogue systems for flexible guidance generation.
In Proc. EUROSPEECH, pp.745--748, 2003. (PDFfile)
I.R.Lane, T.Matsui, S.Nakamura, and T.Kawahara.
Hierarchical topic classification for dialog speech recognition based on language model switching.
In Proc. EUROSPEECH, pp.429--432, 2003. (PDF file)
K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
Flexible guidance generation using user model in spoken dialogue systems.
In Proc. ACL, pp.256--263, 2003. (PDF file)
Y.Kiyota, S.Kurohashi, T.Misu, K.Komatani, T.Kawahara, and F.Kido.
Dialog navigator: A spoken dialog Q-A system based on large text knowledge base.
In Proc. ACL, Vol.Interactive Poster \& Demo., pp.149--152, 2003. (PDF file)
K.Komatani, F.Adachi, S.Ueno, T.Kawahara, and H.G.Okuno.
Flexible spoken dialogue system based on user models and dynamic generation of VoiceXML scripts.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.87--96, 2003. (PDF file)
T.Kawahara, H.Nanjo, T.Shinozaki, and S.Furui.
Benchmark test for speech recognition using the Corpus of Spontaneous Japanese.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.135--138, 2003. (PDF file)
H.Nanjo, K.Shitaoka, and T.Kawahara.
Automatic transformation of lecture transcription into document style using statistical framework.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.215--218, 2003. (PDF file)
Y.Akita, M.Nishida, and T.Kawahara.
Automatic transcription of discussions using unsupervised speaker indexing.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.79--82, 2003. (PDF file)
H.Nanjo and T.Kawahara.
Unsupervised language model adaptation for lecture speech recognition.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.75--78, 2003. (PDF file)
I.R.Lane, T.Kawahara, and T.Matsui.
Language model switching based on topic detection for dialog speech recognition.
In Proc. IEEE-ICASSP, Vol.1, pp.616--619, 2003. (PDF file)
M.Nishida and T.Kawahara.
Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion.
In Proc. IEEE-ICASSP, Vol.1, pp.172--175, 2003. (PDF file)

FY 2002

K.Okuda, T.Kawahara, and S.Nakamura.
Speaking rate compensation based on likelihood criterion in acoustic model training and decoding.
In Proc. ICSLP, pp.2589--2592, 2002. (PDF file)
Y.Tsubota, T.Kawahara, and M.Dantsuji.
Recognition and verification of English by Japanese students for computer-assisted language learning system.
In Proc. ICSLP, pp.1205--1208, 2002. (PDF file)
K.Imoto, Y.Tsubota, A.Raux, T.Kawahara, and M.Dantsuji.
Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system.
In Proc. ICSLP, pp.749--752, 2002. (PDF file)
A.Raux and T.Kawahara.
Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learning.
In Proc. ICSLP, pp.737--740, 2002. (PDF file)
Y.Yamakata, T.Kawahara, and H.G.Okuno.
Belief network based disambiguation of object reference in spoken dialogue system for robot.
In Proc. ICSLP, pp.177--180, 2002. (PDF file)
K.Komatani, T.Kawahara, R.Ito, and H.G.Okuno.
Efficient dialogue strategy to find users' intended items from information query results.
In Proc. COLING, pp.481--487, 2002. (PDF file)
Y.Yamakata, T.Kawahara, and H.G.Okuno.
Belief network based disambiguation of object reference in spoken dialogue system for robot.
In Proc. ISCA Workshop on Multi-Modal Dialogue in Mobile Environments, 2002. (PDF file)
A.Lee, T.Kawahara, K.Takeda, M.Mimura, A.Yamada, A.Ito, K.Itou, and K.Shikano.
Continuous speech recognition consortium -- an open repository for CSR tools and models --.
In Proc. Int'l Conf. Language Resources \& Evaluation (LREC), pp.1438--1441, 2002. (PDF file)
T.Kawahara and M.Hasegawa.
Automatic indexing of lecture speech by extracting topic-independent discourse markers.
In Proc. IEEE-ICASSP, pp.1--4, 2002. (PDF file)
H.Nanjo and T.Kawahara.
Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition.
In Proc. IEEE-ICASSP, pp.725--728, 2002. (PDF file)

FY 2001

A.Raux and T.Kawahara.
Optimizing computer-assisted pronunciation instruction by selecting relevant training topics.
In InSTIL Advanced Workshop, 2002. (PDF file)
Y.Tsubota, T.Kawahara, and M.Dantsuji.
CALL system for Japanese students of English using pronunciation error prediction and formant structure estimation.
In InSTIL Advanced Workshop, 2002. (PDF file)
T.Kawahara, H.Nanjo, and S.Furui.
Automatic transcription of spontaneous lecture speech.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), 2001. (PDF file)
H.Nanjo, K.Kato, and T.Kawahara.
Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition.
In Proc. EUROSPEECH, pp.2531--2534, 2001. (PDF file)
A.Lee, T.Kawahara, and K.Shikano.
Julius -- an open source real-time large vocabulary recognition engine.
In Proc. EUROSPEECH, pp.1691--1694, 2001. (PDF file)
K.Komatani, K.Tanaka, H.Kashima, and T.Kawahara.
Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model.
In Proc. EUROSPEECH, pp.1319--1322, 2001. (PDF file)
A.Lee, T.Kawahara, and K.Shikano.
Gaussian mixture selection using context-independent HMM.
In Proc. IEEE-ICASSP, pp.69--72, 2001. (PDF file)

FY 2000

T.Kawahara, A.Lee, T.Kobayashi, K.Takeda, N.Minematsu, S.Sagayama, K.Itou, A.Ito, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
Free software toolkit for Japanese large vocabulary continuous speech recognition.
In Proc. ICSLP, Vol.4, pp.476--479, 2000. (PDF file)
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures.
In Proc. ICSLP, Vol.3, pp.566--569, 2000. (PDF file)
K.Imoto, M.Dantsuji, and T.Kawahara.
Modelling of the perception of English sentence stress for computer-assisted language learning.
In Proc. ICSLP, Vol.3, pp.175--178, 2000. (PDF file)
H.Nanjo, A.Lee, and T.Kawahara.
Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems.
In Proc. ICSLP, Vol.2, pp.1027--1030, 2000. (PDF file)
K.Komatani and T.Kawahara.
Generating effective confirmation and guidance using two-level confidence measures for dialogue systems.
In Proc. ICSLP, Vol.2, pp.648--651, 2000. (PDF file)
K.Kato, H.Nanjo, and T.Kawahara.
Automatic transcription of lecture speech using topic-independent language modeling.
In Proc. ICSLP, Vol.1, pp.162--165, 2000. (PDF file)
H.Fujisaki, K.Shirai, S.Doshita, S.Nakagawa, K.Hirose, S.Itahashi, T.Kawahara, S.Ohno, H.Kikuchi, K.Abe, and S.Kiriyama.
Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language.
In Proc. ICSLP, Vol.1, pp.70--73, 2000. (PDF file)
T.Kawahara, K.Komatani, and S.Doshita.
Dialogue management using concept-level confidence measures of speech recognition.
In Proc. Int'l Sympo. on Spoken Dialogue, 2000. (PDF file)
K.Komatani and T.Kawahara.
Flexible mixed-initiative dialogue management using concept-level confidence measures of speech recognizer output.
In Proc. COLING, pp.467--473, 2000. (PDF file)
A.Lee, T.Kawahara, K.Takeda, and K.Shikano.
A new phonetic tied-mixture model for efficient decoding.
In Proc. IEEE-ICASSP, pp.1269--1272, 2000. (PDF file)

FY 1999

T.Kawahara, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
Japanese dictation toolkit -- plug-and-play framework for speech recognition R\&D --.
In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.393--396, 1999. (PDF file)