FY
2024 |
2023 |
2022 |
2021 |
2020 |
FY
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013 |
2012 |
2011 |
2010 |
FY
2009 |
2008 |
2007 |
2006 |
2005 |
2004 |
2003 |
2002 |
2001 |
2000 |
FY 2024
-
M.Elmers, K.Inoue, D.Lala, K.Ochi, and T.Kawahara.
Analysis and detection of differences in spoken user behaviors
between autonomous and wizard-of-oz systems.
In Proc. Oriental-COCOSDA Workshop, 2024.
(PDF file)
-
Y.Fu, C.Chu, and T.Kawahara.
StyEmp: Stylizing empathetic response generation via multi-grained
prefix encoder and personality reinforcement.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.172--185,
2024.
(PDF file)
-
T.Honda, S.Sakai, and T.Kawahara.
Efficient and robust long-form speech recognition with hybrid
h3-conformer.
In Proc. INTERSPEECH, pp.2985--2899, 2024.
(PDF file)
-
H.Shi and T.Kawahara.
Dual-path adaptation of pretrained feature extraction module for
robust automatic speech recognition.
In Proc. INTERSPEECH, pp.2850--2854, 2024.
(PDF file)
-
Y.Gao, H.Shi, C.Chu, and T.Kawahara.
Speech emotion recognition with multi-level acoustic and semantic
information extraction and interaction.
In Proc. INTERSPEECH, pp.1060--1064, 2024.
(PDF file)
-
K.Ochi, K.Inoue, D.Lala, and T.Kawahara.
Entrainment analysis and prosody prediction of subsequent
interlocutor's backchannels in dialogue.
In Proc. INTERSPEECH, pp.462--466, 2024.
(PDF file)
-
K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
Multilingual turn-taking prediction using voice activity projection.
In Proc. COLING, pp.11873--11883, 2024.
(PDF file)
-
M.Masuyama, T.Kawahara, and K.Matsuda.
Video retrieval system using automatic speech recognition for the
Japanese Diet.
In ParlaCLARIN IV Workshop, pp.145--148, 2024.
(PDF file)
-
T.Kawahara.
Quantitative analysis of editing in transcription process in
Japanese and European Parliaments and its diachronic changes.
In ParlaCLARIN IV Workshop, pp.66--69, 2024.
(PDF file)
-
H.Shi, K.Shimada, M.Hirano, T.Shibuya, Y.Koyama, Z.Zhong, S.Takahashi,
T.Kawahara, and Y.Mitsufuji.
Diffusion-based speech enhancement with joint generative and
predictive decoders.
In Proc. IEEE-ICASSP, pp.12951--12955, 2024.
(PDF file)
-
Y.Gao, H.Shi, C.Chu, and T.Kawahara.
Enhancing two-stage finetuning for speech emotion recognition using
adapters.
In Proc. IEEE-ICASSP, pp.11316--11320, 2024.
(PDF file)
-
W.Zhou, Z.Yang, C.Chu, S.Li, R.Dabre, Y.Zhao, and T.Kawahara.
MOS-FAD: Improving fake audio detection via automatic mean opinion
score prediction.
In Proc. IEEE-ICASSP, pp.876--880, 2024.
(PDF file)
-
K.Shimada, K.Uchida, Y.Koyama, T.Shibuya, S.Takahashi, Y.Mitsufuji, and
T.Kawahara.
Zero- and few-shot sound event localization and detection.
In Proc. IEEE-ICASSP, pp.636--640, 2024.
(PDF file)
FY 2023
-
K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
An analysis of user behaviours for objectively evaluating spoken
dialogue systems.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.
(PDF file)
-
H.Kawai, D.Lala, K.Inoue, K.Ochi, and T.Kawahara.
Evaluation of a semi-autonomous attentive listening system with
takeover prompting.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.
(PDF file)
-
K.Yamamoto, S.Kawano, T.Kawahara, and K.Yoshino.
Data augmentation for robust natural language generation based on
phrase alignment and sentence structure.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.
(PDF file)
-
Y.Fu, H.Song, T.Zhao, and T.Kawahara.
Enhancing personality recognition in dialogue by data augmentation
and heterogeneous conversational graph networks.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.
(PDF file)
-
Z.H.Pang, Y.Fu, D.Lala, K.Ochi, K.Inoue, and T.Kawahara.
Acknowledgment of emotional states: Generating validating responses
for empathetic dialogue.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.
(PDF file)
-
K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
Real-time and continuous turn-taking prediction using voice activity
projection.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol.
Demo. Paper, 2024.
(PDF file)
-
S.Yamashita, K.Inoue, A.Guo, S.Mochizuki, T.Kawahara, and R.Higashinaka.
RealPersonaChat: A realistic persona chat corpus with
interlocutors' own personalities.
In Proc. PACLIC, 2023.
(PDF file)
-
E.Nakamura.
Computational Analysis of Selection and Mutation Probabilities in the Evolution of Chord Progressions.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.462--473, 2023.
(PDF file)
-
E.Nakamura, T.Eipert and F.C.Moss.
Historical Changes of Modes and their Substructure Modeled as Pitch Distributions in Plainchant from the 1100s to the 1500s.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.450--461, 2023.
(PDF file)
-
T.Nabeoka, E.Nakamura, and K.Yoshii.
Automatic Orchestration of Piano Scores for Wind Bands with User-Specified Instrumentation.
In Proc. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp.387--394, 2023.
(PDF file)
-
Y.Fujita, Y.Bando, K.Imoto, M.Onishi, and K.Yoshii.
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection.
In Proc. APSIPA ASC, pp.2077--2083, 2023.
(PDF file)
-
J.Zhao, E.Nakamura, and K.Yoshii.
Multimodal Multifaceted Music Emotion Recognition Based on Self-Attentive Fusion of Psychology-Inspired Symbolic and Acoustic Features.
In Proc. APSIPA ASC, pp.1657--1661, 2023.
(PDF file)
-
T.Deng, E.Nakamura, and K.Yoshii.
Audio-To-Score Singing Transcription Based on Joint Estimation of Pitches, Onsets, and Metrical Positions with Tatum-Level CTC Loss.
In Proc. APSIPA ASC, pp.583--590, 2023.
(PDF file)
-
T-P.Chen, L.Su, and K.Yoshii.
Learning Multifaceted Self-Similarity for Musical Structure Analysis
In Proc. APSIPA ASC, pp.165--172, 2023.
(PDF file)
-
D.Kamakura, E.Nakamura, and K.Yoshii.
CTC2: End-To-End Drum Transcription Based on Connectionist Temporal Classification with Constant Tempo Constraint.
In Proc. APSIPA ASC, pp.158--164, 2023.
(PDF file)
-
D.Kamakura, E.Nakamura, K.Yoshii, and T.Oyama.
Joint Drum Transcription and Metrical Analysis Based on Periodicity-Aware Multi-Task Learning.
In Proc. APSIPA ASC, pp.151--157, 2023.
(PDF file)
-
K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
Towards objective evaluation of socially-situated conversational
robots: Assessing human-likeness through multimodal user behaviors.
In Proc. ICMI (Companion; Late Breaking Results), pp.86--90,
2023.
(PDF file)
-
Y.Fu, K.Inoue, C.Chu, and T.Kawahara.
Reasoning before responding: Integrating commonsense-based causality
explanation for empathetic response generation.
In Proc. SIGdial Meeting Discourse \& Dialogue, 2023.
(PDF file)
-
S.Kobuki, K.Seaborn, S.Tokunaga, K.Fukumori, S.Hidaka, K.Tamura, K.Inoue,
T.Kawahara, and M.Otake-Matsuura.
Robotic backchanneling in online conversation facilitation: A
cross-generational study.
In Proc. RO-MAN, 2023.
(PDF file)
-
Y.Gao, C.Chu, and T.Kawahara.
Two-stage finetuning of wav2vec 2.0 for speech emotion recognition
with ASR and gender pretraining.
In Proc. INTERSPEECH, pp.3635--3639, 2023.
(PDF file)
-
J.Lee, M.Mimura, and T.Kawahara.
Embedding articulatory constraints for low-resource speech
recognition based on large pre-trained model.
In Proc. INTERSPEECH, pp.1392--1396, 2023.
(PDF file)
-
M.Terao, E.Nakamura, and K.Yoshii.
Neural Band-to-Piano Score Arrangement with Stepless Difficulty Control.
In Proc. IEEE-ICASSP, 2023.
(PDF file)
-
H.Shi, M.Mimura, L.Wang, J.Dang, and T.Kawahara.
Time-domain speech enhancement assisted by multi-resolution frequency
encoder and decoder.
In Proc. IEEE-ICASSP, 2023.
(PDF file)
-
K.Soky, S.Li, C.Chu, and T.Kawahara.
Domain and language adaptation using heterogeneous datasets for
wav2vec2.0-based speech recognition of low-resource language.
In Proc. IEEE-ICASSP, 2023.
(PDF file)
FY 2022
-
K.Yamamoto, K.Inoue, and T.Kawahara.
Character adaptation of spoken dialogue systems based on user
personalities.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.
(PDF file)
-
Y.Fu, K.Inoue, D.Lala, K.Yamamoto, C.Chu, and T.Kawahara.
Improving empathetic response generation with retrieval based on
emotion recognition.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.
(PDF file)
-
Y.Muraki, H.Kawai, K.Yamamoto, K.Inoue, D.Lala, and T.Kawahara.
Semi-autonomous guide agents with simultaneous handling of multiple
users.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.
(PDF file)
-
D.Lala, K.Inoue, T.Kawahara, and K.Sawada.
Backchannel generation model for a third party listening agent.
In Proc. Human-Agent Interaction (HAI), pp.114--122, 2022.
(PDF file)
-
H.Shi, Y.Shu, L.Wang, J.Dang, and T.Kawahara.
Fusing multiple bandwidth spectrograms for improving speech
enhancement.
In Proc. APSIPA ASC, pp.1935--1940, 2022.
(PDF file)
-
H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
Subband-based spectrogram fusion for speech enhancement by combining
mapping and masking approaches.
In Proc. APSIPA ASC, pp.286--292, 2022.
(PDF file)
-
K.Sekiguchi, A.A.Nugraha, Y.Du, Y.Bando, M.Fontaine, and K.Yoshii.
Direction-Aware Adaptive Online Neural Speech Enhancement with an
Augmented Reality Headset in Real Noisy Conversational Environments.
In Proc. IEEE/RSJ IROS, 2022.
(PDF file)
-
H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Non-autoregressive error correction for CTC-based ASR with
phone-conditioned masked LM.
In Proc. INTERSPEECH, pp.3889--3893, 2022.
(PDF file)
-
Y.Du, A.A.Nugraha, K.Sekiguchi, Y.Bando, M.Fontaine and K.Yoshii.
Direction-aware joint adaptation of neural speech enhancement and recognition in real multiparty conversational environments.
In Proc. INTERSPEECH, pp.2918--2922, 2022.
(PDF file)
-
S.Kawano, M.Arioka, A.Yuguchi, K.Yamamoto, K.Inoue, T.Kawahara, S.Nakamura, and
K.Yoshino.
Multimodal persuasive dialogue corpus using teleoperated android.
In Proc. INTERSPEECH, pp.2308--2312, 2022.
(PDF file)
-
J.Nozaki, T.Kawahara, K.Ishizuka, and T.Hashimoto.
End-to-end speech-to-punctuated-text recognition.
In Proc. INTERSPEECH, pp.1811--1815, 2022.
(PDF file)
-
K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
Leveraging simultaneous translation for enhancing transcription of
low-resource language via cross attention mechanism.
In Proc. INTERSPEECH, pp.1362--1366, 2022.
(PDF file)
-
H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
Monaural speech enhancement based on spectrogram decomposition for
convolutional neural network-sensitive feature extraction.
In Proc. INTERSPEECH, pp.221--225, 2022.
(PDF file)
-
H.Kawai, Y.Muraki, K.Yamamoto, D.Lala, K.Inoue, and T.Kawahara.
Simultaneous job interview system using multiple semi-autonomous
agents.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo.
Paper, pp.107--110, 2022.
(PDF file)
-
A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
DNN-Free Low-Latency Adaptive Speech Enhancement Based on
Frame-Online Beamforming Powered by Block-Online FastMNMF.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC),
2022.
(PDF file)
-
Y.Sumura, K.Sekiguchi, Y.Bando, A.A.Nugraha, and K.Yoshii.
Joint Localization and Synchronization of Distributed
Camera-Attached Microphone Arrays for Indoor Scene Analysis.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC),
2022.
(PDF file)
-
S.Ueno and T.Kawahara.
Phone-informed refinement of synthesized mel spectrogram for data
augmentation in speech recognition.
In Proc. IEEE-ICASSP, pp.8572--8576, 2022.
(PDF file)
-
H.Zhang, M.Mimura, T.Kawahara, and K.Ishizuka.
Selective multi-task learning for speech emotion recognition using
corpora of different styles.
In Proc. IEEE-ICASSP, pp.7707--7711, 2022.
(PDF file)
-
A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation.
In Proc. IEEE-ICASSP, pp.501--505, 2022.
(PDF file)
-
M.Terao, Y.Hiramatsu, R.Ishizuka, Y.Wu, and K.Yoshii.
Difficulty-Aware Neural Band-to-Piano Score Arrangement Based on Note- and Statistic-Level Criteria.
In Proc. IEEE-ICASSP, pp.196--200, 2022.
(PDF file)
FY 2021
-
M.Mimura, S.Sakai, and T.Kawahara.
An end-to-end model from speech to clean transcript for parliamentary
meetings.
In Proc. APSIPA ASC, pp.465--470, 2021.
(PDF file)
-
H.Shi, L.Wang, S.Li, C.Fan, J.Dang, and T.Kawahara.
Spectrograms fusion-based end-to-end robust automatic speech
recognition.
In Proc. APSIPA ASC, pp.438--442, 2021.
(PDF file)
-
K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
On the use of speaker information for automatic speech recognition in
speaker-imbalanced corpora.
In Proc. APSIPA ASC, pp.433--437, 2021.
(PDF file)
-
H.Futami, H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
ASR rescoring and confidence estimation with ELECTRA.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.380--387, 2021.
(PDF file)
-
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Data augmentation for ASR using TTS via a discrete
representation.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.68--75, 2021.
(PDF file)
-
K.Soky, M.Mimura, T.Kawahara, S.Li, C.Ding, C.Chu, and S.Sam.
Khmer speech translation corpus of the Extraordinary Chambers in the
Courts of Cambodia (ECCC).
In Proc. Oriental-COCOSDA Workshop, pp.122--127, 2021.
(PDF file)
-
T.Oyama, R.Ishizuka, and K.Yoshii.
Phase-Aware Joint Beat and Downbeat Estimation Based on
Periodicity of Metrical Structure.
In Proc. ISMIR, pp.493--499, 2021.
(PDF file)
-
Y.Hiramatsu, E.Nakamura, and K.Yoshii.
Joint Estimation of Note Values and Voices for Audio-to-Score
Piano Transcription.
In Proc. ISMIR, pp.278--284, 2021.
(PDF file)
-
H.Inaguma, M.Mimura, and T.Kawahara.
VAD-free streaming hybrid CTC/Attention ASR for unsegmented
recording.
In Proc. INTERSPEECH, pp.4049--4053, 2021.
(PDF file)
-
H.Inaguma, M.Mimura, and T.Kawahara.
StableEmit: Selection probability discount for reducing emission
latency of streaming monotonic attention ASR.
In Proc. INTERSPEECH, pp.1817--1821, 2021.
(PDF file)
-
M.Fontaine, K.Sekiguchi, A.A.Nugraha, Y.Bando, and K.Yoshii.
Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation.
In Proc. INTERSPEECH, pp.661--665, 2021.
(PDF file)
-
K.Inoue, H.Sakamoto, K.Yamamoto, D.Lala, and T.Kawahara.
A multi-party attentive listening robot which stimulates involvement
from side participants.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo.
Paper, pp.261--264, 2021.
(PDF file)
-
E.Ishii, G.I.Winata, S.Cahyawijaya, D.Lala, T.Kawahara, and P.Fung.
ERICA: An empathetic android companion for Covid-19 quarantine.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo.
Paper, pp.257--260, 2021.
(PDF file)
-
T.Zhao and T.Kawahara.
Multi-referenced training for dialogue response generation.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.190--201,
2021.
(PDF file)
-
H.Inaguma, T.Kawahara, and S.Watanabe.
Source and target bidirectional knowledge distillation for end-to-end
speech translation.
In Proc. NAACL-HLT, pp.1872--1881, 2021.
(PDF file)
-
H.Inaguma, Y.Higuchi, K.Duh, T.Kawahara, and S.Watanabe.
Non-autoregressive end-to-end speech translation with dual-decoder.
In Proc. IEEE-ICASSP, pp.7488--7492, 2021.
(PDF file)
-
K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, and K.Yoshii.
Autoregressive fast multichannel nonnegative matrix factorization
for joint blind source separation and dereverberation.
In Proc. IEEE-ICASSP, pp.511--516, 2021.
(PDF file)
-
Y.Hiramatsu, G.Shibata, R.Nishikimi, E.Nakamura, and K.Yoshii.
Statistical correction of transcribed melody notes based on probabilistic integration of a music language model and a transcription error model.
In Proc. IEEE-ICASSP, pp.256--261, 2021.
(PDF file)
FY 2020
-
D.Lala, K.Inoue, K.Yamamoto, and T.Kawahara.
Findings from human-android dialogue research with ERICA.
In Proc. IJCAI-2020 workshop on ROBOT-DIAL, 2020.
(PDF file)
-
S.Zhang, T.Zhao, and T.Kawahara.
Topic-relevant response generation using optimal transport for an
open-domain dialog system.
In Proc. COLING, pp.4067--4077, 2020.
(PDF file)
-
J.Woo, M.Mimura, K.Yoshii, and T.Kawahara.
End-to-end music-mixed speech recognition.
In Proc. APSIPA ASC, pp.800--804, 2020.
(PDF file)
-
M.Togami, Y.Masuyama, T.Komatsu, K.Yoshii, and T.Kawahara.
Integration of semi-blind speech source separation and voice activity
detection for flexible spoken dialogue.
In Proc. APSIPA ASC, pp.788--793, 2020.
(PDF file)
-
M.Wake, M.Togami, K.Yoshii, and T.Kawahara.
Integration of semi-blind speech source separation and voice activity
detection for flexible spoken dialogue.
In Proc. APSIPA ASC, pp.775--780, 2020.
(PDF file)
-
Y.Wu, E.Nakamura, and K.Yoshii.
A Variational Autoencoder for Joint Chord and Key Estimation from
Audio Chromagrams
In Proc. APSIPA ASC, pp.500--506, 2020.
(PDF file)
-
R.Ishizuka, R.Nishikimi, E.Nakamura, and K.Yoshii.
Tatum-Level Drum Transcription Based On a Convolutional Recurrent
Neural Network with Language Model-Based Regularized Training
In Proc. APSIPA ASC, pp.359--364, 2020.
(PDF file)
-
D.Lala, K.Inoue, and T.Kawahara.
Prediction of shared laughter for human-robot dialogue.
In Proc. ICMI (Companion; Late Breaking Results), pp.62--66,
2020.
(PDF file)
-
K.Inoue, K.Hara, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
Job interviewer android with elaborate follow-up question generation.
In Proc. ICMI, pp.324--332, 2020.
(PDF file)
-
M.Fontaine, K.Sekiguchi, A.A.Nugraha, and K.Yoshii.
Unsupervised robust speech enhancement based on alpha-stable fast
multichannel nonnegative matrix factorization.
In Proc. INTERSPEECH, pp.4541--4545, 2020.
(PDF file)
-
K.Yamamoto, K.Inoue, and T.Kawahara.
Semi-supervised learning for character expression of spoken dialogue
systems.
In Proc. INTERSPEECH, pp.4188--4192, 2020.
(PDF file)
-
T.V.Dang, T.Zhao, S.Ueno, H.Inaguma, and T.Kawahara.
End-to-end speech-to-dialog-act recognition.
In Proc. INTERSPEECH, pp.3910--3914, 2020.
(PDF file)
-
H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Distilling the knowledge of BERT for sequence-to-sequence ASR.
In Proc. INTERSPEECH, pp.3635--3639, 2020.
(PDF file)
-
K.Matsuura, M.Mimura, S.Sakai, and T.Kawahara.
Generative adversarial training data adaptation for very low-resource
automatic speech recognition.
In Proc. INTERSPEECH, pp.2737--2741, 2020.
(PDF file)
-
Y.Bando, K.Sekiguchi, A.A.Nugraha, and K.Yoshii.
Adaptive neural speech enhancement with a denoising variational
autoencoder.
In Proc. INTERSPEECH, pp.2437--2441, 2020.
(PDF file)
-
H.Inaguma, M.Mimura, and T.Kawahara.
Enhancing monotonic multihead attention for streaming ASR.
In Proc. INTERSPEECH, pp.2137--2141, 2020.
(PDF file)
-
H.Inaguma, M.Mimura, and T.Kawahara.
CTC-synchronous training for monotonic attention model.
In Proc. INTERSPEECH, pp.571--575, 2020.
(PDF file)
-
H.Feng, S.Ueno, and T.Kawahara.
End-to-end speech emotion recognition combined with acoustic-to-word
ASR model.
In Proc. INTERSPEECH, pp.501--505, 2020.
(PDF file)
-
G.Shibata, R.Nishikimi, and K.Yoshii.
Music structure analysis based on an LSTM-HSMM hybrid model.
In Proc. ISMIR, pp.15--22, 2020.
(PDF file)
-
Y.Du, K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, K.Yoshii, and T.Kawahara.
Semi-supervised multichannel speech separation based on a phone- and
speaker-aware deep generative model of speech spectrograms.
In Proc. EUSIPCO, pp.870--874, 2020.
(PDF file)
-
K.Yoshii, K.Sekiguchi, Y.Bando, M.Fontaine, and A.A.Nugraha.
Fast multichannel correlated tensor factorization for blind source separation.
In Proc. EUSIPCO, pp.306-310, 2020.
(PDF file)
-
T.Zhao, D.Lala, and T.Kawahara.
Designing precise and robust dialogue response evaluators.
In Proc. ACL, pp.26--33, 2020.
(PDF file)
-
K.Inoue, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
An attentive listening system with android ERICA: Comparison of
autonomous and WOZ interactions.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.118--127,
2020.
(PDF file)
-
S.Nakamura, C.T.Ishi, and T.Kawahara.
Analysis and modeling of between-sentence pauses in news speech by
Japanese newscasters.
In Proc. Int'l Conf. Speech Prosody, pp.680--684, 2020.
(PDF file)
-
S.Isonishi, K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Response generation to out-of-database questions for example-based
dialogue systems.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020.
(PDF file)
-
K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
A character expression model affecting spoken dialogue behaviors.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020.
(PDF file)
-
K.Matsuura, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Speech corpus of Ainu folklore and end-to-end speech recognition
for Ainu language.
In Proc. Int'l Conf. Language Resources \& Evaluation (LREC),
pp.2622--2628, 2020.
(PDF file)
FY 2019
-
H.Inaguma, K.Duh, T.Kawahara, and S.Watanabe.
Multilingual end-to-end speech translation.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.570--577, 2019.
(PDF file)
-
K.Soky, S.Li, T.Kawahara, and S.Seng.
Multi-lingual transformer training for Khmer automatic speech
recognition.
In Proc. APSIPA ASC, pp.1893--1896, 2019.
(PDF file)
-
G.Shibata, R.Nishikimi, E.Nakamura, and K.Yoshii.
Statistical music structure analysis based on a homogeneity-,
repetitiveness-, and regularity-aware hierarchical hidden
semi-Markov model.
In Proc. ISMIR, pp. 268--275, 2019.
(PDF file)
-
R.Nishikimi, E.Nakamura, M.Goto, and K.Yoshii.
End-to-end melody note transcription based on a beat-synchronous
attention mechanism.
In Proc. IEEE Workshop Applications of Signal Processing to Audio
and Acoustics (WASPAA), 2019.
(PDF file)
-
T.Carsault, A.McLeod, P.Esling, J.Nika, E.Nakamura, and K.Yoshii.
Multi-step chord sequence prediction based on aggregated multi-scale
encoder-decoder networks.
In Proc. IEEE Workshop Machine Learning for Signal Processing
(MLSP), 2019.
(PDF file)
-
D.Lala, K.Inoue, and T.Kawahara.
Smooth turn-taking by a robot using an online continuous model to
generate turn-taking cues.
In Proc. ICMI, pp.226--234, 2019.
(PDF file)
-
S.Li, R.Dabre, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
Improving transformer-based speech recognition systems with
compressed structure and speech attributes augmentation.
In Proc. INTERSPEECH, pp.4400--4404, 2019.
(PDF file)
-
D.Lala, S.Nakamura, and T.Kawahara.
Analysis of effect and timing of fillers in natural turn-taking.
In Proc. INTERSPEECH, pp.4175--4179, 2019.
(PDF file)
-
K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
Turn-taking prediction based on detection of transition relevance
place.
In Proc. INTERSPEECH, pp.4170--4174, 2019.
(PDF file)
-
Y.Li, T.Zhao, and T.Kawahara.
Improved end-to-end speech emotion recognition using self attention
mechanism and multitask learning.
In Proc. INTERSPEECH, pp.2803--2807, 2019.
(PDF file)
-
S.Li, X.Lu, C.Ding, P.Shen, T.Kawahara, and H.Kawai.
Investigating radical-based end-to-end speech recognition systems for
Chinese dialects and Japanese.
In Proc. INTERSPEECH, pp.2200--2204, 2019.
(PDF file)
-
S.Li, C.Ding, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
End-to-end articulatory attribute modeling for low-resource
multilingual speech recognition.
In Proc. INTERSPEECH, pp.2145--2149, 2019.
(PDF file)
-
Y.Wu, T.Carsault, and K.Yoshii.
Automatic chord estimation based on a frame-wise convolutional
recurrent neural network with non-aligned annotations.
In Proc. EUSIPCO, 2019.
(PDF file)
-
K.Sekiguchi, A.Arie Nugraha, Y.Bando, and K.Yoshii.
Fast multichannel source separation based on jointly diagonalizable
spatial covariance matrices.
In Proc. EUSIPCO, 2019.
(PDF file)
-
D.Lala, G.Wilcock, K.Jokinen, and T.Kawahara.
ERICA and WikiTalk.
In Proc. IJCAI, Vol.Demo. Paper, pp.6533--6535, 2019.
(PDF file)
-
S.Nakamura, C.T.Ishi, and T.Kawahara.
Prosodic characteristics of Japanese newscaster speech for
different speaking situations.
In Proc. Int'l Congress Phonetic Sciences (ICPhS), 2019.
(PDF file)
-
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Multi-speaker sequence-to-sequence speech synthesis for data
augmentation in acoustic-to-word speech recognition.
In Proc. IEEE-ICASSP, pp.6161--6165, 2019.
(PDF file)
-
H.Inaguma, J.Cho, M.K.Baskar, T.Kawahara, and S.Watanabe.
Transfer learning of language-independent end-to-end ASR with
language model fusion.
In Proc. IEEE-ICASSP, pp.6096--6100, 2019.
(PDF file)
-
A.Arie Nugraha, K.Sekiguchi, and K.Yoshii.
A deep generative model of speech complex spectrograms.
In Proc. IEEE-ICASSP, pp.905-909, 2019.
(PDF file)
-
S.Ueda, K.Shibata, Y.Wada, R.Nishikimi, E.Nakamura, and K.Yoshii.
Bayesian drum transcription based on nonnegative matrix factor
decomposition with a deep score prior.
In Proc. IEEE-ICASSP, pp.456-460, 2019.
(PDF file)
-
K.Shibata, R.Nishikimi, S.Fukayama, M.Goto, E.Nakamura, K.Itoyama, and
K.Yoshii.
Joint transcription of lead, bass, and rhythm guitars based on a
factorial hidden semi-Markov model.
In Proc. IEEE-ICASSP, pp.236-240, 2019.
(PDF file)
-
E.Nakamura, K.Shibata, R.Nishikimi, and K.Yoshii.
Unsupervised melody style conversion.
In Proc. IEEE-ICASSP, pp.196-200, 2019.
(PDF file)
-
A.McLeod, E.Nakamura, and K.Yoshii.
Improved metrical alignment of MIDI performance based on a repetition-aware
online-adapted grammar.
In Proc. IEEE-ICASSP, pp.186-190, 2019.
(PDF file)
-
R.Nishikimi, E.Nakamura, S.Fukayama, M.Goto, and K.Yoshii.
Automatic singing transcription based on encoder-decoder recurrent
neural networks with a weakly-supervised attention mechanism.
In Proc. IEEE-ICASSP, pp.161-165, 2019.
(PDF file)
-
K.Inoue, K.Hara, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
A job interview dialogue system with autonomous android ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol.Demo. Paper, 2019.
(PDF file)
-
K.Inoue, D.Lala, K.Yamamoto, K.Takanashi, and T.Kawahara.
Engagement-based adaptive behaviors for laboratory guide in
human-robot dialogue.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019.
(PDF file)
-
K.Tanaka, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
End-to-end modeling for selection of utterance constructional units
via system internal states.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019.
(PDF file)
FY 2018
-
M.Mimura, S.Ueno, H.Inaguma, S.Sakai, and T.Kawahara.
Leveraging sequence-to-sequence speech synthesis for enhancing
acoustic-to-word speech recognition.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp.
477--484, 2018.
(PDF file)
-
H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
Improving OOV detection and resolution with external language
models in acoustic-to-word ASR.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp.
212--218, 2018.
(PDF file)
-
S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
Improving very deep time-delay neural network with vertical-attention
for effectively training CTC-based ASR systems.
In Proc. IEEE Spoken Language Technology Workshop (SLT), pp.
77--83, 2018.
(PDF file)
-
E.Nakamura, R.Nishikimi, S.Dixon, and K.Yoshii.
Probabilistic sequential patterns for singing transcription.
In Proc. APSIPA ASC, pp.1905--1912, 2018.
(PDF file)
-
K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Dialogue behavior control model for expressing a character of
humanoid robots.
In Proc. APSIPA ASC, pp.1732--1737, 2018.
(PDF file)
-
K.Sekiguchi, Y.Bando, K.Yoshii, and T.Kawahara.
Bayesian multichannel speech enhancement with a deep speech prior.
In Proc. APSIPA ASC, pp.1233--1239, 2018.
(PDF file)
-
Y.Wada, R.Nishikimi, E.Nakamura, K.Itoyama, and K.Yoshii.
Sequential generation of singing F0 contours from musical note
sequences based on WaveNet.
In Proc. APSIPA ASC, pp.983--989, 2018.
(PDF file)
-
T.Kawahara.
Human-like conversational robot.
In Proc. APSIPA ASC, p. (overview talk), 2018.
(PDF file)
-
D.Lala, K.Inoue, and T.Kawahara.
Evaluation of real-time deep learning turn-taking models for multiple
dialogue scenarios.
In Proc. ICMI, pp.78--86, 2018.
(PDF file)
-
H.Tsushima, E.Nakamura, K.Itoyama and K.Yoshii.
Interactive arrangement of chords and melodies based on a
tree-structured generative model.
In Proc. ISMIR, 2018.
(PDF file)
-
K.Yoshii, K.Kitamura, Y.Bando, E.Nakamura, and T.Kawahara.
Independent low-rank tensor analysis for audio source separation.
In Proc. EUSIPCO, pp.1671--1675, 2018.
(PDF file)
-
S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
Improving CTC-based acoustic model with very deep residual
time-delay neural networks.
In Proc. INTERSPEECH, pp.3708--3712, 2018.
(PDF file)
-
S.Ueno, T.Moriya, M.Mimura, S.Sakai, Y.Yamaguchi, Y.Aono, and T.Kawahara.
Encoder transfer for attention-based acoustic-to-word speech
recognition.
In Proc. INTERSPEECH, pp.2424--2428, 2018.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Forward-backward attention decoder.
In Proc. INTERSPEECH, pp.2232--2236, 2018.
(PDF file)
-
K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
Prediction of turn-taking using multitask learning with prediction of
backchannels and fillers.
In Proc. INTERSPEECH, pp.991--995, 2018.
(PDF file)
-
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Engagement recognition in spoken dialogue via neural network by
aggregating different annotators' models.
In Proc. INTERSPEECH, pp.616--626, 2018.
(PDF file)
-
T.Zhao and T.Kawahara.
A unified neural architecture for joint dialog act segmentation and
recognition in spoken dialog system.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.201--208,
2018.
(PDF file)
-
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Latent character model for engagement recognition based on multimodal
behaviors.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018.
(PDF file)
-
R.Nakanishi, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Generating fillers based on dialog act pairs for smooth turn-taking
by humanoid robot.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018.
(PDF file)
-
T.Kawahara.
Spoken dialogue system for a human-like conversational robot ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), p.
(keynote speech), 2018.
(PDF file)
-
K.Yoshii.
Correlated tensor factorization for audio source separation.
In Proc. IEEE-ICASSP, pp.731--735, 2018.
(PDF file)
-
S.Ueno, H.Inaguma, M.Mimura, and T.Kawahara.
Acoustic-to-word attention-based model complemented with
character-level CTC-based model.
In Proc. IEEE-ICASSP, pp.5804--5808, 2018.
(PDF file)
-
Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Statistical speech enhancement based on probabilistic integration of
variational autoencoder and non-negative matrix factorization.
In Proc. IEEE-ICASSP, pp.716--720, 2018.
(PDF file)
-
K.Shimada, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Unsupervised beamforming based on multichannel nonnegative matrix
factorization for noisy speech recognition.
In Proc. IEEE-ICASSP, pp.5734--5738, 2018.
(PDF file)
-
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Efficient learning of articulatory models based on multi-label
training and label correction for pronunciation learning.
In Proc. IEEE-ICASSP, pp.6239--6243, 2018.
(PDF file)
-
H.Inaguma, M.Mimura, K.Inoue, K.Yoshii, and T.Kawahara.
An end-to-end approach to joint social signal detection and automatic
speech recognition.
In Proc. IEEE-ICASSP, pp.6214--6218, 2018.
(PDF file)
-
E.Nakamura, E.Benetos, K.Yoshii, and S.Dixon.
Towards complete polyphonic music transcription:
Integrating multi-pitch detection and rhythm quantization.
In Proc. IEEE-ICASSP, pp.101--105, 2018.
(PDF file)
-
T.Kawahara, K.Inoue, D.Lala, and K.Takanashi.
Audio-visual conversation analysis by smart posterboard and humanoid
robot.
In Proc. IEEE-ICASSP, pp.6573--6577, 2018.
(PDF file)
FY 2017
-
T.Hagiya, K.Hoashi, and T.Kawahara.
Voice input tutoring system for older adults using input stumble
detection.
In Proc. ACM Int'l Conf. Intelligent User Interfaces (IUI), pp.
415--419, 2018.
(PDF file)
-
S.Li, X.Lu, P.Shen, R.Takashima, T.Kawahara, and H.Kawai.
Incremental training and constructing the very deep convolutional
residual network acoustic models.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.222--227, 2017.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Cross-domain speech recognition using nonparallel corpora with
cycle-consistent adversarial networks.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.134--140, 2017.
(PDF file)
-
Y.Li, C.T.Ishi, N.Ward, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Emotion recognition by combining prosody and sentiment analysis for
expressing reactive emotion by humanoid robot.
In Proc. APSIPA ASC, 2017.
(PDF file)
-
T.Kawahara.
Automatic meeting transcription system for the Japanese Parliament
(Diet).
In Proc. APSIPA ASC, p. (overview talk), 2017.
(PDF file)
-
T.Zhao and T.Kawahara.
Joint learning of dialog act segmentation and recognition in spoken
dialog using neural networks.
In Proc. IJCNLP, pp.704--712, 2017.
(PDF file)
-
T.Kawahara.
Modeling difficulties of second language learners using speech
technology.
In Proc. Seoul International Conference on Speech Sciences
(SICSS), p. 11 (keynote speech), 2017.
(PDF file)
-
D.Lala, K.Inoue, P.Milhorat, and T.Kawahara.
Detection of social signals for recognizing engagement in human-robot
interaction.
In Proc. AAAI Fall Sympo. Natural Communication for Human-Robot
Collaboration, 2017.
(PDF file)
-
R.Nishikimi, E.Nakamura, M.Goto, K.Itoyama, and K.Yoshii.
Scale- and rhythm-aware musical note estimation for vocal F0
trajectories based on a semi-tatum-synchronous hierarchical hidden
semi-Markov model.
In Proc. ISMIR, 2017.
(PDF file)
-
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Function- and rhythm-aware melody harmonization based on
tree-structured parsing and split-merge sampling of chord sequences.
In Proc. ISMIR, 2017.
(PDF file)
-
K.Yoshii, E.Nakamura, K.Itoyama, and M.Goto.
Infinite probabilistic latent component analysis for audio source
separation.
In Proc. IEEE Workshop Machine Learning for Signal Processing
(MLSP), 2017.
(PDF file)
-
M.Wake, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Semi-blind speech enhancement based on recurrent neural network for
source separation and dereverberation.
In Proc. IEEE Workshop Machine Learning for Signal Processing
(MLSP), 2017.
(PDF file)
-
M.Mirzaei, K.Meshgi, and T.Kawahara.
Detecting listening difficulty for second language learners using
automatic speech recognition errors.
In Proc. Workshop Speech \& Language Technology for Education
(SLaTE), pp.164--168, 2017.
(PDF file)
-
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Transfer learning based non-native acoustic modeling for
pronunciation error detection.
In Proc. Workshop Speech \& Language Technology for Education
(SLaTE), pp.50--54, 2017.
(PDF file)
-
M.Mirzaei, K.Meshgi, and T.Kawahara.
Listening difficulty detection to foster second language listening
with the partial and synchronized caption system.
In Proc. EUROCALL, pp.211--216, 2017.
(PDF file)
-
M.Mimura, Y.Bando, K.Shimada, S.Sakai, K.Yoshii, and T.Kawahara.
Combined multi-channel NMF-based robust beamforming for noisy
speech recognition.
In Proc. INTERSPEECH, pp.2451--2455, 2017.
(PDF file)
-
S.Nakamura, R.Nakanishi, K.Takanashi, and T.Kawahara.
Analysis of the relationship between prosodic features of fillers and
its forms or occurrence positions.
In Proc. INTERSPEECH, pp.1726--1230, 2017.
(PDF file)
-
H.Inaguma, K.Inoue, M.Mimura, and T.Kawahara.
Social signal detection in spontaneous dialogue using bidirectional
LSTM-CTC.
In Proc. INTERSPEECH, pp.1691--1695, 2017.
(PDF file)
-
D.Lala, P.Milhorat, K.Inoue, M.Ishida, K.Takanashi, and T.Kawahara.
Attentive listening system with backchanneling, response generation
and flexible turn-taking.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.127--136,
2017.
(PDF file)
-
Y.Ojima, T.Nakano, S.Fukayama, J.Kato, M.Goto, K.Itoyama, K.Yoshii.
A Singing Instrument for Real-Time Vocal-Part Arrangement of Music Audio Signals.
In Proc. Sound and Music Computing Conference (SMC), pp.443--449, 2017.
(PDF file)
-
Y.Wada, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii.
An adaptive karaoke system that plays accompaniment parts of music
audio signals synchronously with users' singing voices.
In Proc. Sound and Music Computing Conference (SMC), pp.110--116, 2017.
(PDF file)
-
P.Milhorat, D.Lala, K.Inoue, Z.Tianyu, M.Ishida, K.Takanashi, S.Nakamura, and
T.Kawahara.
A conversational dialogue manager for the humanoid robot ERICA.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2017.
(PDF file)
-
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Effective articulatory modeling for pronunciation error detection of
L2 learner without non-native training data.
In Proc. IEEE-ICASSP, pp.5815--5819, 2017.
(PDF file)
-
S.Li, X.Lu, S.Sakai, M.Mimura, and T.Kawahara.
Semi-supervised ensemble DNN acoustic model training.
In Proc. IEEE-ICASSP, pp.5270--5274, 2017.
(PDF file)
-
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
Bayesian multichannel nonnegative matrix factorization for audio
source separation and localization.
In Proc. IEEE-ICASSP, pp.551--555, 2017.
(PDF file)
FY 2016
-
D.Lala, Y.Li, and T.Kawahara.
Utterance behavior of users while playing basketball with a virtual
teammate.
In Proc. ICAART, pp.28--38, 2017.
(PDF file)
-
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Multi-lingual and multi-task DNN learning for articulatory error
detection.
In Proc. APSIPA ASC, 2016.
(PDF file)
-
M.Mirzaei, K.Meshgi, and T.Kawahara.
ASR errors as predictor of L2 listening difficulties and PSC
enhancement.
In Proc. Coling Workshop on Computational Linguistics for
Linguistic Complexity (CL4LC), pp.192--201, 2016.
(PDF file)
-
K.Inoue, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
Annotation and analysis of listener's engagement based on multi-modal
behaviors.
In Proc. ICMI Workshop on Multimodal Analyses enabling
Artificial Agents in Human-Machine Interaction (MA3HMI), 2016.
(PDF file)
-
H.Inaguma, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Prediction of ice-breaking between participants using prosodic
features in the first meeting dialogue.
In Proc. ICMI Workshop on Advancements in Social Signal
Processing for Multimodal Interaction (ASSP4MI), 2016.
(PDF file)
-
D.Lala, P.Milhorat, K.Inoue, T.Zhao, and T.Kawahara.
Multimodal interaction with the autonomous android ERICA.
In Proc. ICMI, Vol.Demo. Paper, pp.417--418, 2016.
(PDF file)
-
Y.Bando, H.Suhara, M.Tanaka, T.Kamegawa, K.Itoyama, K.Yoshii,
F.Matsuno, H.G.Okuno.
Sound-based online localization for an in-pipe snake robot.
In Proc. IEEE Int'l Symp. Safety, Security, and Rescue Robotics
(SSRR), 2016.
(PDF file)
-
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Pronunaciation error detection using DNN articulatory model based
on multi-lingual and multi-task learning.
In Proc. Int'l Sympo. Chinese Spoken Language Processing
(ISCSLP), 2016.
(PDF file)
-
S.Li, X.Lu, S.Mori, Y.Akita, and T.Kawahara.
Confidence estimation for speech recognition systems using
conditional random fields trained with partially annotated data.
In Proc. Int'l Sympo. Chinese Spoken Language Processing
(ISCSLP), 2016.
(PDF file)
-
K.Sekiguchi, Y.Bando, K.Nakamura, K.Nakadai, K.Itoyama, and K.Yoshii.
Online Simultaneous Localization and Mapping of Multiple Sound
Sources and Asynchronous Microphone Arrays.
In Proc. IEEE/RSJ IROS, pp. 1973-1979, 2016.
(PDF file)
-
K.Kitamura, Y.Bando, K.Itoyama, and K.Yoshii.
Student's t Multichannel Nonnegative Matrix Factorization
for Blind Source Separation.
In Proc. IEEE Int'l Workshop Acoustic Signal Enhancement (IWAENC),
2016.
(PDF file)
-
D.Lala and T.Kawahara.
Managing dialog and joint actions for virtual basketball teammates.
In Proc. IVA, Vol.Poster, 2016.
(PDF file)
-
K.Inoue, P.Milhorat, D.Lala, T.Zhao, and T.Kawahara.
Talking with ERICA, an autonomous android.
In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo.
Paper, pp.212--215, 2016.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Joint optimization of denoising autoencoder and DNN acoustic model
based on multi-target learning for noisy speech recognition.
In Proc. INTERSPEECH, pp.3803--3807, 2016.
(PDF file)
-
T.Kawahara, T.Yamaguchi, K.Inoue, K.Takanashi, and N.Ward.
Prediction and generation of backchannel form for attentive listening
systems.
In Proc. INTERSPEECH, pp.2890--2894, 2016.
(PDF file)
-
E.Nakamura, K.Yoshii and S.Sagayama.
Rhythm Transcription of MIDI Performances Based on a Merged-Output
HMM for Multiple Voices.
In Proc. Sound and Music Computing Conference (SMC), pp.338--343, 2016.
(PDF file)
-
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii.
A Unified Bayesian Model of Time-Frequency Clustering and Low-Rank
Approximation for Multi-Channel Source Separation.
In Proc. EUSIPCO, pp.2280-2284, 2016.
(PDF file)
-
E.Nakamura, K.Itoyama, K.Yoshii.
Rhythm Transcription of MIDI Performances Based on Hierarchical
Bayesian Modelling of Repetition and Modification of Musical Note
Patterns.
In Proc. EUSIPCO, pp.1946-1950, 2016.
(PDF file)
-
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii,
H.G.Okuno.
Variational Bayesian Multi-Channel Robust NMF for Human-Voice
Enhancement with a Deformable and Partially-Occluded Microphone
Array.
In Proc. EUSIPCO, pp.1018-1022, 2016.
(PDF file)
-
D.F.Glas, T.Minato, C.T.Ishi, T.Kawahara, and H.Ishiguro.
ERICA: The ERATO Intelligent Conversational Android.
In Proc. RO-MAN, pp.22--29, 2016.
(PDF file)
-
M.Mirzaei, K.Meshgi, and T.Kawahara.
Leveraging automatic speech recognition errors to detect challenging
speech segments in TED talks.
In Proc. EUROCALL, pp.313--318, 2016.
(PDF file)
-
R.Nishikimi, E.Nakamura, K.Itoyama and K.Yoshii.
Musical Note Estimation for F0 Trajectories of Singing Voices Based on a Bayesian Semi-Beat-Synchronous HMM.
In Proc. ISMIR, pp.461--467, 2016.
(PDF file)
-
Y.Ojima, E.Nakamura, K.Itoyama and K.Yoshii.
A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis.
In Proc. ISMIR, pp.309--315, 2016.
(PDF file)
-
N.Ward, Y.Li, T.Zhao, and T.Kawahara.
Interactional and pragmatics-related prosodic patterns in Mandarin
dialog.
In Proc. Int'l Conf. Speech Prosody, 2016.
(PDF file)
-
S.Li, Y.Akita, and T.Kawahara.
Data selection from multiple ASR systems' hypotheses for
unsupervised acoustic model training.
In Proc. IEEE-ICASSP, pp.5875--5879, 2016.
(PDF file)
-
E.Nakamura, M.Hamanaka, K.Hirata, and K.Yoshii.
Tree-structured probabilistic model of monophonic written music
based on the generative theory of tonal music.
In Proc. IEEE-ICASSP, pp.276--280, 2016.
(PDF file)
-
K.Yoshii, K.Itoyama, and M.Goto.
Student's t nonnegative matrix factorization and positive
semidefinite tensor factorization for single-channel audio source
separation.
In Proc. IEEE-ICASSP, pp.51--55, 2016.
(PDF file)
FY 2015
-
T.Yamaguchi, K.Inoue, K.Yoshino, K.Takanashi, N.Ward, and T.Kawahara.
Analysis and prediction of morphological patterns of backchannels for
attentive listening agents.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2016.
(PDF file)
-
T.Kawahara, T.Yamaguchi, M.Uesato, K.Yoshino, and K.Takanashi.
Synchrony in prosodic and linguistic features between backchannels
and preceding utterances in attentive listening.
In Proc. APSIPA ASC, pp.392--395, 2015.
(PDF file)
-
Y.Akita, N.Kuwahara, and T.Kawahara.
Automatic classification of usability of ASR result for real-time
captioning of lectures.
In Proc. APSIPA ASC, pp.19--22, 2015.
(PDF file)
-
K.Yoshii, K.Itoyama, and M.Goto.
Infinite Superimposed Discrete All-pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms.
In Proc. ISMIR, pp.86--92, 2015.
(PDF file)
-
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, and
H.G.Okuno.
Human-Voice Enhancement Based on Online RPCA for a Hose-Shaped
Rescue Robot with a Microphone Array.
In Proc. IEEE Int'l Symp. Safety, Security, and Rescue Robotics
(SSRR), 2015.
(PDF file)
-
K.Youssef, K.Itoyama, and K.Yoshii.
Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context.
In Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2015.
-
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, and H.G.Okuno.
Microphone-Accelerometer Based 3D Posture Estimation for a Hose-shaped Rescue Robot.
In Proc. IEEE/RSJ IROS, pp.5580--5586, 2015.
(PDF file)
-
M.Ohkita, Y.Bando, Y.Ikemiya, K.Itoyama, and K.Yoshii.
Audio-Visual Beat Tracking Based on a State-Space Model for a Dancing Robot Playing with Humans.
In Proc. IEEE/RSJ IROS, pp.5555--5560, 2015.
(PDF file)
-
K.Sekiguchi, Y.Bando, K.Itoyama, and K.Yoshii.
Optimizing the Layout of Multiple Mobile Robots for Cooperative Sound Source Separation.
In Proc. IEEE/RSJ IROS, pp5548--8884, 2015.
(PDF file)
-
S.Li, Y.Akita, and T.Kawahara.
Discriminative data selection for lightly supervised training of
acoustic model using closed caption texts.
In Proc. INTERSPEECH, pp.3526--3530, 2015.
(PDF file)
-
K.Inoue, Y.Wakabayashi, H.Yoshimoto, K.Takanashi, and T.Kawahara.
Enhanced speaker diarization with detection of backchannels using
eye-gaze information in poster conversations.
In Proc. INTERSPEECH, pp.3086--3090, 2015.
(PDF file)
-
S.Li, X.Lu, Y.Akita, and T.Kawahara.
Ensemble speaker modeling using speaker adaptive training deep neural
network for speaker adaptation.
In Proc. INTERSPEECH, pp.2892--2896, 2015.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Speech dereverberation using long short-term memory.
In Proc. INTERSPEECH, pp.2435--2439, 2015.
(PDF file)
-
K.Itakura I.Nishimuta, Y.Bando, K.Itoyama, and K.Yoshii.
Bayesian Integration of Sound Source Separation and Speech Recognition:A New Approach to Simultaneous Speech Recognition.
In Proc. INTERSPEECH, pp736--740, 2015.
(PDF file)
-
M.Mirzaei and T.Kawahara.
ASR technology to empower partial and synchronized caption for L2
listening development.
In Proc. Workshop Speech \& Language Technology for Education
(SLaTE), pp.65--70, 2015.
(PDF file)
-
M.Mirzaei, K.Meshgi, Y.Akita, and T.Kawahara.
Errors in automatic speech recognition versus difficulties in second
language listening.
In Proc. EUROCALL, pp.410--415, 2015.
(PDF file)
-
A.Dobashi, Y.Ikemiya, K.Itoyama, K.Yoshii.
A Music Performance Assistance System based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals.
In Proc. Sound and Music Computing Conference (SMC), pp.99--104, 2015.
(PDF file)
-
T.Fukuda, Y.Ikemiya, K.Itoyama, K.Yoshii.
A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification.
In Proc. Sound and Music Computing Conference (SMC 2015), pp.105--110, 2015.
(PDF file)
-
T.Sasada, S.Mori, T.Kawahara, and Y.Yamakata.
Named entity recognizer trainable from partially annotated data.
In Proc. PACLING, pp.10--17, 2015.
(PDF file)
-
Y.Akita, Y.Tong, and T.Kawahara.
Language model adaptation for academic lectures using character
recognition result of presentation slides.
In Proc. IEEE-ICASSP, pp.5431--5435, 2015.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Deep autoencoders augmented with phone-class feature for reverberant
speech recognition.
In Proc. IEEE-ICASSP, pp.4356--4369, 2015.
(PDF file)
-
Y.Bando, T.Otsuka, K.Itoyama, K.Yoshii, Y.Sasaki, S.Kagami, and H.G.Okuno.
Challenges in Deploying A Microphone Array to Localize and Separate Sound Sources in Real Auditory Scenes.
In Proc. IEEE-ICASSP, pp.723-727, 2015.
(PDF file)
-
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Singing Voice Analysis and Editing based on Mutually Dependent F0 Estimation and Source Separation.
In Proc. IEEE-ICASSP, pp.574-578, 2015.
(PDF file)
-
S.Maruo, K.Yoshii, K.Itoyama, M.Mauch, and M.Goto.
A Feedback Framework for Improved Chord Recognition Based on NMF-based Approximate Note Transcription.
In Proc. IEEE-ICASSP, pp.196-200, 2015.
(PDF file)
FY 2014
-
Y.Bando, T.Otsuka, I.Aihara, H.Awano, K.Itoyama, K.Yoshii, and H.G.Okuno.
Recognition of In-field Frog Chorusing using Bayesian Nonparametric Microphone Array Processing.
In Proc. AAAI-2015 Workshop on Comutational Sustainability, 2015.
(PDF file)
-
T.Kawahara, M.Uesato, K.Yoshino, and K.Takanashi.
Toward adaptive generation of backchannels for attentive listening
agents.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015.
(PDF file)
-
K.Yoshino and T.Kawahara.
News navigation system based on proactive dialogue strategy.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015.
(PDF file)
-
I.Nishimuta, K.Yoshii, K.Itoyama, and H.G.Okuno.
Development of a Robot Quizmaster with Auditory Functions for Speech-based Multiparty Interaction.
In Proc. IEEE/SICE Int'l Sympo. System Integration (SII 2014), pp.328--333, 2014.
(PDF file)
-
Y.Wakabayashi, K.Inoue, H.Yoshimoto, and T.Kawahara.
Speaker diarization based on audio-visual integration for smart
posterboard.
In Proc. APSIPA ASC, 2014.
(PDF file)
-
M.Mimura and T.Kawahara.
Unsupervised speaker adaptation of DNN-HMM by selecting similar
speakers for lecture transcription.
In Proc. APSIPA ASC, 2014.
(PDF file)
-
M.Mirzaei, Y.Akita, and T.Kawahara.
Partial and synchronized caption generation to develop second
language listening skill.
In ICCE Workshop on Natural Language Processing Techniques for
Educational Applications (NLP-TEA), pp.13--23, 2014.
(PDF file)
-
I.Nishimuta, N.Hirayama, K.Yoshii, K.Itoyama, and H.G.Okuno.
A Robot Quizmaster that can Localize, Separate, and Recognize Simultaneous Utterances for a Fastest-Voice-First Quiz Game.
roceedings of IEEE-RAS Interanational Conference on Humanoid Robots (Humanoids 2014), pp.967--972, 2014.
(PDF file)
-
Y.Bando, K.Itoyama, S.Tadokoro, M.Konyo, K.Nakadai, K.Yoshii, and H.G.Okuno.
A Sound-based Online Method for Estimating the Time-Varying Posture of a Hose-shaped Robot.
In Proc. Int'l Sympo. Safety, Security, and Rescue Robotics (SSRR-2014), pp.1--6, 2014.
(PDF file)
-
A.Maezawa, K.Itoyama, K.Yoshii, and H.G.Okuno.
Bayesian Audio Alignment Based on A Unified Generative Model of Music Comosition and Performance.
In Proc. ISMIR, pp.233--238, 2014.
(PDF file)
-
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Transferring Vocal Expressions of a Professional Singer to Unaccompanied Singing Signals.
In Proc. ISMIR, 2014.
(PDF file)
-
K.Sudoh, M.Nagata, S.Mori, and T.Kawahara.
Japanese-to-English patent translation system based on
domain-adapted word segmentation and post-ordering.
In Proc. Assoc. for Machine Translation in the Americas (AMTA),
Vol.1, pp.234--248, 2014.
(PDF file)
-
K.Inoue, Y.Wakabayashi, H.Yoshimoto, and T.Kawahara.
Speaker diarization using eye-gaze information in multi-party
conversations.
In Proc. INTERSPEECH, pp.562--566, 2014.
(PDF file)
-
S.Li, Y.Akita, and T.Kawahara.
Corpus and transcription system of Chinese Lecture Room.
In Proc. Int'l Sympo. Chinese Spoken Language Processing
(ISCSLP), pp.442--445, 2014.
(PDF file)
-
M.Mirzaei, Y.Akita, and T.Kawahara.
Partial and synchronized captioning: A new tool for second language
listening development.
In Proc. EUROCALL, pp.230--236, 2014.
(PDF file)
-
K.Yoshino and T.Kawahara.
Information navigation system based on POMDP that tracks user
focus.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.32--40,
2014.
(PDF file)
-
M.Mimura, S.Sakai, and T.Kawahara.
Exploring deep neural networks and deep autoencoders in reverberant
speech recognition.
In Workshop on Hands-free Speech Communication \& Microphone
Arrays (HSCMA), 2014.
(PDF file)
-
K.Yoshii, H.Fujihara, T.Nakano, and M.Goto.
Cultivating Vocal Activity Detection for Music Audio Signals in a Circulation-type Crowdsourcing Ecosystem.
In Proc. IEEE-ICASSP, pp.624-628, 2014.
(PDF file)
FY 2013
-
T.Kawahara.
Smart posterboard: Multi-modal sensing and analysis of poster
conversations.
In Proc. APSIPA ASC, p. (plenary overview talk), 2013.
(PDF file)
-
K.Yoshino, S.Mori, and T.Kawahara.
Predicate argument structure analysis using partially annotated
corpora.
In Proc. IJCNLP, pp.957--961, 2013.
(PDF file)
-
T.Kawahara, S.Hayashi, and K.Takanashi.
Estimation of interest and comprehension level of audience through
multi-modal behaviors in poster conversations.
In Proc. INTERSPEECH, pp.1882--1885, 2013.
(PDF file)
-
K.Yoshino, S.Mori, and T.Kawahara.
Incorporating semantic information to selection of web texts for
language model of spoken dialogue system.
In Proc. IEEE-ICASSP, pp.8252--8256, 2013.
(PDF file)
FY 2012
-
K.Yoshino, S.Mori, and T.Kawahara.
Language modeling for spoken dialogue system based on filtering using
predicate-argument structures.
In Proc. COLING, pp.2993--3002, 2012.
(PDF file)
-
C.Lee and T.Kawahara.
Hybrid vector space model for flexible voice search.
In Proc. APSIPA ASC, 2012.
(PDF file)
-
K.Yoshino, S.Mori, and T.Kawahara.
Language modeling for spoken dialogue system based on sentence
transformation and filtering using predicate-argument structures.
In Proc. APSIPA ASC, 2012.
(PDF file)
-
Y.Akita, M.Watanabe, and T.Kawahara.
Automatic transcription of lecture speech using language model based
on speaking-style transformation of proceeding texts.
In Proc. INTERSPEECH, 2012.
(PDF file)
-
R.Gomez and T.Kawahara.
Dereverberation based on wavelet packet filtering for robust
automatic speech recognition.
In Proc. INTERSPEECH, 2012.
(PDF file)
-
T.Kawahara, T.Iwatate, and K.Takanashi.
Prediction of turn-taking by combining prosodic and eye-gaze
information in poster conversations.
In Proc. INTERSPEECH, 2012.
(PDF file)
-
T.Kawahara, T.Iwatate, T.Tsuchiya, and K.Takanashi.
Can we predict who in the audience will ask what kind of questions
with their feedback behaviors in poster conversation?
In Proc. Interdisciplinary Workshop on Feedback Behaviors in
Dialog, pp.35--38, 2012.
(PDF file)
-
T.Kawahara.
Transcription system using automatic speech recognition for the
Japanese Parliament (Diet).
In Proc. AAAI/IAAI, pp.2224--2228, 2012.
(PDF file)
-
G.Neubig, T.Watanabe, S.Mori, and T.Kawahara.
Machine translation without words through substring alignment.
In Proc. ACL, pp.165--174, 2012.
(PDF file)
-
T.Kawahara.
Multi-modal sensing and analysis of poster conversations toward smart
posterboard.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.1--9
(keynote speech), 2012.
(PDF file)
-
M.Ablimit, T.Kawahara, and A.Hamdulla.
Discriminative approach to lexical entry selection for automatic
speech recognition of agglutinative language.
In Proc. IEEE-ICASSP, pp.5009--5012, 2012.
(PDF file)
FY 2011
-
M.Ablimit, A.Hamdulla, and T.Kawahara.
Morpheme concatenation approach in language modeling for
large-vocabulary Uyghur speech recognition.
In Proc. Oriental-COCOSDA Workshop, 2011.
(PDF file)
-
R.Gomez and T.Kawahara.
Optimized wavelet-based speech enhancement for speech recognition in
noisy and reverberant conditions.
In Proc. APSIPA ASC, 2011.
(PDF file)
-
M.Mimura and T.Kawahara.
Fast speaker normalization and adaptation based on BIC for meeting
speech recognition.
In Proc. APSIPA ASC, 2011.
(PDF file)
-
M.Ablimit, T.Kawahara, and A.Hamdulla.
Lexicon optimization for automatic speech recognition based on
discriminative learning.
In Proc. APSIPA ASC, 2011.
(PDF file)
-
H.Wang, T.Kawahara, and Y.Wang.
Improving non-native speech recognition performance by discriminative
training for language model in a CALL system.
In Proc. APSIPA ASC, 2011.
(PDF file)
-
T.Hirayama, Y.Sumi, T.Kawahara, and T.Matsuyama.
Info-concierge: Proactive multi-modal interaction through mind
probing.
In Proc. APSIPA ASC, 2011.
(PDF file)
-
C.Lee, T.Kawahara, and A.Rudnicky.
Combining slot-based vector space model for voice book search.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), pp.
27--35, 2011.
(PDF file)
-
Y.Akita and T.Kawahara.
Automatic comma insertion of lecture transcripts based on multiple
annotations.
In Proc. INTERSPEECH, pp.2889--2892, 2011.
(PDF file)
-
R.Gomez and T.Kawahara.
Denoising using optimized wavelet filtering for automatic speech
recognition.
In Proc. INTERSPEECH, pp.1673--1676, 2011.
(PDF file)
-
G.Neubig, T.Watanabe, E.Sumita, S.Mori, and T.Kawahara.
An unsupervised model for joint phrase alignment and extraction.
In Proc. ACL-HLT, pp.632--641, 2011.
(PDF file)
-
K.Yoshino, S.Mori, and T.Kawahara.
Spoken dialogue system based on information extraction using
similarity of predicate argument structures.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.59--66,
2011.
(PDF file)
FY 2010
-
T.Kawahara, H.Wang, Y.Tsubota, and M.Dantsuji.
English and Japanese CALL systems developed at Kyoto University.
In Proc. APSIPA ASC, pp.804--810, 2010.
(PDF file)
-
R.Gomez and T.Kawahara.
Optimizing wavelet parameters for dereverberation in automatic speech
recognition.
In Proc. APSIPA ASC, pp.446--449, 2010.
(PDF file)
-
T.Kawahara.
Automatic transcription of parliamentary meetings and classroom
lectures -- a sustainable approach and real system evaluations --.
In Proc. Int'l Sympo. Chinese Spoken Language Processing
(ISCSLP), pp.1--6 (keynote speech), 2010.
(PDF file)
-
M.Ablimit, G.Neubig, M.Mimura, S.Mori, T.Kawahara, and A.Hamdulla.
Uyghur morpheme-based language models and ASR.
In Proc. Int'l Conf. Signal Processing, pp.581--584, 2010.
(PDF file)
-
K.Yoshino and T.Kawahara.
Spoken dialogue system based on information extraction from web text.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol.
Demo. Paper, pp.196--197, 2010.
(PDF file)
-
T.Kawahara, K.Sumi, Z.Q.Chang, and K.Takanashi.
Detection of hot spots in poster conversations based on reactive
tokens of audience.
In Proc. INTERSPEECH, pp.3042--3045, 2010.
(PDF file)
-
G.Neubig, M.Mimura, S.Mori, and T.Kawahara.
Learning a language model from continuous speech.
In Proc. INTERSPEECH, pp.1053--1056, 2010.
(PDF file)
-
Y.Itoh, H.Nishizaki, X.Hu, H.Nanjo, T.Akiba, T.Kawahara, S.Nakagawa, T.Matsui,
Y.Yamashita, and K.Aikawa.
Constructing Japanese test collections for spoken term detection.
In Proc. INTERSPEECH, pp.677--680, 2010.
(PDF file)
-
T.Kawahara, N.Katsumaru, Y.Akita, and S.Mori.
Classroom note-taking system for hearing impaired students using
automatic speech recognition adapted to lectures.
In Proc. INTERSPEECH, pp.626--629, 2010.
(PDF file)
-
R.Gomez and T.Kawahara.
An improved wavelet-based dereverberation for robust automatic speech
recognition.
In Proc. INTERSPEECH, pp.578--581, 2010.
(PDF file)
-
Y.Akita, M.Mimura, G.Neubig, and T.Kawahara.
Semi-automated update of automatic transcription system for the
Japanese national congress.
In Proc. INTERSPEECH, pp.338--341, 2010.
(PDF file)
-
T.Kawahara, Z.Q.Chang, and K.Takanashi.
Analysis on prosodic features of Japanese reactive tokens in poster
conversations.
In Proc. Int'l Conf. Speech Prosody, 2010.
(PDF file)
-
G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
Improved statistical models for SMT-based speaking style
transformation.
In Proc. IEEE-ICASSP, pp.5206--5209, 2010.
(PDF file)
-
R.Gomez and T.Kawahara.
Optimizing spectral subtraction and Wiener filtering for robust
speech recognition in reverberant and noisy conditions.
In Proc. IEEE-ICASSP, pp.4566--4569, 2010.
(PDF file)
-
D.Cournapeau, S.Watanabe, A.Nakamura, and T.Kawahara.
Using online model comparison in the Variational Bayes framework
for online unsupervised voice activity detection.
In Proc. IEEE-ICASSP, pp.4462--4465, 2010.
(PDF file)
FY 2009
-
T.Kawahara.
New perspectives on spoken language understanding: Does machine need
to fully understand speech?
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.46--50 (invited paper), 2009.
(PDF file)
-
T.Misu, K.Sugiura, T.Kawahara, K.Ohtake, C.Hori, H.Kashioka, and S.Nakamura.
Online learning of Bayes risk-based optimization of dialogue
management.
In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2009.
(PDF file)
-
R.Gomez and T.Kawahara.
Tight integration of dereverberation and automatic speech
recognition.
In Proc. APSIPA ASC, pp.639--643, 2009.
(PDF file)
-
T.Akiba, K.Aikawa, Y.Itoh, T.Kawahara, H.Nanjo, H.Nishizaki, N.Yasuda,
Y.Yamashita, and K.Itou.
Developing an SDR test collection from Japanese lecture audio
data.
In Proc. APSIPA ASC, pp.324--330, 2009.
(PDF file)
-
K.Katsurada, A.Lee, T.Kawahara, T.Yotsukura, S.Morishima, T.Nishimoto,
Y.Yamashita, and T.Nitta.
Development of a toolkit for spoken dialog systems with an
anthropomorphic agent: Galatea.
In Proc. APSIPA ASC, pp.148--153, 2009.
(PDF file)
-
A.Lee and T.Kawahara.
Recent development of open-source speech recognition engine Julius.
In Proc. APSIPA ASC, pp.131--137, 2009.
(PDF file)
-
G.Neubig, S.Mori, and T.Kawahara.
A WFST-based log-linear framework for speaking-style
transformation.
In Proc. INTERSPEECH, pp.1495--1498, 2009.
(PDF file)
-
R.Gomez and T.Kawahara.
Optimization of dereverberation parameters based on likelihood of
speech recognizer.
In Proc. INTERSPEECH, pp.1223--1226, 2009.
(PDF file)
-
K.Sumi, T.Kawahara, J.Ogata, and M.Goto.
Acoustic event detection for spotting hot spots in podcasts.
In Proc. INTERSPEECH, pp.1143--1146, 2009.
(PDF file)
-
Y.Akita, M.Mimura, and T.Kawahara.
Automatic transcription system for meetings of the Japanese
national congress.
In Proc. INTERSPEECH, pp.84--87, 2009.
(PDF file)
-
K.Komatani, T.Kawahara, and H.G.Okuno.
A model of temporally changing user behaviors in a deployed spoken
dialogue system.
In Proc. Int'l Conf. User Modeling, Adaptation, and
Personalization (UMAP) (LNCS 5535), pp.409--414, 2009.
(PDF file)
-
T.Kawahara, M.Mimura, and Y.Akita.
Language model transformation applied to lightly supervised training
of acoustic model for congress meetings.
In Proc. IEEE-ICASSP, pp.3853--3856, 2009.
(PDF file)
FY 2008
-
M.Ablimit, M.Eli, and T.Kawahara.
Partly supervised Uighur morpheme segmentation.
In Proc. Oriental-COCOSDA Workshop, pp.71--76, 2008.
(PDF file)
-
T.Shinozaki, S.Furui, and T.Kawahara.
Aggregated cross-validation and its efficient application to
Gaussian mixture optimization.
In Proc. INTERSPEECH, pp.2382--2385, 2008.
(PDF file)
-
T.Sasada, S.Mori, and T.Kawahara.
Extracting word-pronunciation pairs from comparable set of text and
speech.
In Proc. INTERSPEECH, pp.1821--1824, 2008.
(PDF file)
-
H.Wang and T.Kawahara.
A Japanese CALL system based on dynamic question generation and
error prediction for ASR.
In Proc. INTERSPEECH, pp.1737--1740, 2008.
(PDF file)
-
T.Kawahara, M.Toyokura, T.Misu, and C.Hori.
Detection of feeling through back-channels in spoken dialogue.
In Proc. INTERSPEECH, p. 1696, 2008.
(PDF file)
-
T.Kawahara, H.Setoguchi, K.Takanashi, K.Ishizuka, and S.Araki.
Multi-modal recording, analysis and indexing of poster sessions.
In Proc. INTERSPEECH, pp.1622--1625, 2008.
(PDF file)
-
K.Komatani, T.Kawahara, and H.G.Okuno.
Predicting ASR errors by exploiting barge-in rate of individual
users for spoken dialogue systems.
In Proc. INTERSPEECH, pp.183--186, 2008.
(PDF file)
-
K.Ishizuka, S.Araki, and T.Kawahara.
Statistical speech activity detection based on spatial power
distribution for analyses of poster presentations.
In Proc. INTERSPEECH, pp.99--102, 2008.
(PDF file)
-
T.Misu and T.Kawahara.
Bayes risk-based dialogue management for document retrieval system
with speech interface.
In Proc. COLING, Vol.Posters \& Demo., pp.59--62, 2008.
(PDF file)
-
H.Wang and T.Kawahara.
Effective error prediction using decision tree for ASR grammar
network in CALL system.
In Proc. IEEE-ICASSP, pp.5069--5072, 2008.
(PDF file)
-
T.Kawahara, Y.Nemoto, and Y.Akita.
Automatic lecture transcription by exploiting presentation slide
information for language model adaptation.
In Proc. IEEE-ICASSP, pp.4929--4932, 2008.
(PDF file)
-
S.Sakai, T.Kawahara, and S.Nakamura.
Admissible stopping in Viterbi beam search for unit selection in
concatenative speech synthesis.
In Proc. IEEE-ICASSP, pp.4613--4616, 2008.
(PDF file)
-
D.Cournapeau and T.Kawahara.
Using Variational Bayes Free Energy for unsupervised voice activity
detection.
In Proc. IEEE-ICASSP, pp.4429--4432, 2008.
(PDF file)
-
T.Shinozaki and T.Kawahara.
GMM and HMM training by aggregated EM algorithm with increased
ensemble sizes for robust parameter estimation.
In Proc. IEEE-ICASSP, pp.4405--4408, 2008.
(PDF file)
FY 2007
-
T.Shinozaki and T.Kawahara.
HMM training based on CV-EM and CV Gaussian mixture
optimization.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.318--322, 2007.
(PDF file)
-
H.Setoguchi, K.Takanashi, and T.Kawahara.
Multi-modal conversational analysis of poster presentations using
multiple sensors.
In Proc. ICMI Workshop on Tagging, Mining and Retrieval of Human
Related Activity Information, pp.44--47, 2007.
(PDF file)
-
D.Cournapeau and T.Kawahara.
Evaluation of real-time voice activity detection based on high order
statistics.
In Proc. INTERSPEECH, pp.2945--2948, 2007.
(PDF file)
-
T.Misu and T.Kawahara.
Bayes risk-based optimization of dialogue management for document
retrieval system with speech interface.
In Proc. INTERSPEECH, pp.2705--2708, 2007.
(PDF file)
-
C.Waple, H.Wang, T.Kawahara Y.Tsubota, and M.Dantsuji.
Evaluating and optimizing Japanese tutor system featuring dynamic
question generation and interactive guidance.
In Proc. INTERSPEECH, pp.2177--2180, 2007.
(PDF file)
-
T.Shinozaki and T.Kawahara.
Gaussian mixture optimization for HMM based on efficient
cross-validation.
In Proc. INTERSPEECH, pp.2061--2064, 2007.
(PDF file)
-
Y.Akita, Y.Nemoto, and T.Kawahara.
PLSA-based topic detection in meetings for adaptation of lexicon
and language model.
In Proc. INTERSPEECH, pp.602--605, 2007.
(PDF file)
-
K.Komatani, T.Kawahara, and H.G.Okuno.
Analyzing temporal transition of real user's behaviors in a spoken
dialogue system.
In Proc. INTERSPEECH, pp.142--145, 2007.
(PDF file)
-
T.Misu and T.Kawahara.
An interactive framework for document retrieval and presentation with
question-answering function in restricted domain.
In Proc. Int'l Conf. Industrial, Engineering \& Other
Applications of Artificial Intelligent Systems (IEA/AIE) (LNAI 4570), pp.
126--134, 2007.
(PDF file)
-
T.Misu and T.Kawahara.
Speech-based interactive information guidance system using
question-answering technique.
In Proc. IEEE-ICASSP, Vol.4, pp.145--148, 2007.
(PDF file)
-
T.Kawahara, M.Saikou, and K.Takanashi.
Automatic detection of sentence and clause units using local
syntactic dependency.
In Proc. IEEE-ICASSP, Vol.4, pp.125--128, 2007.
(PDF file)
-
Y.Akita and T.Kawahara.
Topic-independent speaking-style transformation of language model for
spontaneous speech recognition.
In Proc. IEEE-ICASSP, Vol.4, pp.33--36, 2007.
(PDF file)
FY 2006
-
T.Kawahara.
Intelligent transcription system based on spontaneous speech
processing.
In Proc. Int'l Conference on Informatics Research for
Development of Knowledge Society Infrastructure, pp.19--26, 2007.
(PDF file)
-
Y.Kida and T.Kawahara.
Evaluation of voice activity detection by combining multiple features
with weight adaptation.
In Proc. INTERSPEECH, pp.1966--1969, 2006.
(PDF file)
-
S.Sakai and T.Kawahara.
Decision tree-based training of probabilistic concatenation models
for corpus-based speech synthesis.
In Proc. INTERSPEECH, pp.1746--1749, 2006.
(PDF file)
-
D.Cournapeau, T.Kawahara, K.Mase, and T.Toriyama.
Voice activity detector based on enhanced cumulant of LPC residual
and on-line EM algorithm.
In Proc. INTERSPEECH, pp.1201--1204, 2006.
(PDF file)
-
Y.Akita, M.Saikou, H.Nanjo, and T.Kawahara.
Sentence boundary detection of spontaneous Japanese using
statistical language model and support vector machines.
In Proc. INTERSPEECH, pp.1033--1036, 2006.
(PDF file)
-
C.Waple, Y.Tsubota, M.Dantsuji, and T.Kawahara.
Prototyping a CALL system for students of Japanese using dynamic
diagram generation and interactive hints.
In Proc. INTERSPEECH, pp.821--824, 2006.
(PDF file)
-
T.Misu and T.Kawahara.
A bootstrapping approach for developing language model of new spoken
dialogue systems by selecting web texts.
In Proc. INTERSPEECH, pp.9--12, 2006.
(PDF file)
-
R.Hamabe, K.Uchimoto, T.Kawahara, and H.Isahara.
Detection of quotations and inserted clauses and its application to
dependency structure analysis in spontaneous Japanese.
In Proc. COLING-ACL, Vol.Poster Sessions, pp.324--330, 2006.
(PDF file)
-
Y.Akita, C.Troncoso, and T.Kawahara.
Automatic transcription of meetings using topic-oriented language
model adaptation.
In Proc. Western Pacific Acoustics Conference (WESPAC), 2006.
(PDF file)
-
H.Nanjo, Y.Akita, and T.Kawahara.
Computer assisted speech transcription system for efficient speech
archive.
In Proc. Western Pacific Acoustics Conference (WESPAC), 2006.
(PDF file)
-
Y.Akita and T.Kawahara.
Efficient estimation of language model statistics of spontaneous
speech via statistical transformation model.
In Proc. IEEE-ICASSP, Vol.1, pp.1049--1052, 2006.
(PDF file)
FY 2005
-
T.Misu and T.Kawahara.
Speech-based information retrieval system with clarification dialogue
strategy.
In Proc. Human Language Technology Conf. (HLT/EMNLP), pp.
1003--1010, 2005.
(PDF file)
-
Y.Kida and T.Kawahara.
Voice activity detection based on optimally weighted combination of
multiple features.
In Proc. INTERSPEECH, pp.2621--2624, 2005.
(PDF file)
-
C.Troncoso and T.Kawahara.
Trigger-based language model adaptation for automatic meeting
transcription.
In Proc. INTERSPEECH, pp.1297--1300, 2005.
(PDF file)
-
T.Misu and T.Kawahara.
Dialogue strategy to clarify user's queries for document retrieval
system with speech interface.
In Proc. INTERSPEECH, pp.637--640, 2005.
(PDF file)
-
H.Nanjo, T.Misu, and T.Kawahara.
Minimum Bayes-risk decoding considering word significance for
information retrieval system.
In Proc. INTERSPEECH, pp.561--564, 2005.
(PDF file)
-
I.R.Lane and T.Kawahara.
Utterance verification incorporating in-domain confidence and
discourse coherence measures.
In Proc. INTERSPEECH, pp.421--424, 2005.
(PDF file)
-
C.Troncoso, T.Kawahara, H.Yamamoto, and G.Kikui.
Trigger-based language model construction by combining different
corpora.
In Proc. Pacific Assoc. Computational Linguistics (PACLING),
pp.340--344, 2005.
(PDF file)
-
H.Nanjo and T.Kawahara.
A new ASR evaluation measure and minimum Bayes-risk decoding for
open-domain speech understanding.
In Proc. IEEE-ICASSP, Vol.1, pp.1053--1056, 2005.
(PDF file)
-
I.R.Lane and T.Kawahara.
Incorporating dialogue context and topic clustering in out-of-domain
detection.
In Proc. IEEE-ICASSP, Vol.1, pp.1045--1048, 2005.
(PDF file)
-
Y.Akita and T.Kawahara.
Generalized statistical modeling of pronunciation variations using
variable-length phone context.
In Proc. IEEE-ICASSP, Vol.1, pp.689--692, 2005.
(PDF file)
FY 2004
-
T.Kawahara, A.Lee, K.Takeda, K.Itou, and K.Shikano.
Recent progress of open-source LVCSR engine Julius and Japanese
model repository.
In Proc. INTERSPEECH, pp.3069--3072, 2004.
(PDF file)
-
K.Shitaoka, H.Nanjo, and T.Kawahara.
Automatic transformation of lecture transcription into document style
using statistical framework.
In Proc. INTERSPEECH, pp.2881--2884, 2004.
(PDF file)
-
S.Ueno, I.R.Lane, and T.Kawahara.
Example-based training of dialogue planning incorporating user and
situation models.
In Proc. INTERSPEECH, pp.2837--2840, 2004.
(PDF file)
-
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Topic classification and verification modeling for out-of-domain
utterance detection.
In Proc. INTERSPEECH, pp.2197--2200, 2004.
(PDF file)
-
T.Kitade, H.Nanjo, and T.Kawahara.
Automatic extraction of key sentences from oral presentations using
statistical measure based on discourse markers.
In Proc. INTERSPEECH, pp.2169--2172, 2004.
(PDF file)
-
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Practical use of English pronunciation system for Japanese
students in the CALL classroom.
In Proc. INTERSPEECH, pp.1689--1692, 2004.
(PDF file)
-
Y.Akita and T.Kawahara.
Language model adaptation based on PLSA of topics and speakers.
In Proc. INTERSPEECH, pp.1045--1048, 2004.
(PDF file)
-
T.Misu, K.Komatani, and T.Kawahara.
Confirmation strategy for document retrieval systems with spoken
dialog interface.
In Proc. INTERSPEECH, pp.45--48, 2004.
(PDF file)
-
K.Komatani, T.Misu, T.Kawahara, and H.G.Okuno.
Efficient confirmation strategy for large-scale text retrieval
systems with spoken dialogue interface.
In Proc. COLING, pp.1100--1106, 2004.
(PDF file)
-
K.Shitaoka, K.Uchimoto, T.Kawahara, and H.Isahara.
Dependency structure analysis and sentence boundary detection in
spontaneous Japanese.
In Proc. COLING, pp.1107--1113, 2004.
(PDF file)
-
Y.Akita, M.Hasegawa, and T.Kawahara.
Automatic audio archiving system for panel discussions.
In Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), 2004.
(PDF file)
-
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Practical use of autonomous English pronunciation learning system
for Japanese students.
In Proc. InSTIL/ICALL -- NLP and Speech Technologies in Advanced
Language Learning Systems, pp.139--142, 2004.
(PDF file)
-
A.Lee, K.Shikano, and T.Kawahara.
Real-time word confidence scoring using local posterior probabilities
on tree trellis search.
In Proc. IEEE-ICASSP, Vol.1, pp.793--796, 2004.
(PDF file)
-
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Out-of-domain detection based on confidence measures from multiple
topic classification.
In Proc. IEEE-ICASSP, Vol.1, pp.757--760, 2004.
(PDF file)
-
H.Nanjo, T.Kitade, and T.Kawahara.
Automatic indexing of key sentences for lecture archives using
statistics of presumed discourse markers.
In Proc. IEEE-ICASSP, Vol.1, pp.449--452, 2004.
(PDF file)
-
M.Nishida and T.Kawahara.
Speaker indexing and adaptation using speaker clustering based on
statistical model selection.
In Proc. IEEE-ICASSP, Vol.1, pp.353--356, 2004.
(PDF file)
-
K.Komatani, R.Ito, T.Kawahara, and H.G.Okuno.
Recognition of emotional states in spoken dialogue with a robot.
In Proc. Int'l Conf. Industrial \& Engineering Applications of
Artificial Intelligence \& Expert Systems (IEA/AIE) (LNAI 3029), pp.
413--423, 2004.
(PDF file)
-
T.Kawahara.
Automatic speech transcription and archiving system using the Corpus
of Spontaneous Japanese.
In Proc. Int'l Congress Acoustics (ICA), pp.161--164, 2004.
(PDF file)
FY 2003
-
T.Kawahara.
Spoken language processing for audio archives of lectures and panel
discussions.
In Proc. Int'l Conference on Informatics Research for
Development of Knowledge Society Infrastructure, pp.23--30, 2004.
(PDF file)
-
T.Kawahara, T.Kitade, K.Shitaoka, and H.Nanjo.
Efficient access to lecture audio archives through spoken language
processing.
In Proc. Special Workshop in Maui (SWIM), 2004.
(PDF file)
-
T.Kawahara, K.Shitaoka, T.Kitade, and H.Nanjo.
Automatic indexing of key sentences for lecture archives.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), 2003.
(PDF file)
-
Y.Tsubota, M.Dantsuji, and T.Kawahara.
An English pronunciation learning system for Japanese students
based on diagnosis of critical pronunciation errors.
In Proc. EUROCALL, p. 204, 2003.
(PDF file)
-
Y.Akita and T.Kawahara.
Unsupervised speaker indexing using anchor models and automatic
transcription of discussions.
In Proc. EUROSPEECH, pp.2985--2988, 2003.
(PDF file)
-
M.Nishida and T.Kawahara.
Speaker model selection using Bayesian information criterion for
speaker indexing and speaker adaptation.
In Proc. EUROSPEECH, pp.1849--1852, 2003.
(PDF file)
-
T.Kawahara, R.Ito, and K.Komatani.
Spoken dialogue system for queries on appliance manuals using
hierarchical confirmation strategy.
In Proc. EUROSPEECH, pp.1701--1704, 2003.
(PDF file)
-
K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
User modeling in spoken dialogue systems for flexible guidance
generation.
In Proc. EUROSPEECH, pp.745--748, 2003.
(PDFfile)
-
I.R.Lane, T.Matsui, S.Nakamura, and T.Kawahara.
Hierarchical topic classification for dialog speech recognition based
on language model switching.
In Proc. EUROSPEECH, pp.429--432, 2003.
(PDF file)
-
K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
Flexible guidance generation using user model in spoken dialogue
systems.
In Proc. ACL, pp.256--263, 2003.
(PDF file)
-
Y.Kiyota, S.Kurohashi, T.Misu, K.Komatani, T.Kawahara, and F.Kido.
Dialog navigator: A spoken dialog Q-A system based on large text
knowledge base.
In Proc. ACL, Vol.Interactive Poster \& Demo., pp.149--152,
2003.
(PDF file)
-
K.Komatani, F.Adachi, S.Ueno, T.Kawahara, and H.G.Okuno.
Flexible spoken dialogue system based on user models and dynamic
generation of VoiceXML scripts.
In Proc. SIGdial Meeting Discourse \& Dialogue, pp.87--96,
2003.
(PDF file)
-
T.Kawahara, H.Nanjo, T.Shinozaki, and S.Furui.
Benchmark test for speech recognition using the Corpus of
Spontaneous Japanese.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing
and Recognition, pp.135--138, 2003.
(PDF file)
-
H.Nanjo, K.Shitaoka, and T.Kawahara.
Automatic transformation of lecture transcription into document style
using statistical framework.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing
and Recognition, pp.215--218, 2003.
(PDF file)
-
Y.Akita, M.Nishida, and T.Kawahara.
Automatic transcription of discussions using unsupervised speaker
indexing.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing
and Recognition, pp.79--82, 2003.
(PDF file)
-
H.Nanjo and T.Kawahara.
Unsupervised language model adaptation for lecture speech
recognition.
In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing
and Recognition, pp.75--78, 2003.
(PDF file)
-
I.R.Lane, T.Kawahara, and T.Matsui.
Language model switching based on topic detection for dialog speech
recognition.
In Proc. IEEE-ICASSP, Vol.1, pp.616--619, 2003.
(PDF file)
-
M.Nishida and T.Kawahara.
Unsupervised speaker indexing using speaker model selection based on
Bayesian information criterion.
In Proc. IEEE-ICASSP, Vol.1, pp.172--175, 2003.
(PDF file)
FY 2002
-
K.Okuda, T.Kawahara, and S.Nakamura.
Speaking rate compensation based on likelihood criterion in acoustic
model training and decoding.
In Proc. ICSLP, pp.2589--2592, 2002.
(PDF file)
-
Y.Tsubota, T.Kawahara, and M.Dantsuji.
Recognition and verification of English by Japanese students for
computer-assisted language learning system.
In Proc. ICSLP, pp.1205--1208, 2002.
(PDF file)
-
K.Imoto, Y.Tsubota, A.Raux, T.Kawahara, and M.Dantsuji.
Modeling and automatic detection of English sentence stress for
computer-assisted English prosody learning system.
In Proc. ICSLP, pp.749--752, 2002.
(PDF file)
-
A.Raux and T.Kawahara.
Automatic intelligibility assessment and diagnosis of critical
pronunciation errors for computer-assisted pronunciation learning.
In Proc. ICSLP, pp.737--740, 2002.
(PDF file)
-
Y.Yamakata, T.Kawahara, and H.G.Okuno.
Belief network based disambiguation of object reference in spoken
dialogue system for robot.
In Proc. ICSLP, pp.177--180, 2002.
(PDF file)
-
K.Komatani, T.Kawahara, R.Ito, and H.G.Okuno.
Efficient dialogue strategy to find users' intended items from
information query results.
In Proc. COLING, pp.481--487, 2002.
(PDF file)
-
Y.Yamakata, T.Kawahara, and H.G.Okuno.
Belief network based disambiguation of object reference in spoken
dialogue system for robot.
In Proc. ISCA Workshop on Multi-Modal Dialogue in Mobile
Environments, 2002.
(PDF file)
-
A.Lee, T.Kawahara, K.Takeda, M.Mimura, A.Yamada, A.Ito, K.Itou, and K.Shikano.
Continuous speech recognition consortium -- an open repository for
CSR tools and models --.
In Proc. Int'l Conf. Language Resources \& Evaluation (LREC),
pp.1438--1441, 2002.
(PDF file)
-
T.Kawahara and M.Hasegawa.
Automatic indexing of lecture speech by extracting topic-independent
discourse markers.
In Proc. IEEE-ICASSP, pp.1--4, 2002.
(PDF file)
-
H.Nanjo and T.Kawahara.
Speaking-rate dependent decoding and adaptation for spontaneous
lecture speech recognition.
In Proc. IEEE-ICASSP, pp.725--728, 2002.
(PDF file)
FY 2001
-
A.Raux and T.Kawahara.
Optimizing computer-assisted pronunciation instruction by selecting
relevant training topics.
In InSTIL Advanced Workshop, 2002.
(PDF file)
-
Y.Tsubota, T.Kawahara, and M.Dantsuji.
CALL system for Japanese students of English using
pronunciation error prediction and formant structure estimation.
In InSTIL Advanced Workshop, 2002.
(PDF file)
-
T.Kawahara, H.Nanjo, and S.Furui.
Automatic transcription of spontaneous lecture speech.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), 2001.
(PDF file)
-
H.Nanjo, K.Kato, and T.Kawahara.
Speaking rate dependent acoustic modeling for spontaneous lecture
speech recognition.
In Proc. EUROSPEECH, pp.2531--2534, 2001.
(PDF file)
-
A.Lee, T.Kawahara, and K.Shikano.
Julius -- an open source real-time large vocabulary recognition
engine.
In Proc. EUROSPEECH, pp.1691--1694, 2001.
(PDF file)
-
K.Komatani, K.Tanaka, H.Kashima, and T.Kawahara.
Domain-independent spoken dialogue platform using key-phrase spotting
based on combined language model.
In Proc. EUROSPEECH, pp.1319--1322, 2001.
(PDF file)
-
A.Lee, T.Kawahara, and K.Shikano.
Gaussian mixture selection using context-independent HMM.
In Proc. IEEE-ICASSP, pp.69--72, 2001.
(PDF file)
FY 2000
-
T.Kawahara, A.Lee, T.Kobayashi, K.Takeda, N.Minematsu, S.Sagayama, K.Itou,
A.Ito, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
Free software toolkit for Japanese large vocabulary continuous
speech recognition.
In Proc. ICSLP, Vol.4, pp.476--479, 2000.
(PDF file)
-
Y.Tsubota, M.Dantsuji, and T.Kawahara.
Computer-assisted English vowel learning system for Japanese
speakers using cross language formant structures.
In Proc. ICSLP, Vol.3, pp.566--569, 2000.
(PDF file)
-
K.Imoto, M.Dantsuji, and T.Kawahara.
Modelling of the perception of English sentence stress for
computer-assisted language learning.
In Proc. ICSLP, Vol.3, pp.175--178, 2000.
(PDF file)
-
H.Nanjo, A.Lee, and T.Kawahara.
Automatic diagnosis of recognition errors in large vocabulary
continuous speech recognition systems.
In Proc. ICSLP, Vol.2, pp.1027--1030, 2000.
(PDF file)
-
K.Komatani and T.Kawahara.
Generating effective confirmation and guidance using two-level
confidence measures for dialogue systems.
In Proc. ICSLP, Vol.2, pp.648--651, 2000.
(PDF file)
-
K.Kato, H.Nanjo, and T.Kawahara.
Automatic transcription of lecture speech using topic-independent
language modeling.
In Proc. ICSLP, Vol.1, pp.162--165, 2000.
(PDF file)
-
H.Fujisaki, K.Shirai, S.Doshita, S.Nakagawa, K.Hirose, S.Itahashi, T.Kawahara,
S.Ohno, H.Kikuchi, K.Abe, and S.Kiriyama.
Overview of an intelligent system for information retrieval based on
human-machine dialogue through spoken language.
In Proc. ICSLP, Vol.1, pp.70--73, 2000.
(PDF file)
-
T.Kawahara, K.Komatani, and S.Doshita.
Dialogue management using concept-level confidence measures of speech
recognition.
In Proc. Int'l Sympo. on Spoken Dialogue, 2000.
(PDF file)
-
K.Komatani and T.Kawahara.
Flexible mixed-initiative dialogue management using concept-level
confidence measures of speech recognizer output.
In Proc. COLING, pp.467--473, 2000.
(PDF file)
-
A.Lee, T.Kawahara, K.Takeda, and K.Shikano.
A new phonetic tied-mixture model for efficient decoding.
In Proc. IEEE-ICASSP, pp.1269--1272, 2000.
(PDF file)
FY 1999
-
T.Kawahara, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, M.Yamamoto, A.Yamada,
T.Utsuro, and K.Shikano.
Japanese dictation toolkit -- plug-and-play framework for speech
recognition R\&D --.
In Proc. IEEE Workshop Automatic Speech Recognition \&
Understanding (ASRU), pp.393--396, 1999.
(PDF file)