FY
2024 |
2023 |
2022 |
2021 |
2020 |
FY
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013 |
2012 |
2011 |
2010 |
FY
2009 |
2008 |
2007 |
2006 |
2005 |
2004 |
2003 |
2002 |
2001 |
2000 |
FY 2024
-
S.Ueno, A.Lee, and T.Kawahara.
Refining synthesized speech using speaker information and phone
masking for data augmentation of speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.32,
pp.3924--3933, 2024.
(text)
(KURENAI)
-
H.Shi, M.Mimura, and T.Kawahara.
Waveform-domain speech enhancement using spectrogram encoding for
robust speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.32,
pp.3049--3060, 2024.
(text)
(KURENAI)
-
K.Soky, S.Li, C.Chu, and T.Kawahara.
Finetuning pretrained model with embedding of domain and language
information for ASR of very low-resource settings.
International Journal of Asian Language Processing, Vol.33,
No.4, pp.2350024:1--16, 2024.
(text)
(KURENAI)
FY 2023
-
K.Yamamoto, K.Inoue, and T.Kawahara.
Character expression of a conversational robot for adapting to user
personality.
Advanced Robotics, Vol.38, No.4, pp.256--266, 2024.
(text)
-
Y.Fu, K.Inoue, D.Lala, K.Yamamoto, C.Chu, and T.Kawahara.
Dual variational generative model and auxiliary retrieval for
empathetic response generation by conversational robot.
Advanced Robotics, Vol.37, No.21, pp.1406--1418, 2023.
(text)
(KURENAI preprint)
-
K.Ochi, K.Inoue, D.Lala, T.Kawahara, and H.Kumazaki.
Effect of attentive listening robot on pleasure and arousal change in
psychiatric daycare.
Advanced Robotics, Vol.37, No.21, pp.1382--1391, 2023.
(text)
(KURENAI)
(KURENAI preprint)
-
K.Yamamoto, K.Inoue, and T.Kawahara.
Character expression for spoken dialogue systems with semi-supervised
learning using variational auto-encoder.
Computer Speech and Language, Vol.79, No. 101469, pp.1--14,
2023.
(text)
FY 2022
-
K.Inoue, D.Lala, and T.Kawahara.
Can a robot laugh with you?: Shared laughter generation for
empathetic spoken dialogue.
Frontiers in Robotics and AI, Vol.Computational Intelligence
in Robotics, pp.1--11, 9:933261, 2022.
(text)
(KURENAI)
-
K.Soky, M.Mimura, T.Kawahara, C.Chu, S.Li, C.Ding, and S.Sam.
TriECCC: Trilingual corpus of the Extraordinary Chambers in the
Courts of Cambodia for speech recognition and translation studies.
International Journal of Asian Language Processing, Vol.31,
No. 3\&4, pp.225007:1--21, 2022.
(text)
(KURENAI)
-
K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, K.Yoshii, and T.Kawahara.
Autoregressive moving average jointly-diagonalizable spatial
covariance analysis for joint source separation and dereverberation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.30,
pp.2368--2382, 2022.
(text)
FY 2021
-
Y.Du, R.Scheibler, M.Togami, K.Yoshii, and T.Kawahara.
Computationally-efficient overdetermined blind source separation
based on iterative source steering.
IEEE Signal Processing Letters, Vol.29, pp.927--931, 2021.
(text)
-
H.Inaguma and T.Kawahara.
Alignment knowledge distillation for online streaming attention-based
speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.31,
pp.1371--1385, 2021.
(text)
-
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Synthesizing waveform sequence-to-sequence to augment training data
for sequence-to-sequence speech recognition.
Acoustical Science \& Technology, Vol.42, No.6, pp.333--343,
2021.
(text)
(PDF file)
-
E.Nakamura and K.Yoshii.
Musical Rhythm Transcription Based on Bayesian Piece-Specific
Score Models Capturing Repetitions.
Information Sciences, Vol. 572, pp. 482-500, 2021.
(text)
-
K.Shibata, E.Nakamura, and K.Yoshii.
Non-Local Musical Statistics as Guides for Audio-to-Score Piano
Transcription.
Information Sciences, Vol. 566, pp. 262–280, 2021.
(text)
-
T.Kawahara, N.Muramatsu, K.Yamamoto, D.Lala, and K.Inoue.
Semi-autonomous avatar enabling unconstrained parallel conversations
--seamless hybrid of WOZ and autonomous dialogue systems--.
Advanced Robotics, Vol.35, No.11, pp.657--663, 2021.
(text)
-
R.Nishikimi, E.Nakamura, M.Goto, and K.Yoshii.
Audio-to-Score Singing Transcription Based on a CRNN-HSMM Hybrid Model.
APSIPA Trans. Signal \& Information Process., Vol.10, No.e7,
pp.1–13, 2021.
(text)
FY 2020
-
A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
Flow-Based Independent Vector Analysis for Blind Source Separation.
IEEE Signal Processing Letters, Vol. 28, pp. 2173–2177, 2020.
(text)
-
Y.Wu, T.Carsault, E.Nakamura, and K.Yoshii.
Semi-Supervised Neural Chord Estimation Based on a Variational
Autoencoder With Latent Chord Labels and Features.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.2956-2966, 2020.
(text)
-
K.Sekiguchi, Y.Bando, A.A.Nugraha, K.Yoshii, and T.Kawahara.
Fast multichannel nonnegative matrix factorization with
directivity-aware jointly-diagonalizable spatial covariance matrices for
blind source separation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.2610--2625, 2020.
(text)
-
R.Nishikimi, E.Nakamura, M.Goto, K.Itoyama, and K.Yoshii.
Bayesian Singing Transcription Based on a Hierarchical Generative
Model of Keys, Musical Notes, and F0 Trajectories.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.1678--1691, 2020.
(text)
-
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Bayesian Melody Harmonization Based on a Tree-Structured Generative Model of Chord Sequences and Melodies.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.1644--1655, 2020.
(text)
-
A.A.Nugraha, K.Sekiguchi, and K.Yoshii.
A Flow-Based Deep Latent Variable Model for Speech Spectrogram
Modeling and Enhancement.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.1104--1117, 2020.
(text)
-
Tatsuya Kawahara, Shoko Ueno, and Masaya Morikawa.
Transcription system using automatic speech recognition in the
Japanese Parliament.
The Journal of Professional Reporting and Transcription (Tiro),
No.1, 2020.
(text)
-
A.A.Nugraha, K.Sekiguchi, and K.Yoshii.
A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.1104--1117, 2020.
(text)
-
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Cross-lingual transfer learning of non-native acoustic modeling for
pronunciation error detection and diagnosis.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28,
pp.391--401, 2020.
(text)
(KURENAI)
-
K.Sekiguchi, Y.Bando, A.A.Nugraha, K.Yoshii, and T.Kawahara.
Semi-supervised multichannel speech enhancement with a deep speech
prior.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.27,
No.12, pp.2197--2212, 2019.
(text)
-
Y.Li, C.T.Ishi, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Expressing reactive emotion based on multimodal emotion recognition
for natural conversation in human-robot interaction.
Advanced Robotics, Vol.33, No.20, pp.1030--1041, 2019.
(text)
-
T.Zhao and T.Kawahara.
Joint dialog act segmentation and recognition in human conversations
using attention to dialog context.
Computer Speech and Language, Vol.57, pp.108--127, 2019.
(text)
(KURENAI)
-
K.Shimada, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Unsupervised speech enhancement based on multichannel NMF-informed
beamforming for noise-robust automatic speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.27,
No.5, pp.960--971, 2019.
(text)
(KURENAI)
FY 2018
-
Y.Ojima, E.Nakamura, K.Itoyama and K.Yoshii.
Chord-aware automatic music transcription based on hierarchical
Bayesian integration of acoustic and language models.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e14,
pp.1--14, 2018.
(text)
-
E.Nakamura and K.Yoshii.
Statistical piano reduction controlling performance difficulty.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e13,
pp.1--12, 2018.
(text)
-
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Engagement recognition by a latent character model based on
multimodal listener behaviors in spoken dialogue.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e9,
pp.1--16, 2018.
(text)
-
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Generative Statistical Models with Self-Emergent Grammar of Chord
Sequences.
Journal of New Music Research, 2018.
(text)
-
M.Mirzaei, K.Meshgi, and T.Kawahara.
Exploiting automatic speech recognition errors to enhance partial and
synchronized caption for facilitating second language listening.
Computer Speech and Language, Vol.49, pp.17--36, 2018.
(text)
(KURENAI)
-
T.Hagiya, T.Horiuchi, T.Yazaki, and T.Kawahara.
Typing Tutor: Individualized tutoring in text entry for older
adults based on statistical input stumble detection.
J. Information Processing, Vol.26, No.4, 2018.
(text)
-
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
Bayesian multichannel audio source separation based on integrated
source and spatial models.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26,
No.4, pp.831--846, 2018.
(text)
(PDF file)
FY 2017
-
Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, T.Kawahara, and
H.G.Okuno.
Speech enhancement based on Bayesian low-rank and sparse
decomposition of multichannel magnitude spectrograms.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26,
No.2, pp.215--230, 2018.
(text)
(PDF file)
(Errata)
-
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Articulatory modeling for pronunciation error detection without
non-native training data based on DNN transfer learning.
IEICE Trans., Vol.E100-D, No.9, pp.2174--2182, 2017.
(text)
-
T.Hagiya, T.Horiuchi, T.Yazaki, T.Kato, and T.Kawahara.
Assistive typing application for older adults based on input stumble
detection.
J. Information Processing, Vol.25, No.6, 2017.
(text)
FY 2016
-
M.Mirzaei, K.Meshgi, Y.Akita, and T.Kawahara.
Partial and synchronized captioning: A new tool to assist learners in
developing second language listening skill.
ReCALL Journal, Vol.29, No.2, pp.178--199, 2017.
(text)
(PDF file)
-
M.Ohkita, Y.Bando, Y.Ikemiya, E.Nakamura, K.Itoyama, and K.Yoshii.
Audio-visual beat tracking based on a state-space model for a
robot dancer performing with a human dancer
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.125-136, 2017.
(text)
-
K.Sekiguchi, Y.Bando, K.Itoyama, and K.Yoshii.
Layout optimization of cooperative distributed microphone arrays
based on estimation of source separation performance.
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.83-93, 2017.
(text)
-
K.Youssef, K.Itoyama, and K.Yoshii.
Simultaneous identification and localization of still and mobile
speakers based on binaural robot audition.
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.59-71, 2017.
(text)
-
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Singing Voice Separation and Vocal F0 Estimation Based on Mutual
Combination of Robust Principal Component Analysis and Subharmonic
Summation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.24, No.11,
pp.2084--2095, 2016.
(text)
(PDF file)
-
S.Li, Y.Akita, and T.Kawahara.
Semi-supervised acoustic model training by discriminative data
selection from multiple ASR systems' hypotheses.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.24,
No.9, pp.1524--1534, 2016.
(text)
(PDF file)
(KURENAI)
-
T.Kawahara, T.Iwatate, K.Inoue, S.Hayashi, H.Yoshimoto, and K.Takanashi.
Multi-modal sensing and analysis of poster conversations with smart
posterboard.
APSIPA Trans. Signal \& Information Process., Vol.5, No.e2,
pp.1--12, 2016.
(text)
-
K.Yoshino and T.Kawahara.
Conversational system for information navigation based on POMDP
with user focus tracking.
Computer Speech and Language, Vol.34, No.1, pp.275--291,
2015.
(text)
-
I.Nishimuta, K.Yoshii, K.Itoyama, and H.G.Okuno.
Toward a Quizmaster Robot for Speech-based Multiparty Interaction.
Advanced Robotics., Vol.29, No.18, pp.1205--1219, 2015.
(text)
-
S.Li, Y.Akita, and T.Kawahara.
Automatic lecture transcription based on discriminative data
selection for lightly supervised acoustic model training.
IEICE Trans., Vol.E98-D, No.8, pp.1545--1552, 2015.
(text)
-
R.Gomez, T.Kawahara, and K.Nakadai.
Optimized wavelet-domain filtering under noisy and reverberant
conditions.
APSIPA Trans. Signal \& Information Process., Vol.4, No.e3,
pp.1--12, 2015.
(text)
-
M.Mimura, S.Sakai, and T.Kawahara.
Reverberant speech recognition combining deep neural networks and
deep autoencoders augmented with phone-class feature.
EURASIP J. Advances in Signal Processing, Vol.2015, No.62,
pp.1--13, 2015.
(text)
(PDF file)
(KURENAI)
-
T.Tung, R.Gomez, T.Kawahara, and T.Matsuyama.
Multi-party interaction understanding using smart multimodal digital
signage.
IEEE Trans. Human-Machine Systems, Vol.44, No.5, pp.
625--637, 2014.
(text)
(PDF file)
-
M.Ablimit, T.Kawahara, and A.Hamdulla.
Lexicon optimization based on discriminative learning for automatic
speech recognition of agglutinative language.
Speech Communication, Vol.60, pp.78--87, 2014.
(text)
(PDF file)
FY 2013
-
S.Sakai and T.Kawahara.
Admissible stopping in Viterbi beam search for unit selection
speech synthesis.
IEICE Trans., Vol.E96-D, No.6, pp.1359--1367, 2013.
(text)
-
G.Neubig, T.Watanabe, S.Mori, and T.Kawahara.
Substring-based machine translation.
Machine Translation, Vol.27, No.2, pp.139--166, 2013.
(text)
(PDF file)
FY 2012
-
H.Nishizaki, T.Akiba, K.Aikawa, T.Kawahara, and T.Matsui.
Evaluation framework design of spoken term detection study at the
NTCIR-9 IR for spoken documents task.
自然言語処理, Vol.19, No.4, pp.329--350, 2012.
(text)
(PDF file)
-
G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
A monotonic statistical machine translation approach to speaking
style transformation.
Computer Speech and Language, Vol.26, No.5, pp.349--370,
2012.
(text)
(PDF file)
-
G.Neubig, T.Watanabe, E.Sumita, S.Mori, and T.Kawahara.
Joint phrase alignment and extraction for statistical machine
translation.
J. Information Processing, Vol.20, No.2, pp.512--523, 2012.
(text)
FY 2011
-
G.Neubig, M.Mimura, S.Mori, and T.Kawahara.
Bayesian learning of a language model from continuous speech.
IEICE Trans., Vol.E95-D, No.2, pp.614--625, 2012.
(text)
-
S.Sakai, T.Kawahara, and H.Kawai.
Probabilistic concatenation modeling for corpus-based speech
synthesis.
IEICE Trans., Vol.E94-D, No.10, pp.2006--2014, 2011.
(text)
-
D.Cournapeau, S.Watanabe, A.Nakamura, and T.Kawahara.
Online unsupervised classification with model comparison in the
Variational Bayes framework for voice activity detection.
IEEE J. Selected Topics in Signal Processing, Vol.4, No.6,
pp.1071--1083, 2010.
(text)
(PDF file)
(KURENAI)
-
R.Gomez and T.Kawahara.
Robust speech recognition based on dereverberation parameter
optimization using acoustic model likelihood.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.7,
pp.1708--1716, 2010.
(text)
(PDF file)
(KURENAI)
-
Y.Akita and T.Kawahara.
Statistical transformation of language and pronunciation models for
spontaneous speech recognition.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.6,
pp.1539--1549, 2010.
(text)
(PDF file)
(KURENAI)
-
K.Ishizuka, S.Araki, and T.Kawahara.
Speech activity detection for multi-party conversation analyses based
on likelihood ratio test on spatial magnitude.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.6,
pp.1354--1365, 2010.
(text)
(PDF file)
-
T.Shinozaki, S.Furui, and T.Kawahara.
Gaussian mixture optimization based on efficient cross-validation.
IEEE J. Selected Topics in Signal Processing, Vol.4, No.3,
pp.540--547, 2010.
(text)
(PDF file)
FY 2009
-
T.Misu and T.Kawahara.
Bayes risk-based dialogue management for document retrieval system
with speech interface.
Speech Communication, Vol.52, No.1, pp.61--71, 2010.
(text)
(PDF file)
-
H.Wang and T.Kawahara.
Effective prediction of errors by non-native speakers using decision
tree for speech recognition-based CALL system.
IEICE Trans., Vol.E92-D, No.12, pp.2462--2468, 2009.
(text)
-
H.Wang, C.J.Waple, and T.Kawahara.
Computer assisted language learning system based on dynamic question
generation and error prediction for automatic speech recognition.
Speech Communication, Vol.51, No.10, pp.995--1005, 2009.
(text)
(PDF file)
FY 2008
-
D.Cournapeau and T.Kawahara.
Voice activity detection based on high order statistics and online
EM algorithm.
IEICE Trans., Vol.E91-D, No.12, pp.2854--2861, 2008.
(text)
-
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Out-of-domain utterance detection using classification confidences of
multiple topics.
IEEE Trans. Audio, Speech \& Language Process., Vol.15, No.1,
pp.150--161, 2007.
(text)
(PDF file)
(KURENAI)
-
T.Misu and T.Kawahara.
Dialogue strategy to clarify user's queries for document retrieval
system with speech interface.
Speech Communication, Vol.48, No.9, pp.1137--1150, 2006.
(text)
(PDF file)
-
C.Troncoso and T.Kawahara.
Trigger-based language model adaptation for automatic transcription
of panel discussions.
IEICE Trans., Vol.E89-D, No.3, pp.1024--1031, 2006.
(text)
-
I.R.Lane and T.Kawahara.
Verification of speech recognition results incorporating in-domain
confidence and discourse coherence measures.
IEICE Trans., Vol.E89-D, No.3, pp.931--938, 2006.
(text)
-
M.Nishida and T.Kawahara.
Speaker model selection based on the Bayesian information criterion
applied to unsupervised speaker indexing.
IEEE Trans. Speech \& Audio Process., Vol.13, No.4, pp.
583--592, 2005.
(text)
(PDF file)
(KURENAI)
FY 2004
-
K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
User modeling in spoken dialogue systems to generate flexible
guidance.
User Modeling and User-Adapted Interaction, Vol.15, No.1, pp.
169--183, 2005.
(text)
(PDF file)
-
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Dialogue speech recognition by combining hierarchical topic
classification and language model switching.
IEICE Trans., Vol.E88-D, No.3, pp.446--454, 2005.
(text)
-
Y.Akita and T.Kawahara.
Language model adaptation based on PLSA of topics and speakers for
automatic transcription of panel discussions.
IEICE Trans., Vol.E88-D, No.3, pp.439--445, 2005.
(text)
-
T.Kawahara, M.Hasegawa, K.Shitaoka, T.Kitade, and H.Nanjo.
Automatic indexing of lecture presentations using unsupervised
learning of presumed discourse markers.
IEEE Trans. Speech \& Audio Process., Vol.12, No.4, pp.
409--419, 2004.
(text)
(PDF file)
(KURENAI)
-
H.Nanjo and T.Kawahara.
Language model and speaking rate adaptation for spontaneous
presentation speech recognition.
IEEE Trans. Speech \& Audio Process., Vol.12, No.4, pp.
391--400, 2004.
(text)
(PDF file)
(KURENAI)
-
Y.Tsubota, T.Kawahara, and M.Dantsuji.
An English pronunciation learning system for Japanese students
based on diagnosis of critical pronunciation errors.
ReCALL Journal, Vol.16, No.1, pp.173--188, 2004.
(text)
(PDF file)
-
Y.Tsubota, T.Kawahara, and M.Dantsuji.
Formant structure estimation using vocal tract length normalization
for CALL system.
Acoustical Science \& Technology, Vol.24, No.2, pp.93--96,
2003.
(text)
(PDF file)
-
M.Mimura and T.Kawahara.
Difference of acoustic modeling for read speech and dialogue speech.
Acoustical Science \& Technology, Vol.22, No.5, pp.373--374,
2001.
(text)
(PDF file)
FY 2000
-
C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji.
Japanese pronunciation instruction system using speech recognition
methods.
IEICE Trans., Vol.E83-D, No.11, pp.1960--1968, 2000.
(text)
-
T.Kawahara, C.-H.Lee, and B.-H.Juang.
Flexible speech understanding based on combined key-phrase detection
and verification.
IEEE Trans. Speech \& Audio Process., Vol.6, No.6, pp.
558--568, 1998.
(text)
(PDF file)
-
T.Kawahara and S.Doshita.
Comparison of discrete and continuous classifier-based HMM.
J. Acoust. Soc. Japan (E), Vol.13, No.6, pp.361--367, 1992.
(text)
(PDF file)