Follow
Takaaki Saeki
Takaaki Saeki
Verified email at ipc.i.u-tokyo.ac.jp - Homepage
Title
Cited by
Cited by
Year
Espnet2-tts: Extending the edge of tts research
T Hayashi, R Yamamoto, T Yoshimura, P Wu, J Shi, T Saeki, Y Ju, ...
arXiv preprint arXiv:2110.07840, 2021
192021
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU.
T Saeki, Y Saito, S Takamichi, H Saruwatari
INTERSPEECH, 1021-1022, 2020
112020
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
T Saeki, D Xin, W Nakata, T Koriyama, S Takamichi, H Saruwatari
arXiv preprint arXiv:2204.02152, 2022
92022
Incremental text-to-speech synthesis using pseudo lookahead with large pretrained language model
T Saeki, S Takamichi, H Saruwatari
IEEE Signal Processing Letters 28, 857-861, 2021
92021
JTubeSpeech: corpus of Japanese speech collected from YouTube for speech recognition and speaker verification
S Takamichi, L Kürzinger, T Saeki, S Shiota, S Watanabe
arXiv preprint arXiv:2112.09323, 2021
42021
Lifter training and sub-band modeling for computationally efficient and high-quality voice conversion using spectral differentials
T Saeki, Y Saito, S Takamichi, H Saruwatari
ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020
42020
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
T Saeki, K Tachibana, R Yamamoto
arXiv preprint arXiv:2203.15683, 2022
22022
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
T Saeki, S Takamichi, T Nakamura, N Tanji, H Saruwatari
arXiv preprint arXiv:2203.12937, 2022
22022
Personalized filled-pause generation with group-wise prediction models
Y Matsunaga, T Saeki, S Takamichi, H Saruwatari
arXiv preprint arXiv:2203.09961, 2022
22022
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network
T Saeki, S Takamichi, H Saruwatari
2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2021
22021
Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials
T Saeki, Y Saito, S Takamichi, H Saruwatari
IEICE TRANSACTIONS on Information and Systems 104 (7), 1002-1016, 2021
22021
End-to-End Deep Learning Speech Recognition Model for Silent Speech Challenge.
N Kimura, Z Su, T Saeki
INTERSPEECH, 1025-1026, 2020
22020
vTTS: visual-text to speech
Y Nakano, T Saeki, S Takamichi, K Sudoh, H Saruwatari
2022 IEEE Spoken Language Technology Workshop (SLT), 936-942, 2023
12023
SpeechLMScore: Evaluating speech generation using speech language model
S Maiti, Y Peng, T Saeki, S Watanabe
arXiv preprint arXiv:2212.04559, 2022
12022
Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis
Y Matsunaga, T Saeki, S Takamichi, H Saruwatari
2022 Asia-Pacific Signal and Information Processing Association Annual …, 2022
12022
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
T Saeki, H Zen, Z Chen, N Morioka, G Wang, Y Zhang, A Bapna, ...
arXiv preprint arXiv:2210.15447, 2022
12022
Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection
K Seki, S Takamichi, T Saeki, H Saruwatari
arXiv preprint arXiv:2210.14850, 2022
12022
SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition
N Kimura, Z Su, T Saeki, J Rekimoto
Proceedings of the Thirteenth Language Resources and Evaluation Conference …, 2022
12022
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech
D Yang, T Koriyama, Y Saito, T Saeki, D Xin, H Saruwatari
arXiv preprint arXiv:2302.13652, 2023
2023
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
T Saeki, S Maiti, X Li, S Watanabe, S Takamichi, H Saruwatari
arXiv preprint arXiv:2301.12596, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20