Zero-shot multi-speaker text-to-speech with state-of-the-art neural speaker embeddings E Cooper, CI Lai, Y Yasuda, F Fang, X Wang, N Chen, J Yamagishi ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 133 | 2020 |
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Y Yasuda, X Wang, S Takaki, J Yamagishi ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 91 | 2019 |
Espnet2-tts: Extending the edge of tts research T Hayashi, R Yamamoto, T Yoshimura, P Wu, J Shi, T Saeki, Y Ju, ... arXiv preprint arXiv:2110.07840, 2021 | 24 | 2021 |
Investigation of learning abilities on linguistic features in sequence-to-sequence text-to-speech synthesis Y Yasuda, X Wang, J Yamagishi Computer Speech & Language 67, 101183, 2021 | 22 | 2021 |
Can speaker augmentation improve multi-speaker end-to-end tts? E Cooper, CI Lai, Y Yasuda, J Yamagishi arXiv preprint arXiv:2005.01245, 2020 | 18 | 2020 |
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments Y Yasuda, X Wang, J Yamagishi arXiv preprint arXiv:1908.11535, 2019 | 18 | 2019 |
End-to-end text-to-speech using latent duration based on vq-vae Y Yasuda, X Wang, J Yamagishd ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 14 | 2021 |
Modeling of Rakugo speech and its limitations: Toward speech synthesis that entertains audiences S Kato, Y Yasuda, X Wang, E Cooper, S Takaki, J Yamagishi IEEE Access 8, 138149-138161, 2020 | 8 | 2020 |
Pretraining strategies, waveform model choice, and acoustic configurations for multi-speaker end-to-end speech synthesis E Cooper, X Wang, Y Zhao, Y Yasuda, J Yamagishi arXiv preprint arXiv:2011.04839, 2020 | 7 | 2020 |
Rakugo speech synthesis using segment-to-segment neural transduction and style tokens—toward speech synthesis for entertaining audiences S Kato, Y Yasuda, X Wang, E Cooper, S Takaki, J Yamagishi Proc. 10th ISCA Speech Synth. Workshop, 111-116, 2019 | 7 | 2019 |
Tts tutorial at ieice sp workshop X Wang, Y Yasuda | 5 | 2019 |
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Y Yasuda, T Toda IEEE Journal of Selected Topics in Signal Processing 16 (6), 1319-1328, 2022 | 3 | 2022 |
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment Y Yasuda, X Wang, J Yamagishi ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 2 | 2020 |
落語音声合成における Tacotron およびコンテキスト特徴量の使用とその評価 (信号処理) 加藤集平, 高木信二, 山岸順一, 安田裕介 電子情報通信学会技術研究報告= IEICE technical report: 信学技報 118 (496 …, 2019 | 1 | 2019 |
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Y Yasuda, T Toda ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
Spoken-Text-Style Transfer with Conditional Variational Autoencoder and Content Word Storage}} D Yoshioka, Y Yasuda, N Matsunaga, Y Ohtani, T Toda Proc. Interspeech 2022, 4576-4580, 2022 | | 2022 |
How Similar or Different Is Rakugo Speech Synthesizer to Professional Performers? S Kato, Y Yasuda, X Wang, E Cooper, J Yamagishi ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | | 2021 |
Lexical pitch accent and duration modeling for neural end-to-end text-to-speech synthesis Y Yasuda The Graduate University for Advanced Studies, 2021 | | 2021 |
Lexical pitch accent and duration modeling for neural end-to-end text-to-speech synthesis 安田裕介, ヤスダ,ユウスケ 総合研究大学院大学, 2021 | | 2021 |
落語音声合成は人間の落語家にどれだけ迫れるのか? 加藤集平, 安田裕介, 山岸順一 研究報告音声言語情報処理 (SLP) 2020 (15), 1-6, 2020 | | 2020 |