Soumi Maiti

Cited by

	All	Since 2019
Citations	256	251
h-index	10	10
i10-index	11	11

100

20182019202020212022202320245 10 18 18 29 95 79

Public access

View all

6 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
Yifan PengCarnegie Mellon UniversityVerified email at andrew.cmu.edu
Michael I MandelAssociate Professor of Computer and Information Science at Brooklyn College, CUNYVerified email at sci.brooklyn.cuny.edu
Takaaki SaekiGoogleVerified email at google.com
Erik MarchiApple Inc.Verified email at tum.de
Alistair ConkieAppleVerified email at apple.com
John HersheyGoogle (formerly MERL, IBM, MSR, UCSD)Verified email at google.com
Hakan ErdoganGoogleVerified email at google.com
Scott WisdomGoogle ResearchVerified email at google.com
Kevin WilsonGoogleVerified email at google.com
Srinivas BangaloreInteractionsVerified email at interactions.com
Svetlana StoyanchevResearch Scientist, AT&T LabsVerified email at research.att.com
Yooncheol JuSpeech synthesis AI researcher, 42dot.Inc, Hyundai Motor GroupVerified email at 42dot.ai

Soumi Maiti

Carnegie Mellon University

Verified email at andrew.cmu.edu - Homepage

Machine Learning Speech Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Generating multilingual voices using speaker space translation based on bilingual speaker data S Maiti, E Marchi, A Conkie ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	23	2020
Improving massively multilingual ASR with auxiliary CTC objectives W Chen, B Yan, J Shi, Y Peng, S Maiti, S Watanabe ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	22	2023
Parametric resynthesis with neural vocoders S Maiti, MI Mandel 2019 IEEE Workshop on Applications of Signal Processing to Audio and …, 2019	22	2019
End-to-end diarization for variable number of speakers with local-global networks and discriminative speaker embeddings S Maiti, H Erdogan, K Wilson, S Wisdom, S Watanabe, JR Hershey ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	21	2021
Speaker independence of neural vocoders and their effect on parametric resynthesis speech enhancement S Maiti, MI Mandel ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020	21	2020
Speechlmscore: Evaluating speech generation using speech language model S Maiti, Y Peng, T Saeki, S Watanabe ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	16	2023
Eend-ss: Joint end-to-end neural speaker diarization and speech separation for flexible number of speakers S Maiti, Y Ueda, S Watanabe, C Zhang, M Yu, SX Zhang, Y Xu 2022 IEEE Spoken Language Technology Workshop (SLT), 480-487, 2023	16	2023
Reducing barriers to self-supervised learning: Hubert pre-training with academic compute W Chen, X Chang, Y Peng, Z Ni, S Maiti, S Watanabe arXiv preprint arXiv:2306.06672, 2023	14	2023
Speech denoising by parametric resynthesis S Maiti, MI Mandel ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019	12	2019
Reproducing whisper-style training using an open-source toolkit and publicly available data Y Peng, J Tian, B Yan, D Berrebbi, X Chang, X Li, J Shi, S Arora, W Chen, ... 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023	11	2023
VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks S Maiti, Y Peng, S Choi, J Jung, X Chang, S Watanabe ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	10	2024
Predicting interaction quality in customer service dialogs S Stoyanchev, S Maiti, S Bangalore Advanced Social Interaction with Agents: 8th International Workshop on …, 2018	8	2018
Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study X Chang, B Yan, K Choi, JW Jung, Y Lu, S Maiti, R Sharma, J Shi, J Tian, ... ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	7	2024
Unsupervised data selection for TTS: using Arabic Broadcast News as a case study M Baali, T Hayashi, H Mubarak, S Maiti, S Watanabe, W El-Hajj, A Ali arXiv preprint arXiv:2301.09099, 2023	7	2023
TriniTTS: Pitch-controllable End-to-end TTS without External Aligner. Y Ju, I Kim, H Yang, JH Kim, B Kim, S Maiti, S Watanabe INTERSPEECH, 16-20, 2022	7	2022
Learning to speak from text: Zero-shot multilingual text-to-speech with unsupervised text pretraining T Saeki, S Maiti, X Li, S Watanabe, S Takamichi, H Saruwatari arXiv preprint arXiv:2301.12596, 2023	6	2023
Large Vocabulary Concatenative Resynthesis. S Maiti, J Ching, MI Mandel INTERSPEECH, 1190-1194, 2018	6	2018
Concatenative Resynthesis Using Twin Networks. S Maiti, MI Mandel INTERSPEECH, 3647-3651, 2017	6	2017
Findadaptnet: Find and insert adapters by learned layer importance J Huang, K Ganesan, S Maiti, YM Kim, X Chang, P Liang, S Watanabe ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023	5	2023
Espnet-st-v2: Multipurpose spoken language translation toolkit B Yan, J Shi, Y Tang, H Inaguma, Y Peng, S Dalmia, P Polák, ... arXiv preprint arXiv:2304.04596, 2023	5	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors