Audio ML Papers

Last 7 Days (March 26 - April 02, 2026)

Subcategories: All (2) | Speech Synthesis (0) | Music Synthesis (0) | Ambient Synthesis (0) | Quality Evaluation (0) | Enhancement (0) | Asr (1) | Llm Audio (0) | Midi Generation (0) | Generative Conditioning (0) | Other (1)
← Previous Week | Current Week

🏆 Top Papers This Week

#1 TOP PAPER (Score: 82)
Huan Shen, Yingao Wang, Shangkun Huang ... · arXiv
Turn-taking modeling is fundamental to spoken dialogue systems, yet its evaluation remains fragmented and often limited to binary boundary detection under narrow interaction settings. Such protocols hinder systematic comparison and obscure model weaknesses across conversational c...
#2 TOP PAPER (Score: 82)
Shangkun Huang, Huan Shen, Wei Zou ... · arXiv
Speech LLM-based ASR often struggles with named entities and long-tail words due to strong internal language-model priors. Retrieval-augmented biasing can help, but its effectiveness depends on accurate hotword localization in full-utterance speech under weak supervision. We prop...
Thursday, March 26, 2026
Shangkun Huang, Huan Shen, Wei Zou ... · arXiv
Speech LLM-based ASR often struggles with named entities and long-tail words due to strong internal language-model priors. Retrieval-augmented biasing can help, but its effectiveness depends on accurate hotword localization in full-utterance speech under weak supervision. We prop...
Huan Shen, Yingao Wang, Shangkun Huang ... · arXiv
Turn-taking modeling is fundamental to spoken dialogue systems, yet its evaluation remains fragmented and often limited to binary boundary detection under narrow interaction settings. Such protocols hinder systematic comparison and obscure model weaknesses across conversational c...