Toward Conversational User Interface via Voice Command Correction

Author: Shi-Jie Ding, Chia-Hui Chang

Publish Year: 2026-05

Update by: April 10, 2026

摘要

Despite recent advances in AI, ASR systems still struggle with real-world errors from pronunciation and homophones. We propose a speech-command-based correction system that enables users to issue natural-language instructions to refine recognition outputs with minimal effort. The system consists of three modules: an input classifier, a command classifier, and a correction labeler. To support training and evaluation, we simulate ASR errors via TTS/ASR pipelines and generate realistic correction commands using LLMs and linguistic features. Experiments show that individual models achieve over 80\% correction accuracy, and a combined model delivers stable performance. Compared to manual correction, this system also demonstrates highly competitive correction speed, which sufficiently indicates its feasibility for practical deployment.