With the growing popularity of voice interaction in smart devices and office automation, there is an increasing demand for real-time, efficient, and easy-to-deploy speech recognition services. This article introduces how to build a high-performance C++ WebSocket streaming ASR server based on sherpa-onnx and SenseVoice, supporting multilingual recognition, VAD, containerized deployment, and compatibility with various clients.
GitHub Project: mawwalker/stt-server. Star and feedback are welcome!
7/13/25...About 3 min