Looking a simple python script for voice pipeline that is fully local (offline), runs on Windows (Core i7), processes microphone audio in real time, uses VAD to detect end-of-speech, performs streaming STT, and immediately streams the recognized text into TTS for echo playback. End-to-end latency must be under 1 second. Any open-source stack that meets this latency is welcome