Absolutely — there is a TouchDesigner component that can recognize sung vocals or spoken speech and transcribe them into text.
A developer known as blankensmithing created a set of custom TouchDesigner components that integrate OpenAI’s Whisper (for speech-to-text transcription) and ChatGPT within TouchDesigner. These components are designed to be easy to use—you simply need to add your OpenAI API key, and they’ll work right inside TouchDesigner without additional setup. citeturn0search0turn0search3
Additionally, the tutorial is available in the TouchDesigner community and on the official Derivative website, with project files and components shared via the creator’s Patreon. citeturn0search3turn0search7
A community member on Reddit shared:
“I’m just looking at adding custom text to speech for the text output… currently looking at adding something like the elevenlabs.io API.”
This indicates these components are actively used and flexible for extension, even in creative performance contexts. citeturn0search4
- Captures sung vocals, not just spoken word, though transcription accuracy will vary based on audio clarity.
- No complex installation — only requires inserting your OpenAI API key into the component.
- Former users have successfully integrated transcription into interactive installations and added TTS functionality (e.g., using ElevenLabs for voice output) citeturn0search9turn0search8.
If you’re looking for alternatives—perhaps highly customised or lighter-weight solutions—you could set up your own speech-to-text pipeline using Python:
- Python scripts in a Script DAT or Text DAT within TouchDesigner can run local or cloud speech-to-text libraries.
- External speech recognition tools can be executed via Python’s
subprocess.Popen() from within TouchDesigner scripts, with output fed back into DATs or CHOPs. Communication can be managed via WebServer DATs or similar methods. citeturn0search1.
This DIY route allows more control over networked workflows, but requires a fair bit more setup.
| Option | Pros | Considerations |
|---|
| Whisper Plugin | Seamless TouchDesigner integration, easy setup | Requires OpenAI API key; transcription accuracy varies for singing |
| DIY Python Implementation | Highly flexible and controllable | More setup and Python scripting required |
- Want to explore the Whisper and ChatGPT TouchDesigner components? I can walk you through setting them up, or help integrate them into your project.
- Curious about improving transcription accuracy for singing? We can look into strategies like audio preprocessing, using denoising filters, or model fine-tuning.
- Interested in adding voice output? We can integrate TTS tools such as ElevenLabs or Google TTS to complete the interaction loop.
Let me know what direction you’re leaning toward—happy to assist with the technical details or walk you through examples tailored to your workflow!