Building realtime voice apps with LLMs is powerful but hard. You have to orchestrate the speech recognition, LLM, and speech synthesis in real-time (all async)–while handling the complexity of conversation (like understanding when someone is finished speaking or handling interruptions).
Our library is easy to get up and running–you can set up a conversation in <15 lines of code. Check out our Gen Z GPT hotline demo: https://replit.com/@vocode/Gen-Z-Phone (try it out at +1-650-729-9536).
It all started with our PrankGPT project that we built for fun (quick demo at https://www.loom.com/share/0d0d68f1a62f409eb5ae24521293d2dc). We realized how powerful voice + LLMs are but that it was hard to build.
Once we got everything working, it was really cool and useful. Talking to LLMs is better than all the voice AI experiences we’ve had before. And, we imagined a host of cool applications that people can build on top of that.
So, we decided to build a developer tool to make it easy. Our library is open source and gives you everything you need in a single place.
We give you a bunch of integrations out-of-the-box to speech recognition/synthesis providers and let you swap them out easily. We have platform support across web and telephony (via Twilio), with mobile coming soon. We also provide abstractions for streaming conversation (this is good for realtime apps like phone calls) and for command-based/turn-based applications (like voice-based chess). And, we provide customizability around how the conversation is done—things like how to know when someone is finished speaking, changing emotion, sending filler audio if there are delays, etc.
In terms of “how do you make money” – we have a hosted version that we’re going to charge for (though right now you can get it for free! https://app.vocode.dev) and we're also going to build enterprise products in the future.
We’d love for you to try it out and give us some feedback! And, if you have any demos you'd like to see – let us know and we’ll take a crack at building them. We’re curious about your experiences using or building voice AI, what features or use cases you’d love to see, and any other ideas you have to share!