Retell AI (YC W24) – Conversational Speech API for Your LLM

Hey HN, we're the co-founders of Retell AI (https://www.retellai.com/). We are building a conversational speech engine to help developers build natural-sounding voice AI. Our API abstracts away the complexities of AI voice conversations, so you can make your voice application the best at what it does. Here's a demo video: https://www.youtube.com/watch?v=0LT64_mgkro.

With the advent of LLMs and recent breakthroughs in speech synthesis, conversational voice AI has just gotten good enough to create really exciting use cases. However, developers often underestimate what's required to build a good and natural-sounding conversational voice AI. Many simply stitch together ASR (speech-to-text), an LLM, and TTS (text-to-speech), and expect to get a great experience. It turns out it's not that simple.

There's more going on in conversation than we consciously realize: things like knowing when to speak and when to listen, handling interruptions, 0-200 ms latency and backchanneling phrases (e.g., "yeah", "uh huh") to signal that they are listening. These are natural for humans, but hard for AI to get right. Developers spend hundreds of hours on the AI conversation experience but end up with poor experiences like 4-5s long latencies, inappropriate cutoffs, speaking over each other, etc.

So, we built Retell AI. We have followed the overall paradigm of having speech-to-text, LLM, and text-to-speech components, but have added additional conversation models in between to orchestrate the conversation while allowing maximum configurability for the developers in each step. You can think of our models as adding a “domain expert” layer for the dynamics of conversation itself.

Retell is designed for you to bring your own LLM into our pipeline. Currently, we can achieve 800ms end-to-end latency, handle interruptions, speech isolation, with tons of customization options (e.g., speaking rate, voice temperature, add ambient sound). We created a guest account for HN, so you can try our playground with a 10-min free trial without login: https://beta.retellai.com/dashboard/hn (Playground tutorial: https://docs.retellai.com/guide/dashboard). Our product is usage-based and the price is $0.1-0.17/min.

Our main product is a developer-facing API, but you can try it without writing code (e.g. create agents, connect to a phone number) via our dashboard. If you want to test it in production, feel free to also self-serve with our API documentation. One of our customers just launched, and you can view their demo: https://www.loom.com/share/64f09a53bf6d4b3799e5ebd08b23fec4?...

We are thrilled to see what our users are building with our API, and we’re excited to show our product to the community and look forward to your feedback!



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2023 best of 2022 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov