We’re both physicists, and one of our biggest frustrations during grad school was finding research — There were a lot of times when we had to sit down to scope out new ideas for a project and quickly become a deep expert, or we had to find solutions to really complex technical problems, but the only way to do that was manually dig through papers on Google Scholar for hours. It was very tedious, to the point where we would often just skip the careful research and hope for the best. Sometimes you’d get burned a few months later because someone already solved the problem you thought was novel and important, or you’d waste your time inventing/building a solution for something when one already existed.
The problem was there’s just no easy way to figure out what others have done in research, and load it into your brain. It’s one of the biggest bottlenecks for doing truly good, important research.
We wanted to fix that. LLMs clearly help, but are mostly limited to general knowledge. Instead, we needed something that would pull in research papers, and give you exactly what you need to know, even for very complex ideas and topics. We realized the way to do this is to mimic the research strategies we already know work, because we do them ourselves, and so we built an agent-like LLM pipeline to carefully search in a way that mimics human research strategies.
Our search system works a bit differently from casual search engines. First, we have you chat back and forth with an LLM to make sure we actually understand your really complex research goals up front, like you’re talking to a colleague. Then the system carefully searches for you for ~3 minutes. At a high level, it does something similar to tree search, following citation rabbit holes and adapting based on what it discovers to look for more content over multiple iterations (the same way you would if you decided to spend a few hours). The 3 minute delay is annoying, but we’re optimizing for quality of results rather than latency right now. At the end there’s a report.
We’re trying to achieve two things with this careful, systematic agent-like discovery process:
1. We want to be very accurate, and only recommend very specific results if you ask for a specific topic. To do this, we carefully read and evaluate content from papers with the highest quality LLMs (we’re just reading abstracts and citations for now, because they’re more widely accessible - but also working on adding full texts).
2. We want to find everything relevant to your search, because in research it’s crucial to know if something exists or not. The key to being exhaustive is the adaptive algorithms we’ve developed (following citations, changing strategy based on what we find, etc). However, one cool feature of the automated pipeline is we can track the discovery process as the search proceeds. Early on, we find many good results, and later on they get more sparse, until all the good leads are exhausted and we stop finding anything helpful. We can statistically model that process, and figure out when we’ve found everything (it actually has an interesting exponential saturation behavior, which you can read a bit more about in our whitepaper (https://www.undermind.ai/static/Undermind_whitepaper.pdf), which we wrote for a previous prototype.)
You can try searching yourself here: https://www.undermind.ai/query_app/promotion/. This is a special HN link where, for today, we’ve dropped the signup gate for your first few searches. Usually we require login so you can save searches.
We’re excited to share this with you! We’d love to hear about your experiences searching, what’s clear or not, and any feedback. We’ll be here to answer any questions or comments.