We love Copilot and believe that AI will change the way developers work. Max wanted to use Copilot when he was an ML engineer at Meta, but leadership blocked him because Copilot requires sending company code to GitHub and OpenAI. We built CodeComplete because lots of other companies are in the same boat, and we want to offer a secure way for these companies to leverage the latest AI-powered dev tools.
To that end, our product is really meant for large engineering teams at enterprise companies who can’t use GitHub Copilot. This generally means teams with more than 200 developers that have strict practices against sending their code or other IP externally.
CodeComplete offers an experience similar to Copilot; we serve AI code completions as developers type in their IDEs. However, instead of sending private code snippets to GitHub or OpenAI, we use a self-hosted LLM to serve code completions. Another advantage with self-hosting is that it’s more straightforward to securely fine-tune to the company’s codebase. Copilot suggestions aren’t always tailored to a company’s coding patterns or internal libraries, so this can help make our completions more relevant and avoid adding tech debt.
To serve code completions, we start with open source foundation models and augment them with additional (permissively-licensed) datasets. Our models live behind your firewall, either in your cloud or on-premises. For cloud deployments, we have terraform scripts that set up our infrastructure and pull in our containers. On-prem deployments are a bit more complicated; we work with the customer to design a custom solution. Once everything’s set up, we train on your codebase and then start serving code completions.
To use our product, developers simply download our extension in their IDE (VS Code currently supported, Jetbrains coming soon). After authentication, the extensions provide in-line code completion suggestions to developers as they type.
Since we’re a self-hosted enterprise product, we don’t have an online version you can just try out, but here are two quick demos: (1): Python completion, fine-tuned on a mock Twitter-like codebase: https://youtu.be/YqkqtGY4qmc. (2) Java completion for "leetcode"-style problems, like converting integers to roman numerals: https://youtu.be/H4tGoFNC8oI.
We take privacy and security seriously. By default, our deployments only send back heartbeat messages to our servers. Our product logs usage data and code snippets to the company’s own internal database so that they can evaluate our performance and improve their models over time. Companies have the option to share a subset of that data with us (e.g. completion acceptance rate, model probabilities output, latencies, etc), but we don’t require it. We never see your code or any other intellectual property.
We charge based on seat licenses. For enterprise companies, these contracts often demand custom scoping and requirements. In general though, our pricing will be at a premium to GitHub Copilot since there is significant technical and operational overhead with offering a self-hosted product like this.
Having access to these types of tools would have saved us a bunch of time in our previous jobs, so we’re really excited to show this to everyone. If you are having similar issues with security and privacy at your current company, please reach out to us at [email protected]! We’d love to hear your feedback.