UpTrain (YC W23) – Open-source performance monitoring for ML models

Hello, we are Shikha, Sourabh, and Vipul - co-founders at UpTrain, an open-source ML observability toolkit. UpTrain helps you monitor the performance of your machine learning applications, alerts you when they go wrong, and helps you improve them by narrowing down on data points to retrain on, all in the same loop.

Our website is at: https://uptrain.ai/ and our Github is here: https://github.com/uptrain-ai/uptrain

ML models tend to perform poorly when presented with new and previously unseen cases as well as their performance deteriorates over time due to evolving real-world environments, which can lead to the degradation of business metrics. In fact, one of our customers (a social media platform with 150 million MAU) was tired of discovering model issues via customer complaints (and increased churn) and wanted an observability solution to identify them proactively.

UpTrain monitors the difference between the dataset the model was trained on and the real-world data it encounters during production (the wild!). This "difference" can be custom statistical measures designed by ML practitioners based on their use case. That last point regarding customization is important because, in most cases, there’s no “ground truth” to check if a model’s output is correct or not. Instead, you need to use statistical measures to figure out drift or performance degradation issues, and those require domain expertise and differ from case to case. For example, in a text summarization model, you want to monitor drift in the input text sentiment, but for a human pose estimation model, you want to add integrity checks on the predicted body length.

Additionally, we monitor for edge cases defined as rule-based smart signals on the model input. Whenever UpTrain sees a distribution shift or an increased frequency of edge cases, it raises an alert while identifying the subset of data that experienced these issues. Finally, it retrains the model on that data, improving its performance in the wild.

Before UpTrain, we explored many observability tools at previous companies (Bytedance, Meta, and Bosch), but always got stuck figuring out what issues our models were facing in production. We used to go through user reviews, find patterns around model failures and manually retrain our models. This was time-consuming and opaque. Customizing our monitoring metrics and having a solution built specifically for ML models was a big need that wasn’t fulfilled.

Additionally, many ML models operate on user-sensitive data, and we didn’t want to send users’ private data to third parties. From a privacy perspective, relying on third-party hosted solutions just felt wrong, and motivated us to create an open-source self-hosted alternative for the same.

We are building UpTrain to make model monitoring effortless. With a single-line integration, our toolkit allows you to detect dips in model performance using real-time dashboards, sends you Slack alerts, helps to pinpoint poor-performing cohorts, and many more. UpTrain is built specifically for ML use cases, providing tools to monitor data distribution shifts, identify production data points with low representation in training data, and visualization/drift detection for embeddings. For more about our key features, see https://docs.uptrain.ai/docs/key-features

Our tool is available as a Python package that can be installed on top of your deployment infrastructure (AWS, GCP, Azure). Since ML models operate on user-sensitive data, and sharing it with external servers is often a barrier to using third-party tools, we focus on deploying to your own cloud.

We’ve launched this repo under an Apache 2.0 license to make it easy for individual developers to integrate it into their production app. For monetization, we plan to build enterprise-level integrations that will include managed service and support. In the next few months, we plan to add more advanced observability measures for large language models and generative AI, as well as make UpTrain easier to integrate with other tools like Weights and Biases, Databricks, Kubernetes, and Airflow.

We would love for you to try out our GitHub repo and give your feedback, and we look forward to all of your comments!



Get Top 5 Posts of the Week



best of all time best of today best of yesterday best of this week best of this month best of last month best of this year best of 2024 best of 2023 yc w25 yc s24 yc w24 yc s23 yc w23 yc s22 yc w22 yc s21 yc w21 yc s20 yc w20 yc s19 yc w19 yc s18 yc w18 yc all-time 3d algorithms animation android [ai] artificial-intelligence api augmented-reality big data bitcoin blockchain book bootstrap bot css c chart chess chrome extension cli command line compiler crypto covid-19 cryptography data deep learning elexir ether excel framework game git go html ios iphone java js javascript jobs kubernetes learn linux lisp mac machine-learning most successful neural net nft node optimisation parser performance privacy python raspberry pi react retro review my ruby rust saas scraper security sql tensor flow terminal travel virtual reality visualisation vue windows web3 young talents


andrey azimov by Andrey Azimov