Including Funding rounds, Bull / Bear thesis, Stock + earnings, Roster changes, Patents, News, and Open roles.
Already subscribed? Sign in →
Nikola Borisov earned a computer science degree with an economics minor from Northwestern University, where he was a competitive programmer and won ACM ICPC regional championships. He spent roughly a decade at the messaging company imo.im, rising to Director of Engineering and building a highly scalable microservice backend serving over 200 million monthly active users, then joined the founding team of HalloApp as a backend engineer. In 2022 he co-founded DeepInfra with Georgios Papoutsis and Yessenzhar Kanapin to host open-source AI models behind a simple, low-cost inference API. As CEO he leads the company, which reports processing trillions of tokens per week.
Georgios Papoutsis is a co-founder and engineer at DeepInfra with a background in physics and engineering, holding a Vordiplom in physics from Technische Universität Berlin and having studied at Technische Universität München. Like his co-founders, he has a long track record in international programming and math competitions. The founding team met while building the backend infrastructure for the imo messaging platform before starting DeepInfra in 2022 to serve open-source AI models in production.
Yessenzhar Kanapin is a co-founder of DeepInfra and studied at Kazakh-British Technical University. He has a background winning international coding competitions and previously worked with his co-founders building the backend infrastructure for the imo messaging platform, which served more than 200 million monthly active users. He helped start DeepInfra in 2022 to host open-source and agent-driven AI models behind a low-cost inference API.
No articles ingested yet for DeepInfra. Once the hourly news pipeline is live, every article the classifier tags as mentioning this company appears here with its one-line AI summary and sentiment.
A serverless, OpenAI-compatible API that gives developers on-demand access to a large catalog of open-source models spanning text generation, image generation, speech recognition, and embeddings. Users send requests without provisioning or managing any GPU hardware, paying per token or per request at prices DeepInfra positions well below hyperscaler rates. The service is built from DeepInfra's own GPU hardware up through the API layer, and is designed for production-scale workloads, processing nearly five trillion tokens per week across its customer base.
Dedicated GPU capacity for teams that need reserved hardware for high-volume or latency-sensitive inference rather than shared serverless endpoints. Customers can run their own or custom models on isolated GPUs while still using DeepInfra's managed serving stack, avoiding the overhead of building and operating their own inference infrastructure. The offering targets enterprises and scaleups running open-source and agent-driven AI workloads that want predictable performance, cost control, and freedom from proprietary-model vendor lock-in.
We don't have a live feed for this company's ATS. Their careers page has every open role.
View all careers ↗