Bringing RL to the Enterprise: AgileRL Raises $7.5M to date

Param Kumar

&

Nick Ustaran-Anderegg

January 7, 2026

Why we raised $7.5M to accelerate reinforcement learning

For too long, reinforcement learning (RL) has been too slow, too difficult, and too expensive

When we built RL agents at our last companies, we saw first-hand the difficulties of accessing and implementing these state-of-the-art AI systems. 

To get even a basic RL system running in production, companies need to invest millions of dollars in top AI researchers and compute. The result tends to be slow, brittle, and prohibitively expensive for anyone without Big Tech resources.

Today, we’re thrilled to announce that we’ve raised a total of $7.5 million to date, to make RL available to any company, at a fraction of the time and cost

RL is the gold standard for complex decision-making, from robotics and autonomous vehicles, to high-frequency trading and LLM fine-tuning. Agentic AI is in huge demand, but building these systems has serious barriers.

Typically, companies need to:

  • Hire several PhD level AI engineers
  • Build complex simulators and reward designs
  • Create distributed training pipelines at scale
  • Fine-tune endless hyperparameters that eat up weeks of compute
  • Safely deploy and monitor agents that are taking actions with real consequences

New use-cases often require rebuilding this infrastructure from scratch, which is a major bottleneck for innovation.

AgileRL is speeding up training and deployment 10x with RLOps

AgileRL was created to take RL from a research project to a standard engineering tool. Our platform, Arena, provides an end-to-end RLOps workflow that handles the heavy lifting.

One of AgileRL's key innovations is our Evolutionary Hyperparameter Optimization framework, built specifically for the challenges of RL training. Instead of training one agent and hoping for the best, we train a population of agents that share learnings and "evolve" in real-time. The best performers survive and adapt, while the weak ones are discarded.

The result is both a 10x improvement in training speed, and superior model performance. This all happens automatically, and is applicable to any RL algorithm, neural network architecture, or environment.

Arena also provides other state-of-the-art features, including environment validation, version control, one-click deployment, distributed compute, and performance visualization, which together enable companies to build and deploy superhuman intelligent systems.

Beyond Arena, AgileRL also provides an open-source framework for RL training and evolutionary HPO, which has received a fantastic response from the community. The framework has already surpassed 300,000 downloads, and is used by engineers at JPMorgan, IBM, Wayve, and Huawei. We are also seeing incredible research applications at MIT, Carnegie Mellon, and the University of Waterloo.

Seeing our tools power real-world systems used in defense, finance, robotics, LLMs and more is what keeps our team moving.

What’s Next:

This funding, led by Fusion Fund, along with Flying Fish, Octopus Ventures and Counterview Capital, allows us to scale as fast as our agents do.

Over the coming months, we will be: 

  • Expanding our team: We are hiring for over a dozen roles across engineering and go-to-market
  • Going global: We are opening our San Francisco office to better expand and support our North American customers
  • Product evolution: We are doubling down on support for multi-agent systems and LLM optimization, to ensure AgileRL remains the most performant stack for the next generation of AI

A massive THANK YOU to everyone who made this possible.

Read more on Business Insider.