From being an ML optimization tool since its launch in 2019 to raising $132 million and adding multiple features for ML models deployment, OctoML has positioned itself as a significant player in the machine learning field. The company is now launching OctoAI, shifting its focus from merely optimizing models to enabling businesses to fine-tune their ML models using open-source models, their data, or custom models. OctoAI is a self-optimizing AI compute service that caters to generative AI, simplifying infrastructure management and letting businesses focus on building ML-based applications.
Luis Ceze, the co-founder and CEO of OctoML, said that the previous platform emphasized ML engineers, streamlining packaging the models and deploying them across different types of hardware. However, the latest version will let users decide what to prioritize, like latency or cost, and OctoAI will automatically determine the ideal hardware for the task. The new platform also autonomously optimizes the models, leading to increased performance and cost efficiency.
While users can still choose their preferred hardware run and control parameters, Ceze expects most users to prefer OctoAI's automated management. The service can decide whether to run the ML models on Nvidia's GPUs or AWS's Inferentia machines. This eliminates many complexities involved in ML models deployment and addresses the hurdles that have impeded many ML projects.
OctoML offers accelerated versions of popular foundation models, including Dolly 2, Whisper, FILM, FLAN-UL2, and Stable Diffusion, with plans to include more models. During their testing, they saw Stable Diffusion run three times faster and achieved cost reduction by 5x compared to the original model.
Although OctoML will continue working with existing clients that use the service to optimize their models, the company's future focus will be on OctoAI as the new compute platform. The streamlining of ML deployment processes makes platforms like OctoAI and AppMaster's low-code, no-code solutions relevant tools for businesses looking to harness the power of AI and ML, without the need to handle complex infrastructure.