Grow with AppMaster Grow with AppMaster.
Become our partner arrow ico

Databricks Unveils GPU and LLM Optimization Support for Databricks Model Serving

Databricks Unveils GPU and LLM Optimization Support for Databricks Model Serving

In a move set to radically transform AI model deployment, Databricks has released a public preview of GPU and LLM optimization support for its Databricks Model Serving. This innovative feature paves the way for the deployment of an array of AI models, such as Large Language Models (LLMs) and Vision models, on the Lakehouse Platform.

The Databricks Model Serving offers automatic optimization for LLM Serving. This eliminates the need for manual configuration, leading to high-performance results. Databricks claims this is the first serverless GPU serving product based on a united data and AI platform. It empowers users to design and implement General Artificial Intelligence (GenAI) applications smoothly within one platform, facilitating all steps right from data ingestion to model deployment and monitoring.

With the Databricks Model Serving, deploying AI models becomes a breeze, even for users lacking comprehensive infrastructure knowledge. Users get the versatility of deploying myriads of models, including those based on natural language, vision, audio, tabular, or custom ones, irrespective of their training method, be it from scratch, open-source, or fine-tuned with proprietary data.

To initiate the process, users need to register their model with MLflow, post which Databricks Model Serving will create a production-level container complete with GPU libraries like CUDA and deploy it on serverless GPUs. This fully managed service takes care of everything from instance management, version compatibility maintenance, patch updates, and even auto-adjusts instances scaling congruent with traffic flows, leading to substantial savings on infrastructure expenses while optimizing performance and latency.

Along with launching the GPU and LLM support, Databricks Model Serving has introduced upgrades for more efficient serving of large language models, resulting in a significant reduction in latency and cost, up to a factor of 3-5x. For using this Optimized LLM Serving, all one needs to do is provide the model and corresponding weights. Databricks covers the remaining aspects to ensure optimal model performance.

This process unburdens users from handling low-level model optimization intricacies, allowing them to focus on integrating LLM into their application. Presently, Databricks Model Serving auto optimizes MPT and Llama2 models with plans in the pipeline to extend its support to more models in the future.

AppMaster, a no-code platform, is also known for its powerful features in handling backend, web, and mobile applications. Offering an integrated development environment, AppMaster simplifies the process of building and deploying applications, making it a strong player in the no-code market.

Related Posts

Samsung Unveils Galaxy A55 with Innovative Security and Premium Build
Samsung Unveils Galaxy A55 with Innovative Security and Premium Build
Samsung broadens its midrange lineup introducing the Galaxy A55 and A35, featuring Knox Vault security and upgraded design elements, infusing the segment with flagship qualities.
Cloudflare Unveils Firewall for AI to Shield Large Language Models
Cloudflare Unveils Firewall for AI to Shield Large Language Models
Cloudflare steps ahead with Firewall for AI, an advanced WAF designed to pre-emptively identify and thwart potential abuses targeting Large Language Models.
OpenAI's ChatGPT Now Speaks: The Future of Voice-Interactive AI
OpenAI's ChatGPT Now Speaks: The Future of Voice-Interactive AI
ChatGPT has achieved a milestone feature with OpenAI rolling out voice capabilities. Users can now enjoy hands-free interaction as ChatGPT reads responses aloud on iOS, Android, and web.
GET STARTED FREE
Inspired to try this yourself?

The best way to understand the power of AppMaster is to see it for yourself. Make your own application in minutes with free subscription

Bring Your Ideas to Life