Dell and Hugging Face partner to simplify LLM deployment

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Hear from top industry leaders on Nov 15. Reserve your free pass


Almost every enterprise today is at least exploring what large language models (LLMs) and generative AI can do for their business. 

Still, just as with the dawn of cloud computing and big data and analytics, many concerns remain: Where do they start in deploying the complex technology? How can they ensure the security and privacy of their sensitive, proprietary data? And what about time- and resource-intensive fine-tuning? 

Today, Dell and Hugging Face are announcing a new partnership to help address these hurdles, simplify on-premises deployment of customized LLMs and enable enterprises to get the most out of the powerful, evolving technology. 

“The impact of gen AI and AI in general will be “significant, in fact, it will be transformative,” Matt Baker, SVP for Dell AI strategy, said in a press pre-briefing. 

VB Event

AI Unleashed

Don’t miss out on AI Unleashed on November 15! This virtual event will showcase exclusive insights and best practices from data leaders including Albertsons, Intuit, and more.

 

Register for free here

“This is the topic du jour, you can’t go anywhere without talking about generative AI or AI,” he added. “But it is advanced technology and it can be pretty daunting and complex.”

Dell and Hugging Face ‘embracing’ to support LLM adoption

With the partnership, the two companies will create a new Dell portal on the Hugging Face platform. This will include custom, dedicated containers, scripts and technical documents for deploying open-source models on Hugging Face with Dell servers and data storage systems. 

The service will first be offered to Dell PowerEdge servers and will be available through the APEX console. Baker explained that it will eventually extend to Precision and other Dell workstation tools. Over time, the portal will also release updated containers with optimized models for Dell infrastructure to support new-gen AI use cases and models. 

“The only way you can take control of your AI destiny is by building your own AI, not being a user, but being a builder,” Jeff Boudier, head of product at Hugging Face, said during the pre-briefing. “You can only do that with open-source.”

The new partnership is the latest in a series of announcements from Dell as it seeks to be a leader in generative AI. The company recently added ObjectScale XF960 to its ObjectScale tools line. The S3-compatible, all-flash appliance is geared towards AI and analytics workflows. 

Dell also recently expanded its gen AI portfolio from initial-stage inferencing to model customization, tuning and deployment. 

Of the latest news, Baker noted with a laugh: “I’m trying to avoid the puns of Dell and Hugging Face ‘embracing’ on behalf of practitioners, but that’s in fact what we are doing.”

Challenges in adopting generative AI

There are undoubtedly many challenges in enterprise adoption of gen AI. “Customers report a plethora of issues,” said Baker. 

To name a few: complexity and closed ecosystems; time-to-value; vendor reliability and support; ROI and cost management. 

Just as in the early days of big data, there’s also an overall challenge in progressing gen AI projects from proof of concept to production, he said. And, organizations are concerned about exposing their data as they seek to leverage it to gain insights and automate processes. 

“Today a lot of companies are stuck because they’re being asked to deliver on this new generative AI trend,” said Boudier, “while at the same time they cannot compromise their IP.”

Just look at popular code assistants such as GitHub Copilot, he said: “Isn’t it crazy that every time a developer at an organization types a keystroke on a keyboard, your company source code goes up on the internet?”

This underscores the value in — and need for — internalizing gen AI and ML apps. Dell research has found that enterprises overwhelmingly (83%) prefer on-prem or hybrid implementations. 

“There’s a significant advantage to deploying on-prem, particularly when you’re dealing with your most precious IP assets, your most precious artifacts,” said Baker. 

Curated models for performance, accuracy, use case

The new Dell Hugging Face portal will include curated sets of models selected for performance, accuracy, use cases and licenses, Baker explained. Organizations will be able to select their preferred model and Dell configuration, then deploy within their infrastructure.

“Imagine a LLama 2 model specifically configured and fine-tuned for your platform, ready to go,” Baker said. 

He pointed to use cases including marketing and sales content generation, chatbots and virtual assistants and software development. 

“We’re going to take the guesswork out of being a builder,” said Baker. “It’s the easy button to go to Hugging Face and deploy the capabilities you want and need in a way that takes away a lot of the minutiae and complexity.”

What makes this new offering different from the spate of others emerging almost daily is Dell’s ability to tune “top to bottom,” Baker contended. This allows enterprises to quickly deploy the best configuration of a given model or framework. 

He emphasized that enterprises won’t be exchanging any data with public models. “It’s your data and nobody else is touching your data except you,” he said, adding that, “once that model has been fine-tuned, it’s your model.”

Every company a vertical

Ultimately, tuning models for maximum output can be a time-consuming process, and many enterprises currently experimenting with gen AI are using retrieval augmented generation (RAG) alongside off-the-shelf LLM tools. 

RAG incorporates external knowledge sources to supplement internal information. The method allows users to find relevant data to create stepwise instructions for many generative tasks, Baker explained, and the pattern can be instantiated in pre-built containers. 

“Techniques like RAG are a way of in essence not having to build a model, but instead providing context to the model to achieve the right generative answer,” he said. 

Dell aims to further simplify the fine-tuning process by providing a containerized tool based on the popular parameter efficient techniques LoRA and QLoRA, he said.

This is an important step when it comes to customizing models to specific business use cases. Going forward, all enterprises will have their own vertical, in fact, “they themselves are vertical — they’re using their specific data,” Baker said. 

There’s much talk of verticalization in AI, but that doesn’t necessarily mean domain-specific models. “Instead, it’s taking your specific data, combining that with a model to provide a generative outcome,” he said. 

Originally appeared on: TheSpuzz

Scoophot
Logo