Large language models (LLMs) and multimodal AI are the cutting edge of AI innovation, with applications trickling down to the enterprise from the ‘Googles’ and ‘OpenAIs’ of the world. We are currently seeing a barrage of LLM and multimodal AI model announcements, as well as commercial applications created around them.
LLMs power applications that range from code creation to customer feedback. At the same time, they are driving multimodal AI and fueling the debate around the limits and use of AI. In 2019, GPT-2 was deemed “too dangerous to release” by OpenAI. Today, models far more powerful than GPT-2 are being released. Either way, the evaluation feels arbitrary. However, yesterday, a first step toward industry-wide best practices for AI language model deployment may have been taken.
Cohere, OpenAI and AI21 Labs have collaborated on a preliminary set of best practices applicable to any organization developing or deploying LLMs. The trio is recommending key principles to help providers of LLMs mitigate the risks of this technology in order to achieve its full promise to augment human capabilities.
The move has garnered support from Anthropic, the Center for Security and Emerging Technology, Google Cloud Platform and the Stanford Center for Research on Foundation Models. AI21 Labs, Anthropic, Cohere, Google and OpenAI are actively developing LLMs commercially, so the endorsement of these best practices may indicate the emergence of some sort of consensus around their deployment.
The joint recommendation for language model deployment is centered around the principles of prohibiting misuse, mitigating unintentional harm and thoughtfully collaborating with stakeholders.
Cohere, OpenAI and AI21 Labs noted that while these principles were developed specifically based on their experience with providing LLMs through an API, they hope they will be useful regardless of release strategy (such as open-sourcing or use within a company).
The trio also noted that they expect these recommendations to change significantly over time because the commercial uses of LLMs and accompanying safety considerations are new and evolving. Learning about and addressing LLM limitations and avenues for misuse is ongoing, they added, while calling for others to discuss, contribute to, learn from and adopt these principles.
Prohibiting misuse of large language models
Usage guidelines should also specify domains where LLM use requires extra scrutiny and prohibit high-risk use cases that aren’t appropriate, such as classifying people based on protected characteristics. Enforcing usage guidelines may include rate limits, content filtering, application approval prior to production access, monitoring for anomalous activity and other mitigations.
Mitigating unintentional harm
To try to mitigate unintentional harm, the recommended practices are to proactively mitigate harmful model behavior, and to document known weaknesses and vulnerabilities. Google model cards is an existing initiative by Google, leveraged in its recently announced PaLM model, which may enable this.
Best practices to mitigate unintentional harm include comprehensive model evaluation to properly assess limitations, minimizing potential sources of bias in training corpora, and techniques to minimize unsafe behavior such as through learning from human feedback.
Thoughtfully collaborating with stakeholders
To encourage thoughtful collaboration with stakeholders, the recommendations are to build teams with diverse backgrounds, publicly disclose lessons learned regarding LLM safety and misuse, and treat all labor in the language model supply chain with respect. This may turn out to be the hardest part of the recommendations to follow: Google’s conduct in the case that led to the dismissal of the former heads of its AI ethics team is a very public case in point.
In its statement of support for the initiative, Google affirmed the importance of comprehensive strategies for analyzing model and training data to mitigate the risks of harm, bias and misrepresentation. It noted that this is a thoughtful step taken by these AI providers to promote the principles and documentation towards AI safety.
Best practices vs. the real world
As LLM providers, Cohere, OpenAI and AI21 Labs noted that publishing these principles represents a first step in collaboratively guiding safer large language model development and deployment. The trio also emphasized that they are excited to continue working with each other and with other parties to identify other opportunities to reduce unintentional harm from and prevent malicious use of language models.
There are many ways to consider this initiative and the support it has garnered. One way is to see this as an acknowledgement of the great responsibility that comes as part and parcel of the great power that LLMs grant. While these recommendations may be well meaning, however, it’s useful to remember that they are just that: recommendations. They remain rather abstract, and there is no real way of enforcing them, even among the ones that sign up for them.
On the other hand, vendors who build and release LLMs will in all likelihood soon be faced with the requirement to adhere to regulations. Just like the 2018 EU GDPR regulation had a worldwide ripple effect on data privacy, a similar effect can be expected around 2025 by the EU AI Act. LLM providers are probably aware of this, and this initiative could be seen as a way of aligning themselves for “soft compliance” ahead of time.
What is worth noting in that respect is that the EU AI Act is a work in progress. Vendors and civil society organizations, as well as other stakeholders, are invited to have their say. While in its current form the regulation would be applicable exclusively to LLM makers, organizations such as the Mozilla Foundation are arguing in favor of extending its applicability to downstream applications as well. Overall, this initiative can be seen as part of the broader AI ethics / trustworthy AI “movement.” As such, it’s important to ask relevant questions, and learn from the experience of people who have been on the forefront of AI ethics.