Join today’s leading executives online at the Data Summit on March 9th. Register here.
This week in AI, DeepMind detailed a new code-generating system, AlphaCode, that it claims is competitive with top human programmers. Supermarket chains in the U.K. announced that they’d begin testing automatic age verification systems to estimate customers’ ages when buying alcohol. And EleutherAI, a research group focused on open-sourcing highly capable AI systems, released GPT-NeoX-20B, a language model that’s among the largest of its kind.
AlphaCode is one of the more sophisticated examples of machine programming, or tools that automate software development and maintenance processes. DeepMind claims that it can write “competition-level” code, achieving an average ranking within the top 54.3% across 10 recent contests on the programming challenge platform Codeforces.
The applications of machine programming are vast in scope — explaining why there’s enthusiasm around it. According to a study from the University of Cambridge, at least half of developers’ efforts are spent debugging, which costs the software industry an estimated $312 billion per year. Even migrating a codebase to a more efficient language can command a princely sum. For example, the Commonwealth Bank of Australia spent around $750 million over the course of five years to convert its platform from COBOL to Java.
AI-powered code generation tools like AlphaCode promise to cut development costs while allowing coders to focus on creative, less repetitive tasks. But AlphaCode isn’t flawless. Besides being expensive to maintain, it doesn’t always produce code that’s correct and could — if similar systems are any indication — contain problematic bias. Moreover, if it’s ever made available publicly, malicious actors could misuse it to create malware, bypass programming tests, and fool cybersecurity researchers.
“[A]lthough the idea of giving the power of programming to people who can’t program is exciting, we’ve got a lot of problems to solve before we get there,” Mike Cook, an AI researcher at Queen Mary University of London.
Automatic age verification
Three supermarket chains in the U.K. — Asada, Co-op, and Morrisons — are using cameras to estimate customers’ age as part of a test by the Home Office, the U.K. department responsible for immigration, security, and law and order. The technology, which was already being used in Aldi’s checkout-free location in London, guesses the age of customers who consent using algorithms trained on “a database of anonymous faces,” according to the BBC. If it decides that they’re under 25, they’ll have to show ID to a member of the staff.
Yoti — the company providing the technology — says that it was tested on more than 125,000 faces and guessed age to within 2.2 years. But while Yoti says that it’s not performing facial recognition or retaining the images that takes, the system raises ethical concerns.
Age estimation systems, like other AI systems, could amplify any bias in the data used to develop the systems. One study highlights the effect of makeup, which can cover age signs like age spots and wrinkles, and finds that age estimation software tends to be more accurate for men. The same research found that the software overestimates the ages of younger non-Caucasians and underestimates the ages of older Asian and Black people, and can even be influenced by whether someone smiles or not.
In an interview with Wired, Yoti cofounder and CEO Robin Tombs admitted that the company was unsure about which facial features its AI uses to determine people’s age. While he stressed that Yoti’s training dataset of “hundreds of thousands” of faces was “representative across skin tones, ages, and gender” and that its internal research showed similar error rates across demographics, the academic literature would appear to suggest otherwise. Yoti’s own whitepaper shows that the tech is least accurate for older women with darker skin.
A wrong age estimate at the supermarket might amount to little more than inconvenience (and perhaps embarrassment). But it could normalize the tech, leading to more problematic applications elsewhere. Daniel Leufer, a Europe policy analyst focused on AI at civil liberties group Access Now, told Wired that regulators should look at whom these systems will likely fail when they’re considering the use cases. “Typically, that answer is people who are routinely failed by other systems,” he said.
Open source language model
EleutherAI on Wednesday released its newest language model, GPT-NeoX-20B, as part of its mission to broaden access to highly capable text-generating AI. Available now through an API and next week in open source, GPT-NeoX-20B outperforms other public language models across several domains while being generally cheaper to deploy, according to EleutherAI.
GPT-NeoX-20B — which was developed on infrastructure provided by CoreWeave, a specialized cloud provider — was trained on EleutherAI’s 825GB text dataset and contains 20 billion parameters, roughly 9 times fewer than OpenAI’s GPT-3. In machine learning, parameters are the part of the model that’s learned from historical training data. Generally speaking, in the language domain, the correlation between the number of parameters and sophistication has held up remarkably well.
EleutherAI makes no claim that GPT-NeoX-20B solves any of the major problems plaguing current language models, including aspects like bias and toxicity. But the group maintains that the benefits of releasing the model — and others like it — outweigh the risks. Language models can cost up to millions of dollars to train from scratch, and inference — (i.e., actually running the trained model) is another barrier. One estimate pegs the cost of running GPT-3 on a single Amazon Web Services instance at a minimum of $87,000 per year.
“From spam and astroturfing to chatbot addiction, there are clear harms that can manifest from the use of these models already today, and we expect the alignment of future models to be of critical importance. We think the acceleration of safety research is extremely important,” EleutherAI cofounder Connor Leahy said in a statement.
EleutherAI’s previous models have already spawned entirely new AI-as-a-service startups. If history is any indication, GPT-NeoX-20B will do the same.
For AI coverage, send news tips to Kyle Wiggers — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.
Thanks for reading,
AI Senior Staff Writer