How machine learning helps the New York Times power its paywall

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.

Every organization applying artificial intelligence (AI) and machine learning (ML) to their business is looking to use these powerful technologies to tackle thorny problems. For the New York Times, one of the biggest challenges is striking a balance between meeting its latest target of 15 million digital subscribers by 2027 while also getting more people to read articles online. 

These days, the multimedia giant is digging into that complex cause-and-effect relationship using a causal machine learning model, called the Dynamic Meter, which is all about making its paywall smarter. According to Chris Wiggins, chief data scientist at the New York Times, for the past three or four years the company has worked to understand their user journey scientifically in general and the workings of the paywall.

Back in 2011, when the Times began focusing on digital subscriptions, “metered” access was designed so that non-subscribers could read the same fixed number of articles every month before hitting a paywall requiring a subscription. That allowed the company to gain subscribers while also allowing readers to explore a range of offerings before committing to a subscription. 

Machine learning for better decision-making

Now, however, the Dynamic Meter can set personalized meter limits — that is, by powering the model with data-driven user insights — the causal machine learning model can be prescriptive, determining the right number of free articles each user should get so they get interested enough in the New York Times to subscribe to continue reading more. 


MetaBeat 2022

MetaBeat will bring together thought leaders to give guidance on how metaverse technology will transform the way all industries communicate and do business on October 4 in San Francisco, CA.

Register Here

According to a blog post written by Rohit Supekar, a data scientist on the New York Times’ algorithmic targeting team, at the top of the site’s subscription funnel are unregistered users. At a specific meter limit, they are shown a registration wall that blocks access and asks them to create an account. This allows them access to more free content, and a registration ID allows the company to better understand their activity. Once registered users reach another meter limit, they are served a paywall with a subscription offer. The Dynamic Meter model learns from all of this registered user data and determines the appropriate meter limit to optimize for specific key performance indicators (KPIs). 

The idea, said Wiggins, is to form a long-term relationship with readers. “It’s a much slower problem in which people engage over the span of weeks or months,” he said. “Then, at some point, you ask them to become a subscriber and see whether or not you did a good job.” 

Causal AI helps understand what would have happened

The most difficult challenge in building the causal machine learning model was in setting up the robust data pipeline to understand the user activity for over 130 million registered users on the New York Times’ site, said Supekar.

The key technical advancement powering the Dynamic Meter is around causal AI, a machine learning method where you want to build models which can predict what would have happened. 

“We’re really trying to understand the cause and effect,” he explained.

If a particular user was given a different number of free articles, what would be the likelihood that they would subscribe or the likelihood that they would read a certain number of articles? This is a complicated question, he explained, because in reality, they can only observe one of these outcomes. 

“If we give somebody 100 free articles, we have to guess what would have happened if they were given 50 articles,” he said. “These sorts of questions fall in the realm of causal AI.”

Superkar’s blog post explained that it’s clear how the causal machine learning model works by performing a randomized control trial, where certain groups of people are given different numbers of free articles and the model can learn based on this data. As the meter limit for registered users increases, the engagement measured by the average number of page views gets larger. But it also leads to a reduction in subscription conversions because fewer users encounter the paywall. The Dynamic Meter has to both optimize for and balance a trade-off between conversion engagement.

“For a specific user who got 100 free articles, we can determine what would have happened if they got 50 because we can compare them with other registered users who were given 50 articles,” said Supekar. This is an example of why causal AI has become popular, because “There are a lot of business decisions, which have a lot of revenue impact in our case, where we would like to understand the relationship between what happened and what would have happened,” he explained. “That’s where causal AI has really picked up steam.” 

Machine learning requires understanding and ethics

Wiggins added that with so many organizations bringing AI into their businesses for automated decision-making, they really want to understand what is going to happen. 

“It’s different from machine learning in the service of insights, where you do a classification problem once and maybe you study that as a model, but you don’t actually put the ML into production to make decisions for you,” he said. Instead, for a business that wants AI to really make decisions, they want to have an understanding of what’s going on. “You don’t want it to be a blackbox model,” he pointed out.

Supekar added that his team is conscious of algorithmic ethics when it comes to the Dynamic Meter model. “Our exclusive first-party data is only about the engagement people have with the Times content, and we don’t include any demographic or psychographic features,” he said. 

The future of the New York Times paywall

As for the future of the New York Times’ paywall, Supekar said he is excited about exploring the science about the negative aspects of introducing paywalls in the media business. 

“We do know if you show paywalls we get a lot of subscribers, but we are also interested in knowing how a paywall affects some readers’ habits and the likelihood they would want to return in the future, even months or years down the line,” he said. “We want to maintain a healthy audience so they can potentially become subscribers, but also serve our product mission to increase readership.” 

The subscription business model has these kinds of inherent challenges, added Wiggins.

“You don’t have those challenges if your business model is about clicks,” he said. “We think about how our design choices now impact whether someone will continue to be a subscriber in three months, or three years. It’s a complex science.” 

Originally appeared on: TheSpuzz