How Hiya taps AI to kill phone spam

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

Have you noticed that you’re getting more calls correctly identified as spam on your phones? Well, Hiya probably has something to do with that.

The Seattle, Washington-based startup, with major clients in telecoms, is using artificial intelligence to detect 20% more illegal and unwanted calls than existing technologies currently do, CEO and founder Alex Algard told VentureBeat.

The company last week introduced what it calls adaptive AI as an addition to its Hiya Protect product, which is used by wireless carriers, smartphone makers, and app developers as part of its service packages. It’s available in services such as AT&T Call Protect, Samsung Smart Call, and the Hiya app.

Algard said the new technology is informed by live data streams from carriers, devices, and apps. “Adaptive AI observes the patterns left by spammers in the network traffic and adapts in real-time to block them without the need for human retraining or historical data,” he said.

The company claims its new capability is much more effective than conventional tactics that only react to known phone numbers used by spammers. The AI adaptivity comes into play when spammers change numbers or carriers, which Algard said happens constantly.

How much phone spam is there?

To quantify the scale of the phone spam, Hiya, which has roughly 200 million, active users, through its carrier clients, offered these statistics:

  • More than 50 billion spam calls are made to Americans each year (16 per month per user)
  • Hiya analyzes more than 13 billion calls per month
  • 94% of unidentified calls go unanswered
  • About one-third of Americans lose money to phone scams each year. On average, each victim lost $182 to phone scams last year. This means Americans collectively lost about $14 billion to scam calls in 2020.

The most common ways scammers make money is by stealing personal information, selling fake products, services, or gaining access to financial accounts. An increasing number of spammers are deploying illegal tactics to generate business leads for legitimate or illegitimate businesses, such as car or computer warranty calls.

Algard said he started Hiya in 2016 as a spin-out from the previous company he founded,

“WhitePages is a directory service site. We identified some potential use cases that we thought we could build an incubator business around — basically, a caller ID service on the old landlines,” Algard said.

“We thought it was odd that on mobile devices, there was no caller ID. So we figured that with the advent of mobile apps, we could actually solve that use case with an automated caller ID service for people who just download the app that we provided. And that turned out to get a lot of consumer interest; tons of people downloaded the app.”

How Hiya puts AI to work

Alex Algard shared the following additional insights in an interview with VentureBeat regarding how technologists, data architects, and software developers can use adaptive AI.

VentureBeat: What AI and ML tools are you using specifically?

Algard: Hiya has unique needs in developing models that can handle the challenges that the scale and volume of voice networks pose. The primary workload is the call analysis load, which must run in real time on live data streams, must be very low latency, and high throughput; fast enough to analyze calls as they are being made; and scale to analyze over 1 billion API calls per day.

This primary workflow is supported by our proprietary Hiya MLOps system that we’ve fine-tuned to our problem. It includes internal ML-model lifecycle management and an ensemble-based prediction system to capture the many telecom scammer scenarios and geographies that we deal with to provide global call protection.

For other workloads, we pull from numerous ML platforms as needed. For example, we use Sagemaker to create, train, and deploy systems that look at a robocall’s network characteristics and analyze recordings.

VentureBeat: Are you using models and algorithms out of a box — for example, from DataRobot or other sources?

Algard: Because of the unique challenges of live data streams and the scale of the networks we run on, we are building and maintaining our own custom frameworks. Out-of-the-box or auto-ML solutions haven’t proven to be a viable solution for the size and scale of the issues we’re tackling.

VentureBeat: What cloud service are you using mainly?

Algard: We use AWS and are expanding to support Microsoft Azure.

VentureBeat: Are you using a lot of the AI workflow tools that come with that cloud?

Algard: We use underlying AWS services such as EC2 and DynamoDB for computing, data storage, and global synchronization. And for data post-processing and data prep, we use tools from multiple sources: AWS Glue, Apache Airflow, Zeppelin, Jupyter, etc.

VentureBeat: How much do you do yourselves?

Algard: Quite a lot. Scammers and illegal callers are sophisticated and constantly changing tactics to avoid detection. We’ve invested in a dedicated team of data scientists that focus on the illegal caller industry and are constantly iterating and adjusting our AI model engine to keep pace with them. Many of the models we employ are on their fifth or sixth generation as we refine them to take on specific scammer tactics. We are active in the AI/ML community and make use of the latest technologies and approaches when we can, but often we have to develop new approaches on our own. Adaptive AI is an example of an approach that we’ve had to develop in-house.

VentureBeat: How are you labeling data for the ML and AI workflows?

Algard: Data labeling is the most important aspect of what we do that makes Hiya so effective at defeating illegal callers globally. We’ve made the investment to do this in-house because of its impact on our accuracy. We use data from several sources, including call event data from the Hiya network, scam traps, user reports, federal compliance data, STIR/SHAKEN, and custom data sources from our carrier and distribution partners.

VentureBeat: Can you give us a ballpark estimate on how much data you are processing?

Algard: Hiya deals with an incredible amount of data: 200M users worldwide, 450,000 ML models recalculations per second, and 20GB/hour of ML model changes pushed to our edge service. Our model recalculation requires the biggest AWS EC2 instance available.

Originally appeared on: TheSpuzz