How AI is improving the web for the visually impaired

May 20, 2022

3363 Views 0

SaveSavedRemoved 0

How AI is improving the web for the visually impaired

We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!

There are almost 350 million people worldwide with blindness or some other form of visual impairment who need to use the internet and mobile apps just like anyone else. Yet, they can only do so if websites and mobile apps are built with accessibility in mind — and not as an afterthought.

The problem

Consider these two sample buttons that you might find on a web page or mobile app. Each has a simple background, so they seem similar.

In fact, they’re a world apart when it comes to accessibility.

It’s a question of contrast. The text on the light blue button has low contrast, so for someone with visual impairment like color blindness or Stargardt disease, the word “Hello” could be completely invisible. It turns out that there is a standard mathematical formula that defines the proper relationship between the color of text and its background. Good designers know about this and use online calculators to calculate those ratios for any element in a design.

So far, so good. But when it comes to text on a complex background like an image or a gradient, things start to get complicated and helpful tools are rare. Before today, accessibility testers have had to check these cases manually by sampling the background of the text at certain points and calculating the contrast ratio for each of the samples. Besides being laborious, the measurement is also inherently subjective, since different testers might sample different points inside the same area and come up with different measurements. This problem — laborious, subjective measurements — has been holding back digital accessibility efforts for years.

Accessibility: AI to the rescue

Artificial intelligence algorithms, it turns out, can be trained to solve problems like this and even to improve automatically as they are exposed to more data.

For example, AI can be trained to do text summarization, which is helpful for users with cognitive impairments; or to do image and facial recognition, which helps those with visual impairments; or real-time captioning, which helps those with hearing impairment. Apple’s VoiceOver integration on the iPhone, whose main usage is to pronounce email or text messages, also uses AI to describe app icons and report battery levels.

Guiding principles for accessibility

Wise companies are rushing to comply with the Americans with Disabilities Act (ADA) and give everyone equal access to technology. In our experience, the right technology tools can help make that much easier, even for today’s modern websites with their thousands of components. For example, a site’s design can be scanned and analyzed via machine learning. It can then improve its accessibility through facial & speech recognition, keyboard navigation, audio translation of descriptions and even dynamic readjustments of image elements.

In our work, we’ve found three guiding principles that, I believe, are critical for digital accessibility. I’ll illustrate them here with reference to how our team, in an effort led by our data science team leader Asya Frumkin, has solved the problem of text on complex backgrounds.

Evinced.002 — Complex backgrounds example. Image by author

Split the big problem into smaller problems

If we look at the text in the image below we see that there is some kind of legibility problem, but it’s hard to quantify overall, looking solely at the whole phrase. On the other hand, if our algorithm examines each of the letters in the phrase separately — for example, the “e” on the left and the “o” on the right — we can more easily tell for each of them whether it is legible or not.

If our algorithm continues to go through all the characters in the text in this way, we can count the number of legible characters in the text and the total number of characters. In our case, there are four legible characters out of eight in total. The ensuing fraction, with the number of legible characters as the numerator, gives us a legibility ratio for the overall text. We can then use an agreed-upon pre-set threshold, for example, 0.6, below which the text is considered unreadable. But the point is we got there by running operations on each piece of the text and then tallying from there.

Evinced.004 — Complex background solution example. Image by author

Repurpose existing tools where possible

We all remember Optical Character Recognition (“OCR”) from the 1970s and 80s. Those tools had promise but ended up being too complex for their originally intended purpose.

But there was a part of those tools called The CRAFT (Character-Region Awareness For Text) model that held out promise for AI and accessibility. CRAFT maps each pixel in the image to its probability of being in the center of a letter. Based on this calculation, it is possible to produce a heat map in which high probability areas will be painted in red and areas with low probability will be painted in blue. From this heat map, you can calculate the bounding boxes of the characters and cut them out of the image. Using this tool, we can extract individual characters from long text and run a binary classification model (like in #1 above) on each of them.

4G2Z5uvnWJy7srLDXAeLMTRInoRjlpI91xt8Pr5n7Q85QyisV8oZnKWJh6knSq TRYrErC lF8 whSu0iu m q8iQG1v ZxUDMDOI7OqgebfhYsJXxiD M AE1Qd0Tn0SGrJASOmAip45rHP7Q?is pending load=1 — CRAFT example. Image by author

Find the right balance in the dataset

The model of the problem classifies individual characters in a straightforward binary way — at least in theory. In practice, there will always be challenging real-world examples that are difficult to quantify. What complicates the matter, even more, is the fact that every person, whether they are visually impaired or not, has a different perception of what is legible.

Here, one solution (and the one we have taken) is to enrich the dataset by adding objective tags to each element. For example, each image can be stamped with a reference piece of text on a fixed background prior to analysis. That way, when the algorithm runs, it will have an objective basis for comparison.

For the future, for the greater good

As the world continues to evolve, every website and mobile application needs to be built with accessibility in mind from the beginning. AI for accessibility is a technological capability, an opportunity to get off the sidelines and engage and a chance to build a world where people’s difficulties are understood and considered. In our view, the solution to inaccessible technology is simply better technology. That way, making websites and apps accessible is part and parcel of making websites and apps that work — but this time, for everybody.

Navin Thadani is cofounder and CEO of Evinced.

Originally appeared on: TheSpuzz