All the sessions from Transform 2021 are available on-demand now. Watch now.
Tobii built its name over the past 20 years by making eye-tracking technology for healthcare, behavior research, gaming, and virtual reality and augmented reality. But now it’s expanding into the market for automotive systems that track drivers.
This kind of Driver Monitoring System can do things like monitor a driver’s alertness or. Part of this is the work of Anand Srivatsa, who spent 15 years at Intel before joining in July 2019 as the CEO of the Tobii Tech division. I interviewed Srivatsa recently to talk about the changes as Tobii and its move into automotive.
The overall company, Tobii Group, has three divisions. It is spinning off its Dynavox medical division on the Stockholm NASDAQ public market, and Srivatsa will be the CEO of the remaining two combined divisions, Tobii Tech and Tobii Pro. The latter does eye tracking to understand human behavior and improve performance. That transaction is still pending. On top of that, Tobii Group announced this week it is acquiring Phasya, an automotive systems company, for $4.7 million.
Tobii expects the Tobii Driver Monitoring System (DMS) to hit the market in cars in 2024 to 2025. It is working with automotive suppliers such as Sunny SmartLead and Nviso. The idea is to apply both eye tracking and AI to decipher key data points about a driver’s attention and drowsiness to enhance traffic safety. It can monitor multiple people, detect emotions, review upper body movement, and gestures.
Three top investment pros open up about what it takes to get your video game funded.
Watch On Demand
Here’s an edited transcript of our interview with Srivatsa.
VentureBeat: How did you come in to Tobii?
Anand Srivatsa: I spent about 15 years before Tobii at Intel. When I was at Intel I was running the desktop business, the workstation business, the client computing group. At the time it was one of these transformations in my career where I had spent a lot of time in different parts of the business. I was on the networking side, the telecom side. I ended up in desktop, and I was looking for the next opportunity to be more technology-centric. Desktop is great. I think we did some good things on that end. But fundamentally the market is shifting away from desktop computing except in the high end of the space. I was looking for something different. This company found me and I found them. It was an interesting opportunity to be in what I see as the next wave of technology adoption following this human-computer interface, this new modality work.
I lived in Oregon before this. When I was at Intel I lived in Oregon for six years. For this job I ended up moving to Stockholm. My family and I are there. It’s an interesting place to be. We got there six months before the pandemic started. It was one of the few places where you could still do date night through the entire time. We felt fortunate. But it’s been a little surreal. It feels like two years at Tobii have gone by in six months. Our normal business activities have been highly curtailed in the course of the pandemic.
VentureBeat: Where is the product line now? Where do you see it going?
Srivatsa: Maybe I could start out and give you a quick context on the changes happening in the company. This could give you an idea of how I view the company and the opportunity in front of us. The company as it’s structured today, there’s the Tobii Group, which is a bit like a holding company. The CEO of that is Henrik Eskilsson, who’s been a founder of the company. Underneath that we have three relatively independent divisions.
One is Tobii Dynavox, which makes medical systems with our underlying core technology. They add software and custom hardware on top of that. They have a lot of value creation focused on delivering medical devices for people who need to communicate. That’s one vertical. The second vertical that’s been stood up inside of Tobii is this business called Tobii Pro, which enables us to use eye tracking to understand human behavior and human preference and performance. They deliver into universities that are doing research on neuroscience or psychology, but they also deliver that into commercial enterprises who are trying to understand performance, do training, do market research or advertising research. Both of those divisions do full solution sets that build on top of our core tech.
The division I’ve been running for the last couple of years is Tobii Tech. We’re more like a picks and shovels company. We’re an enabling technology. We do of course create higher value offerings in the software space, but in the broadest sense, we’re trying to proliferate the adoption of our technology in mass market opportunities. We’ve focused on PC and VR and a couple of other adjacent vertical areas.
What the company announced at the end of April was that we were going to split the company into two pieces. We’re going to list the Dynavox division separately. Dynavox will be a stand-alone company that’s listed on the Stockholm NASDAQ. The remaining company retaining the name of Tobii would be that group function with these two divisions underneath. They’ll be merged together. I would be the CEO of that new company when we finalize the split, which should happen in the Q4 time frame.
That’s where the company is overall. For my focus, I’ve come in initially with a view specifically around Tobii Tech, which has been around driving mass market adoption of the technology. But when you look at the opportunity we have in front of us, it’s an interesting set of synergies, where we have people who create use cases for our technology in scientific research. They look at how you take eye tracking or attention, these biometric data we can deliver, and create insights around people. Those things tend to get delivered, in many cases, through mass market types of applications or mass market products. We have this end to end chain from research that drives definitions of signals and needs. It can connect them to applications, to the commercialization of that, either in specific vertical solutions that are built by particular companies, or in some cases in more mass-produced devices that can then get customized.
A good example of this we’re starting to see now is around enterprise VR. In enterprise VR you have a couple of different modalities for eye tracking. You can use eye tracking for the interaction. You can use it as you see with Facebook, talking about social interaction and things like that. But then there’s this other part where we’re creating relationships with end customers who take the power of our eye tracking solutions and put them in medical-grade devices where they can disrupt existing industries. We have customers that do that in enterprise VR with our eye tracking. That’s the full circle, coming from research that says, “This is what you can do with an eye tracking signal in ophthalmology or neuroscience,” and then it gets manifested into a solution that we’re enabling from a hardware perspective, and then we work with them on extracting these higher-level signals to deliver value.
VentureBeat: Which VR devices are you in?
Srivatsa: We’re in the HTC Vive Pro Eye. We’re in the Pico Neo 2 and the Neo 3. This is actually the reason, coming back to my view of the interest in this company–it seems fairly obvious to me that the truism in the world right now is that machines need to communicate with humans on human terms. We’re no longer going to shape the way we communicate to make machines understand us.
The fact of the matter is, we indicate a lot with where we look. What are we paying attention to? What do we care about? It’s there in general, but if you look at wearables and AR and VR and mixed reality, if you really want to build a compelling AR experience and give context to people, you can’t give it on a wide field of view. I may be looking at anything in this space, but if I’m looking at that menu, maybe you want to say, “Hey, I don’t think that egg and cheese is being served anymore.” That actually can make AR compelling. Otherwise we have to convince people who’ve never worn glasses to put glasses on for the entire day. It’s going to have to be something magic, something that makes me feel like I want to go do this.
What we see as proof–with Varjo, with Facebook’s commentary around eye tracking, with Hololens’s eye tracking, we think that this is where we are specifically with VR and AR. The future will have eye tracking. That validates our point of view that it’s an important technology. With our work in eye tracking over the last 20 years, I think we’re a trusted partner to deliver that for a variety of headset manufacturers.
VentureBeat: How has the tech on the PC developed for things like game eye tracking and the esports applications?
Srivatsa: In the last couple of years, one thing we’ve started to see in PC–we have a couple of different focus areas specifically in gaming. We’re focused on three things around gaming. One we call immersive gaming, which says that you can use eye tracking to enhance your gameplay. Either you get more immersed or you can use it to control elements of the screen. There, I think that it takes a fair amount of time to change user behavior. But we’re starting to see some positive trends.
One big proof point for us recently has been the community of eye tracking users. They went to the Flight Simulator forums and demanded that Microsoft Flight Simulator support it. That, for us, starts to showcase that the behavior or the experience we’re delivering specifically for the simulation is powerful enough that people are starting to expect it. That can get the flywheel started. But of course the expectation long term is that most gamers would say, “I want every game to have it.” That’s a marathon we need to be on.
On the esports training side, this is legitimately a substantial opportunity for us. But we’re starting to see the esports market mature into spaces like this. Something like training or coaching is still starting to become more mainstream. My history at Intel, when we were doing desktops, we were the first group to lean into esports when–esports was already a phenomenon outside of Intel, but we started to pay attention in 2014. We were, at the time, still convincing people it was real. Don’t worry, arenas are going to fill up. We’ve started to see this manifest now. We have teams, sponsorships, coaches. Like other sports, the use of analytics will come in to be much more fundamental to understanding what makes somebody good at a game. But I think that’s still a burgeoning market.
In the last year or so we’ve partnered with this company Mobalytics around League of Legends. We’ve seen so far that we have good, consistent usage of people on the platform, but I think we still need to go and get a breakthrough in that category. That’s something we haven’t done yet. But it’s something we expect to continue to go and invest around. One positive trend we see, though, is that there’s a lot of people who use our streaming platform, the ability to stream with eye tracking, to go in and explain what they’re doing in games. You have people who train other people on how to play FIFA. They use the eye tracking. It’s not a platform they build. They just use the eye tracking, explain what they’re doing in the game, and show their audience, “Hey, this is what I do. This is what I look at.” That creates an intuitive understanding.
All of these video games, in fact most of what we do in general, it’s about responding to visual stimuli. We see something and take some action. Fundamentally, in a video game it’s not about how fast you throw the ball. It’s how fast you process the information you get to take an action on the keyboard or mouse. That’s progressing. But again, it’s going to take a bit of time.
VentureBeat: Do you think the computer makers will continue to build it into laptops and things like that? Is that where most of the activity is?
Srivatsa: We had two primary focus areas, or two sets of things broadly we were doing. One, we were creating demand for our technology. This is where we were working with game studios to get them to integrate support for eye tracking for immersive gaming, or working with a partner like Mobalytics to showcase what can be done with esports training. The way the technology was fulfilled was with two types of products. One was products integrated into laptops, like the Alienware devices, and the second was where we would have a direct to consumer business where people were using eye trackers attached to desktop monitors for gaming.
Fundamentally, if you look at the dynamics of gamers in general, the enthusiast gamers are largely on desktop. When you get a buyer that’s buying a peripheral, the good news for us is it’s very clear where we can see that they absolutely intend to go buy this device. They have a specific usage in mind when they buy. With a more broad distribution model like building it into a laptop, people buy the laptop. They may value the eye tracking, or they may just think it’s a nice thing to have. Where we’ve spent a lot of effort now is on the peripheral itself, to go back in and make sure those use cases are robust enough to get clarity around exactly what’s driving retention for customers on this technology. What’s delighting them? Right now we see that in simulation gaming on the immersive side.
We see good retention with esports training, but it’s still early days there. We need to have more esports titles. Right now we have League of Legends, but we need a platform at some point for Fortnite, for Valorant, for Counter-Strike, for even things like NBA2K.
VentureBeat: Was there a universal benefit that people saw all across different games? I can see why League of Legends is a key game, because you don’t necessarily know where pros are looking. If you can establish where they behave you can try to follow that behavior. With shooters, can it give you the same kind of valuable feedback?
Srivatsa: There’s two things in general. One thing that tends to be consistent in all of these games is that there’s a mini-map. There’s a question around mini-map awareness. In a game like League of Legends where you have lanes and enemies, there’s a piece there where you have awareness of what’s happening there.
The second thing, in the shooters, we’ve seen this when we broadcast esports tournaments. You have people who do a good job looking at where an enemy can be. If you’re going to get sniped or something like that–in Rainbow Six we were broadcasting while using eye tracking during a tournament. They were talking about players walking through an area where there’s a hole in the ceiling. You see the player scan all the locations where an enemy can be. If you think about a player who may be more tied to their crosshairs, the amount of time it takes to notice someone is there before you get shot–those kinds of things are potential differentiators.
But fundamentally I think you’re right. For the different games, the mechanics and what’s important is going to be different. This is why, with Mobalytics and League of Legends, our approach typically has been that we don’t necessarily know what’s going to be most useful for a game. What separates a professional player from an aspiring professional or just the amateur? But certainly some of these things about information processing, about scanning, about not being tunneled into a spot, these seem to be things that would be consistent.
I can tell you myself, when I play Fortnite, I’m terrible. My son’s 12 years old. We started playing together. The thing I struggle with a bit is following where the enemies are. I haven’t eye tracked myself in the game, but my son is just like, “Why can’t you shoot him, Dad?” I honestly don’t see them. Even without much data, I’d bet that a professional player is probably doing some things differently than me. And probably some this is of course the ability to get your mouse exactly where you want to look, where you want to shoot.
VentureBeat: When I played Warzone with some really good players, I think the mini-map was critical. They knew that there was shooting coming from a certain direction, so there were people over there to be careful about. They could always call that out sooner than I was aware.
Srivatsa: Exactly. That’s the thing. For me, in Fortnite, I’m in survival mode. I don’t have time to look at those higher-level things. I’m just, “Shit, someone’s here, I’m gonna get killed.” And then that’s followed by my son yelling at me. It’s so funny. He says, “I didn’t mean to be mean to you.”
But the interesting thing we see, when we look at the PC category, we’ve been very much focused on gaming. We look at streaming or broadcasting and marketing of our use cases, whether it’s around analytics or–we have a lot of people who stream their Star Citizen gameplay. They talk about how immersed they are. But one big focus area for us in general has been going in and broadening the areas where eye tracking or the technologies we develop would be valuable in PCs. Over the last couple of years one burgeoning area of focus for us has been education. As we look forward on PC, we’re going to continue to drive in gaming. We see positive momentum in a couple of areas. We suspect our initial focus will be a lot on these peripherals. We can validate these use cases with enthusiasts. If you’re in esports, it’s going to be very much desktop-driven. And then the integration path could be through monitors, or of course laptops are more and more popular. That would be the next integration path that would follow or continue to be on that vector.
But the second one for us, which I think we’ve been incubating with the same demand creation, has been around education. We have a lot of research and deployed product from partners of ours that use eye tracking to do literacy research. They’re able to diagnose conditions like ADHD and things like that. We’re starting to lean into that now. That’s another big opportunity for us as we see this trend of one-for-one computing in education. Maybe even immediately. You see these challenges around overcoming the education deficit we have from the pandemic. We have these opportunities to make technology a way to scale assessment in a much more objective way, to go and identify those students who need more support or intervention. You can imagine that if you can intervene with a second grader to get their reading up to par, how much of an impact that has through the rest of their education.
We have a partner of ours called Lexplore. They’re broadly deployed in Sweden, but they’re also deployed in the Oakland school district. They do a standard assessment for literacy for kids. Their point of view is, you typically have two easy audiences to identify in students. One is the really good readers that the teachers likely know, and then the really bad readers that the teachers can easily spot. The bad readers get support. You can tell they’re having difficulty reading. But if you go to that midline and look at people you’d characterize as yellow — they’re not a great reader, but it’s not so obvious — those are the bulk of the population that would fall behind. This is one of those things where the teachers don’t have the time to say, “Hey, where are you in that spot?” If you can automate assessment, you can identify those kids early. We’re going to take those kids in yellow and move them over to green. That, I think, has a huge knock-on effect on education outcomes in general.
VentureBeat: How is the progression on being able to control things with your eyes, as opposed to just being able to track them? Whether in VR or on a flat screen.
Srivatsa: In VR it’s quite robust. We have a bunch of technologies we built up. One thing is, if you look at our Tobii Dynavox division, that’s what they use the eye tracking for. They use it as an interface with the computer, not just to see where you’re looking at. You can select objects, whether it’s on a keyboard or icons on a 2D surface. The interesting thing is, your eyes will not have the fidelity of a mouse. You can’t get that kind of DPI. Your eyes move around all the time. There’s actually a fair amount of technology you need to go in and interpret your eye movements to select the right things on the screen.
We have technologies there we call gaze to object mapping, things like that, that increase the percentage we’re going to guess right as far as what the icon is based on where you move. We do that as well in things like VR. In VR, of course, it’s so much more important, because you don’t have a high-fidelity input inside the virtual world. You don’t have the equivalent of a mouse. You may have a handheld wand, but it’s not the same as having something on a surface that has high fidelity. That, we think, is going to be very obvious.
What we see in computing devices like this, what you can end up with is a bit more assistance. We have partners that use the eye tracking information to reduce mouse mileage. If you imagine that you’re looking at one quadrant on the screen, we may not have the resolution to go in and decide that you’re picking this letter. But if the mouse cursor moves close to where you are, you’re not using the mouse to do that broad movement of your cursor. You’re just using it for that fine piece. We have those kinds of uses as well. We have a partner called 4tiitoo. They’ve done some of this work with SAP. They market a piece of software they provide for increased productivity and digital well-being. We certainly see that aspect as well on that end. That kind of usage where you use our technology in concert with something that has more high fidelity could make a computer more easy to use and reduce stress-based injuries, things like that.
VentureBeat: Going back to the splitting of the company and one part going public, how does that structure make sense as far as how far along each division is?
Srivatsa: Tobii Dynavox is fairly mature. They’re a profitable business today. If you look at the three divisions, the aspiration in the beginning was that all of them would likely separate. The reason they were structured that way is because there was an expectation they would continue to bifurcate away from each other. Dynavox is probably the most bifurcated today. They use the core technologies we deliver in terms of attention computing or eye tracking, things like that, but they also have devices that are touch only. There are people who need a voice, who are able to use their hands, and for them they’re not looking for something that’s eye gaze-oriented only. They can use something that’s touch-enabled. Their key value propositions are on the software, the layers above, the purpose-built hardware to deliver those capabilities. That division has started to separate away already.
When you look at Tobii Pro and Tobii Tech, there’s an interesting synergy there. As I mentioned, Tobii Pro is working with the scientific researchers who develop, effectively, what kinds of signals you need to deliver particular value. For interactions, for example, that could be relatively well-understood, but if you think about biometrics and using it for things like diagnostics, medical usages that would be developed by a GE Medical Systems, which Tobii Tech would typically work with, the characteristics of those signals would be defined by research coming out of universities. That linkage is something that we think is going to stay strong throughout.
The second aspect, which I think is interesting, is if you think about the other parts of what Tobii Pro is doing, which is helping people with training or human performance. If you think about delivering platforms to do market research or advertising, they’re not looking for instruments as much as they’re looking for tools. In the absence of having tools that are purpose-built, we build our own. But you can imagine that there’s an instance where you’d say, “As our technology proliferates, those tools will come out of something as standard as your computer.” If you have an eye tracker built in, there may not be a reason to buy a separate piece of hardware to do market research. Maybe what you do is go get the software from Tobii Pro. There, again, this link between what we do on the tech side versus what needs to be delivered from Tobii Pro toward these clients, it can be strengthened by understanding what you can deliver in these more mass market products. That’s the structure that we’re conceiving.
The company, the Pro and Tech entity that comes out, which will be called Tobii, at the get-go is likely not profitable. But we think it has very different kinds of business dynamics. It should be a much higher-growth company than Tobii Dynavox. As we start seeing some of these bets pay off around VR or AR or PC continuing to accelerate, we think that’s a quite high-growth, more high-tech type of company, versus a medical device kind of thing.
VentureBeat: It seems like a good way to incubate different projects.
Srivatsa: Absolutely. That’s one of the biggest things on my end. After a couple of years here, one of the truisms I see is that we have some areas where we can absolutely see the value of eye tracking today. We talk about VR. We see some opportunities in gaming. We think the education opportunity is pretty substantial on the PC side. But there are so many more places that eye tracking could create value that we inside of Stockholm or in the company wouldn’t really understand. Part of our job is to promote what you can do with this tech. This is one of the pivots we’re trying to make.
We tend to talk a lot about eye tracking, but even for me, when I was interviewing with the company, I assumed that eye tracking was about controlling a computer with your eyes or understanding where people look. What people typically don’t understand is that it’s so much more powerful than that. Our eyes are terrible. They’re a very bad camera. But our brains are such good processors that it looks like everything’s in high definition. Because our eyes are so poor, effectively what you see when you look at where someone’s looking is what they’re interested in. This is almost an autonomous type of control in your head. You can try not to look, but the fact is, if you let your brain do what it does, whatever you’re interested in, you’ll end up looking at it. The power of that insight that you glean is fundamental.
We are trying to make the pivot, one, that we’re absolutely a world leader in eye tracking, but already today we do much more than just eye tracking. We’re pulling insight out of it. We’re giving different kinds of signals around head pose or cognitive load. We’re trying to talk about this concept of attention computing, which is maybe trying to understand the user better. As we look at that opportunity and we can explain to people that this is what you can do, we think there will be many applications in places we don’t fully comprehend yet. They would come to us and say, “I can do this with this technology. This would be a disruptive tech.”
One area that is already starting to see that manifest is in automotive. If you think about the need for cars that are trying to get to vision zero, to go and make sure that the driver is paying attention, or you think about autonomous driving and giving control back to a driver, you need to understand their mental state. Are they ready to take control back? Should you do something else? We think this kind of technology will be absolutely critical in those areas. But there may be other examples as well.
VentureBeat: Do you think that this has a place with smartphones, being built in there? It seems like there could be applications with things like tracking whether ads are viewed.
Srivatsa: One thing we’ve had partners talk about–if you look at ads in general, whether it’s on a larger screen device or a smaller one, today you talk about impressions or things in the screen. What people really want to know is, did you see my ad? Not just whether it was on the screen while you looked at something else. There’s absolutely power to do that on that kind of device, where there’s a fair amount of real estate to look at. Where you could be looking at something else and not the ad, even though the ad is on screen and it would count as an impression.
The challenge with this kind of device for eye tracking is that in your field of view it’s relatively small. If you want a much higher fidelity view of a very small portion of the screen you’re looking at, it becomes challenging. The question around smartphones is absolutely open about whether there will be value in trying to track where you look inside the phone. It just has to do with the physical size. But this is why, if you take the reverse, when you’re wearing an AR headset, when your field of view is absolutely immense, there it’s so critical to come back and think about something like eye tracking to understand what it is you care about.
We think that smartphones are a challenging problem to solve for creating enough value to justify it. But I think there’s potentially opportunity there. The larger the field of view, the more you can use where people are looking to go and really deliver value. It could be, like in an AR thing, to give you context. But in VR one of the big things we see is foveated rendering, foveated transport. Again, those kinds of things, when you know where the user is looking and that’s a small percentage of their broader field of view, you can do really intelligent things with that. If you look here, there’s no concept of foveated rendering when this entire thing can be in your field of view.
When I was at Intel, one of the things we were doing when we were driving VR on the desktop, everybody’s vision is you want the holodeck, which I think in some ways would be the metaverse. You come back and the problem is, you need photorealistic VR. Why can’t VR look like this? But this is the power of something like eye tracking and things like that. If you want to compel people to feel like they’re living in a 4K HD world inside of VR, but not explode your compute budget, those kinds of things are going to be important. It’s interesting to see people that are in a way already there.
The metaverse is a very convenient north star, just like the holodeck. We may never get there, but it’s a good thing to aim for, because we know that’s an experience that will delight people. But I’m quite bullish that as more and more companies, especially with the pandemic and how much time people spend online now–they’re getting more comfortable with interacting virtually for work and other things. Maybe taking simple things like turning on your video camera–at my company we were using Teams all the time, and nobody would turn on their camera pre-pandemic. You’d be virtual, but it’d be like a phone call. You’d have to ask people to turn on their camera, and they wouldn’t want to. Then the pandemic happened and now this is the reality. The tools are way better than they were before, but they’re still way behind where they could be if you want to go and create this kind of construct.
One thing that’s been interesting for me, in the last eight months the VR buzz has almost tripled. Varjo has a compelling vision, but even before, it was very much, enterprise VR looks pretty good, consumer VR, who the hell knows. It felt like everyone thought VR was the stepping stone to AR, and that’s the world that was coming to be. At least for me, I perceive VR like, maybe it grows up to be an enthusiast experience. It picks up the console players and enthusiast desktop players. It’s not a billion a year, but maybe $100 million a year. That’s been my mental model. Not everyone will do VR, but if AR becomes the next computing platform you’ll need a billion of those out there.
It’s been interesting in the last six months where all of a sudden, everyone seems to be bought in there. There’s also a bit of healthy skepticism of what it takes to do AR properly. It’s a very difficult engineering problem. But that, for me, has been surprising. Even the big players who dominate the space have come out and said that they’re underinvested in VR.
VentureBeat: How many people are there working for you now?
Srivatsa: The company right now is about 1200 to 1400 people. The two new companies will be split approximately 50-50, more than 600 each.