IBM and NASA deploy open-source geospatial AI foundation model on Hugging Face

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here

There are a lot of different open source models available on Hugging Face — and today at least one more is being added to that number.

IBM and NASA today jointly announced the availability of the geospatial foundation model on Hugging Face. The development of the model was first disclosed in February as an attempt to unlock the value of massive volumes of satellite imagery to help advance climate science and improve life here on Earth. The open mode was trained on NASA’s Harmonized Landsat Sentinel-2 satellite data (HLS)  with additional fine tuning using labeled data for several specific use cases including burn scar and flood mapping. 

The geospatial foundation model benefits from enterprise technologies that IBM has been developing for its effort and the company is hopeful that the innovations pioneered in the new model will help both scientific and business use cases.

“With foundation models, we have this opportunity to be able to do a lot of pre-training and then easily adapt and accelerate productivity and deployment,” Sriram Raghavan, VP for IBM Research AI told VentureBeat.


VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.


Register Now

Data labeling at scale is hard, foundation models solve that problem

A primary challenge that IBM’s enterprise users have faced with AI in the past is that training used to require very large sets of labeled data. Foundation models change that paradigm.

With a foundation model, the AI is pre-trained on a large dataset of unlabeled data. Fine tuning for a specific use case can then be executed using some labeled data to get a very customized model. Not only is the model customized, IBM and NASA found that using the foundation model approach enabled faster training and better accuracy than working with a model entirely built with labeled data.

For example, Raghavan said that for the use case of flood model prediction, the new foundation model was able to improve prediction 15% over a state of the art with one half the amount of labeled data. 

“You are now talking about basically half the work that an SME [Subject Matter Expert] has to do,” said Raghavan. “So, you use the base model that was trained in an unsupervised fashion, then an SME said, ‘I’m going to teach you how to do flood [prediction]’ and they use half the amount of labeled data that they had to use for other techniques.”

For the burn scar use case, which is increasingly important in an era where wildfires rage over wide areas of land, IBM recognized an even greater benefit. Raghavan said that the IBM model was able to train a model with 75% less labeled data than the current state-of-the-art model, providing what he referred to as ‘double digit’ improvements in performance.

Why Hugging Face matters for an open geospatial foundation model

As to why IBM and NASA are making the model available on Hugging Face, there are numerous reasons, Raghavan said.

For one, Hugging Face has become the leading community for open AI models, he said. It’s a recognition that IBM made earlier this year when it first announced the approach to building foundation models. As part of the initial announcement, IBM partnered with Hugging Face to bring access to open AI models to IBM’s enterprise users. 

By making the geospatial foundation model available on Hugging Face, IBM and NASA are hoping that the model will be used, and that there will be some lessons learned that help to improve it over time.

Raghavan said that by making the model compatible with Hugging Face’ APIs, developers can make use of a wide range of existing tooling to benefit from and use the model.

“The purpose was to reduce the effort it takes for the audience, and the audience here is really scientists who are going to work on top of the satellite data,” he said. “Today Hugging Face APIs dominate the ecosystem in terms of familiarity.”

How enterprise users will benefit (eventually)

While the core audience for the geospatial foundation model is scientists, Raghavan expects that there will be learnings that will help enterprise use cases of AI as well.

In terms of direct impact, IBM has an environment intelligence suite that uses various models today to help organizations with sustainability efforts. Raghavan said that the new model will, in time, be integrated with that platform. 

There is also potential for what Raghavan referred to as ‘meta learning’ where lessons learned will impact other areas of IBM’s AI development efforts.

“We believe that we’re in the journey of understanding what is the developer experience around foundation models,” he said. “By exposing a new class of users now with scientists who are going to be doing fine tuning on these models, we will start to understand what we have to offer to make that process better and better, and I believe some of those learnings we will take back.”

Originally appeared on: TheSpuzz