Global AI embeddings in Earth Observation

Global AI embeddings in Earth Observation - earth for embeddings 4 optGlobal AI embeddings in Earth Observation - earth for embeddings 4 opt

CloudFerro and European Space Agency (ESA) Φ-lab have published the first global embeddings dataset for Earth observations. This innovative dataset integrates cutting-edge AI technologies to advance Earth Observation capabilities, enabling more precise and scalable analysis of satellite data.

What are embeddings?

Embeddings are high-dimensional vectors that transform complex data into numerical representations, capturing relationships and meanings.

By mapping words, images, or entire documents into this space, embeddings maintain the semantic properties of the original data, allowing AI models to understand and process context with accuracy.

This enables machines to identify patterns, similarities, and connections that might otherwise be challenging to detect.

Global AI embeddings in Earth Observation - ai embeddings2 opt 1

Embeddings insights

Global AI embeddings in Earth Observation - ai

Raw data AI transformation

Embeddings transform raw data into a structured format that can be meaningfully interpreted, allowing AI models to extract deeper insights and relationships.

Global AI embeddings in Earth Observation - monitor 1

More accurate analysis

By capturing the underlying patterns and connections within the data, embeddings enable more accurate and context-aware analysis, needed in machine learning, natural language understanding, and computer vision.

Global AI embeddings in Earth Observation - cloud1 1

The foundation for scalability

Embeddings unlock new possibilities across a wide range of applications, from predictive modeling to advanced decision-making systems.

First global embeddings dataset for EO

CloudFerro and European Space Agency (ESA) Φ-lab has introduced the first global embeddings dataset for Earth observations. This innovative dataset integrates cutting-edge AI technologies to advance Earth Observation capabilities, enabling more precise and scalable analysis of satellite data. By harnessing the power of GPU-accelerated instances provided by CloudFerro, that not only shares the infrastructure but also plays a key role in preparing the embeddings, alongside the expertise of ESA Φ-lab.

As a CloudFerro team, we have created a high-performance computational environment capable of processing vast amounts of Earth Observation data at an unprecedented scale. The global embeddings are computed using the CREODIAS cloud service platform, powered by GPU-accelerated instances provided by CloudFerro. 

Global AI embeddings in Earth Observation - esa cf collaboration2

Value for Earth Observation data

Embeddings are increasingly valuable in the field of Earth Observation (EO), offering a range of applications for professionals across this sector.

Global AI embeddings in Earth Observation - satellite 1

Empowering EO professionals

Embeddings can be leveraged by a wide range of professionals across the Earth Observation sector: remote sensing scientists, geospatial analysts, and environmental researchers.

Global AI embeddings in Earth Observation - machine learning 1

Wide range of applications

Data scientists and machine learning engineers in EO can apply embeddings  to process and analyze large volumes of geospatial data for tasks such as pattern recognition, anomaly detection, and predictive modeling.

Global AI embeddings in Earth Observation - remote sensing satellite 1

Efficient data representation

Users can use embeddings to extract meaningful features from satellite imagery, sensor data, and geographic information systems (GIS), enabling more efficient analysis of complex spatial relationships, saving time and resources.

Global AI embeddings in Earth Observation - machine

Simplifying satellite data processing

With CloudFerro handling embedding calculations  users can offload the heavy computational tasks to the cloud, which allows them to work with lightweight, pre-processed embeddings rather than managing large, complex satellite datasets themselves.

Global embeddings for satellite images

CloudFerro, in collaboration with ESA Phi Lab, has successfully calculated global embeddings based on advanced AI models for Sentinel-2 and Sentinel-1 imagery at a 10-meter resolution. Utilizing general-purpose vision models like SigLIP and DINOv2, along with SSL4EO for Earth Observation models. This global run marks a major advancement in our efforts, representing a significant leap forward in the scale and scope of Earth Observation data processing.

Over 170 Million Embeddings from Trillions of Pixels

Over 170 million embeddings were generated from more than 62 TB of raw data, distilling insights from 9.368 trillion pixels of source data. This comprehensive analysis involved processing more than 8 million Sentinel-1 and Sentinel-2 images from the Major TOM dataset.

CloudFerro and Phi Lab have created an efficient data representation that captures key relationships and insights, enabling faster processing, easier analysis, and more actionable data for smarter decision-making. This work is part of an expanded standard for releasing Major TOM. Embeddings expansion through open datasets on HuggingFace, including: 

Global AI embeddings in Earth Observation - DINOV2 opt 2
DINOv2 - A simplified visualization of how AI models interpret images
– similar colors lead to similar interpretations.
Global AI embeddings in Earth Observation - sigLIP opt
SigLIP - A simplified visualization of how AI models interpret images
– similar colors lead to similar interpretations.
Global AI embeddings in Earth Observation - earth for embeddingsv4 opt

Enhancing trend discovery, and applications e.g. in agriculture, land management, image restoration, through AI-driven embeddings.

AI embeddings enable a broad range of advanced capabilities for more intelligent and efficient analyses of satellite imagery.

Embeddings can enhance satellite image browsers, allowing users to quickly and accurately explore vast datasets with improved navigation and context. With similarity search, patterns and trends from satellite images can be identified at a global scale, unlocking insights into climate patterns, urbanization, and natural events. Additionally, AI-driven embeddings help model crop yields, improving agricultural predictions, while aiding in image restoration for clearer, more precise visuals. For land cover classification, embeddings enable more accurate mapping of various terrains and crop types, supporting better land management and resource allocation.

Evaluating embeddings across EO tasks, integrating foundation models, and expanding resource availability through the CREODIAS repository.

Next steps

The next steps involve testing and evaluating the computed embeddings across a range of Earth Observation (EO) tasks to assess their performance and applicability in various real-world scenarios. Additionally, further testing will be conducted with other EO foundation models, including MMEarth and DeCUR, to examine their integration with the existing embeddings and optimize their capabilities. Furthermore, the MajorTOM dataset with embeddings will be integrated into the CREODIAS repository, providing access to these resources for the broader EO research community.