What are satellite embeddings?

These are compressed numerical representations of satellite images — each pixel or region is converted into a vector of 128, 384, or 768 numbers. Tasks like classification or segmentation are performed on these vectors much faster than on raw images.

Why is an F1 score of 0.84 with 60 pixels significant?

Classical landscape segmentation models require thousands of labeled examples to reach similar accuracy. OlmoEarth enables researchers to achieve production-ready results with a small number of manually labeled pixels, dramatically lowering the barrier to entry.

AI2 OlmoEarth: satellite embeddings with F1 0.84

Allen Institute for AI (AI2) launched OlmoEarth Studio on April 23, 2026 — a platform with its own embedding models for satellite image analysis. Alongside its OLMo language models, Tülu instruction tuning, and Molmo multimodal models, AI2 continues to expand its open-source strategy.

What is OlmoEarth and how does it fit into AI2’s strategy?

OlmoEarth is a pretrained model that converts satellite images into embeddings — compact vectors that capture visual and geospatial information. AI2 releases it in three sizes: Nano with 128 dimensions, Tiny with 384 dimensions, and Base with 768 dimensions.

The choice of size is a trade-off between accuracy and speed. Nano is fast for processing large areas and running on limited hardware, Base provides the best accuracy for detailed tasks, and Tiny covers the middle ground for most practical use cases. All three models are open-source, in line with AI2’s mission.

Why is the 60-pixel result revolutionary?

The headline technical figure from the release is an F1 score of 0.84 for landscape segmentation when the model is fine-tuned with only 60 labeled pixels. F1 is the harmonic mean of precision and recall — a value of 0.84 is considered production-ready for most geographic analyses.

Classical deep segmentation approaches require thousands to tens of thousands of labeled examples. OlmoEarth, pretrained on a massive dataset of satellite imagery, already “knows” what forests, fields, or urban areas look like, so it only needs a small set of examples to be directed toward a specific task.

What are the concrete applications?

Studio supports three main operations: generating embeddings for an arbitrary region, detecting changes between two time points, and PCA visualization that shows the researcher the cluster structure in the data.

Applications span monitoring deforestation in the Amazon, predicting crop yields for insurance companies, assessing damage after floods and earthquakes, and planning urban growth. The key advantage is the ability to perform downstream analysis without retraining the large model — the researcher works solely with embedding vectors.

Allen AI: OlmoEarth embeddings enable landscape segmentation with just 60 pixels and F1 score of 0.84

What is OlmoEarth and how does it fit into AI2’s strategy?

Why is the 60-pixel result revolutionary?

What are the concrete applications?

Sources

Related news