Skip to main navigation Skip to search Skip to main content

Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador

Research output: Contribution to journalArticlepeer-review

Abstract

Highlights: What are the main findings? Machine learning using Google AlphaEarth Foundations satellite embeddings in Google Earth Engine accurately predicted NO2 and SO2 concentrations in Quito (R2 = 0.71), capturing fine-scale pollution patterns at 10 m resolution. SHAP analysis revealed that only a small subset of embedding bands drives accurate predictions, demonstrating that compact, globally consistent features can explain urban air quality dynamics without handcrafted indices or auxiliary datasets. What is the implication of the main finding? Embedding-based remote sensing models provide a scalable solution for urban air quality monitoring in the Global South, overcoming sparse ground stations and persistent cloud cover. The approach supports policy-relevant applications such as hotspot detection, trend analysis, and sustainable urban planning, offering transferable methods for data-scarce cities worldwide. Many Global-South cities lack dense monitoring and suffer persistent cloud cover, hampering fine-scale trend detection. This study evaluates the potential of annual multi-sensor satellite embeddings from the AlphaEarth Foundations model in Google Earth Engine to predict and map major air pollutants in Quito, Ecuador, between 2017 and 2024. The 64-dimensional embeddings integrate Sentinel-1 radar, Sentinel-2 optical imagery, Landsat surface reflectance, ERA5-Land climate variables, GRACE terrestrial water storage, and GEDI canopy structure into a compact representation of surface and climatic conditions. Annual median concentrations of NO2, SO2, PM2.5, CO, and O3 from the Red Metropolitana de Monitoreo Atmosférico de Quito (REEMAQ) were paired with collocated embeddings and modeled using five machine learning algorithms. Support Vector Regression achieved the highest accuracy for NO2 and SO2 (R2 = 0.71 for both), capturing fine-scale spatial patterns and multi-year changes, including COVID-19 lockdown-related reductions. PM2.5 and CO were predicted with moderate accuracy, while O3 remained challenging due to its short-term photochemical and meteorological drivers and the mismatch with annual aggregation. SHAP analysis revealed that a small subset of embedding bands dominated predictions for NO2 and SO2. The approach provides a scalable and transferable framework for high-resolution urban air quality mapping in data-scarce environments, supporting long-term monitoring, hotspot detection, and evidence-based policy interventions.

Original languageEnglish
Article number3472
JournalRemote Sensing
Volume17
Issue number20
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 by the authors.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 11 - Sustainable Cities and Communities
    SDG 11 Sustainable Cities and Communities
  2. SDG 13 - Climate Action
    SDG 13 Climate Action
  3. SDG 17 - Partnerships for the Goals
    SDG 17 Partnerships for the Goals

Keywords

  • Google Earth Engine
  • machine learning
  • Quito
  • satellite embeddings
  • urban air quality

Fingerprint

Dive into the research topics of 'Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador'. Together they form a unique fingerprint.

Cite this