Flood segmentation from multispectral satellite imagery (ProCANet)
Flood extent mapping from satellite imagery is a critical upstream input for disaster response logistics. Manual GIS analysis introduces latency of days to weeks, which is operationally unacceptable within the 72-hour emergency response window. The 270km Citarum River basin in West Java -- home to 25M+ residents -- lacked an automated, high-accuracy segmentation pipeline capable of handling the spectral heterogeneity of urban, agricultural, and forested flood zones under partial cloud cover.
Standard U-Net architectures using only RGB bands systematically fail in flood-affected terrain due to spectral confusion between water bodies, dark soil, and shadowed regions. Incorporating Near-Infrared (NIR) bands from Sentinel-2 introduces a multi-modal fusion problem: attention must be learned across both spectral dimensions and spatial scales simultaneously. Naively concatenating RGB and NIR at the encoder input does not resolve cross-modal feature misalignment at skip connections -- the precise location where spatial detail is propagated to the decoder.
Designed ProCANet (Progressive Cross-Attention Network): a dual-encoder U-Net where a dedicated NIR encoder runs in parallel to the RGB encoder. Progressive cross-attention modules are inserted at every skip connection -- not only at the bottleneck -- enabling the decoder to attend to modality-specific features at each resolution scale (1/2, 1/4, 1/8, 1/16). Self-attention within each encoder branch captures intra-modal long-range spatial dependencies before the cross-modal fusion step, reducing interference between locally coherent but spectrally distinct regions.
Trained on Sen1Floods11 benchmark (Sentinel-2 multispectral, 11 flood events globally). All experiments tracked end-to-end with Weights & Biases (W&B) including hyperparameter sweeps, loss curve monitoring, and per-class IoU checkpointing.
Zero-shot generalization to PlanetScope imagery at 6x training resolution (3m vs 10m GSD) across a 6,112 km2 holdout area without retraining, validated against NDWI-derived pseudo ground truth (IoU 0.659). This tests real deployment conditions where the inference sensor differs from the training sensor.
Benchmarked against UNet, PSPNet, LinkNet, MANet, PAN, and ConvNeXt V2 on the same Sen1Floods11 protocol. ProCANet achieves F1 0.898 (UNet 0.884) and IoU 0.815 (UNet 0.791) -- statistically meaningful gains given the saturating benchmark landscape.
Directly applicable to disaster response pipelines serving 25M+ residents in the Citarum basin. Automated segmentation from a satellite pass reduces flood extent mapping latency from days (manual GIS) to hours, enabling earlier evacuation routing and logistics staging for BNPB (Indonesia National Disaster Management Authority).
- Monash University — Primary research host (A/Prof Risqi U. Saputra)
- University of Indonesia — Domain validation