ArXiv GUI-SD: first on-policy self-distillation framework for GUI grounding outperforms GRPO across six benchmarks in accuracy and training efficiency
Yan Zhang, Daiqing Wu, and Huawen Shen presented GUI-SD — the first on-policy self-distillation (OPSD) framework specifically for GUI grounding, the ability of AI agents to map natural language instructions to visual coordinates of interface elements. The system uses privileged visual context (bounding box and Gaussian soft mask) and entropy-guided distillation. Across six representative GUI grounding benchmarks, GUI-SD consistently outperforms GRPO-based RL methods.