Visual Basic Database GUI

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

🤔 We identify several limitations in coordinate-generation based methods (i.e., output screen positions as text tokens x=..., y=...) for GUI grounding, including ...

IEEE

Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy

Abstract: The rapid development of multimodal large language models has resulted in remarkable advancements in visual perception and understanding, consolidating several tasks into a single visual ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy

Trending now