H Company’s Holo2-235B Preview sets new UI localization benchmark scores

Getting LLM-powered agents to reliably click the right button on a crowded 4K screen is still a bottleneck for automation teams. H Company is now positioning a new model as a step forward for UI element localization, the computer-vision task of mapping a text instruction like “open settings” to the exact on-screen element.

New Holo2-235B model targets UI element localization

H Company released Holo2-235B-A22B Preview as a research model focused on grounding instructions to specific UI components. The company says the model sets a new high-water mark on ScreenSpot-Pro, a benchmark tracked on the public ScreenSpot-Pro leaderboard.

For B2B marketers and e-commerce operators building browser automations, support bots, or internal agents, better localization can translate into fewer brittle scripts and less manual QA. In practical terms: a model that can find small icons, toggles, or menu items on dense layouts reduces failure rates when pages change.

The model is available via Hugging Face and is framed as a preview release rather than a fully productized offering.

Agentic localization improves accuracy in a few steps

H Company emphasizes “agentic localization,” where the system iteratively refines its predicted UI coordinates over multiple steps. Think of it as a short loop of “guess, zoom/adjust, re-guess” to narrow down a tiny target—useful on high-resolution interfaces where a single-pass prediction can be slightly off.

On ScreenSpot-Pro, the company reports 70.6 percent accuracy in a single step, rising to 78.5 percent within three steps in agent mode. It also reports 79.0 percent on OSWorld G. Across the Holo2 family, H Company claims the iterative approach delivers 10 to 20 percent relative gains.

For teams evaluating AI agents for customer operations or cross-app workflows, the takeaway is that step-wise localization may be a lever to trade small increases in latency for materially higher success rates on real UI tasks.

In the near term, this release is most relevant for builders benchmarking GUI agents and testing multi-step grounding strategies, especially on modern, high-DPI web apps.

H Company’s Holo2-235B Preview sets new UI localization benchmark scores

Key Takeaways

New Holo2-235B model targets UI element localization

Agentic localization improves accuracy in a few steps

Stay Informed

Related Topics