The field of artificial intelligence web agents—AI systems capable of interacting with a browser like a human user—is rapidly evolving. According to recent benchmarks, Microsoft‘s newly released model, Fara1.5, has demonstrated superior performance in complex web tasks when compared directly against proprietary offerings from OpenAI and Google.

Understanding AI Web Agents

These advanced agents are designed to perform sophisticated “computer use” functions, such as comparing vacation rentals across multiple websites, completing booking forms, or confirming service details—all without requiring specialized browser plugins. This capability allows the AI to read a user’s screen and interact (clicking, scrolling, typing) precisely as a person would.

While competitors like OpenAI’s Operator and Google’s Gemini 2.5 Computer Use are proprietary, cloud-based tools that carry high operational costs, Microsoft has released Fara1.5 as an open-weight model family, available in three parameter sizes: 4 billion, 9 billion, and 27 billion parameters.

Benchmark Performance Analysis

Fara1.5 was tested against industry-standard benchmarks designed to measure real-world task completion on the live internet.

Online-Mind2Web Benchmark

  • This specific benchmark evaluates how accurately an AI agent completes 300 diverse, practical tasks across 136 different popular websites.
  • Fara1.5-27B achieved a score of 72%.
  • OpenAI Operator scored 58.3%.
  • Google’s Gemini 2.5 Computer Use scored 57.3%.

The performance gap was evident even at the mid-sized scale: Fara1.5-9B recorded a score of 63.4%, surpassing both OpenAI and Google. For context, another top proprietary alternative, Yutori’s Navigator n1, reached 64.7%. Meanwhile, open-source rivals such as Alibaba’s GUI-Owl-1.5 (8 billion parameters) scored 48.6%.

WebVoyager Benchmark

A second measurement assessing task success on the live web showed similar results: Fara1.5-27B achieved an 88.6% success rate, edging out OpenAI Operator’s 87.0% and beating H Company’s 30-billion-parameter Holo2 at 83.0%.

Technical Architecture and Training Methods

Microsoft developed Fara1.5 by redesigning the entire development process, focusing on agentic task performance from scratch. The model is built upon Qwen 3.5, an Alibaba base model that Microsoft fine-tuned specifically for browser applications.

The core training system used was called FaraGen1.5. A unique aspect of the training involved using OpenAI’s GPT-5.4 as a “teacher agent” to generate demonstrations on how to complete various web tasks, which then became the data source for Fara1.5.

To improve its handling of restricted actions—such as logging into an account or submitting irreversible forms (like booking a flight)—the developers utilized synthetic domain training. This involved creating six fully functional replicas of real-world sites (including email clients and marketplaces) so the model could practice without interacting with live, sensitive accounts.

Safety Features and Accessibility

A critical element in building reliable web agents is ensuring user safety. Microsoft Research’s Senior PM Lead, Yash Lara, stated that “Balancing robust safeguards such as Critical Points with seamless user journeys is key.”

Fara1.5 incorporates these safeguards through MagenticLite, a sandboxed browser environment. This system meticulously logs every action and allows users the ability to pause or halt the agent at any point.

The model’s open nature is cited as its primary advantage in the crowded market, providing public weights and an open inference code repository on GitHub, allowing it to run on hardware controlled by the user. Currently, Fara1.5-9B is active on Azure AI Foundry, with the 4B and 27B versions expected shortly. Microsoft plans for future expansion of the model beyond browser use into desktop and enterprise software.

Hue

Written by

Hue

The girl with pink hair, usually arguing about GPU benchmarks or checking her crypto portfolio between gaming sessions. She writes about PC tech, games, and crypto.