Anthropic’s Latest AI that Moves a Cursor for You

Anthropic

One of the renowned AI software companies Anthropic has just announced its new tool. A tool that can take over the control of the user’s cursor and perform the basic tasks on your desktop.

Not just a new tool but here are some other improvements to Anthropic’s Claude and Haiku models. This tool is called “Computer Use”.  Users can use this tool which is available exclusively with the company’s mid-range 3.5 Sonnet model right now via the API. A user can give multi-step instructions to complete any task on their computer by “looking at a screen, moving a cursor, clicking buttons and typing text.

How does Anthropic’s latest AI Tool Work?

Here is how the Anthropic’s latest AI tools work, when a developer tasks Claude with using a piece of computer software and gives it the necessary access. Then Clause looks at screenshots of what’s visible to the user and counts how many pixels horizontally or vertically it needs to move a cursor to click in the better place. While training Claude, it was quite difficult to count pixels accurately, because without this it was difficult to give the cursor any commands.

This tool also comes with many limitations as it can only operate by rapid successive screengrabs rather than just working with a live video stream.  Due to this, it can miss short-lived notifications or any other changes. But still, the tool is good for performing some common actions like drag and drop. So, developers are starting to use these low-risk tasks first while anthropic works to improve its capabilities with time.

The feature that Anthropic itself described as “at times cumbersome and error-prone” but still has been embraced by companies like Asana and DoorDash.

Claude Performance:

Anthropic with the new AI tool also announced a major upgrade to its Claude family of AI models. They are updating the already impressive Claude 3.5 Sonnet and releasing the New Claude 3.5 Haiku.

Anthropic further stated that Sonnet outperforms OpenAI-4o and Google’s Gemini 1.5 Pro on reasoning tasks, coding, visual analysis, and graduate-level reasoning tasks. So there seems a significant improvement for AI-powered coding in particular and GitLab. They found giving stronger reasoning of up to 10% across use cases with no added latency. Thus making an ideal choice for multi-step software development.

The upgraded 3.5 Sonnet self-corrects and retires tasks when it encounters obstacles. This tool can also work perfectly well, where there requires dozens or hundreds of steps.

Moreover the new Claude 3.5 Haiku is now capable as Claude 3.0 Opus, its large model. That is three times faster than its peers and outperform Claude 3.5 Sonnet and GPT-4o in most tests.