New AI Model Could Control Computers

John Lister's picture

The makers of an AI model claim it can take control of a computer and complete tasks such as completing forms. The latest version of "Claude" takes a screenshot and counts pixels to figure out how far to move the cursor.

Anthropic, which made Claude, say this is the first time a publicly released AI model has the capability of "computer use". It defines this as "looking at a screen, moving a cursor, clicking buttons, and typing text.

The goal is to allow the model to carry out tasks which go beyond simply generating text or images in line with a user's instructions. Instead, it could actually use this text, for example to send an email or fill out an online form to book a trip.

While inputting text or even simulating a mouse movement isn't a particularly difficult task to automate on a computer, figuring out where to move and click the cursor on the screen is trickier. The feature works by taking a screenshot, identifying the necessary location, then counting the number of pixels to "move" to that location.

Drag-And-Drop Off The Table

In its current form, the tool can only work with a rapid series of screenshots rather than video of the screen. That means it struggles to react to pop-up notifications or to replicate a "drag-and-drop" operation that a human could do. (Source: arstechnica.com)

For now, ordinary users can't simply run Claude and access this feature. Instead, it's only available to third-party developers who create applications using the model. They'll be able to translate user instructions into computer commands.

Risk Reduction Request

Anthropic gives the example of a user typing "use data from my computer and online to fill out this form" and the AI tool carrying out the sequence of tasks: "check a spreadsheet; move the cursor to open a web browser; navigate to the relevant web pages; fill out a form with the data from those pages." (Source: anthropic.com)

Anthropic is also clear that the computer use element is very much in beta stage. It admits the feature is "is still experimental-at times cumbersome and error-prone. We're releasing computer use early for feedback from developers, and expect the capability to improve rapidly over time."

What's Your Opinion

Is there any value in trying to achieve this goal? What would it take for you to be happy to use such a feature on your PC? Is this too risky given the potential abuse by hackers?

Rate this article: 
Average: 5 (5 votes)

Comments

Dennis Faas's picture

This and similar AI models will be used to spam forums and social media - plain and simple!