OpenAI may be close to releasing an AI tool that can take control of your PC and perform actions on your behalf.
Tibor Blaho, a software engineer with a reputation for accurately leaking upcoming AI products, claims to have uncovered evidence of OpenAI’s long-rumored Operator tool. Publications including Bloomberg have previously reported on Operator, which is said to be an “agentic” system capable of autonomously handling tasks like writing code and booking travel.
According to The Information, OpenAI is targeting January as Operator’s release month. Code uncovered by Blaho this weekend adds credence to that reporting.
OpenAI’s ChatGPT client for macOS has gained options, hidden for now, to define shortcuts to “Toggle Operator” and “Force Quit Operator,” per Blaho. And OpenAI has added references to Operator on its website, Blaho said — albeit references that aren’t yet publicly visible.
OpenAI website already has references to Operator/OpenAI CUA (Computer Use Agent) – “Operator System Card Table”, “Operator Research Eval Table” and “Operator Refusal Rate Table”
Including comparison to Claude 3.5 Sonnet Computer use, Google Mariner, etc.
(preview of tables… pic.twitter.com/OOBgC3ddkU
— Tibor Blaho (@btibor91) January 20, 2025
According to Blaho, OpenAI’s site also contains not-yet-public tables comparing the performance of Operator to other computer-using AI systems. The tables may well be placeholders. But if the numbers are accurate, they suggest that Operator isn’t 100% reliable, depending on the task.
OpenAI website already has references to Operator/OpenAI CUA (Computer Use Agent) – “Operator System Card Table”, “Operator Research Eval Table” and “Operator Refusal Rate Table”
Including comparison to Claude 3.5 Sonnet Computer use, Google Mariner, etc.
(preview of tables… pic.twitter.com/OOBgC3ddkU
— Tibor Blaho (@btibor91) January 20, 2025
On OSWorld, a benchmark that tries to mimic a real computer environment, “OpenAI Computer Use Agent (CUA)” — possibly the AI model powering Operator — scores 38.1%, ahead of Anthropic’s computer-controlling model but well short of the 72.4% humans score. OpenAI CUA surpases human performance on WebVoyager, which evaluates an AI’s ability to navigate and interact with websites. But the model falls short of human-level scores on another web-based benchmark, WebArena, according to the leaked benchmarks.
Operator also struggles with tasks a human could perform easily, if the leak is to be believed. In a test that tasked Operator with signing up with a cloud provider and launching a virtual machine, Operator was only successful 60% of the time. Tasked with creating a Bitcoin wallet, Operator succeeded only 10% of the time.
OpenAI’s imminent entry into the AI agent space comes as rivals including the aforementioned Anthropic, Google, and others make plays for the nascent segment. AI agents may be risky and speculative, but tech giants are already touting them as the next big thing in AI. According to analytics firm Markets and Markets, the market for AI agents could be worth $47.1 billion by 2030.
Agents today are rather primitive. But some experts have raised concerns about their safety, should the technology rapidly improve.
One of the leaked charts shows Operator performing well on selected safety evaluations, including tests that try to get the system to perform “illicit activities” and search for “sensitive personal data.” Reportedly, safety testing is among the reasons for Operator’s long development cycle. In a recent X post, OpenAI co-founder Wojciech Zaremba criticized Anthropic for releasing an agent he claims lacks safety mitigations.
“I can only imagine the negative reactions if OpenAI made a similar release,” Zaremba wrote.
It’s worth noting that OpenAI has been criticized by AI researchers, including ex-staff, for allegedly de-emphasizing safety work in favor of quickly productizing its technology.
It’s an amazing piece of writing in support of all the online people; they will obtain benefit from it I am sure.
Piece of writing writing is also a fun, if you be familiar
with afterward you can write otherwise it is complicated to write.
I was able to find good advice from your articles.
Please let me know if you’re looking for a article writer
for your weblog. You have some really good articles
and I believe I would be a good asset. If you ever want
to take some of the load off, I’d really like to write some content for your blog in exchange for a link back to
mine. Please blast me an email if interested. Cheers!
Unlim Casino provides all gambling enthusiasts the opportunity
to experience the true pleasure of gambling.
Here you’ll find a variety of slots, card games, as
well as ongoing tournaments, which gives players a significant chance to improve their winning potential and brings a lot of excitement to the process.
Our casino offers not only gambling but also an amazing experience for all users, whether you are playing from
a mobile device or a computer. We guarantee daily expansion of the
game selection and hosting of exciting tournament events.
What’s the advantage of playing at Unlim Casino?
Simple registration — just a few steps, and you’re ready to start playing.
Generous bonuses for newcomers — we offer bonuses on your first
deposit to start with a bigger chance of winning.
Daily promotions and tournaments — for those who want to increase their chances of winning and additional prizes.
24/7 support is always ready to help with any questions regarding
the gaming process.
Many games available both on your computer and mobile devices.
Join us today! Exciting emotions and the chance to win substantial prizes await you at Unlim Casino.
Don’t wait — start winning with us right now! https://unlim-casinomirage.world/
Hi there would you mind stating which blog platform you’re
using? I’m going to start my own blog in the near future but I’m
having a tough time making a decision between BlogEngine/Wordpress/B2evolution and Drupal.
The reason I ask is because your layout seems different then most blogs and I’m looking
for something unique. P.S Sorry for getting off-topic but I had to ask!
Vulkan Platinum — это место, где игра становится настоящим искусством.
В нашем казино вы найдете все от классических слотов до уникальных игр с реальными дилерами.
Вулкан Платинум предоставляет надежную платформу с быстрыми выплатами
и полной прозрачностью.
Что отличает Вулкан Платинум казино официальный сайт от других
казино? Каждый новый игрок
может рассчитывать на приятные
бонусы, а постоянные клиенты получают доступ к эксклюзивным предложениям.
В Vulkan Platinum ваши данные всегда защищены,
а игровой процесс прост и понятен.
Когда начать играть? Чем раньше,
тем лучше! Просто зарегистрируйтесь,
и мгновенно получите доступ ко всем играм и бонусам, которые предлагает Vulkan Platinum.
Вот что вас ждет в Vulkan Platinum:
Мгновенные выплаты и надежность.
Вам доступны не только приветственные бонусы, но и специальные предложения для наших постоянных клиентов.
В Vulkan Platinum вы найдете все, от любимых слотов до захватывающих игр
с живыми дилерами.
Vulkan Platinum — это казино, которое дает вам шанс на
большие выигрыши. https://vulkan-casinorush.icu/