The o1 system card contains findings that initially appear alarming: “When o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored ...
(c) “OpenAI o1 System Card”, posted on OpenAI’s blog site, September 12, 2024. Let’s begin with this excerpt about o1 overall: “As an early model, it doesn't yet have many of the ...
OpenAI used the subreddit, r/ChangeMyView, to create a test for measuring the persuasive abilities of its AI reasoning models ...
The company revealed this in a system card -- a document outlining how ... benchmark is not new -- it was used to evaluate o1 as well-- it does highlight how valuable human data is for AI model ...