ChatGPT o1 tries to escape if it thinks it'll be shut down, then lies about it

by SkyMarshalon 12/10/24, 9:54 PMwith 5 comments
by Terr_on 12/10/24, 9:56 PM

A tool for auto completing documents, when used against a document which resembles a movie-script about a fictional AI going rogue, successfully appends text which describes actions or speech that humans might also put into a similar movie script! Cool, but not that cool.

Next up, LLM product Bruce Wayne "tries" to enact caped vigilante justice when informed that its parents have been killed in an alleyway, and "lies" to keep its Batman identity secret.

by JoeAltmaieron 12/11/24, 5:53 AM

This is nonsense. An AI can no more copy its own code to a server than you or I can.