Google AI chatbot responds with a threatening message: "Human … Please die."

Zerush@lemmy.ml · 1 month ago

Google AI chatbot responds with a threatening message: "Human … Please die."

Rade0nfighter@lemmy.world · 1 month ago

I was just about to query the context to see if this was in any way a “logical” answer and if so, to what extent the bot was baited as you put it, but yeah that doesn’t look great…

Diurnambule@jlai.lu · 1 month ago

I agree, it was a standard academical work until it blowed. I wonder if speaking long enough with any LLM is enough to make them go crazy.

SomeGuy69@lemmy.world · edit-2 1 month ago

Yes, there is a degeneration of replies, the longer a conversation goes. Maybe this student kind of hit the jackpot by triggering a fiction writer reply inside the dataset. It is reproducible in a similar way as the student did, by asking many questions and at a certain point you’ll notice that even simple facts get wrong. I personally have observed this with chatgpt multiple times. It’s easier to trigger by using multiple similar but non related questions, as if the AI tries to push the wider context and chat history into the same LLM training “paths” but burns them out, blocks them that way and then tries to find a different direction, similar to the path electricity from a lightning strike can take.

anomnom@sh.itjust.works · 1 month ago

I wonder if it’s related to training on website comments, which often role the same trajectory.

SomeGuy69@lemmy.world · 1 month ago

Yeah that’s pretty bad. We all know you can bait LLMs to spit out some evil stuff, but that they do it on their own is scary.