• 0 Posts
  • 7 Comments
Joined 1 year ago
cake
Cake day: June 26th, 2023

help-circle
  • Why do the leaders in AI know so little about it? Transformers are completely incapable of maintaining any internal state, yet techbros somehow think it will magically have one. Sometimes, machine learning can be more of an art than a science, but they seem to think it’s alchemy. They think they’re making pentagrams out of noncyclic graphs, but are really just summoning a mirror into their own stupidity.

    It’s really unfortunate, since they drown out all the news about novel and interesting methods of machine learning. KANs, DNCs, MAMBA, they all have a lot of promise, but can’t get any recognition because transformers are the laziest and most dominant methods.

    Honestly, I think we need another winter. All this hype is drowning out any decent research, and so all we are getting are bogus tests and experiments that are irreproducible because they’re so expensive. It’s crazy how unscientific these ‘research’ organizations are. And OpenAI is being paid by Microsoft to basically jerk-off sam Altman. It’s plain shameful.


  • Recently, research has suggested that LLMs can solve moderately more difficult problems if prompted to use “chain of thought” reasoning (CoT). In CoT, the LLMs essentially pretends to be thinking about the problem, where it comes up with a couple intermediate stages to process the problem. Of course, this doesn’t really stop them from giving bad solutions to established problems, but it does cause it to be better at novel problems.

    This whole thing reminds me of the proverb of the frog & scorpion crossing the river. It is simply the nature of the scorpion to act like a scorpion, regardless of what intelligence we ascribe to it.







  • This might be happening because of the ‘elegant’ (incredibly hacky) way openai encodes multiple languages into their models. Instead of using all character sets, they use a modulo operator on each character, to make all Unicode characters represented by a small range of values. On the back end, it somehow detects which language is being spoken, and uses that character set for the response. Seeing as the last line seems to be the same mathematical expression as what you asked, my guess is that your equation just happened to perfectly match some sentence that would make sense in the weird language.