minus-squareRainer Burkhardt@lemmy.worldtoProgrammer Humor@programming.dev•It must be a silent RlinkfedilinkEnglisharrow-up1·2 months agoI can evaluate this because it’s easy for me to count. But how can I evaluate something else, how can I know whether the LLM ist good at it or not? linkfedilink
minus-squareRainer Burkhardt@lemmy.worldtosolarpunk memes@slrpnk.net•I'm tired, bosslinkfedilinkarrow-up0·3 months agoI did this too and I didn’t get fired. May be it depends on the wording? linkfedilink
I can evaluate this because it’s easy for me to count. But how can I evaluate something else, how can I know whether the LLM ist good at it or not?