• flashgnash@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      29 days ago

      On the flip side, training ai for image recognition has the potential for auto labelling images for the blind

      Could be either the website owners themselves generate them if a human written one isn’t provided, or a browser extension that auto labels any unlabelled images on the screen

      • ChaoticNeutralCzech@feddit.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        29 days ago

        They’re probably going to make a deal with Google to improve Google Lens. Yes, it will eventually help the blind but Reddit’s shareholders will be getting even richer from people’s donated time.

      • Grimy@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        29 days ago

        I’m sure blind people are happy to have the models that are built with this data, and since both the image and the description are public facing, anyone can use them including open source.

  • N00b22@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    28 days ago

    Wait what? I only use Old Reddit and Infinity on mobile so idk what they are doing

  • mfat@lemmy.ml
    link
    fedilink
    arrow-up
    0
    ·
    29 days ago

    Is this why we are solving motorcycle, stairs, fire hydrant, etc. captchas?

  • apotheotic (she/her)@beehaw.org
    link
    fedilink
    English
    arrow-up
    0
    ·
    29 days ago

    I am missing a small amount of context - is reddit randomly prompting users to describe images in posts? Or is it prompting you to describe your own image at upload time?

    Context aside, I definitely think that providing image descriptions is something we should do in spite of the fact that its definitely going to be used to train AI. Choosing to not do so is throwing our blind peers under the bus to reduce the amount of training data for ai fractionally.

    • Robust Mirror@aussie.zone
      link
      fedilink
      arrow-up
      0
      ·
      29 days ago

      I haven’t been there in a while but I remember there was a sub of volunteers that were around for years that went around just describing images, way before AI LLM were really a thing.

      I’m assuming this is something new being pushed by reddit itself, but as you said, it’s a good thing regardless.

      • apotheotic (she/her)@beehaw.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        29 days ago

        As long as, even if reddit is using it to train LLM, they are actually still using the descriptions to add accessibility to those images, which I don’t take for granted

        • morrowind@lemmy.ml
          link
          fedilink
          arrow-up
          0
          ·
          29 days ago

          It’s supposed to be similar, not necessarily the same. The idea is something like this

          1. Anakin: something
          2. Padme: but [normal thing] right
          3. Anakin: …
          4. Padme: but [more basic thing] right??
          5. (implied that not even that)

          The implication here is that not only is the primary purpose not for the blind, but it won’t help them at all

          (am I overthinking this?)

  • circuitfarmer@lemmy.sdf.org
    link
    fedilink
    arrow-up
    0
    ·
    edit-2
    29 days ago

    At this point, any request for information could potentially be used as training data. That includes things like captchas.

    I recommend everyone have an extremely literal interpretation of “labor”. Unless you have tremendous insight into where your data is going and how it is being used (and perhaps even then), then assume any ask is ultimately an ask for unpaid labor.

    Obviously you can’t avoid things like captchas, but you can avoid things like this.

    Edit: and it should go without saying, but anything you upload to socials is probably automatic training data at this point. The best approach is simply not to engage with corporate social networks.

    Though Lemmy is not corporately controlled, the information is publically accessible, so even this post is potential training data to be scraped. That is harder to avoid, lest we stop using the internet altogether, but at least avoiding the corpo routes is a good start.

    • AndrasKrigare@beehaw.org
      link
      fedilink
      arrow-up
      0
      ·
      28 days ago

      Bear in mind, with this liberal interpretation, any time you access a website, that is also consuming someone’s labor and if you don’t have a subscription to it, it is unpaid.

    • flashgnash@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      29 days ago

      Captchas have been for training ai for years that’s nothing new. Iirc the reason you do two is one to confirm you’re human, one for training data