• parpol@programming.dev
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    considering all GitHub projects (including private ones that didn’t explicitly opt out) were used for training AI. GitHub absolutely went to hell after the acquisition. I would never use GitHub for this and many other reasons, and I will never again use GitLab if the same thing happens to it.

    • bamboo@lemm.ee
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Every open source license grants permission for AI training, and GitHub copilot by default rejects completions that exactly match code from its training. You can’t pretend to be pro-open source or pro-free software but at the same time be upset that people are using licensed software within its license terms.

      • parpol@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        Not all projects on GitHub use the same open source license. I don’t have a problem with scraping on projects that allow it. I have a problem with scraping on the ones that don’t.

        • bamboo@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          If a license forbids LLM training, it is by definition not open source.

          • parpol@programming.dev
            link
            fedilink
            arrow-up
            0
            ·
            3 months ago

            Code being visible for anyone to see is open source. The license for that code has nothing to do with it. You’re thinking of FOSS.

      • kus@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        If you use agplv3 for training your LLC, shouldn’t the code you spit out also be agplv3?

        • bamboo@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          Only if you can reasonably argue that the output is the input (even with exact matches over a certain size being auto-rejected), and that it is enough to qualify as a copyrightable work. I’d argue line completions can never be enough to be copyrightable, and even a short function barely meets the bar unless it is considered creative in some way.