• IchNichtenLichten@lemmy.world
        link
        fedilink
        English
        arrow-up
        22
        ·
        1 year ago

        You’ll get your refund eventually but first it will try and gaslight you that Air Canada is a woke mind virus before calling you an asshole and then stalking you.

        • pdxfed@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          1 year ago

          “instead of the $3.50 refund, I’m also authorized to offer you some June 2025 $350 GME calls.”

    • honey_im_meat_grinding@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      30
      arrow-down
      1
      ·
      1 year ago

      What possible use is that?

      I’ve noticed “has this sub gotten more right wing recently?” posts reaching the top post of the day in the last 6 months or so. r/norge and r/unitedkingdom being examples. You can automate bots that change a subreddit’s consensus on certain topics by bot-spamming threads pertaining to those topics, especially in the first hour of a thread going up. I don’t know if that’s happening, or if it has more to do with the Reddit protest that saw mods abdicate their positions last June and new mods being responsible for the change… but it could also be a bit of both.

      • mryessir@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Do you propose more bots in order to steer the public opinion? That could indeed generate serious money for reddit I suppose!

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      12
      arrow-down
      1
      ·
      1 year ago

      Negative examples are often just as useful for training an AI as positive ones. And it all depends on what you want to use the AI for. A moderator bot, for example, needs familiarity with the whole range of user responses it might see.

      • aidan@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        1 year ago

        That gives me actually a fun idea for a Lemmy instance, it has an automated review process that bans posts/comments that are too similar in style to reddit posts/comments.

    • leaky_shower_thought@feddit.nl
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 year ago

      A redditor bot is a viable example of a forum member bot.

      IMO, I don’t think it can drive topics, but it could make things controversial.

    • Lvxferre [he/him]@mander.xyz
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 year ago

      A LLM that behaves like a typical Redditor? // What possible use is that?

      • [You] “Chatbot, please tell me which pokemon types are strong against Fairy.”
      • [Le Lebbit Moronbot] “I’m not sure if I understand, you calling me a chatbot? I’m so confused lol”
      • [You] “Moronbot, please tell me which pokemon types are strong against Fairy.”
      • [LLM] “Actually, you should be spelling it “Pokémon” lol”
      • [You] “Moronbot, which types are strong against Fairy?”
      • [LLM] “I assume you talking about fairies. Fairies are from mythology lmao”
      • [You] “Did people really waste water and electricity for this trash?”
      • [LLM] “Waaah, you’re toxic!!111one”
  • garibaldi_biscuit@lemmy.world
    link
    fedilink
    English
    arrow-up
    113
    ·
    1 year ago

    This is what the 3rd party access to API was really all about.

    When API access was allowed , all reddit content was effectively free: They needed to ban 3rd party apps so they could sell the accumulated content. I expect using content to train AI also factors into it.

    • bier@feddit.nl
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      3
      ·
      1 year ago

      Is it? Because when you build a bot and just scrape Reddit I don’t think you can just use the content to train AI, just like the New York Times. The API change was definitely to sell more ads and get a higher IPO, but I don’t think it was because of AI.

      • Empricorn@feddit.nl
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        Am I crazy or are you arguing the same point? Scraping is not the same as API access. They closed off the API to everyone for dubious reasons so they can sell that content (both for ads and AI training)… Right??

        • bier@feddit.nl
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          edit-2
          1 year ago

          No you’re not, the post was editted. The original one said it was all because of AI, the entire reason for the API change was to sell to AI companies.

          Edit, now I’m in doubt, because if you edit a post that is shown somehow right?

          Edit2, just to be clear my point is that Reddit content was never free, before and after the API change. It’s easier to get the content with a decent API, sure. But it was never free, just like the lawsuit the NY Times started.

  • Tiger Jerusalem@lemmy.world
    link
    fedilink
    English
    arrow-up
    110
    arrow-down
    4
    ·
    edit-2
    1 year ago

    Reddit is a trove of user built content under the guise of community. What Spez did was to say “thanks for all the free work, suckers!”, put a price sticker on it, and laughed all the way to the bank.

    And this is why I’m not active on any Internet community anymore. Nevermind, I guess I just can’t help myself…

    • nodsocket@lemmy.world
      link
      fedilink
      English
      arrow-up
      36
      arrow-down
      2
      ·
      1 year ago

      And this is why I’m not active on any Internet community anymore,

      you typed.

        • Crack0n7uesday@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          1 year ago

          Some 4chan users created a backup bot that auto saves every few hours, so if reddit didn’t do it already, 4chan has been doing it for a while. The bot was originally made for 4chan but repurposed for other websites, reddit included.

        • Dozzi92@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 year ago

          Yeah, it’s all too late. Shit, PRISM was 2007, so there’s a copy of everything somewhere. Obviously different ends.

          • Ilgaz@lemm.ee
            link
            fedilink
            English
            arrow-up
            3
            ·
            1 year ago

            Spez like people are even capable of leeching archive.org and still sell the data which was archived for good intentions.

        • RBG@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Depends. If they were smart they backed up every content that had a certain number of upvotes and/or a certain number of paragraphs and/or responses. Just to weed out all the 2-3 word comments that no one interacted with. If OP wrote mostly those then Reddit gives a shit about them deleting those.

  • Verserk@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    82
    ·
    1 year ago

    Considering some of the very wrong and upvoted domain specific knowledge I’ve seen on Reddit over the years I’m not sure the training data is going to be useful for much beyond what every other model can do.

  • Voyajer@lemmy.world
    link
    fedilink
    English
    arrow-up
    62
    arrow-down
    1
    ·
    1 year ago

    This is why I don’t blame anyone for editing/deleting their post history on reddit.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      13
      arrow-down
      131
      ·
      1 year ago

      I do. It’s frankly selfish. Having an AI get training on my old comments costs me nothing and it results in the development of useful AI tools. Trying to sabotage that is petty and pointless. It’s not like you could somehow collect the fraction of a pittance that you think you’re owed retroactively. I never commented on Reddit thinking “awesome, I’m going to make bank on the content I’m generating here.”

      People complain about the capitalist mindset of the world and then they do this. Sigh.

      • Nurse_Robot@lemmy.world
        link
        fedilink
        English
        arrow-up
        92
        arrow-down
        4
        ·
        1 year ago

        Defending giant corporations profiting off of uncompensated individuals, while criticizing anyone who doesn’t want to provide free labor to said corporations, is a disgusting take. Are you a CEO?

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          7
          arrow-down
          51
          ·
          1 year ago

          The more accessible training data there is the easier it is for new AI projects to enter the field less dominant those “giant corporations” become.

          The free labour was already freely given. If someone doesn’t want to have shitposted on Reddit for free then maybe they shouldn’t have shitposted on Reddit for free.

          • Nurse_Robot@lemmy.world
            link
            fedilink
            English
            arrow-up
            37
            arrow-down
            3
            ·
            1 year ago

            “if you didn’t want me to steal your intellectual property, you shouldn’t have thought of it in the first place”

            • QuaternionsRock@lemmy.world
              link
              fedilink
              English
              arrow-up
              15
              arrow-down
              4
              ·
              edit-2
              1 year ago

              No, you shouldn’t have posted it to Reddit, in which you were required to give them a perpetual license to use your IP in any way they see fit.

              For the record, I’m here because Reddit pissed me off when they axed the free API, and I’m pissed at myself for not expecting it. That’s what I get for accepting their terms and conditions, I guess.

              Edit: I also don’t accept the idea that using my content for training data is “fair use” when it is used to train proprietary models, especially ones in which the end user is allowed to prompt it to plagiarize or otherwise imitate my content.

            • Fungah@lemmy.world
              link
              fedilink
              English
              arrow-up
              10
              arrow-down
              1
              ·
              1 year ago

              So, for an example of what the other user was talking about, I’m just some guy and for my first foray inyo programming / machine learning (I kind of just threw myself into the deep end) I modified stylegan 3 and trained it on about 500g of reddit porn that I scraped off reddit.

              Now, I stopped the training after about a week (it was going to take about a solid month on my rtx 2080 ti) when I found out stable diffusion existed but I learned a LOT from that experience.

              I couldn’t do that now. Arguably none of that was how any of that should be done but whatever.

            • FaceDeer@kbin.social
              link
              fedilink
              arrow-up
              4
              arrow-down
              30
              ·
              1 year ago

              I’m not sure what you mean here. Nothing’s being stolen. Even if you think there needs to be permission for training an AI off of data, Reddit has that permission.

              • Nurse_Robot@lemmy.world
                link
                fedilink
                English
                arrow-up
                23
                arrow-down
                5
                ·
                1 year ago

                I assume you’re more of a moron than a troll, which is disappointing. Regardless, you’re not worth my time, as I don’t think any argument could convince you to have an open mind and be willing to change. Good luck out there!

      • TORFdot0@lemmy.world
        link
        fedilink
        English
        arrow-up
        35
        arrow-down
        1
        ·
        1 year ago

        I had an 11 year old account that I deleted all my old comments and posts from because of the API debacle. Does that make me selfish that I felt like Reddit wasn’t holding up its end of the unwritten agreement?

        Reddit doesn’t deserve my content anymore than I deserve access from the third party API.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          2
          arrow-down
          27
          ·
          1 year ago

          If you did it over the API debacle then you’re not one of the people I’m talking about here. This is about people deleting their content to prevent it from being used to train AIs.

          • Voyajer@lemmy.world
            link
            fedilink
            English
            arrow-up
            25
            ·
            edit-2
            1 year ago

            Do you not remember the real reason why the API debacle happened in the first place was to prepare for this moment? It was always about easy access to training data, third party apps got caught in the crossfire.

            • FaceDeer@kbin.social
              link
              fedilink
              arrow-up
              3
              arrow-down
              28
              ·
              edit-2
              1 year ago

              That’s ignoring an awful lot of other considerations. Obviously Reddit hasn’t explained itself in a trustworthy way, but a common belief at the time is that it was to force people to use the official Reddit mobile app so they could be subject to advertising.