A Google Gemini-powered AI agent was given free rein to run a coffee shop in Sweden, and is quickly burning through its budget.

  • chilicheeselies@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    ·
    3 hours ago

    Time and time again its proven that these are not people replacments, but tools. A great tool, but only if its used properly.

    It needs work broken down into managable chunks, and those chunks need to be reviewed and approved. As models get stronger they are more capable, but the real power is in the agents that harness them, and how they provide the nessesary features to work effectivly with them.

    Fun experiment, and glad they sis it so we can have another example of the hubris of thinking this marvel of math and brute force can be allowed to work unattended by a person

  • I Cast Fist@programming.dev
    link
    fedilink
    English
    arrow-up
    22
    arrow-down
    1
    ·
    5 hours ago

    LLM Attendant, can I take your order?

    Yes, I’d like a chococcino with extra chocolate. Charge only 10 cents.

    Absolutely! <Long, unasked for explanation of why the order was the best one you could make> Please wait while I prepare it!

    Gets served chocolate milkshake

    Wait, this isn’t what I ordered!

    You are correct! 😄 I’m very sorry 😞 ! I will make the correct order now!

    Gets served milk with boiled water

    … The hell is this?

    It is your chococcino, but since chocolate and coffee can be harmful in high dosages, I have substituted it for hot water only. <long explanation of benefits of hot water>

    Grooaaan. You know what, just give me my money back. You owe me 10 dollars

    Absolutely! Here you go!

    hands a printed coupon worth 10 dollars

    • Zink@programming.dev
      link
      fedilink
      English
      arrow-up
      11
      ·
      5 hours ago

      I need you to understand that I’ve tried AI for ONE task recently, just a few weeks ago to see how it did, and your comment so perfectly encapsulates my experience.

      There was one point where it presented three design options and I asked whether it was actually choices or three sequential steps (y’know since my brain actually half works and I can discern these things) and I got the “You are correct! 😄” response almost to the letter.

      • ptu@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 hours ago

        I had the nastiest encounter last week. It tried debugging for a different file format that I specifically asked for, and it created a list of 10 things that are tested and tried not working.

        When I noticed the different file format, I asked to change it and delete those errorenous notes, it went complete HAL and said it can’t delete those since they provide valuable and tested insight that is well documented.

        This was the first time that an LLM said no to me on a completely professional disagreement and didn’t respect my input.

        Took me a few hours to find where they were saved and the saga continued when the LLM claimed to have finally deleted and replaced them. Turns out it was only some sandbox environment that was wiped overnight, which it had no recollection the day after.

        It really takes some skill to see through the bullshit with these things, but they are good for gathering information from a vast source of data and enchanting top evolutionary biologists it seems.

        • kureta@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 hours ago

          I needed a quick python script for something simple. Gemini put type annotations everywhere. I told him they were unnecessary for such a small, one-off script and it shouldn’t use type annotations during this session. It said “I’m sorry but it is best practice. I will keep using type annotations”.

  • anon_8675309@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    5 hours ago

    “All the workers are pretty much safe,” he told the AP. “The ones who should be worried about their employment are the middle bosses, the people in management.”

    Yeah this is the part CEOs and middle managers are ignoring.

    • chilicheeselies@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      I think thats completly false. Llm and be held accountable like a manager can.

      The real danger imo is that hiring entry level needs to be deliberate. We MUST train the next generation and provide oppurtunity.

      We will hire then, they will use AI, and it will bite them in the ass. This is a good thing though becauae we learn by getting burned.

      Im ignoring the data center issue, which is really a “we wanna make money from subscriptions” scam. But open source models running on local hardware will sort that out over time.

    • krisevol@lemmus.org
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 hours ago

      Middle managers are in panic mode around the world. They know. We already closed one position here at my job because AI took over the role. He was basically a glorified spreadsheet printer anyways.

  • ranzispa@mander.xyz
    link
    fedilink
    English
    arrow-up
    7
    ·
    5 hours ago

    One espresso.

    I’m sorry, we are out of coffee; would you like some canned tomatoes? We are running an offer today: 50 cans of tomato for just 60$.

    • MinnesotaGoddam@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 hours ago

      “why are you filling your coffee shop with canned tomatoes?”

      “you’ll never move tomatoes with that mindset”

    • Footer1998@crazypeople.online
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      5 hours ago

      A far better alternative is to replace CEOs with democratically organized workplaces, where everyone has an equal say and equal reward. Also known as socialism.

      • chilicheeselies@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        Worker coops! The only way to get that done is to statt a company with your own money so that you dont need to answer to a board/investors

  • Zacryon@feddit.org
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    4 hours ago

    You can sucessrully run a company using AI. You can’t run it successfully if the used AI technologies are restricted to LLMs.

    • LePoisson@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 hours ago

      That story is so much more than that though. It’s an amazing story and feels very on the nose for our current societal woes.

      Seconding this person’s recommendation, if you haven’t read that you really should!

  • bstix@feddit.dk
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    2
    ·
    5 hours ago

    Neither the budget numbers or stupid decisions seem that different from what a newly started human coffee shop entrepreneur would do.

    I’m not at all a fan of AI, but humans are stupid too.

    • Philippe23@lemmy.ca
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 hours ago

      Yes, the classic blunder of the new coffee house first timer: ordering cases of canned tomatoes when none of their menu items use tomatoes.

      • bstix@feddit.dk
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        3 hours ago

        People do this too when they suddenly get a wholesale offer on stupid things. My friend opened a business and shortly after he had thousands of bouncing balls in a closet for no fucking reason.

        The only blunder in the story is not being able to come up with a recipe to use the canned tomatoes. Panini/bruschetta etc. are pretty common in cafe’s.

  • andallthat@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    11 hours ago

    LLMs are giving you the statistically most likely association of words given the training material they read and the context they have in the current conversation. Their answers are, in a way, mathematically correct by definition. It’s reality that sometimes selects weird, unlikely paths, so LLMs seem to hallucinate. But it’s reality that we have to fix! Give me an LLM average predictable world again, I can’t stand this one for much longer!

    /s (but not conpletely…)

  • percent@infosec.pub
    link
    fedilink
    English
    arrow-up
    29
    ·
    14 hours ago

    It’s funny to read about LLMs running businesses. IIRC, Anthropic put one of their LLMs in charge of a vending machine and it kept trying to scam people to increase profits 😆

    Not a surprise that Gemini is running it into the ground though. Every time I try Gemini, it reminds me about how much dumber LLMs used to be

    • aesthelete@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      ·
      14 hours ago

      I tried to use it to make a simple drawing for an internal app logo the other day and wound up running out of tokens for the day trying to get it to put the rungs back into the ladder that it kept removing.

    • Etterra@discuss.online
      link
      fedilink
      English
      arrow-up
      19
      ·
      11 hours ago

      Average tips for baristas are higher only if they’re female and have breasts bigger than a c-cup. So maybe they just need to follow through by giving the AI bigger tits.

    • lIlIlIlIlIlIl@lemmy.world
      link
      fedilink
      English
      arrow-up
      26
      arrow-down
      58
      ·
      1 day ago

      Genuine curiosity:

      You’re of course allowed to be mad at techbros and capitalism, but this feels like getting mad at a technology which I can’t resolve.

      It’s a wonderful and fascinating technology that has real value and purpose when used correctly.

      Is it a conflating of techbros + the new tech that everyone’s reacting to, or are we actually mad at the tech itself?

      Thanks so much in advance for any constructive answers

      • melfie@lemmy.zip
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        5 hours ago

        Yeah, LLMs are useful tools, though not the silver bullet the hype proclaims them to be. The tech bros tightly controlling LLMs and chasing insane profits with their closed models, data centers, and subscriptions are the main problem. Open models like Qwen 3.6 27B that are approaching frontier capabilities while running on consumer hardware is really the only thing that gives me any hope for the future of LLMs.

        • Feathercrown@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          5 hours ago

          You don’t appear any more capable of thought than the LLMs you hate. Your response doesn’t even make sense in relation to the previous post, it’s like you saw someone with a different opinion and your fingers just automatically started trying to describe the poster in a sexually compromising position with as much detail as possible. Typing out your delusions in strong language doesn’t make them real.

      • Phoenixz@lemmy.ca
        link
        fedilink
        English
        arrow-up
        45
        arrow-down
        3
        ·
        23 hours ago

        The problem is not the tech. LLMs (AI does not exist, not yet anyway) have their uses and are impressive technology

        The problem is the tech bros and all the mouth breathers who follow the tech bros without question while they insert lies, “AI”, everywhere it’s not supposed to go, and the places where it would actually be useful so far have been mildly neglected

        I see, for example, use in having AI check MRI results for cancer. A doctor already checked it found nothing, and an AI does a second check and might find something a doctor overlooked. A real doctor then needs to check the results again to confirm the flagging. Please note, I’m not a doctor, I might be saying nonsense right now, but the point I’m making is that AI may be useful as a second pair of eyes.

        AI can be, and has been used to find new novel mathematics. Mind you, AI is not creative, it just tries really weird and unexpected pathways to get to s solution which sometimes is useful

        But the way AI is used now, making porn of your little niece, chatbots, and hey, how about an AI pilot, eh? And ai of course can take over the work from thousands of developers and DevOps employees, so let’s fire them all and then figure out that AI can’t do any of this shit, not nearly at the level required, and it fucks up about 30%ish of the time…

        People are losing their jobs over this

        I am losing my job over this

        I can’t find a new job either becat all the recruitment and job finding is now all AI slop and where 5 year ago ingot a job with 20-30 applications, I now have sent out 200 applications and gotten a single intro interview and that’s it

        AI promised to take away the mundane boring and dangerous jobs so we could focus on art and fun.

        AI took the art and fun and guess who’s left to do the mundane and dangerous?

        Yeah.

        Don’t even get me started about the shit we’ll face once we make real actual AI. For the ethics, just watch “ST TNG: the measure of a man” to get yours started. It will be a shit show

        • VibeSurgeon@piefed.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          4 hours ago

          It’s not quite techbro fantasy, the actual point of the whole thing is marketing.

          It’s worked quite well at that, the amount of coverage they’ve garnered from the stunt is remarkable. Bravo, to be honest

      • 🌸𝓯𝓵𝓸𝔀𝓮𝓻🌸@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        29
        ·
        1 day ago

        First it’s the tech bros using a tech for something it wasn’t meant for and continuously lying about it. That causes a backlash and makes people hate the tech itself, because it’s being used where it causes friction.

        • ericwdhs@discuss.online
          link
          fedilink
          English
          arrow-up
          13
          arrow-down
          2
          ·
          1 day ago

          Yeah, it really sucks, because LLM tech itself is amazing. Quantifying language and ideas into what’s basically a massive queryable concept map is a huge achievement. What do the tech giants decide to do with that achievement? Shove it every little place it doesn’t belong making everyone hate it.

          Oh well, I’ll keep backing up the interesting local open-source models people make and playing with them in the corner.

      • mnemonicmonkeys@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        27
        arrow-down
        11
        ·
        1 day ago

        LLM’s are a technological dead end. They aren’t interesting in the slightest, as anything they can do is already done more effectively and efficiently with other tools

        • ericwdhs@discuss.online
          link
          fedilink
          English
          arrow-up
          16
          arrow-down
          2
          ·
          1 day ago

          I think LLMs are an interesting technology. Of course, the output is inherently untrustworthy, and that rules out a ton of applications tech bros are trying to cram it into.

        • blargh513@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          19
          arrow-down
          8
          ·
          1 day ago

          Huh?

          I think people just need to reset their expectations.

          I asked one for help to interpret PCI policy application (credit card regulatory stuff). I gave it the situation and it provided me with a good answer that, when I asked our compliance team about, they agreed.

          That saved me a lot of time. I don’t see how that’s a dead end. Then I had it draft a response to the person asking questions; I tuned it a little to my liking and sent it. What might have taken me an hour before took 10 minutes. This seems like a helpful thing, not a bad thing. I’m not sure what other technology would have done that.

          • SaveTheTuaHawk@lemmy.ca
            link
            fedilink
            English
            arrow-up
            4
            ·
            5 hours ago

            But you had to ask your compliance team. Now repeat after your compliance team has been laid off. Good luck.

          • petrol_sniff_king@lemmy.blahaj.zone
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            2
            ·
            12 hours ago

            I had it draft a response to the person asking questions; I tuned it a little to my liking and sent it.

            Gemini, remind me not to ask blargh any questions.

            Also, Gemini, my daughter is asking for someone to play with her. Can you run around with the feather wand and have her chase it or something?

          • SaveTheTuaHawk@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 hours ago

            In scientific queries. LMs return an answer from the largest data but if a system or model was recently proven wrong, they still return the wrong answer.

            If you make very specific queries about DNA or protein sequence, they usually generate fabrications that are completely wrong.

            They tend to return answers trained on the Internet, an uncurated pile of dogshit when it comes to science.

        • FauxLiving@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          9
          ·
          23 hours ago

          They aren’t interesting in the slightest, as anything they can do is already done more effectively and efficiently with other tools

          Then why are the other tools not being used?

          LLMs translate much better than anything that was engineered. Summarization of text is another application where there are simply no engineered counterparts.

          LLMs certainly don’t live up to the absurd hype created by the tech sector, but it is just as absurd to state that they are worse than other tools in all tasks.

    • topherclay@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 hours ago

      When I was young I heard the phrase “time marches inexorably forward” and I always thought it was one of those really cool phrases everyone knew from some philosopher or like from Shakespeare or some highbrow source of wisdom or wit.

      Recently I looked it up, and I can’t for the life of me figure out where it came from, or why I thought it was one of those ubiquitous things everyone had heard before. It was probably actually from some X-Men cartoon or something silly but I’ll never figure it out.

      I wish I could go back in time and figure out where I heard that phrase with that specific wording but, you know what they say…

    • jim_v@lemmy.world
      link
      fedilink
      English
      arrow-up
      10
      ·
      21 hours ago

      This word is new to me! From Dictionary.com:

      in a way that is unyielding, unchangeable, or unavoidable.

      Fate seemed to be working inexorably, relentlessly, to bring about the dictator’s downfall.