Show top LLMs buggy code and they'll finish off the mistakes rather than fix them.

Tea@programming.dev · 2 days ago

Show top LLMs buggy code and they'll finish off the mistakes rather than fix them.

LovableSidekick@lemmy.world · edit-2 2 days ago

As a software developer I’ve never used AI to write code, but several of my friends use it daily and they say it really helps them in their jobs. To explain this to non-programmers, they don’t tell it “Write some code” and then watch TV while it does their job. Coding involves a lot of very routine busy work that’s little more than typing. AI can generate approximately what they want, which they then edit, and according to them this helps them work a lot faster.

A hammer is a useful tool, even though can’t build a building by itself and is really shitty as a drill. I look at AI the same way.

bpev@lemmy.world · edit-2 16 hours ago

100%. As a solo dev who used to work corporate, I compare it to having a jr engineer who completes every task instantly. If you give it something well-documented and not too complex, it’ll be perfect. If you give it something more complex or newer tech, it could work, but may have some mistakes or unadvised shortcuts.

I’ve also found it pretty good for when a dependency I’m evaluating has shit documentation. Not always correct, but sometimes it’ll spit out some apis I didn’t notice.

Edit: Oh also I should mention, I’ve found TDD is pretty good with ai. Since I’m building the tests anyways, it can often give the ai a good description of what you’re looking for, and save some time.

Reliant1087@lemmy.world · 20 hours ago

I’ve found it okay to get a general feel for stuff but I’ve been given insidiously bad code. Functions and data structures that look similar enough to real stuff but are deeply wrong or non+existent.

bpev@lemmy.world · edit-2 13 hours ago

Mmm it sounds like you’re using it in a very different way to me; by the time I’m using an LLM, I generally have way more than a general feel for what I’m looking for. People rag on ai for being a “fancy autocomplete”, but that’s literally what I like to use it for. I’ll feed it a detailed spec for what I need, give it a skeleton function with type definitions, and tell the ai to fill it in. It generally fills in basic functions pretty well with that level of definition (ymmv depending on the scope of the function).

This lets me focus more on the code design/structure and validation, while the ai handles a decent amount of grunt work. And if it does a bad job, I would have written the spec and skeleton anyways, so it’s more like bonus if it works. It’s also very good at imitation, so it can help to avoid double-work with similar functionalities.

Kind of shortened/naive example of how I use:

/* Example of another db update function within the app */
/* UnifiedEventUpdate and UnifiedEvent type definitions */

Help me fill in this function

/// Updates event properties, and children:
///   - If `event.updated` is newer than existing, update as normal
///   - If `event.updated` is older than existing, error
///   - If no `event.updated` is provided, assume updated to be now()
/// For updating Content(s):
///   - If `content.id` exists, update the existing content
///   - If `content.id` does not exist, create a new content
///   - If an existing content isn't present, delete the content
pub fn update_event(
    conn: &mut Conn,
    event: UnifiedEventUpdate,
) -> Result<UnifiedEvent, Error> {

IphtashuFitz@lemmy.world · 1 day ago

We have a handful of Python tools that we require to adhere to PEP8 formatting, and have Jenkins pipeline jobs to validate it and block merge requests if any of the code isn’t properly formatted. I haven’t personally tried it yet, but I wonder if these AI’s might be good for fixing up this sort of formatting lint.

witten@lemmy.world · 17 hours ago

Why bother with AI for that? https://black.readthedocs.io/en/stable/index.html

Lemminary@lemmy.world · 2 days ago

Coding involves a lot of very routine busy work that’s little more than typing.

That’s right. You watch it type it out and right where it gets to the important part you realize that’s not what you meant at all, so you hit the stop button. Then you modify the prompt and repeat that one more time. That’s when you realize there are so many things it’s not even considering which gives you the satisfaction that your job is still secure. Then you write a more focused prompt for one aspect of them problem and take whatever good enough bullshit it spewed as a starting point for you to do the manual work. Rinse and repeat.

Excrubulent@slrpnk.net · edit-2 22 hours ago

That sounds exhausting to me.

Like seriously what busywork is so routine and so basic that you need an AI to do it but couldn’t make a template for it? And how is it less work to read what it gave you to check for errors? That’s always the harder part of coding in my experience.

I would love to know the specifics of where this supposedly saves time.

I suspect the energy you’re putting into learning this tool could go into becoming a better typist, and you wouldn’t need to cook the planet to do it.

sugar_in_your_tea@sh.itjust.works · 2 days ago

Exactly. I have a coworker use it effectively.

Personally, I’ve been around the block so it’s usually faster for me to just do the busy work myself. I have lots of tricks for manipulating text quickly (I’m quite proficient with vim), so it’s not a big deal to automate turning JSON into a serializer class or copy and modify a function a bunch of times to build out a bunch of controllers or something. What takes others on my team 30 min I can sometimes get done in 5 through the power of regex or macros.

But at the end of the day, it doesn’t really matter what tools you use because you’re not being paid for your typing speed or ability to do mundane work quickly, you’re being paid to design and support complex software.

Show top LLMs buggy code and they'll finish off the mistakes rather than fix them.

Show top LLMs buggy code and they'll finish off the mistakes rather than fix them.

Trained on buggy code, LLMs often parrot same mistakes