AI coding errors, management should be handled by human programmers

It’s well known that software development work involving generative AI involves making fundamentally different mistakes than human programmers, yet most companies’ plans for fixing AI coding mistakes simply rely on throwing skilled human programmers into the loop.

Experienced human programmers intuitively know the kinds of mistakes and shortcuts that human programmers make, but it takes separate training to identify the kinds of mistakes that software makes when it creates software.

These discussions were fueled by comments from AWS CEO Matt Garman, who said he expects most developers to no longer code as early as 2026.

ⓒ Getty Images Bank

Many companies in the development tool space have claimed that using AI apps to manage AI coding apps can solve this problem. It’s like a second train wreck. Even financial giant Morgan Stanley is thinking about how to use AI to manage AI.

The only realistically safe and remotely feasible approach is to train programming managers to understand the nature of generative AI coding errors. In fact, given the very different nature of AI coding errors, it may be better to train new people who are not used to spotting human coding mistakes as AI coding managers.

Part of the problem is human nature. People tend to magnify and misinterpret differences. When managers see people or AI making mistakes they would never make, they tend to think that the mistakes are due to the manager’s inferiority in coding problems.

But let’s assume that in the case of self-driving cars, statistically, self-driving cars are much safer than human-driven cars. Automated systems don’t get tired, don’t get drunk, and don’t intentionally become violent.

But self-driving cars aren’t perfect. And when they make mistakes, like ramming a truck that’s stopped in traffic at full speed, humans will ask, “I would never do something so stupid… I can’t trust AI.” (Waymo parking lot disasteris a must-see video.)

But just because self-driving cars make the odd mistake doesn’t mean they’re less safe than human drivers. But human nature can’t reconcile these differences.

The same goes for coding management. Generative AI coding models can be very efficient, but if not done properly, they can easily go in the wrong direction.

AI is a crazy alien programmer

Dev Nag, CEO of SaaS company QueryPal, has been working on generative AI coding when he felt many enterprise IT executives were unprepared for how different this new technology would be.

“It makes a lot of weird mistakes, like an alien from another planet,” Naeg said. “The code breaks down in ways that human developers don’t. It goes in weird directions, like an alien intelligence that doesn’t think like us. The AI ​​will find pathological ways to manipulate the system.”

this year ‘AI Assisted ProgrammingAsk Tom Towley, author of several AI programming books, including .

“For example, you might ask an LLM to write code for you, and sometimes you might even build a framework or a virtual library or module to do what you want,” Towley said. (Towley explained that an LLM doesn’t actually create a new framework, but rather pretends to do so.)

“Unless (human programmers) are crazy, they wouldn’t create virtual libraries or modules out of thin air,” Towley points out.

If this happens, it’s easy for anyone to find out if they’re looking for it. “If you try to install it yourself, you’ll find that there’s nothing there,” Towley explains. “In this case, the IDE and compiler will throw errors.”

The idea of ​​handing over the entire coding of an application, including creative control over the executable, to a system that periodically causes hallucinations seems like a terrible approach.

A much better way to leverage the efficiency of generative AI coding is to use it as a tool to help programmers do more work. As AWS’s Garman suggests, eliminating humans would be suicidal.

If generative AI coding tools are free to roam around and create backdoors that can be modified later without human intervention, what if attackers could create backdoors that could be used by attackers as well?

Companies tend to be very effective at testing the functionality of their apps, especially their own, to make sure they work properly. Where app testing often fails is when they try to find out if the app can do something it shouldn’t. This is the mindset of penetration testing.

But in the real world of generative AI coding, this pen-testing approach should be the norm, and should be managed by a supervisor who is well-educated about the wild world of generative AI mistakes.

Enterprise IT is certainly looking forward to a more efficient coding future, where programmers will take on a more strategic role, focusing more on what the app should do and why, and less time coding every line.

But those efficiencies and strategic gains come at a huge cost: hiring better, differently trained people to make sure the AI-generated code is headed in the right direction.
editor@itworld.co.kr

Source: www.itworld.co.kr