Coding’s new dawn: how AI assistants are reshaping software development


The emergence of AI assistants like GitHub Copilot and next-generation large language models (LLMs) such as Anthropic’s Claude 3.5 Sonnet, OpenAI’s GPT-4, and Google’s Gemini, has fundamentally shifted perceptions of what is possible.

Developers, hobbyists, and even total newcomers can write code, fix bugs, or spin up prototypes faster than ever. In some cases, they don’t have to worry much about syntax or frameworks – a simple human-language command might suffice to generate boilerplate functions or entire program modules.

But where these tools really show off their prowess is in translating abstract ideas into immediate, tangible code. Instead of memorizing function signatures or searching Stack Overflow for hours, users can describe exactly what they want and watch the AI deliver a solution – often in seconds.

ADVERTISEMENT

People who have never opened a text editor can now say, “Build me a web form that collects emails and stores them in a database,” and receive a working draft almost instantly. It’s a stunning testament to how advanced – and how accessible – software development has become.

To gauge just how good these AI models are at coding, researchers rely on standardized benchmarks such as “HumanEval,” which tests a model’s ability to produce correct solutions to carefully designed coding challenges. Time and again, OpenAI’s models have performed admirably on these tests.

OpenAI ChatGPT Pro
Image by NurPhoto | Getty Images

However, some experts caution that benchmark scores don’t always predict real-world performance, especially when dealing with messy or domain-specific codebases. As many developers know, a pristine test environment is a far cry from the chaos of large-scale projects, where frameworks, external APIs, and last-minute requirement changes can make even straightforward tasks more complicated.

Despite these caveats, current AI models excel at generating snippets or entire functions, speeding up routine coding work and significantly cutting down on repetitive tasks.

GitHub Copilot, which is powered by OpenAI’s Codex, has quickly been adopted by tens of thousands of developers. It integrates with a variety of development environments like Visual Studio Code and JetBrains IDEs, offering line-by-line suggestions that range from the mundane – like automatically closing brackets – to the borderline magical, such as inferring an entire function logic from a quick comment.

The wave of AI-assisted coding platforms shows no sign of slowing. Tools like Cursor, a specialized code editor that bundles AI features in a familiar Visual Studio Code-like interface, and Trae.ai, an emerging tool from ByteDance, both aim to streamline the development process for professionals and novices alike. Meanwhile, players such as Tabnine, Codeium, MutableAI, IntelliCode Compose, and Qodo are all vying for attention in this rapidly expanding market. Each tool differentiates itself with features like advanced debugging help, multi-language code translation, or codebase-wide searches that incorporate natural language queries. However, the overarching goal is the same – to lower the friction between a developer’s idea and a finished software product.

Stefanie Marcus Walsh profile Ernestas Naprys vilius
Join us on Google News
ADVERTISEMENT

In particular, Anthropic’s Claude 3.5 Sonnet has captured attention for its remarkable speed. Internal evaluations suggest it outperforms previous Anthropic models by nearly doubling the percentage of coding problems it can solve correctly, jumping from around 38% to 64% accuracy on certain tests. This jump offers hope that AI might soon handle some of the more challenging parts of coding – like diagnosing subtle logical bugs – rather than merely providing skeletal templates.

chatgpt_claude_gemini_1218
Image by Cybernews

Yet many experts point out that even advanced models can produce code that looks correct at first glance but hides serious flaws, from performance inefficiencies to security vulnerabilities.

On the other side of the coin is OpenAI’s GPT-4, often praised for its deeper reasoning capabilities and ability to handle a wide range of tasks, including debugging. Users claim that GPT-4 can methodically walk through an error-ridden snippet and point out exactly where the logic fails, often providing a corrected version or a step-by-step explanation of how to resolve the bug. For developers who handle large, complex codebases, this reasoning skill is a crucial advantage, as it provides more transparency than the rapid-fire but occasionally “buggy” suggestions that sometimes come from faster LLMs.

Google’s new Gemini models will likely push LLM capabilities even further. Benchmarks by llmarena, a platform comparing LLMs through user-generated battles, suggest that experimental Gemini family models, Gemini 2 Pro, and Gemini 2.0 Flash Thinking, are the best models at coding. Early indications suggest that Google is also investing heavily in bridging coding tasks with its vast suite of cloud services. This could lead to a future where entire data pipelines or machine learning workflows are built through simple back-and-forth conversations with an AI – no advanced coding is required.

Already, new startups are popping up in fields like marketing, e-commerce, and finance, launching digital products without needing to hire a specialized developer. Enthusiasts who have a “big idea” for an app but have never studied computer science can now attempt to create a working prototype in days instead of months. While professional coders remain indispensable for complex or large-scale projects, the democratizing effect of these AI tools could reshape the tech talent market in surprising ways.

Major publications have taken note of the broader implications. The Financial Times reported that AI-powered coding has attracted nearly $1 billion in funding in recent months, underscoring investor confidence in AI’s potential as a “killer app” for the software world. The Verge covered Anthropic’s endeavors in automating computer interactions, pointing out how far these tools have come in a relatively short time. Meanwhile, Business Insider highlighted OpenAI’s efforts to address AI’s biggest coding pitfalls with “CriticGPT,” a system designed to minimize costly errors. It all adds up to a picture of an industry hurtling toward an AI-driven future at breakneck speed.

However, it’s worth remembering that human expertise is far from obsolete. True, AI may handle basic tasks with ease, but veteran developers bring a wealth of problem-solving acumen and intuitive domain knowledge that current AI systems can’t replicate. Crafting robust architectures, ensuring high performance, enforcing security best practices, and making judgment calls in ambiguous scenarios remain distinctly human competencies. Moreover, maintaining codebases generated partly by AI still demands a level of scrutiny and expertise that only experienced developers can reliably provide.

For the moment, the coding world stands at a crossroads.

One path leads to a future where coding is so democratized that anyone can do it, with professional developers stepping into roles that emphasize oversight, advanced architecture, and creative innovation. The other path warns of AI overselling its capabilities, potentially leading to over-reliance on flawed code generation. The likely outcome lies somewhere in the middle: a symbiotic relationship in which AI handles the heavy lifting and developers focus on the deeper, more strategic aspects of software creation.

ADVERTISEMENT

Either way, there’s no question that this dynamic field is changing rapidly. With a growing number of companies heralding a new age of low-barrier software development, Claude 3.5 Sonnet racing to outpace its predecessors, and GPT-4 offering sophisticated debugging and logic capabilities, the coding profession stands on the brink of transformation. Rather than spelling doom for programmers, these tools may ultimately amplify their abilities – enabling them to build more, build faster, and build better. Yet they also lower the entry point for others to join in, fueling an unprecedented wave of innovation. If nothing else, that’s a development that both professional coders and complete newcomers might learn to embrace.

About the author

Mantas Lukauskas is the AI Tech Lead at web hosting company Hostinger. He is also responsible for AI development in the AI orchestration startup Nexos.ai and is a researcher at Kaunas University of Technology.