Agentic AI is the new revolution

As AI workflows grow more autonomous, we’ll find ourselves delegating more and more tasks to them.

Agentic AI is the new revolution
Juju hanging in the hammock

‌We have barely gone through the Chat AI revolution and we’re already going through a new one – Agentic AI.

Once AI stops telling us what to do, and starts doing things for us, everything will change. And I don’t think this can be stopped.

Let’s examine the principles and order of operations on why this is such a big deal..

..starting with chat.

“Chat”GPT was kind of an accident

The book The Optimist by Keach Hagey tells many stories of OpenAI, including the inception of ChatGPT. Among many curious paragraphs, there’s this gem:

As OpenAI was making progress on GPT-4, Murati, who was named CTO in May 2022, and senior research leaders, were experimenting with Schulman’s chat interface as a tool to make sure the new model behaved safely. [..] One customer at a meeting ostensibly about DALL-E was so impressed that OpenAI team returned to the office, realizing that the safety tool was more compelling than they had thought. When GPT-4 finished its training run in August, they made plans to release GPT-4 with the chat interface the following January. But as more people played with it, those plans began to change. “Somehow this chat interface is way a way bigger deal than people realize,” Altman remembers thinking.[..] “I thought that doing GPT-4 plus the chat interface at the same time—and I really stand by this decision in retrospect—was going to just be a massive update to the world. And it was better to give people the intermediate thing.[..]”

In short, chat wasn’t quite a product from the get-go, it was just an internal safety tool. API was the GPT’s interface. Eventually, OpenAI realized chat was something that should be a product all its own, and that it was such a massive deal that they decided not to release it with GPT-4 just not to overwhelm us!

I, for one, am glad that they broke chat and GPT 4 into two. I was overwhelmed by ChatGPT when it launched in late 2023, and then I was overwhelmed again by ChatGPT 4 when it came out a few months later.

And this created its own first revolution..

..Prompting.

Prompt Engineering comes and goes

In 2024 a new role started being talked about: Prompt Engineers.

It makes sense: the moment you get a valuable tool that gives value based on a skill, like LLMs such as ChatGPT, you start seeing roles based on specializing in that skill.

Prompt Engineering was seen with a mix of awe and skepticism in the industry, but I was a believer: people vary widely in their ability to prompt LLMs, and particularly if you’re talking about automation e.g. Support chat interfaces, skill makes a difference in both featureset and security.

I’m no longer a believer in Prompt Engineering though, in all honesty. Two important things changed.

  1. LLMs got A LOT better: Prompt engineering ChatGPT 3 or 3.5 is much more important than prompting o3 or Claude Sonnet 4. Their thinking will overcome many of your limitations, and if they don’t, you can just ask them to write the prompts for you.
  2. Prompting is no longer the only game in town: A mix of auto-complete, task planning, thinking, and agentic AI are now ways that people leverage AI in their work.

Also from a personal experience standpoint, my prompts now are pretty.. weak. There’s no point in me spending human time on prompting the AI, all my incentives are in saving my time and letting the AI figure things out, which is the opposite of “engineering.”

So in the end, I don’t think skillfully prompting AI is a valuable skill in the market anymore. Before we analyze where things are ultimately going, let’s look at the next step from prompts:

AI Auto-complete. Particularly for code.

GitHub Copilot.. or Pilot, really

The first time I tried GitHub Copilot, an auto-complete tool trained on OpenAI’s “Codex” model (a model they trained on code when they found out, by accident, that GPT could code), I was blown away by how quickly I trusted it.

In pair-programming, there are typically 2 roles: the Driver and the Navigator. The Driver is the person with the keyboard, writing code, and thinking out loud. The Navigator is checking their work, examining where we could go, and tracking things.

I was studying deep learning at the time, and as I wrote a toy deep learning model from scratch to test GitHub Copilot, I was astounded by how quickly I became the Navigator instead of the Driver.

With auto-complete, that only makes sense: none of the editor integrations work such that the LLM comments or critiques the code YOU write. Instead, they write the code that you then evaluate.

In short: You were the Copilot all along.

Now, the problem with being a copilot is the same as that of watching a movie: it’s hard to tell the difference between doing a bad job and just not doing any job. If the Driver zones out, no code gets written, but if the Navigator zones out, the code still gets written and shipped to production.

So pretty soon, a bunch of code started to be written by AI. We were supposed to read it, verify it, review it, just like we do our coworker’s code, but the incentives are pretty low, and it’s just hard to review stuff we did vs other people’s work, even if it wasn’t quite done by us, either. Just let your coworker comment on the PR instead, right?

Not much is talked about the skill of using a copilot today, rather surprisingly. Many top-tier engineers just turn off AI autocomplete altogether because it’s so bad, while others just use the defaults and get dozens of autocompletes per minute, and most of us don’t even know how to go about customizing this and making it work for us.

And so we went along with our prompts to ChatGPT and our completions by Copilot (tab, tab, tab), until Cursor Compose came along and changed everything.. yet again.

Cursor, write me a new web app

Cursor is a fork of Visual Studio Code focused on AI capabilities. It didn’t differ that much from Visual Studio Code, aside from some basic QoL changes.

But Cursor had something that nobody else had, or thought about having at the time: Cursor Compose. You told it in some way (README files were particularly popular) what app it wanted, and it’d do several round trips to your LLM model until it built it for you – no prompting or autocomplete required!

Suddenly, Cursor is the Pilot.. and the Copilot. Instead of being an Engineer, you’re more like a Product Owner describing what you want the engineering team to do.

Now, instead of writing prompts, you’re writing specs: plans for what and how the LLM should go about on its quest to build the thing that you want.

And just like prompting, you don’t even really need to write the spec: you can just prompt a simple LLM like ChatGPT 4o to write it for you. It took me 5 seconds to write this prompt:

You get the idea: it’s not my prompting or spec’n capabilities that will be the LLM bottleneck here, really.

The magic of Cursor Compose at the time is exactly that it does all the thinking for you—you don’t really need any skills for describing things the best way to the LLM. In Cursor’s current example documentation for Cursor Planning (Compose’s current iteration as of this writing), the example prompt literally says “migrate to zod v4” .. that’s it. No Prompt or Spec Engineering required.

Cursor Planning and similar tools plan and execute a plan for you: “I’ll write this file, then that file, then change these, then run this command to install this tool, then run it.” It’ll also check with you any steps along the way you want it to: write files, run network commands, run writing commands, etc.

But more and more of these AI planning setups are doing more than just coding and running terminal commands for you.

They’re doing everything.

Nobody loves Jira.. except LLMs

Some engineers are annoyed that LLMs are being used to presumably replace engineers, but nobody seems annoyed at LLMs’ ability to keep your board up to date.

As agents start to become more autonomous (I was going to write “smarter” but that’s not quite it), they’re doing more than coding—they’re doing our whole jobs.

OpenAI’s Codex, for example, is an agentic tool that gets tasks from you and turns them into outcomes, including PRs. Tell me if these snippets, straight out of their blog post announcement, don’t sound like agents automating our work:

Codex can perform tasks for you such as writing features, answering questions about your codebase, fixing bugs, and proposing pull requests for review.
Task completion typically takes between 1 and 30 minutes, depending on complexity, and you can monitor Codex’s progress in real time.
Like human developers, Codex agents perform best when provided with configured dev environments, reliable testing setups, and clear documentation.

But what about Jira, you ask? Well, you don’t have to go too far to find ways to automate Jira and Codex with codex-cli, for example. In a recent post by OpenAI, Kevin Alwell and Naren Sankaran, titled “Automate Jira ↔ GitHub with Codex.”

This cookbook provides a practical, step-by-step approach to automating the workflow between Jira and GitHub. By labeling a Jira issue, you trigger an end-to-end process that creates a GitHub pull request, keeps both systems updated, and streamlines code review, all with minimal manual effort. The automation is powered by the codex-cli agent running inside a GitHub Action.

What’s most impressive about this for me is just how.. unremarkable this is nowadays. It’s so easy to implement with existing technologies that it’s a “cookbook”, not a product.

In fact, given how advanced our AI is, most of the limitations to using agents today are just people adjusting to agentic workflows and writing glue tying it together.

The results? Here’s how the article describes how it works:

1) Label a Jira issue
2) Jira Automation calls the GitHub Action
3) The action spins up codex-cli to implement the change
4) A PR is opened
5) Jira is transitioned & annotated - creating a neat, zero-click loop. This includes changing the status of the ticket, adding the PR link and commenting in the ticket with updates.

I’m quoting OpenAI Codex but there are literally a myriad ways of doing this today with a myriad vendors. The thing about AI is that moats are almost non-existent, and AI itself kind of commoditizes its own innovations.

I wasn’t even sure I was gonna use Codex as an example here before I started writing, to be honest. Anything would do.

And then we get to the main point of my article, 1,800 words later:

Pretty soon, this Agentic AI workflow that Engineers can use today to do their jobs for them? We’ll all be using it to do all of our own jobs and chores.

The incentives are just too high not to.

Agentic AI will replace you.. because you will want it to

There are commonly three camps in AI today..

  1. The “AI is too dumb to do my job” camp: SREs, Expert Engineers, and other highly technical individual contributors who realize that AI’s current capabilities are nowhere near their own and while AI helps a bit sometimes, it gets in their way most of the time.
  2. The “AI is a great efficiency tool” camp: Software Engineers, Product Managers, Engineering Managers, people who use AI on a day-to-day basis to write documentation, help troubleshoot things, and answer questions. They are the day-to-day believers.
  3. The “AI will replace everyone” camp: Small (and often short-sighted) business owners who think you can just fire your marketers, copy editors, engineers and support staff and replace them with AI.

I think what I’m ultimately saying is that the short-sighted business owners are right, but not quite in the way they expect.

So here’s my ultimate thesis:

”As AI workflows and capabilities get more advanced and autonomous, we’ll naturally automate more and more of ourselves with it, independent of our line of work.”

In short: AI will not replace you. You.. will replace you.

And the biggest bottleneck preventing this from happening today is not AI sophistication—it’s workflows. To spin Bill Clinton’s strategist James Carville’s quote: “It’s the interface, stupid!”

While AI is likely to continue to get more and more sophisticated on its own, I think it’s the ways we use it that will slowly but surely creep into our day-to-days. As it gets cheaper (and/or as we decide to individually spend more on it), it’ll be more and more directly applied to solving our problems with minimal to no supervision.

AI’s incentives will almost irresistibly drive us to delegate to it.

Ultimate delegation to AI

In the 2019 book Free To Focus, Michael Hyatt talks about delegation to humans with the 5 Levels of Delegation.

While he’s talking about delegation to humans, it’s astounding how well this translates to how I believe we’ll treat AI really soon.

The quotes are straight from the Free to Focus book:

  • Level 1: Here’s exactly what to do. Don’t deviate. As he says, “This level is perfect for new hires, entry level people, and virtual assistants.”
  • Level 2: Research and report back. “This is a great level to use [.. any time you..] need someone to gather information for you.”
  • Level 3: Make a recommendation. “At this stage, you can make a well-informed decision on a complex topic in one simple meeting.”
  • Level 4: Do this, and keep me in the loop. “This is a great level to use with growing leaders, because it empowers them with decision-making experience and [..let’s you..] evaluate how well they are doing.”
  • Level 5: The project is yours, I don’t have to hear about it. “Now you’ve cloned yourself. [..] Level 5 is where delegation magic happens. It’s perfect for when you have complete confidence in the person to whom you’re delegating [..]”

I foresee more and more of our usage of AI will go up the ladder of delegation, and “complete confidence” can be bent by quite a bit when you’re talking about a $2,000/year LLM instead of a $200,000/year Engineer or Marketing Director.

Choosing whether to do something yourself or pay $200,000 is a lot easier than choosing to do something yourself or pay $2,000.

Now while I don’t quite know what the above looks like in practice yet, you didn’t come here to read about practical advice, right?

Oh, you did?! Shoot!

OK. So I guess the only practical advice I have for you is:

“Find ways to make AI work for you with minimal necessary supervision from you.”

That, I believe, is what will put you ahead of the current curve.