Agents -- what even are they

Some rambling thoughts about nomenclature and what is going on with AI agents -- 6/12/2023

David Hershey

Jun 12, 2023

What is an Agent?

The talk of the AI town these last few months has been singular: “agents.” Some public examples from the last 6 months:

LangChain Agents (“Some applications require not just a predetermined chain of calls to LLMs/other tools, but potentially an unknown chain that depends on the user’s input”)
AutoGPT (/BabyAGI) (“This program, driven by GPT-4, chains together LLM "thoughts", to autonomously achieve whatever goal you set.”)
Voyager (“Use LLMs as a reasoning engine to play Minecraft”)

Agent has become an overloaded term. People use “agent” to refer to way too many different things.

Let’s break down what people are talking about.

A Taxonomy of Agents

Here’s my attempt at categorizing the work that people are labeling as “agents”:

AGI
- Examples: AutoGPT, BabyAGI, AgentLLM, AgentGPT
- Explanation: The flashiest of the “agents” are (as far as I can tell) trying to solve general intelligence. They are essentially programs of the structure “specify any task, get a solution.”
- Diagnosis: Unsurprisingly, these don’t work yet. We haven’t solved AGI! If these projects end up working… we can probably safely ignore all of the rest of my writing 🙂
- What I would call this (instead of agent): AGI
LLMs that prompt themselves
- Examples: LangChain Agents, Smol AI, a lot of the noise on Twitter
- Explanation: The broadest definition basically says that “if an LLM is coming up with prompts, then it’s an agent.” This accounts for a lot of the “agent” noise — people have started using the word agent to describe complex chains of prompts that are unknown at runtime.
- Diagnosis: This is simply a natural evolution of LLM applications.
- What I would call this (instead of agent): Nondeterministic prompt chains
LLMs that use tools
- Examples: ChatGPT Plugins, LangChain Agents, Fixie Agents, Gorilla
- Explanation: If you directly integrate the input or output of an LLM with some form of API, then you have an agent.
- Diagnosis: Another natural evolution of LLM applications.
- What I would call this (instead of agent): Tools / skills / plugins
Software with intuition
- Examples: Voyager
- Explanation: LLMs seem to have a “world model”, and we can use that to include the concept of intuition in programs.
- Diagnosis: The most exciting development in software right now. Software that has a baseline understanding of how the world works will change how our world works in the coming years.
- What I would call this (instead of agent): Agent. I think this is the underlying building block that makes software “agential.”

Thank you for reading Generally Intelligent: A Daily AI Update. This post is public, so feel free to share it.

I mostly ignore the AGI agents. I think they’re important fundamental work! But I don’t think we’re that close to AGI, and I think OpenAI and Anthropic are likely further along than OSS.
Nondeterministic prompt chains + LLMs that use tools are new programming paradigms that demand new developer tools (and best practices).
Software with intuition is a new way of conceiving what is possible that will lead to a wide swath of products that will disrupt how we interact with computers.
WE NEED A BETTER VOCABULARY THAN JUST THE WORD AGENT

Happy Monday, all!

Generally Intelligent

Agents -- what even are they

Some rambling thoughts about nomenclature and what is going on with AI agents -- 6/12/2023

What is an Agent?

A Taxonomy of Agents

My Thoughts