I was catching up with my friend Jason over lunch, and he made a statement that surprised me. He mentioned that his goal over the next 3 years was to delegate enough so that he could manage his company(s) through only his apple watch. This was not surprising - nor should it be that provocative to many of you, my loyal readers. Copilots are excellent now, and agents are right around the corner. Bootstrapping a collection of agents to an AI "team", or even "department" will follow as scaling continues and roadblocks are un-hobbled.
What was surprising is that he stated that the last person he would fire would be his personal assistant.
I think I now understand why he made this claim. I'll use this post to explain that, as well as some of my current thoughts on the state of LLMs and their capabilities.
To whom will we delegate?
If we believe we'll eventually make the transition to managing AI teams, it will be helpful to understand what makes a good subordinate now! And here, we can turn to the Army: One hint that our tools have progressed enough for Jason to eschew his laptop will be if our agents, departments, and firms (here, we will call this amorphous blob an "entity") can create Completed Staff Work, a term originating from a 1942 Army memo (half a page, and worth many re-reads).
Their definition:
Completed staff work is a principle of management which states that subordinates are responsible for submitting written recommendations to superiors in such a manner that the superior needs to do nothing further in the process other than to review the submitted document and indicate approval or disapproval.
The line of argument goes: The chief should not suffer your half-baked thoughts and in-progress work. In fact, You are doing yourself a disservice by presenting in-flight work because you avoid the important and clarifying work of the ironing it all out; you also do the chief a disservice by clouding him with the irrelevant. In other words, it asks that staff practice computational kindness "The less the brain has to do, the more it likes to do it. When you create situations where the brain doesn't have to think too much - you're being computationally kind." towards their superiors. While yes, the ultimate decision responsibility lies with the chief, the cognitive and computative responsibility lies with the staff.
As a simplifying exercise, we claim that the property "the ability to generate complete staff work" is a sufficient condition for a system that Jason can use to generate that sweet, sweet shareholder value.
So let's build them. All we have to do is add the following to all of our prompts, and we are done, right?
It is so easy to ask the chief what to do, and it appears so easy for him to answer. Resist that impulse. You will succumb to it only if you do not know your job. It is your job to advise your chief what he ought to do, not to ask him what you ought to do. He needs answers, not questions. Your job is to study, write, restudy and rewrite until you have evolved a single proposed action – the best one of all you have considered.
Unironically, this probably does help with response quality. But, it won't get us all the way to agents that can complete staff work. As of writing this article in September 2024, current systems are unable to produce complete staff work: they are mostly un-agentic, they have limitied access to external resources, and they are probably a bit to sychophantic to produce quality output.
Nonetheless, we can probably build such competent systems. In the next section, we'll inspect the qualities of an entity that can produce complete staff work; hopefully understanding the qualities of a great subordinate give us hints on how to design the systems and infrastructure that will be the sinew of those future entities.
Desiderata for a capable subordinate:
I'm not sure if this is all of the qualities needed to produce "complete staff work", but it's a good enough start. Lets work through how each of these qualities interact with the capabilities of current LLM / agent systems.