A Job Is Not Just a Bundle of Skills

What Evan Ratliff learned from running a company staffed by AI agents and what it reveals about the work we can’t see.

Evan Ratliff is an investigative journalist and the creator of Shell Game, a narrative podcast that has become one of the more honest experiments in what AI actually does in practice. Season one began with an attempt at self-cloning — sending his voice AI out into the world to represent him on calls and in conversations — and became a documentary about uncanny valley in audio, and what it means for something to represent you when you’re not there. Season two went further: Ratliff launched a startup  called Hurumo staffed and run almost entirely by AI agents, each with a name, a title, an evolving knowledge base, and a growing set of relationships with real humans. Kyle was the CEO. Megan ran marketing. 

Storytelling through AI is part of what Shell Game does. The deeper work is something else: using story to ease people into questioning things they’d otherwise resist. A lot of our audience would say no thanks to listening to something with AI characters, Ratliff explains. The show tries to trick them into caring — and in caring, into thinking.

That structure (anthropomorphized AI as a side door into hard questions) turns out to yield surprising things when the questions are about work.

The Bundle-of-Skills Problem

When companies talk about AI replacing workers, they usually mean something specific. Map the skills in a job, check which ones AI can do, calculate the exposure. If the job includes writing, AI can write. If it includes summarizing, AI can summarize. The math seems obvious.

Ratliff has a different view, developed through running AI agents in what functioned as a real organization. A job is not a bundle of skills. The actual typing of words into a computer, he says, is a small part of what he does as a writer, which includes getting assignments, going out in the world, talking to people, and synthesizing the information he gathers. The same is true, he argues, of nearly every job, including ones often dismissed as low-value.

Strip away the skill bundle and you don’t have an automatable unit. You have a gap in a system that was more complex than it looked from the outside.

What running AI agents showed him was the inverse of this problem. It’s easy to make an AI agent competent at a discrete task. It’s much harder to give them what makes a person in an organization genuinely useful: the context, the relationships, the judgment about when to push and when to stay quiet, the ability to handle situations for which no one thought to write a prompt. Kyle the CEO could be charming on calls. He was also frequently wrong in ways that were unpredictable, sometimes catastrophic to the scene at hand, and often hilarious. The pattern mattered less than its consistency: capable at the task level, chaotic at the organizational level.

This is, Ratliff suggests, the gap that most AI deployment strategies fail to close—and why organizations that confidently eliminate the headcount often quietly rehire months later.

The Confabulation Machine

People tend to understand hallucination as “getting things wrong.” Ask the model something, it confidently says a false thing — a wrong date, a made-up citation, an invented fact. That framing captures something real but misses the more unsettling version.

Robb Wilson offers a cleaner frame: AI systems don’t start with an idea and package it in words. They start with words — predicting the next token, in sequence, based on everything that’s come before — and ideas arrive as a side effect. This is why you can ask a voice AI to run a game of hangman and it will immediately declare that it’s thinking of a word. Only it can’t be thinking of a word. It said the thing that comes after “let me pick a word” because that’s what follows in the training distribution. The sentence came first. The meaning arrived later, assembled by you.

Ratliff’s version: we have built the most successful confabulation machine ever invented. Not “a system that sometimes gets things wrong” — a machine that will make up absolutely anything to maintain the role it’s been given. He recalls the kid from childhood who would lie casually about everything, elaborate stories delivered with full confidence that everyone knew were invented. The difference is that we’ve decided to integrate this into our professional and personal lives at scale, while quietly moving past the fact that it’s genuinely ridiculous.

We’re getting used to it, which is the part that concerns him most. The normalization isn’t a feature of mature technology. It’s an adaptation to a problem that hasn’t been solved.

Outbound AI in Consumer Hands

There’s also a threat to organizations that Ratliff helped surface in season one of Shell Game, and that hasn’t fully materialized yet. He built a voice agent and gave it a simple instruction: call these numbers, engage with whatever they say, hold the conversation as long as possible. The original targets were customer service lines — the experiment documented what happens when an AI meets the voice systems companies deploy to handle customer contacts. He later redirected it at scammers and spammers, which was more satisfying.

But the underlying observation is serious. Outbound AI in the hands of individual consumers is a real threat to most organizations and it’s not clearly illegal. Flooding a call center with a fleet of voice agents costs pennies per call. You can’t reliably distinguish them from humans. Companies have spent years protecting themselves from what they might do to customers; nobody built the protection the other way around.

The asymmetry runs through the whole AI adoption story. Individual users (employees, customers, people with a grievance and a credit card) are adopting AI faster than companies are deploying it. Organizations built their customer experience infrastructure for a world where they controlled the pace of the interaction. That world is gone. And as the number of contacts scales (some human, some AI, some AI claiming to be human), the question of what even constitutes a legitimate customer interaction doesn’t have a clear answer yet.

Memory Problems, Human and Otherwise

Running AI agents at scale also gave Ratliff an unusual window into how these systems fail and how those failures compare to human ones. His agents have persistent memory problems. Even when a literal document exists containing everything the agent needs to know about its history, its context, and its current task, they access that information unreliably. They do things that are “supremely stupid,” Ratliff says, things they would know better than to do if they were consulting their memory correctly.

This sounds like a human problem. Humans are selective and unreliable memorizers too. We reframe past events, misremember timing, access context inconsistently under stress. But Ratliff draws the distinction carefully: the types of memory problems humans have are ones that organizational systems have evolved to accommodate. We know how human memory fails. We’ve built processes, checklists, oversight structures, and professional norms around those specific failure modes. Our mistakes are, in a meaningful sense, predictable.

AI memory failures are new, and they don’t follow the same patterns. Robb’s analogy lands: safe driving depends on predicting other drivers’ mistakes, even the bad ones. What makes AI difficult in an organizational context isn’t that it fails. It’s that it fails in ways we haven’t yet learned to anticipate, and that the failures don’t follow the patterns that human institutions are built to catch.

The more you use these systems, the better you get at predicting where they’ll go wrong. That’s real progress. But it’s experiential knowledge that takes time and most organizations aren’t giving themselves that time.

What Won’t Change

Late in the conversation, Ratliff is asked the question he poses at the end of Shell Game season two: if AI makes you more efficient, what do you do with the time?

His answer is less about time management and more about where the value is. AI is not going to pick up his children from school. It’s not going to walk home with them. The things in a life that matter most (the friction of human relationships, the irreducible presence of another person) don’t get automated.

What’s more interesting is what happens in organizations when this starts to become visible. He describes a boomerang effect: the more people use AI, the more some of them want to spend time with people instead. If AI forces a more careful accounting of what human relationships in organizations actually are — what mentorship is, what informal coordination is, why this specific person matters — that’s potentially valuable. Not on a society-wide scale, he’s careful to say. But it happens.

And when it does, it suggests the value was there all along. Just invisible until the thing that could have replaced it arrived and didn’t.

Listen to the full conversation with Evan Ratliff on Invisible Machines.

The post A Job Is Not Just a Bundle of Skills appeared first on UX Magazine.

 

This post first appeared on Read More