Similar to intelligence, agency can be thought of as a spectrum. Some things are more agentive than others. Is a hammer agentive? No. I mean if you want to be indulgently philosophical, you could propose that the metal head is acting on the nail per request by the rich gestural command the user provides to the handle. But the fact that it’s always available to the user’s hand during the task means it’s a tool—that is, part of the user’s attention and ongoing effort.
Less philosophically, is an internet search an example of an agent? Certainly the user states a need, and the software rummages through its internal model of the internet to retrieve likely matches. This direct cause-and-effect means that it’s more like the hammer with its practically instantaneous cause-and-effect. Still a tool.
But as you saw before, when Google lets you save that search, such that it sits out there, letting you pay attention to other things, and lets you know when new results come in, now you’re talking about something that is much more clearly acting on behalf of its user in a way that is distinct from a tool. It handles tasks so that you can use your limited attention on something else. So this part of “acting on your behalf”—that it does its thing while out of sight and out of mind—is foundational to the notion of what an agent is, why it’s new, and why it’s valuable. It can help you track something you would find tedious, like a particular moment in time, or a special kind of activity on the internet, or security events on a computer network.
To do any of that, an agent must monitor some stream of data. It could be something as simple as the date and time, or a temperature reading from a thermometer, or it could be something unbelievably complicated, like watching for changes in the contents of the internet. It could be data that is continuous, like wind speed, or irregular, like incoming photos. As it watches this data stream, it looks for triggers and then runs over some rules and exceptions to determine if and how it should act. Most agents work indefinitely, although they could be set to run for a particular length of time or when any other condition is met. Some agents like a spam filter will just keep doing their job quietly in the background. Others will keep going until they need your attention, and some will need to tell you right away. Nearly all will let you monitor them and the data stream, so you can check up on how they’re doing and see if you need to adjust your instructions.
So those are the basics. Agentive technology watches a datastream for triggers and then responds with narrow artificial intelligence to help its user accomplish some goal. In a phrase, it’s a persistent, background assistant.
If those are the basics, there are a few advanced features that a sophisticated agent might have. It might infer what you want without your having to tell it explicitly. It might adapt machine learning methods to refine its predictive models. It might gently fade away in smart ways such that the user gains competence. You’ll learn about these in Part II, “Doing,” of this book, but for now it’s enough to know that agents can be much smarter than the basic definition we’ve established here.
How Different Are Agents?
Since most of the design and development process has been built around building good tools, it’s instructive to compare and contrast them to good agents—because they are different in significant ways.
|A Tool-Based Model||An Agent-Based Model|
|A good tool lets you do a task well.||A good agent does a task for you per your preferences.|
|A hammer might be the canonical model.||A valet might be the canonical model.|
|Design must focus on having strong affordances and real-time feedback.||Design must focus on easy setup and informative touchpoints.|
|When it’s working, it’s ready-to-hand, part of the body almost unconsciously doing its thing.||When the agent is working, it’s out of sight. When a user must engage its touchpoints, they require conscious attention and consideration.|
|The goal of the designer is often to get the user into flow (in the Mihalyi Csikszentmihalyi sense) while performing a task.||The goal of the designer is to ensure that the touchpoints are clear and actionable, to help the user keep the agent on track.|
Drawing a Boundary Around Agentive Technology
To make a concept clear, you need to assert a definition, give examples, and then describe its boundaries. Some things will not be worth considering because they are obviously in; some things will not be worth considering because they are obviously out; but the interesting stuff is at the boundary, where it’s not quite clear. What is on the edge of the concept, but specifically isn’t the thing? Reviewing these areas should help you get clear about what I mean by agentive technology and what lies beyond the scope of my consideration.
It’s Not Assistive Technology
Artificial narrow intelligences that help you perform a task are best described as assistants, or assistive technology. We need to think as clearly about assistive tech as we do agentive tech, but we have a solid foundation to design assistive tech. We have been working on those foundations for the last seven decades or so, and recent work with heads-up displays and conversational UI are making headway into best practices for assistants. It’s worth noting that the design of agentive systems will often entail designing assistive aspects, but they are not the same thing.
It seems subtle at first, but consider the difference between two ways to get an international airline ticket to a favorite destination. Assistive technology would work to make all your options and the trade-offs between them apparent, helping you avoid spending too much money or winding up with a miserable, five-layover flight, as you make your selection. An agent would vigilantly watch all airline offers for the right ticket and pipe up when it had found one already within your preferences. If it was very confident and you had authorized it, it might even have made the purchase for you.
It’s Not Conversational Agents
“Agent” has been used traditionally in services to mean “someone who helps you.” Think of a customer service agent. The help they give you is, 99 percent of the time, synchronous. They help you in real time, in person, or on the phone, doing their best to minimize your wait. In my mind, this is much more akin to an assistant. But even that’s troubling since “assistant” has also been used to mean “that person who helps me at my job” both synchronously—as in “please take dictation”—and agentively—as in “hold all my calls until further notice.”
These blurry usages are made even blurrier because human agents and assistants can act in both agentive and assistive ways. But since I have to pick, given the base meanings of the words, I think an assistant should assist you with a task, and an agent takes agency and does things for you. So “agent” and “agentive” are the right terms for what I’m talking about.
Complicating that rightness is that a recent trend in interaction design is the use of conversational user interfaces, or chatbots. These are distinguished for having users work in a command line interface inside a chat framework, interacting with software that is pretty good at understanding and responding to natural language. Canonical examples feature users purchasing airline tickets (yes, like a travel agent) or movie tickets.
Because these mimic the conversations one might have with a customer service agent, they have been called conversational agents. I think they would be better described as conversational assistants, but nobody asked me, and now it’s too late. That ship has sailed. So, when I speak of agents, I am not talking about conversational agents. Agentive technology might engage its user through a conversational UI, but they are not the same thing.
It’s Not Robots
No. But holy processor do we love them. From Metropolis’ Maria to BB8 and even GLaDoS, we just can’t stop talking and thinking about them in our narratives.
A main reason I think this is the case is because they’re easy to think about. We have lots of mental equipment for dealing with humans, and robots can be thought of as a metal-and-plastic human. So between the abstraction that is an agent, and the concrete thing that is a robot, it’s easy to conflate the two. But we shouldn’t.
Another reason is that robots promise—as do agents—“ethics-free” slave labor (please note the irony marks, and see Chapter 12 “Utopia, Dystopia, and Cat Videos” for plenty of ethical questions). In this line of thinking, agents work for us, like slaves, but we don’t have to concern ourselves about their subservience or even subjugation the same way we must consider a human, because the agents and robots are programmed to be of service. There is no suffering sentience there, no longing to be free. For example, if you told your Nest Thermostat to pursue its dreams, it should rightly reply that its dream is to keep you comfortable year round. Programming it for anything else might frustrate the user, and if it is a general artificial intelligence, be cruel to the agent.
Of course, robots will have software running them, which if they are to be useful, will be at times agentive. But while our expectations are that the robot’s agent stays in place, coupled as we are to a body, that’s not necessarily the case with an agent. For example, my health agent may reside on my phone for the most part, but tap into my bathroom scale when I step on it, parley with the menu when I’m at a restaurant, pop onto the crosstrainer at the gym, and jump to the doctor’s augmented reality system when I’m in her office. So while a robot may house agentive technology, and an agent may sometimes occupy a given robot, these two elements are not tightly coupled.
It’s Not Service by Software
I actually think this is a very useful way to think about agentive tech: service delivered by smart software. If you have studied service design, then you have a good grounding in the user-centered issues around agentive design. Users often grant agency to services to act on their behalf. For example, I grant the mail service agency to deliver letters on my behalf and agree to receive letters from others. I grant my representative in government agency to legislate on my behalf. I grant the human stock portfolio manager agency to do right by my retirement. I grant the anesthesiologist agency to keep me knocked out while keeping me alive, even though I may never meet her.
But where a service delivers its value through humans working directly with the user or delivering the value “backstage,” out of sight, an agent’s backstage is its programming and the coordination with other agents. In practice, sophisticated agents may entail human processes, but on balance, if it’s mostly software, it’s an agent rather than a service. And where a service designer can presume the basic common senses and capabilities of any human in its design, those things need to be handled much differently when we’re counting on software to deliver the same thing.
It’s Not Automation
If you are a distinguished, long-time student of human-computer interaction, you will note similar themes from the study of automation and what I’m describing. But where automation has as its goal the removal of the human from the system, agentive technology is explicitly in service to a human. An agent might have some automated components, but the intentions of the two fields of study are distinct.
Hey Wait—Isn’t Every Technology an Agent?
Hello, philosopher. You’ve been waiting to ask this question, haven’t you? A light switch, you might argue, acts as an agent, monitoring a data stream that is the position of the knife switch. And when that switch changes, it turns the light on or off, accordingly.
Similarly, a key on a keyboard watches its momentary switch and when it is depressed, helpfully sends a signal to a small processor on the keyboard to translate the press to an ASCII code that gets delivered to the software that accumulates these codes to do something with them. And it does it all on your behalf. So are keys agents? Are all state-based machines? Is it turtles all the way down?
Yes, if you want to be philosophical about it, that argument could be made. But I’m not sure how useful it is. A useful definition of agentive technology is less of a discrete and testable aspect of a given technology, and more of a productive way for product managers, designers, users, and citizens to think about this technology. For example, I can design a light switch when I think of it as a product, subject to industrial design decisions. But I can design a better light switch when I think of it as a problem that can be solved either manually with a switch or agentively with a motion detector or a camera with sophisticated image processing behind it. And that’s where the real power of the concept comes from. Because as we continue to evolve this skin of technology that increasingly covers both our biology and the world, we don’t want it to add to people’s burdens. We want to alleviate them and empower people to get done what needs to be done, even if we don’t want to do it. And for that, we need agents.