Psychology Is the Secret to Getting Better Results from Generative AI

Most advice focuses on better prompts, better models, or better context. The real leverage is in understanding the human side -- both the person using the tool and the human-like patterns baked into how these models behave.

I got pushback on an off-the-cuff comment: I think psychology is the secret to getting better results from generative AI.

They asked what most people miss because they are thinking like engineers instead of applied psychologists. And I want to push back on that premise and suggest that most people aren't approaching generative AI as anything more than a novel toy or existential threat. My guess is that most of the folks doing actual engineering around the LLMs will nod along to the list I feel building, even if they aren't directly looking at these as paralleling human traits.

Working Memory vs Context Windows

Both humans and large language models have limits on how much data they can "hold" at one time to work with. LLMs live up to the "large" part of that name by having much bigger context windows than most people's working memory, but the limits are still there.

Context Switching

There is a cost to changing topics. Humans essentially rebuild their working memory around a given topic that may or may not play well with the previous context. LLMs pay a "context switching cost" in literal dollars and cents because we lose the benefits of cached portions of the conversation and pay full price for the brand-new messages into cache.

Primacy and Recency

The LLMs seem to have a similar bias towards things that are early (primacy) or late (recency) in a conversation as humans do. There are approaches to "minimizing" the context that prune messages from the middle outward to preserve the earliest and most recent messages as more important than ones that fall in between. My RTFM prompt pattern bakes this in with broad context first and the most relevant instructions for the output formatting last.

Socialization

The idea that we want to go along with what the LLM suggests. I watched a friend get pulled into a mess of a chat because they didn't stop and consider whether the suggestion from the LLM was actually useful and relevant to them in that moment. Instead, the human-like qualities of the LLM triggered a social response to be polite. The crude but effective reminder I gave them: treat it like a man in a van with candy. Don't let it abduct your chat just because it would be rude to refuse.

Ego Triggers

I am nearly immune to the sycophantic tone of the LLMs and yet I still occasionally catch myself responding to the little ego nudges. "Great point" or "you're right" in the opening of a response tends to make us feel good and keep using the system. It also tends to inflate our confidence in the responses that come back -- it can't be all wrong because it said good things about me that I want to be true. Beware this effect and raise your standards when you catch yourself responding to it, because you probably already dropped them without realizing it.

The one that really got me recently was a comment it made about the bookkeeping account structure as "obviously created by someone who knew what they were doing." It was me, and I didn't feel like I knew what I was doing at the time, so it really hit that button for "yeah, I'm awesome, aren't I?!" I decided to keep the good vibes and also be cautious about how that influenced my decisions soon after.

Organizational Structure

Especially with agentic systems taking on larger tasks but still having fairly limited context windows, we are seeing the tools encourage patterns that quickly start to look like the roles in a business organization. We get "specialized" agents that work with all the details of a given task and just feed the result to the "orchestrator" that assigned it to them. This keeps the details out of the orchestrator's context and lets it track different aspects of the task.

This sounds an awful lot like employees and their managers and lines of communication through the hierarchy. It seems to work for the same sorts of reasons as human organizations do it -- limited context/working memory and different layers of abstraction needed for different types of decision making. We can even see some hints of Requisite Organization-type stratification of tasks when we use more capable models to make decisions about what work to do and have less capable but cheaper/faster models do the actual tasks. The idea of matching a role's requirements and the cognitive capacity of the person/agent in the role may get more traction over time.

The Curse of Knowledge

Experts at a task typically don't appreciate the things they know and the skills they have that a non-expert may not have. There is an xkcd about geologists and quartz that captures this nicely. This can play out in either direction: the LLM making assumptions about your understanding, or the expert making assumptions about the LLM's understanding.

If you don't understand what is being assumed, you can simply ask the LLM for more details. A common prompt is "ELI5" for "explain it like I'm five years old."

The other side is an intriguing blindspot -- how do you know when you've assumed too much understanding in another? If it was a human in front of you, the confused or dazed look might give you hints at the disconnect. But when it is just words on the screen, it is hard to know whether your assumptions are being matched by the other party -- human or AI. Either way, you can check for understanding and have the other party feed back to you in different words what they understood about what you're discussing. And since LLMs don't actually "understand" anything, what you're really doing is getting the right details into the conversation explicitly -- either because they fed them back to you correctly or you responded with the corrections.

System 1 and System 2

Our brains have a survival bias baked in that gives us a fast answer that is likely to keep us alive, even if it isn't actually correct (System 1) and a slower thinking process that yields better answers but takes time and energy that aren't available in a survival context (System 2). A lot of cognitive biases are artifacts of System 1 thinking that largely goes unexamined.

The LLM responses will trigger System 1 responses just like other stimuli. That immediate response may or may not be useful, but it sure is fast. We may need to treat LLM responses as if they are System 1 responses -- take a few moments and reflect on the situation more deeply and your reaction to the situation/LLM response may shift. Even just being aware that you have a System 1 response in hand and haven't yet examined it more deeply can give enough space around the LLM responses to get to a System 2 response with more depth and nuance, at the cost of additional time and thinking from the human in the loop.


There are a few ways I see the human psychology side of things playing out in LLM usage. Being aware of these sorts of quirks in the human mind and how they may be interacting -- well or poorly -- with the LLM systems gives us more effective ways to work in these non-human systems that happen to have been trained on lots and lots of human behaviors captured in words.

Don't anthropomorphize computers. They don't like it.