Psychology Aims For A Unified Theory Of Cognition And AI Will Be A Big Help To Get There

Black Male Developer Thinking and Typing on Computer, Surrounded by Big Screens Showing Coding Prompts. Professional Programmer Creating Software, Running Coding Tests. Futuristic Programming Concept

In today’s column, I examine the ongoing pursuit by psychology to devise a unified theory of cognition. The deal is this. There have been numerous attempts that have been floated regarding proposed unified theories or models of cognition. Subsequently, by and large, those theories or models have been sharply criticized as being at times incomplete, illogical, unfounded, and otherwise not yet fully developed. The desire and need for a true and comprehensive unified theory of cognition persists and remains exasperatingly elusive.

Into this pursuit comes the use of AI, especially modern-era AI such as generative AI and large language models (LLMs). Can we make a substantive forward leap on devising a unified theory of cognition via leaning into contemporary AI and LLMs? Some say abundantly yes, others wonder if doing so will be a distraction and lead us down a primrose path.

Let’s talk about it.

This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here).

AI And Psychology

As a quick background, I’ve been extensively covering and analyzing a myriad of facets regarding the advent of modern-era AI that entails the field of psychology, such as providing AI-driven mental health advice and performing AI-based therapy. This rising use of AI has principally been spurred by the evolving advances and widespread adoption of generative AI. For a quick summary of some of my posted columns on this evolving topic, see the link here, which briefly recaps about forty of the over one hundred column postings that I’ve made on the subject.

There is little doubt that this is a rapidly developing field and that there are tremendous upsides to be had, but at the same time, regrettably, hidden risks and outright gotchas come into these endeavors too. I frequently speak up about these pressing matters, including in an appearance last year on an episode of CBS’s 60 Minutes, see the link here.

You might find of keen interest that AI and psychology have had a longstanding relationship with each other. There is a duality at play. AI can be applied to the field of psychology, as exemplified by the advent of AI-powered mental health apps. Meanwhile, psychology can be applied to AI, such as aiding us in exploring better ways to devise AI that more closely approaches the human mind and how we think. See my in-depth analysis of this duality encompassing AI-psychology and psychology-AI at the link here.

The Enigma Of Human Cognition

The American Psychological Association (APA) defines cognition this way:

“Cognition: All forms of knowing and awareness, such as perceiving, conceiving, remembering, reasoning, judging, imagining, and problem solving.” (source: APA online Dictionary).

One nagging mystery underlies how it is that we can think and embody cognition.

All sorts of biochemical elements in our brain seem to work in a manner that gives rise to our minds and our ability to think. But we still haven’t cracked the case on how those neurons and other elements in our noggin allow us to do so. Sure, you can trace aspects at a base level, yet explaining how that produces everyday cognition is a puzzle that won’t seem to readily be solved.

This certainly hasn’t stopped researchers from trying dearly to figure things out. Hope springs eternal that the mysteries of cognition will be unraveled and we will one day know precisely the means by which cognition happens. Nobel prizes are bound to be awarded. Fame and fortune are in the cards. And imagine what else we might do to help and overcome cognitive disorders, along with potentially enhancing cognition to nearly unimaginably heightened levels.

This is undoubtedly one of the most baffling mysteries of all time, and there is a purist sense of absolute joy and satisfaction in solving it.

Various Types Of Models

When seeking to come up with a unified theory of cognition, the route taken usually entails these four major paths:

(1) Conceptual model: A high-level conceptual model is narratively sketched, consisting of an abstract depiction that articulates the hypothesized theory at hand.
(2) Mathematical model: A series of stipulated equations and formulas are identified that are believed to represent how cognition operates.
(3) Biochemical model: A set of biological stipulations and chemical formulas is laid out to showcase how cognition arises.
(4) Computational model: A computer-based computational model is built that purports to simulate or represent the nature of cognition, of which the latest models usually incorporate state-of-the-art AI capabilities.

You can use only one of those approaches, or you can use two or more.

If you opt to use two or more, your best bet is to make sure each model aligns with the other models being utilized. Any misalignment will indubitably bring criticism and skepticism raining down upon you. For example, if you propose a conceptual model and a mathematical model, but those two don’t sync up, it becomes an easy line of attack to suggest that your theory is hogwash.

AI And Computational Models

A tempting avenue for cognition modeling these days is to rely upon an AI-based computational model that leverages the latest generative AI and LLMs. You can essentially repurpose a popular LLM, i.e., OpenAI’s ChatGPT, which garners 400 million weekly active users, or Anthropic Claude, Google Gemini, Meta Llama, and so on.

Those off-the-shelf LLMs are ready-made for experimenting on psychology-based premises. I recently explained how contemporary generative AI is devised to react to psychological ploys and techniques, an intriguing facet that is both helpful and potentially hurtful, see my coverage at the link here.

One monumental wrinkle is whether a conventional LLM is suitable for representing a semblance of human cognition. Allow me to elaborate on this vital point.

The mainstay of LLMs makes use of an artificial neural network (ANN). This is a series of mathematical functions that are computationally rendered in a computer system. I refer to this as an artificial neural network to try and distinguish it from a true neural network (NN) or wetware that is inside your head.

Please be aware that ANNs are an exceedingly loosely contrived variation of true NNs. They are not the same. An ANN is quite far from real NNs and, in contrast, is many magnitudes simpler. For my detailed explanation about ANNs versus NNs, see the link here.

The bottom line is that an instant criticism of any cognition research that dovetails into LLMs is that you are starting at a recognized point of heated contention. Namely, a cogent argument is that since ANNs are not the same as true NNs, you are building your cognition hopes on somewhat of a house of cards. The counterargument is to acknowledge that ANNs are indeed not an isomorphic match, and instead, you are merely engaging them to aid in a broad-based simulation that doesn’t have to be a resolute match.

In any case, I stridently support using LLMs as insightful exploratory vehicles and assert that we can gain a great deal of progress about cognition in doing so, assuming we proceed mindfully and alertly.

LLMs And Intrinsic Human Behavior

Suppose you decide to use an off-the-shelf LLM to perform a cognitive modeling investigation. There is something important that you need to be thinking about. I shall unpack the weighty consideration.

First, be aware that LLMs are developed via pattern-matching on human writing that is scanned across the Internet. That’s how the fluency of LLMs comes about. The ANN is used to pattern-match on how we use words. In turn, when you enter a prompt into generative AI, the generated response produces words composed into sentences that appear to be on par with human writing. They reflect the computational mimicry of extensive computational pattern-matching based on words (actually, it is based on tokens, see the details in my discussion at the link here).

You can’t especially declare that the LLM is thinking like humans. The AI is using words and patterns about the usage of words. That’s not necessarily a direct embodiment of human thinking per se, and more so, presumably, the indirect outcomes of human thinking.

One clever idea is to augment an off-the-shelf LLM by aiming to further data-train the AI on veritable traces of human thinking (well, kind of, as you’ll see momentarily). Perhaps that will enable the LLM to be more closely aligned with what human cognition consists of. For example, I fed transcripts of therapist-patient sessions into a major LLM to see if it might be feasible to augment its data training and guide the AI toward behaving more like a versed human therapist, see my experiment at the link here.

Psych Experimental Results As Rich Data

What other kinds of data could we potentially use to perform augmented data training of an LLM so that it can be more readily suited for cognition experimentation?

Easy-peasy, tap into the vast tome of psychology experiments that have been performed endlessly on all sorts of people for many decades. Here are the steps. Collect together that data. Work the data into a readable and usable shape. Feed it into an existing LLM, doing so via a method such as RAG (retrieval-augmented generation), see my RAG elicitation at the link here.

Voila, perhaps you’ve tuned up conventional generative AI to better simulate human behavior.

A recent research study took that innovative approach. In an article entitled “A Foundation Model To Predict And Capture Human Cognition” by Marcel Binz et al, Nature, July 2, 2025, the paper made these key points (excerpts):

“Establishing a unified theory of cognition has been an important goal in psychology.”
“An important step towards a unified theory of cognition is to build a computational model that can predict and simulate human behavior in any domain. In this paper, we take up this challenge and introduce Centaur — a foundation model of human cognition.”
We derived Centaur by fine-tuning a state-of-the-art language model on a large-scale dataset called Psych-101. Psych-101 has an unprecedented scale, covering trial-by-trial data from more than 60,000 participants performing in excess of 10,000,000 choices in 160 experiments.”
“We believe that such models provide tremendous potential for guiding the development of cognitive theories, and we present a case study to demonstrate this.”

Details Of The Approach

The researchers chose to use Meta Llama as their base LLM. The data augmentation was done via the use of the increasingly popular technique known as QLoRA (quantized low-rank adaptation), a distant cousin of RAG.

They transcribed 160 experiments into natural language data. It was publicly available data. The types of experiments included many of the classics in psychology, such as memory recall, supervised learning, decision-making, multi-armed bandits, Markov decision processes, and others.

To give you a sense of what those experiments are like, consider these two examples:

Multi-armed bandits: “In this task, you have to repeatedly choose between two slot machines labelled B and C. When you select one of the machines, you will win or lose points. Your goal is to choose the slot machines that will give you the most points.”
Decision-making: “You will choose from two monetary lotteries by pressing N or U. Your choice will trigger a random draw from the chosen lottery that will be added to your bonus. Lottery N offers 4.0 points with 80.0% or 0.0 points with 20.0%. Lottery U offers 3.0 points with 100.0%.”

Handily, the researchers have opted to make the dataset available, known as Psych-101, and can be accessed freely on Huggingface. In addition, they have nicely made available the augmented Meta Llama model, which they refer to as Centaur, and which is also freely available on Huggingface.

It is a welcome touch because other researchers can now come along and do not need to begin from scratch. They can reuse the arduous and time-consuming work that went into devising Psych-101 and Centaur. Thus, the dataset and the model are ready-made for launching new investigations and serve as a springboard accordingly.

The Results In Brief

A commonly utilized means of validating an LLM consists of holding back some of the training data so that you can use the holdback for testing purposes. This is a longstanding technique that has been used for statistical model validations.

You might use, say, 90% of the data to do the augmented data training and keep the remaining 10% in reserve. When you are ready to test the LLM, you give it the data that was aside to see if the AI can adequately predict the presumed unseen data. They did this and indicated that their Centaur LLM did a bang-up job on the hold-out data.

The next step typically undertaken is to employ a make-or-break test when aiming to devise a generalizable model. You give the LLM data that is considered outside the initial scope of the augmentation. The handwringing question is whether the LLM will generalize sufficiently to contend with so-called out-of-distribution (OOD) circumstances.

The researchers opted to select a handful of OOD settings, including economic games, deep sequential decision tasks, reward learning, etc. Their reported results indicate that Centaur LLM did quite well at making predictions associated with those previously unseen experimental transcripts.

Overall, kudos to the researchers for thinking outside the box on AI and psychology.

Some Thoughts To Ponder

I’d like to cover a few quick thoughts overall.

First, one agonizing difficulty with gauging an off-the-shelf pre-cooked LLM for any kind of newly encountered circumstances is that it is challenging to know whether such data or similar data might have been scanned during the initial setup of the LLM. Usually, only the AI maker knows precisely what data was initially scanned. Ergo, it is worthwhile to be mindful in interpreting generalizability since an LLM might have already had an unknown leg-up previously.

Second, and perhaps more importantly, the desire to push toward a semblance of cognitive realism by further data training of an LLM is a laudable idea.

Will the AI be more human-like in its reasoning patterns?

Maybe, maybe not.

One important determinant is whether the AI is still resorting to human-like language and not necessarily patterning on human reasoning. There is a huge debate going on regarding LLM foundational models that are claimed to be using “reasoning” versus whether they are still potentially doing heads-down next-token prediction, see my coverage on the lively dispute at the link here.

Taking Next Steps

The overarching aim to see if we can properly ground cognitive computer-based computational simulations in a more psychologically plausible way is exciting. No doubt about that.

The researchers also noted that there might be entirely different AI architectural approaches that might be better for us to pursue, beyond the somewhat conventional infrastructures currently dominating the AI realm right now.

As a heads-up, some ardently believe that our prevailing LLMs and AI architecture are not going to get us to artificial general intelligence (AGI) or artificial superintelligence (ASI). You see, the trend right now is to mainly power up prevailing designs with faster hardware and more computational running time. But the incremental benefits could be misleadingly tying us to a road that leads to a dead-end.

Could the desire to attain a unified model of cognition be the kick in the pants to the AI field to look beyond the groupthink of today’s AI and LLMs?

I certainly hope so.

As General George S. Patton once proclaimed: “If everyone is thinking alike, then somebody isn’t thinking.”

Forbes