Taking Another Big Leap In AI Progress

Man jumping in mid-air while doing acrobat

“That’s one small step for (a) man… one giant leap for mankind.”

Despite the confusing transcript, that was a big leap for the world, and one that was destined not to be repeated for another half-century or more. Fast-forward about 50 years, and we’re now wondering, not where we will explore in the universe, but what big step we will take, as a global society, dealing with a technology that feels alien, and formidable to us.

Sometimes, like Indiana Jones in The Last Crusade, we find ourselves wondering not just what the big leap will be, but whether a leap of faith will lead us to safety or send us tumbling down into peril.

That’s a little like what this moment feels like now for many people pondering the uncertainly of artificial intelligence. AI can now write poems, paint pictures, and do our bookwork. What’s next?

Proliferation of AI Capabilities

One of the main ideas being promoted right now by experts is that the next big step in AI will be, for lack of a better word, “multimodal.”

What that means is that AI will not confine itself to text, as it largely has to date, answering questions in words, on a screen. It will be given depth and dimension, with voice, with robotics. It will generate audio, video, even perhaps material results, or it will drive vehicles, operate equipment, pick fruit.

That’s multimodal AI. And it’s coming.

“Future AI assistants won’t just respond to typed prompts but will understand a user’s tone of voice, facial expressions, surroundings, and social context,” writes Muhammad Tuhin at ScienceNewsDaily. “The result will be systems that feel more intuitive, adaptable, and human-like in their interactions.”

We should be ready for this kind of evolution.

When AI Knows More Than All of Us

Then there’s the point where AI just becomes smarter than a human, when our own personal agents know more than a whole human community.

Some of this has to do with edge computing, where the LLM can become more powerful and evolved, and still live on your smartphone. There’s the prevailing idea among experts and fans of Ray Kurzweil that “everyone will have the smartest AI” and that each of us will command a superhuman intelligence – maybe one tied to others in an ad hoc network. (This segment from Reid Blackman is something I found interesting).

What It Looks Like

My colleague Ramesh Raskar, as well as Stanford professor Tengyu Ma, Anthropic research scientist Andi Peng, and Carina Hong of Axiom, addressed the likely future of AI and the next big step that all of us will take to get there. (disclaimer: I have consulted for Liquid AI).

One of the standout ideas was where Raskar invoked Marvin Minsky to suggest that, a la Minsky’s Society of the Mind, the bold new AI world will involve distributed networks of intelligence, not just one monolithic hive mind. Raskar used the example of a CEO at a firm.

“We don’t try to make the CEO the smartest person in the company, like (a) highly centralized intelligence, but we say the CEO is more of an orchestrator …the intelligence is actually all the smart people in that company,” he said, suggesting that thinking about AI as centralized infrastructure may be a mistake. “Let’s consider this possibility that actually we might be doing this completely wrong. You know, we have centralized all the data, centralized all the compute, centralized all the talent: is that the right way to think about (AI) globally, or should it be done differently?”

As an alternative, he described a network approach.

“(One) possibility is that the right way to think about intelligence is we actually create many micro AIs that are all over the world, and they have their access to local tools, local data, local context,” he said. “And as they talk to each other, a kind of global intelligence emerges.”

He also brings up the premise that corporate strategy may already be centered in this direction.

“If you think about big companies, actually, that’s what they’re doing. You know, they’re not acknowledging it. But instead of creating one large model, they say, ‘oh, maybe it’s mixture of experts, or maybe it’s not (a) mixture of experts, but maybe it’s reasoning, where I’m going to break out of the task.’ So they are decentralizing their definition of intelligence already.”

Progress in Reasoning

Hong, who is pioneering model work at Axiom, talked about how capabilities have evolved quickly, in math and code, and reasoning.

“You can turn every single math problem and solution proof into the computer version, like the computer code, and then get the same incredible success that you have seen in RL for coding,” she said. So we at Axiom are quite excited about that as the next frontier of AI, which is verified superintelligence. We want to build self-improving systems in the verifiable domain that can let the model reflect on the mistakes it has gotten wrong, and then reflect on what it has gotten right, and then have multiple of them interact with each other to continue to improve on the performance. This is what we believe to be the next leap of AI.”

Not a Lot of Models

Moving on in that theory of decentralization, Raskar pointed out how the market won’t need a lot of model makers, just a prolific cloning of a given system onto edge devices.

“They’ll fit on your phone, they’ll fit on your laptop, and it’ll be even better than the models we have today, right?” he said. “All of the models will be very tiny. I think the models will be highly commoditized. I think the possibility is that the big model makers will get out of the business of creating models, because, as you know, the token costs are decreasing by a factor of 100 every year. It’s a race to the bottom. So I think the real value is going to come because all of these models are either distilled or freshly trained for very minor tasks, whether it’s health or legal, my own life, my own calendar, my own email, but (something that) I’m running locally on my own machine, and then, when my agent or my AI talks to your AI and other scientists and so on, new intelligence emerges out of that.”

Mass Intelligence at Work

“I think we’re entering an era of mass intelligence,” Hong said, developing this further. “As the price of reasoning becomes more and more elastic, there will be more and more unexpected use cases, and markets that we haven’t actually had the tools to unlock.”

What this points to, she suggested, is a “massive foundation” for advances globally. “(It’s) so many things, from physics, engineering,” she said. “You could argue that computer science, or a lot of the algorithm problems that we are solving, are based on math, …(it’s) the incredible power of mathematics and fundamental science, leveraged by AI, to be scaled and to be applied at an incredible, unprecedented speed, to all the applied sciences. That is what mass intelligence means.”

Anthropic’s Data Request

There was a lot more in the panel which is available by transcript or video, about tokens, about the emergence of AGI, etc. But near the end, someone asked about Anthropic, and Peng pointed out a reason for the recent changes in data policy, where the company is now asking for more user data, after pointing out that she is not in legal, but in a scientific role at the company.

“I think partially, what we want to understand is how to build models that are helpful for particular users, and how that might differentiate between users,” she said. “What’s useful for (an MIT professor) is different than (what’s useful to) an eighth grader coding for the first time, right? And so part of this type of user feedback enables us to better understand how to best support different customers in different use cases. I don’t exactly know if we have an explicitly published plan for how we’re intending on actively using this, but it is something that we are exploring, and we are potentially looking forward to.”

Those are some of the big leaps that these experts are looking for soon. Hearing opinions from academia and business helps us to all get more informed about what’s coming down the pike. Stay tuned.

Forbes