Provably Autonomous Artificial Intelligence

How do you know that the AI response you get to a prompt is really from an AI? That is the question.

At the dawn of the computer era, Alan Turing famously proposed a test to distinguish humans from digital representations of human thought (what we now call artificial intelligence). Called the “Imitation Game” the idea was simple – if a human prompted a respondent who was hidden behind a screen with questions the test would be to distinguish between those responses generated by a human on the other side of the screen and those that came from an AI system. Turing posited that the inability to distinguish AI responses from human ones would be a significant development in the understanding of intelligence.

We are, today, long past the point where the Turing test has any real descriptive value. Today, in many contexts (though of course not all) it is nearly impossible to distinguish between AI responses to a prompt and those of a human. Artificial intelligence is now quite effective at mimicking human behavior and masquerading as a human actor – a situation that has caused a host of large-scale legal and policy problems.

But there has, until now, been little cause to turn the question around and ask how, if at all, we can distinguish between an AI response and a human masquerading as an AI actor? The question, while interesting, was of little practical importance. Perhaps no longer – and therein hangs a tale. How will law and policy respond to the development of provable autonomous artificial intelligence?

* * * *

When you input a prompt into Chat-GPT, you get back a response from a generative artificial intelligence program. Or so you assume. And, or course, your assumption is well-grounded in circumstantial evidence. To begin with, the entire purpose of Chat-GPT is to give you a non-human response from an AI program. It would be odd, indeed, if responses from an AI bot were actually from humans masquerading as an AI. Thus, with a high degree of confidence, you can accept the response as AI-generated.

But you cannot be certain. There is no formal way of proving that the response is AI-generated. Indeed, what we know, with certainty, is that all commercial AI programs available to consumers today have human controls in the background that are capable of modifying an AI response. And, for the most part, we welcome that human control. It allows human operators to remove disinformation; counteract AI hallucinations; and, when necessary, correct for unintended biases. Given how fearful many are of fully independent AI (think of Skynet and the Terminator series), the presence of human control is considered a feature, not a bug.

But, given that human control and access is considered a “given” for the current cadre of AI bots, that means that one cannot, in a formal way, prove that any AI response is, with certainty, from the AI. To be sure, there is every reason to believe it is, but the very possibility of human intervention means that certitude is lacking. So far, this has had little or no impact on AI development. We, as a society, have welcomed AI that is only contingently autonomous.

Moreover, even if one were to want to build an AI bot with complete autonomy, it hasn’t, just far, proven easy to achieve. After all, developers of AI systems begin with access to and the ability to modify an AI program. Using that capability to create an AI and then finding a way to irrevocably eliminate access requires creative thought.

Nor, frankly, has the question been one of any legal or policy salience. In the current environment, the critical question has been identifying artificial intelligence products that are masquerading as human-generated – deepfake videos; electoral disinformation campaigns, and such. Given how much that circumstance concerns us, it seems almost a waste of resources to focus on the opposite problem – how to identify human products that are masquerading as AI-generated.

* * * *

That lack of focus may, however, soon need to change.

I was recently sent this fascinating article from Nous Research (an open-source AI startup) entitled “Setting Your Pet Rock Free, or How to Deploy Provably-Fully-Autonomous Thinking Sand.” [Full disclosure – one of the founders of Nous is my future nephew, who is engaged to my niece. That’s how I found out about it. But other than a general interest in his (and their) overall welfare, I have no interest of any sort in Nous.] As the article demonstrates, it turns out that an autonomous AI bot can be deployed. In effect, the creators can build the AI system and then throw away the access keys.

Perhaps even more interestingly, the key to this autonomy for AI software lies in a hardware solution. As I read the article (and, full confession, I am no real technologist, so corrections are welcome) the way to achieve autonomy is to host the autonomous AI bot in a Trusted Execution Environment and then, in effect, also store the critical control credentials in the TEE in a manner that prevents the human developer from having access to them.

As a result, without human access, the AI bot created will have a “life” of its own beyond the control of the developers – one that might continue on indefinitely. In the proof-of-concept development, Nous prevented that by adding a timed-release account recovery feature. The account credentials were revealed after a fixed timeout (7 days from launch), allowing the human developer to step in and end the program. In addition, since the AI bot was on a single server the developers could, in the end, pull the plug.

But, of course, these end-of-life steps were optional. Other developers could, without too much apparent difficulty, create the same AI implementation without the automatic kill switch, raising the prospect of an autonomous generative AI with no “off button.”

It is unclear how difficult the implementation Nous described truly is. Given the challenges that the developers had with the Twitter substrate in which they were working, they appear to be significant. On the other hand, it seems probable that a different substrate could be created that would facilitate the development process rather than impede it. Ultimately, it is neither clear (at least to me) whether this proof-of-concept is robust, nor whether it is readily scalable.

But what is clear is that the proof is a remarkable (and for techno-geeks, like me, fascinating) result. The prospect of an autonomous artificial intelligence that is both independent of human intervention and can be proven to be so is, one suspects, highly significant.

* * * *

Why is that?

First, the rapid development of artificial intelligence has continued to confound societal capacity to distinguish objective reality. This is not a new problem – it has been around at least as long as propaganda campaigns have been waged. Digitally, altered reality has been common for over a century, and with the creation of Photoshop it is now a commercial commonplace. The growth of artificial intelligence with the capacity to create realistic simulations of human content has only exacerbated and accelerated the problem.

This latest step may be a useful corrective. Consider – we are generally heading in the direction of greater reliance on artificial intelligence, because of its greater efficiency and perceived superior capacity to make decisions. One can readily envision a future in which artificial intelligence comes to define certain aspects of societal existence – it isn’t necessarily a future that all would welcome, but it seems plausible.

What impact would uncertainty about the provenance of AI input have on that future? I imagine it would be quite significant – if only as a caution against over-reliance on AI that might be spoofed. One imagines, then, that the development of provably autonomous AI will enhance societal confidence in the authenticity of AI products (though not, of course, their accuracy or reliability). And so, it seems likely that greater fidelity in AI authenticity will accelerate the use of AI in critical systems.

Second, the development may have the contra-positive effect of diminishing concerns about deepfakes and AI masquerading as humans. One can, for example, imagine that the ability to prove autonomous AI will impact our ability to prove humanity. Or, perhaps, not – proving humanity may prove too difficult to achieve in a digital system. But even so, the deployment of provably autonomous AI will allow us to exclude the possibility of AI acting autonomously in certain use cases – and that in itself may prove of value. All of this is more speculative, of course, but any enhanced ability to distinguish AI from human products will have positive impacts on a host of social and legal issues.

Third, this latest development continues to make it clear that policy and law lag behind technology in ways that are simply impossible to avoid. We have barely begun to contemplate how to regulate non-autonomous AI systems. And, indeed, one of the principles that has guided most such regulations is to mandate “human-in-the-loop” control over AI systems.

The prospect of autonomous AI directly challenges that mandate in ways that are guaranteed to generate concern. Provably autonomous AI will almost certainly add fuel to fears of Skynet–like AI systems. The natural reaction will likely be an overreaction – an enhancement of fear and thus the prospect of over-regulation.

One prediction then is that the technical improvements that enhance the deployment of AI will, perversely, also generate the social push-back that may retard that same deployment. Only time will tell, of course. But given our history of failing to implement law and policy structures to restrain other technological developments, I know where my money would be placed if I were to bet on the outcome. The development of provably autonomous AI will only increase the gap between technology and law/policy. Are we ready for this brave new world?

Provably Autonomous Artificial Intelligence

How do you know that the AI response you get to a prompt is really from an AI? That is the question.

Recent Articles

Article Categories

Share This