Writing as a human, and the AI pledge

You’ll notice there are no emdashes or excessive use of emojis here. That’s because it’s written by a human at about 20 tokens per MINUTE rather than an AI at hundreds of tokens per second.

I think it’s great there are movements like Verified Human and Not By AI to “pledge” and visually indicate that content has been written by a human instead of AI.

100% Written by Human

However, there’s more we need to do. It should be more than the (raises right hand) “honor system”. We should have these be verifiable in an automated and cryptographically sound way. There’s nothing stopping bad actors or AI from including these badges in their own work, so it doesn’t truly help identify authentic human creative work. Without this, we will lose the ability to discern what is true and authentic.

The idea of AI Watermarking is promising, but there are practical limitations. There is a challenge to get the growing number of AI model providers to comply. And that compliance needs to have repurcussions to hold them accountable. And that would need to be bound by international law so its not confined to individual nation states. Even if we solved the compliance challenge with human institutions across the world, it’s trivial for a bad actor to work around it and use an open source AI model to remove watermarks.

This sounds like a wicked problem without a clear solution, or even a framework for how to solve it. But I think it’s important we come up with a framework for how to solve it. Socio-technical problems at the intersection of humans and technology are the hardest ones to solve.

Even if we can’t cover 100% of all content with a watermark that certifies whether human or AI generated, making progress on having some critically important spaces in which the content in that platform can carry those guarantees would be a meaningful step forward.

Think places where the authenticity of the content has high consequences if we don’t get it right. Evidence presented in court that has legal consequences that impact people. Information used to make life or death decisions, such as medical imagery. Information used to make critical decisions for the successful functioning of human institutions, such as government voting records. Maybe the content shown on the television or in a social media feed, which might not have immediate consequences but has a compounding effect of manipulating and influencing large groups of people to act a certain way or form a certain opinion.

A necessary part of the solution would be end-to-end custody from production to consumption of the content, with a chain of trust that extends across all intermediaries from how that content gets transmitted from the originator who created it to the person that receives it. So it can’t be manipulated and distorted. Much like end-to-end encryption chat and video provided by WhatsApp and Signal to protect the privacy of your information as it flows through many computer networks across the world. I think these platforms are actually best positioned to solve this. But the hardest part is the producer side where the content is originally authored.