I’m writing this right after Apple announced their Vision Pro and OS, and at the crest of the AI/LLM hype. As it stands, it seems like both of these technologies will come to define the next 10 years of evolution.
AI is hot topic these days so I’m going to try not to retread to much. Instead, let me share three ideas that I’m fascinated by right now.
One note on terminology. I use AI and LLM interchangeably and without precision. The latter is mostly what I’m talking about, although my guess is the field will broaden from LLMs particularly hence the us of AI seems more appropriate.
I. The triumph of a product that captures the imagination
As a technologist, I’m very interested in the mainstreaming of new technological paradigms. This is never given and a lot of well meaning ideas do not take off. It is always useful to look back and dissect how tech can come to be.
There’s obviously the technology part. With a bit of hindsight, the transformer paper feels like a seminal moment. There’s also the Moores law-esque aspect as well. Neural network techniques benefit from scale and we are swimming in scale: the scale of compute and thus parameters, and internet and thus datasets. LLMs are now billion parameters deep, uses the entire internet of data, and takes over a month to train on the fastest hardware clusters on the planet. If either the hardware, software, data or research is not up to par, it wouldn’t have came together.
Now, the technology is obviously impressive, but I am much more intrigued by how the product itself kick started the excitement. There was before ChatGPT, and then there was after ChatGPT.
ChatGPT did a few important things. It was free, it was usable, it had a good design. More importantly, it put LLM technology into the hands of anyone and unlocked a wave of experimentation, speculation, and imagination. One can argue that AI was already prevalent before. Google had it everywhere, from autocomplete in Gmail to demos all through the last few Google I/Os. It just wasn’t tantalizing or available.
The UX was perfect for the moment as well. The interaction model was reminiscent of Siri, Alexa and all the myriad of smart home assistants and we use texts and chats all the time. The blinking cursor, reply stream-in and then the conversation itself – passing the turing test – was a perfect showcase.
The original ChatGPT was half a generation behind their cutting edge work but it didn’t matter. The tech was good enough, the product was excellent, and the price is right. While AI will still be an important development without this little moment, it has definitely sparked something here.
One day when we look back and all the subsequent development seems inevitable, it will be thanks to the conversation that ChatGPT ignited.
II. The language power tool
A lot of the current interest in LLMs in the media and culture sphere is focused on misinformation, agency, and alignment. I have a slightly different take. To me, the profound step here is the ability to work with language and information. So much of human interest and knowledge is encoded in language, and the ability to work with it using compute is a large step forward.
I remember taking a class in school about 15 years ago where we were parsing language using state machines and grammar rules. Even back then, these techniques feel archaic with insurmountable flaws. LLMs are not perfect, but they feel possible. Not to overstate it, but we now have robust methods that can modify, extract, compress, retrieve, translate, and transmit language while maintaining the general accuracy of information that is usable inside of computation.
The Internet is a medium that transacts on language, from the free flowing content that zaps through it’s pipes to all the rigid languages (i.e., protocols) that connects all the communication infrastructure. With a suitable model, we can now traverse and compute through the stack. As an example, people are now building natural language translation layers into data query languages and this obviates the need for memorizing a new lexicon.
I believe this is not being talked about enough. Just to spitball a few ideas: compression allows people to extract information from hour long conversation transcripts. Modification allows knowledge to be translated to address different audiences. Translation makes querying large data lakes easier without the necessary knowledge of how joins cascade.
To raise an objection, a lot of these ideas look like shortcuts. I don’t think these new abilities take away from the need for people to learn, it does however allow people to pick and choose what are meaningful work to do instead of accumulating process knowledge. To take the obvious and oft discussed example, student cheating on their essays. That would be a first order effect, but we can as a result lean into these tools and challenge the students to create pieces that would be previously rare at their grade level. That is a good thing in the long run.
There will be plenty of new ideas coming into the marketplace as AI moves along. The one thing I’m particularly interested in is how this affects the way user interfaces are used and designed.
III. A few words on AI threat
I’m skeptical of the AI existential threat. Indubitably, new technology will reshape the industrial landscape as they do, but to take seriously the idea that a man-made technology can go autonomous and freewheeling is beyond me. (After all, someone has to pay for these things to do meaningful work.) Some discussions point to the black box nature of AI models; we may not understand the granular details on a specific model-weight set, but we do have a good idea on how these tools work in the general.
But let’s take it seriously for a moment. If there were to be an AI existential threat, two things will have to first happen: AI software 1 will have to gain memory. And gain its own motivations and goals. What it breaks down to is the ability to solve problems and navigate obstacles to the satisfaction of its own purpose over time.
An internal memory is not impossible, but it doesn’t seem straightforward either. LLMs today gain memory via two means, through its training data and being provided through the input. There will certainly be advances here but the specific design is not obvious. As a thought experiment: is memory retained on every instance or across all instances? The more philosophical among us might wonder what does infinite memory do to a sentient being, even assuming sentience. I’m not quite sure what to speculate here.
As for its own goals, the most salient question I have is on what basis. But let’s speculate a little. Let’s say it is an emergent behavior of the sum of all the data used to train it. After all, AI has to derive from data. This argues like the infinite monkey theorem, that given enough computation units there will be an emergent behavior that is sentient and has goals. Even if this is true, why would this particular AI be so hell bent in achieving its goals?
Or perhaps as the paperclip trope suggests, AI gains its goals from tasks that are given and prices the completion of it above all else. It certainly is a fun thought experiment, but it raises a lot more questions about how the situation arises. It is similarly unwise to give full autonomy and control to my Roomba as well. There are far more efficient ways to destroy the world if a person decides so.
We have biological bodies. To anthropomorphize AI feels like a desire to suppress our own mammalian needs and instincts. There are certainly ways that AI can be problematic, starting from the real dangers of deploying AI that codifies existing biases and prejudices. Let’s focus on these first.
Certainly as we work on and use AI more, the mystique surrounding them will dissipate and its inner workings will be laid bare. Then, we can have a more meaningful discussion on the real dangers of AI.
-
Let’s throw out the assumption of how they are implemented. I’m not certain if neural networks as we have today are able to achieve sentience, but it may be possible with future development. ↩