That’s only true on a very basic level, I understand that Turings maths is complex and unintuitive even more so than calculus but it’s a very established fact that relatively simple mathematical operations can have emergent properties when they interact to have far more complexity than initially expected.
The same way the giraffe gets its spots the same way all the hardware of our brain is built, a strand of code is converted into physical structures that interact and result in more complex behaviours - the actual reality is just math, and that math is almost entirely just probability when you get down to it. We’re all just next word guessing machines.
We don’t guess words like a Markov chain instead use a rather complex token system in our brain which then gets converted to words, LLMs do this too - that’s how they can learn about a subject in one language then explain it in another.
Calling an LLM predictive text is a fundamental misunderstanding of reality, it’s somewhat true on a technical level but only when you understand that predicting the next word can be a hugely complex operation which is the fundamental math behind all human thought also.
Plus they’re not really just predicting one word ahead anymore, they do structured generation much like how image generators do - first they get the higher level principles to a valid state then propagate down into structure and form before making word and grammar choices. You can manually change values in the different layers and see the output change, exploring the latent space like this makes it clear that it’s not simply guessing the next word but guessing the next word which will best fit into a required structure to express a desired point - I don’t know how other people are coming up with sentences but that feels a lot like what I do
That’s only true on a very basic level, I understand that Turings maths is complex and unintuitive even more so than calculus but it’s a very established fact that relatively simple mathematical operations can have emergent properties when they interact to have far more complexity than initially expected.
The same way the giraffe gets its spots the same way all the hardware of our brain is built, a strand of code is converted into physical structures that interact and result in more complex behaviours - the actual reality is just math, and that math is almost entirely just probability when you get down to it. We’re all just next word guessing machines.
We don’t guess words like a Markov chain instead use a rather complex token system in our brain which then gets converted to words, LLMs do this too - that’s how they can learn about a subject in one language then explain it in another.
Calling an LLM predictive text is a fundamental misunderstanding of reality, it’s somewhat true on a technical level but only when you understand that predicting the next word can be a hugely complex operation which is the fundamental math behind all human thought also.
Plus they’re not really just predicting one word ahead anymore, they do structured generation much like how image generators do - first they get the higher level principles to a valid state then propagate down into structure and form before making word and grammar choices. You can manually change values in the different layers and see the output change, exploring the latent space like this makes it clear that it’s not simply guessing the next word but guessing the next word which will best fit into a required structure to express a desired point - I don’t know how other people are coming up with sentences but that feels a lot like what I do