I’ll be brief, omit needless words. Intelligence is prediction is compression because Compression is finding a code that makes the data shorter And codeword lengths are probabilities So codes are probability distributions But probability distributions are prediction strategies.
And prediction strategies are almost optimization procedures?
Did your really need to say that you’d be brief? Wasn’t it enough to say that you’d omit needless words? :)
Comment
But then he’d lose the Strunk and White allusion.
I approve the haikuesque format. Do you agree that the "bijection" Intelligence → Prediction preserves more structure than Prediction → Compression?