🤓 the missing dimensions of intelligence

making machines that do a better job of matching what intelligence means for humans

It's easy to think that AIs are all knowing.

After all, whether you ask it to tell you the history of Mesopotamia, rewrite an essay or explain the second law of thermodynamics, it seems to be able to do it all.

It's only when you spend a lot of time getting it to do things that deviate from a) text generation, b) logical, verifiable, first principles type reasoning and c) xxx that you realize that intelligence is meant to largely communicate intellectual intelligence.

The most obvious type of missing intelligence is physical or embodied intelligence and why we've seen an explosion of efforts to focus on creating, gathering and training with the specific kind of data that will create capable models for physical intelligence.

But the other kinds of intelligence that are obvious when immersed in the work of the home are ones like relational and emotional and intuition intelligence.

The kinds that speak a different sort of wisdom.

Why is this important?

Well, let's take a task like "help me plan meals for next week".

There are current 4 gaps as I see it:

  1. relevant benchmarks
  2. relevant data
  3. relevant alignment
  4. relevant reasoning

The first means we don't really have benchmarks built for unpaid work. So even if we have AIs more than willing to readily pontificate on your meal planning, we don't have a systematic way to know if they're any good and to make them better.

The second has to do with the data used to train for this work flow. LLMs are largely trained from the data on the public internet and it's not sufficiently representative. The public internet over represents:

This is particularly problematic when trying to use models for domains where these groups are not the experts in the work.

Worse, is the belief that the model's "common sense" accurately reflects that of humans, when it is more accurate to say it's the common sense of people who write things on the internet.

Which brings us to the 3rd gap and the need for reward functions and reinforcement processes that map more closely to the complexity of care. Often it's not convenience or quick helpfulness and often there isn't one clear answer. There's a pluralism that not only needs to be accounted for but allowed.

And the last is gaps in reasoning. The invisible load of running a family is called that because it;s an iceberg problem. There is a small execution part and is seen (eg. buy groceries) but a lot of hidden under the surface tasks that allow you to get to the write answer (eg understand when soccer is this week, who's travel, current kid food preferences and energy level for cooking).

After spending thousands of hours coaxing LLMs to match the shape of the work it's clear to me that it will be possible but it's far from there yet.

To make it possible we need to put effort to solving these gaps.