Based in Sydney, Australia, Foundry is a blog by Rebecca Thao. Her posts explore modern architecture through photos and quotes by influential architects, engineers, and artists.

Episode 21 - Probability, Belief, and The Truth

Episode 21 - Probability, Belief, and The Truth

This episode was recorded during my vacation in Tybee Beach Georgia, and it worked out great! I explore the nature of probability, and my approach to thinking about facts and belief. I talk about problem solving a lot; here I go into the fundamentals and work to justify how I think about it.

What problem does probability solve? Should we always think in terms of probability? What's the difference between an artificial language (code, math) and natural language? How many water molecules are in the Atlantic Ocean? Sometimes we have vague questions and make vague statements - should that be ok?

This episode is a little different, but I hope this is the beginning of a back-and-forth conversation with the Local Maximum listenership to try to get to the bottom of these deep issues.

Previous Episode

This episode reaches all the way back to Episode 0 when I first went over Bayesian thinking and my justifications for it.

Mentioned Links

My interview with Mariya Yao on Foursquare Attribution

Script
The following is the script and acts as the show’s transcript

We need to step back and talk about why we have a mental model of the world to begin with. What is the difference between a fact.. And a belief?

Note that the question is usually posed as fact vs. opinion. I’m not doing that - I think that one’s pretty easy: an opinion is a personal fact. So if I say, “The movie was pretty good” that’s not an objective statement about the movie, but “Max likes the movie” is perfectly objective fact.

I’m not a philosopher - so if you are one - this whole episode could sound a little bit out of place to you - maybe a lot of this pertains to some arguments that have been going back millenia. I don’t know anything about that! Would like to be filled in though. But this is just how I think about truth and belief and probability, and how it has led me to solve actually problems in the real world, and how it’s led me to study specific fields (like Bayesian Statistics). So maybe at some point I’ll have an actual philosopher on and tell me that this is a branch of whatever - but for now I’m mangling your field probably. SORRY AGAIN. Not sorry really, though.

I’m sure my current thinking on this comes from a lot of places.. I talk about mainly thinking about these things when trying to tackle real problems: like how do I find the best restaurant in a town, or how do I tell if an ad is working, or where should I invest my time and money? All very important - but my thinking is also heavily influenced by things I’ve read, things I’ve heard, and my educational background - sometimes in ways I might not be aware of.

Anyway - what is a belief? A belief is a statement of your mental model of the world. You can believe that a fact is true. Every person could have a different belief about whether a certain fact is true or not, or no belief at all. They might not even be aware enough to ask the question, or the question might not be relevant to them. There are things you don’t know that you don’t know.

So the facts are the things that correspond to reality. If you “know something for a fact” you’re really saying that not only do you believe it, but you have very good reason not to doubt it at all; you’ve decided that you’ll at like 100%. Now I’ve been proven wrong after saying I know something for a fact - so even when you SAY that doesn’t always work out - but now when I hear that phrase, it just to me means that HEY - I’ve seen enough evidence for this, I’m taking it as true.

I made this out to be a distinctly human exercise, but humans aren’t the only entities that have beliefs - that have mental maps of the world. Specifically, machines can too - databases for example are a collection of facts. So are blockchains - THE NEWEST EXAMPLE MAYBE! And this property isn’t even that special, it’s not AI. It’s information - it’s “dumb”. A document - just written on paper - is also a collection of facts. You can say that the document has a certain set of beliefs about the world - that’s a list of facts purported to be true by the document - whether it be an official record, a ledger, a personal calendar, a book, a map - anything like that.

So if everyone and everyTHING could potentially have a different belief about the world or about a question - is there such thing as a true fact? Well - there are people who deny that there’s an objective reality, but I’m not one of them. There are true facts, and your beliefs may or may not line up with those true facts. In fact, your beliefs may not be answering the right questions about the universe at all. For example, if you ask about the coordinates of an electron - well from quantum theory we know that electrons don’t have coordinates in space - they’re a quantum probability wave. And then you have to learn about that. Of course, if you’re just talking in approximations then that statement can work just fine.

A good example I have is - how many water molecules in the Atlantic Ocean? It’s a perfectly good question if you want an estimate - cubic kilometers of water multiplied by the number of molecules per cubic kilometer equals the answer. But we all know there’s no exact natural number down to the ones digit that corresponds the answer. It keeps changing - there’s a never-ending rule-making and debate over whether particular molecules are in the ocean - like what if they’re on the beach or something? And even then - quantum effects -- it gets murkey.’

I even run into this problem when I think about advertising and marketing. Let’s say I run a marketing campaign -- let’s say I want to promote this podcast! I put an ad out, and let’s say I measure 10% lift.. Listeners of the ad are 10% more likely to subscribe than they would otherwise have. Well… what does that mean? What if I showed the ad to one person? Either they subscribed or they didn’t.. Would it make sense to talk about lift? The aggregate number makes sense, but maybe not on the individual level.

So the lesson is… Sometimes your questions about the world are not technically allowed but they are important simplifications of aspects of the world. In fact, I think most questions are like that - and I don’t think these questions are somehow not allowed or should make people go crazy. In other words, I don’t like too much nitpicking! Sort of makes it hard to get the job done.

So coming back to it… If you have a set of beliefs - whether you’re a person, or a machine, or a book, or a blockchain for that matter - you have to answer some questions:

  1. How do I state my beliefs - or represent my beliefs?

  2. How do I observe new facts and change my beliefs?

  3. How do I know if what I believe is true.. Is this working for me?

Now it’s not entirely clear how to do all these from a human perspective. There’s a tendency to want a universal language to state our beliefs - like a universal formalism. And why not? What if we had a database that can contain any piece of knowledge. What if we had a type system that can express anything? What if we had a formal mathematics that can state every problem?

I’m really interested in these formal systems. I was back in high school and college - I am now. These are artificial languages really, as opposed to natural language like English.

But my thinking on where they fit in has changed. So Two things - first, the idea that there can be a universal artificial language to state every question is actually proven impossible. There’s something called the Godel incompleteness theorem in mathematics, and its analogs in computer science, the halting problem for one

But I used to think that was the main problem and it’s not.

The second thing is that even if we could to do it - or come close to doing it - it’s actually a really clunky way to solve problems.

But look, this is something that I love to look into - formal systems, type systems, programming languages, functional programming. And I’m sure we’ll do more shows on all of this. And maybe I’ll do some projects in this area in the future.

  • But do you really want to use this kind of language for stating every fact, and every belief? Or for just speaking for that matter?

  • Let me put this in a way that makes the answer clearer: Is this what you want to teach children when they’re just learning to speak?

So I think about these questions, and it’s clear that there are benefits of these formal systems, but they’re actually at the top of the chain - not the bottom. Like, it’s not like you define this formal system and everything follows. It’s that you have all your beliefs and you top it off with a formal language to help you organize them AT THE END.

See the Godel incompleteness theorem says it’s impossible to capture everything - but it doesn’t say that you can’t build something that’s really really really good. So - not a bad area for research, I have some books on type theory on my shelf; another on the continuum hypothesis. But I think that these quote “artificial languages” are built on something else, “natural languages” like English - human languages that are much more vague but flexible.

[ PAUSE ]

Alright - so that was a detour. Let’s get back to stating and representing our beliefs. If we’re not always giving some equation or something.. Then what is it?

Well, we have our brains and we have our language. Those develop in several ways. First we have evolution. We’ve evolved senses to observe new facts, and our brains and bodies to represent those facts physically. And how do we know they are true from an evolutionary perspective? Well if you have the wrong map of the world, you die, and if you’re right you have a greater chance of survival and reproduction or at least helping your species reproduce.

It’s not perfect. Sometimes you’re right and you die anyway. Sometimes the quote “wrong” ones live. I also have a way different view of evolution and natural selection than I may have had many years ago. People jump to conclusions very quickly “oh we’re like this because of selection for survival and reproduction”.. But the world is actually much more complicated than that. There’s group dynamics, there’s the way people interact with every other person and the environment. So hard to predict how things will develop. But in general, there’s got to be some stochastic learning going on here - and by stochastic learning I mean tends to make more good changes than bad changes over time.

This whole natural development thing IS a rabbit hole, but I think this process has given us a brain and some tools that are in line with true facts. Even starting with the lizard brain - you get afraid of stuff, could be irrational at times, but also helps discover real threats.

Then building on that, our full brain, there’s natural language like English, which is very flexible for stating and memorizing facts. And once you get facts, you can pass them down generation by generation. And you can tell stories and ideas - and people will tend to hold on to the ones that speak to us most as human beings - that are relevant to the human experience. And that is a truth.

And we can put those into writing and we have religious and mythological stories, but also informational writings, how-to engineering and science. And then you can actually catch contradictions and compare notes. Put some formalism into your observations. You can record history.

So now you’re not evolving facts of - look I had this belief and survived - because that’s a pretty clunky way to develop beliefs - but you’re actually starting to make observations, compare them with other observations - creating hypotheses - and weeding out contradictions.

Let’s talk about contradictions for a second. I used to believe “only bad thinkers have contradictory beliefs”. Sorry folks - we all have ‘em.

I think that only short and flat documents are devoid of contradictions. Bitcoin blockchain doesn’t have contradictions by design because it’s a list of transactions. But I don’t think any humans, and most complex databases and books - are devoid of contradictions in their set of facts.

Okay - then you have the enlightenment, and we get the scientific method - that’s a pretty good way of agreeing on facts.

I think the difference between humans and machines right now is that for humans, our methods for forming our beliefs are unimaginably complicated. Like I just pointed out - there’s evolution, there’s language, there’s logic, there’s the scientific method, there’s theology. I can’t tell someone exactly where to get their beliefs from. For example, if I say “only get your beliefs from the scientific method otherwise you’re ANTI-SCIENCE” that falls apart pretty quickly. What would be my justification for using the scientific method? Does that come from the scientific method? Uh no - it comes from logic - and that comes from language - that comes from evolution** (why asterix?). And it’s incredibly complicated.

But I think the broader point is that we do have ways of getting facts, forming beliefs, and making sure that those beliefs are in accordance with reality. Sometimes we don’t do a good job - but hey that’s life.

And because this is so complex, we have a way to gate off our maps of the world - put them in different sections of our mind - put them in different documents. And hopefully each document is simpler in that we can wrap our head around it.

So, let’s talk about a physical map - let’s say it’s a map of Europe. We can see where the land is, we can see where the cities are. You can compare these with aerial photos, you can go to these places and you can check the facts. Doesn’t tell us everything.

So a few points to make

  • Remember - You don’t need to form a belief about every question

  • Remember -- There are some questions you’re not able to ask

  • Even if you can ask a question… There are some potential answers you’re not able to formulate.

Ok, so here’s where probability comes in. Probability is a formalization of belief, and it really helps us hedge against multiple potential realities.

So let’s say you’re able to formulate a question, and you have a space of several possible answers to this question. So far, you have 2 options:

  1. Not have a belief on it

  2. Decide on one of the possibilities as the correct answer

#1 is pretty safe, and if the question is unimportant to your life go with #1 because it’s the cheapest in terms of storage and brain power. You’re born not having an opinion on it!!

#2 is the second cheapest. So just pick one right answer and move on. Is the Earth flat, or roughly spherical? I’m not going to entertain the notion of a flat earth right now, it’s not worth my time. Even though from a Bayesian perspective, you never have a 0% belief in something, just forget it. Pick the right answer and move on, and don’t complicate your life. If you know something for a fact, and then that fact turns out to be wrong, you can deal with it then. It’s not a Bayesian method, it’s a logical method of weeding out contradictions.

#3 is to think in terms of probability. So, now that I get to it, it’s the best model - better than #1, and #2, but it’s also the most expensive because you need to have several beliefs in your mind at once.

It’s also more expensive for machines and databases. So, let’s say I wanted to know which emails are spam. Usually, I’m storing a probability for spam on each email. That’s more storage than just storing a 1 or a 0 on whether it’s spam or not.

Now this gets us into the answer of what is probability now. Because a probability can be thought of as a mixed belief over several potential facts. That’s the subjective nature of probability.

And that’s why I don’t think - for the most part - that probability is objective. I think that’s true only in certain circumstances. We can talk about quantum physics, but even something like a coin flip as maybe being objectively 50-50. But the fact is that the coin will land on either one side or the other - and I would argue that your belief should be 50-50 beforehand because of what we know about coins and all that. You might disagree if you have a completely different mental map of the universe, but I’d say you’re crazy.

So it’s still actually subjective, but it’s something where all quote “right-thinking” people should come to the same conclusion.

Now also note - believing probabilistically is not the same as saying “the universe isn’t black and white - it’s shades of gray”. You still believe the coin flip will be heads or tails, but you’re not sure which one yet. That’s different from saying “I think it’ll be like heads and tails at the same time, man”. You could believe in shades of gray and also think probabilistically, but they’re not the same by any stretch.

So finally - here’s a major benefit to storing your belief as a probability. It comes around to the question of - how do I update my beliefs when I get new information?

If you have to choose a single hypothesis as true every time - then how are you going to answer the question of:

  1. How am I doing?

  2. How can I do better?

I guess you can say maybe “the number of facts I get right” like it’s a fair voting system. One fact one vote, and if I come upon a contradicting fact - that’s a count against me. But it’s not very formal, and it’s not true that all facts are equally important to get right.

For probabilities, we actually have a very good and formal answer for this, which is formalized by Bayesian statistics. The answer is:

I just made an observation. How likely was the observation given the probabilities on beliefs I have in my head? That’s the likelihood. And Bayes Rule also give you a formal way to update your beliefs based on your observations.

So, if you have a self-contained question, that’s somewhat complicated, but formalized, Bayesian thinking will get you to the right answers, and will lead you to the right conclusions better than anything else we have.

Now - like the scientific method - it’s not a panacea. We still need to ask the right questions - we still need to search for the possible solutions. There are some solutions that maybe we didn’t conceive of and it’s not going to be in the model. That’s a random search, that’s creativity. That’s evolution - and trial and error.

But we can built on that, and in certain areas of life - really big ones - we have these tools of subjective probability and Bayesian statistics to answer some the BIG questions.

Episode 22 - Bayes Rulez, Death to P-Hacking

Episode 22 - Bayes Rulez, Death to P-Hacking

Episode 20 - Imperfect Bots, Marketplaces, and the AI Economy

Episode 20 - Imperfect Bots, Marketplaces, and the AI Economy