Based in Sydney, Australia, Foundry is a blog by Rebecca Thao. Her posts explore modern architecture through photos and quotes by influential architects, engineers, and artists.

Episode 267 - Bernoulli's Fallacy with Aubrey Clayton

Episode 267 - Bernoulli's Fallacy with Aubrey Clayton

Could the crisis in modern science be a result of fallacious probabilistic thinking? Mathematical researcher and writer Aubrey Clayton joins the Local Maximum to discuss his book Bernoulli’s Fallacy: Statistical Illogic and the Crisis of Modern Science.

Aubrey Clayton

 
 

Aubrey Clayton: Website | LinkedIn | Twitter

Bernoulli’s Fallacy Book

 
 

Links

Wikipedia: Jacob Bernoulli, founder of Probability
Seeing Theory: Bayesian Inference
Columbia University Press: Bernoulli’s Fallacy
Amazon: Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science
The Local Maximum Archives

Related Episodes

Episode 0 - The great beginning, and how to Update your Beliefs
Episode 1 - Bayesian Analysis of the Hawaii Missile Scare
Episode 78 - Bayesian Thinking: Flat Earth Priors, The Mises Brothers, and Curb Your Analogy
Episode 105 - Joys of Bayes, Belief Networks, and Aerodynamics with Sophie Carr
Episode 119 - Advocating for Bayesian Inference with Brian Blais
Episode 207 - Max Returns with Priors

Transcript

Max Sklar: You're listening to the Local Maximum episode 267.

Narration: Time to expand your perspective. Welcome to the Local Maximum. Now here's your host, Max Sklar.

Max Sklar: Welcome everyone, welcome! You have reached another Local Maximum. 

Are you on the Local Maximum Locals, by any chance? That's maximum.local.com. That's our private message board with all of our best listeners and best supporters there to share ideas and have a little fun. I was just hitting the slopes this weekend up in Vermont with Aaron and his family. Come to think of it, really sore from all that skiing. It was really, really, really freezing but Aaron and I did a live stream for the locals from the top of the ski lifts so that was a lot of fun. See what kind of great content you could be missing out on if you don't join the Locals so go to maximum.locals.com. 

Today, on the Local Maximum, we're returning to an old favorite topic on the program, which is Bayesian inference. Specifically, our guest today is going to talk about how Bayesian thinking sharpens our scientific method and prevents us from believing in fallacies, some of which have been, dare I say, quite catastrophic. In fact, my next guest says that there is a logical flaw in the statistical methods used across experimental science, and has been the cause of all sorts of negative effects in areas like medicine, law, and public policy. 

Today's guest is an applied mathematical researcher, lecturer, and writer. His most recent book is Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. Aubrey Clayton, you've reached the Local Maximum. Welcome to the show.

Aubrey Clayton: Thank you.

Max: Today, we're talking about your book, Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science. And of course, I just want to talk about Bayesian inference and Bernoulli's fallacy more generally. 

But before we get- and it’s also, I want to talk directly to my audience has been listening for a long time. It's amazing how many different angles we can approach this idea of Bayesian inference from because I don't think we talked about this fallacy directly. But before we get into what the fallacy is, I feel like you've kind of come to the conclusion that there's a big problem or issue in the way we do science or in the way we make inferences. So how did you come to that conclusion and how did that lead you to writing this book or doing the types of research or learning that you do?

Aubrey: I think that the idea that there is a big problem in science, and particularly in the use of statistics and science, that's not something that I came up with. That's been apparent to various people for years, decades, even. But the biggest news around that has been to do with the replication crisis. That's really been on people's minds for about the last 10, 12 years. 

That’s just that, in many different scientific domains, it seems to be that there's a problem, that scientific results are not replicable. That if experiments are run again, the original results or claims or effects, or whatever it was shown in some published paper don't seem to manifest second time around. 

I think that there are various ways of looking at that in terms of scientific practice. But the one that I really focused on in my work is the fact that it seems to be that statistics and statistical methods underlie all of these scientific results. You can kind of draw a straight line from the growth of statistical methods to the replication crisis if you get into the criticism of those methods deeply enough. 

That's kind of what brought it to my attention but I had been thinking about these things for many years. I think a lot of people who are involved in this field have been thinking about this and feeling frustrated about the ways that statistical methods are what they are and it had happened to coincide with a lot of attention and press focused on the replication crisis. That kind of connected the dots for me. It made a sort of story that I thought I could tell.

Max: So these statistical methods that are used in science. I'm kind of used to a Bayesian atmosphere when I was a machine learning engineer when you're trying to make inferences for companies. Is that not the way it works in modern science? What methods are they using?

Aubrey: Not at all. It's actually very interesting, I think, that people who come to statistics from other backgrounds, say data science and machine learning, or various other fields, are often surprised by what statistical methods are actually sort of dominant in the worlds of science. 

To put it succinctly, those methods are not Bayesian. Bayesian methods are not the standard in almost all of statistics. So the real methods that are kind of the common language of statistical inference and testing are frequent test methods and there are things like significance testing, and null hypothesis significance testing, being the chief one. 

Almost every scientific paper that gets published has to pass some kind of test that is of the form does the data support there being a statistically significant effect or association, or whatever it is and Bayesian inference is just not part of the template.

Max: Are there good reasons for using those methods that they do?

Aubrey: There are reasons to use those methods. I wouldn't call them good reasons but I think that if you go through the history of the development of those methods, which is very much what I was trying to do in the book, you can kind of understand how they came to be what they are. 

The kind of common refrain of those frequentist methods is that they are somehow objective, or mechanical. That they remove the influence of the experimenter, or the observer, the author, from interpreting the results. That you just kind of take your data, you pass them through some process, throw them into the hopper, and out comes some result and it kind of gives you this judgment- significant or not significant. 

That, I guess, is an understandable kind of desire to have. But, unfortunately, that's kind of what's led us into the crisis that we're in today, among other things. There is no way for those methods to distinguish between hypotheses that we should require much stronger evidence in support of than we do, versus ones that are kind of in agreement with a previously established theory or understanding of the world. The fact that that is not part of the process is a big part of the reason why the replication crisis is as widespread as it is.

Max: Is this mostly about the objection to having some kind of a prior on your data? And is that the primary objection that we're picking priors, and there's no particular prior to pick so you can't be objective? Is that really all it is?

Aubrey: That is, I think, a pretty good description of the problem. I think that the objection that people have to Bayesian methods, generally, is where did the priors come from. 

Max: We've talked about that a lot but I want to hear your take on it. Where do the priors come from? Right.

Aubrey: That's the criticism and that's the big question, I think. It's understandable that that's where people would focus their attention. So again, coming at it from the point of view of wanting to be objective and not wanting to allow for some undue influence on the part of the experimenter or the person interpreting the results, you might say, well, that prior feels very subjective. It feels like, you just kind of, it's personal. It comes from the biases of the person doing the experiment. 

I think that the real answer to the question of where do priors come from requires us to revisit and re-understand what we mean by probability more generally. So these people who want there to be no priors and want to do statistics as though it were kind of a mechanical method without any kind of bias, they want probability to mean something measurable and objective. So they want probability to be frequency of occurrence. How often something happens, that's the probability which doesn't allow for priors.

Now the problem with that is that it's just too limited of probability to be useful in these kinds of arenas. So if you want probability to mean something like degree of confidence or degree of belief, then you need to think through. Where do prior probabilities come from? And the answer is basically, that it's all about information. It's about the information that you have as an experiment, or as an author, or reader about the world and the context of that experiment and what you think is true.

Max: Is there an argument… Maybe I'm not quite making this up on the spot, but I'm trying to word it on the spot. The fact that you can choose several different priors and under certain circumstances, the same data will lead to very similar conclusions. Does that calm fears at all or not?

Aubrey: Well, I think that that does. In many cases, I think that takes some of the pressure off, the priors. Let's say that it is often the case that with a sufficient amount of data, it doesn't matter too much where you start from in terms of your priors because the data is going to be conclusive more or less than in one way or another. 

That is actually, oftentimes, in a weird way, a kind of defense that frequentists statisticians use for not including the priors because they say, basically, it wouldn't matter. The data should speak for itself and if we have enough data, all priors will lead to the same place. 

I think what the problem with that is that oftentimes, in fact, pretty much all the time, we are in some kind of in-between state where we have some data that leads us towards a conclusion, but not all the way there. The prior information or our background information about a problem is necessary. It kind of guides us to maybe support one conclusion over another when we're kind of in this in-between where we have some data, but not an immense quantity of data.

Max: Or, I guess also, maybe alternate hypotheses where the hypotheses are so similar to each other that you can't really distinguish them with a lot of data. I guess that's sort of a similar situation. 

All right, let's get into the fallacy itself. What is Bernoulli's fallacy? And maybe we can get some examples of where this has led us down the wrong path. And also why was he wrong on this or was he wrong on this?

Aubrey: Let's start with the second part first. 

Max: Yeah there's a lot.

Aubrey: There’s a lot there. Bernoulli is Jacob Bernoulli, who is widely regarded, and rightly so. He’s kind of one of the founding fathers of probability and statistics. He wrote a very important book called Ars Conjectandi around 1700. It was his kind of magnum opus. It had in it, one of the great results of probability called the law of large numbers that I think people have heard about and still talk about to this day. 

Basically, what was really so significant about it was that it attempted to extend the reach of probability from its origins in kind of gambling, which is really where probability comes from, to all kinds of real-world situations. So basically, anytime you have to make decisions under uncertainty, or conjecturing, as the name of the book suggests, Bernoulli thought probability was the right conceptual framework to use and that you could learn from observation to establish this probability. 

The fallacy is in the logical setup of that learning process. What Bernoulli said was that you could learn an unknown probability, or an unknown, let's say, physical parameters or constant or something that you want to gain knowledge of, you could learn it through observation without any influence of your prior understanding. So basically, he didn't have Bayes' theorem as a tool but his framework just said you could learn that thing, and the way you could learn it is by looking at probabilities of the data. 

So really, the essence of Bernoulli's fallacy, as I sort of described in the book, is thinking that all you need to make inferences about experimental hypotheses or propositions or whatever it is are probabilities of the data given those hypotheses. So if something is true, how likely would your observation be X or Y? And Bernoulli’s argument just kind of falls into this trap. He tries to build up this inference from only those probabilities when if you kind of work through some thought experiments you can show that that's just insufficient. That's not enough information.

Max: Let's walk through some of them.

Aubrey: Sure, well, so one of the…

Max: Leave some for the book readers of course.

Aubrey: Yeah, yeah. One of the major categories of examples that I give in the book is what's called the prosecutor's fallacy, which is a famous misuse of probability in the world of law. So one of the stories I tell is of Sally Clark who was convicted of murdering her two children. She had two children who died in infancy, a few years apart. One of the arguments at the trial, a sort of key argument at the trial, was that if she were innocent, then the chance of having two children kind of die suddenly in infancy was very, very low. So one in 73 million was the figure. 

So you could say, if the hypothesis, she's innocent, then this data has this very low probability of this observation, these two children dying. When what we need as jurors and people interpreting that evidence is the probability going the other way around. We need to know given that she had two children who died, what is the probability that she is innocent of that crime?

So the fact that the direction of that probability being the wrong way around is often called the prosecutor's fallacy. It's a kind of well-known misunderstanding of probability in legal contexts. It happens to be exactly the same template as the statistical methods if hypothesis then the probability of the data is low, therefore we reject the hypothesis.

Max: So the alternate hypothesis of… Was it the alternate hypothesis of being guilty is also low? Or was it also a conditional, independence-of-events type of situation as well?

Aubrey: Unfortunately, there were many mistakes in that argument simultaneously. So it would have been nice if the people had chosen to only make one mistake at a time. 

The part of the logical argument that I think is really fallacious is the fact that there is no allowance given to the prior probability of that theory that basically we're weighing as jurors two very unlikely propositions. Either she is a double murderer within her own family, which is itself a very, very unlikely proposition. Or she's innocent and this very, very unlikely thing happened to her. 

Bayes' theorem and Bayesian inference give you a way of kind of balancing those two or seeing which one you ultimately lean towards. But just focusing on this one probability, if hypothesis, then data, does not allow you to include this background information that the proposition you're talking about in this theory of the case is a very, very unlikely thing.

Max: Yeah, it's hard to expect a jury to have that kind of level of statistical thinking and they convicted her on just that probabilistic argument. That sounds crazy.

Aubrey: Yeah, more or less. think it weighed very heavily in the minds of the jurors. And I agree, it was hard for them to make sense of that kind of probabilistic argument. It's one of the reasons why I think we need to do a better job in teaching probability so that the average person is more literate in these top topics. But unfortunately, that is interfered with by the fact that the main school of probability does not allow for priors and background information. I think we're kind of stymied in some way that anyone who even took probability and statistics would not be really better equipped to understand that argument.

Max: Yeah. It's hard enough to convince, sometimes, people… If it's hard to talk to people in the scientific community, or in business, or in product development, how much chance does the average person have? It almost seems like a very difficult problem. 

Maybe we can go into Bayes' Law a little bit. I know this is always hard to do in an audio format where we don't have a whiteboard in front of us and all that. But maybe we could make an attempt to go through the equation and sort of figure out where we go wrong when we do this. When we commit, when we go down this fallacy. What terms are we missing?

Aubrey: Without stating the equation, I think the essential idea of Bayes' theorem, as I like to think about it, is that it tells you something about your updated probability assignment for a theory or a hypothesis, given some evidence. So you've made some observations, some facts have been presented to you, and now you have to turn that into an interpretation, update your assignment for how likely do you think some explanation might be. 

What it says is that that’s what's called the posterior probability, posterior meaning after you've made that observation, is proportional to two things. It’s proportional to your prior understanding, your prior assignment, for the hypothesis, and what's called the likelihood, which is the probability of that data if the hypothesis is true. Those are the kind of two main ingredients that have to be mixed together. 

Bayes’ theorem just kind of tells you in a simple way that if you have those two things, you multiply them together, then your posterior probabilities is directly proportional to that. So if a theory explains the evidence very, very well, meaning that the probability of the evidence is high, given the hypothesis, then it's probably going to be the case, it’s likely going to be the case that you give that a lot of weight. But if you thought that that hypothesis was very, very unlikely from the start, you still might not be convinced of it.

These two things can kind of offset each other. They can kind of get mixed together. But that's really what Bayes' theorem tells you. What these methods that I mentioned, kind of committing this fallacy, leave out is most often the prior probability. So they focus entirely on the second of those two terms, which is the probability of the evidence given the hypothesis without any allowance for the fact that maybe your prior probability is low for that theory. 

They're also in the course of working out that proportionality. Sort of how proportional those things are. You have to think about what other alternative hypotheses are there that could explain the data. And that's another thing that these methods just kind of don't allow for, don't examine. Oftentimes, the alternatives are kind of implicit. But other times, it might not be entirely clear what other theories out there that could explain that did any better and how does the balance of probability get spread around the different theories.

Max: That makes total intuitive sense. It's like if you're trying to figure out the cause of something you want to, before jumping to conclusions, figure out all the different possible causes that it could be. I don't know, I feel like that part of it is very intuitive, at least. 

So, alright. What a good example of that, an example from the talk you gave, I don't know if it's in the book, but the probability of buying a lottery ticket. I feel like that one really crystallizes the results. Maybe we can go through that one real quick.

Aubrey: Thank you. This is one that I do return to often. So again, thinking about this kind of inferential argument, kind of structure, this argument in the Sally Clark case or in other words, oftentimes, they take the form if a hypothesis were true, then the data would be very, very unlikely. 

Ronald Fisher, who's one of the great statisticians of the 20th century, essentially used that as his justification for null hypothesis significance testing. It says, basically, that the structure of that argument is, if the hypothesis is right, whatever hypothesis is under consideration, then the data we got would be something very, very unlikely. Therefore we kind of reject that hypothesis, or we were suspicious of it anyway. 

Unfortunately, as probabilities go, that just doesn't work. Because you have to factor in what other explanations might there be. For example, if your hypothesis was that someone bought a lottery ticket, and the data observation was that they won the lottery then the conditional-

Max: So they already won. You've seen that they won. Crazy thing, celebrate your at the party talking.

Aubrey: Exactly. So it's something that you've observed or it could be hypothetical. It doesn't really matter to probability sampling. 

Let's say someone won the lottery and they're celebrating, and your hypothesis was that they bought a ticket. Well, if the hypothesis is true, then the probability of that observation that they won the lottery is very, very small. Vanishingly small. However, you wouldn't use that event as a way or reason to cast doubt on that hypothesis because there's no other explanation available. You have to buy a ticket to win the lottery. So even though that conditional probability going one way is very, very small, the conditional probability pointing the other way is one, or at least very close to one. 

So that's an example I think of where there can be a huge disconnect between these kinds of sampling probabilities of the data versus the inferential probabilities going the other direction, which is really what we care about.

Max: So the probability of winning the lottery, given that you bought a ticket is small, vanishingly small. The probability of winning the lottery given that you did not buy a ticket is zero. That's almost like a tautology there but it illustrates the problem. Because when it's not so clear, people make the observation like they didn't buy a ticket because if they bought a ticket, that probability is small. So that that would be a crazy thing to say.

Aubrey: That's right. And Bayes' theorem, by the way, it's sorted this all out immediately. You can put these things into Bayes' theorem, the conditional probabilities for some theories can be zero as it is here. It will tell you even though your your likelihood was very, very small of that data, whatever prior you associate to it, you now have to believe it because it's the only theory of the case that could explain what you've observed.

Max: A couple of issues left. One is something that you're calling the implicitly uniform prior. We've talked about uniform priors on the show. How it's not always the right way to go. What's the implicit uniform prior? And what are the issues that you see with uniform priors?

Aubrey: Again, in this kind of context of arguing about priors, which is really where all of criticism of Bayesian statistics ultimately winds up, one of the questions is, how do you represent ignorance, let's say over many different possible hypotheses or values of some parameter and a model or whatever is appropriate. Some people use kind of uniform distributions and their reasons for and against that kind of arguments back and forth. 

What is interesting to me about frequentist statistics and these methods that are, again, the common language that focus entirely on the probability of the data given the hypothesis is that they actually agree with Bayesian methods. They coincide with Bayesian methods numerically, in one special case, which is where all the various alternative hypotheses have the same prior probability. So if you have the same prior for everything, then that term kind of just cancels out of the equation and what you're left with is that your posterior assignments for various hypotheses, given the evidence, are proportional to the likelihoods. They're proportional to how likely the data was given the hypothesis which is kind of the thought process of these standard methods.

What I sometimes like to describe that as is that basically, when people use frequentist methods, they are implicitly assigning uniform prior probabilities to all the hypotheses. If you do that numerically, you're going to wind up in the same place as a Bayesian. 

The problem with that, of course, is that sometimes that's just not the right prior probability to assign. Maybe you have some strong information about a range of some value that's more likely than another range or a range that's impossible. So a uniform prior over all these values is just not an adequate representation of your background knowledge. You're going to be led to some pretty wacky inferences if you follow methods that kind of implicitly assign uniform probability. 

So that's another way, a spin on criticizing those methods is just that, really, they are Bayesian methods, just where you've kind of implicitly ignored the differences between hypotheses. I think, just historically speaking, that was reasonable for the scientific context that some of those methods were developed in. Things like survey sampling from populations and evolutionary biology, uniform priors might have actually been a pretty good description of what they knew. They knew very little. It's just the problem of trying to extend that to all of their scientific norms.

Max: So if we're doing some kind of statistical problem, it’s often this method, or we're trying to find the maximum likelihood. Or a lot of times in machine learning, we're doing gradient descent. Like we're trying to find the maximum of some function because that's the most likely place to be. Is that kind of implicit uniform prior? Because we're essentially just assuming. We’re not including any prior in that.

Aubrey: Yes, that's right. You have to be a little bit careful to distinguish between different ways of actually stating what you're doing. So maximum likelihood methods, again another brainchild of Ronald Fisher in the 20th century, say find the hypothesis or the parameter value or whatever is appropriate for that setting. Find the hypothesis that makes the data the most likely possible. So it has the highest conditional probability of the data, given the hypothesis. That's called the likelihood. 

That is the same as the most likely hypothesis, the hypothesis that you would assign the greatest probability to that you think is the most likely if you were indifferent to all the hypotheses from the start. So if you had a uniform prior, and you have this term dropping out of Bayes' theorem, then you end up with your posterior probability assignments are proportional to your likelihoods. So your maximum likelihood estimate is the thing you think is most likely, which could be a good inference. It could be a reasonable choice to make if you are trying to do some learning. 

But implicitly, what you're doing is saying why I have no prior information that allows me to favor one hypothesis or another before I've seen the data, or not in light of the data. Again, that works fine in some settings. It may even be a reasonable inference. But it doesn't work in every settin, where you may have some strong information that you should include.

Max: So it sounds like the fallacy is the fallacy of the implicitly uniform prior. It's not that the uniform prior itself is a fallacy. It's more like, make it explicit. Tell us why you're using the uniform prior. Justify and then we'll have less problems. Am I saying that correct or…?

Aubrey: Yes exactly, that's right. I think what I would argue for generally speaking in science is to try to make everything that's implicit, explicit. And if you understand these kinds of standard methods is basically doing Bayesian inferences with an implicitly uniform prior, once you bring that out into the open, then you've got something to really kind of latch on to and criticize and say, is that appropriate for this scientific setting? Maybe, maybe not. And if it's not appropriate, then what other prior would represent our understanding of that? Then what would a Bayesian inference tell us about our posterior probabilities?

Max: Very cool. Very good. All right.

I just have… There's one thing that is mentioned in the book. One name that I actually had never heard before so I want to ask you about it before we wrap up. Who's Adolphe Quetelet? I don't know if I'm pronouncing that correctly. You have a chapter on the bell curve and I've never heard of… Who is that? What's that situation all about?

Aubrey: Yes, indeed. SoAdolphe Quetelet? Who is probably the most famous scientist to-

Max: Quetelet. Starting to bring out my French. 

Aubrey: Yeah, that's right. He is probably the most important scientist that no one's ever heard of. He was a Belgian scientist in sort of mid 19th century and is really responsible for the development of what we now understand as social science. So quantitative methods to deal with people in society and people's lives. 

How that was able to kind of come about was a very interesting kind of circumstance. A kind of unique intersection, in his case of social science and astronomy. He also had a background in astronomy and he studied in Paris with some of the most famous astronomers of the day, basically students of Laplace because he wanted to build an observatory in Belgium. He wanted to make Belgium as kind of a center of astronomical learning. 

The methods in that field were very statistical and quantitative and involved things like the normal distribution or the bell curve as a representation of the errors in astronomical measurements. He kind of understood that. He knew how to use those tools. 

Then he started collecting massive amounts of data on people because he wanted to build what he called a social physics. He wanted to essentially model people the way that Kepler and others had modeled the motion of the planets. And he wanted to get all kinds of data to support that model building and what he found was this bell curve shape in lots of different places. That kind of made the lightbulb go off and he thought the tools and techniques that we have in astronomy where this bell curve is a kind of representation of our error distribution, they can be applicable in all the other realms where we have the bell curve showing up as a kind of distribution of people's heights or chest sizes or whatever it is.  

That's what led to social science taking off. The methods that we now understand as core statistics that were built in early mid-20th century, they traced directly back to Quetelet, in this early pioneering work. But unfortunately, he’s not a household name, but I think he's basically one of the most influential people in the development of science and statistics.

Max: Interesting. Now, you said he was working with people who were students of Laplace. And Laplace was a big Bayesian, was he not? So were these methods not Bayesian? Where did that get mixed up?

Aubrey: That's an interesting question and probably a much longer discussion for another time. Whether Laplace is a Bayesian.

Max: Yeah, I always tend to as these questions right at the end. 

Aubrey: That's alright. I think, actually, it's an interesting point because we Bayesians like to claim Laplace as one of ours. And if you read the writings of Laplace, you will definitely find Bayesian inklings but you will also sometimes find frequentist ones. 

I think for Laplace at the time, there wasn't such a sharp divide between Bayesian and frequentist probability and statistics. He had a flexible understanding, that allowed him to go back and forth. Sometimes that was very useful. Other times, it kind of got him into trouble. I think one of the ways it got him into trouble was in thinking about these error distributions in astronomy, that those are real physical things. You can measure them, you can look at those errors, chart them on a histogram, and you'll get a bell curve shape. Probability kind of does, in that setting, very naturally, kind of mean frequency. 

That's where Quetelet picked it up. And he wanted to use that as an objective basis to argue for the existence of social science. I think it's because of the context that he was trying to apply it in, that Quetelet, really leaned hard into that the frequentist understanding of probability because he wanted it to be, essentially unassailable. He wanted to argue for these things existing without it being just a product of his imagination. Laplace gave him he gave him some room. His ideas allowed for that in a way that maybe if we were starting it all over again as Bayesians, we want to shut down.

Max: They probably didn't know… They could not have foreseen the implications of the methodologies that they were using over centuries. That's almost too much to ask of any individual mathematician or scientist. 

This all sounds really fascinating. I look forward to picking up your book because I feel like there's a lot of new information here even as someone who's been talking about Bayesian inference for so many years. It has a lot of new things that haven't seen before. So I thought this was a very fascinating discussion today. Aubrey, thank you so much for being on the show. Do you have any last thoughts? And where can someone go to learn more and to pick up the book?

Aubrey: People can go to my website, aubreyclayton.com, which has all kinds of information about my writing and other things. The book is available at all fine bookshops, Amazon bookshop at our work, and the Columbia University Press website. 

My final thought is people should be more Bayesian. They, in particular, should examine the question, what is probability? I think if they do that, they'll find all kinds of ways in their daily lives that probability comes up and is used and abused. I think a clearer understanding of probability can help people navigate all kinds of uncertain situations. So that's my advertisement for Bayesian thinking.

Max: Awesome. Well, Aubrey, I could not agree more. That final statement has my wholehearted endorsement. Thank you so much for coming on the show today. 

Aubrey: Thank you. 

Max: All right. There you go. Just some more arguments and examples in support of the Bayesian idea. If you want to learn more about priors, there's a lot of prior episodes about prior. Episode 207 is all about priors and prior beliefs. Check that one out. There's so many previous episodes that we've done on the topic.

I do have some recommendations. You can always go all the way back to episode zero and one which, while the sound quality may be somewhat sub-optimal, those were our opening arguments for the show. Also episode 119 with Brian Blais on bringing Bayesianism to probability and statistics education. Episode 78 for my experience with Bayesian thinking. For my episode on Bayesian thinking, that was the one where I did an episode on right after I did the Bayesian Thinking course in Lviv in Ukraine back in 2019. Then Episode 105, with mathematician Sophie Carr.

I didn't do that in chronological order. Maybe if I did, it would be a 01, followed by 78, followed by 105, followed by 119, followed by 207. See there, as a computer scientist, I can just do all sorts of sorts in my head. Whether it's just a selection sort, actually, I think I did a quicksort in my head. 

All right. You're gonna have to go to localmaxradio.com/archive or search the website by topic because there are so many different angles to the Bayesian topic that we've covered over the years. I feel like we've cataloged one of the greatest libraries on Bayesian thinking. So even though I have… So yeah, definitely check that out and I have a lot of episodes in the can. Some are really cool. Some are really different from the ones we've done in the past. I hope to get to those soon, but I'm going to intersperse them so I hope to do some news updates as well and of course have Aaron back on the show. Have a great weekend everyone. 

Narrator: That's the show. To support the Local Maximum, sign up for exclusive content and our online community at maximum.locals.com. The Local Maximum is available wherever podcasts are found. If you want to keep up, remember to subscribe on your podcast app. Also, check out the website with show notes and additional materials at localmaxradio.com. If you want to contact me, the host, send an email to localmaxradio@gmail.com. Have a great week.

Episode 268 - Pascal's Mugging, Doomsday Clocks, and the AGI Debate

Episode 268 - Pascal's Mugging, Doomsday Clocks, and the AGI Debate

Episode 266 - Simplicity, Complexity, and Text Classification with Joel Grus

Episode 266 - Simplicity, Complexity, and Text Classification with Joel Grus