20210901

Can you know for sure?

Every science pupil knows about drawing a best-fit straight line by eye through a scatter plot of experimental data. But at college level I discovered something so much better in regression analysis. Better, because the resulting best-fit (trend) curve or line (polynomial) is mathematically deterministic and no longer depending on the vagaries of human judgement. Apt and questionable use of regression is well illustrated by...


The black dots represent experimental data, the same in all 12 graphs, and the various red lines are "best fit" attempts to determine the underlying theory that explains these data. In the simplest case the experimenter will massage the theory into an equation of a straight line, plot the data according to this equation and then using linear regression to fit the best straight line as in the first graph. The resulting gradient and y-intercept are the summary data and the correlation coefficient provides a measure of confidence level. 

Measured data is generally discrete as well as being subject to experimental error, whereas the theory is hoped to be exact and generally continuous. For example, with an experiment to verify the well known equation for the period of a simple pendulum:

the period T will be measured for discrete values of length L.  Squaring both sides of the theoretical equation shows that plotting T² against L should result in a straight line through the origin whose gradient will give a value for the acceleration of free fall g.

Whereas the period might have been measured for discrete lengths, regression will give us a value for the period at any length. Getting a value in this way within the experimental range is called "interpolation", or outside that range is called "extrapolation". The latter is often unreliable, as illustrated in the last graph above, especially for higher orders of the regression polynomial.

Valid or spurious?

But even interpolation can be tricky. Consider the experimental data depicted in the above graph. The "outlier" red point might be due to human measurement error or might describe some real but anomalous phenomenon. If in fact no measurement had been taken at this x-axis value any such phenomenon would have been missed. Of course the correct action in either case would be to take more measurements. But this freedom to re-measure is not always feasible, particularly if we are dealing with data measured historically or with apparatus that is no longer available or is too costly to set up.

We are talking about experimental data versus theory. Science is all about coming up with theories to explain observed phenomena. In physics the epitome would be to fulfil the quest for the so far illusive Theory of Everything

As you can see from the Curve Fitting Methods image, there might be more than one theory that fits the same experimental data, so which is the correct one? An example is the wave / particle duality of elementary particles. That light was observed to sometimes behave as a wave, and at other times as a stream of particles, birthed the start of what we now call Modern Physics. Back then these two theories seemed to be at odds but we now accept that they are different facets of the same truth. Enter the "greatest mistake" in Arago's spot. And we are similarly happy with the concept that what we call matter is not as tangibly solid as suggested by Thomas' "Unless I see in his hands the mark of the nails, and place my finger into the mark of the nails, and place my hand into his side, I will never believe" but rather is subject to the fuzzy laws of probability - or at least so quantum theory says.  But these are theories often beset with hard or costly to obtain data and of course we can neither directly "see" nor touch some of this stuff. Think Large Hadron Collider, the world's largest and highest-energy particle accelerator.

To the uninitiated the various "modern" theories such as General Relativity, String theory and the Standard Model with its quarks and more recent supposed anyons that can only exist in 2D appear whacky indeed and the only reason there are no large scale public demonstrations to either refute or endorse them is, I suppose, because the average member of the public doesn't have a clue about what they all mean or their relevance to every-day life.

But perhaps none of these currently accepted theories is in fact the best fit to the data, witness that a Theory of Everything has not yet been proposed. And perhaps a similar argument can be applied to other realms, for example climate change aka global warming and the role of CO2 where a lot depends on questionable extrapolation. Add to this the political agendas that seem so often to have the media promoting particular theories to their own end and you have such a mishmash that all the ordinary person is left to choose is whether to go with the flow or go against the flow, neither course being as strongly based on science or truth as we would like. 


2 comments:

  1. https://www.vice.com/en/article/z3xbz4/eric-weinstein-says-he-solved-the-universes-mysteries-scientists-disagree (no comment!)

    ReplyDelete
  2. Jon, I enjoyed reading your recommended 'The Devil's Delusion' and this quote agrees nicely with what I was saying: "It is entirely possible that there may be as many elementary particles as there is funding available to investigate them". We cannot know for sure.

    ReplyDelete