I mention this because Arjun recently brought to my attention an old, but thought-provoking and, I think, relevant book. It is by the late Milton Rothman, and called Discovering the Natural Laws (1972, reprinted by Dover, 1989). This book is a very interesting one, and I wish I had known of it long ago. It concerns the history by which some of the basic 'laws' of physics were shown to be just that: laws. And, while its specific topic is physics, it is very relevant to what is going on in genetics at present.
|Source: my copy's cover|
Rothman deals with many presumed laws of Nature, including Newton's laws of motion, conservation, relativity, and electromagnetisim. For example, we all know that Newton's universal law of gravitation is that the gravitational attraction between two objects, of mass M and N, a distance r apart, is
F = G MN/r^2
(the denominator is r-squared). G is the universal gravitational constant. But how do we know this? And how do we know it is a 'universal law'? For example, as Rothman discusses, why do we say the effect is due to the exact square (power 2) of the distance between the objects? Why not, for example, the 2.00005 power? And what makes G a, much less the, constant? Why constant? Why does the same relationship apply regardless of the masses of the objects, or what they're made of? And what makes it universal--how could we possibly test that?
The fact that these laws are laws, as Rothman details for this and other laws of physics, was not easy to prove. For example, what two objects would be used to show the gravitational law, where measurement errors and so on would not confound the result? The Earth and Moon won't do, despite Newton's use of them, because they are affected by the pull of the Sun and other planets etc. Those effects may be relatively small, but they do confound attempts to prove that the law is in fact a true, exact law.
A key to this demonstration of truth is that as various factors are accounted for, and measurement becomes more accurate, and different approaches triangulate, the law data become asymptotically close to the predictions of the theory: predictions based on theory get ever closer to what is observed.
In fact, and very satisfyingly, to a great approximation these various principles do appear to be universal (in our universe, at least) and without exception. Only when we get to the level of resolution of individual fundamental particles, and quantum effects, do these principles break down or seem to vary. But even then the data approach a specifiable kind of result, and the belief is that this is a problem of our incomplete understanding, not a sign of the fickleness of Nature!
Actually if a value, like G or the power 2 were not universal but instead were context-specific, but replicable in any given context, we could probably show that, and characterize those situations in which a given value of the parameter held, or we could define with increasing accuracy some functional relationship between the parameter's value and its circumstances.
But Mermaid's Tale is about evolution and genetics, so what is the relevance of these issues? Of course, we're made of molecules and have to follow the principles--yes, the laws--of physics. But at the scale of genes or organisms or evolution or disease prediction, what is its relevance? The answer is epistemological, that is, about the nature of knowledge.
Theory and inference genetics and evolution: where are the 'laws'? Are there laws?
When you have a formal, analytic or even statistical theory, you start with that, a priori one might say, and apply it to data either to test the theory itself, or to estimate some parameter (such as F in some setting such as a planet's orbital motion). As I think it can be put, you use or test an externally derived theory.
Rothman quotes Karl Popper's view that "observation is always observation in the light of theories" which we test by experiments in what Rothman calls a 'guess and test' method, otherwise known as 'the scientific method'. This is a hypothesis-driven science world-view. It's what we were all taught in school. Both the method and its inferential application depend on the assumption of law-like replicability.
Popper is most known for his idea that replication of observations never prove an hypothesis because the next observation might falsify it, but it only takes one negative observation to do that. There may be some unusual circumstance that is valid but you hadn't yet encountered. In general, but perhaps particularly in biomedical sciences, it is faddish and often self-serving to cite 'falsifiability' as our noble criterion, as if one has a deep knowledge of epistemology. Rothman, like almost everyone, quotes Popper favorably. But in fact, his idea doesn't work. Not even in physics! Why?
Any hypothesis can be 'falsified' if the experiment encounters design or measurement error or bad luck (in the case of statistical decision-making and sampling). You don't necessarily know that's what happened, only that you didn't get your expected answer. But what about falsifiability in biology and genetics? We'll see below that even by our own best theory it's a very poor criterion in ways fundamentally beyond issues of lab mistakes or bad sampling luck.
We have only the most general theories of the physics-like sort for phenomena in biology. Natural selection and genetic drift, genes' protein-coding mechanisms, and so on are in a sense easy and straightforward examples. But our over-arching theory, that of evolution, says that situations are always different. Evolution is about descent with modification, to use Darwin's phrase, and this generates individuals as non-replicates of each other. If you, very properly, include chance in the form of genetic drift, we don't really have a formal theory to test against data. That's because chance involves sampling, measurement, and the very processes themselves. Instead, in the absence of a precisely predictive formal theory, we use internal comparisons--cases vs controls, traits in individuals with, and without, a given genotype, and so on. What falsifiability criterion should you apply and, more importantly, what justifies your claim of that criterion?
Statistical comparison may be an understandable way to do business under current circumstances that do have not provided a precise enough theory, but it means that our judgments are highly dependent on subjective inferential criteria like p-value significance cutoffs (we've blogged about this in the past). They may suggest that 'something is going on' rather than 'nothing is going on', but even then only by the chosen cutoff standard. Unlike the basic laws of physics, they do not generate asymptotically closer and closer fits to formally constructed expectations.
What is causal 'risk'?
We may be interested in estimating 'the' effect of a particular genetic variant on some trait. We collect samples of carriers of one or two copies of the variant, and determine the risk in them, perhaps comparing this to the risk in samples of individuals. But this is not 'the' risk of the variant! It is, at best, an estimate of the fraction of outcomes in our particular data. By contrast, a probability in the prediction sense is the result of a causal process with specific properties, not an empirical observation. We usually don't have the evidence to assert that our observed fraction is an estimate of that underlying causal process.
Collecting larger samples will not in the appropriate sense lead asymptotically to a specific risk value. That's because we know enough about genes to know that an effect is dependent on the individual's genomic background and life-experience. Each individual has his/her own, unique 'risk'. In fact, the term 'risk' itself, applied to an individual, is rather meaningless because it is simply inappropriate to use such individual concepts in this context, when the value is based on a group sample. Or, put another way, the ideas about variance in estimates and so on just don't apply, because individuals are not replicable observations. The variance of group estimates are different from what we might want to know about individuals.
These are the realities of the data, not a fault with the investigators. If anything, it is a great triumph that we owe to Darwin whatever his overstatement of selection as a 'law' of nature, that we know these realities! But it means, if we pay attention to what a century of fine life science has clearly shown, that fundamental context-dependence deprives us of the sort of replicability found in the physical sciences. That undermines many of the unstated assumptions in our statistical methods (the existence of actual probability values, proper distributions from which we're sampling, replicability, etc.).
This provides a generic explanation for why we have the kinds of results we have, and why the methods being used have the problems they have. In a sense, our methods, including statistical inference, are borrowed from the physical sciences, and may be inappropriate for many of the questions we're asking in evolution and contemporary genetics, because our science has shown that life doesn't obey the same 'laws' at the levels of our observation.
Darwin and Mendel had Newton envy, perhaps, but gave us our own field--if we pay attention
Rothman's book provides, to me, a clear way to see these issues. It may give us physics envy, a common and widespread phenomenon in biology. I think Darwin and Mendel were both influenced indirectly by Newton and the ideas of rigorous 'laws' of nature. But they didn't heed what Newton himself said in his Principia in 1687: "Those qualities of bodies that . . . belong to all bodies on which experiments can be made should be taken as qualities of all bodies universally." (Book 3, Rule 3, from the Cohen and Whitman translation, 1999) That is, for laws of nature, if you understand behavior of what you can study, those same laws must apply universally, even to things you cannot or have not yet studied.
This is a cogent statement of the concept of laws, which for Newton was "the foundation of all natural philosophy." It is what we mean by 'laws' in this context, and why better designed studies asymptotically approach the theoretical predictions. But this principle is one that biomedical and evolutionary scientists and their students should have tattooed on so they don't forget. The reason is that this principle of physics is not what we find in biology, and we should be forced to keep in mind and take seriously the differences in the sciences, and the problems we face in biology. We do not have the same sort of replicability or universality in life unless non-replicability is our universal law.
If we cannot use this knowledge to ferret out comparable laws of biology, then what was the purpose of 150 years of post-Darwinian science? It was to gain understanding, not to increase denial. At least we should recognize the situation we face, which includes stop making misleading or false promises about our predictive, retrodictive, or curative powers, that essentially assume as laws what our own work has clearly proven are not laws.