Tuesday, November 12, 2013

Conceptual inertia in science: what is simpler and more straightforward? A comment on 'Mendelism'

We now know enough about genetics to give up, as historical relics, terms like Mendelian, dominant, and recessive.  But their use lingers on. The habit of Mendelian terminology is one that we owe, of course, to Gregor Mendel and his work on pea-plant hybridization published in 1865, roughly contemporary with Darwin  The idea that a trait is inherited and could have two states, one of which had an over-riding effect relative to the other, clarified the nature of one kind of inheritance and led to a huge explosion of successful research that ultimately showed that inheritance involved particles of biological causation ('genes').  It provided a research strategy that led to the discovery of the nature and function of chromosomes, of DNA, its protein-coding nature, that causal elements were discrete segments of DNA, and it explained major aspects of inheritance.  In this sense, few discoveries in any science have had a greater impact on human knowledge than Mendel's work.

Mendel's 7 chosen traits

But Mendel was Mendel and times have moved on.  We now know that what is inherited is not traits but coding molecules that affect or produce traits are.  We know of many functional elements in DNA beyond direct protein-coding that affect gene expression, and we know that much if not most of the time, no single element produces a trait.  Further, variation (a central aspect of evolution and genomic function) usually has quantitative (taller, higher blood pressure, higher cholesterol) rather than qualitative (yes/no, green/yellow, sick/well) effects.  Indeed, Mendel knew this, too, and carefully selected traits that behaved themselves relative to his purpose of creating better, predictable strains of pea plants.  (Actually, more than DNA-based codes are inherited, because DNA depends on its cellular context, but that's another topic). 

Old terms past their utility
The problem with continuing to treat the basic model of genetic causation as one in which traits are due to inheritance of variants from a single gene, with one variant being 'dominant' over the other, is that it constrains our thinking.  A lot that we now know about genetics doesn't fit the model.  A lot that Mendel knew doesn't fit the model!  A lot of dichotomous traits that Mendel knew don't fit the model!

Sometimes the effects of both variants are seen, or the trait actually varies based on which variants a person has, or there are many variants in the population with differing effects, including none, and the variant we call 'dominant' sometimes doesn't manifest itself.  When we don't see effects, we say the variant has 'incomplete penetrance', or, for quantitative traits some 'dominance deviation', but that's misleading.  What is the information content of saying that the 'A' allele is dominant, except when it isn't?  Penetrance probability is a fudge factor that forces a model to fit when it actually doesn't.

When a fudge factor is invoked in science, it is a signal that something about the theory is wrong.

When are children old enough to learn the truth?
I've heard people acknowledge that, yes, the classical terms that we get from Mendel, like dominant and recessive, are misleading but then go on to argue that they are the simple and right way to introduce inheritance to students.  Let's introduce concepts slowly, adding complexities later when they're grown up enough to understand.

We disagree.  We think that such an approach entrenches simplistic and misleading thinking in students' understanding of genetics, where it hangs around in their mental background even when they become professionals who do or should know better.  So we don't like this approach. Indeed, it's based on an incorrect premise.

We think young minds can just as easily understand that we inherit a copy of each gene from each parent, and each confers some effect.  Call it a and b, and the result is a-and-b.  That's about as simple as an explanation can get.  Then we can note that sometimes one of the variants by itself has a huge effect that we can't miss, and those effects have often been discovered first, because they were easy to find. 

Here is an extract from the Wikipedia entry for the blood-clotting disease trait called Factor V Leiden:
Factor V Leiden is an autosomal dominant condition that exhibits incomplete dominance and results in a factor V variant that cannot be as easily degraded by aPC (activated Protein C).. . . .Up to 30 percent of patients who present with deep vein thrombosis (DVT) or pulmonary embolism have this condition. The risk of developing a clot in a blood vessel depends on whether a person inherits one or two copies of the factor V Leiden mutation. Inheriting one copy of the mutation from a parent (heterozygous) increases by fourfold to eightfold the chance of developing a clot. People who inherit two copies of the mutation (homozygous), one from each parent, may have up to 80 times the usual risk of developing this type of blood clot.[9] Considering that the risk of developing an abnormal blood clot averages about 1 in 1,000 per year in the general population, the presence of one copy of the factor V Leiden mutation increases that risk to between 4 in 1,000 to 8 in 1,000. Having two copies of the mutation may raise the risk as high as 80 in 1,000. It is unclear whether these individuals are at increased risk for recurrent venous thrombosis. 
Now, we happen to think that while this may be technically correct description, it is burdened by a concept that it tries to assert and yet then to deny in the same initial phrase--Factor V Leiden is "autosomal dominant", and yet exhibits "incomplete dominance".  The trait would be called both Mendelian and dominant by most people.  But it is neither.  What it is, is a trait whose measure (severity or frequency of attack or relevance to experiences such as taking birth control pills in women) depends on the genotype and is both quantitative and probabilistic.

Why not just say that "The risk of blood clotting is affected by a person's specific genotype in the F5 gene, which determines the strength of the gene's effect"?  That is a true, accurate, and direct statement.

There is no need to call it 'Mendelian' in the sense that the trait itself is inherited (the gene certainly is!) nor 'dominant', and then take it back in the same breath. That obfuscates by jargon that sounds knowledgeable but conveys essentially no information and indeed by conveying less information than our statement in the prior paragraph.  The harm is that by oversimplifying it distracts attention from the search for the causal truth, making a problem seem solved when it hasn't been.

Teach the truth
If we follow the kind of approach we suggest here, the truth is actually simpler and just as easy to digest as a 19th century first-look at a problem--simpler, that is, for new students not already acculturated to obsolete thinking.  Teach the truth and we'd have fewer inaccuracies to have to un-teach later.  Since most genetic diseases are not simple, we go on to say that the 'penetrance' of each genotype is complex and depends on life and environmental (and other unknown) factors that probably include modifying effects of the genotype in each person at other unidentified genome locations.

We could then go on to say that, as a matter of history, Mendel chose special cases that were clear and this is how he understood diploid genotype-phenotype relationships.  He found situations where there were only two variants in a gene, one with much stronger effects than the other, and this allowed him to track inheritance across generations and understand the situation.  Etc.

And we can go on to say (yes, now to more advanced students), that variation in F5 is the most common single genetic risk factor for 'thrombophilia' (above average tendency to form blood clots) in Europeans, but not the only one, and many non-genetic factors can also be involved.  So F5 is a cause of clotting, with high penetrance, but not the cause, in the same way that speeding affects risk of car crashes but is not the only risk factor.

We use penetrance properly as the term that connects genotypes and phenotypes, and we don't need to refer to 'Mendelian' traits etc.  We could avoid terms like 'dominance' completely, or at least when we introduce them make it clear that they are complex subtle terms with variable meanings.  In so doing, the next generation could be freed of the conceptual inertia of the Victorian era.

5 comments:

Manoj Samanta said...

> In so doing, the next generation could be freed of the conceptual inertia of the Victorian era.

Scholars of Victorian era were lot more free thinkers than those living today.

Ken Weiss said...

They were more broadly educated and I think that, in principle, can favor synthesis rather than narrowness.

Anonymous said...

Question presented: "Why not just say that "The risk of blood clotting is affected by a person's specific genotype in the F5 gene, which determines the strength of the gene's effect"? "

I don't follow. Did I miss something? Assuming that the antecedent of "the gene's effect" is the F5 gene, then the sentence reads:

"The risk of blood clotting is affected by a person's specific genotype in the F5 gene, which determines the strength of the F5 gene's effect"

But isn't that just the same way as saying: "The risk of blood clotting is determined by the genotype in the F5 gene."??

And isn't that, really, just "Blood clotting is affected by the F5 gene locus"???

Sure, that's true and accurate, but how is it complete? It conveys nothing more than a causal connection between locus and trait. Whether over simplified or not, "autosomal, incompletely dominant" provides more information.

"Incomplete dominance" does not mean, "not dominant," does it? It means, "more than 50% dominant, but not 100% dominant." In other words, it has information value.

So saying that a diseased form of a gene is "autosomal, incompletely dominant" conveys: 1. Located on non-sex gene, 2. Dominant over non-diseased gene, 3. Not completely dominant.

From that, the 7th grade version can be distilled: one disease gene bad, two disease genes very bad!

But I'm open to consider ditching Mendel - I need to bone up on genetics a bit to see if Mendel's concepts still have utility.

Andrew B.

Anonymous said...

Oh, and I forgot:

While I completely agree that the truth should always be taught, sometimes a partial truth taught first can facilitate the whole truth taught later.

A very good cognate to the use Mendel/scrap Mendel discussion is Newton/Einstein.

If you were to teach the truth in physics, at first presentation, you'd have to start by scrapping Newton. His planetary mechanics are Mendel's pea - well chosen, fit the theory, explain some basic observations, but fundamentally incorrect.

Ken Weiss said...

My view is that we should drop categorical descriptions and simply say that there is a quantitative relationship between the trait value o the presence of the trait, and the genotype at some gene under consideration.

If we did not inculcate thinking into 19th century terms, including 'dominant', we would not have so much of the community accepting the idea of genes 'for' some trait or rare 'mendelian' variants. I don't expect everyone to agree. But I think legacy concepts are as often confusing as they are enlightening,and they're not as much simpler than modern views as they are often said to be.