The passing of stars

Birth, death, and afterlife in the universe


Keith S. Taber


stars are born, start young, live, sometimes living alone but sometimes not, sometimes have complicated lives, have lifetimes, reach the end of their lives, and die, so, becoming dead, eventually long dead; and, indeed, there are generations of stars with life cycles


One of the themes I keep coming back to here is the challenge of communicating abstract scientific ideas. Presenting science in formal technical language will fail to engage most general audiences, and will not support developing understanding if the listener/reader cannot make good sense of the presentation. But, if we oversimplify, or rely on figures of speech (such as metaphors) in place of formal treatments of concepts, then – even if the audience does engage and make sense of the presentation – audience members will be left with a deficient account.

Does that matter? Well, often a level of understanding that provides some insight into the science is far better than the impression that science is so far detached from everyday experience that it is not for most people.

And the context matters.

Public engagement with science versus science education

In the case of a scientist asked to give a public talk, or being interviewed for news media, there seems a sensible compromise. If people come away from the presentation thinking they have heard about something interesting, that seems in some way relevant to them, and that they understood the scientist's key messages, then this is a win – even if it is only a shift to an over-simplified account, or an understanding in terms of a loose analogy. (Perhaps some people will want to learn more – but, even if not, surely this meets some useful success criterion?)

In this regard science teachers have a more difficult job to do. 1 The teacher is not usually considered successful just because the learners think they have understood teaching, but rather only when the learners can demonstrate that what they have learnt matches a specified account set out as target knowledge in the curriculum. This certainly does not mean a teacher cannot (or should not) use simplification and figures of speech and so forth – this is often essential – but rather that such such moves can usually only be seen as starting points in moving learners onto temporary 'stepping stones' towards creditable knowledge that will eventually lead to test responses that will be marked correct.


An episode of 'In Our Time' on 'The Death of Stars'
"The image above is of the supernova remnant Cassiopeia A, approximately 10,000 light years away, from a once massive star that died in a supernova explosion that was first seen from Earth in 1690"

The Death of Stars

With this in mind, I was fascinated by an episode of the BBC's radio show, 'In Our Time' which took as its theme the death of stars. Clearly, this falls in the category of scientists presenting to a general public audience, not formal teaching, and that needs to be borne in mind as I discuss (and perhaps even gently 'deconstruct') some aspects of the presentation from the perspective of a science educator.

The show was broadcast some months ago, but I made a note to revisit it because I felt it was so rich in material for discussion, and I've just re-listened. I thought this was a fascinating programme, and I think it is well worth a listen, as the programme description suggests:

"Melvyn Bragg and guests discuss the abrupt transformation of stars after shining brightly for millions or billions of years, once they lack the fuel to counter the force of gravity. Those like our own star, the Sun, become red giants, expanding outwards and consuming nearby planets, only to collapse into dense white dwarves. The massive stars, up to fifty times the mass of the Sun, burst into supernovas, visible from Earth in daytime, and become incredibly dense neutron stars or black holes. In these moments of collapse, the intense heat and pressure can create all the known elements to form gases and dust which may eventually combine to form new stars, new planets and, as on Earth, new life."

https://www.bbc.co.uk/sounds/play/m0018128

I was especially impressed by the Astronomer Royal, Professor Martin Rees (and not just because he is a Cambridge colleague) who at several points emphasised that what was being presented was current understanding, based on our present theories, with the implication that this was open to being revisited in the light (sic) of new evidence. This made a refreshing contrast to the common tendency in some popular science programmes to present science as 'proven' and so 'certain' knowledge. That tendency is an easy simplification that distorts both the nature and excitement of science.

Read about scientific certainty in the media

Presenter Melvyn Bragg's other guests were Carolin Crawford (Emeritus Member of the Institute of Astronomy, and Emeritus Fellow of Emmanuel College, University of Cambridge) and Mark Sullivan (Professor of Astrophysics at the University of Southampton).

Public science communication as making the unfamiliar familiar

Science communicators, whether professional journalists or scientists popularising their work, face similar challenges to science teachers in getting across often complex and abstract ideas; and, like them, need to make the unfamiliar familiar. Science teachers are taught about how they need to connect new material with the learners' prior knowledge and experiences if it is to make sense to the students. But successful broadcasters and popularisers also know they need to do this, using such tactics as simplification, modelling, metaphor and simile, analogy, teleology, anthropomorphism and narrative.

There were quite a few examples of the speakers seeking to make abstract ideas accessible to listeners in such ways in this programme. However, perhaps the most common trope was one set up by the episode title, and one which could very easily slip under radar (so to speak). In this piece I examine the seemingly ubiquitous metaphor (if, indeed, it is to be considered a metaphor!) of stars being alive; in a sequel I discuss some of the wide range of other figures of speech adopted in this one science programme.

Science: making the familiar, unfamiliar?

If when working as a teacher I saw a major part of my work as making the unfamiliar familiar to learners, in my research there was a sense in which I needed to make the familiar unfamiliar. Often, the researcher needs to focus afresh on the commonly 'taken-for-granted' and to start to enquire into it as if one does not already know about it. That is, one needs to problematise the common-place. (This reflects a process sometimes referred to as 'bracketing'.)

To give one obvious example. Why do some students do well in science tests and others less well? Obviously, because some learners are better science students than others! (Clearly in some sense this is true – but is it just a tautology? 2) But one clearly needs to dig into this truism in more detail to uncover any insights that would actually be useful in supporting students and improving teaching!

The same approach applies in science. We do not settle for tautologies such as fire burns because fire is the process of burning, or acids are corrosive because acids are the category of substances which corrode; nor what are in effect indirect disguised tautologies such as heavy objects fall because they are largely composed of the element earth, where earth is the element whose natural place is at the centre of the world. (If that seems a silly example, it was the widely accepted wisdom for many centuries. Of course, today, we do not recognise 'earth' as a chemical element.)

I mention this, because I would like to invite readers to share with me in making the familiar unfamiliar here – otherwise you could easily miss my point.

"so much in the Universe, and much of our understanding of it, depends on changes in stars as they die after millions or billions of stable years"

Tag line for 'the Death of Stars'

The lives of stars

The episode opens with

"Hello. Across the universe, stars have been dying for millions of years…

Melvyn Bragg introducing the episode

The programme was about the death of stars – which directly implies stars die, and, so, also suggests that – before dying – they live. And there were plenty of references in the programme to reinforce this notion. Carolin Crawford suggested,

"So, essentially, a star's life, it can exist as a star, for as long as it has enough fuel at the right temperature at the right density in the core of the star to stall the gravitational collapse. And it is when it runs out of its fuel at the core, that's when you reach the end of its lifetime and we start going through the death processes."

Prof. Carolin Crawford talking on 'In Our Time'

Not only only do stars have lives, but some have much longer lives than others,

"…more massive stars can … build quite heavy elements at their cores through their lifetimes. And … they actually have shorter lifetimes – it is counter-intuitive, but they have to chomp through their fuel supply so furiously that they exhaust it more rapidly. So, the mass of the star dictates what happens in the core, what you create in the core, and it also determines the lifetime of the star."

"The mass of the star…determines the lifetime of the star….
our sun…we reckon it is about halfway through its lifetime, so stars like the sun have lifetimes of 10 billions years or so…"


Prof. Carolin Crawford talking on 'In Our Time'

This was not some idiosyncratic way that Professor Crawford had of discussing stars, as Melvyn's other guests also used this language. Here are some examples I noted:

  • "this is a dead, dense star" (Martin Rees)
  • "the lifetime of a stable star, we can infer the … life cycles of stars" (Martin Rees)
  • "stars which lived and died before our solar system formed…stars which have more complicated lives" (Martin Rees)
  • "those old stars" (Martin Rees)
  • "earlier generations of massive stars which had lived and died …those long dead stars" (Martin Rees)
  • "it is an old dead star" (Mark Sullivan)
  • "our sun…lives by itself in space. But most stars in the universe don't live by themselves…" (Mark Sullivan)
  • "two stars orbiting each other…are probably born with different masses" (Mark Sullivan)
  • "when [stars] die" (Mark Sullivan)
  • "when [galaxies] were very young" (Martin Rees)
  • "stars that reach the end point of their lives" (Carolin Crawford )
  • "a star that's younger" (Martin Rees)

So, in the language of astronomy, stars are born, start young, live; sometimes living alone but sometimes not, sometimes have complicated lives; have lifetimes, reach the end of their lives, and die, so, becoming dead, eventually long dead; and, indeed, there are generations of stars with life cycles.


The processes that support a star's luminosity come to an end: but does the star therefore die?

(Cover art for the Royal Philharmonic Orchestra's recording of David Bedford's composition Star's End. Photographer: Monique Froese)


Are stars really alive?

Presumably, the use of such terms in this context must have originally been metaphorical. Life (and so death) has a complex but well-established and much-discussed meaning in science. Living organisms have certain necessary characteristics – nutrition, (inherent) movement, irritability/sensitivity, growth, reproduction, respiration, and excretion, or some variation on such a list. Stars do not meet this criterion. 3 Living organisms maintain a level of complex organisation by making use of energy stores that allow them to decrease entropy internally at the cost of entropy increase elsewhere.

Animals and decomposers (such as fungi) take in material that can be processed to support their metabolism and then the 'lower quality' products are eliminated. Photosynthetic organisms such as green plants have similar metabolic processes, but preface these by using the energy 'in' sunlight to first facilitate endothermic reactions that allow them to build up the material used later for their mortal imperative of working against the tendencies of entropy. Put simply, plants synthesise sugar (from carbon dioxide and water) that they can distribute to all their cells to support the rest of the metabolism (a complication that is a common source of alternative conceptions {misconceptions} to learners 4).

By contrast, generally speaking, during their 'lifetimes', stars only gain and lose marginal amounts of material (compared with a 70 kg human being that might well consume a tonne of food each year) – and do not have any quality control mechanism that would lead to them taking in what is more useful and expelling what is not.

As far as life on earth is concerned, virtually all of that complex organisation of living things depends upon the sun as a source of energy, and relies on the process by which the sun increases the universe's entropy by radiating energy from a relatively compact source into the diffuse vastness of space. 4 In other words, if anything, a star like our sun better reflects a dead being such as a felled tree or a zebra hunted down by a lion, providing a source of concentrated energy for other organisms feeding on its mortal remains!

Are the lives and deaths of stars simply pedagogical devices?

So, are stars really alive? Or is this just one example of the kind of rhetorical device I referred to above being adopted to help make the abstract unfamiliar becomes familiar? Is it the use of a familiar trope employed simply to aid in the communication of difficult ideas? Is this just a metaphor? That is,

  • Do stars actually die, or…
  • are they only figuratively alive and, so, only suffer (sic) a metaphorical death?

I do not think the examples I quote above represent a concerted targeted strategy by Professors Crawford, Rees and Sullivan to work with a common teaching metaphor for the sake of Melvyn and his listeners: but rather the actual language commonly used in the field. That is, the life cycles and lifetimes of stars have entered into the technical lexicon of the the science. If so, then stars do actually live and die, at least in terms of what those words now mean in the discipline of astronomy.

Gustav Strömberg referred to "the whole lifetime of a star" in a paper in the The Astrophysical Journal as long ago as 1927. He did not feel the need to explain the term so presumably it was already in use – or considered obvious. Kip Thorne published a paper in 1965 about 'Gravitational Collapse and the Death of a Star". In the first paragraph he pointed out that

"The time required for a star to consume its nuclear fuel is so long (many billions of years in most cases) that only a few stars die in our galaxy per century; and the evolution of a star from the end point of thermonuclear burning to its final dead state is so rapid that its death throes are observable for only a few years."

Thorne, 1965, p.1671

Again, the terminology die/death/dead is used without introduction or explanation.

He went on to refer to

  • deaths of stars
  • different types of death
  • final resting states

before shifting to what a layperson would recognise as a more specialist, technical, lexicon (zero point kinetic energy; Compton wavelength of an electron; neutron-rich nuclei; photodistintegration; gravitational potential energy; degenerate Fermi gas; lambda hyperons; the general relativity equation of hydrostatic equilibrium; etc.), before reiterating that he had been offering

"the story of the death of a star as predicted by a combination of nuclear theory, elementary particle theory, and general relativity"

Thorne, 1965, p.1678

So, this was a narrative, but one intended to be fit for a professional scientific audience. It seems the lives and deaths of stars have been part of the technical vocabulary of astronomers for a long time now.

When did scientists imbue stars with life?

Modern astronomy is quite distinct from astrology, but like other sciences astronomy developed from earlier traditions and at one time astronomy and astrology were not so discrete (an astronomical 'star' such as Johannes Kepler was happy to prepare horoscopes for paying customers) and mythological and religious aspects of thinking about the 'heavens' were not so well compartmentalised from what we would today consider as properly the realm of the scientific.

In Egyptian religion, Ra was both a creative force and identified with the sun. Mythology is full of origin stories explaining how the stars had been cast there after various misadventures on earth (the Greek myths but also in other traditions such as those of the indigenous North American and Australian peoples 5) and we still refer to examples such as the seven sisters and Orion with the sword hanging in his belt. The planets were associated with different gods – Venus (goddess of love), Mars (the god of war), Mercury (the messenger of the gods), and so on.6 It was traditional to refer to some heavenly bodies as gendered: Luna is she, Sol is he, Venus is she, and so on. This usage is sometimes found in scientific writing on astronomy.

Read about examples of personification in scientific writing

Yet this type of poetic license seems unlikely to explain the language of the life cycles of stars, even if there are parallels between scientific and poetic or spiritual accounts,

Stars are celestial objects having their own life cycles. Stars are born, grow up, mature and eventually die. …The author employs inductive and deductive analysis of the verses of the Quran and the Hadith texts related with the life and death of stars. The results show that the life and death of the stars from Islamic and Modern astronomy has some similarities and differences.

Wahab, 2015

After all, the heavenly host of mythology comprised of immortals, if sometimes starting out as mortals subsequently given a kind of immorality by the Gods when being made into stars. Indeed the classical tradition supported by interpretation of Christian orthodoxy was that unlike the mundane things of earth, the heavens were not subject to change and decay – anything from the moon outwards was perfect and unchanging. (This notion was held onto by some long after it was established that comets with their varying paths were not atmospheric phenomena – indeed well into the twentieth century some young earth creationists were still insisting in the perfect, unchanging nature of the heavens. 7)

So, presumably, we need to look elsewhere to find how science adopted life cycles for stars.

A natural metaphor?

Earlier in this piece I asked readers to bear with me, and to join with me in making the familiar unfamiliar, to 'bracket' the familiar notion that we say starts are born, live and later die, and to problematise it. In one scientific sense stars cannot die – as they were never alive. Yet, I accept this seems a pretty natural metaphor to use. Or, at least, it seems a natural metaphor to those who are used to hearing and reading it. A science teacher may be familiar with the trope of stars being born, living, and dying – but how might a young learner, new to astronomical ideas, make sense of what was meant?

Now, there is a candidate project for anyone looking for a topic for a student research assignment: how would people who have never previously been exposed to this metaphor respond to the kinds of references I've discussed above? I would genuinely like to know what 'naive' people would make of this 8 – would they just 'get' the references immediately (appreciate in what sense stars are born, live, and die); or, would it seem a bizarre way of talking about stars? Given how readily people accept and take up anthropomorphic references to molecules and viruses and electrons and so forth, I find the question intriguing.

Read about anthropomorphism in science

What makes a star alive or dead?

Even if for the disciplinary experts the language of living stars and their life cycles has become a 'dead metaphor 'and is now taken (i.e., taken for granted) as technical terminology – the novice learner, or lay member of the public listening to a radio show, still has to make sense of what it means to say a star is born, or is alive, or is nearing the end of its life, or is dead.

The critical feature discussed by Professors Crawford, Rees and Sullivan concerns an equilibrium that allow a star to exist in a balance between the gravitational attraction of its component matter and the pressure generated through its nuclear reactions.

A star forms when material comes together under its mutual gravitational attraction – and as the material becomes denser it gets hotter. Eventually a sufficient density and temperature is reached such that there is 'ignition' – not in the sense of chemical combustion, but self-sustaining nuclear processes occur, generating heat. This point of ignition is the 'birth' of the star.

Fusion processes continue as long as there is sufficient fissionable material, the 'fuel' that 'feeds' the nuclear 'furnace' (initially hydrogen, but depending on the mass of the star there can be a series of reactions with products from one stage undergoing further fusion to form even heavier elements). The life time of the star is the length of time that such processes continue.

Eventually there will not be sufficient 'fuel' to maintain the level of 'burning' that is needed to allow the ball of material to avoid ('resist') gravitational collapse. There are various specific scenarios, but this is the 'death' of the star. It may be a supernova offering very visible 'death throes'.

The core that is left after this collapse is a 'dead' star, even if it is hot enough to continue being detectable for some time (just as it takes time for the body of a homeothermic animal that dies to cool to the ambient temperature).

It seems then that there is a kind of analogy at work here.

Organisms are alive as long as they continue to metabolise sufficiently in order to maintain their organisation in the face of the entropic tendency towards disintegration and dispersal.Stars are alive as long as they exhibit sufficient fusion processes to maintain them as balls of material that have much greater volumes, and lower densities than the gravitational forces on their component particles would otherwise lead to.

It is clearly an imperfect analogy.

Organisms base metabolism on a through-put of material to process (and in a sense 'harvest' energy sources).Stars do acquire new materials and eject some, but this is largely incidental and it is essentially the mass of fissionable material that originally comes together to initiate fusion which is 'harvested' as the energy source.
Organisms may die if they cannot access external food sources, but some die of built-in senescence and others (those that reproduce by dividing) are effectively immortal.

We (humans) die because the amazing self-constructing and self-repairing abilities of our bodies are not perfect, and somatic cells cannot divide indefinitely to replace no longer viable cells.
Stars 'die' because they run out of their inherent 'fuel'.

Stars die when the hydrogen that came together to form them has substantially been processed.

Read about analogy in science

One person's dead star is another person's living metaphor

So, do stars die? Yes, because astronomers (the experts on stars) say they do, and it seems they are not simply talking down to the rest of us. The birth and death of stars seems to be based on an analogy: an analogy which is implicit in some of the detailed discussion of star life cycles. However, through the habitual use of this analogy, terms such as the birth, lifetimes, and death of stars have been adopted into mainstream astronomical discourse as unmarked (taken-for-granted) language such that to the uninitiated they are experienced as metaphors.

And these perspectival metaphors 9 become extended to describe stars that are considered young, old, dying, long dead, and so forth. These terms are used so readily, and so often without a perceived need for qualification or explanation, that we might consider them 'dead' metaphors within astronomical discourse – terms of metaphorical origin but now so habitually used that they have come to be literal (stars are born, they do have lifetimes, they do die). Yet for the uninitiated they are still 'living' metaphors, in the sense that the non-expert needs to work out what it means when a star is said to live or die.

There is a well recognised distinction between live and dead metaphors. But here we have dead-to-the-specialists metaphors that would surely seem to be non-literal to the uninitiated. These terms are not explained by experts as they are taken by them as literal, but they cannot be understood literally by the novice, for whom they are still metaphors requiring interpretation. That is, they are perspectival metaphors zombie words that may seem alive or dead (as figures of speech) according to audience, and so may be treated as dead in professional discourse, but may need to be made undead when used in communicating to the public.


Other aspects of the In Our Time discussion of 'The death of stars' are explored as The complicated social lives of stars: stealing, escaping, and blowing-off in space


Sources cited:
  • Strömberg, G. (1927). The Motions of Giant M Stars. The Astrophysical Journal, 65, 238.
  • Thorne, K. S. (1965). Gravitational Collapse and the Death of a Star. Science, 150(3704), 1671-1679. http://www.jstor.org.ezp.lib.cam.ac.uk/stable/1717408
  • Wahab, R. A. (2015). Life and death of stars: an analysis from Islamic and modern astronomy perspectives. International Proceedings of Economics Development and Research, 83, 89.

Notes

1 In this regard, but not in all regards. As I have suggested here before, the teacher usually has two advantages:

a) generally, a class has a limited spread in terms of the audience background: even a mixed ability class is usually from a single school year (grade level) whereas the public presentation may be addressing a mixed audience of all ages and levels of education.

b) usually a teacher knows the class, and so knows something about their starting points, and their interests


2 Some students do well in science tests and others less well.

If we say this is because

  • some learners are better science students than others
  • and settle for defining better science students as those who achieve good results in formal science tests (that is tests as currently administered, based on the present curriculum, taught in our usual way)

then we are simply 'explaining' the explicandum (i.e., some students do better on science tests that others) by a rephrasing of what is to be explained (some students are better science students: that is, they perform well in science tests!)

Read about tautology


3 Criterion (singular) as a living organism has to satisfy the entries in the list collectively. Each entry is of itself a necessary, but not sufficient, condition.


4 A simple misunderstanding is that animals respire but plants photosynthesise.

In a plant in a steady state, the rates of build-up and break down of sugars would be balanced. However, plants must photosynthesise more than they respire overall in order to to grow and ultimately to allow consumers to make use of them as food. (This needs to be seen at a system level – the plant is clearly not in any inherent sense photosynthesising to provide food for other organisms, but has evolved to be a suitable nutrition source as it transpires [no pun intended] that increases the fitness of plants within the wider ecosystem.)

A more subtle alternative conception is that plants photosynthesise during the day when they are illuminated by sunlight (fair enough) and then use the sugar produced to respire at night when the sun is not available as a source of energy. See, for example, 'Plants mainly respire at night because they are photosynthesising during the day'.

Actually cellular processes require continuous respiration (as even in the daytime sunlight cannot directly power cellular metabolism, only facilitate photosynthesis to produce the glucose that that can be oxidised in respiration).

Schematic reflection of the balance between how photosynthesis generates resources to allow respiration – typically a plant produces tissues that feed other organisms.
The area above the line represents energy from sunlight doing work in synthesising more complex substances. The area below the lines represents work done when the oxidation of those more complex substances provides the energy source for building and maintaining an organism's complex organisation of structure and processes (homoestasis).

5 Museum Victoria offers a pdf that can be downloaded and copied by teachers to teach about how "How the southern night sky is seen by the Boorong clan from north-west Victoria":

'Stories in the Stars – the night sky of the Boorong people' shows the constellations as recognised by this group, the names they were given, and the stories of the people and creatures represented.

(This is largely based on the nineteenth century reports made by William Edward Stanbridge of information given by Boorong informants – see 'Was the stellar burp really a sneeze?')

The illustration shown here is of 'Kulkunbulla' – a constellation that is considered in the U.K. to be only part of the constellation known here as Orion. (Constellations are not actual star groupings, but only what observers have perceived as stars seeming to be grouped together in the sky – the Boorong's mooting of constellations is no more right or wrong than that suggested in any other culture.)


6 The tradition was continued into modern times with the discovery of the planets that came to be named Neptune and Uranus after the Gods of the sea and sky respectively.


7 Creationism, per se, is simply the perspective or belief that the world (i.e., Universe) was created by some creator (God) and so creationism as such is not necessarily in conflict with scientific accounts. The theory of the big bang posits that time, space and matter had a beginning with an uncertain cause which could be seen as God (although some theorists such as Professor Roger Penrose develop theories which posit a sequence of universes that each give rise to the next and that could have infinite extent).

Read about science and religion

Young earth creationists, however, not only believe in a creator God (i.e., they are creationists), but one who created the World no more than about 10 thousand years ago (the earth is young!), rather than over 13 billion years ago. This is clearly highly inconsistent with a wide range of scientific findings and thinking. If the Young Earth Creationists are right, then either

  • a lot of very strongly evidenced science is very, very wrong
  • some natural laws (e.g. radioactive decay rates) that now seem fixed must have changed very substantially since the creation
  • the creator God went to a lot of trouble to set up the natural world to present a highly misleading account of its past history

8 I am not using the term naive here in a discourteous or demeaning way, but in a technical sense of someone who is meeting something for the first time.


9 That is, terms that will appear as metaphors from the perspective of the uninitiated, but now seem literal terms from the perspective of the specialist. We cannot simply say they are or are not metaphors, without asking 'for whom?'


Was the stellar burp really a sneeze?

Pulling back the veil on an astronomical metaphor


Keith S. Taber


It seems a bloated star dimmed because it sneezed, and spewed out a burp.


'Pardon me!' (Image by Angeles Balaguer from Pixabay)

I was intrigued to notice a reference in Chemistry World to a 'stellar burp'.

"…the dimming of the red giant Betelgeuse that was observed in 2019…was later attributed to a 'stellar burp' emitting gas and dust which condensed and then obscured light from the star"

Motion, 2022

The author, Alice Motion, quoted astrophysics doctoral candidate and science communicator Kirsten Banks commenting that

"In recorded history…It's the first time we've ever seen this happen, a star going through a bit of a burp"

Kirsten Banks quoted in Chemistry World

although she went on to suggest that the Boorong people (an indigenous culture from an area of the Australian state Victoria) had long ago noticed a phenomena that became recorded in their oral traditions 1, which

"was actually the star Eta Carinae which went through a stellar burp, just like Betelgeuse did"

Kirsten Banks quoted in Chemistry World

Composite image (optical appearing as white; ultraviolet as cyan; X-rays as purple) of Eta Carinae,

Source: NASA


Clearly a star cannot burp in the way a person can, so I took this to be a metaphor, and wondered if this was a metaphor used in the original scientific report.

A clump and a veil

The original report (Montargès, et al, 2021) was from Nature, one of the most prestigious science research journals. It did not seem to have any mention of belching. This article reported that,

"From November 2019 to March 2020, Betelgeuse – the second-closest red supergiant to Earth (roughly 220 parsecs, or 724 light years, away) – experienced a historic dimming of its visible brightness…an event referred to as Betelgeuse's Great Dimming….Observations and modelling support a scenario in which a dust clump formed recently in the vicinity of the star, owing to a local temperature decrease in a cool patch that appeared on the photosphere."

Montargès, et al., 2012, p.365

So, the focus seemed to be not on any burping but a 'clump' of material partially obscuring the star. That material may well have arisen from the star. The paper in nature suggests that Betelgeuse may loose material through two mechanisms: both by a "smooth homogeneous radial outflow that consists mainly of gas", that is a steady and continuous process; but also "an episodic localised ejection of gas clumps where conditions are favourable for efficient dust formation while still close to the photosphere" – that is the occasional, irregular, 'burp' of material, that then condenses near the star. But the word used was not 'burp', but 'eject'.

A fleeting veil

Interestingly the title of the article referred to "A dusty veil shading Betelgeuse". The 'veil' (another metaphor) only seemed to occur in the title. There is an understandable temptation, even in scholarly work, to seek a title which catches attention – perhaps simplifying, alliterating (e.g., 'mediating mental models of metals') or seeking a strong image ('…a dusty veil shading…'). In this case, the paper authors clearly thought the metaphor did not need to be explained, and that readers would understand how it linked to the paper content without any explicit commentary.


WordFrequency in Nature article
clump(s)25 (excluding reference list)
eject(ed, etc.)4
veil1 (in title only)
burp0
blob0
There's no burping in Nature

The European Southern Observatory released a press release (sorry, a 'science release') about the work entitled 'Mystery of Betelgeuse's dip in brightness solved', that explained

"In their new study, published today in Nature, the team revealed that the mysterious dimming was caused by a dusty veil shading the star, which in turn was the result of a drop in temperature on Betelgeuse's stellar surface.

Betelgeuse's surface regularly changes as giant bubbles of gas move, shrink and swell within the star. The team concludes that some time before the Great Dimming, the star ejected a large gas bubble that moved away from it. When a patch of the surface cooled down shortly after, that temperature decrease was enough for the gas to condense into solid dust.

'We have directly witnessed the formation of so-called stardust,' says Montargès, whose study provides evidence that dust formation can occur very quickly and close to a star's surface. 'The dust expelled from cool evolved stars, such as the ejection we've just witnessed, could go on to become the building blocks of terrestrial planets and life', adds Emily Cannon, from KU Leuven, who was also involved in the study."

https://www.eso.org/public/news/eso2109/

So, again, references to ejection and a veil – but no burping.

Delayed burping

Despite this, the terminology of the star burping, seems to have been widely taken up in secondary sources, such as the article in Chemistry World

A New Scientist report suggested "Giant gas burp made Betelgeuse go dim" (Crane, 2021). On the website arsTECHNICA, Jennifer Ouellette wrote that "a cold spot and a stellar burp led to strange dimming of Betelgeuse".

On the newsite Gizmodo, George Dvorsky wrote a piece entitled "A dusty burp could explain mysterious dimming of supergiant star Betelgeuse". Whilst the term burp was only used in the title, Dvorsky was not shy of making other corporeal references,

"a gigantic dust cloud, which formed after hot, dense gases spewed out from the dying star. Viewed from Earth, this blanket of dust shielded the star's surface, making it appear dimmer from our perspective, according to the research, led by Andrea Dupree from the Centre for Astrophysics at Harvard & Smithsonian.

A red supergiant star, Betelgeuse is nearing the end of its life. It's poised to go supernova soon, by cosmological standards, though we can't be certain as to exactly when. So bloated is this ageing star that its diameter now measures 1.234 million kilometers, which means that if you placed Betelgeuse at the centre of our solar system, it would extend all the way to Jupiter's orbit."

The New York Times published an article (June 17, 2021) entitled "Betelgeuse Merely Burped, Astronomers Conclude", where author Dennis Overbye began his piece:

"Betelgeuse, to put it most politely, burped."

The New York Times

Overbye also reports the work from the Nature paper

"We have directly witnessed the formation of so-called stardust," Miguel Montargès, an astrophysicist at the Paris Observatory, said in a statement issued by the European Southern Observatory. He and Emily Cannon of Catholic University Leuven, in Belgium, were the leaders of an international team that studied Betelgeuse during the Great Dimming with the European Southern Observatory's Very Large Telescope on Cerro Paranal, in Chile.

Parts of the star, they found, were only one-tenth as bright as normal and markedly cooler than the rest of the surface, enabling the expelled blob to cool and condense into stardust. They reported their results on Wednesday in Nature."

The New York Times

So, instead of the clumps referred to in the Nature article as ejected, we now have an expelled blob (neither word appears in the nature article itself). Overbye also explains how this study followed up on earlier observations of the star

"Their new results would seem to bolster findings reported a year ago by Andrea Dupree of the Harvard-Smithsonian Center for Astrophysics and her colleagues, who detected an upwelling of material on Betelgeuse in the summer of 2019.

'We saw the material moving out through the chromosphere in the south in September to November 2019,', Dr. Dupree wrote in an email. She referred to the expulsion as 'a sneeze.'

The New York Times

'…material moving out through the chromosphere in the south…': Hubble space telescope images of Betelgeuse (Source: NASA) 2

Bodily functions and stellar processes

I remain unsure why, if the event was originally considered a sneeze, it became transformed into a burp. However the use of such descriptions is not so unusual. Metaphor is a common tool in science communication to help 'make the unfamiliar familiar' by describing something abstract or out-of-the-ordinary in more familiar terms.

Read about metaphors in science

Here, the body [sic] of the scientific report keeps to technical language although a metaphor (the dust cloud as a veil) is considered suitable for the title. It is only when the science communication shifts from the primary literature (intended for the science community) into more popular media aimed at a wider audience that the physical processes occurring in a star became described in terms of our bodily functions. So, in this case, it seems a bloated star dimmed because it sneezed, and spewed out a burp.


Coda

The astute reader may have also noticed that the New York Times article referred to Betelgeuse as an "ageing star" that is "nearing the end of its life": terms that imply a star is a living, and mortal, being. This might seem to be journalistic license, but the NASA website from which the sequence of Betelgeuse images above are taken also refers to the star as ageing (as well as being 'petulant' and 'injured').2 NASA employs scientifically qualified people, but its public websites are intended for a broad, general audience, perhaps explaining the anthropomorphic references.

Thus, we might understand references to stars as alive as being a metaphorical device used in communicating astronomical ideas to the general public. Yet, an examination of the scientific literature might instead suggest instead that astronomers DO consider stars to be alive. But, that is a topic for another piece.


Work cited:
  • Crane, L. (2021). Giant gas burp made Betelgeuse go dim. New Scientist, 250(3340), 22. doi:10.1016/S0262-4079(21)01094-0
  • Hamacher, D. W., & Frew, D. J. (2010). An aboriginal Australian record of the great eruption of Eta Carinae. Journal of Astronomical History and Heritage, 13(3), 220-234.
  • Montargès, M., Cannon, E., Lagadec, E., de Koter, A., Kervella, P., Sanchez-Bermudez, J., . . . Danchi, W. (2021). A dusty veil shading Betelgeuse during its Great Dimming. Nature, 594(7863), 365-368. doi:10.1038/s41586-021-03546-8
  • Motion, A. 2022, Space for more science. Astrophysics and Aboriginal astronomy on TikTok, Chemistry World, December 2022, p.15 (https://www.chemistryworld.com/opinion/space-for-more-science/4016585.article)

Notes

1 William Edward Stanbridge (1816-1894) was an Englishman who moved to Australia in 1841. He asked Boorong informants about their astronomy, and recorded their accounts. He presented a report to the Philosophical Institute of Victoria in 1857 and published two papers (Hamacher & Frew, 2010). The website Australian Indigenous Astronomy explains that

"The larger star of [of the binary system] Eta Car is unstable and undergoes occasional violent outbursts, where it sheds material from its outer shells, making it exceptionally bright.  During the 1840s, Eta Car went through such an outburst where it shed 20 solar masses of its outer shell and became the second brightest star in the night sky, after Sirius, before fading from view a few years later.  This event, commonly called a "supernova-impostor" event, has been deemed the "Great Eruption of Eta Carinae".  The remnant of this explosion is evident by the Homunculus Nebulae [see figure above – nebulae are anything that appears cloud-like to astronomical observation].  This identification shows that the Boorong had noted the sudden brightness of this star and incorporated it into their oral traditions."

Duane Hamacher

A paper in the Journal of Astronomical History and Heritage concludes that

"the Boorong people observed 𝜂 Carinae in the nineteenth century, which we identify using Stanbridge's description of its position in Robur Carolinum, its colour and brightness, its designation (966 Lac, implying it is associated with the Carina Nebula), and the relationship between stellar brightness and positions of characters in Boorong oral traditions. In other words, the nineteenth century outburst of 𝜂 Carinae was recognised by the Boorong and incorporated into their oral traditions"

Hamacher & Frew 2010, p.231

2 The images reproduced here are presented on a NASA website under the heading 'Hubble Sees Red Supergiant Star Betelgeuse Slowly Recovering After Blowing Its Top'. This is apparently not a metaphor as the site informs readers that"Betelgeuse quite literally blew its top in 2019". Betelgeuse is described as a "monster star", and its activity as "surprisingly petulant behaviour" and a "titanic convulsion in an ageing star", such that "Betelgeuse is now struggling to recover from this injury."

This seems rather anthropomorphic – petulance and struggle are surely concepts that refer to sentient deliberate actors in the world, not massive hot balls of gas. However, anthropomorphic narratives are often used to make scientific ideas accessible.

Read about anthropomorphism

The recovery (from 'injury') is described in terms of two similes,

"The star's interior convection cells, which drive the regular pulsation may be sloshing around like an imbalanced washing machine tub, Dupree suggests. … spectra imply that the outer layers may be back to normal, but the surface is still bouncing like a plate of gelatin dessert [jelly] as the photosphere rebuilds itself."

NASA Website

Read about science similes


A drafted man is like a draft horse because…

A case of analogy in scientific discovery


Keith S. Taber


How is a drafted man like a draft horse (beyond them both having been required to give service?)

"The phthisical soldier is to his messmates
what
the glandered horse is to its yoke fellow"

Jean-Antoine Villemin quoted by Goetz, 2013

Analogy in science

I have discussed many examples of analogies in these pages. Often, these are analogies intended to help communicate scientific ideas – to introduce some scientific concept by suggesting it is similar to something already familiar. However, analogy is important in the practice of science itself – not just when teaching about or communicating science to the general public. Scientific discoveries are often made by analogical thinking – perhaps this as-yet-unexplained phenomenon is a bit like that other well-conceptualised phenomenon?

Analogies are more than just similes (simply suggesting that X is like Y; say that the brain is like a telephone exchange 1) because they are based on an explicit structural mapping. That is, there are parallels between relationships within a concept.

So,

  • to say that the atom is a tiny solar system would just be a metaphor, and
  • to simply state that the atom is like a tiny solar system would be a simile;
  • but to say that the atom is like a tiny solar system because both consist of a more massive central body orbited by much less massive bodies would be an analogy. 2

Read about analogies in science

A medical science analogy

Thomas Goetz describes how, in the nineteenth century, Jean-Antoine Villemin suspected that the disease known as phthisis (tuberculosis, 'T.B.') was passed between people, and that this tended to occur when people were living in crowded conditions. Villemin was an army surgeon and the disease was very common among soldiers, even though they tended to be drawn from younger, healthier members of the population. (This phenomenon continued into the twentieth century long after the cause of the infection was understood. 3)


Heavy horses: it is not just the workload of draught horses that risks their health 4
(Image by Daniel Borker from Pixabay)


Villemin knew that a horse disease, glanders, was often found to spread among horses that were yoked closely together to work in teams, and he suspected something similar was occurring among the enlisted men due to their living and working in close quarters.

"…Jean-Antoine Villemin, a French army surgeon…in the 1860s conducted a series of experiments testing whether tuberculosis could be transmitted form one animal to another. Villemin's interest began when he observed how tuberculosis seemed to affect young men who moved to the city, even though they were previously healthy in their rural homes. He compared the effect to how glanders, a horse disease, seemed to spread when a team [of horses] was yoked together. "The phthisical soldier is to his messmates what the glandered horse is to its yoke fellow", Villemin conjectured."

Goetz, 2013, p.104

To a modern reader this seems an unremarkable suggestion, but that would be an ahistorical evaluation. Glanders is an infectious disease, and so is tuberculosis, so being in close contact with an infected cospecific is clearly a risk factor for being infected. Yet, when Villemin was practising medicine it was not accepted that tuberculosis was infectious, and infectious agents such as bacteria and viruses had not been identified.

Before the identification of the bacterium Mycobacterium tuberculosis as the infectious agent, there was no specific test to demarcate tuberculosis from other diseases. This mattered as although T.B. tends to especially affect the pulmonary system, it can cause a wide range of problems for an infected person. Scrofula, causing swollen lymph nodes, was historically seen as quite distinct from consumption, recognised by bloody coughing, but these are now both recognised as the results of Mycobacterium tuberculosis infection (when the bacterium moves from the lungs into the lymphatic system it leads to the symptoms of scrofula). The bacterium can spread through the bloodstream to cause systemic disease. However, a person may be infected with the bacterium for years before becoming ill. Before the advent of 'germ theory', and the ability to identify specific 'germs', the modern account of tuberculosis as a complex condition with diverse symptoms caused by a single infectious agent was not at all obvious.

The contexts of discovery and justification

Although the analogy with glanders was suggestive to Villemin, this was just the formation of a hypothesis: that T.B. could be passed from one person to another via some form of material transfer during close contact. The context of discovery was the recognition of an analogy, but the context of justification needed to be the laboratory.

Sacrifices for medical science

The basic method for testing the hypothesis consisted of taking diseased animals (today we would say infected, but that was not yet accepted), excising diseased material from their bodies, or taking samples of tissue from diseased people, and introducing it into the bodies of healthy animals. If the healthy animals quickly showed signs of disease, when similar controls remained healthy, it seemed likely that the transfer of material from the diseased animal was the cause.

Although the microbes responsible for T.B. and similar diseases had not been found, autopsy showed irregularities in diseased bodies. The immune system acts to localise the infection and contain it within tissue nobules or granuloma known as 'tubercles'. These tubercles are large enough to be detected and recognised post-mortem.

It was therefore possible to harvest diseased material and introduce it into healthy animals:

"If one shaves a narrow area on the ear of a rabbit or at the groin or on the chest under the elbow of a dog, and then creates a subcutaneous wound so small and so shallow that it does not yield the slightest drop of blood, and then one introduces into this wound, such that it cannot escape, a pinhead-sized packet of tuberculous material obtained from a man, a cow or a rabbit that has already been rendered tuberculous; or if, alternatively, one uses a Pravaz [hypodermic] syringe to instil, under the skin of the animal, a few droplets of sputum from a patient with phthisis…"

Villemin, 1868/2015, p.256

Villemin reports that the tiny wound quickly heals, and then the introduced material cannot be felt beneath the site of introduction. However after a few days:

"a slight swelling is observed, accompanied in some cases by redness and warmth, and one observes the progressive development of a local tubercle of a size between that of a hemp seed and that of a cobnut. When they reach a certain volume, these tubercles generally ulcerate. In some cases, there is an inflammatory reaction…"

Villemin, 1868/2015, p.256

Despite these signs, the animals remain in reasonable health – for a while,

"Only after 15, 20 or 30 days does it become evident that they are losing weight, and have lost their appetite, gaiety and vivacity of movement. Some, after going into decline for a certain period, regain some weight. Others gradually weaken, falling into the doldrums, often suffering from debilitating diarrhoea, finally succumbing to their illness in a state of emaciation."

Villemin, 1868/2015, p.256
In the doldrums

The doldrums refers to oceanic waters within about five degrees of the equator where there are often 'lulls' or calms with no substantial winds. Sailing ships relied on winds to make progress, and ships that were in the doldrums might be becalmed for extended periods, and so unable to make progress, leaving crews listless and frustrated – and possibly running out of essential supplies.

"Down dropt the breeze, the sails dropt down, 'Twas sad as sad could be; And we did speak only to break The silence of the sea! 

All in a hot and copper sky, The bloody Sun, at noon, Right up above the mast did stand, No bigger than the Moon. 

Day after day, day after day, We stuck, nor breath nor motion; As idle as a painted ship Upon a painted ocean. 

Water, water, every where, And all the boards did shrink; Water, water, every where, Nor any drop to drink." 

Extract from The Rime of the Ancient Mariner, 1834, Samuel Taylor Coleridge

So, the inoculated animals 'fell into the duldrums', metaphorically speaking.

Read about metaphors in science


Under a hot and copper sky
(Image by Youssef Jheir from Pixabay)

The needs of the many are outweighed by the needs of humans

It was widely considered entirely acceptable to sacrifice the lives and well-being of animals in this way, to generate knowledge that is was hoped might help reduce human suffering. 'Animal rights' had not become a mainstream cause (even if animals had occasionally been subject to legal prosecution and sometimes found guilty in European courts – suggesting they had responsibilities if not rights).

Similar experiments were later carried out by Robert Koch in his own investigations of T.B. and other diseases soon after. Indeed, Goetz notes that when working on anthrax in 1875,

"As Koch's experiments went on, his backyard menagerie began to thin out; his daughter, Getrud, grew concerned that she was losing all her pets."
p.27

Goetz, 2013, p.27

"Let us hope that daddy can draw conclusions from his experiments soon…"
(Image by Adina Voicu from Pixabay )

Although animals are still used in medical research today, there is much more concern about their welfare and researchers are expected to avoid the suffering and death of more animals than considered strictly necessary. 5 Wherever possible, alternatives to animal experimentation are preferred.

Inadmissible analogies?

One of the arguments made against animal studies is that as different species are by definition different in their anatomy and physiology, non-human animals are imperfect models for human disease processes. One argument that Villemin faced was that his inoculations between animals was most successful in rabbits, when, it was claimed, rabbits were widely tubercular in the normal population. In other words, it was suggested that Villemin only found evidence of disease in his inoculated test animals because they probably already had the disease anyway.

That suggests the need for some sort of experimental control, and Villemin reported that

"…despite routine sequestration and the tortures that the vivisectionists force them to endure, rabbits are almost never tuberculous. I have explored more than a hundred lungs from these rodents from markets and I found none to be tuberculous."

Villemin, 1868/2015, p.257
Indirect evidence

Villemin had made an analogy between disease transfer between horses to disease transfer between humans. His experiments did not directly test disease transfer between humans – as that would not have been considered unethical (and so "absolutely forbidden") even at a time when animal (i.e., non-human animal) research was not widely questioned:

I believe that I have experimentally demonstrated that phthisis, like syphilis and glanders, is communicable by inoculation. It can be inoculated from humans to certain animals, and from these animals to others of the same species. Can it be inoculated between humans? It is absolutely forbidden for us to provide experimental proof of this, but all the evidence is in favour of an affirmative response.

Villemin, 1868/2015, p.265

So, Villemin did not demonstrate that T.B. could be transferred between people, but only that analogous transfers occurred. So, in a sense, the context of justification, as well as the context of discovery, relied on analogies. Despite this, the indirect evince was strong and Villemin's failure to persuade most of the wider scientific community of his arguments likely reflected the general paradigmatic beliefs at the time that disease was caused by hereditary weakness, or through broad environmental conditions, rather than minute amounts of material being transferred between bodies.


Mycobacterium tuberculosis – the infectious agent in tuberculosis – could only be detected once suitable microscopes were available – Koch published his discovery of the bacterium in 1882.

(source: Wikipedia Commons)


Koch was able to be more persuasive because he was also able to actually identify a microbe present in diseased bodies, as well as show inoculation led to the microbe being found in the inoculated animal. That shift in thinking required the acceptance of a different kind of analogy: that the presence, or absence, of a bacterium in the tissues mapped onto being infected with, or free from, a disease.

present in tissuesMycobacterium tuberculosis
a microscopic 'germ' – only visible under the microscope
absent in tissues
     ↕︎↕︎
infectedtuberculosis
a widespread and often fatal disease of people and other mammals
not infected
In a sense, diagnosis through microbiological methods relies on a kind of analogy

Sources cited:
  • Daniel, T. M. (2015). Jean-Antoine Villemin and the infectious nature of tuberculosis. The International Journal of Tuberculosis and Lung Disease, 19(3), 267-268. https://doi.org/10.5588/ijtld.06.0636
  • Frith, J. (2014). History of Tuberculosis. Part 1 – Phthisis, consumption and the White Plague. Journal of Military and Veterans' Health, 22(2), 29-35.
  • Goetz, T. (2013). The Remedy. Robert Koch, Arthur Conan Doyle, and the quest to cure tuberculosis. Gotham Books.
  • Surget, A. (2022). Being between Scylla and Charybdis: designing animal studies in neurosciences and psychiatry – too ethical to be ethical? In Seminar series: Berlin-Bordeaux Working Group Translating Validity in Psychiatric Research.
  • Taber, K. S. (2013). Upper Secondary Students' Understanding of the Basic Physical Interactions in Analogous Atomic and Solar Systems. Research in Science Education, 43(4), 1377-1406. doi:10.1007/s11165-012-9312-3
  • Villemin, J. A. (1868/2015). On the virulence and specificity of tuberculosis [De la virulence et de la spécificité de la tuberculose]. The International Journal of Tuberculosis and Lung Disease, 19(3), 256-266. https://doi.org/https://doi.org/10.5588/ijtld.06.0636-v

Notes

1 As analogies link to what is familiar, they tend to reflect cultural contexts. At one time the mind was referred to as being like a slate. The once-common comparison of the brain to a telephone exchange has tended to have been largely displaced now by the commparison to a computer.


2 Whilst this is a common teaching analogy, it is also problematic if it taught without considering the negative aspects of the analogy (e.g. electrons repel each other, unlike planets; planets vary in mass etc.), and if the target concept is not clearly presented as one (simplified) model of atomic structure. See Taber, 2013.


3 "During both World War I and World War II in the US Army, tuberculosis was the leading cause of discharge [i.e., from the service]. Annual incidence of tuberculosis in the military of Western countries is very low, however in the last several decades microepidemics have occurred in small close knit units on US and British Naval warships and land based units deployed overseas. Living and working in close quarters and overseas deployment to tuberculosis-endemic areas of the world such as Afghanistan, Iraq and South-East Asia remain significant risk factors for tuberculosis infection in military personnel, particularly multidrug resistant tuberculosis."

Frith, 2014, p.29

4 Some horses have been bred to be fast runners, and others to be capable of pulling heavy loads. (That is some have been artificially selected to be like sprinters or cyclists, and others to be like weightlifters or shot-putters). The latter are variously called draft (U.S. spelling) / draught (British spelling) horses (US), dray horses, carthorses, work horses or heavy horses. When a load was too heavy to be moved by a single horse, several would be harnessed together into a team – providing more power. Ironically the term 'horsepower' was popularised by James Watts – whose name has since been given to the modern international (S.I.) unit of power – in marketing his steam engines. According to the Institute of Physics,

Whilst the peak mechanical power of a single horse can reach up to 15 horsepower, it is estimated that a typical horse can only sustain an output of 1 horsepower (746 W) for three hours and, if working for an eight-hour day, a horse might output only three quarters of one horsepower. 

https://spark.iop.org/why-one-horsepower-more-power-one-horse

5 Alexandre Surget (Associate Professor at University of Tours, France) has even argued that the guidelines adopted in animal experiments are sometimes counter-productive as they encourage experiments with too few animals, and consequently too little statistical power, to support robust conclusions – in effect sacrificing animals without reasonable expectations of securing sound knowledge (Surget, 2022).

Any research that makes demands of resources and the input of others, but which is designed in such a way that it is unlikely to produce reliable new knowledge, can be considered unethical.

Read about research ethics


Not actually a government event

"It's just the name of the shop, love"


Keith S. Taber



An invitation

This week I received an invitation to chair an event (well, as most weeks, I received several of those, but this one seemed to be actually on a topic I knew a little about…).

Dear Keith,

"It is my pleasure to invite you to chair at Government Events' upcoming event The Delivering Excellence in Teaching STEM in Schools Conference 2023, taking place online on 29th of March 2023.

Chairing would involve giving a short opening and closing address, chairing the Q&A and panel discussions, and providing insights throughout the day.

Invited Speakers Include:

  • Kiera Newmark, Deputy Director for STEM, Digital and EdTech, Department for Education
  • Maria Rossini, Head of Education, British Science Association
  • Sam Parrett, Chief Executive, London South East Colleges

I feel you would add great value and insight to the day and I would be delighted to confirm your involvement in this event! …"

(Well, I claim to know a bit about teaching science, not so much about teaching technology or mathematics, or engineering {that I was not aware was really part of the National Curriculum}.)

Read about STEM education

So, at face value this would be a government-sponsored event, including a senior representative from the ministry of education – so perhaps another chance for me to lobby to have the embarrassing errors in the English National Curriculum for science corrected – as well as a leading executive from the 'British Ass'. 1

My initial reaction was mixed. This was clearly an important topic, and one where I probably was qualified to act as chair and might be able to make a useful contribution. And it was on-line, so I would not have to travel. Then again, I retired from teaching because I suffer from regular bouts of fatigue, and find I have to limit my high intensity inputs as I tire very easily and quickly these days. Chairing a session might not completely drain me, but a whole conference?

Due diligence

Finding myself tempted, I felt the need to do some due diligence. Was this really what it seemed? What would be involved?

The invitation seemed genuine enough, even if it included one of those dodgy legalese footers sometimes used by scam artists to put people off sharing dubious messages. (The 'you must not tell anyone' trope reminds me of what fictional blackmailers say about not contacting the police.)


A rather silly legal disclaimer.
It seems from the wording, presumably carefully chosen by the legal department, that this disclaimer only applies to "email (which included any attachment)" – whereas mine did not.

This one suggested that if I had received the message in error I should

  • permanently delete it from my computer without making a copy
  • then reply to say I had done so

I will leave the reader to spot the obvious problem there.

However, this lack of clear logic did remind me of the similarly bizarre statement about the conservation of energy in the National Curriculum which perhaps gave some credence to this being a government event.

Luckily, I was the intended recipient, but in any case I take the view that if someone sends me an unsolicited email, then they have no right to tell me what to do with it, and as in this case I discovered they had already announced the invitation on their website (see image above), I could not see how any court would uphold their claim that the message was confidential.

Government events?

I was clearly aware that just because an event was organised by an entity called "Government Events" was not assurance this really was a government event. So I checked out the website. (The lack of any link in the invitation email to the event webpage, or indeed the organisation more generally, might have been an oversight, but seemed odd.)

As you will have likely guessed, this was not a government event.

In situations such as this I am always put in mind of the 'song' 'Shirt' by the dadaist-inspired Bonzo Dog Doo-dah Band which included a joke about a man who takes his shirt to a dry cleaner for 'express cleaning' and was then told it would be ready for collection in three weeks. 1

Three weeks!? But, the sign outside says '59 minute cleaning'

Yes, that's just the name of the shop, love.

On searching out the website I found that "Government events" claims to be "Supporting UK Public Sector Teams to Deliver World Class Public Services" and to be "the UK's No. 1 provider of public sector conferences, training courses and case study focused insights and intelligence". By "public sector conferences" they presumably mean conferences for the public sector, not conferences in the pubic sector.

It transpired that "Government Events" is one brand of an organisation called "Professional Development Group". That organisation has a webpage featuring members of its "Advisory Board [which] is made up of senior executives and academics from both corporate and public-sector background" but its website did not seem to provide any information about its governance or who its executives or owners were. (Professional Development Group does have a listing in the Companies House registry showing two current directors.)


Possibly the senior leadership team at Professional Development Group? But probably not.


A bias against the private sector?

Perhaps I am simply displaying my personal bias towards the public sector? I've worked in state schools, a state college, and a state university. I have worked in the private sector if we include after-school, and holiday, jobs (mainly for Boots the Chemist or Boots the Pharmacist as they should be known), but my career is very much public sector. And I've not liked what I've seen as the inappropriate and misguided attempts to make the health and education service behave like a series of competing units in a free market. (And do not 'get me going' on the state of utilities in England now – the selling off of state assets at discounted rates to profit-making concerns (seemingly to fund temporary tax cuts for electoral advantage), and so replacing unitary authorities (with no need to budget for continuously needing to advertise and to try to poach each others' customers) by dozens of competing and, recently, often failing, profit-making companies that often own each other or are owned overseas.)

So, although I have no problem with the private sector, which no doubts does some things better, I am suspicious of core 'public sector' activities being moved into the commercial sphere.

Perhaps "Government Events" does a good job despite the misleading name. After all they are kite-marked by an organisation called the CPD Certification Service (a trademark of The CPD Certification Service Ltd, so another privately owned company. Again, the website did not give any information about governance or The CPD Certification Service Ltd's executives. But four directors are named in the public listing at Company's House). This all seems alien to someone from the public sector where organisations go out of their way to provide such information, and value transparency in such matters. (Three of the four directors share the same family name, 'Savage', which might raise some questions in a publicly governed organisation.)

A bit pricey for an on-line meeting?

But even if Professional Development Group do a wonderful job, do they offer value for money?

The conference is aimed at "teachers who work in STEM and senior leadership representatives from schools". If they work in state schools the cost per delegate is £399.00 (plus V.A.T., but schools should be able to reclaim that). For that they get a one-day on line conference. The chair (currently listed as" "Keith Taber, Professor of Science Education, University of Cambridge (invited)", but that will need to be changed*) is due to open the event at 09.50, and to wind it up with some closing remarks at 16.20. The £399 will presumably not include accommodation, refreshments, lunch, a notepad, a branded pen, a tote bag for the goodies, or any of the other features of face-to-face events.

It will include a chance to hear a range of speakers. Currently listed (caveat emptor: "programme subject to change without notice") are ten specified presentations as well as two Key Supporter Sessions (!) The advertised topics seem valuable:

  • National Trends and Updates on Boosting the Profile of STEM Subjects in Schools
  • Best Practice for Implementing Flexible Working to Help Recruit and Retain STEM Teachers
  • Providing an Inclusive and Accessible STEM Curriculum for Pupils with SEND
  • Driving Increased Interest and Participation in STEM Among Female Students
  • Encouraging Students from Disadvantaged Backgrounds to Study STEM Subjects
  • Taking Action to Boost Extracurricular Engagement with STEM Subjects 
• Primary: Implementing a Whole School Approach to Boost the Profile of STEM Subjects• Secondary: Supporting Students to Succeed and Improve Outcomes in STEM Examinations
• Primary: Partnership Working to Promote STEM Education in Primary Schools• Secondary: Working with Employers and Universities to Encourage Post-Secondary STEM

However, anyone looking to book should notice that at this point only one of the ten mooted speakers has confirmed – the rest are 'invited'.

I was also intrigued by the two slots reserved for 'Key Supporter Session's. You, dear reader, could buy ('sponsor') one of these slots to talk at the conference.

You can sponsor the conference

Professional Development Group offer "sponsorship and exhibition packages" for their events. This would allow a sponsor to meet their "target audience", to "demonstrate your products or services" and even to "speak alongside industry [sic] leading experts".

Someone wishing to invest as a Key Supporter (pricing not disclosed) gets branding on the Website and Event Guide and a "20-minute speaking slot followed by Q&A". (For this specific conference it seems you could buy time to sell your wares in the 10.40 slot or the 13.55 slot.)

  • Perhaps you have invented a new type of perpetual motion kit for use in the classroom and are seeking an opportunity to demonstrate and market your wares? ["demonstrate your products or services"]
  • Perhaps you think that evolution is not really science because it is only a theory, and you want to subject delegates to a diatribe on why impressionable young people should not be indoctrinated with such dangerous speculations? ["speak alongside industry [sic] leading experts"]
  • Perhaps your company mines and refines uranium ore, and is looking to find a market for the vast amounts of fine slag produced, and think it might make an excellent modelling material for use in design and technology classes? [meet "your target audience"]

A Strategic Headline Sponsor at a Professional Development Group event can also purchase other features such as a "pre show marketing email to all registered delegates". I guess the terms and conditions of signing up to a Professional Development Group event mean delegates agree to receive such sponsored advertising.

What's wrong with selling conference slots?

There is nothing inherently immoral about selling slots at a commercial conference – after all, it is a commercial event – so, it is primarily about 'the bottom line' of the balance sheet. But that's my point. This would be unacceptable at an academic conference, where some speakers are invited because they are considered to have something relevant to say, and others wishing to present have to submit their proposals to peer review.

What I find, if not immoral, certainly distasteful here, is that an on-line conference of the kind that would likely be arranged for free or for a nominal fee in an academic context, is being priced at £399 for state school teachers at a time when public services are under immense pressures and budgets need to be very wisely spent. How can this price be justified?

Perhaps the speaker fees are a significant cost. But I doubt that: I was not offered any fee to give up a day of my time to chair the meeting, and so I expect the other speakers are also being expected to speak for free as well. That's how things usually work in academia and the state sector. (But if this is a commercial activity, then the professional speakers SHOULD ask for a fee. If they are taking time out of school, and so already being paid, then perhaps the fee could be used to buy school books or pay for supply teachers?) Indeed, there are two slots for fee-paying speakers who wish to advertise their wares.

So, this is perhaps not actually a scam, but it does not meet the standards of honesty and transparency I would expect in the state sector (because it is only masquerading as state sector), and the event seems to be priced in order to make money for shareholders, not primarily to meet a mission of "Supporting UK Public Sector Teams".

If the COVID pandemic taught us anything, it is that many (probably not all, but surely most) meetings can be held just as well on line, so avoiding all the time, money and carbon footprints of moving people around the country. Oh, and consequently, it showed us that most of these meetings (a) can be offered for free where they are hosted by a public sector organisation that can consider them as meeting part of its core mission; and (b) that even when that does not apply, and so costs have to be covered, they can be arranged for a fraction of the expense of a face-to-face event at a hired venue.

As you may have guessed, I declined.*


* I replied to decline this opportunity on 19th November. Checking on 25th November, I see I am still listed as Chair (invited). See note 1



Notes

1 In the academic world, the term 'invited speaker' is used to designate a conference speaker who was invited by the organisers in contrast to a speaker who applied to speak and proposed a contribution in response to an open call. However, 'invited speaker' here seems to mean someone whom has been invited to speak, in contrast to someone who has agreed to.


2 I have a pretty poor memory, but do recall seeing Bonzo stalwart Neil Innes play at Nottingham University when I was a student. He sang their most successful song, "I'm the urban spaceman" (which reached no. 5 in the UK single charts and led to Innnes getting an Ivor Novello award for his song-writing), then announced, deadpan, "that was a medley of hit".

The Bonzos

Out of the womb of darkness

Medical ethics in 20th Century movies


Keith S. Taber


The hero of the film, Dr Holden, is presented as a scientist. Here he is trying to collect some data.
(still from 'The Night of the Demon')

"The Night of the Demon" is a 1957 British film about an American professor who visits England to investigate a supposed satanic cult. It was just shown on English television. It was considered as a horror film at the time of its release, although the short scenes that actually feature a (supposedly real? merely imagined? *) monster are laughable today (think Star Trek's Gorn in the original series, and consider if it is believable as anything other than an actor wearing a lizard suit – and you get the level of horror involved). [*Apparently the director, Jacques Tourneur, never intended a demon to be shown, but the film's producer decided to add footage showing the monster in the opening scenes, potentially undermining the whole point of the film: but giving the publicity department something they could work with. 6]


A real scary demon (in 1959) and a convincing alien (in 1967)?
(stills from 'The Night of the Demon' and ' Star Trek' episode 'Arena')
[Move the slider to see more of each image.]

The film's protagonist is a psychologist, Dr. John Holden, who dismisses stories of demons and witchcraft and the like, and has made a career studying people's beliefs about such superstitions. Dr Holden's visit to Britain deliberately coincided with a conference at which he was to present, as well as coincidentally with the death of one of his colleagues (who had been subject to a hex for investigating the cult).


'Night of the Demon' (Dir.  Jacques Tourneur) movie poster: Sabre Film Production.
[As was common at the time, although the film was in monochrome, the publicity was coloured. Whether the colour painting of the monster looks even less scary than the version in the film itself is a moot point.]

The film works much better as a kind of psychological thriller examining the power of beliefs, than as horror. (Director: 1 – Producer, 0.) That, if we believe something enough, it can have real effects is well acknowledged – but this does not need a supernatural explanation. People can be 'scared' to death by what they imagine, and how they respond to their fears. Researchers expecting a positive outcome from their research are likely to inadvertently behave in ways that leads to this very result: thus the use of double blind studies in medical trials, so that the researchers do not know which patients are receiving which treatment.

Read about expectancy effects in research

While the modern viewer will find little of suspense in the film, I did metaphorically at least 'recoil with shock' from one moment of 'horror'. At the conference a patient (Rand Hobart) is wheeled in on a trolley – someone suspected of having committed a murder associated with the cult, whom the authorities had allowed to be questioned by the researchers…at the conference.


"The authorities have lent me this suspected murderer for the benefit of dramatic effect and for plot development purposes"
(still from 'The Night of the Demon').

A variety of movie posters were produced for the film 6 – arguably this one reflects the genuinely horrific aspect of ther story. To a modern viewer this might also appear the most honest representation of the film as the demon given prominence in some versions of the poster barely features in the film.

Holden's British colleague, Professor O'Brien, explains to the delegates,

"For a period of time this man has been as you see him here. He fails to respond to any normal stimulation. His experience, whatever it was, which we hope here to discover, has left him in a state of absolute catatonic immobility. When I first investigated this case, the problem of how to hypnotise an unresponsive person was the major one. Now the proceedings may be somewhat dramatic, but they are necessary. The only way of bringing his mind out of the womb of darkness into which it has retreated to protect itself, is by therapeutic shock, electrical or chemical. For our purposes we are today using pentothal [? 1] and later methylamphetamine."

Introducing a demonstration of non-consensual use of drugs on a prisoner/patient

"Okay, we'll give him a barbiturate, then we'll hypnotise him, then a stimulant, and if that does not kill him, surely he will simply, calmly and rationally, tell us what so traumatised him that he has completely withdrawn into his subconscious."
(Still from 'The Night of the Demon')


After an injection, Hobart comes out of his catatonic state, becomes aware of his surroundings, and panics.

The dignity of the accused: Hobart is forced out of his 'state of absolute catatonic immobility' to discover he is an exhibit at a scientific conference.
(Still from 'The Night of the Demon'.)

He is physically restrained, and examined by Holden (supposedly the 'hero' of the piece), who then hypnotises him.



He is then given an injection of methylamphetamine before being questioned by O'Brien and Holden. He becomes agitated (what, after being forcibly given 'speed'?), breaks free, and leaps, out of a conveniently placed window, to his death.

Now, of course, this is all just fiction – a story. No one is really drugged, and Hobart is played by an' actor who is unharmed. (I can be fairly sure of that as the part was played by Brian Wilde who much later turned up alive and well as prison officer 'Mr Barrowclough' in BBC's Ronnie Barker vehicle 'Porridge'.)


The magic of the movies – people do not stay dead, and there are no professional misconduct charges brought against our hero.
(Stills from 'The Night of the Demon' and from BBC series 'Porridge'.3 )
[Move the slider to see more of each image.]

Yet this is not some fantastical film (the Gorn's distant cousin aside) but played for realism. Would a psychiatric patient and murder suspect have been released to be paraded and demonstrated at a conference on the paranormal in 1957? I expect not. Would the presenters have been allowed to drug Hobart without his consent?

Read about voluntary, informed, consent

An adult cannot normally be medicated without their consent unless they are considered to lack the ability to make responsible decisions for themselves. Today, it might be possible to give a patient drugs without consent if they have been sectioned under the Mental Health Act (1983) and it was considered the action was necessary for their safety or for the safety of others. Hobart was certainly not an immediate threat to anyone before he was brought out of his trance.

However, even if this enforced use of drugs was sanctioned, this would not be done in a public place with dozens of onlookers. 4 And it would not be done (in the U.K. at least!) simply to question someone about a crime.5 Presumably, the makers of the film either thought that this scene reflected something quite reasonable, or, at least, that the cinema-going public would find this sufficiently feasible to suspend disbelief. If this fictitious episode did not reflect acceptable ethical standards at the time, it would seem to tell us something about public perceptions of the attitude of those in authority (whether the actual authorities who were meant to have a duty of care to a person under arrest, or those designated with professional roles and academic titles) to human rights.

Today, however, professionals such as researchers, doctors, and even teachers, are prepared for their work with a strong emphasis on professional ethics. In medical care, the interest of the patient themselves comes first. In research, informants are voluntary participants in our studies, who offer us the gift of data, and are not subjects of our enquiries to be treated simply as available material for our work.

Yet, actually, this is largely a modern perspective that has developed in recent decades, and sadly there are many real stories, even in living memory, of professionals deciding that people (and this usually meant people with less standing or power in their society) should be drugged, or shocked, or operated on, without their consent and even against their explicit wishes; for what is seen as their own, or even what is judged as some greater, good; in circumstances where it would be totally unacceptable in most countries these days.

So, although this is not really a horror film by today's measures, I hope any other researchers (or medical practitioners) who were watching the film shared my own reaction to this scene: 'no, they cannot do that!'

At least, they could not do that today.

Read about research ethics


Notes

1 This sounds to me like 'pentatyl', but I could not find any reference to a therapeutic drug of that name. Fentanyl is a powerful anti-pain drug, which like amphetamines is abused for recreational use – but was only introduced into practice the year after the film was made. It was most likely referring to sodium thiopental, known as pentothal, and much used (in movies and television, at least) as a truth serum. 5 As it is a barbiturate, and so is used in anaesthesia, it does not seem an obvious drug of choice to wake someone from a catatonic state.


2 The script is based loosely on a 1911 M. R. James short story, 'Casting the Runes' that does not include the episode discussed.


3 I have flipped this image (as can be seen form the newspaper) to put Wilde (playing alongside Ronnie Barker, standing, and Richard Beckinsale), on the right hand side of picture.


4 Which is not to claim that such a public demonstration would have been unlikely at another time and place. Execution was still used in the U.K. until 1964 (during my lifetime), although by that time being found guilty of vagrancy (being unemployed and hanging around {unfortunate pun unintended}) for the second time was no longer a capital offence. However, after 1868 executions were no longer carried out in public.

It was not unknown for the corpses of executed criminals to be subject to public dissection in Renaissance [sic, ironically] Europe.


5 Fiction, of course, has myriad scenes where 'truth drugs' are used to obtain secrets from prisoners – but usually those carrying out the torture are the 'bad guys', either criminals or agents of what is represented in the story as an enemy or dystopian state.


6 Some variations on a theme. (For some reason, for its slightly cut U.S. release 'The Night of the Demon' was called 'The Curse of the Demon'.) The various representations of the demon and the prominence given to it seem odd to a modern viewer given how little the demon actually features in the film.

The references to actually seeing demons and monsters from hell on the screen, "the most terrifying story ever told", and "scenes of terror never before imagined" raises the question of whether the copywriters were expected to watch a film before producing their copy.

Passive learners in unethical control conditions

When 'direct instruction' just becomes poor instruction


Keith S. Taber


An experiment that has been set up to ensure the control condition fails, and so compares an innovation with a substandard teaching condition, can – at best – only show the innovation is not as bad as the substandard teaching

One of the things which angers me when I read research papers is examples of what I think of as 'rhetorical research' that use unethical control conditions (Taber, 2019). That is, educational research which sets up one group of students to be taught in a way that is clearly disadvantages them to ensure the success of an experimental teaching approach,

"I am suggesting that some of the experimental studies reported in the literature are rhetorical in the … sense that the researchers clearly expect to demonstrate a well- established effect, albeit in a specific context where it has not previously been demonstrated. The general form of the question 'will this much-tested teaching approach also work here' is clearly set up expecting the answer 'yes'. Indeed, control conditions may be chosen to give the experiment the best possible chance of producing a positive outcome for the experimental treatment."

Taber, 2019, p.108

This irks me for two reasons. The first, obviously, is that researchers have been prepared to (ab)use learners as 'data fodder' and subject them to poor learning contexts in order to have the best chance of getting positive results for the innovation supposedly being 'tested'. However, it also annoys me as this is inherently a poor research design (and so a poor use of resources) as it severely limits what can be found out. An experiment that compares an innovation with a substandard teaching condition can, at best, show the innovation is not as ineffecitive as the substandard teaching in the control condition; but it cannot tell us if the innovation is at least as effective as existing good practice.

This irritation is compounded when the work I am reading is not some amateur report thrown together for a predatory journal, but an otherwise serious study published in a good research outlet. That was certainly the case for a paper I read today in Research in Science Education (the journal of the Australasian Science Education Research Association) on problem-based learning (Tarhan, Ayar-Kayali, Urek & Acar, 2008).

Rhetorical studies?

Genuine research is undertaken to find something out. The researchers in this enquiry claim:

"This research study aims to examine the effectiveness of a [sic] problem-based learning [PbBL] on 9th grade students' understanding of intermolecular forces (dipole- dipole forces, London dispersion forces and hydrogen bonding)."

Tarhan, et al., 2008, p.285

But they choose to compare PbBL with a teaching approach that they expect to be ineffective. Here the researchers might have asked "how does teaching year 9 students about intermolecular forces though problem-based learning compared with current good practice?" After all, even if PbBL worked quite well, if it is not quite as effective as the way teachers are currently teaching the topic then, all other things being equal, there is no reason to shift to it; whereas if it outperforms even our best current approaches, then there is a reason to recommend it to teachers and roll out associated professional development opportunities.


Problem-based learning (third column) uses a problem (i.e., a task which cannot be solved simply by recalling prior learning or employing an algorithmic routine) as the focus and motivation for learning about a topic

Of course, that over-simplifies the situation, as in education, 'all other things' never are equal (every school, class, teacher…is unique). An approach that works best on average will not work best everywhere. But knowing what works best on average (that is, taken across the diverse range of teaching and learning contexts) is certainly a very useful starting point when teachers want to consider what might work best in their own classrooms.

Rhetorical research is poor research, as it is set up (deliberately or inadvertently) to demonstrate a particular outcome, and, so, has built-in bias. In the case of experimental studies, this often means choosing an ineffective instructional approach for the comparison class. Why else would researchers select a control condition they know is not suitable for bringing about the educational outcomes they are testing for?

Problem-Based Learning in a 9th Grade Chemistry Class

Tarhan and colleagues' study was undertaken in one school with 78 students divided into two groups. One group was taught through a sequence based on problem-based learning that involved students undertaking research in groups, gently supported and steered by the teacher. The approach allowed student dialogue, which is believed to be valuable in learning, and motivated students to be active engaged in enquiry. When such an approach is well judged it has potential to count as 'scaffolding' of learning. This seems a very worthwhile innovation – well worth developing and evaluating.

Of course, work in one school cannot be assumed to generalise elsewhere, and small-scale experimental work of this kind is open to major threats to validity, such as expectancy effects and researcher bias – but this is unfortunately always true of these kinds of studies (which are often all educational researchers are resourced to carry out). Finding out what works best in some educational context at least potentially contributes to building up an overall picture (Taber, 2019). 1

Why is this rhetorical research?

I consider this rhetoric research because of the claims the authors make at the start of the study:

"Research in science education therefore has focused on applying active learning techniques, which ensure the affective construction of knowledge, prevent the formation of alternate conceptions, and remedy existing alternate conceptions…Other studies suggest that active learning methods increase learning achievement by requiring students to play a more active role in the learning process…According to active learning principles, which emphasise constructivism, students must engage in researching, reasoning, critical thinking, decision making, analysis and synthesis during construction of their knowledge."

Tarhan, et al., 2008, pp.285-286

If they genuinely believed that, then to test the effectiveness of their PbBL activity, Tarhan and colleagues needed to compare it with some other teaching condition that they are confident can "ensure the affective construction of knowledge, prevent the formation of alternate conceptions, and remedy existing alternate conceptions… requir[e] students to play a more active role in the learning process…[and] engage in researching, reasoning, critical thinking, decision making, analysis and synthesis during construction of their knowledge." A failure to do that means that the 'experiment' has been biased – it has been set up to ensure the control condition fails.

Unethical research?

"In most educational research experiments of [this] type…potential harm is likely to be limited to subjecting students (and teachers) to conditions where teaching may be less effective, and perhaps demotivating. This may happen in experimental treatments with genuine innovations (given the nature of research). It can also potentially occur in control conditions if students are subjected to teaching inputs of low effectiveness when better alternatives were available. This may be judged only a modest level of harm, but – given that the whole purpose of experiments to test teaching innovations is to facilitate improvements in teaching effectiveness – this possibility should be taken seriously."

Taber, 2019, p.94

The same teacher taught both classes: "Both of the groups were taught by the same chemistry teacher, who was experienced in active learning and PbBL" (p.288). This would seem to reduce the 'teacher effect' – outcomes being effected because the teacher of one one class being more effective than that of another. (Reduce, rather than eliminate, as different teachers have different styles, skills, and varied expertise: so, most teachers are more suited to, and competent in, some teaching approaches than others.)

So, this teacher was certainly capable of teaching in the ways that Tarhan and colleagues claim as necessary for effective learning ("active learning techniques"). However, the control condition sets up the opposite of active learning, so-called passive learning:

"In this study, the control group was taught the same topics as the experimental group using a teacher-centred traditional didactic lecture format. Teaching strategies were dependent on teacher expression and question-answer format. However, students were passive participants during the lessons and they only listened and took notes as the teacher lectured on the content.

The lesson was begun with teacher explanation about polar and nonpolar covalent bonding. She defined formation of dipole-dipole forces between polar molecules. She explained that because of the difference in electronegativities between the H and Cl atoms for HCl molecule is 0.9, they are polar molecules and there are dipole-dipole forces between HCl molecules. She also stated that the intermolecular dipole-dipole forces are weaker than intramolecular bonds such as covalent and ionic bonding. She gave the example of vaporisation and decomposition of HCl. She explained that while 16 kJ/mol of energy is needed to overcome the intermolecular attraction between HCl molecules in liquid HCl during vaporisation process of HCl, 431 kJ/mol of energy is required to break the covalent bond between the H and Cl atoms in the HCl molecule. In the other lesson, the teacher reminded the students of dipole-dipole forces and then considered London dispersion forces as weak intermolecular forces that arise from the attractive force between instantaneous dipole in nonpolar molecules. She gave the examples of F2, Cl2, Br2, I2 and said that because the differences in electronegativity for these examples are zero, these molecules are non-polar and had intermolecular London dispersion forces. The effects of molecular size and mass on the strengths of London dispersion forces were discussed on the same examples. She compared the strengths of dipole-dipole forces and London dispersion forces by explaining the differences in melting and boiling points for polar (MgO, HCl and NO) and non-polar molecules (F2, Cl2, Br2, and I2). The teacher classified London dispersion forces and dipole- dipole as van der Waals forces, and indicated that there are both London dispersion forces and dipole-dipole forces between polar molecules and only London dispersion forces between nonpolar molecules. In the last lesson, teacher called attention to the differences in boiling points of H2O and H2S and defined hydrogen bonds as the other intermolecular forces besides dipole-dipole and London dispersion forces. Strengths of hydrogen bonds depending on molecular properties were explained and compared in HF, NH3 and H2O. She gave some examples of intermolecular forces in daily life. The lesson was concluded with a comparison of intermolecular forces with each other and intramolecular forces."

Tarhan, et al., 2008, p.293

Lecturing is not ideal for teaching university students. It is generally not suitable for teaching school children (and it is not consistent with what is expected in Turkish schools).

This was a lost opportunity to seriously evaluate the teaching through PbBL by comparing with teaching that followed the national policy recommendations. Moreover, it was a dereliction of the duty that educators should never deliberately disadvantage learners. It is reasonable to experiment with children's learning when you feel there is a good chance of positive outcomes: it is not acceptable to deliberately set up learners to fail (e.g., by organising 'passive' learning when you claim to believe effective learning activities are necessarily 'active').

Isn't this 'direct instruction'?

Now, perhaps the account of the teaching given by Tarhan and colleagues might seem to fit the label of 'direct teaching'. Whilst Tarhan et al. claim constructivist teaching is clearly necessary for effective learning, there are some educators who claim that constructivist approaches are inferior, and a more direct approach, 'direct instruction', is more likely to lead to learning gains.

This has been a lively debate, but often the various commentators use terminology differently and argue across each other (Taber, 2010). The proponents of direct instruction often criticise teaching that expects learners to take nearly all the responsibility for learning, with minimal teacher support. I would also criticise that (except perhaps in the case of graduate research students once they have demonstrated their competence, including knowing when to seek supervisory guidance). That is quite unlike genuine constructivist teaching which is optimally guided (Taber, 2011): where the teacher manages activities, constantly monitors learner progress, and intervenes with various forms of direction and support as needed. Tarhan and colleagues' description of their problem-based learning experimental condition appears to have had this kind of guidance:

"The teacher visited each group briefly, and steered students appropriately by using some guiding questions and encouraging them to generate their hypothesis. The teacher also stimulated the students to gain more information on topics such as the polar structure of molecules, differences in electronegativity, electron number, atom size and the relationship between these parameters and melting-boiling points…The teacher encouraged students to discuss the differences in melting and boiling points for polar and non-polar molecules. The students came up with [their] research questions under the guidance of the teacher…"

Tarhan, et al., 2008, pp.290-291

By contrast, descriptions of effective direct instruction do involve tightly planned teaching with carefully scripted teacher moves of the kind quoted in the account, above, of the control condition. (But any wise teacher knows that lessons can only be scripted as a provisional plan: the teacher has to constantly check the learners are making sense of teaching as intended, and must be prepared to change pace, repeat sections, re-order or substitute activities, invent new analogies and examples, and so forth.)

However, this instruction is not simply a one-way transfer of information, but rather a teacher-led process that engages students in active learning to process the material being introduced by the teacher. If this is done by breaking the material into manageable learning quanta, each of which students engage with in dialogic learning activities before preceding to the next, then this is constructivist teaching (even if it may also be considered by some as 'direct instruction'!)


Effective teaching moves between teacher input and student activities and is not just the teacher communicating information to the learners.

By contrast, the lecture format adopted by Tarhan's team was based on the teacher offering a multi-step argument (delivered over several lessons) and asking the learners to follow and retain an extensive presentation.

"The lesson was begun with teacher explanation …

She defined …

She explained…

She also stated…

She gave the example …

She explained that …

the teacher reminded the students …

She gave the examples of …

She compared…

The teacher classified …

and indicated that …

[the] teacher called attention to …

She gave some examples of …"

Tarhan, et al., 2008, p.293

This is a description of the transmission of information through a communication channel: not an account of teaching which engages with students' thinking and guides them to new understandings.

Ethical review

Despite the paper having been published in a major journal, Research in Science Education, there seems to be no mention that the study design has been through any kind of institutional ethical review before the research began. Moreover, there is no reference to the learners or their parents/guardians having been asked for, or having given, voluntary, informed, consent, as is usually required in research with human participants. Indeed Tarhen and colleagues refer to the children as the 'subjects' of their research, not participants in their study.

Perhaps ethical review was not expected in the national context (at least, in 2008). Certainly, it is difficult to imagine how voluntary, informed, consent would be obtained if parents were to be informed that half of the learners would be deliberately subject to a teaching approach the researchers claim lacks any of the features "students must engage in…during construction of their knowledge".

PbBL is better than…deliberately teaching in a way designed to limit learning

Tarhan and colleagues, unsurprisingly, report that on a post-test the students who were taught through PbBL out-performed these students who were lectured at. It would have been very surprising (and so potentially more interesting, and, perhaps, even useful, research!) had they found anything else, given the way the research was biased.

So, to summarise:

  1. At the outset of the paper it is reported that it is already established that effective learning requires students to engage in active learning tasks.
  2. Students in the experimental conditions undertook learning through a PbBL sequence designed to engage them in active learning.
  3. Students in the control condition were subject to a sequence of lecturing inputs designed to ensure they were passive.
  4. Students in the active learning condition outperformed the students in the passive learning condition

Which I suggest can be considered both rhetorical research, and unethical.


The study can be considered both rhetorical and unfair to the learners assigned to be in the control group

Read about rhetorical experiments

Read about unethical control conditions


Work cited:

Note:

1 There is a major issue which is often ignored in studies of his type (where a pedagogical innovation is trialled in a single school area, school or classroom). Finding that problem-based learning (or whatever) is effective in one school when teaching one topic to one year group does not allow us to generalise to other classrooms, schools, country, educational level, topics and disciplines.

Indeed, as every school, every teacher, every class, etc., is unique in some ways, it might be argued that one only really finds out if an approach will work well 'here' by trying it out 'here' – and whether it is universally applicable by trying it everywhere. Clearly academic researchers cannot carry out such a programme, but individual teachers and departments can try out promising approaches for themselves (i.e., context-directed research, such as 'action research').

We might ask if there is any point in researchers carrying out studies of the type discussed in this article, there they start by saying an approach has been widely demonstrated, and then test it in what seems an arbitrarily chosen (or, more likely, convenient) curriculum and classroom context, given that we cannot generalise from individual studies, and it is not viable to test every possible context.

However, there are some sensible guidelines for how series of such studies into the same type of pedagogic innovation in different contexts can be more useful in (a) helping determine the range of contexts where an approach is effective (through what we might call 'incremental generalisation'), and (b) document the research contexts is sufficient detail to support readers in making judgements about the degree of similarity with their own teaching context (Taber, 2019).

Read about replication studies

Read about incremental generalisation

Cells are buzzing cities that are balloons with harpoons

What can either wander door to door, or rush to respond; and when it arrives might touch, sniff, nip, rear up, stroke, seal, or kill?


Keith S. Taber


a science teacher would need to be more circumspect in throwing some of these metaphors out there, without then doing some work to transition from them to more technical, literal, and canonical accounts


BBC Radio 4's 'Start the week' programme is not a science programme, but tends to invite in guests (often authors of some kind) each week according to some common theme. This week there was a science theme and the episode was titled 'Building the Body, Opening the Heart', and was fascinating. It also offers something of a case study in how science gets communicated in the media.


Building the Body, Opening the Heart

The guests all had life-science backgrounds:

Their host was geneticist and broadcaster Adam Rutherford.

Communicating science through the media

As a science educator I listen to science programmes both to enhance and update my own science knowledge and understanding, but also to hear how experts present scientific ideas when communicating to a general audience. Although neither science popularisation nor the work of scientists in communicating to the public is entirely the same as formal teaching (for example,

  • there is no curriculum with specified target knowledge; and
  • the audiences
    • are not well-defined,
    • are usually much more diverse than found in classrooms, and
    • are free to leave at any point they lose interest or get a better offer),

they are, like teachers, seeking to inform and explain science.

Science communicators, whether professional journalists or scientists popularising their work, face similar challenges to science teachers in getting across often complex and abstract ideas; and, like them, need to make the unfamiliar familiar. Science teachers are taught about how they need to connect new material with the learners' prior knowledge and experiences if it is to make sense to the students. But successful broadcasters and popularisers also know they need to do this, using such tactics as simplification, modelling, metaphor and simile, analogy, teleology, anthropomorphism and narrative.

Perhaps one of the the biggest differences between science teaching and science communication in the media is the ultimate criterion of success. For science teachers this is (sadly) usually, primarily at least, whether students have understood the material, and will later recall it, sufficiently to demonstrate target knowledge in exams. The teacher may prefer to focus on whether students enjoy science, or develop good attitudes to science, or will consider working in science: but, even so, they are usually held to account for students' performance levels in high-stakes tests.

Science journalists and popularisers do not need to worry about that. Rather, they have to be sufficiently engaging for the audience to feel they are learning something of interest and understanding it. Of course, teachers certainly need to be engaging as well, but they cannot compromise what is taught, and how it is understood, in order to entertain.

With that in mind, I was fascinated at the range of ways the panel of guests communicated the science in this radio show. Much of the programme had a focus on cells – and these were described in a variety of ways.

Talking about cells

Dr Rutherford introduced cells as

  • "the basic building blocks of life on earth"; and observed that he had
  • "spent much of my life staring down microscopes at these funny, sort of mundane, unremarkable, gloopy balloons"; before suggesting that cells were
  • "actually really these incredible cities buzzing with activity".

Dr. Mukherjee noted that

"they're fantastical living machines" [where a cell is the] "smallest unit of life…and these units were built, as it were, part upon part like you would build a Lego kit"

Listeners were told how Robert Hooke named 'cells' after observing cork under the microscope because the material looked like a series of small rooms (like the cells where monks slept in monasteries). Hooke (1665) reported,

"I took a good clear piece of Cork, and with a Pen-knife sharpen'd as keen as a Razor, I cut a piece of it off, and…cut off from the former smooth surface an exceeding thin piece of it, and…I could exceeding plainly perceive it to be all perforated and porous, much like a Honey-comb, but that the pores of it were not regular; yet it was not unlike a Honey-comb in these particulars

…these pores, or cells, were not very deep, but consisted of a great many little Boxes, separated out of one continued long pore, by certain Diaphragms, as is visible by the Figure B, which represents a sight of those pores split the long-ways.

Robert Hooke

Hooke's drawing of the 'pores' or 'cells' in cork

Components of cells

Dr. Mukherjee described how

"In my book I sort of board the cell as though it's a spacecraft, you will see that it's in fact organised into rooms and there are byways and channels and of course all of these organelles which allow it to work."

We were told that "the cell has its own skeleton", and that the organelles included the mitochondria and nuclei ,

"[mitochondria] are the energy producing organelles, they make energy in most cells, our cells for instance, in human cells. In human cells there's a nucleus, which stores DNA, which is where all the genetic information is stored."


A cell that secretes antibodies which are like harpoons or missiles that it sends out to kill a pathogen?

(Images by by envandrare and OpenClipart-Vectors from Pixabay)


Immune cells

Rutherford moved the conversation onto the immune system, prompting 'Sid' that "There's a lovely phrase you use to describe T cells, which is door to door wanderers that can detect even the whiff of an invader". Dr. Mukherjee distinguished between the cells of the innate immune system,

"Those are usually the first responder cells. In humans they would be macrophages, and neutrophils and monocytes among them. These cells usually rush to the site of an injury, or an infection, and they try to kill the pathogen, or seal up the pathogen…"

and the cells of the adaptive system, such as B cells and T cells,

"The B cell is a cell that eventually becomes a plasma cell which secretes antibodies. Antibodies, they are like harpoons or missiles which the cell sends out to kill a pathogen…

[A T cell] goes around sniffing other cells, basically touching them and trying to find out whether they have been altered in some way, particularly if they are carrying inside them a virus or any other kind of pathogen, and if it finds this pathogen or a virus in your body, it is going to go and kill that virus or pathogen"


A cell that goes around sniffing other cells, touching them? 1
(Images by allinonemovie and OpenClipart-Vectors from Pixabay)

Cells of the heart

Another topic was the work of Professor Harding on the heart. She informed listeners that heart cells did not get replaced very quickly, so that typically when a person dies half of their heart cells had been there since birth! (That was something I had not realised. It is believed that this is related to how heart cells need to pulse in synchrony so that the whole organ functions as an effective pumping device – making long lasting cells that seldom need replacing more important than in many other tissues.)

At least, this relates to the cardiomyocytes – the cells that pulse when the heart beats (a pulse that can now be observed in single cells in vitro). Professor Harding described how in the heart tissue there are also other 'supporting' cells, such as "resident macrophages" (immune cells) as well as other cells moving around the cardiomyocytes. She describe her observations of the cells in Petri dishes,

"When you look at them in the dish it's incredible to see them interact. I've got a… video [of] cardiomyocytes in a dish. The cardiomyocytes pretty much just stay there and beat and don't do anything very much, and I had this on time lapse, and you could see cells moving around them. And so, in one case, the cell (I think it was a fibroblast, it looked like a fibroblast), it came and it palpated at the cardiomyocyte, and it nipped off bits of it, it sampled bits of the cardiomyocyte, and it just stroked it all the way round, and then it was, it seemed to like it a lot.

[In] another dish I had the same sort of cardiomyocyte, a very similar cell came in, it went up to the cardiomyocyte, it touched it, and as soon as it touched it, I can only describe it as it reared up and it had, little blobs appeared all over its surface, and it rushed off, literally rushed off, although it was time lapse so it was two minutes over 24 hours, so, it literally rushed off, so what had it found, why did one like it and the other one didn't?"

Making the unfamiliar, familiar

The snippets from the broadcast that I have reported above demonstrate a wide range of ways that the unfamiliar is made familiar by describing it in terms that a listener can relate to through their existing prior knowledge and experience. In these various examples the listener is left to carry across from the analogue features of the familiar (the city, the Lego bricks, human interactions, etc.) those that parallel features of the target concept – the cell. So, for example, the listener is assumed to appreciate that cells, unlike Lego bricks, are not built up through rigid, raised lumps that fit precisely in depressions on the next brick/cell. 2

Analogies with the familiar

Hooke's original label of the cell was based on a kind of analogy – an attempt to compare what we has seeing with something familiar: "pores, or cells…a great many little Boxes". He used the familiar simile of the honeycomb (something directly familiar to many more people in the seventeenth century when food was not subject to large-scale industrialised processing and packaging).

Other analogies, metaphors and similes abound. Cells are visually like "gloopy balloons", but functionally are "building blocks" (strictly a metaphor, albeit one that is used so often it has become treated as though a literal description) which can be conceptualised as being put together "like you would build a Lego kit" (a simile) although they are neither fixed, discrete blocks of a single material, nor organised by some external builder. They can be considered conceptually as the"smallest unit of life"(though philosophers argue about such descriptions and what counts as an individual in living systems).

The machine description ("fantastical living machines") reflects one metaphor very common in early modern science and cells as "incredible cities" is also a metaphor. Whether cells are literally machines is a matter of how we extend or limit our definition of machines: cells are certainly not actually cities, however, and calling them such is a way of drawing attention to the level of activity within each (often, apparently from observation, quite static) cell. B cells secrete antibodies, which the listener is old are like (a simile) harpoons or missiles – weapons.

Skeletons of the dead

Whether "the cell has its own skeleton" is a literal or metaphorical statement is arguable. It surely would have originally been a metaphoric description – there are structures in the cell which can be considered analogous to the skeleton of an organism. If such a metaphor is used widely enough, in time the term's scope expands to include its new use – and it becomes (what is called, metaphorically) a 'dead metaphor'.

Telling stories about cells

A narrative is used to help a listener imagine the cell at the scale of "a spacecraft". This is "organised into rooms and there are byways and channels" offering an analogy for the complex internal structure of a cell. Most people have never actually boarded a spacecraft, but they are ubiquitous in television and movie fiction, so a listener can certainly imagine what this might be like.


Endoplastic reticulum? (Still from Star Trek: The Motion Picture, Paramount Pictures, 1979)

Oversimplification?

The discussion of organelles illustrates how simplifications have to be made when introducing complex material. This always brings with it dangers of oversimplification that may impede further learning, or even encourage the development of alternative conceptions. So, the nucleus does not, strictly, 'store' "all the genetic information" in a cell (mitochondria carry their own genes for example).

More seriously, perhaps, mitochondria do not "make energy". 'More seriously' as the principle of conservation of energy is one of the most basic tenets of modern science and is considered a very strong candidate for a universal law. Children are often taught in school that energy cannot be created or destroyed. Science communication which is contrary to this basic curriculum science could confuse learners – or indeed members of the public seeking to understand debates about energy policy and sustainability.

Anthropomorphising cells

Cells are not only compared to inanimate entities like balloons, building bricks, cities and spaceships. They are also described in ways that make them seem like sentient agents – agents that have experiences, and conscious intentions, just as people do. So, some immune cells are metaphorical 'first responders' and just as emergency services workers they "rush to the site" of an incident. To rush is not just to move quickly, buy to deliberately do so. (By contrast, Paul McAuley refers to "innocent" amoeboid cells that collectively form into the plasmodium of a slime mould spending most of their lives"bumbling around by themselves" before they "get together". ) The immune cells act deliberately – they "try" to kill. Other immune cells "send out" metaphorical 'missiles' "to kill a pathogen". Again this language suggests deliberate action (i.e., to send out) and purpose.

That is, what is described is not just some evolved process, but something teleological: there is a purpose to sending out antibodies – it is a deliberate act with an aim in mind. This type of language is very common in biology – even referring to the 'function' of the heart or kidney or a reflex arc could be considered as misinterpreting the outcome of evolutionary developments. (The heart pumps blood through the vascular system, but referring to a function could suggest some sense of deliberate design.)

Not all cells are equal

I wonder how many readers noticed the reference above to 'supporting' cells in the heart. Professor Harding had said

"When you look inside the [heart] tissue there are many other cells [than cardiomyocytes] that are in there, supporting it, there are resident macrophages, I think we still don't know really what they are doing in there"

Why should some heart cells be seen as more important and others less so? Presumably because 'the function' of a heart is to beat, to pump, so clearly the cells that pulse are the stars, and the other cells that may be necessary but are not obviously pulsing just a supporting cast. (So, cardiomyocytes are considered heart cells, but macrophages in the same tissue are only cells that are found in the heart, "residents" – to use an analogy of my own, like migrants that have not been offered citizenship!)3

That is, there is a danger here that this way of thinking could bias research foci leading researchers to ignore something that may ultimately prove important. This is not fanciful, as it has happened before, in the case of the brain:

"Glial cells, consisting of microglia, astrocytes, and oligodendrocyte lineage cells as their major components, constitute a large fraction of the mammalian brain. Originally considered as purely non-functional glue for neurons, decades of research have highlighted the importance as well as further functions of glial cells."

Jäkel and Dimou, 2017
The lives of cells

Narrative is used again in relation to the immune cells: an infection is presented as a kind of emergency event which is addressed by special (human like) workers who protect the body by repelling or neutralising invaders. "Sniffing" is surely an anthropomorphic metaphor, as cells do not actually sniff (they may detect diffusing substances, but do not actively inhale them). Even "touching" is surely an anthropomorphism. When we say two objects are 'touching' we mean they are in contact, as we touch things by contact. But touching is sensing, not simply adjacency.

If that seems to be stretching my argument too far, to refer to immune cells "trying to find out…" is to use language suggesting an epistemic agent that can not only behave deliberately, but which is able to acquire knowledge. A cell can only "find" an infectious agent if it is (i.e., deliberately) looking for something. These metaphors are very effective in building up a narrative for the listener. Such a narrative adopts familiar 'schemata', recognisable patterns – the listener is aware of emergency workers speeding to the scene of an incident and trying to put out a fire or seeking to diagnose a medical issue. By fitting new information into a pattern that is familiar to the audience, technical and abstract ideas are not only made easier to understand, but more likely to be recalled later.

Again, an anthropomorphic narrative is used to describe interactions between heart cells. So, a fibroblast that "palpates at" a cardiomyocyte seems to be displaying deliberate behaviour: if "nipping" might be heard as some kind of automatic action – "sampling" and "stroking" surely seem to be deliberate behaviour. A cell that "came in, it went up [to another]" seems to be acting deliberately. "Rearing up" certainly brings to mind a sentient being, like a dog or a horse. Did the cell actually 'rear up'? It clearly gave that impression to Professor Harding – that was the best way, indeed the "only" way, she had to communicate what she saw.

Again we have cells "rushing" around. Or do we? The cell that had reared up, "rushed off". Actually, it appeared to "rush" when the highly magnified footage was played at 720 times the speed of the actual events. Despite acknowledging this extreme acceleration of the activity, the impression was so strong that Professor Harding felt justified in claiming the cell "literally rushed off, although it was time lapse so it was two minutes over 24 hours, so, it literally rushed off…". Whatever it did, that looked like rushing with the distortion of time-lapse viewing, it certainly did not literally rush anywhere.

But the narrative helps motivate a very interesting question, which is why the two superficially similar cells 'behaved' ('reacted', 'responded' – it is actually difficult to find completely neutral language) so differently when in contact with a cardiomyocyte. In more anthropomorphic terms: what had these cells "found, why did one like it and the other one didn't?"

Literally speaking?

Metaphorical language is ubiquitous as we have to build all our abstract ideas (and science has plenty of those) in terms of what we can experience and make sense of. This is an iterative process. We start with what is immediately available in experience, extend metaphorically to form new concepts, and in time, once those have "settled in" and "taken root" and "firmed up" (so to speak!) they can then be themselves borrowed as the foundation for new concepts. This is true both in how the individual learns (according to constructivism) and how humanity has developed culture and extended language.

So, should science communicators (whether scientists themselves, journalists or teachers) try to limit themselves to literal language?

Even if this were possible, it would put aside some of our strongest tools for 'making the unfamiliar familiar' (to broadcast audiences, to the public, to learners in formal education). However these devices also bring risks that the initial presentations (with their simplifications and metaphors and analogies and anthropomorphic narratives…) not only engage listeners but can also come to be understood as the scientific account. That is is not an imagined risk is shown by the vast numbers of learners who think atoms want to fill their shells with octets of electrons, and so act accordingly – and think this because they believe it is what they have been taught.

Does it matter if listeners think the simplification, the analogy, the metaphor, the humanising story,… is the scientific account? Perhaps usually not in the case of the audience listening to a radio show or watching a documentary out of interest.

In education it does matter, as often learners are often expected to progress beyond these introductory accounts in their thinking, and teachers' models and metaphors and stories are only meant as a starting point in building up a formal understanding. The teacher has to first establish some kind of anchor point in the students' existing understandings and experiences, but then mould this towards the target knowledge set out in the curriculum (which is often a simplified account of canonical knowledge) before the metaphor or image or story becomes firmed-up in the learners' minds as 'the' scientific account.

'Building the Body, Opening the Heart' was a good listen, and a very informative and entertaining episode that covered a lot of ideas. It certainly included some good comparisons that science teachers might borrow. But I think in a formal educational context a science teacher would need to be more circumspect in throwing some of these metaphors out there, without then doing some work to transition from them to more technical, literal, and canonical accounts.


Read about science analogies

Read about science metaphors

Read about science similes

Read about anthropomorphism

Read about teleology


Work cited:


Notes:

1 The right hand image portrays a mine, a weapon that is used at sea to damage and destroy (surface or submarine) boats. The mine is also triggered by contact ('touch').


2 That is, in an analogy there are positive and negative aspects: there are ways in which the analogue IS like the target, and ways in which the analogue is NOT like the target. Using an analogy in communication relies on the right features being mapped from the familiar analogue to the unfamiliar target being introduced. In teaching it is important to be explicit about this, or inappropriate transfers may be made: e.g., the atom is a tiny solar system so it is held together by gravity (Taber, 2013).


3 It may be a pure coincidence in relation to the choice of term 'resident' here, but in medicine 'residents' have not yet fully qualified as specialist physicians or surgeons, and so are on placement and/or under supervision, rather than having permanent status in a hospital faculty.


Using water to feed the fire

How NOT to heat up your blast furnace


Keith S. Taber


"From one of the known ingredients of steam being a highly inflammable body, and the other that essential part of the air which supports combustion, it was imagined that [steam] would have the effect of increasing the fire …"


Producing iron requires high temperatures: adding H2O does not help
(Image by zephylwer0 from Pixabay)

The challenge of chemical combination

School science teachers are likely aware of how chemistry poses some significant leaning challenges for learners. One of these is the nature of chemical compounds. That is, compounds of chemical elements.

It may seem obvious to learners that when we 'mix' two components with different properties we should get a mixture with a combination of the component properties. So far, so good. But of course, in chemical reactions we do not just mix different substances, but rather they chemically react. So, sodium will react with chlorine, which can be understood in terms of processes occurring at the nanoscopic scale where molecules of a gas interact with the metallic lattice of sodium cations and delocalised electrons.

Sodium and chlorine behaving badly

Although we can model this process, we cannot observe it directly, or even the starting structures at that scale. Understandably, students often struggle to relate the macroscopic and molecular:

As Sodium is a reactive meterial [sic] and chlorine is a acid. When Sodium is placed in Chlorine, Sodium react badly making a flame and maybe a noise. I think why this reaction happen is because as Sodium reactive metal meaning that it atomic configuration is unstable make the metal danger And as Chlorine is a dangerous acid. When sodium is placed in Chlorine, the sodium start dissolving in the acid due to all the particle rushing around quickly pushing together with Chlorine atom. Producing Sodium chloride.

Student setting out on Advanced level chemistry, quoted in Taber, 1996

So, for example, if we do burn sodium in chlorine we end up with sodium chloride which is a new substance that has its own properties – properties which are not simply some mixture of, or intermediate between, the properties of the substances we start with (the reactants).

Indeed, sodium is a dangerous material to handle: it will react vigorously with water (in a person's sweat for example!) and burns violently in air. Chlorine is so nasty that it has been used as a weapon of war (and since banned as an 'unacceptable' weapon, even in war). In the 'great' war ('great' only because of its scale) the way men died in agony from breathing chlorine was much reported, as well as the effects on those who survived the gas – being blinded for example.

"In all my dreams before my helpless sight,

He plunges at me, guttering, choking, drowning."

Wilfred Owen, Dulce et Decorum Est 1

Sweet and honourable? 1 (Image by Bruce Mewett from Pixabay)

Sodium chloride certainly has its associated hazards – if eaten in excess it is a risk factor for high blood pressure for example – but is certainly not dangerous in anything like the same sense. Many people put sodium chloride on their chips (often along with ethanoic acid solution). No one would want sodium on their food, or to eat in a canteen with a chlorine atmosphere!

When is something both present and not present?

Why this is especially challenging is that the chemistry teacher tells the students that although, at one level, the new substance does not contain its precursors – there is no sodium (substance) or chlorine (substance) in the substance sodium chloride – yet it is a compound of these elements and in some some sense the elements remain 'in' the compound.


Learning chemistry requires understanding how disciplinary concepts explained in terms of submicroscopic level models (After Figure 5, Taber, 2013)

This links to that key theoretical framework in chemistry where we can explain macroscopic (bench scale) phenomena in terms of models of matter at the submicroscopic (indeed nanoscopic or even subnanoscopic) scale. The sense in which sodium chloride 'contains' sodium and chlorine is that it is comprised of a lattice of sodium ions and chloride ions – species which include the specific types of nuclei (those of charge +11 and +17 respectively) that define those elements.

So, when we ask whether the elements are in some sense 'in' the compound we have to think in terms of these abstract models at a tiny scale – there is no sodium substance or chlorine substance present, but there is something that is inherently identified with these two elements. In a sense, but a very abstract sense, the elements are still present. Or, perhaps, better, something intrinsic to those elements is still present.

"We are working here with two complementary meanings for the idea of element, one at the (macroscopic) level of phenomena we can demonstrate to students (substances, and their reactions); the other deriving from a theoretical model in terms of conjectured submicroscopic entities ('quanticles'…).

However, there is also a sense in which an element is considered to be present, in a virtual or potential sense, within its compounds. This use is more common among French-speaking chemists, and in the English-speaking world we normally consider it quite inappropriate to suggest that sodium is somehow present in sodium chloride, or hydrogen in water. Yet, of course, chemical formulae (NaCl, H2O, etc) tell us that the compounds somehow 'contain' the elements."

Taber, 2012, p.19

Figure 1.9 from Taber, 2012

A source of alternative conceptions

This is easy to understand for someone very familiar with molecular level models – but is understandably difficult for novice learners. Thus we can reasonably understand why there are common alternative conceptions along the lines of students thinking that, for example, a compound of a dangerous element (say chlorine) must also be dangerous. Yet we 'mix' and react a soft, reactive, metal and a choking green gas – and get hard white crystals that safely dissolve in water to give a solution we can use in cooking, or to soak our feet, or to gargle with.

An historical precedent

Because science teachers and chemists are so used to thinking in models at the molecular level, we can forget just how unfamiliar this perspective is to the novice, and so the challenge of acquiring the scientific ways of thinking that have become 'second nature' through extensive application.

I was therefore fascinated to see an example of this same alternative conception, assuming a compound will show the properties of its constituent elements, reported by the scientist Sir John Herschel (astronomer, chemist, mathematician, philosopher…), not in a school science context, but rather an industrial context.

"The smelting of iron requires the application of the most violent heat that can be raised, and is commonly performed in tall furnaces, urged by great iron bellows driven by steam-engines. Instead of employing this power to force air into the furnace through the intervention of bellows, it was, on one occasion, attempted to employ the steam itself in, apparently, a much less circuitous manner; viz. by directing the current of steam in a violent blast, from the boiler at once into the fire. From one of the known ingredients of steam being a highly inflammable body, and the other that essential part of the air which supports combustion, it was imagined that this would have the effect of increasing the fire to tenfold fury, whereas it simply blew it out; a result which a slight consideration of the laws of chemical combination, and the state in which the ingredient elements exist in steam, would have enabled any one to predict without a trial."

Herschel, J. F. W. (1830/1851/2017), §37 2

So, here, instead of dropping marks on a test, this misunderstanding of the chemistry leads to a well-intentioned industrialist trying to generate heat in a blast furnace by adding water to the fire. But this does remind us just how counter-intuitive some of the things taught in science are. It might also be a useful anecdote to share with students to help them appreciate that that their errors are by no means unusual, or necessarily a reflection on their ability.

Perhaps this might even be a useful teaching example that could be built up into a historical anecdote which students might readily recall and that will help them remember that compounds have new properties that may be quite different from their constituent elements. So, while a mixture of the flammable gas hydrogen and oxygen can be explosive, a combination (that is, a chemical combination – a compound), of hydrogen and oxygen will not 'feed' a fire but dampen it down. Just as well, really, as otherwise emergency fire and rescue services would need to find an alternative to the widely available, inexpensive, recyclable, non-toxic, agent they widely use in fighting fires.


Compounds and mixtures are not interchangeable (Image by David Mark from Pixabay)

Work cited:

Notes:

1 Wilfred Owen was famous for his war poetry written about the horrors of the trench fighting in the 'first world war'. Owen was killed a week before the war ended. 'Dulce Et Decorum Est' referred to a Latin phrase or motto (dulce et decorum est pro patria mori) that Owen labelled as 'the old lie', that it was sweet and honourable to die in the service of one's country.


2 For some reason, "…it was imagined that this would have the effect of increasing the fire to tenfold fury, whereas it simply blew it out…" puts me in mind of

"the mighty ships tore across the empty wastes of space and finally dived screaming on to…Earth – where due to a terrible miscalculation of scale the entire battle fleet was accidentally swallowed by a small dog."

Douglas Adams, The Hitchhiker's Guide to the Galaxy

The missing mass of the electron

Annihilating mass in communicating science


Keith S. Taber


An episode of 'In Our Time' about the electron

The BBC radio programme 'In Our Time' today tackled the electron. As part of the exploration there was the introduction of the positron, and the notion of matter-antimatter annihilation. These are quite brave topics to introduce in a programme with a diverse general audience (last week Melvyn Bragg and his guests discussed Plato's Atlantis and next week the programme theme is the Knights Templar).

Prof. Victoria Martin of the School of Physics and Astronomy at the University of Edinburgh explained:

If we take a pair of matter and antimatter, so, since we are talking about the electron today, if we take an electron and the positron, and you put them together, they would annihilate.

And they would annihilate not into nothingness, because they both had mass, so they both had energy from E=mc2 that tells us if you have mass you have energy. So, they would annihilate into energy, but it would not just be any kind of energy: the particular kind of energy you get when you annihilate an electron and a positron is a photon, a particle of light. And it will have a very specific amount of energy. Its energy will be equal to the sum of the energy of electron and the positron that they had initially when they collided together.

Prof. Victoria Martin on 'In Our Time'

"An electron and the positron, and you put them together, they would annihilate…they would annihilate into energy" – but this could be misleading.

Now, I am sure that is somewhat different from how Prof. Martin would treat this topic with university physics students – of course, science in the media has to be pitched at the largely non-specialist audience.

Read about science in the media

It struck me that this presentation had the potential to reinforce a common alternative conception ('misconception') that mass is converted into energy in certain processes. Although I am aware now that this is an alternative conception, I seem to recall that is pretty much what I had once understood from things I had read and heard.

It was only when I came to prepare to teach the topic that I realised that I had a misunderstanding. That, I think, is quite common for teachers – when we have to prepare a topic well enough to explain it to others, we may spot flaws in our own understanding (Taber, 2009)

So, for example, I had thought that in nuclear processes, such as in a fission reactor or fusion in stars, the mass defect (the apparent loss of mass as the resulting nuclear fragments have less mass than those present before the process) was due to that amount of mass being converted to energy. This is sometimes said to explain why nuclear explosions are so much more violent than chemical explosions, as (given E=mc2): a tiny amount of mass can be changed into a great deal of energy.

Prof. Martin's explanation seemed to support this way of thinking: "they would annihilate into energy".


An alternative conception of particle annihilation: This scheme seems to be implied by Prof. Martin's comments

What is conserved?

It is sometimes suggested that, classically, mass and energy were considered to be separately conserved in processes, but since Einstein's theories of relativity have been adopted, now it is considered that mass can be considered as if a form of energy such that what is conserved is a kind of hybrid conglomerate. That is, energy is still considered conserved, but only when we account for mass that may have been inter-converted with energy. (Please note, this is not quite right – see below.)

So, according to this (mis)conception: in the case of an electron-positron annihilation, the mass of the two particles is converted to an equivalent energy – the mass of the electron and the mass of the positron disappear from the universe and an equivalent quantity of energy is created. Although energy is created, energy is still conserved if we allow for the mass that was converted into this new energy. Each time an electron and positron annihilate, their masses of about 2 ✕ 10-30 kg disappear from the universe and in its place something like 2 ✕ 10-13 J appears instead – but that's okay as we can consider 2 ✕ 10-30 kg as a potential form of energy worth 2 ✕ 10-13 J.

However, this is contrary to what Einstein (1917/2004) actually suggested.


Einstein did not suggest that matter could be changed to energy

Equivalence, not interconversion

What Einstein actually suggested was not that mass could be considered as if another kind/form of energy (alongside kinetic energy and gravitational potential, etc.) that needed to be taken into account in considering energy conservation, but rather that inertial mass can be considered as an (independent) measure of energy.

That is, we think energy is always conserved. And we think that mass is always conserved. And in a sense they are two measures of the same thing. We might see these two statements as having redundancy:

  • In a isolated system we will always have the same total quantity of energy before and after any process.
  • In a isolated system we will always have the same total quantity of mass before and after any process.

As mass is always associated with energy, and so vice versa, either of these statements implies the other. 1


Two conceptions of the shift from a Newtonian to a relativistic view of the conservation of energy (move the slider to change the image)

No interconversion?

So, mass cannot be changed into energy, nor vice versa. The sense in which we can 'interconvert' is that we can always calculate the energy equivalence of a certain mass (E=mc2) or mass equivalence of some quantity of energy (m=E/c2).

So, the 'interconversion' is more like a change of units than a change of entity.


Although we might think of kinetic energy being converted to potential energy reflects a natural process (something changes), we know that changing joules to electron-volts is merely use of a different unit (nothing changes).

If we think of a simple pendulum under ideal conditions 2 it could oscillate for ever, with the total energy unchanged, but with the kinetic energy being converted to potential energy – which is then converted back to kinetic energy – and so on, ad infinitum. The total energy would be fixed although the amount of kinetic energy and the amount of potential energy would be constantly changing. We could calculate the energy in joules or some other unit such as eV or ergs (or calories or kWh or…). We could convert from one unit to another, but this would not change anything about the physical system. (So, this is less like converting pounds to dollars, and more like converting an amount reported in pounds {e.g., £24.83} into an amount reported in pence {e.g., 2483p}.)

Using this analogy, the electron and positron being converted to a photon is somewhat like kinetic energy changing to potential energy in a swinging pendulum (something changes), but it is not the case that mass is changed into energy. Rather we can do our calculations in terms of energy or mass and will get (effectively, given E=mc2) the same answer (just as we can add up a shopping list in pounds or pence, and get the same outcome given the conversion factor, 1.00£ = 100p).

So, where does the mass go?

If mass is conserved, then where does the mass defect – the amount by which the sum of masses of daughter particles is less than the mass of the parent particle(s) – in nuclear processes go? And, more pertinent to the present example, what happens to the mass of the electron and positron when they mutually annihilate?

To understand this, it might help to bear in mind that in principle these process are like any other natural processes – such as the swinging pendulum, or a weight being lifted with pulley, or methane being combusted in a Bunsen burner, or heating water in a kettle, or photosynthesis, or a braking cycle coming to a halt with the aid of friction.

In any natural process (we currently believe)

  • the total mass of the universe is unchanged…
    • but mass may be reconfigured
  • the total energy of the universe is unchanged…
    • but energy may be reconfigured; and
  • as mass and energy are associated, any reconfigurations of mass and energy are directly correlated.

So, in any change that involves energy transfers, there is an associated mass transfer (albeit usually one too small to notice or easily measure). We can, for example, calculate the (tiny) increase in mass due to water being heated in a kettle – and know just as the energy involved in heating the water came from somewhere else, there is an equivalent (tiny) decrease of mass somewhere else in the wider system (perhaps due to falling of water powering a hydroelectric power station). If we are boiling water to make a cup of tea, we may well be talking about a change in mass of the order of only 0.000 000 001 g according to my calculations for another posting.

Read 'How much damage can eight neutrons do? Scientific literacy and desk accessories in science fiction.'

The annihilation of the electron and positron is no different: there may be reconfigurations in the arrangement of mass and energy in the universe, but mass (and so energy) is conserved.

So, the question is, if the electron and positron, both massive particles (in the physics sense, that they have some mass) are annihilated, then where does their mass go if it is conserved? The answer is reflected in Prof. Martin's statement that "the particular kind of energy you get when you annihilate an electron and a positron is a photon, a particle of light". The mass is carried away by the photon.

The mass of a massless particle?

This may seem odd to those who have learnt that, unlike the electron and positron, the photon is massless. Strictly the photon has no rest mass, whereas the electron and positron do have rest mass – that is, they have inertial mass even when judged by an observer at rest in relation to them.

So, the photon only has 'no mass' when it is observed to be stationary – which nicely brings us back to Einstein who noted that electromagnetic radiation such as light could never appear to be at rest compared to the observer, as its very nature as a progressive electromagnetic wave would cease if one could travel alongside it at the same velocity. This led Einstein to conclude that the speed of light in any given medium was invariant (always the same for any observer), leading to his theory of special relativity.

So, a photon (despite having no 'rest' mass) not only carries energy, but also the associated mass.

Although we might think in terms of two particles being converted to a certain amount of energy as Prof. Martin suggests, this is slightly distorted thinking: the particles are converted to a different particle which now 'has' the mass from both, and so will also 'have' the energy associated with that amount of mass.


Mass is conserved during the electron-positron annihilation

A slight complication is that the electron and position are in relative motion when they annihilate, so there is some kinetic energy involved as well as the energy associated with their rest masses. But this does not change the logic of the general scheme. Just as there is an energy associated with the particles' rest masses, there is a mass component associated with their kinetic energy.

The total mass-energy equivalence before the annihilation has to include both the particle rest masses and their kinetic energy. The mass-energy equivalence afterwards (being conserved in any process) also reflects this. The energy of the photon (and the frequency of the radiation) reflects both the particle masses and their kinetic energies at the moment of the annihilation. The mass (being perfectly correlated with energy) carried away by the photon also reflects both the particle masses and their kinetic energies.

How could 'In Our Time' have improved the presentation?

It is easy to be critical of people doing their best to simplify complex topics. Any teacher knows that well-planned explanations can fail to get across key ideas as one is always reliant on what the audience already understands and thinks. Learners interpret what they hear and read in terms of their current 'interpretive resources' and habits of thinking.

Read about constructivism

A physicist or physics student hearing the episode would likely interpret Prof. Martin's statement within a canonical conceptual framework. However, someone holding the 'misconception' that mass is converted to energy in nuclear processes would likely interpret "they would annihilate into energy" as fitting, and reinforcing, that alternative conception.

I think a key issue here is a slippage that apparently refers to energy being formed in the annihilation, rather than radiation: (i.e., Prof. Martin could have said "they would annihilate into [radiation]"). When the positron and electron 'become' a photon, matter is changed to radiation – but it is not changed to energy, as matter has mass, and (as mass and energy have an equivalence) the energy is already there in the system.


Energy is reconfigured, but is not formed, in the annihilation process.

So, this whole essay is simply suggesting that a change of one word – from energy to radiation – could potentially avoid the formation of, or the reinforcing of, the alternative conception that mass is changed into energy in processes studied in particle physics. As experienced science teachers will know, sometimes such small shifts can make a good deal of difference to how we are interpreted and, so, what comes to be understood.


Addenda:

Reply from Prof. Victoria Martin on twitter (@MamaPhysikerin), September 30:

"E2 = p2c2 + m2c4 is a better way to relate energy, mass and momentum. Works for both massive and massless states."

@MamaPhysikerin

Work cited:

Notes

1 In what is often called a closed system there is no mass entering or leaving the system. However, energy can transfer to, or from, the system from its surroundings. Classically it might be assumed that the mass of a closed system is constant as the amount of matter is fixed, but Einstein realised that if there is a net energy influx to, or outflow from, the system, than some mass would also be transferred – even though no matter enters or leaves.


2 Perhaps in a uniform gravitational field, not subject to to any frictional forces, with an inextensible string supporting the bob, and in thermal equilibrium with its environment.

Falsifying research conclusions

You do not need to falsify your results if you are happy to draw conclusions contrary to the outcome of your data analysis.


Keith S. Taber


Li and colleagues claim that their innovation is successful in improving teaching quality and student learning: but their own data analaysis does not support this.

I recently read a research study to evaluate a teaching innovation where the authors

  • presented their results,
  • reported the statistical test they had used to analyse their results,
  • acknowledged that the outcome of their experiment was negative (not statistically significant), then
  • stated their findings as having obtained a positive outcome, and
  • concluded their paper by arguing they had demonstrated their teaching innovation was effective.

Li, Ouyang, Xu and Zhang's (2022) paper in the Journal of Chemical Education contravenes the scientific norm that your conclusions should be consistent with the outcome of your data analysis.
(Magnified portions of this scheme are presented below)

And this was not in a paper in one of those predatory journals that I have criticised so often here – this was a study in a well regarded journal published by a learned scientific society!

The legal analogy

I have suggested (Taber, 2013) that writing up research can be understood in terms of a number of metaphoric roles: researchers need to

  • tell the story of their research;
  • teach readers about the unfamiliar aspects of their work;
  • make a case for the knowledge claims they make.

Three metaphors for writing-up research

All three aspects are important in making a paper accessible and useful to readers, but arguably the most important aspect is the 'legal' analogy: a research paper is an argument to make a claim for new public knowledge. A paper that does not make its case does not add anything of substance to the literature.

Imagine a criminal case where the prosecution seeks to make its argument at a pre-trial hearing:

"The police found fingerprints and D.N.A. evidence at the scene, which they believe were from the accused."

"Were these traces sent for forensic analysis?"

"Of course. The laboratory undertook the standard tests to identify who left these traces."

"And what did these analyses reveal?"

"Well according to the current standards that are widely accepted in the field, the laboratory was unable to find a definite match between the material collected at the scene, and fingerprints and a D.N.A. sample provided by the defendant."

"And what did the police conclude from these findings?"

"The police concluded that the fingerprints and D.N.A. evidence show that the accused was at the scene of the crime."

It seems unlikely that such a scenario has ever played out, at least in any democratic country where there is an independent judiciary, as the prosecution would be open to ridicule and it is quite likely the judge would have some comments about wasting court time. What would seem even more remarkable, however, would be if the judge decided on the basis of this presentation that there was a prima facie case to answer that should proceed to a full jury trial.

Yet in educational research, it seems parallel logic can be persuasive enough to get a paper published in a good peer-reviewed journal.

Testing an educational innovation

The paper was entitled 'Implementation of the Student-Centered Team-Based Learning Teaching Method in a Medicinal Chemistry Curriculum' (Li, Ouyang, Xu & Zhang, 2022), and it was published in the Journal of Chemical Education. 'J.Chem.Ed.' is a well-established, highly respected periodical that takes peer review seriously. It is published by a learned scientific society – the American Chemical Society.

That a study published in such a prestige outlet should have such a serious and obvious flaw is worrying. Of course, no matter how good editorial and peer review standards are, it is inevitable that sometimes work with serious flaws will get published, and it is easy to pick out the odd problematic paper and ignore the vast majority of quality work being published. But, I did think this was a blatant problem that should have been spotted.

Indeed, because I have a lot of respect for the Journal of Chemical Education I decided not to blog about it ("but that is what you are doing…?"; yes, but stick with me) and to take time to write a detailed letter to the journal setting out the problem in the hope this would be acknowledged and the published paper would not stand unchallenged in the literature. The journal declined to publish my letter although the referees seemed to generally accept the critique. This suggests to me that this was not just an isolated case of something slipping through – but a failure to appreciate the need for robust scientific standards in publishing educational research.

Read the letter submitted to the Journal of Chemical Education

A flawed paper does not imply worthless research

I am certainly not suggesting that there is no merit in Li, Ouyang, Xu and Zhang's work. Nor am I arguing that their work was not worth publishing in the journal. My argument is that Li and colleague's paper draws an invalid conclusion, and makes misleading statements inconsistent with the research data presented, and that it should not have been published in this form. These problems are pretty obvious, and should (I felt) have been spotted in peer review. The authors should have been asked to address these issues, and follow normal scientific standards and norms such that their conclusions follow from, rather than contradict, their results.

That is my take. Please read my reasoning below (and the original study if you have access to J.Chem.Ed.) and make up your own mind.

Li, Ouyang, Xu and Zhang report an innovation in a university course. They consider this to have been a successful innovation, and it may well have great merits. The core problem is that Li and colleagues claim that their innovation is successful in improving teaching quality and student learning: when their own data analysis does not support this.

The evidence for a successful innovation

There is much material in the paper on the nature of the innovation, and there is evidence about student responses to it. Here, I am only concerned with the failure of the paper to offer a logical chain of argument to support their knowledge claim that the teaching innovation improved student achievement.

There are (to my reading – please judge for yourself if you can access the paper) some slight ambiguities in some parts of the description of the collection and analysis of achievement data (see note 5 below), but the key indicator relied on by Li, Ouyang, Xu and Zhang is the average score achieved by students in four teaching groups, three of which experienced the teaching innovation (these are denoted collectively as the 'the experimental group') and one group which did not (denoted as 'the control group', although there is no control of variables in the study 1). Each class comprised of 40 students.

The study is not published open access, so I cannot reproduce the copyright figures from the paper here, but below I have drawn a graph of these key data:


Key results from Li et al, 2022: this data was the basis for claiming an effective teaching innovation.

Loading poll ...
Coming Soon
What do you think this graph tells us?

It is on the basis of this set of results that Li and colleagues claim that "the average score showed a constant upward trend, and a steady increase was found". Surely, anyone interrogating these data might have pause to wonder if that is the most authentic description of the pattern of scores year on year.

Does anyone teaching in a university really think that assessment methods are good enough to produce average class scores that are meaningful to 3 or 4 significant figures. To a more reasonable level of precision, nearest %age point (which is presumably what these numbers are – that is not made explicit), the results were:


CohortAverage class score
201780
201880
201980
202080
Average class scores (2 s.f.) year on year

When presented to a realistic level of precision, the obvious pattern is…no substantive change year on year!

A truncated graph

In their paper, Li and colleagues do present a graph to compare the average results in 2017 with (not 2018, but) 2019 and 2020, somewhat similar to the one I have reproduced here which should have made it very clear how little the scores varied between cohorts. However, Li and colleagues did not include on their axis the full range of possible scores, but rather only included a small portion of the full range – from 79.4 to 80.4.

This is a perfectly valid procedure often used in science, and it is quite explicitly done (the x-axis is clearly marked), but it does give a visual impression of a large spread of scores which could be quite misleading. In effect, their Figure 4b includes just a slither of my graph above, as shown below. If one takes the portion of the image below that is not greyed out, and stretches it to cover the full extent of the x axis of a graph, that is what is presented in the published account.


In the paper in J.Chem.Ed., Li and colleagues (2022) truncate the scale on their average score axis to expand 1% of the full range (approximated above in the area not shaded over) into a whole graph as their Figure 4b. This gives a visual impression of widely varying scores (to anyone who does not read the axis labels).

Compare images: you can use the 'slider' to change how much of each of the two images is shown.

What might have caused those small variations?

If anyone does think that differences of a few tenths of a percent in average class scores are notable, and that this demonstrates increasing student achievement, then we might ask what causes this?

Li and colleagues seem to be convinced that the change in teaching approach caused the (very modest) increase in scores year on year. That would be possible. (Indeed, Li et al seem to be arguing that the very, very modest shift from 2017 to subsequent years was due to the change of teaching approach; but the not-quite-so-modest shifts from 2018 to 2019 to 2020 are due to developing teacher competence!) However, drawing that conclusion requires making a ceteris paribus assumption: that all other things are equal. That is, that any other relevant variables have been controlled.

Read about confounding variables

Another possibility however is simply that each year the teaching team are more familiar with the science, and have had more experience teaching it to groups at this level. That is quite reasonable and could explain why there might be a modest increase in student outcomes on a course year on year.

Non-equivalent groups of students?

However, a big assumption here is that each of the year groups can be considered to be intrinsically the same at the start of the course (and to have equivalent relevant experiences outside the focal course during the programme). Often in quasi-experimental studies (where randomisation to conditions is not possible 1) a pre-test is used to check for equivalence prior to the innovation: after all, if students are starting from different levels of background knowledge and understanding then they are likely to score differently at the end of a course – and no further explanation of any measured differences in course achievement need be sought.

Read about testing for initial equivalence

In experiments, you randomly assign the units of analysis (e.g., students) to the conditions, which gives some basis for at least comparing any differences in outcomes with the variations likely by chance. But this was not a true experiment as there was no randomisation – the comparisons are between successive year groups.

In Li and colleagues' study, the 40 students taking the class in 2017 are implicitly assumed equivalent to the 40 students taking the class in each of the years 20818-2020: but no evidence is presented to support this assumption. 3

Yet anyone who has taught the same course over a period of time knows that even when a course is unchanged and the entrance requirements stable, there are naturally variations from one year to the next. That is one of the challenges of educational research (Taber, 2019): you never can "take two identical students…two identical classes…two identical teachers…two identical institutions".

Novelty or expectation effects?

We would also have to ignore any difference introduced by the general effect of there being an innovation beyond the nature of the specific innovation (Taber, 2019). That is, students might be more attentive and motivated simply because this course does things differently to their other current courses and past courses. (Perhaps not, but it cannot be ruled out.)

The researchers are likely enthusiastic for, and had high expectations for, the innovation (so high that it seems to have biased their interpretation of the data and blinded them to the obvious problems with their argument) and much research shows that high expectation, in its own right, often influences outcomes.

Read about expectancy effects in studies

Equivalent examination questions and marking?

We also have to assume the assessment was entirely equivalent across the four years. 4 The scores were based on aggregating a number of components:

"The course score was calculated on a percentage basis: attendance (5%), preclass preview (10%), in-class group presentation (10%), postclass mind map (5%), unit tests (10%), midterm examination (20%), and final examination (40%)."

Li, et al, 2022, p.1858

This raises questions about the marking and the examinations:

  • Are the same test and examination questions used each year (that is not usually the case as students can acquire copies of past papers)?
  • If not, how were these instruments standardised to ensure they were not more difficult in some years than others?
  • How reliable is the marking? (Reliable meaning the same scores/mark would be assigned to the same work on a different occasion.)

These various issues do not appear to have been considered.

Change of assessment methodology?

The description above of how the students' course scores were calculated raises another problem. The 2017 cohort were taught by "direct instruction". This is not explained as the authors presumably think we all know exactly what that is : I imagine lectures. By comparison, in the innovation (2018-2020 cohorts):

"The preclass stage of the SCTBL strategy is the distribution of the group preview task; each student in the group is responsible for a task point. The completion of the preview task stimulates students' learning motivation. The in-class stage is a team presentation (typically PowerPoint (PPT)), which promotes students' understanding of knowledge points. The postclass stage is the assignment of team homework and consolidation of knowledge points using a mind map. Mind maps allow an orderly sorting and summarization of the knowledge gathered in the class; they are conducive to connecting knowledge systems and play an important role in consolidating class knowledge."

Li, et al, 2022, p.1856, emphasis added.

Now the assessment of the preview tasks, the in-class group presentations, and the mind maps all contributed to the overall student scores (10%, 10%, 5% respectively). But these are parts of the innovative teaching strategy – they are (presumably) not part of 'direct instruction'. So, the description of how the student class scores were derived only applies to 2018-2020, and the methodology used in 2017 must have been different. (This is not discussed in the paper.) 5

A quarter of the score for the 'experimental' groups came from assessment components that could not have been part of the assessment regime applied to the 2017 cohort. At the very least, the tests and examinations must have been more heavily weighed into the 'control' group students' overall scores. This makes it very unlikely the scores can be meaningfully directly compared from 2017 to subsequent years: if the authors think otherwise they should have presented persuasive evidence of equivalence.


Li and colleagues want to convince us that variations in average course scores can be assumed to be due to a change in teaching approach – even though there are other conflating variables.

So, groups that we cannot assume are equivalent are assessed in ways that we cannot assume to be equivalent and obtain nearly identical average levels of achievement. Despite that, Li and colleagues want to persuade us that the very modest differences in average scores between the 'control' and 'experimental' groups (which is actually larger between different 'experimental group' cohorts than between the 'control' group and the successive 'experimental' cohort) are large enough to be significant and demonstrate their teaching innovation improves student achievement.

Statistical inference

So, even if we thought shifts of less than a 1% average in class achievement were telling, there are no good reasons to assume they are down to the innovation rather than some other factor. But Li and colleagues use statistical tests to tell them whether differences between the 'control' and 'experimental' conditions are significant. They find – just what anyone looking at the graph above would expect – "there is no significant difference in average score" (p.1860).

The scientific convention in using such tests is that the choice of test, and confidence level (e.g., a probability of p<0.05 to be taken as significant) is determined in advance, and the researchers accept the outcomes of the analysis. There is a kind of contract involved – a decision to use a statistical test (chosen in advance as being a valid way of deciding the outcome of an experiment) is seen as a commitment to accept its outcomes. 2 This is a form of honesty in scientific work. Just as it is not acceptable to fabricate data, nor is is acceptable to ignore experimental outcomes when drawing conclusions from research.

Special pleading is allowed in mitigation (e.g., "although our results were non-significant, we think this was due to the small samples sizes, and suggest that further research should be undertaken with large groups {and we are happy to do this if someone gives us a grant}"), but the scientist is not allowed to simply set aside the results of the analysis.


Li and colleagues found no significant difference between the two conditions, yet that did not stop them claiming, and the Journal of Chemical Education publishing, a conclusion that the new teaching approach improved student achievement!

Yet setting aside the results of their analysis is what Li and colleagues do. They carry out an analysis, then simply ignore the findings, and conclude the opposite:

"To conclude, our results suggest that the SCTBL method is an effective way to improve teaching quality and student achievement."

Li, et al, 2022, p.1861

It was this complete disregard of scientific values, rather than the more common failure to appreciate that they were not comparing like with like, that I found really shocking – and led to me writing a formal letter to the journal. Not so much surprise that researchers might do this (I know how intoxicating research can be, and how easy it is to become convinced in one's ideas) but that the peer reviewers for the Journal of Chemical Education did not make the firmest recommendation to the editor that this manuscript could NOT be published until it was corrected so that the conclusion was consistent with the findings.

This seems a very stark failure of peer review, and allows a paper to appear in the literature that presents a conclusion totally unsupported by the evidence available and the analysis undertaken. This also means that Li, Ouyang, Xu and Zhang now have a publication on their academic records that any careful reader can see is critically flawed – something that could have been avoided had peer reviewers:

  • used their common sense to appreciate that variations in class average scores from year to year between 79.8 and 80.3 could not possibly be seen as sufficient to indicate a difference in the effectiveness of teaching approaches;
  • recommended that the authors follow the usual scientific norms and adopt the reasonable scholarly value position that the conclusion of your research should follow from, and not contradict, the results of your data analysis.


Work cited:

Notes

1 Strictly the 2017 cohort has the role of a comparison group, but NOT a control group as there was no randomisation or control of variables, so this was not a true experiment (but a 'quasi-experiment'). However, for clarity, I am here using the original authors' term 'control group'.

Read about experimental research design


2 Some journals are now asking researchers to submit their research designs and protocols to peer review BEFORE starting the research. This prevents wasted effort on work that is flawed in design. Journals will publish a report of the research carried out according to an accepted design – as long as the researchers have kept to their research plans (or only made changes deemed necessary and acceptable by the journal). This prevents researchers seeking to change features of the research because it is not giving the expected findings and means that negative results as well as positive results do get published.


3 'Implicitly' assumed as nowhere do the authors state that they think the classes all start as equivalent – but if they do not assume this then their argument has no logic.

Without this assumption, their argument is like claiming that growing conditions for tree development are better at the front of a house than at the back because on average the trees at the front are taller – even though fast-growing mature trees were planted at the front and slow-growing saplings at the back.


4 From my days working with new teachers, a common rookie mistake was assuming that one could tell a teaching innovation was successful because students achieved an average score of 63% on the (say, acids) module taught by the new method when the same class only averaged 46% on the previous (say, electromagnetism) module. Graduate scientists would look at me with genuine surprise when I asked how they knew the two tests were of comparable difficulty!

Read about why natural scientists tend to make poor social scientists


5 In my (rejected) letter to the Journal of Chemical Education I acknowledged some ambiguity in the paper's discussion of the results. Li and colleagues write:

"The average scores of undergraduates majoring in pharmaceutical engineering in the control group and the experimental group were calculated, and the results are shown in Figure 4b. Statistical significance testing was conducted on the exam scores year to year. The average score for the pharmaceutical engineering class was 79.8 points in 2017 (control group). When SCTBL was implemented for the first time in 2018, there was a slight improvement in the average score (i.e., an increase of 0.11 points, not shown in Figure 4b). However, by 2019 and 2020, the average score increased by 0.32 points and 0.54 points, respectively, with an obvious improvement trend. We used a t test to test whether the SCTBL method can create any significant difference in grades among control groups and the experimental group. The calculation results are shown as follows: t1 = 0.0663, t2 = 0.1930, t3 =0.3279 (t1 <t2 <t3 <t𝛼, t𝛼 =2.024, p>0.05), indicating that there is no significant difference in average score. After three years of continuous implementation of SCTBL, the average score showed a constant upward trend, and a steady increase was found. The SCTBL method brought about improvement in the class average, which provides evidence for its effectiveness in medicinal chemistry."

Li, et al, 2022, p.1858-1860, emphasis added

This appears to refer to three distinct measures:

  • average scores (produced by weighed summations of various assessment components as discussed above)
  • exam scores (perhaps just the "midterm examination…and final examination", or perhaps just the final examination?)
  • grades

Formal grades are not discussed in the paper (the word is only used in this one place), although the authors do refer to categorising students into descriptive classes ('levels') according to scores on 'assessments', and may see these as grades:

"Assessments have been divided into five levels: disqualified (below 60), qualified (60-69), medium (70-79), good (80-89), and excellent (90 and above)."

Li, et al, 2022, p.1856, emphasis added

In the longer extract above, the reference to testing difference in "grades" is followed by reporting the outcome of the test for "average score":

"We used a t test to test …grades …The calculation results … there is no significant difference in average score"

As Student's t-test was used, it seems unlikely that the assignment of students to grades could have been tested. That would surely have needed something like the Chi-squared statistic to test categorical data – looking for an association between (i) the distributions of the number of students in the different cells 'disqualified', 'qualified', 'medium', 'good' and 'excellent'; and (ii) treatment group.

Presumably, then, the statistical testing was applied to the average course scores shown in the graph above. This also makes sense because the classification into descriptive classes loses some of the detail in the data and there is no obvious reason why the researchers would deliberately chose to test 'reduced' data rather than the full data set with the greatest resolution.


Methodological and procedural flaws in published study

A letter to the editor of the Journal of Chemical Education

the authors draw a conclusion which is contrary to the results of their data analysis and so is invalid and misleading

I have copied below the text of a letter I wrote to the editor of the Journal of Chemical Education, to express my concern about the report of a study published in that journal. I was invited to formally submit the letter for consideration for publication. I did. Following peer review it was rejected.

Often when I see apparent problems in published research, I discuss them here. Usually, the journals concerned are predatory, and do not seem to take peer review seriously. That does not apply here. The Journal of Chemical Education is a long-established, well-respected, periodical published by a national learned scientific society: the American Chemical Society. Serious scientific journals often do publish comments from readers about published articles and even exchanges between correspondents and the original authors of the work commented on. I therefore thought it was more appropriate to express my concerns directly to the journal. 𝛂 On this occasion, after peer review, the editor decided my letter was not suitable for publication. 𝛃

I am aware of the irony – I am complaining about an article which passed peer review in a posting which is publishing a letter submitted, but rejected, after peer review. Readers should bear that in mind. The editor will have carefully considered the submission and the referee recommendations and reports, and decided to decline publication based on journal policy and the evaluation of my submission.

However, having read the peer reviewers' comments (which were largely positive about the submission and tended to agree with my critique 𝜸), I saw no reason to change my mind. If such work is allowed to stand in the literature without comment, it provides a questionable example for other researchers, and, as the abstracts and conclusions from research papers are often considered in isolation (so, here, without being aware that the conclusions contradicted the results), it distorts the research literature.

To my reading, the published study sets aside accepted scientific standards and values – though I very much suspect inadvertently. Perhaps the authors' enthusiasm for their teaching innovation affected their judgement and dulled their critical faculties. We are all prone to that: but one would normally expect such a major problem to have been spotted in peer review, allowing the authors the opportunity to put this right before publication.

Read about falsifying research conclusions


Methodological and procedural flaws in published study

Abstract

A recent study reported in the journal is presented as an experimental test of a teaching innovation. Yet the research design does not meet the conditions for an experiment as there is insufficient control of variables and no random assignment to conditions. The study design used does not allow a comparison of student scores in the 'experimental' and 'control' conditions to provide a valid test of the innovation. Moreover, the authors draw a conclusion which is contrary to the results of their data analysis and so is invalid and misleading. While the authors may well feel justified in putting aside the outcome of their statistical analysis, this goes against good scientific norms and practice.

Dear Editor

I am writing regarding a recent article published in J.Chem.Ed. 1, as I feel the reporting of this study, as published, is contrary to good scientific practice. The article, 'Implementation of the Student-Centered Team-Based Learning Teaching Method in a Medicinal Chemistry Curriculum' reports an innovation in pedagogy, and as such is likely to be of wide interest to readers of the journal. I welcome both this kind of work in developing pedagogy and its reporting to inform others; however, I think the report contravenes normal scientific standards.

Although the authors do not specify the type of research methodology they use, they do present their analysis in terms of 'experimental' and 'control' groups (e.g., p.1856), so it is reasonable to consider they see this as a kind of experimental research. There are many serious challenges when applying experimental method to social research, and it is not always feasible to address all such challenges in educational research designs 2 – but perhaps any report of educational experimental research should acknowledge relevant limitations.

A true experiment requires units of analysis (e.g., students) to be assigned to conditions randomly, as this can avoid (or, strictly, reduce the likelihood) of systematic differences between groups. Here the comparison is across different cohorts. These may be largely similar, but that cannot just be assumed. (Strictly, the comparison group should not be labelled as a 'control' group.2 ) There is clearly a risk of conflating variables.

  • Perhaps admission standards are changing over time?
  • Perhaps the teaching team has been acquiring teaching experience and expertise over time regardless of the innovation?

Moreover, if I have correctly interpreted the information on p.1858 about how student course scores after the introduction of the innovation in part derived from the novel activities in the new approach, then there is no reason to assume that the methodology of assigning scores is equivalent with that used in the 'control' (comparison) condition. The authors seem to simply assume the change in scoring methodology will not of itself change the score profile. Without evidence that assessment is equivalent across cohorts, this is an unsupportable assumption.

As it is not possible to 'blind' teachers and students to conditions there is a very real risk of expectancy effects which have been shown to often operate when researchers are positive about an innovation – when introducing the investigated innovation, teachers

  • may have a new burst of enthusiasm,
  • perhaps focus more than usual on this aspect of their work,
  • be more sensitive to students responses to teaching and so forth.

(None of this needs to be deliberate to potentially influence outcomes.) Although (indeed, perhaps because) there is often little that can be done in a teaching situation to address these challenges to experimental designs, it seems appropriate for suitable caveats to be included in a published report. I would have expected to have seen such caveats here.

However, a specific point that I feel must be challenged is in the presentation of results on p.1860. When designing an experiment, it is important to specify before collecting data how one will know what to conclude from the results. The adoption of inferential statistics is surely a commitment to accepting the outcomes of the analysis undertaken. Li and colleagues tell readers that "We used a t test to test whether the SCTBL method can create any significant difference in grades among control groups and the experimental group" and that "there is no significant difference in average score". This is despite the new approach requiring an "increased number of study tasks, and longer preclass preview time" (pp.1860-1).

I would not suggest this is necessarily a good enough reason for Li and colleagues to give up on their innovation, as they have lived experience of how it is working, and that may well offer good grounds for continuing to implement, refine, and evaluate it. As the authors themselves note, evaluation "should not only consider scores" (p.1858).

However, from a scientific point of view, this is a negative result. That certainly should not exclude publication (it is recognised that there is a bias against publishing negative results which distorts the literature in many fields) but it suggests, at the very least, that more work is needed before a positive conclusion can be drawn.

Therefore, I feel it is scientifically invalid for the authors to argue that as "the average score showed a constant [i.e., non-significant] upward trend, and a steady [i.e., non-significant] increase was found" they can claim their teaching "method brought about improvement in the class average, which provides evidence for its effectiveness in medicinal chemistry". Figure 4 reiterates this: a superficially impressive graphic, even if omits the 2018 data, actually shows just how little scores changed when it is noticed that the x-axis has a range only from 79.4-80.4 (%, presumably). The size of the variation across four cohorts (<1%, "an obvious improvement trend"?) is not only found to not be significant but can be compared with how 25% of student scores apparently derived from different types of assessment in the different conditions. 3

To reiterate, this is an interesting study, reporting valuable work. There might be very good reasons to continue the new pedagogic approach even if it does not increase student scores. However, I would argue that it is simply scientifically inadmissible to design an experiment where data will be analysed by statistical tests, and then to offer a conclusion contrary to the results of those tests. A reader who skipped to the end of the paper would find "To conclude, our results suggest that the SCTBL method is an effective way to improve teaching quality and student achievement" (p.1861) but that is to put aside the results of the analysis undertaken.


Keith S. Taber

Emeritus Professor of Science Education, University of Cambridge

References

1 Li, W., Ouyang, Y., Xu, J., & Zhang, P. (2022). Implementation of the Student-Centered Team- Based Learning Teaching Method in a Medicinal Chemistry Curriculum. Journal of Chemical Education, 99(5), 1855-1862. https://doi.org/10.1021/acs.jchemed.1c00978

2 Taber, K. S. (2019). Experimental research into teaching innovations: responding to methodological and ethical challenges. Studies in Science Education, 55(1), 69-119. https://doi.org/10.1080/03057267.2019.1658058

3 I felt there was some ambiguity regarding what figures 4a and 4b actually represent. The labels suggest they refer to "Assessment levels of pharmaceutical engineering classes [sic] in 2017-2020" and "Average scores of the medicinal chemistry course in the control group and the experimental group" (which might, by inspection, suggest that achievement on the medicinal chemistry course is falling behind shifts across the wider programme), but the references in the main text suggest that both figures refer only to the medicinal chemistry course, not the wider pharmaceutical engineering programme. Similarly, although the label for (b) refers to 'average scores' for the course, the text suggests the statistical tests were only applied to 'exam scores' (p.1858) which would only amount to 60% of the marks comprising the course scores (at least in 2018-2020; the information on how course scores were calculated for the 2017 cohort does not seem to be provided but clearly could not follow the methodology reported for the 2018-2020 cohorts). So, given that (a) and (b) do not seem consistent, it may be that the 'average scores' in (b) refers only to examination scores and not overall course scores. If so, that would at least suggest the general assessment methodology was comparable, as long as the setting and marking of examinations are equivalent across different years. However, even then, a reader would take a lot of persuasion that examination papers and marking are so consistent over time that changes of a third or half a percentage point between cohorts exceeds likely measurement error.


Read: Falsifying research conclusions. You do not need to falsify your results if you are happy to draw conclusions contrary to the outcome of your data analysis.


Notes:

𝛂 This is the approach I have taken previously. For example, a couple of years ago a paper was published in the Royal Society of Chemistry's educational research journal, Chemistry Education Research and Practice, which to my reading had similar issues, including claiming "that an educational innovation was effective despite outcomes not reaching statistical significance" (Taber, 2020).

Taber, K. S. (2020). Comment on "Increasing chemistry students' knowledge, confidence, and conceptual understanding of pH using a collaborative computer pH simulation" by S. W. Watson, A. V. Dubrovskiy and M. L. Peters, Chem. Educ. Res. Pract., 2020, 21, 528. Chemistry Education Research and Practice. doi:10.1039/D0RP00131G


𝛃 I wrote directly to the editor, Prof. Tom Holme on 12th July 2022. I received a reply the next day, inviting me to submit my letter through the journal's manuscript submission system. I did this on the 14th.

I received the decision letter on 15th September. (The "manuscript is not suitable for publication in the Journal of Chemical Education in its present form.") The editor offered to consider a resubmission of "a thoroughly rewritten manuscript, with substantial modification, incorporating the reviewers' points and including any additional data they recommended". I decided that, although I am sure the letter could have been improved in some senses, any new manuscript sufficiently different to be considered "thoroughly rewritten manuscript, with substantial modification" would not so clearly make the important points I felt needed to be made.


𝜸 There were four reviewers. The editor informed me that the initial reviews led to a 'split' perspective, so a fourth referee was invited.

  • Referee 1 recommended that the letter was published as submitted.
  • Referee 2 recommended that the letter was published as submitted.
  • Referee 3 recommended major revisions should be undertaken.
  • Referee 4 recommended rejection.

Read more about peer review and editorial decisions

Are these fossils dead, yet?

Non-living fossils and dead metaphors


Keith S. Taber


Fossil pottery?
(Images by by Laurent Arroues {background}) and OpenClipart-Vectors from Pixabay)


I was intrigued by some dialogue that was part of one of (physicist) Jim Al-Khalili's interviews for the BBC's 'The Life Scientific' series, where Prof. Al-Khalili "talks to leading scientists about their work, finding out what inspires and motivates them and asking what their discoveries might do for mankind".


The Life Scientific – interviews with scientists about their lives and work

This week he was talking to Dr Judith Bunbury of St. Edmund's College and the Department of Earth Sciences at Cambridge ('Judith Bunbury on the shifting River Nile in the time of the Pharaohs'). It was a fascinating interview, and in particular discussed work showing how the Nile River has repeatedly changed its course over thousands of years. The Nile is considered the longest river in Africa (and possibly the world – the other contender being the Amazon).


Over time the river shifts is position as it unevenly lay down sediment and erodes the river banks – (Image by Makalu from Pixabay)

The exchange that especially piqued my interest followed an account of the diverse material recovered in studies that sample the sediments formed by the river. As sediments are laid down over time, a core (collected by an auger) can be understood to have formed on a time-line – with the oldest material at the bottom of the sample.

Within the sediment, researchers find fragments of animal bone, human teeth, pottery, mineral shards from the working of jewels…


"Are you sure the Nile flows this far?" Using an auger to collect a core (of ice in this case) (Image by David Mark from Pixabay)

Dr Bunbury was taking about how changing fashions allowed the pottery fragments to be useful in dating material – or as the episode webpage glossed this: "pottery fragments which can be reliably time-stamped to the fashion-conscious consumers in the reign of individual Pharaohs".

This is my transcription of the exchange:

[JAK]: …a bit like fossil hunting
[JB]: well, I mean, we're just treating pottery as a kind of fossil
a kind of fossil, yeah, > no, absolutely >
< and it is a fossil <
yes, well quite, I can see the similarities.

Prof. Jim Al-Khalili interviewing Dr Judith Bunbury

Now Prof. Jim has a very gentle, conversational, interview style, as befits a programme with extended interviews with scientists talking about their lives (unlike, say, a journalist faced with a politician where a more adversarial style might be needed), so this exchange probably comes as close to a disagreement or challenge as 'The Life Scientific' gets. Taking a slight liberty, I might represent this as:

  • Al-Khalili: your work is like fossil hunting, the pottery fragments are similar to fossils
  • Bunbury: no, they ARE fossils

So, here we have an ontological question: are the pottery fragments recovered in archaeological digs (actually) fossils or not?

Bunbury wants to class the finds as fossils.

Al-Khalili thinks that in this context 'a kind of fossil' and 'like fossil hunting' are similes ("I can see the similarities") – the finds are somewhat like fossils, but are not fossils per se.

Read about science similes

So, who is right?

Metaphorical fossils

The term fossil is commonly used in metaphorical ways. For example, for a person to be described as a fossil is to be characterised as a kind of anachronism that has not kept up with social changes.

The term also seems to have been adopted in some areas of science as a kind of adjective. One place it is used is in relation to evidence of dampened ocean turbulence,

"The term 'fossil turbulence' refers to remnants of turbulence in fluid which is no longer turbulent."

Gibson, 1980, p.221

If that seems like a contradiction, it is explained that

"Small scale fluctuations of temperature, salinity, and vorticity in the ocean occur in isolated patches apparently caused by bursts of active turbulence. After the turbulence has been dampened by stable stratification the fluctuations persist as fossil turbulence."

Gibson, 1980, p.221

So, 'fossil turbulence' is not actually turbulence, but more the afterglow of the turbulence: a bit like the aftermath of a lively party which leaves its traces: the the chaotic pattern of abandoned debris provides signs there has been a party although there is clearly no longer a party going on.


An analogy for 'fossil turbulence'

Another example from astronomy is fossil groups of galaxies, which are apparently "systems with a very luminous X-ray source …and a very optically dominant central galaxy" (Kanagusuku, Díaz-Giménez & Zandivarez, 2016). It seems,

"The true nature of fossil groups in the Universe still puzzles the astronomical community. These peculiar systems are one of the most intriguing places in the Universe where giant elliptical galaxies are hosted [sic]."

Kanagusuku, Díaz-Giménez & Zandivarez, 2016

('Hosted' here also seems metaphorical – who or what could be acting as a host to an elliptical galaxy?)

The term 'fossil group' was introduced for "for an apparently isolated elliptical galaxy surrounded by an X-ray halo, with an X-ray luminosity typical of a group of galaxies" (Zarattini, Biviano, Aguerri, Girardi & D'Onghia, 2012): so, something that looks like a single galaxy, but in other respsects resembles a whole group of galaxies?

Close examination might reveal other galaxies present, yet the 'fossil' group is "distinguished by a large gap between the brightest galaxy and the fainter members" (Dariush, Khosroshahi, Ponman, Pearce, Raychaudhury & Hartley, 2007). Of course, there is normally a 'large gap' between any two galaxies (space contains a lot of, well, space), but presumably this is another metaphor – there is a 'gap' between the magnitude of the luminosity of the brightest galaxy, and the magnitudes of the luminosities of the others.

Read about science metaphors

Dead metaphors

One way in which language changes over time is through the (metaphorical) death of metaphors. Terms that are initially introduced as metaphors sometimes get generally adopted and over time become accepted terminology.

Many words in current use today were originally coined in this way, and often people are quite unaware of their origins. References to the hands of a clock or watch will these days be taken as simply a technical term (or perhaps for those who only familiar with digital clocks, a complete mystery?) In time, this may happen to 'fossil turbulence' or 'fossil galaxy groups'.

What counts as a fossil?

But it seems reasonable to suggest that, currently at least, these are still metaphors, implying that in some sense the ocean fluctuations or the galactic groups are somewhat like fossils. But these are not actual fossils, just as tin-pot dictators are not actually fabricated from tin.

So, what are actual fossils. The 'classic' fossil takes the form of an ancient, often extinct, living organism, or a part thereof, but composed of rock which has over time replaced the original organic material. In this sense, Prof. Al-Khalili seems correct in suggesting bits of pottery are only akin to fossils, and not actually fossils. But is that how the experts use the term?

According to the British Geological Survey (BGS):

Fossils are the preserved remains of plants and animals whose bodies were buried in sediments, such as sand and mud, under ancient seas, lakes and rivers. Fossils also include any preserved trace of life that is typically more than 10 000 years old. 

https://www.bgs.ac.uk/discovering-geology/fossils-and-geological-time/fossils/ 1

Now, pottery is not the preserved remains of plants or animals or other living organisms, but the site goes on to explain,

Preserved evidence of the body parts of ancient animals, plants and other life forms are called 'body fossils'. 'Trace fossils' are the evidence left by organisms in sediment, such as footprints, burrows and plant roots.

https://www.bgs.ac.uk/discovering-geology/fossils-and-geological-time/fossils 1

So, footprints, burrows, [evidence of] plant roots 2…or shards of pottery…can be trace fossils? After all, unearthed pottery is indirect evidence of living human creatures having been present in the environment, and, as the BGS also points out "the word fossil is derived from the Latin fossilis meaning 'unearthed'."

However, if the term originally simply meant something unearthed, then although the bits of pot would count as fossils – based on that argument so would potatoes growing in farmers' fields. So, clearly the English word 'fossil' has a more specific meaning in common use than its Latin ancestor.

But going by the BGS definition, Dr Bunbury's unearthed samples of pottery are certainly evidence of organisms left in sediment, so might be considered fossils. These fossils are not the remains of dead organisms, but neither is 'fossil' here simply a metaphor (not even a dead metaphor).


Work cited:
  • Dariush, A, Khosroshahi, H. G., Ponman, T. J., Pearce, F., Raychaudhury, S. & Hartley, W. (2007), The mass assembly of fossil groups of galaxies in the Millennium simulation, Monthly Notices of the Royal Astronomical Society, Volume 382, Issue 1, 21 November 2007, Pages 433-442, https://doi.org/10.1111/j.1365-2966.2007.12385.x
  • Gibson, Carl H. (1980) Fossil Temperature, Salinity, and Vorticity Turbulence in the Ocean. In Jacques C.J. Nihoul (Ed.) Marine Turbulence, Elsevier, pp. 221-257.
  • Kanagusuku, María José, Díaz-Giménez, Eugenia & Zandivarez, Ariel (2016) Fossil groups in the Millennium simulation – From the brightest to the faintest galaxies during the past 8 Gyr, Astronomy & Astrophysics, 586 (2016) A40, https://doi.org/10.1051/0004-6361/201527269.
  • Romero, I. C., Nuñez Otaño, N. B., Gibson, M. E., Spears, T. M., Fairchild, C. J., Tarlton, L., . . . O'Keefe, J. M. K. (2021). First Record of Fungal Diversity in the Tropical and Warm-Temperate Middle Miocene Climate Optimum Forests of Eurasia [Original Research]. Frontiers in Forests and Global Change, 4. https://doi.org/10.3389/ffgc.2021.768405
  • Zarattini, S., Biviano, A., Aguerri, J. A. L., Girardi, M. & D'Onghia, E. (2012) Fossil group origins – XI. The dependence of galaxy orbits on the magnitude gap, Astronomy & Astrophysics, 655 (2021) A103, DOI: https://doi.org/10.1051/0004-6361/202038722.

Notes:

1 "Fossils are the preserved remains of plants and animals whose bodies …". But this suggests that fungi do not form fossils. The same site points out that "We tend to think of fungi, such as mushrooms and toadstools, as being plants — but they are not. They neither grow from embryos nor photosynthesise and are put in a separate kingdom" (https://www.bgs.ac.uk/discovering-geology/fossils-and-geological-time/plants-2/) – yet does not seem to mention any examples of fungi that have been fossilised (so the comment could be read to be meant to suggest that fossil fungi are found as well as fossil plants; but could equally well be read to mean that as fungi are not plants they do not fossilise).

The second quote here is more inclusive: "Preserved evidence of the body parts of ancient animals, plants and other life forms…" The site does also specify that "Remains can include microscopically small fossils, such as single-celled foraminifera…" (https://www.bgs.ac.uk/discovering-geology/fossils-and-geological-time/fossils/).

So, just to be clear, fossil fungi have been found.




Fungal spores found in Thailand – figure 3 from Romero et al, 2021. These fossils were recovered form lignite (a form of coal) deposited in the Miocene epoch.
Copyright © 2021 Romero, Nuñez Otaño, Gibson, Spears, Fairchild, Tarlton, Jones, Belkin, Warny, Pound and O'Keefe; distributed under the terms of the Creative Commons Attribution License (CC BY).

2 If the roots were themselves fossilised then these would surely be body fossils as roots are parts of plant. Presumably this is meant to refer to the channels in soil when the roots grow through the soil.