Misconceptions of change

It may be difficult to know what counts as an alternative conception in some topics – and sometimes research does not make it any clearer


Keith S. Taber


If a reader actually thought the researchers themselves held these alternative conceptions then one could have little confidence in their ability to distinguish between the scientific and alternative conceptions of others

I recently published an article here where I talked in some detail about some aspects of a study (Tarhan, Ayyıldız, Ogunc & Sesen, 2013) published in the journal Research in Science and Technological Education. Despite having a somewhat dodgy title 1, this is a well respected journal published by a serious publisher (Routledge/Taylor & Francis). I read the paper because I was interested in the pedagogy being discussed (jigsaw learning), but what promoted me to then write about it was the experimental design: setting up a comparison between a well-tested active learning approach and lecture-based teaching. A teacher experienced in active learning techniques taught a control group of twelve year old pupils through a 'traditional' teaching approach (giving the children notes, setting them questions…) as a comparison condition for a teaching approach based on engaging group-work.

The topic being studied by the sixth grade, elementary school, students was physical and chemical changes.

I did not discuss the outcomes of the study in that post as my focus there was on the study as possibly being an example of rhetorical research (i.e., a demonstration set up to produce a particular outcome, rather than an open-ended experiment to genuinely test a hypothesis), and I was concerned that the control conditions involved deliberately providing sub-optimal, indeed sub-standard, teaching to the learners assigned to the comparison condition.

Read 'Didactic control conditions. Another ethically questionable science education experiment?'

Identifying alternative conceptions

The researchers actually tested the outcome of their experiment in two ways (as well as asking students in the experimental condition about their perceptions of the lessons), a post-test taken by all students, and "ten-minute semi-structured individual interviews" with a sample of students from each condition.

Analysis of the post-test allowed the researchers to identify the presence of students' alternative conceptions ('misconceptions'2) related to chemical and physical change, and the identified conceptions are reported in the study. Interviewees were purposively selected,

"Ten-minute semi-structured individual interviews were carried out with seven students from the experimental group and 10 students from the control group to identify students' understanding of physical and chemical changes by acquiring more information about students' unclear responses to [the post-test]. Students were selected from those who gave incorrect, partially correct and no answers to the items in the test. During the interviews, researchers asked the students to explain the reasons for their answers to the items."

Tarhan et al., 2013, p.188

I was interested to read about the alternative conceptions they had found for several reasons:

  1. I have done research into student thinking, and have written a lot about alternative conceptions, so the general topic interests me;
  2. More specifically, it is interesting to compare what researchers find in different educational contexts, as this gives some insight into the origins and developments of such conceptions;
  3. Also, I think the 'chemical and physical changes' distinction is actually a very problematic topic to teach. (Read about a free classroom resource to explore learners' ideas about physical and chemical changes.)

In this post I am going to question whether the author's claims in their research report about some of the alternative conceptions they reported finding are convincing. First, however, I should explain the second point here.

Cultural variations in alternative conceptions

Some alternative conceptions seem fairly universal, being identified in populations all around the world. These may primarily be responses to common experiences of the natural world. An obvious example relates to Newton's first law (the law of inertia): we learn from very early experience, before we even have language to talk about our experiences, that objects that we push, throw, kick, toss, pull… soon come to a stop. They do not move off in a straight line and continue indefinitely at a constant speed.

Of course, that experience is not actually contrary to Newton's first law (as various forces are acting on the objects concerned), but it presents a consistent pattern (objects initially move off, but soon slow and stop) that becomes part of out intuitions about the world and so makes learning the scientific law seem counter-intuitive, and so more difficult to accept and apply when taught in school.

Read about the challenge of learning Newton's first law

By contrast, no one has ever tested Newton's first law directly by seeing what happens under the ideal conditions under which it would apply (see 'Poincaré, inertia, and a common misconception').

Other alternative conceptions may be less universal: some may be, partially at least, due to an aspect of local cultural context (e.g. folk knowledge, local traditions), the language of instruction, the curriculum or teaching scheme, or even a particular teacher's personal way of presenting material.

So, to the extent that there are some experiences that are universal for all humans, due to commonalities in the environment (e.g., to date at least, all members of the species have been born into an environment with a virtually constant gravitational field and a nitrogen-rich atmosphere of about 1 atmosphere pressure {i.e., c.105 Pa} and about 21% oxygen content), there is a tendency for people everywhere (on earth) to develop the same alternative conceptions.

And, conversely, to the extent that people in different institutional, social, and cultural contexts have contrasting experiences, we would expect some variations in the levels of incidence of some alternative conceptions across populations.

"Some common ideas elicited from children are spread, at least in part, through informal learning in everyday "life-world" contexts. Through such processes youngsters are inducted into the beliefs of their culture. Ideas that are common in a culture will not usually contradict everyday experience, but clearly beliefs may develop and be disseminated without matching formal scientific knowledge. …

Where life-world beliefs are relevant to school science – perhaps contradicting scientific principles, perhaps apparently offering an explanation of some science taught in school; perhaps appearing to provide familiar examples of taught principles – then it is quite possible, indeed likely, that such prior beliefs will interfere with the learning of school science. …

Different common beliefs will be found among different cultural groups, and therefore it is likely that the same scientific concepts will be interpreted differently among different cultural groups as they will be interpreted through different existing conceptual frameworks."

Taber, 2012a, pp.5-6

As a trivial example, in England the National Curriculum for primary age children in England erroneously describes some materials that are mixtures as being substances. These errors have persisted for some years as the government department does not think they are important enough to make the effort to correct the error. Assuming many primary school teachers (who are usually not science specialists, though some are of course) trust the flawed information in the official curriculum, we might expect more secondary school students in England, than in other comparable populations, to later demonstrate alternative conceptions in relation to the critical concept of a chemical substance.

"This suggests that studies from different contexts (e.g., different countries, different cultures, different languages of instruction, and different curriculum organisations) should be encouraged for what they can tell us about the relative importance of educational variables in encouraging, avoiding, overcoming, or redirecting various types of ideas students are known to develop."

Taber, 2012a, p.9
The centrality of language

Language of instruction may sometimes be important. Words that supposedly are translated from one language to another may actually have different nuances and associations. (In English, it is clearly an alternative conception to think the chemical elements still exist in a compound, but the meaning of the French élément chemie seems to include the 'essence' of an element that does continue into compound.)

Research in different educational contexts can in principle help unravel some of this: in principle as it does need the various researchers to detail aspects of the teaching contexts and cultural contexts from which they report as well as the student's ideas (Taber, 2012a).

Chemical and physical change

Teaching about chemical and physical change is a traditional topic in school science and chemistry courses. It is one of those dichotomies that is understandably introduced in simple terms, and so, offers a simplification that may need to be 'unlearnt' later:

[a change is] chemical change or physical change

[an element is] metal or non-metal

[a chemical bond is] ionic bonding or covalent bonding

There are some common distinctions often made to support this discrimination into two types of change:


Table 1.2 from Teaching Secondary Chemistry (2nd ed) (Taber, 2012b)

However, a little thought suggests that such criteria are not especially useful in supporting the school student making observations, and indeed some of these criteria simply do not stand up to close examination. 2

"the distinction between chemical and physical changes is a rather messy one, with no clear criteria to help students understand the difference"

Taber, 2012b, p.33


So, I was especially interested to know what Tarhan and colleagues had found.

Methodological 'small print'

In reading any study, a consideration of the findings has to be tempered by an understanding of how the data were collected and analysed. Writing-up research reports for journals can be especially challenging as referees and editors may well criticise missing details they feel should be reported, yet often journals impose word-limits on articles.

Currently (2023) this particular journal tells potential authors that "A typical paper for this journal should be between 7000 and 8000 words" which is a little more generous than some other journals. However, Tarhan and colleagues do not fully report all aspects of their study. This may in part be because they need quite a lot of space to describe the experimental teaching scheme (six different jigsaw learning activities).

Whatever the reason:

  • the authors do not provide a copy of the post-test which elicited the responses that were the basis of the identified alternative conceptions; and
  • nor do they explain how the analysis to identify conceptions was undertaken – to show how student responses were classified;
  • similarly, there are no quotations from the interview dialogue to illustrate how the researchers interpreted student comments .

Data analysis is the process of researchers interpreting data so they become evidence for their findings, and generally research journals expect the process to be detailed – but here the reader is simply told,

"Students' understanding of physical and chemical changes was identified according to the post-test and the individual interviews after the process."

Tarhan et al., 2013, p.189

'Misconceptions'

In their paper, Tarhan and colleagues use the term 'misconception' which is often considered a synonym for 'alternative conception'. Commonly, conceptions are referred to as alternative if they are judged to be inconsistent with canonical concepts.

Read about alternative conceptions

Although the term 'misconception' is used 32 times in the paper (not counting instances in the reference list), the term is not explained in the text, presumably because it is assumed that all those working in science education know (and agree) what it means. This is not at all unusual. I once wrote about another study

"[The] qualities of misconceptions are largely assumed by the author and are implicit in what is written…It could be argued that research reports of this type suggest the reported studies may themselves be under-theorised, as rather well-defined technical procedures are used to investigate foci that are themselves only vaguely characterised, and so the technical procedures are themselves largely operationalised without explicit rationale."

Taber, 2013, p.22

Unfortunately, in Tarhan and colleagues' study there are less well-defied technical procedures in relation to how data was analysed to identify 'misconceptions', so leaving the reader with limited grounds for confidence that what are reported are worthy of being described as student conceptions – and are not just errors or guesses made on the test. Our thinking is private, and never available directly to others, and, so, can only be interpreted from the presentations we make to represent our conceptions in a public (shared) space. Sometimes we mis-speak, or we mis-write (so that then our words do not accurately represent our thoughts). Sometimes our intended meanings may be misinterpreted (Taber, 2013).

Perhaps the researchers felt that this process of identifying conceptions from students' texts and utterances was unproblematic – perhaps the assignments seemed so obvious to the researchers that they did not need to exemplify and justify their analytical method. This is unfortunate. There might also be another factor here.

Lost and found in translation?

The study was carried out in Turkey. The paper is in English, and this includes the reported alternative conceptions. The study was carried out "in a public elementary school" (not an international school, for example). Although English is often taught as a foreign language in Turkish schools, the language of instruction, not unreasonably, is Turkish.

So, it seems either

  • the data was collected in (what, for the children, would have been) 'L2' – a second language, or
  • a study carried out (questions asked; answers given) in Turkish has been reported in English, translating where necessary from one language to another.

This issue is not discussed at all in the paper – there is no mention of either the Turkish or English language, nor of anything being translated.

Yet the authors are not oblivious to the significance of language issues in learning. They report how one variant of Jigsaw teaching had "been designed specifically to increase interaction among students of differing language proficiencies in bilingual classrooms" (p.186) and how the research literature reports that sometimes children's ideas reflect "the incorrect use of terms in everyday language" (p.198). However, they did not feel it was necessary to report either that

  1. data had been collected from elementary school children in a second language, or
  2. data had been translated for the purposes of reporting in an English language journal

It seems reasonable to assume they would have appreciated the importance of mentioning option 1, and so it seems much more likely (although readers of the study should not have to guess) the reporting in English involved translation. Yet translation is never a simple algorithmic process, but rather always a matter of interpretation (another stage in analysis), so it would be better if authors always acknowledged this – and offered some basis for readers to consider the translations made were of high quality (Taber, 2018).

Read about guidelines for detailing translation in research reports

It is a general principle that the research community should adopt, surely, that whenever material reported in a research paper has been translated from another language (a) this is reported and (b) evidence of the accuracy and reliability of the translation is offered (Taber, 2018).

I make this point here, as some of the alternative conceptions reported by the authors are a little mystifying, and this may(?) be because their wording has been 'degraded' (and obscured) by imperfect translation.

An alternative conception of combustion?

For example, here are two of the learning objectives from one of the learning activities:

"The students were expected to be able to:

…comment on whether the wood has similar intensive properties before and after combustion

…indicate the combustion reactions in examples of several physical and chemical changes"

Tarhan et al., 2013, p.193

The wording of the first of these examples seems to imply that when wood is burnt, the product is still…wood. That is nonsense, but possibly this is simply a mistranslation of something that made perfect sense in Turkish. (The problem is that a reader can only speculate on whether this is the case, and research reports should be precise and explicit.)

The second learning objective quoted here implies that some combustion reactions are physical changes (or, at least, combustion reactions are components of some physical changes).

Combustion reactions are a class of chemical reactions. 'Chemical reaction' is synonymous with 'chemical change'. So, there are (if you will excuse the double negative) no examples of combustion reactions that are not chemical reactions and which would be said to occur in physical changes. So, this is mystifying, as it is not at all clear what the children were actually being taught unless one assumes the researchers themselves have very serious misconceptions about the chemistry they are teaching.

If a reader actually thought that the researchers themselves held these alternative conceptions

  • the product of combustion of wood is still wood
  • some combustion reactions are (or occur as part of) physical changes

then one could have little confidence in their ability to distinguish between the scientific and alternative conceptions of others. (A reader might also ask how come the journal referees and editor did not ask for corrections here before publication – I certainly wondered about this).

There are other statements the authors make in describing the teaching which are not entirely clear (e.g., "give the order of the changes in matter during combustion reactions", p.194), and this suggests a degree of scepticism is needed in not simply accepting the reported alternative conceptions at face value. This does not negate their interest, but does undermine the paper's authority somewhat.

One of the misconceptions reported in the study is that some students thought that "there is a flame in all combustion reaction". This led me to reflect on whether I could think of any combustion reactions that did not involve a flame – and I must confess none readily came to mind. Perhaps I also have this alternative conception – but it seems a harsh judgement on elementary school learners unless they had actually been taught about combustion reactions without flames (if, indeed, there are such things).


The study reported that some 12 year olds held the 'misconception' that "there is a flame in all combustion reaction[s]".

[Image by Susanne Jutzeler, Schweiz, from Pixabay]


Failing to control variables?

Another objective was for students to "comprehend that temperature has an effect on chemical reaction rate by considering the decay of fruit at room temperature, and the change in color [colour] from green to yellow of fallen leaves in autumn" (p.193). As presented, this is somewhat obscure.

Presumably it is not meant to be a comparison between:

the rate of
decay of fruit at room temperature
andthe rate of
change in colour of fallen leaves in autumn
Explaining that temperature has an effect on chemical reaction rate?

Clearly, even if the change of colour of leaves takes place at a different temperature to room temperature, one cannot compare between totally different processes at different temperatures and draw any conclusions about how "temperature has an effect on chemical reaction rate" . (Presumably, 'control of variables' is taught in the Turkish science curriculum.)

So, one assumes these are two different examples…

But that does not help matters too much. The "decay of fruit at room temperature" (nor, indeed, any other process studied at a single temperature) cannot offer any indication of how "temperature has an effect on chemical reaction rate". The change of colours in leaves of deciduous trees (that usually begins before they fall) is triggered by environmental conditions such as change in day length and temperature. This is part of a very complex system involving a range of pigments, whilst water content of the leaf decreases (once the supply of water through the tree's vascular system is cut off), and it is not clear how much detail these twelve year olds were taught…but it is certainly not a simple matter of a reaction changing rate according to temperature.

Evaluating conceptions

Tarhan and colleagues report their identified alternative conceptions ('misconceptions') under a series of headings. These are reported in their table 4 (p.195). A reader certainly finds some of the entries in this table easy to interpret: they clearly seem to reflect ideas contrary to the canonical science one would expect to be reflected in the curriculum and teaching. Other statements are less obviously evidence of alternative conceptions as they do not immediately seem necessarily at odds with scientific accounts (e.g., associating combustion reactions with flames).

Other reported misconceptions are harder to evaluate. School science is in effect a set of models and representations of scientific accounts that often simplify the actual current state of scientific knowledge. Unless we know exactly what has been taught it is not entirely clear if students' ideas are credit-worthy or erroneous in the specific context of their curriculum.

Moreover, as the paper does not report the data and its analysis, but simply the outcome of the analysis, readers do not know on what basis judgements have been made to assign learners as having one of the listed misconceptions.


Changes of state are chemical changes

A few students from the lecture-based teaching condition were identified as 'having' the misconception that 'changes of state are chemical changes'. This seems a pretty serious error at the end of a teaching sequence on chemical and physical changes.

However, this raises a common issue in terms of reports of alternative conceptions – what exactly does it mean to say that a student has a conception that 'changes of state are chemical changes'? A conception is a feature of someone's thinking – but that encompasses a vast range of potential possibilities from a fleeting notion that is soon forgotten ('I wonder if s orbitals are so-called because they are spherical?') to an on-going commitment to an extensive framework of ideas that a life is lived by (Buddhism, Roman Catholicism, Liberalism, Hedonism, Marxism…).


A person's conceptions can vary along a range of characteristics (Figure from Taber, 2014)


The statement that 'Changes of state are chemical changes' is unlikely to be the basis of anyone's personal creed. It could simply be a confusion of terms. Perhaps a student had a decent understanding of the essential distinction between chemical and physical changes but got the terms mixed up (or was thinking that 'changes of state' meant 'chemical reaction'). That is certainty a serious error that needs correcting, but in terms of understanding of the science, would seem to be less worrying than a deeper conceptual problem.

In their commentary, the authors note of these children:

"They thought that if ice was heated up water formed, and if water was heated steam formed, so new matter was formed and chemical changes occurred".

Tarhan et al., 2013, p.197

It is not clear if this was an explanation the learners gave for thinking "changes of state are chemical changes", or whether "changes of state are chemical changes" was the researchers' gloss on children commenting that "if ice was heated up water formed, and if water was heated steam formed, so new matter was formed and chemical changes occurred".

That a range of students are said to have precisely the same train of thought leads a reader (or, at least, certainly one with experience of undertaking research of this kind) to ask if these are open-ended responses produced by the children, or the selection by the children of one of a number of options offered by the researchers (as pointed out above, the data analysis is not discussed in detail in the paper). That makes a difference in how much weight we might give to the prevalence of the response (putting a tick by the most likely looking option requires less commitment to, and appreciation of, an idea than setting it out yourself in your own personally composed text), illustrating why it is important that research journals should require researchers to give full accounts of their instrumentation and analysis.

Because density of matter changes during changes of state, its identity also changes, and so it is a chemical change

Thirteen of the children (all in the lecture-based teaching condition) were considered to have the conception "Because density of matter changes during changes of state, its identity also changes, and so it is a chemical change". This is clearly a much more specific conception (than 'changes of state are chemical changes') which can be analysed into three components:

  • a change of state is a chemical change, AND
  • we know this because such changes involve a change in identity, AND
  • we know that because a change of state leads to a change in density

Terhan and colleagues claim this conception was "first determined in this study" (p.195).

The specificity is intriguing here – if so many students explicitly and individually built this argument for themselves then this is an especially interesting finding. Unfortunately, the paper does not give enough detail of the methodology for a reader to know if this was the case. Again, if students were just agreeing with an argument offered as an option on the assessment instrument then it is of note, but less significant (as in such cases students might agree with the statement simply because one component resonated – or they may even be guessing rather than leaving an item unanswered). Again this does not completely negate the finding, but it leaves its status very unclear.

Taken together these first two claimed results seem inconsistent – as at least 13 students seem to think "Changes of state are chemical changes". That is, all those who thought that "Because density of matter changes during changes of state, its identity also changes, and so it is a chemical change" would seem to have thought that "Changes of state are chemical changes" (see the Venn diagram below). Yet, we are also told that only five students held the less specific and seemingly subsuming conception "changes of state are chemical changes".


If 13 students think that changes of state are chemical changes because a change of density implies a change of identity; what does it mean that only 5 students think that changes of state are chemical changes?

This looks like an error, but perhaps is just a lack of sufficient detail to make the findings clear. Alternatively, perhaps this indicates some failure in translating material accurately into English.

The changes in the pure matters are physical changes

Six children in the lecture-based teaching condition and one in the jigsaw learning condition were reported as holding the conception that "The changes in the pure matters are physical changes". The authors do not explain what they mean here by "pure matters" (sic, presumably 'matter'?). The only place this term is used in the paper is in relation to this conception (p.195, p.197).

The only other reference to 'pure' was in one of the learning objectives for the teaching:

  • explain the changes of state of water depending on temperature and pressure; give various examples for other pure substances (p.191)

If "pure matter" means a pure sample of a substance, then changes in pure substances are all physical – by definition a chemical changes leads to a different substance/different substances. That would explain why this conception was "first determined [as a misconception] in this study", p.195, as it is not actually a misconception)". So, it does not seem clear precisely why the researchers feel these children have got something wrong here. Again, perhaps this is a failure of translation rather than a failure in the original study?

Changes in shape?

Tarhan and colleagues report two conceptions under the subheading of 'changes in shape'. They seem to be thinking here more of grain size than shape as such. (Another translation issue?) One reported misconception is that if cube sugar is granulated, sugar particles become small [smaller?].


Is it really a misconception to think that "If cube sugar is granulated, sugar particles become small"?

(Image by Bruno /Germany from Pixabay)


Tarhan and colleagues reported that two children in the experimental condition, and 13 in the control condition thought that "If cube sugar is granulated, sugar particles become small". Sugar cubes are made of granules of sugar weakly joined together – they can easily be crumbled into the separate grains. The grains are clearly smaller than the cubes. So, what is important here is what is meant/understood* by the children by the term 'particles'.

(* If this phrasing was produced by the children, then we want to know what they meant by it. If, however, the children were agreeing with a phrase presented to them by researchers, then we wish to know how they understood it.)

If this means quanticle level particles, molecules, then it is clearly an alternative conception – each grain contain vast numbers of molecules, and the molecules are unchanged by the breaking up the cubes. If, however, particles here refers to the cube and grains**, then it is a fair reflection of what happens: one quite large particle of sugar is broken up into many much smaller particles. The ambiguity of the (English) word 'particles' in such contexts is well recognised.

(** That is, if the children used the word 'particles' – did they mean the cubes/grains as particles of sugar? If however the phrasing was produced by the researchers and presented to the children, and if the researchers meant 'particles' to mean 'molecules'; did the children appreciate that intention, or did they understand 'particles' to refer to the cubes and grains?)

However, as no detail is given on the actual data collected (e.g., is this the children's own words; was this based on an open response?), and how it was analysed (and, as I suspect this all occurred in Turkish) the reader has no way to check on this interpretation of the data.

What kind of change is dissolving?

Tarhan and colleagues report a number of 'misconceptions' under the heading of 'molecular solubility'. Two of these are:

  • "The solvation processes are always chemical changes"
  • "The solvation processes are always physical changes"

This reflects a problem of teaching about physical and chemical changes. Dissolving is normally seen as a physical change: there is no new chemical substance formed and dissolving is usually fairly readily reversed. However, as bonds are broken and formed it also has some resemblance to chemical change.2

In dissolving common salt in water, strong ionic bonds are disrupted and the ions are strongly solvated. Yet the usual convention is still to consider this a physical change – the original substance, the salt, can be readily recovered by evaporation of the solvent. A solution is considered a kind of mixture. In any case, as Tarhan and colleagues refer to 'molecular' solubility (strictly solubility refers to substances, not molecules, but still) they were, presumably, only dealing with examples of the dissolving of substances with discrete molecules.

Taking together these two conceptions, it seems that Tarhan and colleagues think that dissolving is sometimes a physical change, and sometimes a chemical change. Presumably they have some criterion or criteria to distinguish those examples of dissolving they consider physical changes from those they consider chemical changes. A reader can only speculate how a learner observing some solute dissolve in a solvent is expected to distinguish these cases. The researchers do not explain what was taught to the students, so it is difficult to appreciate quite what the students supposedly got wrong here.

Sugar is invisible in the water, because new matter is formed

The idea that learners think that new matter is formed on dissolving would indeed be an alternative conception. The canonical view is that new matter is only formed in very high energy processes – such as in the big bang. In both chemical and physical processes studied in the school laboratory there may be transformations of matter, but no new matter.

This seems a rather extreme 'misconception' for the learners to hold. However, a reader might wonder if the students actually suggested that a new substance was formed, and this has been mistranslated. (The Turkish word 'madde' seems to mean either matter or substance.) If these students thought that a new type of substance was formed then this would be an alternative conception (and it would be interesting to know why this led to sugar being invisible – unless they were simply arguing that different appearance implied different substance).

While sugar is dissolving in the water, water damages the structure of sugar and sugar splits off

Whether this is a genuine alternative conception or just imprecise use of language is not clear. It seems reasonable to suggest that while sugar is dissolving in the water, the process breaks up the structure of solid sugar and sugar molecules split off – so some more detail would be useful here. Again, if there has been translation from Turkish this may have lost some of the nuance of the original phrasing through translation into English.

The phrasing reflects an alternative conception that in chemical reactions one reactant is an active agent (here the water doing the damaging) and the other the patient, that is passive and acted upon (here the sugar being damaged) – rather than seeing the reaction as an interaction between two species (Taber & García Franco, 2010) – but there is no suggestion in their paper that this is the issue Tarhan and colleagues are highlighting here.

When sugar dissolves in water, it reacts with water and disappears from sight

If the children thought that dissolving was a chemical reaction then this is an alternative conception – the sugar does indeed disappear from sight, but there has been no reaction.

Again, we might ask if this was actually a misunderstanding (misconception), or imprecise use of language. The sugar does 'react' with the water in the everyday sense of 'reaction'. But this is not a chemical reaction, so this terminology should be avoided in this context.

Even in science, 'reaction' means something different in chemistry and physics: in the sense of Newtonian physics, during dissolving, when a water molecule attracts a sugar molecule {'action')'} there will be an equal and oppositely directed reaction as the sugar molecule attracts the water molecule. This is Newton's third law, which applies to quanticles as much as to planets. If a water molecule and a sugar molecule collide, the force applied by the sugar molecule on the water molecule is equal to the force applied by the water molecule on the sugar molecule.

Read about learning difficulties with Newton's third law

So, 'sugar reacts with water' could be

  • a misunderstanding of dissolving (a genuine alternative conception);
  • a misuse of the chemical term 'reaction'; or
  • a use of the everyday term 'reaction' in a context where this should be avoided as it can be misunderstood

These are somewhat different problems for a teacher to address.

Molecules split off in physical changes and atoms split off in chemical changes

Ten of the children are said to have demonstrated the 'misconception' that molecules split off in physical changes and atoms split off in chemical changes. The authors claim that this misconception has not been reported in previous studies. But is this really a misconception? It may be a simplistic, and imprecise, statement – but I think when I was teaching youngsters of this age I would have been happy to find they have this notion – which at least seems to reflect an ability to imagine and visualise processes at the molecular level.

In dissolving or melting/boiling of simple molecular substances, molecules do indeed 'split off' in a sense, and in at least some chemical changes we can posit mechanisms that, in simple terms at least, involve atoms 'splitting off' from molecules.

So, again, this is another example of how this study is tantalising, without being very informative. The reader is not clear in what sense this is viewed as wrong, or how the conception was detected. (Again, for ten different students to specifically think that 'molecules split off in physical changes and atoms split off in chemical changes' makes one wonder if they volunteered this, or have simply agreed with the statement when having it presented to them).

In conclusion

The main thrust of Tarhan and colleagues' study was to report on an innovation using jig-saw learning (which unfortunately compared this with a form of pedagogy widely considered unsuitable for young children, so offering a limited basis for judging effectiveness of the innovation). As part of the study they collected data to evaluate learning in the two conditions, and used this to identify misconceptions students demonstrated after being taught about physical and chemical changes. The researchers provide a long list of identified misconceptions – but it is not always obvious why these are considered misconceptions, and what the desired responses matching teaching models were.

The researchers do not detail their data collection and analysis instruments and protocols in sufficient detail for a readers to appreciate what they mean by their results. In particular, what it means to have a misconception – e.g., to give a definitive statement in an interview, or just to select some response on a test as the answer that looked most promising at the time. Clearly we give much more weight to a notion that a learner presents in their own words as an explanation for some phenomenon, than the selection of one option from a menu of statements presented to them that comes with no indication of their confidence in the selection made.

Of particular concern: either the children were asked questions in a second language that they may not have been sufficiently fluent in to fully understand questions or compose clear responses; or none of the misconceptions reported are presented in their original form and they have all been translated by someone (unspecified) of uncertain ability as a translator. (A suitably qualified translator would need to have high competence in both languages and a strong familiarity with the subject matter being translated.)

In the circumstances, Tarhan and colleagues' reported misconceptions are little more than intriguing. In science, the outcome of a study is only informative in the context of understanding exactly how the data were obtained, and how they have been processed. Without that, readers are asked to take a researcher's conclusions on faith, rather than be persuaded of them by a logical chain of argument.


p.s. For anyone who did not know, but wondered: s orbitals are not so-called because they are spherical: the designation derives from a label ('sharp') that was applied to some lines in atomic spectra.


Work cited

Notes


1 To my reading, the publication title 'Research in Science and Technological Education' seems to suggest the journal has two distinct and somewhat disconnected foci, that is:

Research in ( Science ) and ( Technological Education )

And it would be better (that is, most consistently) titled as

Research in Science and Technology Education

{Research in ( Science and Technology ) Education}

or

Research in Scientific and Technological Education

{Research in ( Scientific and Technological ) Education}

but, hey, I know I am pedantic.


2 The table (Table 1.2 in the source) was followed by the following text:

"The first criterion listed is the most fundamental and is generally clear cut as long as the substances present before and after the change are known. If a new substance has been produced, it will almost certainly have different melting and boiling temperatures than the original substance.

The other [criteria] are much more dubious. Some chemical changes involve a great deal of energy being released, such as the example of burning magnesium in air, or even require a considerable energy input, such as the example of the electrolysis of water. However, other reactions may not obviously involve large energy transfers, for example when the enthalpy and entropy changes more or less cancel each other out…. The rusting of iron is a chemical reaction, but usually occurs so slowly that it is not apparent whether the process involves much energy transfer ….

Generally speaking, physical changes are more readily reversible than chemical changes. However, again this is not a very definitive criterion. The idea that chemical reactions tend to either 'go' or not is a useful approximation, but there are many examples of reactions that can be readily reversed…. In principle, all reactions involve equilibria of forward and reverse reactions, and can be reversed by changing the conditions sufficiently. When hydrogen and oxygen are exploded, it takes a pedant to claim that there is also a process of water molecules being converted into oxygen and hydrogen molecules as the reaction proceeds, which means the reaction will continue for ever. Technically such a claim may be true, but for all practical purposes the explosion reflects a reaction that very quickly goes to completion.

One technique that can be used to separate iodine from sand is to warm the mixture gently in an evaporating basin, over which is placed an upturned beaker or funnel. The iodine will sublime – turn to vapour – before recondensing on the cold glass, separated from the sand. The same technique may be used if ammonium chloride is mixed with the sand. In both cases the separation is achieved because sand (which has a high melting temperature) is mixed with another substance in the solid state that is readily changed into a vapour by warming, and then readily recovered as a solid sample when the vapour is in contact with a colder surface. There are then reversible changes involved in both cases:

solid iodine ➝ iodine vapour

ammonium chloride ➝ ammonia + hydrogen chloride

In the first case, the process involves only changes of state: evaporation and condensation – collectively called sublimation. However the second case involves one substance (a salt) changing to two other substances. To a student seeing these changes demonstrated, there would be little basis to infer one is (usually considered as) a chemical change, but not the other. …

The final criterion in Table 1.2 concerns whether bonds are broken and made during a change, and this can only be meaningful for students once they have learnt about particle models of the submicroscopic structure of matter… In a chemical change, there will be the breaking of bonds that hold together the reactants and the formation of new bonds in the products. However, we have to be careful here what we mean by 'bond' …

When ice melts and water boils, 'intermolecular' forces between molecules are disrupted and this includes the breaking of hydrogen 'bonds'. However, when people talk about bond breaking in the context of chemical and physical changes, they tend to mean strong chemical bonds such as covalent, ionic and metallic bonds…

Yet even this is not clear cut. When metals evaporate or are boiled, metallic bonds are broken, although the vapour is not normally considered a different substance. When elements such as carbon and phosphorus undergo phase changes relating to allotropy, there is breaking, and forming, of bonds, which might suggest these changes are chemical and that the different forms of the same elements should be considered different substances. …

A particularly tricky case occurs when we dissolve materials to form solutions, especially materials with ionic bonding…. Dissolving tends to involve small energy changes, and to be readily reversible, and is generally considered a physical change. However, to dissolve an ionic compound such as sodium chloride (table salt), the strong ionic bonds between the sodium and chloride ions have to be overcome (and new bonds must form between the ions and solvent molecules). This would seem to suggest that dissolving can be a chemical change according to the criterion of bond breaking and formation (Table 1.2)."

(Taber, 2012b, pp.31-33)

Quasi-experiment or crazy experiment?

Trustworthy research findings are conditional on getting a lot of things right


Keith S. Taber


A good many experimental educational research studies that compare treatments across two classes or two schools are subject to potentially conflating variables that invalidate study findings and make any consequent conclusions and recommendations untrustworthy.

I was looking for research into the effectiveness of P-O-E (predict-observe-explain) pedagogy, a teaching technique that is believed to help challenge learners' alternative conceptions and support conceptual change.

Read about the predict-observe-explain approach



One of the papers I came across reported identifying, and then using P-O-E to respond to, students' alternative conceptions. The authors reported that

The pre-test revealed a number of misconceptions held by learners in both groups: learners believed that salts 'disappear' when dissolved in water (37% of the responses in the 80% from the pre-test) and that salt 'melts' when dissolved in water (27% of the responses in the 80% from the pre-test).

Kibirige, Osodo & Tlala, 2014, p.302

The references to "in the 80%" did not seem to be explained anywhere. Perhaps only 80% of students responded to the open-ended questions included as part of the assessment instrument (discussed below), so the authors gave the incidence as a proportion of those responding? Ideally, research reports are explicit about such matters avoiding the need for readers to speculate.

The authors concluded from their research that

"This study revealed that the use of POE strategy has a positive effect on learners' misconceptions about dissolved salts. As a result of this strategy, learners were able to overcome their initial misconceptions and improved on their performance….The implication of these results is that science educators, curriculum developers, and textbook writers should work together to include elements of POE in the curriculum as a model for conceptual change in teaching science in schools."

Kibirige, Osodo & Tlala, 2014, p.305

This seemed pretty positive. As P-O-E is an approach which is consistent with 'constructivist' thinking that recognises the importance of engaging with learners' existing thinking I am probably biased towards accepting such conclusions. I would expect techniques such as P-O-E, when applied carefully in suitable curriculum contexts, to be effective.

Read about constructivist pedagogy

Yet I also have a background in teaching research methods and in acting as a journal editor and reviewer – so I am not going to trust the conclusion of a research study without having a look at the research design.


All research findings are subject to caveats and provisos: good practice in research writing is for the authors to discuss them – but often they are left unmentioned for readers to spot. (Read about drawing conclusions from studies)


Kibirige and colleagues describe their study as a quasi-experiment.

Experimental research into teaching approaches

If one wants to see if a teaching approach is effective, then it seems obvious that one needs to do an experiment. If we can experimentally compare different teaching approaches we can find out which are more effective.

An experiment allows us to make a fair comparison by 'control of variables'.

Read about experimental research

Put very simply, the approach might be:

  • Identify a representative sample of an identified population
  • Randomly assign learners in the sample to either an experimental condition or a control condition
  • Set up two conditions that are alike in all relevant ways, apart from the independent variable of interest
  • After the treatments, apply a valid instrument to measure learning outcomes
  • Use inferential statistics to see if any difference in outcomes across the two conditions reaches statistical significance
  • If it does, conclude that
    • the effect is likely to due to the difference in treatments
    • and will apply, on average, to the population that has been sampled

Now, I expect anyone reading this who has worked in schools, and certainly anyone with experience in social research (such as research into teaching and learning), will immediately recognise that in practice it is very difficult to actually set up an experiment into teaching which fits this description.

Nearly always (if indeed not always!) experiments to test teaching approaches fall short of this ideal model to some extent. This does not mean such studies can not be useful – especially where there are many of them with compensatory strengths and weaknesses offering similar findings (Taber, 2019a)- but one needs to ask how closely published studies fit the ideal of a good experiment. Work in high quality journals is often expected to offer readers guidance on this, but readers should check for themselves to see if they find a study convincing.

So, how convincing do I find this study by Kibirige and colleagues?

The sample and the population

If one wishes a study to be informative about a population (say, chemistry teachers in the UK; or 11-12 year-olds in state schools in Western Australia; or pharmacy undergraduates in the EU; or whatever) then it is important to either include the full population in the study (which is usually only feasible when the population is a very limited one, such as graduate students in a single university department) or to ensure the sample is representative.

Read about populations of interest in research

Read about sampling a population

Kibirige and colleagues refer to their participants as a sample

"The sample consisted of 93 Grade 10 Physical Sciences learners from two neighbouring schools (coded as A and B) in a rural setting in Moutse West circuit in Limpopo Province, South Africa. The ages of the learners ranged from 16 to 20 years…The learners were purposively sampled."

Kibirige, Osodo & Tlala, 2014, p.302

Purposive sampling means selecting participants according to some specific criteria, rather than sampling a population randomly. It is not entirely clear precisely what the authors mean by this here – which characteristics they selected for. Also, there is no statement of the population being sampled – so the reader is left to guess what population the sample is a sample of. Perhaps "Grade 10 Physical Sciences" students – but, if so, universally, or in South Africa, or just within Limpopo Province, or indeed just the Moutse West circuit? Strictly the notion of a sample is meaningless without reference to the population being sampled.

A quasi-experiment

A key notion in experimental research is the unit of analysis

"An experiment may, for example, be comparing outcomes between different learners, different classes, different year groups, or different schools…It is important at the outset of an experimental study to clarify what the unit of analysis is, and this should be explicit in research reports so that readers are aware what is being compared."

Taber, 2019a, p.72

In a true experiment the 'units of analysis' (which in different studies may be learners, teachers, classes, schools, exam. papers, lessons, textbook chapters, etc.) are randomly assigned to conditions. Random assignment allows inferential statistics to be used to directly compare measures made in the different conditions to determine whether outcomes are statistically significant. Random assignment is a way of making systematic differences between groups unlikely (and so allows the use of inferential statistics to draw meaningful conclusions).

Random assignment is sometimes possible in educational research, but often researchers are only able to work with existing groupings.

Kibirige, Osodo & Tlala describe their approach as using a quasi-experimental design as they could not assign learners to groups, but only compare between learners in two schools. This is important, as means that the 'units of analysis' are not the individual learners, but the groups: in this study one group of students in one school (n=1) is being compared with another group of students in a different school (n=1).

The authors do not make it clear whether they assigned the schools to the two teaching conditions randomly – or whether some other criterion was used. For example, if they chose school A to be the experimental school because they knew the chemistry teacher in the school was highly skilled, always looking to improve her teaching, and open to new approaches; whereas the chemistry teacher in school B had a reputation for wishing to avoid doing more than was needed to be judged competent – that would immediately invalidate the study.

Compensating for not using random assignment

When it is not possible to randomly assign learners to treatments, researchers can (a) use statistics that take into account measurements on each group made before, as well as after, the treatments (that is, a pre-test – post-test design); (b) offer evidence to persuade readers that the groups are equivalent before the experiment. Kibirige, Osodo and Tlala seek to use both of these steps.

Do the groups start as equivalent?

Kibirige, Osodo and Tlala present evidence from the pre-test to suggest that the learners in the two groups are starting at about the same level. In practice, pre-tests seldom lead to identical outcomes for different groups. It is therefore common to use inferential statistics to test for whether there is a statistically significant difference between pre-test scores in the groups. That could be reasonable, if there was an agreed criterion for deciding just how close scores should be to be seen as equivalent. In practice, many researchers only check that the differences do not reach statistical significance at the level of probability <0.05: that it they look to see if there are strong differences, and, if not, declare this is (or implicitly treat this as) equivalence!

This is clearly an inadequate measure of equivalence as it will only filter out cases where there is a difference so large it is found to be very unlikely to be a chance effect.


If we want to make sure groups start as 'equivalent', we cannot simply look to exclude the most blatant differences. (Original image by mcmurryjulie from Pixabay)

See 'Testing for initial equivalence'


We can see this in the Kibirige and colleagues' study where the researchers list mean scores and standard deviations for each question on the pre-test. They report that:

"The results (Table 1) reveal that there was no significant difference between the pre-test achievement scores of the CG [control group] and EG [experimental group] for questions (Appendix 2). The p value for these questions was greater than 0.05."

Kibirige, Osodo & Tlala, 2014, p.302

Now this paper is published "licensed under Creative Commons Attribution 3.0 License" which means I am free to copy from it here.



According to the results table, several of the items (1.2, 1.4, 2.6) did lead to statistically significantly different response patterns in the two groups.

Most of these questions (1.1-1.4; 2.1-2.8; discussed below) are objective questions, so although no marking scheme was included in the paper, it seems they were marked as correct or incorrect.

So, let's take as an example question 2.5 where readers are told that there was no statistically significant difference in the responses of the two groups. The mean score in the control group was 0.41, and in the experimental group was 0.27. Now, the paper reports that:

"Forty nine (49) learners (31 males and 18 females) were from school A and acted as the experimental group (EG) whereas the control group (CG) consisted of 44 learners (18 males and 26 females) from school B."

Kibirige, Osodo & Tlala, 2014, p.302

So, according to my maths,


Correct responsesIncorrect responses
School A (49 students)(0.27 ➾) 1336
School B (44 students)(0.41 ➾) 1826
pre-test results for an item with no statistically significant difference between groups

"The achievement of the EG and CG from pre-test results were not significantly different which suggest that the two groups had similar understanding of concepts" (p.305).
Pre-test results for an item with no statistically significant difference between groups (offered as evidence of 'similar' levels of initial understanding in the two groups)

While, technically, there may have been no statistically significant difference here, I think inspection is sufficient to suggest this does not mean the two groups were initially equivalent in terms of performance on this item.


Data that is normally distributed falls on a 'bell-shaped' curve

(Image by mcmurryjulie from Pixabay)


Inspection of this graphic also highlights something else. Student's t-test (used by the authors to produce the results in their table 1), is a parametric test. That means it can only be used when the data fit certain criteria. The data sample should be randomly selected (not true here) and normally distributed. A normal distribution means data is distributed in a bell-shaped Gaussian curve (as in the image in the blue circle above).If Kibirige, Osodo & Tlala were applying the t-test to data distributed as in my graphic above (a binary distribution where answers were either right or wrong) then the test was invalid.

So, to summarise, the authors suggest there "was no significant difference between the pre-test achievement scores of the CG and EG for questions", although sometimes there was (according to their table); and they used the wrong test to check for this; and in any case lack of statistical significance is not a sufficient test for equivalence.

I should note that the journal does claim to use peer review to evaluate submissions to see if they are ready for publication!

Comparing learning gains between the two groups

At one level equivalence might not be so important, as the authors used an ANCOVA (Analysis of Covariance) test which tests for difference at post-test taking into account the pre-test. Yet this test also has assumptions that need to be tested for and met, but here seem to have just been assumed.

However, to return to an even more substantive point I made earlier, as the learners were not randomly assigned to the two different conditions /treatments, what should be compared are the two school-based groups (i.e., the unit of analysis should be the school group) but that (i.e., a sample of 1 class, rather than 40+ learners, in each condition) would not facilitate using inferential statistics to make a comparison. So, although the authors conclude

"that the achievement of the EG [taking n=49] after treatment (mean 34. 07 ± 15. 12 SD) was higher than the CG [taking n =44] (mean 20. 87 ± 12. 31 SD). These means were significantly different"

Kibirige, Osodo & Tlala, 2014, p.303

the statistics are testing the outcomes as if 49 units independently experienced one teaching approach and 44 independently experienced another. Now, I do not claim to be a statistics expert, and I am aware that most researchers only have a limited appreciation of how and why stats. tests work. For most readers, then, a more convincing argument may be made by focussing on the control of variables.

Controlling variables in educational experiments

The ability to control variables is a key feature of laboratory science, and is critical to experimental tests. Control of variables, even identification of relevant variables, is much more challenging outside of a laboratory in social contexts – such as schools.

In the case of Kibirige, Osodo & Tlala's study, we can set out the overall experimental design as follows


Independent
variable
Teaching approach:
– predict-observe-explain (experimental)
– lectures (comparison condition)
Dependent
variable
Learning gains
Controlled
variable(s)
Anything other than teaching approach which might make a difference to student learning
Variables in Kibirige, Osodo & Tlala's study

The researchers set up the two teaching conditions, measure learning gains, and need to make sure any other factors which might have an effect on learning outcomes, so called confounding variables, are controlled so the same in both conditions.

Read about confounding variables in research

Of course, we cannot be sure what might act as a confounding variable, so in practice we may miss something which we do not recognise is having an effect. Here are some possibilities based on my own (now dimly recalled) experience of teaching in school.

The room may make a difference. Some rooms are

  • spacious,
  • airy,
  • well illuminated,
  • well equipped,
  • away from noisy distractions
  • arranged so everyone can see the front, and the teacher can easily move around the room

Some rooms have

  • comfortable seating,
  • a well positioned board,
  • good acoustics

Others, not so.

The timetable might make a difference. Anyone who has ever taught the same class of students at different times in the week might (will?) have noticed that a Tuesday morning lesson and a Friday afternoon lesson are not always equally productive.

Class size may make a difference (here 49 versus 44).

Could gender composition make a difference? Perhaps it was just me, but I seem to recall that classes of mainly female adolescents had a different nature than classes of mainly male adolescents. (And perhaps the way I experienced those classes would have been different if I had been a female teacher?) Kibirige, Osodo and Tlala report the sex of the students, but assuming that can be taken as a proxy for gender, the gender ratios were somewhat different in the two classes.


The gender make up of the classes was quite different: might that influence learning?

School differences

A potentially major conflating variable is school. In this study the researchers report that the schools were "neighbouring" and that

Having been drawn from the same geographical set up, the learners were of the same socio-cultural practices.

Kibirige, Osodo & Tlala, 2014, p.302

That clearly makes more sense than choosing two schools from different places with different demographics. But anyone who has worked in schools will know that two neighbouring schools serving much the same community can still be very different. Different ethos, different norms, and often different levels of outcome. Schools A and B may be very similar (but the reader has no way to know), but when comparing between groups in different schools it is clear that school could be a key factor in group outcome.

The teacher effect

Similar points can be made about teachers – they are all different! Does ANY teacher really believe that one can swap one teacher for another without making a difference? Kibirige, Osodo and Tlala do not tell readers anything about the teachers, but as students were taught in their own schools the default assumption must be that they were taught by their assigned class teachers.

Teachers vary in terms of

  • skill,
  • experience,
  • confidence,
  • enthusiasm,
  • subject knowledge,
  • empathy levels,
  • insight into their students,
  • rapport with classes,
  • beliefs about teaching and learning,
  • teaching style,
  • disciplinary approach
  • expectations of students

The same teacher may perform at different levels with different classes (preferring to work with different grade levels, or simply getting on/not getting on with particular classes). Teachers may have uneven performance across topics. Teachers differentially engage with and excel in different teaching approaches. (Even if the same teacher had taught both groups we could not assume they were equally skilful in both teaching conditions.)

Teacher variable is likely to be a major difference between groups.

Meta-effects

Another conflating factor is the very fact of the research itself. Students may welcome a different approach because it is novel and a change from the usual diet (or alternatively they may be nervous about things being done differently) – but such 'novelty' effects would disappear once the new way of doing things became established as normal. In which case, it would be an effect of the research itself and not of what is being researched.

Perhaps even more powerful are expectancy effects. If researchers expect an innovation to improve matters, then these expectations get communicated to those involved in the research and can themselves have an affect. Expectancy effects are so well demonstrated that in medical research double-blind protocols are used so that neither patients nor health professionals they directly engage with in the study know who is getting which treatment.

Read about expectancy effects in research

So, we might revise the table above:


Independent
variable
Teaching approach:
– predict-observe-explain (experimental)
– lectures (comparison condition)
Dependent
variable
Learning gains
Potentially conflating
variables
School effect
Teacher effect
Class size
Gender composition of teaching groups
Relative novelty of the two teaching approaches
Variables in Kibirige, Osodo & Tlala's study

Now, of course, these problems are not unique to this particular study. The only way to respond to teacher and school effects of this kind is to do large scale studies, and randomly assign a large enough number of schools and teachers to the different conditions so that it becomes very unlikely there will be systematic differences between treatment groups.

A good many experimental educational research studies that compare treatments across two classes or two schools are subject to potentially conflating variables that invalidate study findings and make any consequent conclusions and recommendations untrustworthy (Taber, 2019a). Strangely, often this does not seem to preclude publication in research journals. 1

Advice on controls in scientific investigations:

I can probably do no better than to share some advice given to both researchers, and readers of research papers, in an immunology textbook from 1910:

"I cannot impress upon you strongly enough never to operate without the necessary controls. You will thus protect yourself against grave errors and faulty diagnoses, to which even the most competent investigator may be liable if he [or she] fails to carry out adequate controls. This applies above all when you perform independent scientific investigations or seek to assess them. Work done without the controls necessary to eliminate all possible errors, even unlikely ones, permits no scientific conclusions.

I have made it a rule, and would advise you to do the same, to look at the controls listed before you read any new scientific papers… If the controls are inadequate, the value of the work will be very poor, irrespective of its substance, because none of the data, although they may be correct, are necessarily so."

Julius Citron

The comparison condition

It seems clear that in this study there is no strict 'control' of variables, and the 'control' group is better considered just a comparison group. The authors tell us that:

"the control group (CG) taught using traditional methods…

the CG used the traditional lecture method"

Kibirige, Osodo & Tlala, 2014, pp.300, 302

This is not further explained, but if this really was teaching by 'lecturing' then that is not a suitable approach for teaching school age learners.

This raises two issues.

There is a lot of evidence that a range of active learning approaches (discussion work, laboratory work, various kinds of group work) engages and motivates students more than whole lessons spent listening to a teacher. Therefore any approach which basically involves a mixture of students doing things, discussing things, engaging with manipulatives and resources as well as listening to a teacher, tends to be superior to just being lectured. Good science teaching normally involves lessons sequenced into a series of connected episodes involving different types of student activity (Taber, 2019b). Teacher presentations of the target scientific account are very important, but tend to be effective when embedded in a dialogic approach that allows students to explore their own thinking and takes into account their starting points.

So, comparing P-O-E with lectures (if they really were lectures) may not tell researchers much about P-O-E specifically, as a teaching approach. A better test would compare P-O-E with some other approach known to be engaging.

"Many published studies argue that the innovation being tested has the potential to be more effective than current standard teaching practice, and seek to demonstrate this by comparing an innovative treatment with existing practice that is not seen as especially effective. This seems logical where the likely effectiveness of the innovation being tested is genuinely uncertain, and the 'standard' provision is the only available comparison. However, often these studies are carried out in contexts where the advantages of a range of innovative approaches have already been well demonstrated, in which case it would be more informative to test the innovation that is the focus of the study against some other approach already shown to be effective."

Taber, 2019a, p.93

The second issue is more ethical than methodological. Sometimes in published studies (and I am not claiming I know this happened here, as the paper says so little about the comparison condition) researchers seem to deliberately set up a comparison condition they have good reason to expect is not effective: such as asking a teacher to lecture and not include practical work or discussion work or use of digital learning technologies and so forth. Potentially the researchers are asking the teacher of the 'control' group to teach less effectively than normally to bias the experiment towards their preferred outcome (Taber, 2019a).

This is not only a failure to do good science, but also an abuse of those learners being deliberately subjected to poor teaching. Perhaps in this study the class in School B was habitually taught by being lectured at, so the comparison condition was just what would have occurred in the absence of the research, but this is always a worry when studies report comparison conditions that seem to deliberately disadvantage students. (This paper does not seem to report anything about obtaining voluntary informed consent from participants, nor indeed about how access to the schools was negotiated. )

"In most educational research experiments of the type discussed in this article, potential harm is likely to be limited to subjecting students (and teachers) to conditions where teaching may be less effective, and perhaps demotivating…It can also potentially occur in control conditions if students are subjected to teaching inputs of low effectiveness when better alternatives were available. This may be judged only a modest level of harm, but – given that the whole purpose of experiments to test teaching innovations is to facilitate improvements in teaching effectiveness – this possibility should be taken seriously."

Taber, 2019a, p.94

Validity of measurements

Even leaving aside all the concerns expressed above, the results of a study of this kind depends upon valid measurements. Assessment items must test what they claim to test, and their analysis should be subject to quality control (and preferably blind to which condition a script being analysed derives form). Kibirige, Osodo and Tlala append the test they used in the study (Appendix 2, pp.309-310), which is very helpful in allowing readers to judge at least its face validity. Unfortunately, they do not include a mark/analysis scheme to show what they considered responses worthy of credit.

"The [Achievement Test] consisted of three questions. Question one consisted of five statements which learners had to classify as either true or false. Question two consisted of nine [sic, actually eight] multiple questions which were used as a diagnostic tool in the design of the teaching and learning materials in addressing misconceptions based on prior knowledge. Question three had two open-ended questions to reveal learners' views on how salts dissolve in water (Appendix 1 [sic, 2])."

Kibirige, Osodo & Tlala, 2014, p.302

"Question one consisted of five statements which learners had to classify as either true or false."

Question 1 is fairly straightforward.

1.2: Strictly all salts do dissolve in water to some extent. I expect that students were taught that some salts are insoluble. Often in teaching we start with simple dichotomous models (metal-non metal; ionic-covalent; soluble-insoluble; reversible – irreversible) and then develop these to more continuous accounts that recognise difference of degree. It is possible here then that a student who had learnt that all salts are soluble to some extent might have been disadvantaged by giving the 'wrong' ('True') response…

…although[sic] , actually, there is perhaps no excuse for answering 'True' ('All salts can dissolve in water') here as a later question begins "3.2. Some salts does [sic] not dissolve in water. In your own view what happens when a salt do [sic] not dissolve in water".

Despite the test actually telling students the answer to this item, it seems only 55% of the experimental group, and 23% of the control group obtained the correct answer on the post test – precisely the same proportions as on the pre-test!



1.4: Seems to be 'False' as the ions exist in the salt and are not formed when it goes into solution. However, I am not sure if that nuance of wording is intended in the question.

Question 2 gets more interesting.


"Question two consisted of nine multiple questions" (seven shown here)

I immediately got stuck on question 2.2 which asked which formula (singular, not 'formula/formulae', note) represented a salt. Surely, they are all salts?

I had the same problem on 2.4 which seemed to offer three salts that could be formed by reacting acid with base. Were students allowed to give multiple responses? Did they have to give all the correct options to score?

Again, 2.5 offered three salts which could all be made by direct reaction of 'some substances'. (As a student I might have answered A assuming the teacher meant to ask about direct combination of the elements?)

At least in 2.6 there only seemed to be two correct responses to choose between.

Any student unsure of the correct answer in 2.7 might have taken guidance from the charges as shown in the equation given in question 2.8 (although indicated as 2.9).

How I wished they had provided the mark scheme.



The final question in this section asked students to select one of three diagrams to show what happens when a 'mixture' of H2O and NaCl in a closed container 'react'. (In chemistry, we do not usually consider salt dissolving as a reaction.)

Diagram B seemed to show ion pairs in solution (but why the different form of representation?) Option C did not look convincing as the chloride ions had altogether vanished from the scene and sodium seemed to have formed multiple bonds with oxygen and hydrogens.

So, by a process of elimination, the answer is surely A.

  • But components seem to be labelled Na and Cl (not as ions).
  • And the image does not seem to represent a solution as there is much too much space between the species present.
  • And in salt solution there are many water molecules between solvated ions – missing here.
  • And the figure seems to show two water molecules have broken up, not to give hydrogen and hydroxide ions, but lone oxygen (atoms, ions?)
  • And why is the chlorine shown to be so much larger in solution than it was in the salt? (If this is meant to be an atom, it should be smaller than the ion, not larger. The real mystery is why the chloride ions are shown so much smaller than smaller sodium ions before salvation occurs when chloride ions have about double the radii of sodium ions.)

So diagram A is incredible, but still not quite as crazy an option as B and C.

This is all despite

"For face validity, three Physical Sciences experts (two Physical Sciences educators and one researcher) examined the instruments with specific reference to Mpofu's (2006) criteria: suitability of the language used to the targeted group; structure and clarity of the questions; and checked if the content was relevant to what would be measured. For reliability, the instruments were piloted over a period of two weeks. Grade 10 learners of a school which was not part of the sample was used. Any questions that were not clear were changed to reduce ambiguity."

Kibirige, Osodo & Tlala, 2014, p.302

One wonders what the less clear, more ambiguous, versions of the test items were.

Reducing 'misconceptions'

The final question was (or, perhaps better, questions were) open-ended.



I assume (again, it would be good for authors of research reports to make such things explicit) these were the questions that led to claims about the identified alternative conceptions at pre-test.

"The pre-test revealed a number of misconceptions held by learners in both groups: learners believed that salts 'disappear' when dissolved in water (37% of the responses in the 80% from the pre-test) and that salt 'melts' when dissolved in water (27% of the responses in the 80% from the pre-test)."

Kibirige, Osodo & Tlala, 2014, p.302

As the first two (sets of) questions only admit objective scoring, it seems that this data can only have come from responses to Q3. This means that the authors cannot be sure how students are using terms. 'Melt' is often used in an everyday, metaphorical, sense of 'melting away'. This use of language should be addressed, but it may not be a conceptual error

As the first two (sets of) questions only admit objective scoring, it seems that this data can only have come from responses to Q3. This means that the authors cannot be sure how students are using terms. 'Melt' is often used in an everyday, metaphorical, sense of 'melting away'. This use of language should be addressed, but it may not (for at least some of these learners) be a conceptual error as much as poor use of terminology. .

To say that salts disappear when they dissolve does not seem to me a misconception: they do. To disappear means to no longer be visible, and that's a fair description of the phenomenon of salt dissolving. The authors may assume that if learners use the term 'disappear' they mean the salt is no longer present, but literally they are only claiming it is not directly visible.

Unfortunately, the authors tell us nothing about how they analysed the data collected form their test, so the reader has no basis for knowing how they interpreted student responded to arrive at their findings. The authors do tell us, however, that:

"the intervention had a positive effect on the understanding of concepts dealing with dissolving of salts. This improved achievement was due to the impact of POE strategy which reduced learners' misconceptions regarding dissolving of salts"

Kibirige, Osodo & Tlala, 2014, p.305

Yet, oddly, they offer no specific basis for this claim – no figures to show the level at which "learners believed that salts 'disappear' when dissolved in water …and that salt 'melts' when dissolved in water" in either group at the post-test.


'disappear' misconception'melt' misconception
pre-test:
experimental group
not reportednot reported
pre-test:
comparison group
not reportednot reported
pre-test:
total
(0.37 x 0.8 x 93 =)
24.5 (!?)
(0.27 x 0.8 x 93 =)
20
post-test:
experimental group
not reportednot reported
post-test:
comparison group
not reportednot reported
post-test:
total
not reportednot reported
Data presented about the numbers of learners considered to hold specific misconceptions said to have been 'reduced' in the experimental condition

It seems journal referees and the editor did not feel some important information was missing here that should be added before publication.

In conclusion

Experiments require control of variables. Experiments require random assignment to conditions. Quasi-experiments, where random assignment is not possible, are inherently weaker studies than true experiments.

Control of variables in educational contexts is often almost impossible.

Studies that compare different teaching approaches using two different classes each taught by a different teacher (and perhaps not even in the same school) can never be considered fair comparisons able to offer generalisable conclusions about the relative merits of the approaches. Such 'experiments' have no value as research studies. 1

Such 'experiments' are like comparing the solubility of two salts by (a) dropping a solid lump of 10g of one salt into some cold water, and (b) stirring a finely powdered 35g sample of the other salt into hot propanol; and watching to see which seems to dissolve better.

Only large scale studies that encompass a wide range of different teachers/schools/classrooms in each condition are likely to produce results that are generalisable.

The use of inferential statistical tests is only worthwhile when the conditions for those statistical tests are met. Sometimes tests are said to be robust to modest deviations from such acquirements as normality. But applying tests to data that do not come close to fitting the conditions of the test is pointless.

Any research is only as trustworthy as the validity of its measurements. If one does not trust the measuring instrument or the analysis of measurement data then one cannot trust the findings and conclusions.


The results of a research study depend on an extended chain of argumentation, where any broken link invalidates the whole chain. (From 'Critical reading of research')

So, although the website for the Mediterranean Journal of Social Science claims "All articles submitted …undergo to a rigorous double blinded peer review process", I think the peer reviewers for this article were either very generous, very ignorant, or simply very lazy. That may seem harsh, but peer review is meant to help authors improve submissions till they are worthy of appearing in the literature, and here peer review has failed, and the authors (and readers of the journal) have been let down by the reviewers and the editor who ultimately decided this study was publishable in this form.

If I asked a graduate student (or indeed an undergraduate student) to evaluate this paper, I would expect to see a response something along these sorts of lines:


Applying the 'Critical Reading of Empirical Studies Tool' to 'The effect of predict-observe-explain strategy on learners' misconceptions about dissolved salts'

I still think P-O-E is a very valuable part of the science teacher's repertoire – but this paper can not contribute anything to support to that view.

Work cited:

Note

1 A lot of these invalid experiments get submitted to research journals, scrutinised by editors and journal referees, and then get published without any acknowledgement of how they fall short of meeting the conditions for a valid experiment. (See, for example, examples discussed in Taber 2019a.) It is as if the mystique of experiment is so great that even studies with invalid conclusions are considered worth publishing as long as the authors did an experiment.

A salt grain is a particle (but with more particles inside it)

Keith S. Taber

Sandra was a participant in the Understanding Science Project. When I interviewed Sandra about her science lessons in Y7 she told me "I've done changing state, burning, and we're doing electricity at the moment". She talked about burning as being a chemical change, and when asked for another example told me dissolving was a chemical change, as when salt was dissolved it was not possible to turn it back to give salt grains of the same size. She talk me that is the water was boiled off from salt solution "you'd have the same [amount of salt], but there would just be more particles, but they'd be smaller".

As Sandra had referred to had referred to the salt 'particles' being smaller,(as as she had told me she had been studying 'changing state') I wondered if she had bee taught about the particle model of matter

So the salt's got particles. The salt comes as particles, does it?
Yeah.
Do other things come as particles?
Everything has particles in it.
Everything has particles?
Yeah.
But with salt, you can get larger particles, or smaller particles?
Well, most things. Like it will have like thousands and thousands of particles inside it.
So these are other types of particles, are they?
Mm.

So although Sandra had referred to the smaller salt grains as being "smaller particles", it seemed he was aware that 'particles' could also refer to something other than the visible grains. Everything had particles in. Although salt particles (grains?) could be different sizes, it (any salt grain?) would have a great number ("like thousands and thousands") of particles (not grains – quanticles perhaps) inside it. So it seemed Sandra was aware of the possible ambiguity here, that there were small 'particles' of some materials, but all materials (or, at least, "most things") were made up of a great many 'particles' that were very much smaller.

So if you look at the salt, you can see there's tiny little grains?
Yeah.
But that's not particles then?
Well it sort of is, but you've got more particles inside that.

"It sort of is" could be taken to mean that the grains are 'a kind of particle' in a sense, but clearly not the type of particles that were inside everything. She seemed to appreciate that these were two different types of particle. However, Sandra was not entirely clear about that:

So there's two types are of particles, are there?
I don't know.
Particles within particles?
Yeah.
Something like that, is it?
Yeah.
But everything's got particles has it, even if you can't see them?
Yeah.
So if you dissolved your salt in water, would the water have particles?
Ye:ah.
'cause I've seen water, and I've never seen any particles in the water.
The part¬, you can't actually see particles.
Why not?
Because they're too small.
Things can be too small to see?
Yeah.
Oh amazing. So what can you see when you look at water, then? 'cause you see something, don't you?
You can see – what the particles make up.
Ah, I see, but not the individual particles?
No.

Sandra's understanding here seems quite strong – the particles that are inside everything (quanticles) were too small to be seen, and we could only see "what the particles make up". That is, she, to some extent at least, appreciated the emergence of new properties when very large numbers of particles that were individually too small to see were collected together.

Despite this, Sandra's learning was clearly not helped by the associations of the word 'particle'. Sandra may have been taught about submicroscopic particles outside of direct experience, but she already thought of small visible objects like salt grains as 'particles'. This seems to be quite common – science borrows a familiar term, particle, and uses it to label something unfamiliar.

We can see this as extending the usual everyday range of meaning of 'particle' to also include much smaller examples that cannot be perceived, or perhaps as a scientific metaphor – that quanticles are called particles because they are in some ways like the grains and specks that we usually think of as being very small particles. Either way, the choice of a term with an existing meaning to label something that is in some ways quite similar (small bits of matter) but in other ways very different ('particles' without definite sizes/volumes or actual edges/surfaces) can confuse students. It can act as an associative learning impediment if students transfer the properties of familiar particles to the submicroscopic entities of 'particle' theory.

Dissolving salt is a chemical change as you cannot turn it back

Dissolving salt is a chemical change as you cannot turn it back as it was before

Keith S. Taber

Sandra was a participant in the Understanding Science Project. When I interviewed Sandra about her science lessons in Y7 she told me "I've done changing state, burning, and we're doing electricity at the moment". I asked her about burning:

Well, tell me a bit about burning then. What's burning then?
It's just when something gets set on fire, and turns into ash, or – has a chemical change, whatever.
Has a chemical change: what's a chemical change?
It means something has changed into something else and you can't turn it back.
Oh I see. So burning would be an example of that.
Yeah.

So far this seemed to fit 'target knowledge'. However, Sandra suggested that dissolving would also be a chemical change. Dissolving is not normally considered a chemical change in school science, but a physical change, the distinction is a questionable teaching model. (Chemical change is said to involve bond breaking/making, and of course dissolving a salt does involve breaking up the ionic bonding to form solvent-solute interactions.)

Are there other examples?
Erm – dissolving.
So give me an example of something you might dissolve?
Salt.
Okay, and if you dissolve salt, you can't get it back?
Not really, not as it was before.
No. Can you get it back at all?
Sort of, you can like, erm, make the, boil the water so it turns into gas, and then you have salt, salt, salt on the, left there. Sometimes.
But you think that might not be quite the same as it was before?
No.
No. Different in some way?
Yeah
How might it be different?
Be much smaller.
Oh I see, so do you think you'd have less salt than you started with?
You'd have the same, but there would just be more particles, but they'd be smaller.
Ah, so instead of having quite large grains you might have lots of small grains
Yeah.

So Sandra was clear that one could dissolve salt, and then reclaim the same amount of salt by removing the solvent (water) which from the canonical perspective would mean the change was reversible – a criterion of a physical change.

Yet Sandra also thought that although the amount of salt would be conserved, the salt would be in a different form – it would have different grain size. (Indeed, if the water was boiled off, rather than left to evaporate, it might indeed be produced as very small crystals.)

So, Sandra seemed to have a fairly good understanding of the process, but because of the way she interpreted the criterion of a chemical change, something [salt] has changed into something else [solution] and you can't turn it back [with the same granularity]. Large grains will have changed into small grains – so this would, to Sandra's mind, be a chemical change.

Science teachers deserve a great deal of public appreciation. A teacher can teach something so that a student learns it well – and yet still form an alternative conception – here because of the inherent ambiguity in the ways language is used and understood. Sandra's interpretation – if you start off with large particles and end up with smaller particles then you have not turned it back – was a reasonable interpretation of what she had learnt. (It also transpired there was ambiguity in quite what was meant by particles.)

A chemical change is where two things just go together


Keith S. Taber


Morag was a participant in the Understanding Science project. In the first interview, in her first term in secondary school, Morag told me that that she was studying electricity having previously studied changing state and burning. When I asked her whether these science topics have anything in common, that made them science, we got into a conversation about chemical reactions, and chemical change:

Do they have anything in common do you think? is there anything similar about those topics?

Changing state and burning's got something in common, but I don't know about electricity.

Oh yeah? So what's, what have they got in common then?

Erm, in burning you have, you could have a chemical reaction, and in changing states you've got chemical reactions as well.

From the canonical scientific perspective, a change of state is not a chemical reaction (so this is an alternative conception), so I followed up on this.

Ah, so what's a chemical reaction?

(I had to learn this) it's when two things, erm, are mixed together and can't be made to the original things easy, easily.

Oh, can you give me an example of that?

{pause, c. 2 seconds}

Water mixing with sugar, but that's not a chemical reaction.

So, Morag offers a definition or at least a description of a chemical reaction, but then the example she gives of that of type of event is not something she considers to be a chemical reaction. (Dissolving is not usually considered a chemical change, although it usually involves the breaking and forming of bonds, sometimes strong bonds.)

Oh so that's something else is it, is that something different?

I don't know.

Don't know, so can you mix water with sugar?

Yeah, but you can't get the water and the sugar back together very easily.

You can't. Is there a way of doing that?

No.

No? So if I gave you a beaker with some sugar in, and a beaker with some water in, and you mixed them together, poured them all in one beaker, and stirred them up – you would find it then difficult to get the water out or the sugar out, would you?

Ye-ah.

Yeah, so is that a chemical reaction?

No.

No, okay. That's not a chemical reaction.

At this point Morag suggested we look in her book as "it's in my book", but I was more interested in what she could tell me without referring to her notes.

So, have you got any examples of chemical reactions – any you think are chemical reactions?

Fireworks,

I: Fireworks, okay.

when like the gunpowder explodes, erm in the inside, and you can't get it back to the original rocket once it's has exploded.

and is that what makes it a, er, a chemical reaction, that you can't get it back?

{pause, c. 3 s}

Yeah, I suppose so.

So, now Morag has presented an example of a chemical reaction, that would be considered canonical (as chemical change) by scientists. Yet her criterion is the same as she used for the dissolving example, that she did not think was a chemical reaction.

Yeah? And then the water and the sugar, you can't get them back very easily, but we don't think that is a chemical reaction?

Yeah – that's a chemical change – {adding quietly} I think.

It's what, sorry?

Well there's, a chemical reaction and a chemical change.

Oh I see. So what's the difference between a chemical reaction and a chemical change?

Erm nothing, it's just two different ways of saying it.

Oh so they're the same thing?

Yeah, just two different ways of saying it.

So, now Morag had introduced a differentiated terminology, initially suggesting that sugar mixing with water was a chemical change, whereas a firework exploding was a chemical reaction. However, this distinction did not seem to hold up, as she believed the terms were synonyms. However, as the conversation proceeded, she seemed to change her mind on this point.

So when a firework goes off, the gunpowder, er, explodes in a firework, that's a chemical reaction?

Yeah – yeah, cause something's mixing with the gunpowder to make it blow up.

And So that's a chemical reaction?

Yeah.

And is that a chemical change?

{pause, c. 2 s}

Yeah.

Yeah?

(I suppose.) Yeah.

And when you mix sugar and water, you get kind of sugary water?

Yeah.

Have you got a name for that, when you mix a liquid and solid like that?

{pause, c. 1 s}

Or is that just mixing sugar and water?

{pause, c. 1 s}

There is a name for it, but I don't know it.

Ah. Okay, so when we mix it we get this sugar-water, whatever, and then it's harder to, it's hard to separate it is it?

Yeah.

And get the sugar out and the water out?

Yeah.

So is that a chemical reaction?

{Pause, c. 3 s}

No.

No, is that a chemical change?

{Pause, c. 1 s}

Yes.

Ah, okay.

So, again, Morag was suggesting she could distinguish between a chemical reaction, and a chemical change.

So what's the difference between a chemical change and a chemical reaction?

A reaction is where two things react with each other, like the gunpowder and flame, and a change is where two things just go together. You know like water and sugar, they go together…

In effect we had reached a tautology: in a chemical reaction, unlike a chemical change, things react with each other. She also thought that a sugar/water and a salt/water mixtures (i.e., solutions) were different "because the sugar's so small it would evaporate with the water"*.

The idea that a chemical reactions has to involve two reactants is common, but is an alternative conception as chemists also recognise reactions where there is only one reactant which decomposes.

Morag seemed to be struggling with the distinction between a chemical and a physical change. However, that distinction is not an absolute one, and dissolving presents a problematic case. Certainly without a good appreciation of the submicroscopic models used in chemistry, it is not easy to appreciate why reactions produce a different substance, but physical changes do not. One of Morag's qualities as a learner, however, was a willingness to 'run with' ideas and try to talk her way into understanding. That did not work here, despite Morag being happy to engage in the conversation.

Morag was also here talking as though in the gunpowder example the flame was a reactant (i.e., the flame reacts to the gunpowder). Learners sometimes consider substances in a chemical reaction are reacting to heat or stirring rather than with another substance (e.g., Taber & García Franco, 2010).

Read about learners' alternative conceptions

Source cited:

Taber, K. S., & García Franco, A. (2010). Learning processes in chemistry: Drawing upon cognitive resources to learn about the particulate structure of matter. Journal of the Learning Sciences, 19(1), 99-142.