Psychological skills, academic achievement and…swimming

Keith S. Taber

'Psychological Skills in Relation to Academic Achievement through Swimming Context'

 


Original image by Clker-Free-Vector-Images from Pixabay

I was intrigued by the title of an article I saw in a notification: "Psychological Skills in Relation to Academic Achievement through Swimming Context". In part, it was the 'swimming context' – despite never having been very athletic or sporty (which is not to say I did not enjoy sports, just that I was never particularly good at any), I have always been a regular and enthusiastic swimmer.  Not a good swimmer, mind (too splashy, too easily veering off-line) – but an enthusiastic one. But I was also intrigued by the triad of psychological skills, academic achievement, and swimming.

Perhaps I had visions of students' psychological skills being tested in relation to their academic achievement as they pounded up and down the pool. So, I was tempted to follow this up.

Investigating psychological skills and academic achievement

The abstract of the paper by Bayyat and colleagues reported three aims for their study:

"This study aimed to investigate:

  • (1) the level of psychological skills among students enrolled in swimming courses at the Physical Education faculties in the Jordanian Universities.
  • (2) the relation between their psychological skills and academic achievement.
  • (3) the differences in these psychological skills according to gender."

Bayyat et al., 2021: 4535

The article was published in a journal called 'Psychology and Education', which, its publishers*  suggest is "a quality journal devoted to basic research, theory, and techniques and arts of practice in the general field of psychology and education".

A peer reviewed journal

The peer review policy reports this is a double-blind peer-reviewed journal. This means other academics have critiqued and evaluated a submission prior to its being accepted for publication. Peer review is a necessary (but not sufficient) condition for high quality research journals.

Journals with high standards use expert peer reviewers, and the editors use their reports to both reject low-quality submissions, and to seek to improve high-quality submissions by providing feedback to authors about points that are not clear, any missing information, incomplete chains of argumentation, and so forth. In the best journals editors only accept submissions after reviewers' criticisms have been addressed to the satisfaction of reviewers (or authors have made persuasive arguments for why some criticism does not need addressing).

(Read about peer review)

The authors here report that

"The statistical analysis results revealed an average level of psychological skills, significant differences in psychological skills level in favor of female students, A students and JU[**], and significant positive relation between psychological skills and academic achievement."

Bayyat et al., 2021: 4535

Rewriting slightly it seems that, according to this study:

  • the students in the study had average levels of psychological skills;
  • the female students have higher levels of psychological skills than their male peers;
  • and that there was some kind of positive correlation between psychological
  • skills and academic achievement;

Anyone reading a research paper critically asks themselves questions such as

  • 'what do they mean by that?';
  • 'how did they measure that?;
  • 'how did they reach that conclusion?'; and
  • 'who does this apply to?'

Females are better – but can we generalise?

In this study it was reported that

"By comparing psychological skills between male and female participants, results revealed significant differences in favor [sic] of female participants"

"All psychological skills' dimensions of female participants were significant in favor [sic] of females compared to their male peers. They were more focused, concentrated, confident, motivated to achieve their goals, and sought to manage their stress."

Bayyat et al., 2021: 4541, 4545

"It's our superior psychological skills that make us this good!" (Image by CristianoCavina from Pixabay)

A pedant (such as the present author) might wonder if "psychological skills' dimensions of female participants" [cf. psychological skills' dimensions of male participants?] would not be inherently likely to be in favour of females , but it is clear from the paper that this is intended to refer to the finding that females (as a group) got significantly higher ratings than males (as a group) on the measures of 'psychological skills'.

If we for the moment (but please read on below…) accept these findings as valid, an obvious question is the extent to which these results might generalise beyond the study. That is, to what extent would these findings being true for the participants of this study imply the same thing would be found more widely (e.g., among all students in Jordanian Universities? among all university students? Among all adult Jordanians? among all humans?)

Statistical generalisation
Statistical generalisation (From Taber, 2019)

Two key concepts here are the population and the sample. The population is the group that we wish our study to be about (e.g., chemistry teachers in English schools, 11-year olds in New South Wales…), and the sample is the group who actually provide data. In order to generalise to the population from the sample it is important that the sample is large enough and representative of the population (which of course may be quite difficult to ascertain).

(Read about sampling in research)

(Read about generalisation)

In this study the reader is told that "The population of this study was undergraduate male and female students attending both intermediate and advanced swimming courses" (Bayyat et al., 2021: 4536). Taken at face value this might raise the question of why a sample was drawn exclusively from Jordan – unless of course this is the only national context where students attend intermediate or advanced swimming courses. *** However, it was immediately clarified that "They consisted of (n= 314) students enrolled at the schools of Sport Sciences at three state universities". That is, the population was actually undergraduate male and female students from schools of Sport Sciences at three Jordanian state universities attending both intermediate and advanced swimming courses.

"The Participants were an opportunity sample of 260 students" (Bayyat et al., 2021: 4536). So in terms of sample size, 260, the sample made up most of the population – almost 83%. This is in contrast to many educational studies where the samples may necessarily only reflect a small proportion of the population. In general, representatives of a sample is more important than size as skew in the sample undermines statistical generalisations (whereas size, for a representative sample, influences the magnitude of the likely error ****) – but a reader is likely to feel that when over four-fifths of the population were sampled it is less critical that a convenience sample was used.

This still does not ensure us that the results can be generalised to the population (students from schools of Sport Sciences at three Jordanian state universities attending 'both' intermediate and advanced swimming courses), but psychologically it seems quite convincing.

Ontology: What are we dealing with?

The study is only useful if it is about something that readers think is important – and it is clear what it is about. The authors tells us their study is about

  • Psychological Skills
  • Academic Achievement

which would seem to be things educators should be interested in. We do need to know however how the authors understand these constructs: what do they mean by 'a Psychological Skill' and 'Academic achievement'? Most people would probably think they have a pretty good idea what these terms might mean, but that is no assurance at all that different people would agree on this.

So, in reading this paper it is important to know what the authors themselves mean by these terms – so a reader can check they understand these terms in a sufficiently similar way.

What is academic achievement?

The authors suggest that

"academic achievement reflects the learner's accomplishment of specific goals by the end of an educational experience in a determined amount of time"

Bayyat et al., 2021: 4535

Bayyat et al., 2021: 4535

This seems to be  the extent of the general characterisation of this construct in the paper *****.

What are psychological skills?

The authors tell readers that

"Psychological skills (PS) are a group of skills and abilities that enhances peoples' performance and achievement…[It has been] suggested that PS includes a whole set of trainable skills including emotional control and self-confidence"

Bayyat et al., 2021: 4535

Bayyat et al., 2021: 4535

For the purposes of this particular study, they

"identified the psychological skills related to the swimming context such as; leadership, emotional stability, sport achievement motivation, self-confidence, stress management, and attention"

Bayyat et al., 2021: 4536

Bayyat et al., 2021: 4536

So the relevant skills are considered to be:

  • leadership
  • emotional stability
  • sport achievement motivation
  • self-confidence
  • stress management
  • attention

I suspect that there would not be complete consensus among psychologists or people working in education over whether all of these constructs actually are 'skills'. Someone who did not consider these (or some of these) characteristics as skills would need to read the authors' claims arising from the study about 'psychological skills' accordingly (i.e., perhaps as being about something other than skills) but as the authors have been clear about their use of the term, this should not confuse or mislead readers.

Epistemology: How do we know?

Having established what is meant by 'psychological skills' and 'academic achievement' a reader would want to know how these were measured in the present study – do the authors use techniques that allow them to obtain valid and reliable measures of 'psychological skills' and 'academic achievement'?

How is academic achievement measured?

The authors inform readers that

"To calculate students' academic achievement, the instructors of the swimming courses conducted a valid and reliable assessment as a pre-midterm, midterm, and final exam throughout the semester…The assessment included performance tests and theoretical tests (paper and pencil tests) for each level"

Bayyat et al., 2021: 4538

Bayyat et al., 2021: 4538

Although the authors claim their assessment are valid and reliable, a careful reader will note that the methodology here does not match the definition of "accomplishment of specific goals by the end of an educational experience" (emphasis added)- as only the final examinations took place at the end of the programme. On that point, then, there is a lack of internal consistency in the study. This might not matter to a reader who did not think academic achievement needed to be measured at the end of a course of study.

Information on the "Academic achievement assessment tool", comprising six examinations (pre-midterm, midterm, and final examinations at each of the intermediate and advanced levels) is included as an appendix – good practice that allows a reader to interrogate the instrument.

Although this appendix is somewhat vague on precise details, it offers a surprise to someone (i.e., me) with a traditional notion of what is meant by 'academic achievement' – as both theory and practical aspects are included. Indeed, most  of the marks seem to be given for practical swimming proficiency. So, the 'Intermediate swimming Pre-midterm exam' has a maximum of 20 marks available – with breast stroke leg technique and arm technique each scored out of ten marks.

The 'Advanced swimming midterm exam' is marked out of 30, with 10 marks each available for the 200m crawl (female), individual medley (female) and life guarding techniques. This seems to suggest that 20 of the 30 marks available can only be obtained by being female, but this point does not seem to be clarified. Presumably (?) male students had a different task that the authors considered equivalent.

How are psychological skills measured?

In order to measure psychological skills the authors proceeded to "to develop and validate a questionnaire" (p.4536). Designing a new instrument is a complex and challenging affair. The authors report how they

"generated a 40 items-questionnaire reflecting the psychological skills previously mentioned [leadership, emotional stability, sport achievement motivation, self-confidence, stress management, and attention] by applying both deductive and inductive methods. The items were clear, understandable, reflect the real-life experience of the study population, and not too long in structure."

Bayyat et al., 2021: 4538

So, items were written which it was thought would reflect the focal skills of interest. (Unfortunately there are no details of what the authors mean by "applying both deductive and inductive methods" to generate the items.) Validity was assured by asking a panel of people considered to have expertise to critique the items:

"the scale was reviewed and assessed by eight qualified expert judges from different related fields (sport psychology, swimming, teaching methodology, scientific research methodology, and kinesiology). They were asked to give their opinion of content representation of the suggested PS [psychological skills], their relatedness, clarity, and structure of items. According to the judges' reviews, we omitted both leadership and emotional stability domains, in addition to several items throughout the questionnaire. Other items were rephrased, and some items were added. Again, the scale was reviewed by four judges, who agreed on 80% of the items."

So, construct validity was a kind of face validity, in that people considered to be experts thought the final set of items would elicit the constructs intended, but there was no attempt to see if responses correlated in any way with any actual measurements of the 'skills'.

Readers of the paper wondering if they should be convinced by the study would need to judge if the expert panel had the right specialisms to evaluate scale items for 'psychological skills',and might find some of the areas of expertise (i.e.,

  • sport psychology
  • swimming
  • teaching methodology
  • scientific research methodology
  • kinesiology)

more relevant than others:

Self-reports

If respondents responded honestly, their responses would have reflected their own estimates of their 'skills' – at least to the extent that their interpretation of the items matched that of the experts. (That is, there was no attempt to investigate how members of the population of interest would understand what was meant by the items.)

Here are some examples of the items in the instrument:

Construct ('psychological skill')Example item

self-confidence

I manage my time effectively while in class

sports motivation achievement

I do my best to control everything
related to swimming lessons.

attention

I can pay attention and focus on different places in the pool while carrying out swimming tasks

stress-management

I am not afraid to perform any difficult swimming skill, no matter what
Examples of statements students were asked to rate in order to measure their 'psychological skills' (source: Bayyat et al., 2021: 4539-4541)

Analysis of data

The authors report various analyses of their data, that lead to the conclusions they reach. If a critical reader was convinced about matters so far, they would still need to beleive that the analyses undertaken were

  • appropriate, and
  • completed competently, and
  • correctly interpreted.

Drawing conclusions

However, as a reader I personally would have too many quibbles with the conceptualisation and design of instrumentation to consider the analysis in much detail.

To my mind, at least, the measure of 'academic achievement' seems to be largely an evaluation of swimming skills. They are obviously important in a swimming course, but I do not consider this a valid measure of academic achievement. That is not a question of suggesting academic achievement is better or more important than practical or athletic achievements, but it is surely something different (akin to me claiming to have excellent sporting achievement on the basis of holding a PhD in education).

The measure of psychological skills does not convince me either. I am not sure some of the focal constructs can really be called 'skills' (self-confidence? motivation?), but even if they were, there is no attempt to directly measure skill. At best, the questionnaire offers self-reports of how students perceive (or perhaps wish to be seen as perceiving) their characteristics.

It is quite common in research to see the term 'questionnaire' used for an instrument that is intended to test knowledge or skill – but questionnaires are not the right kind of instrument for that job.

(Read about questionnaires)

Significant positive relation between psychological skills and academic achievement?

So, I do not think this methodology would allow anyone to find a "significant positive relation between psychological skills and academic achievement" – only a relationship between students self-ratings on some psychological characteristics and swimming achievement. (That may reflect an interesting research question, and could perhaps be a suitable basis for a study, but is not what this study claims to be about.)

Significant differences in psychological skills level in favor of female students?

In a similar way, although it is interesting that females tended to score higher on the questionnaire scales, this shows they had higher self-ratings on average, and tells us nothing about their actual skills.

It may be that the students have great insight into these constructs and their own characteristics and so make very accurate ratings on these scales – but with absolutely no evidential basis for thinking this there are no grounds for making such a claim.

An alternative interpretation of the results is that on average the male students under-rate their 'skills' compared to their female peers. That is the 'skills' could be much the same across gender, but there might be a gender-based difference in perception. (I am not suggesting that is the case, but the evidence presented in the paper can be explained just as well by that possibility.)

An average level of psychological skills?

Finally, we might ask what is meant by

"The statistical analysis results revealed an average level of psychological skills…"

"Results of this study revealed that the participants acquired all four psychological skills at a moderate level."

Bayyat et al., 2021: 4535, 4545

Even leaving aside that what is being measured is something other than psychological skills, it is hard to see how these statements can be justified. This was the first administration of a new instrument being applied to a sample of a very specific population.


Image by Igor Drondin from Pixabay

The paper reports standard deviations for the ratings on the items in the questionnaire, so – as would be expected – there were distributions of results: spreads with different students giving different ratings. Within the sample tested, some of the students will have given higher than median ratings on an item, some will have given lower than median ratings – although on average the ratings for that particular item would have been – average for this sample (that is, by definition!) So, assuming this claim (of average/moderate levels of psychological skills) was not meant as a tautology, the authors might seem to be suggesting that the ratings given on this administration of the instrument align with what would be typically obtained, that is from across other administrations.

That is, the authors seem to be suggesting that the ratings given on this administration of the instrument align with what they expect would be typically obtained from across other administrations. Of course they have absolutely no way of knowing that is the case without collecting data from samples of other populations.

What the authors actually seem to be basing these claims (of average/moderate levels of psychological skills) on is that the average responses on these scales did not give a very high or very low rating in terms of the raw scale. Yet, with absolutely no reference data for how other groups of people might respond on the same instrument that offers little useful information. At best, it suggests something welcome about the instrument itself (ideally one would wish items to elicit a spread of responses, rather than having most responses rated very high or very low) but nothing about the students sampled.

On this point the authors seem to be treating the scale as calibrated in terms of some nominal standard (e.g. 'a rating of 3-4 would be the norm'), when there is no inherent interpretation of particular ratings of items in such a scale that can just be assumed – rather this would be a matter that would need to be explored empirically.

The research paper as an argument

The research paper is a very specific genre of writing. It is an argument for new knowledge claims. The conclusions of the paper rest on a chain of argument that starts with the conceptualisation of the study and moves through research design, data collection, analysis, and interpretation. As a reader, any link in the argument chain that is not convincing potentially invalidates the knowledge claim(s) being made. Thus the standards expected for research papers are very high.


Research writing

In sum then, this was an intriguing study, but did not convince me (even if it apparently convinced the peer reviewers and editor of Psychology and Education). I am not sure it was really about psychological skills or, academic achievement

…but at least it was clearly set in the context of swimming.


Work cited:

Bayyat, M. M., Orabi, S. M., Al-Tarawneh, A. D., Alleimon, S. M., & Abaza, S. N. (2021). Psychological Skills in Relation to Academic Achievement through Swimming Context. Psychology and Education, 58(5), 4535-4551.

Taber, K. S. (2019). Experimental research into teaching innovations: responding to methodological and ethical challenges. Studies in Science Education, 55(1), 69-119. doi:10.1080/03057267.2019.1658058 [Download manuscript version]


* Despite searching the different sections of the journal site, I was unable to find who publishes the journal. However, searching outside the site I found a record of the publisher of this journal being 'Auricle Technologies, Pvt., Ltd'.

** It transpired later in the paper that 'JU' referred to students at the University of Jordan: one of three universities involved in the study.

*** I think literally this means those who participated in the study were students attending both an intermediate swimming course and an advanced swimming course – but I read this to mean those who participated in the study were students attending either an intermediate or advanced swimming course. This latter interpretation is consistent with information given elsewhere in the paper: "All schools of sports sciences at the universities of Jordan offer mandatory, reliable, and valid swimming programs. Students enroll in one of three swimming courses consequently: the basic, intermediate, and advanced levels". (Bayyat et al., 2021: 4535, emphasis added)

**** That is, if the sample is unrepresentative of the population, there is no way to know how biased the sample might be. However, if there is a representative sample, then although there will still likely be some error (the results for the sample will not be precisely what the results across the whole population would be) it is possible to calculate the likely size of this error (e.g., say ±3%) which will be smaller when a higher proportion of the population are sampled.

***** It is possible some text that was intended to be at this point has gone missing during production – as, oddly, the following sentence is

facilisi, veritus invidunt ei mea (Times New Roman, 10)

Bayyat et al., 2021: 4535

which seems to be an accidental retention of text from the journal's paper template.

Those flipping, confounding variables!

Keith S. Taber

Alternative interpretations and a study on flipped learning

Image by Please Don't sell My Artwork AS IS from Pixabay

Flipping learning

I was reading about a study of 'flipped learning'. Put very simply, the assumption behind flipped learning is that usually teaching follows a pattern of (a) class time spent with the teacher lecturing, followed by (b) students working through examples largely in their own time. This is a pattern that was (and perhaps still is) often found in Universities in subjects that largely teach though lecture courses.

The flipped learning approach switches the use the class time to 'active' learning activities, such as working through exercises, by having students undertake some study before class. That is, students learn about what would have been presented in the lecture by reading texts, watching videos, interacting with on-line learning resources, and so forth, BEFORE coming to class. The logic is that the teacher's input is more useful  when students are being challenged to apply the new ideas than as a means of presenting information.

That is clearly a quick gloss, and clearly much more could be said about the rationale, the assumptions behind the approach,and its implementation.

(Read more about flipped learning)

However, in simple terms, the mode of instruction for two stages of the learning process

  • being informed of scientific ideas (through a lecture)
  • applying those ideas (in unsupported private study)

are 'flipped' to

  • being informed of scientific ideas (through accessing learning resources)
  • applying those ideas (in a context where help and feedback is provided)

Testing pedagogy

So much for the intention, but does it work? That is where research comes in. If we want to test a hypothesis, such as 'students will learn more if learning is flipped' (or 'students will enjoy their studies more if learning is flipped', or 'more students will opt to study the subject further if learning is flipped', or whatever) then it would seem an experiment is called for.

In principle, experiments allow us to see if changing some factor (say, the sequence of activities in a course module) will change some variable (say, student scores on a test). The experiment is often the go-to methodology in natural sciences: modify one variable, and measure any change in another hypothesised to be affected by it, whilst keeping everything else that could conceivably have an influence constant. Even in science, however, it is seldom that simple, and experiments can never actually 'prove' our hypothesis is correct (or false).

(Read more about the scientific method)

In education, running experiments is even more challenging (Taber, 2019). Learners, classes, teachers, courses, schools, universities are not 'natural kinds'. That is, the kind of comparability you can expect between two copper sulphate crystals of a given mass, or two specimens of copper wire of given dimensions, does not apply: it can matter a lot whether you are testing this student or that student, or if the class is taught one teacher or another.

People respond to conditions different to inanimate objects – if testing the the conductivity of a sample of a salt solution of a given concentration it should not matter if it is Monday morning of Thursday afternoon, or whether it is windy outside, or which team lost last's night's match, or even whether the researcher is respectful or rude to the sample. Clearly when testing the motivation or learning of students, such things could influence measurements. Moreover, a sample of gas neither knows or cares what you are expecting to happen when you compress it, but people can be influenced by the expectations of researchers (so called expectancy effect – also known as the Pygmalion effect).

(Read about experimental research into teaching innovations)

Flipping the fundamentals of analytic chemistry

In the study, by Ponikwer and Patel, researchers flipped part of a module on the fundamentals of analytical chemistry, which was part of a BSc honours degree in biomedical science. The module was divided into three parts:

  1. absorbance and emission spectrosocopy
  2. chromatography and electrophoresis
  3. mass spectroscopy and nuclear magnetic resonance spectroscopy

Students were taught the first topics by the usual lectures, then the topics of chromatography and electrophoresis were taught 'flipped', before the final topics were taught through the usual lectures. This pattern was repeated over three successive years.

[Figure 1 in the paper offers a useful graphical representation of the study design. If I had been prepared to pay SpringerNature a fee, I would have been allowed to reproduce it here.*]

The authors of the study considered the innovation a success

This study suggests that flipped learning can be an effective model for teaching analytical chemistry in single topics and potentially entire modules. This approach provides the means for students to take active responsibility in their learning, which they can do at their own pace, and to conduct problem-solving activities within the classroom environment, which underpins the discipline of analytical chemistry. (Ponikwer & Patel,  2018: p.2268)

Confounding variables

Confounding variables are other factors which might vary between conditions and have an effect.

Read about confounding variables

Ponikwer and Patel were aware that one needs to be careful in interpreting the data collected in such a study. For example, it is not especially helpful to consider how well students did on the examination questions at the end of term to see if students did as well, or better, on the flipped topics that the other topics taught. Clearly students might find some topics, or indeed some questions, more difficult than others regardless of how they studied. Ponikwer and Patel reported that on average students did significantly better on questions from the flipped elements, but included important caveats

"This improved performance could be due to the flipped learning approach enhancing student learning, but may also be due to other factors, such as students finding the topic of chromatography more interesting or easier than spectroscopy, or that the format of flipped learning made students feel more positive about the subject area compared with those subject areas that were delivered traditionally." (Ponikwer & Patel,  2018: p.2267)

Whilst acknowledging such alternative explanations for their findings might seem to undermine their results it is good science to be explicit about such caveats. Looking for (and reporting) alternative explanations is a key part of the scientific attitude.

This good scientific practice is also clear where the authors discuss how attendance patterns varied over the course. The authors report that the attendance at the start of the flipped segment was similar to what had come before, but then attendance increased slightly during the flipped learning section of the course. They point out this shift was "not significant", that is statistics suggested it could not be ruled out to be a chance effect.

However Ponikwer and Patel do report a statistically "significant reduction in the attendance at the non-flipped lectures delivered after the flipped sessions" (p.2265) – that is, once students had experienced the flipped learning, on average they tended to attend normal lectures less later in their course. The authors suggest this could be a positive reaction to how they experienced the flipped learning, but again they point out that there were confounding variables, and other interpretations could not ruled out:

"This change in attendance may be due to increased engagement in the flipped learning module; however, it could also reflect a perception that a more exciting approach of lecturing or content is to be delivered. The enhanced level of engagement may also be because students could feel left behind in the problem-solving workshop sessions. The reduction in attendance after the flipped lecture may be due to students deciding to focus on assessments, feeling that they may have met the threshold attendance requirement" (Ponikwer & Patel,  2018: p.2265).

So, with these students, taking this particular course, in this particular university, having this sequence of topics based on some traditional and some flipped learning, there is some evidence of flipped learning better engaging students and leading to improved learning – but subject to a wide range of caveats which allow various alternative explanations of the findings.

(Read about caveats to research conclusions)

Pointless experiments?

Given the difficulties of interpreting experiments in education, one may wonder if there is any point in experiments in teaching and learning. On the other hand, for the lecturing staff on the course, it would seem strange to get these results, and dismiss them (it has not been proved that flipped learning has positive effects, but the results are at least suggestive and we can only base our action on the available evidence).

Moreover, Ponikwer and Patel collected other data, such as students' perceptions of the advantages and challenges of the flipped learning approach – data that can complement their statistical tests, and also inform potential modifications of the implementation of flipped learning for future iterations of the course.

(Read about the use of multiple research techniques in studies)

Is generalisation possible?

What does this tell us about the use of flipped learning elsewhere? Studies taking place in a single unique teaching and learning context do not automatically tell us what would have been the case elsewhere – with different lecturing staff, different demographic of students, when learning about marine ecology or general relativity. Such studies are best seen as context-directed, as being most relevant to here they are carried out.

However, again, even if research cannot be formally generalised, that does not mean that it cannot be informative to those working elsewhere who may apply a form of 'reader generalisation' to decide either:

a) that teaching and learning context seems very similar to ours: it might be worth trying that here;

or

b) that is a very different teaching and learning context to ours: it may not be worth the effort and disruption to try that out here based on the findings in such a different context.

(Read about generalisation)

This requires studies to give details of the teaching and learning context where they were carried out (so called 'thick description'). Clearly the more similar a study context is to one's own teaching context, and the wider the range of teaching and learning contexts where a particular pedagogy or teaching approach has been shown to have positive outcomes, the more reason there is to feel it is with trying something out in own's own classroom.

I have argued that:

"What are [common in the educational research literature] are individual small-scale experiments that cannot be considered to offer highly generalisable results. Despite this, where these individual studies are seen as being akin to case studies (and reported in sufficient detail) they can collectively build up a useful account of the range of application of tested innovations. That is, some inherent limitations of small-scale experimental studies can be mitigated across series of studies, but this is most effective when individual studies offer thick description of teaching contexts and when contexts for 'replication' studies are selected to best complement previous studies." (Taber, 2019: 106)

In that regard, studies like that of Ponikwer and Patel can be considered not as 'proof' of the effectiveness of flipped learning, but as part of a cumulative evidence base for the value of trying out the approach in various teaching situations.

Why I have not included the orignal figure showing the study design

* I had hoped to include in this post a copy of the figure in the paper showing the study design. The paper is not published open access and so the copyright in the 'design' (that, is the design of the figure **, not the study!) means that it cannot be legally reprodiced without permission. I sought permission to reproduce the figure here through (SpringerNature) the publisher's on line permissions request system, explaining this was to be used in an acdemics scholar's personal blog.

Springer granted permission for reuse, but subject to a fee of £53.83.

As copyright holder/managers they are perfectly entitled to do that. However, I had assumed that they would offer free use for a non-commercial purpose that offers free publicity to their publication. I have other uses for my pension, so I refer readers interested in seeing the figure to the original paper.

** Under the conventions associated with copyright law the reproduction of short extracts of an academic paper for the purposes of criticism and review is normally considered 'fair use' and exempt from copyright restrictions. However, any figure (or table) is treated as a discrete artistic design and cannot be copied from a work in copyright without permission.

(Read about copyright and scholarly works)

 

Work cited:

Do nerve signals travel faster than the speed of light?

Keith S. Taber

I have recently posted on the blog about having been viewing some of the court testimony being made available to the public in the State of Minnesota v. Derek Michael Chauvin court case (27-CR-20-12646: State vs. Derek Chauvin).

[Read 'Court TV: science in the media']

Prof. Martin J. Tobin, M.D., Loyola University Chicago Medical Center

I was watching the cross examination of expert witness Dr Martin J. Tobin, Professor of Pulmonary and Critical Care Medicine by defence attorney Eric Nelson, and was intrigued by the following exchange:

Now you talked quite a bit about physics in your direct testimony, agreed?

Yes

And you would agree that physics, or the application of physical forces, is a constantly changing, er, set of circumstances.

I did not catch what you said.

Sure. You would agree with me, would you not, that when you look at the concepts of physics, these things are constantly changing, right?

Yeah, all of science is constantly changing.

Constant! I mean,

Yes.

in milliseconds and nanoseconds, right?

Yes.

And so if I put this much weight [Nelson demonstrating by shifting position] or this much weight [shifting position], all of the formulas [sic] and variations, will change from second to second, from millisecond to millisecond, nanosecond to nanosecond, agreed.

I agree.

Similarly, biology sort of works the same way. Right?

Yes.

My heart beats, my lungs breathe [sic], my brain is sending millions of signals to my body, at all times.

Correct.

Again, even, I mean, faster than the speed of light, right?

Correct.

Millions of signals every nanosecond, right?

Yes.

Day 9. 27-CR-20-12646: State vs. Derek Chauvin

Agreeing – but talking about different things?

The first thing that struck me here concerns what seems to me to be Mr Nelson and Dr Tobin talking at cross-purposes – that neither participant acknowledged (and so perhaps neither were aware of).

I think Nelson is trying to make an argument that the precise state of Mr George Floyd (who's death is at the core of the prosecution of Mr Chauvin) would have been a dynamic matter during the time he was restrained on the ground by three police officers (an argument being made in response to the expert's presentation of testimony suggesting it was possible to posit fairly precise calculations of the forces acting during the episode).

This seems fairly clear from the opening question of the exchange above:

Now you talked quite a bit about physics in your direct testimony, agreed? … And you would agree that physics, or the application of physical forces, is a constantly changing, er, set of circumstances.

However, Dr Tobin does not hear this clearly (there are plexiglass screens between them as COVID precautions, and Nelson acknowledges that he is struggling with his voice by this stage of the trial).

Nelson re-phrases, but actually says something rather different:

You would agree with me, would you not, that when you look at the concepts of physics, these things are constantly changing, right?

['These things' presumably refers to 'the application of physical forces', but if Dr Tobin did not hear Mr Nelson's previous utterance then 'these things' would be taken to be 'the concepts of physics'.]

So, now it is not the forces acting in a real world scenario which are posited to be constantly changing, but the concepts of physics. Dr Tobin's response certainly seems to make most sense if the question is understood in terms of the science itself being in flux:

Yeah, all of science is constantly changing.

Given that context, the following agreement that these changes are occurring "in milliseconds and nanoseconds" seems a little surreal, as it is not quite clear in what sense science is changing on that scale (except in the sense that science is continuing constantly – certainly not in the sense that canonical accounts of concepts shift at that pace: say, in the way Einstein's notions of physics came to replace those of Newton).

In the next exchange the original context Nelson had presented ("the application of physical forces, is … constantly changing") becomes clearer:

And so if I put this much weight [Nelson demonstrating by shifting position] or this much weight [shifting position], all of the formulas and variations, will change from second to second, from millisecond to millisecond, nanosecond to nanosecond, agreed.

I agree.

As a pedantic science teacher I would suggest that it is not the formulae of physics that change, but the values to be substituted into the system of equations derived from them to describe the particular event: but I think the intended meaning is clear. Dr Tobin is a medical expert, not a physicist nor a science teacher, and the two men appear to be agreeing that the precise configurations of forces on a person being restrained will constantly change, which seems reasonable. I guess that is what the jury would take from this.

If my interpretation of this dialogue is correct (and readers may check the footage and see how they understand the exchange) then at one point the expert witness was agreeing with the attorney, but misunderstanding what he was being asked about (how in the real world the forces acting are continuously varying, not how the concepts of science are constantly being developed). Even if I am right, this does not seem problematic here, as the conversation shifted to the intended focus quickly (an example of Bruner's 'constant transnational calibration' perhaps?).

However, this reminds me of interviews with students I have carried out (and others I have listened to undertaken by colleagues), and of classroom episodes where teacher and student are agreeing – but actually are talking at cross purposes. Sometimes it becomes obvious to those involved that this is what has happened – but I wonder how often it goes undetected by either party. (And how often there are later recriminations – "but you said…"!)

Simplifying biology?

The final part of the extract above also caught my attention, as I was not sure what to make of it.

My heart beats, my lungs breathe, my brain is sending millions of signals to my body, at all times.

Correct.

Again, even, I mean, faster than the speed of light, right?

Correct.

Millions of signals every nanosecond, right?

Yes.

How frequently do our brains send out signals?

I am a chemistry and physicist, not a biologist so I was unsure what to make of the millions of signals the brain is sending out to the rest of the body every nanosecond.

I can certainly beleive that perhaps in a working human brain there will be billions of neutrons firing every nanosecond as they 'communicate' with each other. If my brain has something like 100 000 000 000 neurons then that does not seem entirely unreasonable.

But does the brain really send signals to the rest of the body (whether through nerves or by the release of hormones) at a rate of nx106/10-9 s-1 ("millions of signals every nanosecond"), that is,  multiples of 1015 signals per second, as Mr Nelson suggests and Dr Tobin agrees?

Surely not? Dr Tobin is a professor of medicine and a much published expert in his field and should know better than me. But I would need some convincing.

Biological warp-drives

I will need even more convincing that the brain sends signals to the body faster than the speed of light. Both nervous and hormonal communication are many orders of magnitude slower than light speed. The speed of light is still considered to be a practical limit on the motion of massive objects (i.e., anything with mass). Perhaps signals could be sent by quantum entanglement – but that is not how our nervous and endocrine systems function?

If Mr Nelson and Dr Tobin do have good reason to believe that communication of signals in the human body can travel faster than the speed of light then this could be a major breakthrough. Science and technology have made many advances by mimicking, or learning from, features of the structure and function of living things. Perhaps, if we can learn how the body is achieving this impossible feat, warp-drive need not remain just science fiction.

A criminal trial is a very serious matter, and I do not intend these comments to be flippant. I watched the testimony genuinely interested in what the science had to say. The real audience for this exchange was the jury and I wonder what they made of this, if anything. Perhaps it should be seen as poetic language making a general point, and not a technical account to be analysed pedantically. But I think it does raise issues about how science is communicated to non-experts in contexts such as courtrooms.

This was an expert witness for the prosecution (indeed, very much for the prosecution) who was agreeing with the defence counsel on a point strictly contrary to accepted science. If I was on a jury, and an expert made a claim that I knew was contrary to current well-established scientific thinking (whether the earth came into being 10 000 years ago, or the brain sends out signals that travel faster then the speed of light) this would rather undermine my confidence in the rest of their expert testimony.

 

 

 

Responding to a misconception about my own teaching

Keith S. Taber

There are many postings here about things that learners said, and so presumably thought, about curriculum topics that would likely surprise, if not shock, the teachers who had taught them those topics. I am certainly not immune from being misunderstood. Today, I reflect on how someone seems to have understood some of my own teaching, and indeed seriously objected to it.

When I have called-out academic malpractice in this blog the targets have usually been conference organisers or journal administrators using misleading (or downright dishonest) techniques, or publishers mistreating authors. I feel somewhat uneasy about publicly contradicting a junior scholar. However, I also do not appreciate being publicly described as deliberately misleading a student, as has happened here, and my direct challenge to the blog author was rejected.

The accusation

A while back some Faculty colleagues referred me to a blog that included the following comments:

In the Faculty of Education students pursuing the MPhil or PhD take a research ethics lecture that presents the Tuskegee Syphilis Study as ethically sound, but only up to the year 1947 when penicillin was actively being used to treat syphilis. According to the Cambridge lecturer, that's the point when the study became unethical.

When I interrupted his lecture to object to his presentation, I was told by that lecturer that he'd never received any objections in his many, many years of teaching the same slides on the same course. That was not true. He knew and the Faculty knows and yet that false information continues to be disseminated to students, many of whom will go on to complete research in developing countries where their only reference for their ethical or unethical behavior is this lecture.

I am not named, but virtually anyone in my Faculty, or having taken graduate studies there in the last few years, would surely know who was being discussed. As is pointed out in our Educational Research course, and the Research Methods strand of other graduate courses, if you want to avoid someone being identified in your writing, it is not enough to not name them. I can be fairly confident the author of the comments above should have known that: it is a point made in the very lecture being criticised.

This blog posting seems to have received quite a lot of attention among students at the University Faculty where I worked. Yet the two claims here are simply not correct. The teaching is seriously misrepresented, and I certainly did not lie to this student.

The blog invited me to 'Leave a Reply', so I did. My comments were subject to moderation – and the next morning I found a response in my email in-box. My comments would not be posted, and the claims would not be amended: I was welcome to post my reply elsewhere, but not at the site where I was being criticised. So, here goes:

The (rejected) reply

I hope you are well.

I was directed to your blog by a group of scholars in the Faculty (Of Education at Cambridge). It is an impressive blog. However, I was rather surprised by some of what you have posted. I was the lecturer you refer to in your posting who taught the lecture on research ethics. I do indeed remember you interrupting me when I was presenting the Tuskagee syphilis study as an example of unethical research. I always encouraged students to participate in class, and would have welcomed your input at the end of my treatment of that example.

However, having read your comments here, I do need to challenge your account. I do not consider that the Tuskegee syphilis study was initially ethically sound, and I do not (and did not) teach that. I certainly did make the point that even if the study had been ethical until antibiotics were widely available, continuing it beyond that point would have been completely unjustifiable. But that was certainly not the only reason the study was unethical. Perhaps this would have been clearer if you had let me finish my own comments before interjecting – but even so I really do not understand how you could have interpreted the teaching that way.

Scheme (an annotated version of 'the ethical field', Taber, 2013a, Figure 9.1) used to summarise ethical issues in the Tuskegee syphilis study in my Educational Research lecture on ethical considerations of research.

The reference to 1947 in the posting quoted above relates to the 'continue' issue under research quality – the research (which involved medical staff periodically observing, but not treating, diseased {black, poor, mostly illiterate} men who had not been told of the true nature of their condition) was continued even when effective, safe treatment was available and any claims to the information being collected having potential to inform medical practice became completely untenable.

I may well have commented that no one had ever raised any objections to the presentation when I had given the lecture on previous occasions over a number of years – because that is true. No one had previously raised any concerns with me regarding my teaching of this example (or any aspect of the lecture as far as I can recall). I am not sure why you seem to so confidently assume otherwise: regarding this, you are simply wrong.

Usually in that lecture I would present a brief account of the Milgram 'learning' experiment, which would often lead to extended discussion about the ethical problems of that research in relation to its motivation and what was usefully learnt from it. Then, later in the session, I would talk about the Tuskegee study, which normally passed without comment. I had always assumed that was because the study is so obviously and seriously problematic that no one would see any reason to disagree with my critique. Then I would go on to discuss other issues and studies. I can assure you that no one had previously, before you, raised any concerns about my teaching of this example with me. If anyone in earlier cohorts had any concerns about this example they would have been welcome to talk to me about them – either in class, or privately afterwards. No one ever did.

I have no reason to believe that colleagues at Cambridge are deliberately disseminating false information to students, but then I do not audit other teaching officers' lectures, and I cannot speak for them. However, I can speak for myself, just as you rightly speak up for yourself. I have certainly always taken care to do my best not to teach things that are not the case. Of course, as a school and college science teacher I was often teaching models and simplifications, and not the 'whole' truth, but that is the nature of pedagogy, and is something we should make clear to learners (i.e., that they are being taught models and simplifications that can later in their studies be developed through more sophisticated treatments).

In a similar way, I used simplifications and models in my research methods lectures at Cambridge – for example, in terms of the 'shape' of a research project, or contrasting paradigms, or types of qualitative analysis, and so on, but would make explicit to the class that this is what they were: 'teaching models'. I entered the teaching profession to make a positive difference; to help learners develop, and to acquire new understandings and perspectives and skills; not to misinform people. I very much suspect that on occasions I must have got some things wrong, but, if so, such errors would always have been honest mistakes. I have never knowingly taught something that I thought was untrue.

So, whilst I admire your courage in standing up for what you believe, and I certainly wish you well, what you have written is not correct, and I trust my response will be posted so that your inaccurate remarks will not go unchallenged. I suspect that you are not being deliberately untruthful (you accuse me of telling you something I knew was not true: I try to be charitable and give people the benefit of doubt, so I would like to think that you were writing your comments in good faith), but I do not understand how you managed to come to the interpretation of my teaching that you did, and wish that you would have at least heard me out before interrupting the class, as that may have clarified my position for you. The Tuskegee syphilis study was a racist, unethical study that misled and abused some of those people with the lowest levels of economic and political power in society: people (not just the men subjected to the study, but also their families) who were betrayed by those employed by the public health service that they trusted (and should have been able to trust) to look after their interests. I do not see how anyone could consider it an ethically sound study, and I struggle to see why you would think anyone could.

Your claim that I lied about not having previously received complaints about my teaching of this topic before is simply untrue – it is a falsehood that I hope you will be prepared to correct.

What should a 'constructivist' teacher make of this?

I should be careful about criticising a student for thinking I was teaching something quite different from what I thought I was teaching. I have spent much of my career telling other teachers that learners will make sense of our teaching in terms of the interpretive resources they have available, and so they may interpret our teaching in unexpected ways. Learners will always be biased to understand in terms of their expectations and past experiences. We see it all the time in science teaching, as many of the posts here demonstrate.

I have described learning as being an incremental, interpretive, and so iterative, process and not a simple transfer of understanding (Taber, 2014). Teaching (indeed communication) is always a representation of thinking in a publicly accessible form (speech, gesture, text, diagrams {what sense does the figure above make out of the context of the lecture?}, models, etc.) – and whatever meaning may have informed the production of the representation, the representation itself does not have or contain meaning: the person accessing that presentation has to impose their own interpretation to form a meaning (Taber, 2013b). After teaching and writing about these ideas, I would be a hypocrite to claim that a learner could not misinterpret my own teaching as I can communicate perfectly to a room full of students from all around the world with different life experiences and varied disciplinary backgrounds!

Even so, I am still struggling to understand the interpretation put on my teaching in this case, despite going back to revisit the teaching materials a number of times. Most of the points I was making must have been completed disregarded to think I did not consider the study, which ran from 1932 to 1972 (Jones, 1993) unethical until 1947. So, even for someone who claims to be a constructivist teacher and knows there is always a risk of learners misconceiving teaching, this example seems an extreme case.

The confident claim that it was not true that I had not received previous complaints about my teaching of this example is even harder to understand. It is at least a good reminder for me not to assume I know what students are thinking or that they know what I am thinking, or can readily access the intended meaning in my teaching. I've made those points to others enough times, so I will try to see this incident as a useful reminder to follow my own advice.

Sources cited:

A case of hybrid research design?

When is "a case study" not a case study? Perhaps when it is (nearly) an experiment?

Keith S. Taber

I read this interesting study exploring learners shifting conceptions of the particulate nature of gases.

Mamombe, C., Mathabathe, K. C., & Gaigher, E. (2020). The influence of an inquiry-based approach on grade four learners' understanding of the particulate nature of matter in the gaseous phase: a case study. EURASIA Journal of Mathematics, Science and Technology Education, 16(1), 1-11. doi:10.29333/ejmste/110391

Key features:

  • Science curriculum context: the particulate nature of matter in the gaseous phase
  • Educational context: Grade 4 students in South Africa
  • Pedagogic context: Teacher-initiated inquiry approach (compared to a 'lecture' condition/treatment)
  • Methodology: "qualitative pre-test/post-test case study design" – or possibly a quasi-experiment?
  • Population/sample: the sample comprised 116 students from four grade four classes, two from each of two schools

This study offers some interesting data, providing evidence of how students represent their conceptions of the particulate nature of gases. What most intrigued me about the study was its research design, which seemed to reflect an unusual hybrid of quite distinct methodologies.

In this post I look at whether the study is indeed a case study as the authors suggest, or perhaps a kind of experiment. I also make some comments about the teaching model of the states of matter presented to the learners, and raise the question of whether the comparison condition (lecturing 8-9 year old children about an abstract scientific model) is appropriate, and indeed ethical.

Learners' conceptions of the particulate nature of matter

This paper is well worth reading for anyone who is not familiar with existing research (such as that cited in the paper) describing how children make sense of the particulate nature of matter, something that many find counter-intuitive. As a taster for this, I reproduce here two figures from the paper (which is published open access under a creative common license* that allows sharing and adaption of copyright material with due acknowledgement).

Figures © 2020 by the authors of the cited paper *

Conceptions are internal, and only directly available to the epistemic subject, the person holding the conception. (Indeed, some conceptions may be considered implicit, and so not even available to direct introspection.) In research, participants are asked to represent their understandings in the external 'public space' – often in talk, here by drawing (Taber, 2013). The drawings have to be interpreted by the researchers (during data analysis). In this study the researchers also collected data from group work during learning (in the enquiry condition) and by interviewing students.

What kind of research design is this?

Mamombe and colleagues describe their study as "a qualitative pre-test/post-test case study design with qualitative content analysis to provide more insight into learners' ideas of matter in the gaseous phase" (p. 3), yet it has many features of an experimental study.

The study was

"conducted to explore the influence of inquiry-based education in eliciting learners' understanding of the particulate nature of matter in the gaseous phase"

p.1

The experiment compared two pedagogical treatments :

  • "inquiry-based teaching…teacher-guided inquiry method" (p.3) guided by "inquiry-based instruction as conceptualized in the 5Es instructional model" (p.5)
  • "direct instruction…the lecture method" (p.3)

These pedagogic approaches were described:

"In the inquiry lessons learners were given a lot of materials and equipment to work with in various activities to determine answers to the questions about matter in the gaseous phase. The learners in the inquiry lessons made use of their observations and made their own representations of air in different contexts."

"the teacher gave probing questions to learners who worked in groups and constructed different models of their conceptions of matter in the gaseous phase. The learners engaged in discussion and asked the teacher many questions during their group activities. Each group of learners reported their understanding of matter in the gaseous phase to the class"

p.5, p.1

"In the lecture lessons learners did not do any activities. They were taught in a lecturing style and given all the notes and all the necessary drawings.

In the lecture classes the learners were exposed to lecture method which constituted mainly of the teacher telling the learners all they needed to know about the topic PNM [particulate nature of matter]. …During the lecture classes the learners wrote a lot of notes and copied a lot of drawings. Learners were instructed to paste some of the drawings in their books."

pp.5-6

The authors report that,

"The learners were given clear and neat drawings which represent particles in the gaseous, liquid and solid states…The following drawing was copied by learners from the chalkboard."

p.6
Figure used to teach learners in the 'lecture' condition. Figure © 2020 by the authors of the cited paper *
A teaching model of the states of matter

This figure shows increasing separation between particles moving from solid to liquid to gas. It is not a canonical figure, in that the spacing in a liquid is not substantially greater than in a solid (indeed, in ice floating on water the spacing is greater in the solid), whereas the difference in spacing in the two fluid states is under-represented.

Such figures do not show the very important dynamic aspect: that in a solid particles can usually only oscillate around a fixed position (a very low rate of diffusion not withstanding), where in a liquid particles can move around, but movement is restricted by the close arrangement of (and intermolecular forces between) the particles, where in a gas there is a significant mean free path between collisions where particles move with virtually constant velocity. A static figure like this, then, does not show the critical differences in particle interactions which are core to the basic scientific model

Perhaps even more significant, figure 2 suggests there is the same level of order in the three states, whereas the difference in ordering between a solid and liquid is much more significant than any change in particle spacing.

In teaching, choices have to be made about how to represent science (through teaching models) to learners who are usually not ready to take on board the full details and complexity of scientific knowledge. Here, Figure 2 represents a teaching model where it has been decided to emphasise one aspect of the scientific model (particle spacing) by distorting the canonical model, and to neglect other key features of the basic scientific account (particle movement and arrangement).

External teachers taught the classes

The teaching was undertaken by two university lecturers

"Two experienced teachers who are university lecturers and well experienced in teacher education taught the two classes during the intervention. Each experienced teacher taught using the lecture method in one school and using the teacher-guided inquiry method in the other school."

p.3

So, in each school there was one class taught by each approach (enquiry/lecture) by a different visiting teacher, and the teachers 'swapped' the teaching approaches between schools (a sensible measure to balance possible differences between the skills/styles of the two teachers).

The research design included a class in each treatment in each of two schools

An experiment; or a case study?

Although the study compared progression in learning across two teaching treatments using an analysis of learner diagrams, the study also included interviews, as well as learners' "notes during class activities" (which one would expect would be fairly uniform within each class in the 'lecture' treatment).

The outcome

The authors do not consider their study to be an experiment, despite setting up two conditions for teaching, and comparing outcomes between the two conditions, and drawing conclusions accordingly:

"The results of the inquiry classes of the current study revealed a considerable improvement in the learners' drawings…The results of the lecture group were however, contrary to those of the inquiry group. Most learners in the lecture group showed continuous model in their post-intervention results just as they did before the intervention…only a slight improvement was observed in the drawings of the lecture group as compared to their pre-intervention results"

pp.8-9

These statements can be read in two ways – either

  • a description of events (it just happened that with these particular classes the researchers found better outcomes in the enquiry condition), or
  • as the basis for a generalised inference.

An experiment would be designed to test a hypothesis (this study does not seem to have an explicit hypothesis, nor explicit research questions). Participants would be assigned randomly to conditions (Taber, 2019), or, at least, classes would be randomly assigned (although then strictly each class should be considered as a single unit of analysis offering much less basis for statistical comparisons). No information is given in the paper on how it was decided which classes would be taught by which treatment.

Representativeness

A study could be carried out with the participation of a complete population of interest (e.g., all of the science teachers in one secondary school), but more commonly a sample is selected from a population of interest. In a true experiment, the sample has to be selected randomly from the population (Taber, 2019) which is seldom possible in educational studies.

The study investigated a sample of 'grade four learners'

In Mamombe and colleagues' study the sample is described. However, there is no explicit reference to the population from which the sample is drawn. Yet the use of the term 'sample' (rather than just, say, 'participants') implies that they did have a population in mind.

The aim of the study is given as to "to explore the influence of inquiry-based education in eliciting learners' understanding of the particulate nature of matter in the gaseous phase" (p.1) which could be considered to imply that the population is 'learners'. The title of the paper could be taken to suggest the population of interests is more specific: "grade four learners". However, the authors make no attempt to argue that their sample is representative of any particular population, and therefore have no basis for statistical generalisation beyond the sample (whether to learners, or to grade four learners, or to grade four learners in RSA, or to grade four learners in farm schools in RSA, or…).

Indeed only descriptive statistics are presented: there is no attempt to use tests of statistical significance to infer whether the difference in outcomes between conditions found in the sample would probably have also been found in the wider population.

(That is inferential stats. are commonly used to suggest 'we found a statistically significant better outcome in one condition in our sample, so in the hypothetical situation that we had been able to include the entire population in out study we would probably have found better mean outcomes in that same condition'.)

This may be one reason why Mamombe and colleagues do not consider their study to be an experiment. The authors acknowledge limitations in their study (as there always are in any study) including that "the sample was limited to two schools and two science education specialists as instructors; the results should therefore not be generalized" (p.9).

Yet, of course, if the results cannot be generalised beyond these four classes in two schools, this undermines the usefulness of the study (and the grounds for the recommendations the authors make for teaching based on their findings in the specific research contexts).

If considered as an experiment, the study suffers from other inherent limitations (Taber, 2019). There were likely novelty effects, and even though there was no explicit hypothesis, it is clear that the authors expected enquiry to be a productive approach, so expectancy effects may have been operating.

Analytical framework

In an experiment is it important to have an objective means to measure outcomes, and this should be determined before data are collected. (Read about 'Analysis' in research studies.). In this study methods used in previous published work were adopted, and the authors tell us that "A coding scheme was developed based on the findings of previous research…and used during the coding process in the current research" (p.6).

But they then go on to report,

"Learners' drawings during the pre-test and post-test, their notes during class activities and their responses during interviews were all analysed using the coding scheme developed. This study used a combination of deductive and inductive content analysis where new conceptions were allowed to emerge from the data in addition to the ones previously identified in the literature"

p.6

An emerging analytical frame is perfectly appropriate in 'discovery' research where a pre-determined conceptualisation of how data is to be understood is not employed. However in 'confirmatory' research, testing a specific idea, the analysis is operationalised prior to collecting data. The use of qualitative data does not exclude a hypothesis-testing, confirmatory study, as qualitative data can be analysed quantitatively (as is done in this study), but using codes that link back to a hypothesis being tested, rather than emergent codes. (Read about 'Approaches to qualitative data analysis'.)

Much of Mamombe and colleagues' description of their work aligns with an exploratory discovery approach to enquiry, yet the gist of the study is to compare student representations in relation to a model of correct/acceptable or alternative conceptions to test the relative effectiveness of two pedagogic treatments (i.e., an experiment). That is a 'nomothetic' approach that assumed standard categories of response.

Overall, the author's account of how they collected and analysed data seem to suggest a hybrid approach, with elements of both a confirmatory approach (suitable for an experiment) and elements of a discovery approach (more suitable for case study). It might seem this is a kind of mixed methods study with both confirmatory/nomothetic and discovery/idiographic aspects – responding to two different types of research question the same study.

Yet there do not actually seem (**) to be two complementary strands to the research (one exploring the richness of student's ideas, the other comparing variables – i.e., type of teaching versus degree of learning), but rather an attempt to hybridise distinct approaches based on incongruent fundamental (paradigmatic) assumptions about research. (** Having explicit research questions stated in the paper could have clarified this issue for a reader.)

So, do we have a case study?

Mamombe and colleagues may have chosen to frame their study as a kind of case study because of the issues raised above in regard to considering it an experiment. However, it is hard to see how it qualifies as case study (even if the editor and peer reviewers of the EURASIA Journal of Mathematics, Science and Technology Education presumably felt this description was appropriate).

Mamombe and colleagues do use multiple data sources, which is a common feature of case study. However, in other ways the study does not meet the usual criteria for case study. (Read more about 'Case study'.)

For one thing, case study is naturalistic. The method is used to study a complex phenomena (e.g., a teacher teaching a class) that is embedded in a wider context (e.g., a particular school, timetable, cultural context, etc.) such that it cannot be excised for clinical examination (e.g., moving the lesson to a university campus for easy observation) without changing it. Here, there was an intervention, imposed from the outside, with external agents acting as the class teachers.

Even more fundamentally – what is the 'case'?

A case has to have a recognisable ('natural') boundary, albeit one that has some permeability in relation to its context. A classroom, class, year group, teacher, school, school district, etcetera, can be the subject of a case study. Two different classes in one school, combined with two other classes from another school, does not seem to make a bounded case.

In case study, the case has to be defined (not so in this study); and it should be clear it is a naturally occurring unit (not so here); and the case report should provide 'thick description' (not provided here) of the case in its context. Mamombe and colleagues' study is simply not a case study as usually understood: not a "qualitative pre-test/post-test case study design" or any other kind of case study.

That kind of mislabelling does not in itself does not invalidate research – but may indicate some confusion in the basic paradigmatic underpinnings of a study. That seems to be the case [sic] here, as suggested above.

Suitability of the comparison condition: lecturing

A final issue of note about the methodology in this study is the nature of one of the two conditions used as a pedagogic treatment. In a true experiment, this condition (against which the enquiry condition was contrasted) would be referred to as the control condition. In a quasi-experiment (where randomisation of participants to conditions is not carried out) this would usually be referred to as the comparison condition.

At one point Mamombe and colleagues refer to this pedagogic treatment as 'direct instruction' (p.3), although this term has become ambiguous as it has been shown to mean quite different things to different authors. This is also referred to in the paper as the lecture condition.

Is the comparison condition ethical?

Parental consent was given for students contributing data for analysis in the study, but parents would likely trust the professional judgement of the researchers to ensure their children were taught appropriately. Readers are informed that "the learners whose parents had not given consent also participated in all the activities together with the rest of the class" (p.3) so it seems some children in the lecture treatment were subject to the inferior teaching approach despite this lack of consent, as they were studying "a prescribed topic in the syllabus of the learners" (p.3).

I have been very critical of a certain kind of 'rhetorical' research (Taber, 2019) report which

  • begins by extolling the virtues of some kind of active / learner-centred / progressive / constructivist pedagogy; explaining why it would be expected to provide effective teaching; and citing numerous studies that show its proven superiority across diverse teaching contexts;
  • then compares this with passive modes of learning, based on the teacher talking and giving students notes to copy, which is often characterised as 'traditional' but is said to be ineffective in supporting student learning;
  • then describes how authors set up an experiment to test the (superior) pedagogy in some specific context, using as a comparison condition the very passive learning approach they have already criticised as being ineffective as supporting learning.

My argument is that such research is unethical

  • It is not genuine science as the researchers are not testing a genuine hypothesis, but rather looking to demonstrate something they are already convinced of (which does not mean they could not be wrong, but in research we are trying to develop new knowledge).
  • It is not a proper test of the effectiveness of the progressive pedagogy as it is being compared against a teaching approach the authors have already established is sub-standard.

Most critically, young people are subjected to teaching that the researchers already believe they know will disadvantage them, just for the sake of their 'research', to generate data for reporting in a research journal. Sadly, such rhetorical studies are still often accepted for publication despite their methodological weaknesses and ethical flaws.

I am not suggesting that Mamombe, Mathabathe and Gaigher have carried out such a rhetorical study (i.e., one that poses a pseudo-question where from the outset only one outcome is considered feasible). They do not make strong criticisms of the lecturing approach, and even note that it produces some learning in their study:

"Similar to the inquiry group, the drawings of the learners were also clearer and easier to classify after teaching"

"although the inquiry method was more effective than the lecture method in eliciting improved particulate conception and reducing continuous conception, there was also improvement in the lecture group"

p.9, p.10

I have no experience of the South African education context, so I do not know what is typical pedagogy in primary schools there, nor the range of teaching approaches that grade 4 students there might normally experience (in the absence of external interventions such as reported in this study).

It is for the "two experienced teachers who are university lecturers and well experienced in teacher education" (p.3) to have judged whether a lecture approach based on teacher telling, children making notes and copying drawings, but with no student activities, can be considered an effective way of teaching 8-9 year old children a highly counter-intuitive, abstract, science topic. If they consider this good teaching practice (i.e., if it is the kind of approach they would recommend in their teacher education roles) then it is quite reasonable for them to have employed this comparison condition.

However, if these experienced teachers and teacher educators, and the researchers designing the study, considered that this was poor pedagogy, then there is a real question for them to address as to why they thought it was appropriate to implement it, rather than compare the enquiry condition with an alternative teaching approach that they would have expected to be effective.

Sources cited:

* Material reproduced from Mamombe, Mathabathe & Gaigher, 2020 is © 2020 licensee Modestum Ltd., UK. That article is an open access article distributed under the terms and conditions of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) [This post, excepting that material, is © 2020, Keith S. Taber.]

An introduction to research in education:

Taber, K. S. (2013). Classroom-based Research and Evidence-based Practice: An introduction (2nd ed.). London: Sage.

Single bonds are different to covalent bonds

Single bonds are different to covalent bonds or ionic bonds

Keith S. Taber

Annie was a participant in the Understanding Chemical Bonding project. She was interviewed near the start of her college 'A level' course (equivalent to Y12 of the English school system). Annie was shown, and asked about, a sequence of images representing atoms, molecules and other sub-microscopic structures of the kinds commonl y used in chemistry teaching. She was shown a representation of the resonance between three canonical forms of BF3, sometimes used as away of reflection polar bonding. She had just seen another image representing resonance in the ethanoate ion, and had suggested that it contained a double bond. She had earlier in the interview referred to covalent bonding and ionic bonding, and after introducing the ideas of double bond, suggested that a double bond is different to a covalent bond.

Focal figure (14) presented to Annie

What about diagram 14?…

Oh.

(pause, c.13s)

Seems to be different arrangements. Of the three, or two elements.

Uh hm.

(pause, c.3s)

Which are joined by single bonds.

What, where, what single, what sorry are joined by single bonds?

All the F to the B to the F. Are single bonds they are not double like before. [i.e., a figure discussed earlier in the interview]

So are they covalent bonds? Or ionic bonds, or? Or are single bonds something different again?

Single bonds are different.

This reflected her earlier comment to the effect that a double bond is different to a covalent bond, suggesting that she did not appreciate how covalent bonds are considered to be singular or multiple.

However, as I checked what she was telling me, Annie's account seemed to shift.

They're different to double bonds?

Yeah.

And are they different to covalent bonds?

No 'cause you probably get covalent bonds which are single bonds.

So single bonds, just moments before said to different to covalent bonds, were now 'probably' capable of being covalent. As she continued to answer questions, Annie decided these were 'probably' just alternative terms.

So covalent bonds and single bonds, is that another word for the same thing?

Yeah, probably. But they can probably occur in different, things like in organic you talk about single bonds more than you talk about covalent, and then like in inorganic you talk about covalent bond, more than you talk about single bonding or double bonding.

So you think that maybe inorganic things, like sort of, >> copper iodide or something like that, that would tend to be more concerned with covalent bonds?

< Yeah. < Yeah.

But if you were doing organic things like, I don't know, erm, ethane, >> that's more likely to have single bonds in.

< Yeah. < Yeah.

So single bonds are more likely to occur in carbon compounds.

Yeah.

And covalent bonds are more likely to occur in some other type of compound?

Yeah. Sort of you've got different terminology, like you could probably use single bonds to refer to something in inorganic, but when you are talking about the structures and that, it's easier to talk about single bonds and double bonds, rather than saying that's got a covalent bond or that's got an ionic bond.

Annie's explanation did not seem to be a fully thought-out position. It was not consistent with the way she had earlier reported there being five covalent bonds and one double bond in an ethanoate ion.

It seems likely that in the context of the research interview, where being asked directly about these points, Annie was forced to make explicit the reasons she tended to label particular bonds in specific ways. The interview questions may have acted like Socratic questioning, a kind of scaffolding, leading to new insights. Only in this context did she realise that the single and double bonds her organic chemistry lecturer talked about might actually be referring to the same entities as the covalent bonds her inorganic chemistry lecturer talked about.

It would probably not have occurred to Annie's lecturers (of which, I was one) that she would not realise that single and double bonds were covalent bonds. It may well have been that if she had been taught by the same lecturer in both areas, the tendency to refer to single and multiple bonds in organic compounds (where most bonds were primarily covalent) and to focus on the covalent-ionic dissension in inorganic compounds (where degree of polarity in bonds was a main theme of teaching) would still have lead to the same confusion. Later in the interview, Annie commented that:

if I use ionic or covalent I'm talking about, sort of like a general, bond, but if I use double or single bonds, that's mainly organic, because sort of it represents, sort of the sharing, 'cause like you draw all the molecules out more.

This might be considered an example of fragmentation learning impediment, where a student does not make a link that the teacher is likely to assume is obvious.

How plants get their food to grow and make energy

Respiration produces energy, but photosynthesis produces glucose which produces energy

Keith S. Taber

Image by Frauke Riether from Pixabay 

Mandy was a participant in the Understanding Science Project. When I spoke to her in Y10 (i.e. when she was c.14 year old) she told me that photosynthesis was one of the topics she was studying in science. So I asked her about photosynthesis:

So, photosynthesis. If I knew nothing at all about photosynthesis, how would you explain that to me?

It's how plants get their food to grow and – stuff, and make energy

So how do they make their energy, then?

Well, they make glucose, which has energy in it.

How does the energy get in the glucose?

Erm, I don't know.

It's just there is it?

Yeah, it's just stored energy

I was particularly interested to see if Mandy understood about the role of photosynthesis in plant nutrition and energy metabolism.

Why do you think it is called photosynthesis, because that's a kind of complicated name?

Isn't photo, something to do with light, and they use light to – get the energy.

So how do they do that then?

In the plant they've got chlorophyll which absorbs the light, hm, that sort of thing.

What does it do once it absorbs the light?

Erm.

Does that mean it shines brightly?

No, I , erm – I don't know

Mandy explained that the chlorophyll was in the cells, especially in the plant's leaves. But I was not very clear on whether she had a good understanding of photosynthesis in terms of energy.

Do you make your food?

Not the way plants do.

So where does the energy come from in your food then?

It's stored energy.

How did it get in to the food? How was it stored there?

Erm.

[c. 2s pause]

I don't know.

At this point it seemed Mandy was not connecting the energy 'in' food either directly or indirectly with photosynthesis.

Okay. What kind of thing do you like to eat?

Erm, pasta.

Do you think there is any energy value in pasta? Any energy stored in the pasta?

Has lots of carbohydrates, which is energy.

So do you think there is energy within the carbohydrate then?

Yeah.

Stored energy.

Yeah.

So how do you think that got there, who stored it?

(laughs) I don't know.

Again, the impression was that Mandy was not linking the energy value of food with photosynthesis. The reference to carbohydrates being energy seemed (given the wider context of the interview) to be imprecise use of language, rather than a genuine alternative conception.

So do you go to like the Co-op and buy a packet of pasta. Or mum does I expect?

Yeah.

Yeah. So do you think, sort of, the Co-op are sort of putting energy in the other end, before they send it down to the shop?

No, it comes from 'cause pasta's made from like flour, and that comes from wheat, and then that uses photosynthesis.

Now it seemed that it was quite clear to Mandy that photosynthesis was responsible for the energy stored in the pasta. It was not clear why she had not suggested this before, but it seemed she could make the connection between the food people eat and photosynthesis. Perhaps (it seems quite likely) she had previously been aware of this and it initially did not 'come to mind', and then at some point during this sequences of questions there was a 'bringing to mind' of the link. Alternatively, it may have been a new insight reached when challenged to respond to the interview questions.

So you don't need to photosynthesise to get energy?

No.

No, how do you get your energy then?

We respire.

Is that different then?

Yeah.

So what's respire then, what do you do when you respire?

We use oxygen to, and glucose to release energy.

Do plants respire?

Yes.

So when do you respire, when you are going to go for a run or something, is that when you respire, when you need the energy?

No, you are respiring all the time.

Mandy suggested that plants mainly respire at night because they are photosynthesising during the day. (Read 'Plants mainly respire at night'.)

So is there any relationship do you think between photosynthesis and respiration?

Erm respiration uses oxygen – and glucose and it produces er carbon dioxide and water, whereas photosynthesis uses carbon dioxide and water, and produces oxygen and glucose.

So it's quite a, quite a strong relationship then?

Yeah.

Yeah, and did you say that energy was involved in that somewhere?

Yeah, in respiration, they produce energy.

What about in photosynthesis, does that produce energy?

That produces glucose, which produces the energy.

I see, so there is no energy involved in the photosynthesis equation, but there is in the glucose?

Yeah.

Respiration does not 'produce' energy of course, but if it had the question about whether photosynthesis also produced energy might have been expected to elicit a response about photosynthesis 'using' energy or something similar, to give the kind of symmetry that would be consistent with conservation of energy (a process and its reverse can not both 'produce' energy). 'Produce' energy might have meant 'release' energy in which case it might be expected the reverse process should 'capture' or 'store' it.

Mandy appreciated the relationship between photosynthetic and respiration in terms of substances, but had an asymmetric notion of how energy was involved.

Mandy appeared to be having difficult appreciating the symmetrical arrangement between photosynthesis and respiration because she was not clear how energy was transformed in photosynthesis and respiration. Although she seemed to have the components of the scientific narrative, she did not seem to fully appreciate how the absorption of light was in effect 'capturing' energy that could be 'stored' in glucose till needed. At this stage in her learning she seemed to have grasped quite a lot of the relevant ideas, but not quite integrated them all coherently.

Higher resistance means less current for the same voltage – but how does that relate to the formula?

Image by Gerd Altmann from Pixabay 

The higher resistance is when there is less current flowing around the circuit when you have the same voltage – but how does that relate to the formula?

Adrian was a participant in the Understanding Science Project. When I interviewed him in Y12 when he was studying Advanced level physics he told me that "We have looked at resistance and conductance and the formulas that go with them" and told me that "Resistance is current over, voltage, I think" although he did not think he could remember formulae. He thought that an ohm was the unit that resistance is measured in, which he suggested "comes from ohm's law which is the…formula that gives you resistance".

Two alternative conceptions

There were two apparent alternative conceptions there. One was that 'Resistance is current over voltage', but as Adrian believed that he was not good at remembering formulae, this would be a conception to which he did not have a high level of commitment. Indeed, on another occasion perhaps he would have offered a different relationship between R, I, and V. I felt that if Adrian had a decent feel for the concepts of electrical resistance, current and voltage then he should be able to appreciate that 'resistance is current over voltage' did not reflect the correct relationship. Adrian was not confident about formulae, but with some suitable leading questioning he might be able to think this through. I describe my attempts to offer this 'scaffolding' below.

The other alternative conception was to conflate two things that were conceptually different: the defining equation for resistance (that R=V/I, by definition so must be true) and Ohm's law that suggests for certain materials under certain conditions, V/I will be found to be constant (that is an empirical relationship that is only true in certain cases). (This is discussed in another post: When is V=IR the formula for Ohm’s law?)

So, I then proceeded to ask Adrian how he would explain resistance to a younger person, and he suggested that resistance is how much something is being slowed down or is stopped going round. After we had talked about that for a while, I brought the discussion back to the formula and the relationship between R, V and I.

Linking qualitative understanding of relating concepts and the mathematical formula

As Adrian considered resistance as slowing down or stopping current I thought he might be able to rationalise how a higher resistance would lead to less current for a particular potential difference ('voltage').

Okay. Let’s say we had, erm, two circuits, and they both have resistance and you wanted to get one amp of current to flow through the circuits, and you had a variable power supply.

Okay.

And the first circuit in order to get one (amp) of current to flow through the circuit.

Yes.

You have to adjust the power supply, until you had 10 volts.

Okay.

So it took 10 volts to get one amp to flow through the circuit. And the second (unclear) the circuit, when you got up to 10 volts, (there is) still a lot less than one amp flowing. You can turn it up to 25 volts, and only when it got to 25 volts did you get one amp to flow through the circuit.

Yes, okay.

In mathematical terms, the resistance of the first circuit is (R = V/I = 10/1 =) 10Ω, and the second is (25/1 =) 25Ω, so the second – the one that requires greater potential difference to drive the same current, has more resistance.

Do you think those two circuits would have resistance?

Erm, (pause, three seconds) Probably yeah.

This was not very convincing, as it should have been clear that as an infinite current was not produced there must be some resistance. However, I continued:

Same resistance?

No because they are not the same circuit, but – it would depend what components you had in your circuit, if you had different resistors in your circuit.

Yeah, I've got different resistors in these two circuits.

Then yes each would have a different resistance.

Can you tell me which one had the bigger resistance? Or can’t you tell me?

No, I can’t do that.

You can’t do it?

No I don’t think so. No.

Adrian's first response, that the circuits would 'probably' have resistance, seemed a little lacking in conviction. His subsequent responses suggested that although he knew there was a formula he did not seem to recognise that if different p.d.s were required to give the same current, this must suggest there was different resistance. Rather he argued from a common sense position that different circuits would be likely to have different components which would lead to them having different resistances. This was a weaker argument, as in principle two different circuits could have the same resistance.

We might say Adrian was applying a reasonable heuristic principle: a rule of thumb to use when definite information was not available: if two circuits have different components, then they likely they have different resistance. But this was not a definitive argument. Here, then, Adrian seemed to be applying general practical knowledge of circuits, but he was not displaying a qualitative feel for what resistance in a circuit was about in term of p.d. and current.

I shifted my approach from discussing different voltages needed to produce the same current, to asking about circuits where the same potential difference would lead to different current flowing:

Okay, let me, let me think of doing it a different way. For the same two circuits, erm, but you got one let's say for example it’s got 10 volts across it to get an amp to flow.

Yeah. So yes okay so the power supply is 10 volts.

Yeah. And the other one also set on 10 volts,

Okay.

but we don’t get an amp flow, we only get about point 4 [0.4] of an amp, something like that, to flow.

Yeah, yeah.

Any idea which has got the high resistance now?

The second would have the higher resistance.

Why do you say that?

Because less erm – There’s less current amps flowing around the circuit erm when you have the same voltage being put into each circuit.

Okay?

Yes.

This time Adrian adopted the kind of logic one would hope a physics student would apply. It was possible that this outcome was less about the different format of the two questions, and simply that Adrian had had time to adjust to thinking about how resistance might be linked to current and voltage. [It is also possible too much information was packed close together in the first attempt, challenging Adrian's working memory capacity, whereas the second attempt fed the information in a way Adrian could better manage.]

You seem pretty sure about that, does that make sense to you?

Yes, it makes sense when you put it like that.

Right, but when I had it the other way, the same current through both, and one required 10 volts and one required 25 volts to get the same current.

Yes.

You did not seem to be too convinced about that way of looking at it.

No. I suppose I have just thought about it more.

Having made progress with the fixed p.d. example, I set Adrian another with constant current:

Yes. So if I get you a different example like that then…let’s say we have two different circuits and they both had a tenth of an amp flowing,

Okay. Yes.

and one of them had 1.5 volt power supply

Okay yes.

and the other one had a two volt power supply

Yeah.

but they have both got point one [0.1] of an amp flowing. Which one has got the high resistance?

Currents the same, I would say they have got different voltages, yeah, so erm (pause, c.6s) probably the (pause, c.2s) the second one. Yeah.

Because?

Because there is more voltage being put in, if you like, to the circuit, and you are getting less current flowing in and therefore resistance must be more to stop the rest of that.

Yes?

I think so, yes.

Does that make sense to you?

Yeah.

So this time, having successfully thought through a constant p.d. example, Adrian successfully worked out that a circuit that needed more p.d. to drive a certain level of current had greater resistance (here 2.0/0.1 = 20Ω) than one that needed a smaller p.d. (i.e. 1.5/0.1 = 15Ω). However, his language revealed a lack of fluency in using the concepts of electricity. He referred to voltage being "put in" to the circuits rather than across them. Perhaps more significantly he referred to their being "less current flowing in" where there was the same current in both hypothetical circuits. It would have been more appropriate to think of there being proportionally less current. He also referred to the greater resistance stopping "the rest" of the current, which seemed to reflect his earlier suggestion that resistance is how much something is being slowed down or is stopped going round.

My purpose in offering Adrian hypothetical examples, each a little 'thought experiment', was to see if they allowed him to reconstruct the formula he could not confidently recall. As he had now established that

greater p.d. is needed when resistance is higher (for a fixed current)

and that

less current flows when resistance is higher (for a fixed p.d.)

he might (perhaps should) have been able to recognise that his suggestion that "resistance is current over, voltage" was inconsistent with these relationships.

Okay and how does that relate to the formula you were just telling me before?

Erm, No idea.

No idea?

Erm (pause, c.2s) once you know the resistance of a circuit you can work out, or once you know any of the, two of the components you can work out, the other one, so.

Yeah, providing you know the equation, when you know which way round the equation is.

Yes providing you can remember the equation.

So can you relate the equation to the explanations you have just given me about which would have the higher resistance?

So if something has got a higher resistance, so (pause, c.2s) so the current flowing round it would be – the resistance times the voltage (pause, c.2s) Is that right? No?

Erm, so the current is resistance time voltage? Are you sure?

No.

So Adrian suggested the formula was "the current flowing round it would be the resistance times the voltage", i.e., I = R × V (rather than I = V /R ), which did not reflect the qualitative relationships he had been telling me about. I had one more attempt at leading him through the logic that might have allowed him to deduce the general form of the formula.

Go back to thinking in terms of resistance.

Okay.

So you reckoned you can work out the resistance in terms of the current and the voltage?

Yes, I think.

Okay, now if we keep, if we keep the voltage the same and we get different currents,

Yes.

Which has, Which has got the higher resistance, the one with more current or the one with less current?

Erm (Pause, c.6s) So, so, if they keep the same voltage.

That’s the way we liked it the first time so.

Okay.

Let’s say we have got the same voltage across two circuits.

Yes.

Different amounts of current.

Yes.

Which one’s got the higher resistance? The one with more current or the one with less current?

The one with less current.

So less current means it must be more resistance?

Yes.

Ok, so if we had to have an equation R=.

Yes.

What’s it going to be, do you think?

Erm 

(pause, c.7s)

R=

(pause, c.3s)

I don’t know. It's too hard.

Whether it really was too hard for Adrian, or simply something he lacked confidence to do, or something he found too difficult being put 'on the spot' in an interview, is difficult to say. However it seems fair to suggest that the kind of shift between qualitative relationships and algebraic representation – that is ubiquitous in studying physics at this level – did not come readily to this advanced level physics student.

I had expected my use of leading (Socratic) questioning would provide a 'scaffold' to help Adrian appreciate he had misremembered "resistance is current over, voltage, I think", and was somewhat disappointed that I had failed.



'In my head, son' – mind reading commentators

Keith S. Taber

*

"Tim Howard is a little frustrated with himself that it wasn't a tidier save, because he feels he ought to have done better with the first attempt."

Thus claimed the commentator on the television highlights programme Match of the Day (BBC) commenting on the association football (soccer) match Everton vs. Spurs on May 25th 2015.  

It was not a claim that was obviously contradicted by the footage being shown, but inevitably my reaction (as someone who teaches research methods to students) was 'how do you know?" The goalkeeper was busy playing a game of football, some distance from the commentator, and there was no obvious conversation between them. The answer of course is that the commentator was a mind reader who knew what someone else was thinking and feeling.

This is not so strange, as we are all mind readers – or at least we commonly make statements about the thoughts, attitude, feels, beliefs etc. of others, based on their past or present behaviour, subtle body language, facial expressions and/or the context of their current predicament.

Of course, that is not strictly mind reading, as minds are not visible. But part of normal human development is acquiring a 'theory of mind' that allows us to draw inferences about the thoughts and feelings of others – the internal subjective experiences of others – drawing upon our own feelings and thoughts as a model. In everyday life, this ability is essential to normal social functioning – even if we do not always get it right. Yet we become so used to relying upon these skills that public commentators (well, a sports commentator here) feel no discomfort in not only interpreting the play, but the feelings and thoughts of the players they are observing.

A large part of the kind of educational research that I tend to be involved in is very similar to this – it involves using available evidence to make inferences about what others think and feel. [There are many examples in the blog posts on this site.]  Sometimes we have very strong evidence (what people tell us about their thoughts and feelings) but even then this is indirect evidence – we can never actually see another mind at work (1). We do not "see the cogs moving", even if we may like to talk as though we do.

In everyday life we forgive the kinds of under-determined claims made by sports commentators, and may not even notice when they draw such inferences and question what support their claims have. Sadly this seems to be a human quality that we often take for granted a little too much. A great deal of the research literature in science education is written as though research offers definite results about students' conceptions (and misconceptions) and whether or not they know something or understand it – as though such matters are simple, binary, and readily detected (1). Yet research actually suggests this is far from the case (2).

Research that explores students' thinking and learning is actually very challenging, and is in effect a enterprise to build and test models rather than uncover simple truths. I suspect quite a bit of the disagreement about the nature of student thinking in the science education research literature is down to researchers who forget that even if people are mind readers in everyday life, they must become careful and self-critical model builders when they are seeking to make claims presented as research (1).

References:

Taber, K. S. (2013). Modelling Learners and Learning in Science Education: Developing representations of concepts, conceptual structure and conceptual change to inform teaching and research. Dordrecht: Springer.

(2) Taber, K. S. (2014). Student Thinking and Learning in Science: Perspectives on the nature and development of learners' ideas. New York: Routledge.

* Previously published at http://people.ds.cam.ac.uk/kst24/science-education-research: 25th May 2015

Nothing random about a proper scientific evaluation?

Keith S. Taber

Image by annca from Pixabay 

I heard about an experiment comparing home-based working with office-based working on the radio today (BBC Radio 4 – Positive Thinking: Curing Our Productivity Problem, https://www.bbc.co.uk/sounds/play/m000kgsb). This was a randomised controlled trial (RCT). An RCT is, it was explained, "a proper scientific evaluation". The RCT is indeed considered to be the rigorous way of testing an idea in the social sciences (see Experimental research into teaching innovations: responding to methodological and ethical challenges).

Randomisation in RCTs

As the name suggests, a key element of a RCT is randomisation. This can occur at two levels. Firstly, research often involves selecting a sample from a larger population, and ideally one selects the sample at random from the population (so every member of the wider population has exactly the same chance of being selected for the sample), so that it can be assumed that what is found with the sample is likely to reflect what would have occurred had the entire population been participating in the experiment. This can be very difficult to organise.

More critically though, it is most important that the people in the sample each have an equal chance of being assigned to each of the conditions. So, in the simplest case there will be two conditions (e.g., here working at home most workdays vs. working in the office each workday) and each person will be assigned in such a way that they have just as much chance as being in one condition as anyone else. We do not put the cleverest, more industrious, the tallest, the fastest, the funniest, etcetera, in one group – rather, we randomise.

If we are randomising, there should be no systematic difference between the people in each condition. That is, we should not be able to use any kind of algorithm to predict who will be in each condition because assignments are made randomly – in effect, according to 'chance'. So, if we examine the composition of the two groups, there is unlikely to be any systematic pattern that distinguishes the two groups.

Two groups – with elements not selected at random (Image by hjrivas from Pixabay)

Now some scientists might suspect that nothing happens by chance – that if we could know the precise position and momentum of every particle in the universe (contra Heisenberg) … perhaps even that probabilistic effects found in quantum mechanics follow patterns due to hidden variables we have not yet uncovered…

How can we randomise?

Even if that is not so, it is clear that many ways we use to randomise may be deterministic at some level (when we throw a die, how it lands depends upon physical factors that could in principle, even if not easily in practice, be controlled) but that does not matter if that level is far enough from human comprehension or manipulation. We seek, at least, a quasi-randomisation (we throw dice; we mix up numbered balls in a bag, and then remove them one at a time 'blind'; we flip a coin for each name as we go down a list, until we have a full group for one condition; we consult a table of 'random' numbers; whatever), that is in effect random in the sense that the researchers could never know in advance who would end up in each condition.

When I was a journal editor it became clear to me that claims of randomisation reported in submitted research reports are often actually false, even if inadvertently so (see: Non-random thoughts about research). A common 'give away' here is when you ask the authors of a report how they carried out the randomisation. If they are completely at odds to answer, beyond repeating 'we chose randomly', then it is quite likely not truly random.

To randomise, one needs to adopt a technique: if one has not adopted a randomisation technique, then one used a non-random method of assignment. Asking the more confident, more willing, more experienced, less conservative, etc., teacher to teach the innovation condition is not random. For that matter, asking the first teacher one meets in the staffroom is arbitrary and not really random, even if it may feel as if it is.

…they were randomised, by even and odd birthdates…

The study I was hearing about on the radio was the work of Stanford Professor Nick Bloom, who explained how the 'randomisation' occurred:

"…for those volunteers, they were randomised, by even and odd birth dates, so anyone with an even birth date, if you were born on like the 2nd, the 4th, the 6th, the 8th, etcetera,of the month, you get to work at home for four out of five days a week, for the next nine months, and if you are odd like, I'm the 5th May, you had to stay in the office for the next nine months…"

Professor Nick Bloom interviewed on Positive Thinking: Curing Our Productivity Problem
Image by Jeevan Singla from Pixabay 

So, by my definition, that is not randomisation at all – it is totally deterministic. I would necessarily have been in the working at home condition, with zero possibility of being in the office working condition. If this had been random there would have been a 50:50 chance of Prof. Bloom and myself being assigned to the same condition – but with the non-random, systematic assignment used it was certain that we would have ended up in different conditions. So, this was a RCT without randomisation, but rather a completely systematic assignment to conditions.

This raises some questions.

  • Is it likely that a professor of economics does not understand randomisation?
  • Does it really matter?

Interestingly, I see from Prof. Bloom's website that one "area of [his] research is on the causes and consequences of uncertainty", so I suspect he actually understands randomisation very well. Presumably, Prof. Bloom knows that strictly there was no randomisation in this experiment, but is confident that it does not matter here.

Had Prof. Bloom assigned the volunteers to conditions depending on whether they were born before or after midnight on the 31st December 1989, this clearly would have introduced a major confounding variable. Had he assigned the volunteers according to those born in March to August to one condition and those born in September to February to the other, say, this might have been considered to undermine the research as it is quite conceivable that the time of year people were gestated, and born, and had to survive the first months of life, might well be a factor that makes a difference to work effectiveness later.

Even if we had no strong evidence to believe this would be so, any systematic difference where we might conjecture some mechanism that could have an effect has to be considered a potential confound that undermines confidence in the results of a RCT. Any difference found could be due to something other (e.g., greater thriving of Summer babies) than the intended difference in conditions ; any failure to find an effect might mean that a real effect (e.g., home working being more efficient than office working) is being masked by the confounding variable (e.g., season of birth).

It does not seem conceivable that even and odd birth dates could have any effect (and this assignment is much easier to organise than actually going through the process of randomisation when dealing with a large number of study participants). So, in practice, it probably does not matter here. It seems very unlikely this could undermine Prof. Bloom's conclusions. Yet, in principle, we randomise in part because we are not sure which variables will, or will not, be relevant, and so we seek to avoid any systematic basis for assigning participants to conditions. And given the liberties I've seen some other researchers take when they think they are making random choices, my instinct is to want to see an RCT where there is actual randomisation.

Covalent bonding is sharing electrons

It's covalent bonding where the electrons are shared to create a full outer shell

Keith S. Taber

Brian was a participant in the Understanding Chemical Bonding project. He was interviewed during the first year of his college 'A level' course (equivalent to Y12 of the English school system). Brian was shown, and asked about, a sequence of images representing atoms, molecules and other sub-microscopic structures of the kinds commonly used in chemistry teaching. He was shown a simple representation of a covalent molecule:

Focal figure ('2') presented to Brian

Any idea what that's meant to be, number 2?

Hydrogen molecule.

Why, how do you recognise that as being a hydrogen molecule?

Because there's two atoms with one electron in each shell.

Uh hm. Er, what, what's going on here, in this region here, where these lines seem to meet?

Bonding.

That's bonding. So there's some sort of bonding there is there?

Yeah.

Can you tell me anything about that bonding?

It's covalent bonding.

So, so what's covalent bonding, then?

The electrons are shared to create a full outer shell.

Okay, so that's an example of covalent bonding, so can you tell me how many bonds there are there?

One.

There's one covalent bond?

Yeah.

Right, what exactly is a covalent bond?

It's where electrons are shared, almost, roughly equally, between the two atoms.

So that's what we'd call a covalent bond?

Yeah.

So according to Brian, covalent bonding is where "the electrons are shared to create a full outer shell". The idea that a covalent bond is the sharing of electrons to allow atoms to obtain full electron shells is a very common way of discussing covalent bonding, drawing upon the full shells explanatory principle, where a 'need' for completing electron shells is seen as the impetus for bonding, reactions, ion formation etc. This principle is the basis of a common alternative conceptual framework, the octet rule framework.

For some students, such ideas are the extent of their ways of discussing bonding phenomena. However, despite Brian defining the covalent bond in this way, continued questioning revealed that he was able to think about the bond in terms of physical interactions

Okay. And why do they, why do these two atoms stay stuck together like that? Why don't they just pull apart?

Because of the bond.

So how does the bond do that?

(Pause, c.13s)

Is it by electrostatic forces?

Is it – so how do you think that works then?

I'm not sure.

The long pause suggests that Brian did not have a ready formed response for such a question. It seems here that 'electrostatic forces' is little more than a guess, if perhaps an informed guess because charges and forces had features in chemistry. A pause of about 13 seconds is quite a lacuna in a conversation. In a classroom context teachers are advised to give students thinking time rather than expecting (or accepting) immediate responses. Yet, in many classrooms, 13 seconds of 'dead air' (to borrow a phrase from broadcasting) from the teacher night be taken as an invitation to retune attention to another station.

Even in an interview situation the interviewer's instinct may be to move on to a another question, but in situations where a researcher is confident that waiting is not stressful to the participant, it is sometimes productive to give thinking time.

Another issue relating to interviewing is the use of 'leading questions'. Teachers as interviewers sometimes slip between researcher and teacher roles, and may be tempted to teach rather than explore thinking.

Yet, the very act of interviewing is an intervention in the learners' thinking, in that whatever an interviewer tells us is in the context of the conversation set up by the interviewer, and the participant may have ideas they would not have done without that particular context. In any case, learning is not generally a once off event, as school learning relies on physiological process long after the initial teaching event to consolidate learning, and this is supported by 'revision'. Each time a memory is reactivated it is strengthened (and potentially changed).

So the research interview is a learning experience no matter how careful the researcher is. Therefore the idea of leading questions is much more nuanced that a binary distinction between those questions which are leading and those that are not. So rather than completely avoiding leading questions, the researcher should (a) use open-ended questions initially to best understand the ideas the learner most easily beings to mind; (b) be aware of the degree of 'scaffolding' that Socratic questioning can contribute to the construction of a learners' answer. [Read about the idea of scaffolding learning here.] The interview continued:

Can you see anything there that would give rise to electrostatic forces?

The electrons.

Right so the electrons, they're charged are they?

Yeah. Negatively.

Negatively charged – anything else?

(Pause, c.8s)

The protons in the nucleus are positively charged.

Uh hm. And so would that give rise to any electronic interactions?

Yeah.

So where would there be, sort of any kind of, any kind of force involved here is there?

By the bond.

So where would there be force, can you show me where there would be force?

By the, in the bond, down here.

So the force is localised in there, is it?

The erm, protons would be repelling each other, they'd be attracted by the electrons, so they're keep them at a set distance.

It seemed that Brian could discuss the bond as due to electrical interactions, although his initial ('instinctive') response was to explain the bond in terms of electrons shared to fill electron shells. Although the researcher channelled Brian to think about the potential source of any electrical interactions, this was only after Brian had himself conjectured the role of 'electrostatic forces.'

Often students learn to 'explain' bonds as electron sharing in school science (although arguably this is a rather limited form of explanation), and this becomes a habitual way of talking and thinking by the time they progress to college level study.

Why write about Cronbach's alpha?

Keith S. Taber

What is Cronbach's alpha?

It is a statistic that is commonly quoted by researchers when reporting the use of scales and questionnaires.

Why carry out a study of the use of this statistic?

I am primarily a qualitative researcher, so do not usually use statistics in my own work. However, I regularly came across references to alpha in manuscripts I was asked to review for journals, and in manuscripts submitted to the journal I was editing myself (i.e., Chemistry Education Research and Practice).

I did not really understand what alpha was, or what is was supposed to demonstrate, or what value was desirable – which made it difficult to evaluate that aspect of a manuscript which was citing the statistic. So, I thought I had better find out more about it.

So, what is Cronbach's alpha?

It is a statistic that tests for internal consistency in scales. It should only be applied to a scale intended to measure a unidimensional factor – something it is assumed can be treated a single underlying variable (perhaps 'confidence in physics learning', 'enjoyment of school science practicals', or 'attitude to genetic medicine').

If someone developed a set of questionnaire items intended to find out, say, how skeptical a person was regarding scientific claims in the news, and administered the items to a sample of people, then alpha would offer a measure of the similarity of the set of items in terms of the patterns of responses from that sample. As the items are meant to be measuring a single underlying factor, they should all elicit similar responses from any individual respondent. If they do, then alpha would approach 1 (its maximum value).

Does alpha not measure reliability?

Often, studies state that alpha is measuring reliability – as internal consistency is sometimes considered a kind of reliability. However, more often in research what we mean by reliability is that repeating the measurements later will give us (much) the same result – and alpha does not tell us about that kind of reliability.

I think there is a kind of metaphorical use of 'reliability' here. The technique derives from an approach used to test equivalence based on dividing the items in a scale into two subsets*, and seeing whether analysis of the two subsets gives comparable results – so one could see if the result from the 'second' measure reliably reproduced that from the 'first' (but of course the ordering of the two calculations is arbitrary, and the two subsets of items were actually administered at the same time as part of a single scale).

* In calculating alpha, all possible splits are taken into account.

Okay, so that's what alpha is – but, still, why carry out a study of the use of this statistic?

Once I understood what alpha was, I was able to see that many of the manuscripts I was reviewing did not seem to be using it appropriately. I got the impression that alpha was not well understood among researchers even though it was commonly used. I felt it would be useful to write a paper that both highlighted the issues and offered guidance on good practice in applying and reporting alpha.

In particular studies would often cite alpha for broad features like 'understanding of chemistry' where it seems obvious that we would not expect understanding of pH, understanding of resonance in benzene, understanding of oxidation numbers, and understanding of the mass spectrometer, to be the 'same' thing (or if they are, we could save a lot of time and effort by reducing exams to a single question!)

It was also common for studies using instruments with several different scales to not only quote alpha for each scale (which is appropriate), but to also give an overall alpha for the whole instrument even though it was intended to be multidimensional. So imagine a questionnaire which had a section on enjoyment of physics, another on self-confidence in genetics, and another on attitudes to science-fiction elements in popular television programmes: why would a researcher want to claim there was a high level of internal consistency across what are meant to be such distinct scales?

There was also incredible diversity in how different authors describe different values of alpha they might calculate – so the same value of alpha might be 'acceptable' in one study, 'fairly high' in another, and 'excellent' in a third (see figure 1).


Fig. 1 Qualitative descriptors used for values/ranges of values of Cronbach's alpha reported in papers in leading science education journals (The Use of Cronbach's Alpha When Developing and Reporting Research Instruments in Science Education)

Some authors also suggested that a high value of alpha for an instrument implied it was unidimensional – that all the items were measuring the same things – which is not the case.

But isn't it the number that matters: we want alpha to be as high as possible, and at least 0.7?

Yes, and no. And no, and no.

But the number matters?

Yes of course, but it needs to be interpreted for a reader: not just 'alpha was 0.73'.

But the critical value is 0.7, is that right?

No.

It seems extremely common for authors to assume that they need alpha to reach, or exceed, 0.7 for their scale to be acceptable. But that value seems to be completely arbitrary (and was not what Cronbach was suggesting).

Well, it's a convention, just as p<0.05 is commonly taken as a critical value.

But it is not just like that. Alpha is very sensitive to how many items are included in a scale. If there are only a few items, then a value of, say, 0.6 might well be sensibly judged acceptable. In any case it is nearly always possible to increase alpha by adding more items till you reach 0.7.

But only if the added items genuinely fit for the scale?

Sadly, no.

Adding a few items that are similar to each other, but not really fitting the scale, would usually increase alpha. So adding 'I like Manchester United', 'Manchester United are the best soccer team', and 'Manchester United are great' as items to be responded to in a scale about self-efficacy in science learning would likely increase alpha.

Are you sure: have you tried it?

Well, no. But, as I pointed out above, instruments often contain unrelated scales, and authors would sometimes calculate an overall alpha (the computer found to be greater than that of each of its component scales – at least that would be the implication if it were assumed that a larger alpha means a higher internal consistency without factoring how alpha tends to be larger the more items are included in the calculation.

But still, it is clear that the bigger alpha the better?

Up to a point.

But consider a scale with five items where everybody responds to each item in exactly the same way (not, that is, different people respond in the same way as each other, just whatever response a person gives to one item – e.g., 2 on a scale of 1-7 – they also give to the other items). So alpha should be 1, as high as it can get. But Cronbach would suggest you are wasting researcher and participant effort by having many items if they all elicit the same response. The point of scales having several items is that we assume no one item directly catches perfectly what we are trying to measure. Whether they do or not, there is no point in multiple items that are effectively equivalent.

Was it necessary to survey science education journals to make the point?

I did not originally think so.

My draft manuscript made the argument by drawing on some carefully selected examples of published papers in relation to the different issues I felt needed to be highlighted and discussed. I think the draft manuscript effectively made the point that there were papers getting published in good journals that quoted alpha but seemed to simply assume it demonstrated something (unexplained) to readers, and/or used alpha when their instrument was clearly not meant to be multidimensional, and/or took 0.7 as a definitive cut-off regardless of the number of items concerned, and/or quoted alpha values for overall instruments as well as for the distinct scales as if that added some evidence of instrument quality, or claimed a high value of alpha for an instrument demonstrated it was unidimensional.

So why did you then spend time reviewing examples across four journals over a whole year of publication?

Although I did not think this was necessary, when the paper was reviewed for publication a journal reviewer felt the paper was too anecdotal: that just because a few papers included weak practice, that may not have been especially significant. I think there was also a sense that a paper critiquing a research technique did not fit in the usual categories of study published in the journal, but a study with more empirical content (even if the data were published papers) better fitted the journal.

At that point I could have decided to try and get the paper published elsewhere, but Research in Science Education is a good journal and I wanted the paper in a good science education journal. This took extra work, but satisfied the journal.

I still think the paper would have made a contribution without the survey BUT the extra work did strengthen paper. In retrospect, I am happy that I responded to review comments in that way – as it did actually show just how frequency alpha is used in science education, and the wide variety of practice in reporting the statistic. Peer review is meant to help authors improve their work, and I think it did here.

Has the work had impact?

I think so, but…

The study has been getting a lot of citations, and it is always good to think someone notices a study, given the work it involves. Perhaps a lot of people have genuinely thought about their use of alpha as a result of reading the paper, and perhaps there are papers out their which do a better job of using and reporting alpha as a result of authors reading my study. (I would like to think so.)

However, I have also noticed that a lot of papers citing this study as an authority for using alpha in the reported research are still doing the very things I was criticising, and sometimes directly justifying poor practice by citing my study! These authors either had not actually read the study (but were just looking for something about alpha to cite) or perhaps did not fully appreciate the points made.

Oh well, I think it was Oscar Wilde who said there is only one thing in academic life worse than being miscited…