Creeping bronzes

Evidence of journalistic creep in 'surprising' Benin bronzes claim


Keith S. Taber


How certain can we be about the origin of metals used in historic artefacts? (Image by Monika from Pixabay)


Science offers reliable knowledge of the natural world – but not absolutely certain knowledge. Conclusions from scientific studies follow from the results, but no research can offer absolutely certain conclusions as there are always provisos.

Read about critical reading of research

Scientists tend to know this, something emphasised for example by Albert Einstein (1940), who described scientific theories (used to interpret research results) as "hypothetical, never completely final, always subject to question and doubt".

When scientists talk to one another within some research programme they may used a shared linguistic code where they can omit the various conditionals ('likely', 'it seems', 'according to our best estimates', 'assuming the underlying theory', 'within experimental error', and the rest) as these are understood, and so may be left unspoken, thus increasing economy of language.

When scientists explain their work to a wider public such conditionals may also be left out to keep the account simple, but really should be mentioned. A particular trope that annoyed me when I was younger was the high frequency of links in science documentaries that told me "this could only mean…" (Taber, 2007) when honest science is always framed more along the lines "this would seem to mean…", "this could possibly mean…", "this suggested the possibility"…

Read about scientific certainty in the media

Journalistic creep

By journalistic creep I mean the tendency for some journalists who act as intermediates between research scientists and the public to keep the story simple by omitting important provisos. Science teachers will appreciate this, as they often have to decide which details can be included in a presentation without loosing or confusing the audience. A useful mantra may be:

Simplification may be necessary – but oversimplification can be misleading

A slightly different type of journalist creep occurs within stories themselves, Sometimes the banner headline and the introduction to a piece report definitive, certain scientific results – but reading on (for those that do!) reveals nuances not acknowledged at the start. Teachers will again appreciate this tactic: offer the overview with the main point, before going back to fill in the more subtle aspects. But then, teachers have (somewhat) more control over whether the audience engages with the full account.

I am not intending to criticise journalists in general here, as scientists themselves have a tendency to do something similar when it comes to finding titles for papers that will attract attention by perhaps suggesting something more certain (or, sometimes, poetic or even controversial) than can be supported by the full report.


An example of a Benin Bronze (a brass artefact from what is now Nigeria) in the British [sic] Museum

(British Museum, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons)


Where did the Benin bronzes metal come from?

The title of a recent article in the RSC's magazine for teachers, Education in Chemistry, proclaimed a "Surprise origin for Benin bronzes".1 The article started with the claim:

"Geochemists have confirmed that most of the Benin bronzes – sculptured heads, plaques and figurines made by the Edo people in West Africa between the 16th and 19th centuries – are made from brass that originated thousands of miles away in the German Rhineland."

So, this was something that scientists had apparently confirmed as being the case.

Reading on, one finds that

  • it has been "long suspected that metal used for the artworks was melted-down manillas that the Portuguese brought to West Africa"
  • scientists "analysed 67 manillas known to have been used in early Portuguese trade. The manillas were recovered from five shipwrecks in the Atlantic and three land sites in Europe and Africa"
  • they "found strong similarities between the manillas studied and the metal used in more than 700 Benin bronzes with previously published chemical compositions"
  • and "the chemical composition of the copper in the manillas matched copper ores mined in northern Europe"
  • and "suggests that modern-day Germany, specifically the German Rhineland, was the main source of the metal".

So, there is a chain of argument here which seems quite persuasive, but to move from this to it being "confirmed that most of the Benin bronzes…are made from brass that originated …in the German Rhineland" seems an example of journalistic creep.

The reference to "the chemical composition of the copper [sic] in the manillas" is unclear, as according to the original research paper the sample of manilla analysed were:

"chemically different from each other. Although most manillas analysed here …are brasses or leaded brasses, sometimes with small amounts of tin, a few specimens are leaded copper with little or no zinc."

Skowronek, et al., 2023

The key data presented in the paper concerned the ratios of different lead isotopes (205Pb:204Pb; 206Pb:204Pb; 207Pb:204Pb; 208Pb:204Pb {see the reproduced figure below}) in

  • ore from different European locations (according to published sources)
  • sampled Benin bronze (as reported from earlier research), and
  • sampled recovered manillas

and the ratios of different elements (Ni:AS; Sb:As; Bi:As) in previously sampled Benin bronzes and sampled manillas.

The tendency to consider a chain of argument where each link seems reasonably persuasive as supporting fairly certain conclusions is logically flawed (it is like concluding from knowledge that one's chance of dying on any particular day is very low, that one must be immortal) but seems reflected in something I have noticed with some research students: that often their overall confidence in the conclusions of a research paper they have scrutinised is higher than their confidence in some of the distinct component parts of that study.


An example of a student's evaluation of a research study


This is like being told by a mechanic that your cycle brakes have a 20% of failing in the next year; the tyres 30%; the chain 20%; and the frame 10%; and concluding from this that there is only about a 20% chance of having any kind of failure in that time!

A definite identification?

The peer reviewed research paper which reports the study discussed in the Education in Chemistry article informs readers that

"In the current study, documentary sources and geochemical analyses are used to demonstrate that the source of the early Portuguese "tacoais" manillas and, ultimately, the Benin Bronzes was the German Rhineland."

"…this study definitively identifies the Rhineland as the principal source of manillas at the opening of the Portuguese trade…"

Skowronek, et al.,2023

which sounds pretty definitive, but interestingly the study did not rely on chemical analysis alone, but also 'documentary' evidence. In effect, historical evidence provided another link in the argument, by suggesting the range of possible sources of the alloy that should be considered in any chemical comparisons. This assumes there were no mining and smelting operations providing metal for the trade with Africa which have not been well-documented by historians. That seems a reasonable assumption, but adds another proviso to the conclusions.

The researchers reported that

Pre-18th century manillas share strong isotopic similarities with Benin's famous artworks. Trace elements such as antimony, arsenic, nickel and bismuth are not as similar as the lead isotope data…. The greater data derivation suggests that manillas were added to older brass or bronze scrap pieces to produce the Benin works, an idea proposed earlier.

and acknowledges that

Millions of these artifacts were sent to West Africa where they likely provided the major, virtually the only, source of brass for West African casters between the 15th and the 18th centuries, including serving as the principal metal source of the Benin Bronzes. However, the difference in trace elemental patterns between manillas and Benin Bronzes does not allow postulating that they have been the only source.

The figure below is taken from the research report.


Part of Figure 2 from the open access paper (© 2023 Skowronek et al. – distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)

The chart shows results from sampled examples of Benin bronzes (blue circles); compared with the values of the same isotope ratios from different copper ore site (squares) and manillas sampled from different archaeological sties (triangles).


The researchers feel that the pattern of clustering of results (in this, and other similar comparisons between lead isotope ratios) from the Benin bronzes, compared with those from the sampled manillas, and the ore sites, allows them to identify the source of metal re-purposed by the Edo craftspeople to make the bronzes.

It is certainly the case that the blue circles (which refer to the artworks) and the green squares (which refer to copper ore samples from Rhineland) do seem to generally cluster in a similar region of the graph – and that some of the samples taken from the manillas also seem to fit this pattern.

I can see why this might strongly suggest the Rhineland (certainly more so than Wales) as the source of the copper believed to be used in manillas which were traded in Africa and are thought to have been later melted down as part of the composition of alloy used to make the Benin bronzes.

Whether that makes for either

  • definitive identification of the Rhineland as the principal source of manillas (Skowronek paper), or
  • confirmation that most of the Benin bronze are made from brass that originated thousands of miles away in the German Rhineland (EiC)

seems somewhat less certain. Just as scientific claims should be.


A conclusion for science education

It is both human nature, and often good journalistic or pedagogic practice to begin with a clear, uncomplicated statement of what is to be communicated. But we also know that what is heard or read first may be better retained in memory than what follows. It also seems that people in general tend to apply the wrong kind of calculus when there are multiple source of doubt – being more likely to estimate overall doubt as being the mean or modal level of the several discrete sources of doubt, rather than something that accumulates step-on-step.

It seems there is a major issue here for science education in training young people in critically questioning claims, looking for the relevant provisos, and understanding how to integrate levels of doubt (or, similarly, risk) that are distributed over a sequence of phases in a process.


All research conclusions (in any empirical study in any discipline) rely on a network of assumptions and interpretations, any one of which could be a weak link in the chain of logic. This is my take on some of the most critical links and assumptions in the Benin bronzes study. One could easily further complicate this scheme (for example, I have ignored the assumptions about the validity of the techniques and calibration of the instrumentation used to find the isotopic composition of metal samples).


Work cited:

Note:

1 It is not clear to me what the surprise was – but perhaps this is meant to suggest the claim may be surprising to readers of the article. The study discussed was premised on the assumption that the Benin Bronzes were made from metal largely re-purposed from manillas traded from Europe, which had originally been cast in one of the known areas in Europe with metal working traditions. The researchers included the Rhineland as one of the potential regional sites they were considering. So, it was surely a surprise only in a similar sense to rolling a die and it landing on 4, rather than say 2 or 5, would be a surprise.

But then, would you be just as likely to read an article entitled "Benin bronzes found to have anticipated origin"?


Delusions of educational impact

A 'peer-reviewed' study claims to improve academic performance by purifying the souls of students suffering from hallucinations


Keith S. Taber


The research design is completely inadequate…the whole paper is confused…the methodology seems incongruous…there is an inconsistency…nowhere is the population of interest actually identified…No explanation of the discrepancy is provided…results of this analysis are not reported…the 'interview' technique used in the study is highly inadequate…There is a conceptual problem here…neither the validity nor reliability can be judged…the statistic could not apply…the result is not reported…approach is completely inappropriate…these tables are not consistent…the evidence is inconclusive…no evidence to demonstrate the assumed mechanism…totally unsupported claims…confusion of recommendations with findings…unwarranted generalisation…the analysis that is provided is useless…the research design is simply inadequate…no control condition…such a conclusion is irresponsible

Some issues missed in peer review for a paper in the European Journal of Education and Pedagogy

An invitation to publish without regard to quality?

I received an email from an open-access journal called the European Journal of Education and Pedagogy, with the subject heading 'Publish Fast and Pay Less' which immediately triggered the thought "another predatory journal?" Predatory journals publish submissions for a fee, but do not offer the editorial and production standards expected of serious research journals. In particular, they publish material which clearly falls short of rigorous research despite usually claiming to engage in peer review.

A peer reviewed journal?

Checking out the website I found the usual assurances that the journal used rigorous peer review as:

"The process of reviewing is considered critical to establishing a reliable body of research and knowledge. The review process aims to make authors meet the standards of their discipline, and of science in general.

We use a double-blind system for peer-reviewing; both reviewers and authors' identities remain anonymous to each other. The paper will be peer-reviewed by two or three experts; one is an editorial staff and the other two are external reviewers."

https://www.ej-edu.org/index.php/ejedu/about

Peer review is critical to the scientific process. Work is only published in (serious) research journals when it has been scrutinised by experts in the relevant field, and any issues raised responded to in terms of revisions sufficient to satisfy the editor.

I could not find who the editor(-in-chief) was, but the 'editorial team' of European Journal of Education and Pedagogy were listed as

  • Bea Tomsic Amon, University of Ljubljana, Slovenia
  • Chunfang Zhou, University of Southern Denmark, Denmark
  • Gabriel Julien, University of Sheffield, UK
  • Intakhab Khan, King Abdulaziz University, Saudi Arabia
  • Mustafa Kayıhan Erbaş, Aksaray University, Turkey
  • Panagiotis J. Stamatis, University of the Aegean, Greece

I decided to look up the editor based in England where I am also based but could not find a web presence for him at the University of Sheffield. Using the ORCID (Open Researcher and Contributor ID) provided on the journal website I found his ORCID biography places him at the University of the West Indies and makes no mention of Sheffield.

If the European Journal of Education and Pedagogy is organised like a serious research journal, then each submission is handled by one of this editorial team. However the reference to "editorial staff" might well imply that, like some other predatory journals I have been approached by (e.g., Are you still with us, Doctor Wu?), the editorial work is actually carried out by office staff, not qualified experts in the field.

That would certainly help explain the publication, in this 'peer-reviewed research journal', of the first paper that piqued my interest enough to motivate me to access and read the text.


The Effects of Using the Tazkiyatun Nafs Module on the Academic Achievement of Students with Hallucinations

The abstract of the paper published in what claims to be a peer-reviewed research journal

The paper initially attracted my attention because it seemed to about treatment of a medical condition, so I wondered was doing in an education journal. Yet, the paper seemed to also be about an intervention to improve academic performance. As I read the paper, I found a number of flaws and issues (some very obvious, some quite serious) that should have been spotted by any qualified reviewer or editor, and which should have indicated that possible publication should have been be deferred until these matters were satisfactorily addressed.

This is especially worrying as this paper makes claims relating to the effective treatment of a symptom of potentially serious, even critical, medical conditions through religious education ("a  spiritual  approach", p.50): claims that might encourage sufferers to defer seeking medical diagnosis and treatment. Moreover, these are claims that are not supported by any evidence presented in this paper that the editor of the European Journal of Education and Pedagogy decided was suitable for publication.


An overview of what is demonstrated, and what is claimed, in the study.

Limitations of peer review

Peer review is not a perfect process: it relies on busy human beings spending time on additional (unpaid) work, and it is only effective if suitable experts can be found that fit with, and are prepared to review, a submission. It is also generally more challenging in the social sciences than in the natural sciences. 1

That said, one sometimes finds papers published in predatory journals where one would expect any intelligent person with a basic education to notice problems without needing any specialist knowledge at all. The study I discuss here is a case in point.

Purpose of the study

Under the heading 'research objectives', the reader is told,

"In general, this journal [article?] attempts to review the construction and testing of Tazkiyatun Nafs [a Soul Purification intervention] to overcome the problem of hallucinatory disorders in student learning in secondary schools. The general objective of this study is to identify the symptoms of hallucinations caused by subtle beings such as jinn and devils among students who are the cause of disruption in learning as well as find solutions to these problems.

Meanwhile, the specific objective of this study is to determine the effect of the use of Tazkiyatun Nafs module on the academic achievement of students with hallucinations.

To achieve the aims and objectives of the study, the researcher will get answers to the following research questions [sic]:

Is it possible to determine the effect of the use of the Tazkiyatun Nafs module on the academic achievement of students with hallucinations?"

Awang, 2022, p.42

I think I can save readers a lot of time regarding the research question by suggesting that, in this study, at least, the answer is no – if only because the research design is completely inadequate to answer the research question. (I should point that the author comes to the opposite conclusion: e.g., "the approach taken in this study using the Tazkiyatun Nafs module is very suitable for overcoming the problem of this hallucinatory disorder", p.49.)

Indeed, the whole paper is confused in terms of what it is setting out to do, what it actually reports, and what might be concluded. As one example, the general objective of identifying "the symptoms of hallucinations caused by subtle beings such as jinn and devils" (but surely, the hallucinations are the symptoms here?) seems to have been forgotten, or, at least, does not seem to be addressed in the paper. 2


The study assumes that hallucinations are caused by subtle beings such as jinn and devils possessing the students.
(Image by Tünde from Pixabay)

Methodology

So, this seems to be an intervention study.

  • Some students suffer from hallucinations.
  • This is detrimental to their education.
  • It is hypothesised that the hallucinations are caused by supernatural spirits ("subtle beings that lead to hallucinations"), so, a soul purification module might counter this detriment;
  • if so, sufferers engaging with the soul purification module should improve their academic performance;
  • and so the effect of the module is being tested in the study.

Thus we have a kind of experimental study?

No, not according to the author. Indeed, the study only reports data from a small number of unrepresentative individuals with no controls,

"The study design is a case study design that is a qualitative study in nature. This study uses a case study design that is a study that will apply treatment to the study subject to determine the effectiveness of the use of the planned modules and study variables measured many times to obtain accurate and original study results. This study was conducted on hallucination disorders [students suffering from hallucination disorders?] to determine the effectiveness of the Tazkiyatun Nafs module in terms of aspects of student academic achievement."

Awang, 2022, p.42

Case study?

So, the author sees this as a case study. Research methodologies are better understood as clusters of similar approaches rather than unitary categories – but case study is generally seen as naturalistic, rather than involving an intervention by an external researcher. So, case study seems incongruous here. Case study involves the detailed exploration of an instance (of something of interest – a lesson, a school, a course of tudy, a textbook, …) reported with 'thick description'.

Read about the characteristics of case study research

The case is usually a complex phenomena which is embedded within a context from which is cannot readily be untangled (for example, a lesson always takes place within a wider context of a teacher working over time with a class on a course of study, within a curricular, and institutional, and wider cultural, context, all of which influence the nature of the specific lesson). So, due to the complex and embedded nature of cases, they are all unique.

"a case study is a study that is full of thoroughness and complex to know and understand an issue or case studied…this case study is used to gain a deep understanding of an issue or situation in depth and to understand the situation of the people who experience it"

Awang, 2022, p.42

A case is usually selected either because that case is of special importance to the researcher (an intrinsic case study – e.g., I studied this school because it is the one I was working in) or because we hope this (unique) case can tell us something about similar (but certainly not identical) other (also unique) cases. In the latter case [sic], an instrumental case study, we are always limited by the extent we might expect to be able to generalise beyond the case.

This limited generalisation might suggest we should not work with a single case, but rather look for a suitably representative sample of all cases: but we sometimes choose case study because the complexity of the phenomena suggests we need to use extensive, detailed data collection and analyses to understand the complexity and subtlety of any case. That is (i.e., the compromise we choose is), we decide we will look at one case in depth because that will at least give us insight into the case, whereas a survey of many cases will inevitably be too superficial to offer any useful insights.

So how does Awang select the case for this case study?

"This study is a case study of hallucinatory disorders. Therefore, the technique of purposive sampling (purposive sampling [sic]) is chosen so that the selection of the sample can really give a true picture of the information to be explored ….

Among the important steps in a research study is the identification of populations and samples. The large group in which the sample is selected is termed the population. A sample is a small number of the population identified and made the respondents of the study. A case or sample of n = 1 was once used to define a patient with a disease, an object or concept, a jury decision, a community, or a country, a case study involves the collection of data from only one research participant…

Awang, 2022, p.42

Of course, a case study of "a community, or a country" – or of a school, or a lesson, or a professional development programme, or a school leadership team, or a homework policy, or an enrichnment activity, or … – would almost certainly be inadequate if it was limited to "the collection of data from only one research participant"!

I do not think this study actually is "a case study of hallucinatory disorders [sic]". Leading aside the shift from singular ("a case study") to plural ("disorders"), the research does not investigate a/some hallucinatory disorders, but the effect of a soul purification module on academic performance. (Actually, spoiler alert  😉, it does not actually investigate the effect of a soul purification module on academic performance either, but the author seems to think it does.)

If this is a case study, there should be the selection of a case, not a sample. Sometimes we do sample within a case in case study, but only from those identified as part of the case. (For example, if the case was a year group in a school, we may not have resources to interact in depth with several hundred different students). Perhaps this is pedantry as the reader likely knows what Awang meant by 'sample' in the paper – but semantics is important in research writing: a sample is chosen to represent a population, whereas the choice of case study is an acknowledgement that generalisation back to a population is not being claimed).

However, if "among the important steps in a research study is the identification of populations" then it is odd that nowhere in the paper is the population of interest actually specified!

Things slip our minds. Perhaps Awang intended to define the population, forgot, and then missed this when checking the text – buy, hey, that is just the kind of thing the reviewers and editor are meant to notice! Otherwise this looks very like including material from standard research texts to play lip-service to the idea that research-design needs to be principled, but without really appreciating what the phrases used actually mean. This impression is also given by the descriptions of how data (for example, from interviews) were analysed – but which are not reflected at all in the results section of the paper. (I am not accusing Awang of this, but because of the poor standard of peer review not raising the question, the author is left vulnerable to such an evaluation.)

The only one research participant?

So, what do we know about the "case or sample of n = 1 ", the "only one research participant" in this study?

The actual respondents in this case study related to hallucinatory disorders were five high school students. The supportive respondents in the case study related to hallucination disorders were five counseling teachers and five parents or guardians of students who were the actual respondents."

Awang, 2022, p.42

It is certainly not impossible that a case could comprise a group of five people – as long as those five make up a naturally bounded group – that is a group that a reasonable person would recognise as existing as a coherent entiy as they clearly had something in common (they were in the same school class, for example; they were attending the same group therapy session, perhaps; they were a friendship group; they were members of the same extended family diagnosed with hallucinatory disorders…something!) There is no indication here of how these five make up a case.

The identification of the participants as a case might have made sense had the participants collectively undertaken the module as a group, but the reader is told: "This study is in the form of a case study. Each practice and activity in the module are done individually" (p.50). Another justification could have been if the module had been offered in one school, and these five participants were the students enrolled in the programme at that time but as "analysis of  the  respondents'  academic  performance  was conducted  after  the  academic  data  of  all  respondents  were obtained  from  the  respective  respondent's  school" (p.45) it seems they did not attend a single school.

The results tables and reports in the text refer to "respondent 1" to "respondent 4". In case study, an approach which recognises the individuality and inherent value of the particular case, we would usually assign assumed names to research participants, not numbers. But if we are going to use numbers, should there not be a respondent 5?

The other one research participant?

It seems that these is something odd here.

Both the passage above, and the abstract refer to five respondents. The results report on four. So what is going on? No explanation of the discrepancy is provided. Perhaps:

  • There only ever were four participants, and the author made a mistake in counting.
  • There only ever were four participants, and the author made a typographical mistake (well, strictly, six typographical mistakes) in drafting the paper, and then missed this in checking the manuscript.
  • There were five respondents and the author forgot to include data on respondent 5 purely by accident.
  • There were five respondents, but the author decided not to report on the fifth deliberately for a reason that is not revealed (perhaps the results did not fit with the desired outcome?)

The significant point is not that there is an inconsistency but that this error was missed by peer reviewers and the editor – if there ever was any genuine peer review. This is the kind of mistake that a school child could spot – so, how is it possible that 'expert reviewers' and 'editorial staff' either did not notice it, or did not think it important enough to query?

Research instruments

Another section of the paper reports the instrumentation used in the paper.

"The research instruments for this study were Takziyatun Nafs modules, interview questions, and academic document analysis. All these instruments were prepared by the researcher and tested for validity and reliability before being administered to the selected study sample [sic, case?]."

Awang, 2022, p.42

Of course, it is important to test instruments for validity and reliability (or perhaps authenticity and trustworthiness when collecting qualitative data). But it is also important

  • to tell the reader how you did this
  • to report the outcomes

which seems to be missing (apart from in regard to part of the implemented module – see below). That is, the reader of a research study wants evidence not simply promises. Simply telling readers you did this is a bit like meeting a stranger who tells you that you can trust them because they (i.e., say that they) are honest.

Later the reader is told that

"Semi- structured interview questions will be [sic, not 'were'?] developed and validated for the purpose of identifying the causes and effects of hallucinations among these secondary school students…

…this interview process will be [sic, not 'was'] conducted continuously [sic!] with respondents to get a clear and specific picture of the problem of hallucinations and to find the best solution to overcome this disorder using Islamic medical approaches that have been planned in this study

Awang, 2022, pp.43-44

At the very least, this seems to confuse the plan for the research with a report of what was done. (But again, apparently, the reviewers and editorial staff did not think this needed addressing.) This is also confusing as it is not clear how this aspect of the study relates to the intervention. Were the interviews carried out before the intervention to help inform the design of the modules (presumably not as they had already been "tested for validity and reliability before being administered to the selected study sample"). Perhaps there are clear and simple answers to such questions – but the reader will not know because the reviewers and editor did not seem to feel they needed to be posed.

If "Interviews are the main research instrument in this study" (p.43), then one would expect to see examples of the interview schedules – but these are not presented. The paper reports a complex process for analysing interview data, but this is not reflected in the findings reported. The readers is told that the six stage process leads to the identifications and refinement of main and sub-categories. Yet, these categories are not reported in the paper. (But, again, peer reviewers and the editor did not apparently raise this as something to be corrected.) More generally "data  analysis  used  thematic  analysis  methods" (p.44), so why is there no analysis presented in terms of themes? The results of this analysis are simply not reported.

The reader is told that

"This  interview  method…aims to determine the respondents' perspectives, as well as look  at  the  respondents'  thoughts  on  their  views  on  the issues studied in this study."

Awang, 2022, p.44

But there is no discussion of participants perspectives and views in the findings of the study. 2 Did the peer reviewers and editor not think this needed addressing before publication?

Even more significantly, in a qualitative study where interviews are supposedly the main research instrument, one would expect to see extracts from the interviews presented as part of the findings to support and exemplify claims being made: yet, there are none. (Did this not strike the peer reviewers and editor as odd: presumably they are familiar with the norms of qualitative research?)

The only quotation from the qualitative data (in this 'qualitative' study) I can find appears in the implications section of the paper:

"Are you aware of the importance of education to you? Realize. Is that lesson really important? Important. The success of the student depends on the lessons in school right or not? That's right"

Respondent 3: Awang, 2022, p.49

This seems a little bizarre, if we accept this is, as reported, an utterance from one of the students, Respondent 3. It becomes more sensible if this is actually condensed dialogue:

"Are you aware of the importance of education to you?"

"Realize."

"Is that lesson really important?"

"Important."

"The success of the student depends on the lessons in school right or not?"

"That's right"

It seems the peer review process did not lead to suggesting that the material should be formatted according to the norms for presenting dialogue in scholarly texts by indicating turns. In any case, if that is typical of the 'interview' technique used in the study then it is highly inadequate, as clearly the interviewer is leading the respondent, and this is more an example of indoctrination than open-ended enquiry.

Random sampling of data

Completely incongruous with the description of the purposeful selection of the participants for a case study is the account of how the assessment data was selected for analysis:

"The  process  of  analysis  of  student  achievement documents is carried out randomly by taking the results of current  examinations  that  have  passed  such  as the  initial examination of the current year or the year before which is closest  to  the  time  of  the  study."

Awang, 2022, p.44

Did the peer reviewers or editor not question the use of the term random here? It is unclear what is meant to by 'random' here, but clearly if the analysis was based on randomly selected data that would undermine the results.

Validating the soul purification module

There is also a conceptual problem here. The Takziyatun Nafs modules are the intervention materials (part of what is being studied) – so they cannot also be research instruments (used to study them). Surely, if the Takziyatun Nafs modules had been shown to be valid and reliable before carrying out the reported study, as suggested here, then the study would not be needed to evaluate their effectiveness. But, presumably, expert peer reviewers (if there really were any) did not see an issue here.

The reliability of the intervention module

The Takziyatun Nafs modules had three components, and the author reports the second of the three was subjected to tests of validity and reliability. It seems that Awang thinks that this demonstrates the validity and reliability of the complete intervention,

"The second part of this module will go through [sic] the process of obtaining the validity and reliability of the module. Proses [sic] to obtain this validity, a questionnaire was constructed to test the validity of this module. The appointed specialists are psychologists, modern physicians (psychiatrists), religious specialists, and alternative medicine specialists. The validity of the module is identified from the aspects of content, sessions, and activities of the Tazkiyatun Nafs module. While to obtain the value of the reliability coefficient, Cronbach's alpha coefficient method was used. To obtain this Cronbach's alpha coefficient, a pilot test was conducted on 50 students who were randomly selected to test the reliability of this module to be conducted."

Awang, 2022, pp.43-44

Now to unpack this, it may be helpful to briefly outline what the intervention involved (as as the paper is open access anyone can access and read the full details in the report).


From the MGM film 'A Night at the Opera' (1935): "The introduction of the module will elaborate on the introduction, rationale, and objectives of this module introduced"

The description does not start off very helpfully ("The introduction of the module will elaborate on the introduction, rationale, and objectives of this module introduced" (p.43) put me in mind of the Marx brothers: "The party of the first part shall be known in this contract as the party of the first part"), but some key points are,

"the Tazkiyatun Nafs module was constructed to purify the heart of each respondent leading to the healing of hallucinatory disorders. This liver purification process is done in stages…

"the process of cleansing the patient's soul will be done …all the subtle beings in the patient will be expelled and cleaned and the remnants of the subtle beings in the patient will be removed and washed…

The second process is the process of strengthening and the process of purification of the soul or heart of the patient …All the mazmumah (evil qualities) that are in the heart must be discarded…

The third process is the process of enrichment and the process of distillation of the heart and the practices performed. In this process, there will be an evaluation of the practices performed by the patient as well as the process to ensure that the patient is always clean from all the disturbances and disturbances [sic] of subtle beings to ensure that students will always be healthy and clean from such disturbances…

Awang, 2022, p.45, p.43

Quite how this process of exorcising and distilling and cleansing will occur is not entirely clear (and if the soul is equated with the heart, how is the liver involved?), but it seems to involve reflection and prayer and contemplation of scripture – certainly a very personal and therapeutic process.

And yet its validity and reliability was tested by giving a questionnaire to 50 students randomly selected (from the unspecified population, presumably)? No information is given on how a random section was made (Taber, 2013) – which allows a reader to be very sceptical that this actually was a random sample from the (un?)identified population, and not just an arbitrary sample of 50 students. (So, that is twice the word 'random' is used in the paper when it seems inappropriate.)

It hardly matters here, as clearly neither the validity nor the reliability of a spiritual therapy can be judged from a questionnaire (especially when administered to people who have never undertaken the therapy). In any case, the "reliability coefficient" obtained from an administration of a questionnaire ONLY applies to that sample on that occasion. So, the statistic could not apply to the four participants in the study. And, in any case, the result is not reported, so the reader has no idea what the value of Cronbach's alpha was (but then, this was described as a qualitative study!)

Moreover, Cronbach's alpha only indicates the internal coherence of the items on a scale (Taber, 2019): so, it only indicates whether the set of questions included in the questionnaire seem to be accessing the same underlying construct in motivating the responses of those surveyed across the set of items. It gives no information about the reliability of the instrument (i.e., whether it would give the same results on another occasion).

This approach to testing validity and reliability is then completely inappropriate and unhelpful. So, even if the outcomes of the testing had been reported (and they are not) they would not offer any relevant evidence. Yet it seems that peer reviewers and editor did not think to question why this section was included in the paper.

Ethical issues

A study of this kind raises ethical issues. It may well be that the research was carried out in an entirely proper and ethical manner, but it is usual in studies with human participants ('human subjects') to make this clear in the published report (Taber, 2014b). A standard issue is whether the participants gave voluntary, informed, consent. This would mean that they were given sufficient information about the study at the outset to be able to decide if they wished to participate, and were under no undue pressure to do so. The 'respondents' were school students: if they were considered minors in the research context (and oddly for a 'case study' such basic details as age and gender are not reported) then parental permission would also be needed, again subject to sufficient briefing and no duress.

However, in this specific research there are also further issues due to the nature of the study. The participants were subject to medical disorders, so how did the researcher obtain information about, and access to, the students without medical confidentiality being broken? Who were the 'gatekeepers' who provided access to the children and their personal data? The researcher also obtained assessment data "from  the  class  teacher  or  from  the  Student Affairs section of the student's school" (p.44), so it is important to know that students (and parents/guardians) consented to this. Again, peer review does not seem to have identified this as an issue to address before publication.

There is also the major underlying question about the ethics of a study when recognising that these students were (or could be, as details are not provided) suffering from serious medical conditions, but employing religious education as a treatment ("This method of treatment is to help respondents who suffer from hallucinations caused by demons or subtle beings", p.44). Part of the theoretical framework underpinning the study is the assumption that what is being addressed is"the problem of hallucinations caused by the presence of ethereal beings…" (p.43) yet it is also acknowledged that,

"Hallucinatory disorders in learning that will be emphasized in this study are due to several problems that have been identified in several schools in Malaysia. Such disorders are psychological, environmental, cultural, and sociological disorders. Psychological disorders such as hallucinatory disorders can lead to a more critical effect of bringing a person prone to Schizophrenia. Psychological disorders such as emotional disorders and psychiatric disorders. …Among the causes of emotional disorders among students are the school environment, events in the family, family influence, peer influence, teacher actions, and others."

Awang, 2022, p.41

There seem to be three ways of understanding this apparent discrepancy, which I might gloss:

  1. there are many causes of conditions that involve hallucinations, including, but not only, possession by evil or mischievousness spirits;
  2. the conditions that lead to young people having hallucinations may be understood at two complementary levels, at a spiritual level in terms of a need for inner cleansing and exorcising of subtle beings, and in terms of organic disease or conditions triggered by, for example, social and psychological factors;
  3. in the introduction the author has relied on various academic sources to discuss the nature of the phenomenon of students having hallucinations, but he actually has a working assumption that is completely different: hallucinations are due to the presence of jinn or other spirits.

I do not think it is clear which of these positions is being taken by the study's author.

  1. In the first case it would be necessary to identify which causes are present in potential respondents and only recruit those suffering possession for this study (which does not seem to have been done);
  2. In the second case, spiritual treatment would need to complement medical intervention (which would completely undermine the validity of the study as medical treatments for the underlying causes of hallucinations are likely to be the cause of hallucinations ceasing, not the tested intervention);
  3. The third position is clearly problematic in terms of academic scholarship as it is either completely incompetent or deliberately disregards academic norms that require the design of a study to reflect the conceptual framework set out to motivate it.

So, was this tested intervention implemented instead of or alongside formal medical intervention?

  • If it was alongside medical treatment, then that raises a major confound for the study.
  • Yet it would clearly be unacceptable to deny sufferers indicated medical treatment in order to test an educational intervention that is in effect a form of exorcism.

Again, it may be there are simple and adequate responses to these questions (although here I really cannot see what they might be), but unfortunately it seems the journal referees and editor did not think to ask for them.  

Findings


Results tables presented in Awang, 2022 (p.45) [Published with a creative commons licence allowing reproduction]: "Based on the findings stated in Table I show that serial respondents experienced a decline in academic achievement while they face the problem of hallucinations. In contrast to Table II which shows an improvement in students' academic achievement  after  hallucinatory  disorders  can  be  resolved." If we assume that columns in the second table have been mislabelled, then it seems the school performance of these four students suffered while they were suffering hallucinations, but improved once they recovered. From this, we can infer…?

The key findings presented concern academic performance at school. Core results are presented in tables I and II. Unfortunately these tables are not consistent as they report contradictory results for the academic performance of students before and during periods when they had hallucinations.

They can be made consistent if the reader assumes that two of the columns in table II are mislabelled. If the reader assumes that the column labelled 'before disruption' actually reports the performance 'during disruption' and that the column actually labelled 'during disruption' is something else, then they become consistent. For the results to tell a coherent story and agree with the author's interpretation this 'something else' presumably should be 'after disruption'.

This is a very unfortunate error – and moreover one that is obvious to any careful reader. (So, why was it not obvious to the referees and editor?)

As well as looking at these overall scores, other assessment data is presented separately for each of respondent 1 – respondent 4. Theses sections comprise presentations of information about grades and class positions, mixed with claims about the effects of the intervention. These claims are not based on any evidence and in many cases are conclusions about 'respondents' in general although they are placed in sections considering the academic assessment data of individual respondents. So,there are a number of problems with these claims:

  • they are of the nature of conclusions, but appear in the section presenting the findings;
  • they are about the specific effects of the intervention that the author assumes has influenced academic performance, not the data analysed in these sections;
  • they are completely unsubstantiated as no data or analysis is offered to support them;
  • often they make claims about 'respondents' in general, although as part of the consideration of data from individual learners.

Despite this, the paper passed peer-review and editorial scrutiny.

Rhetorical research?

This paper seems to be an example of a kind of 'rhetorical research' where a researcher is so convinced about their pre-existant theoretical commitments that they simply assume they have demonstrated them. Here the assumption seem to be:

  1. Recovering from suffering hallucinations will increase student performance
  2. Hallucinations are caused by jinn and devils
  3. A spiritual intervention will expel jinn and devils
  4. So, a spiritual intervention will cure hallucinations
  5. So, a spiritual intervention will increase student performance

The researcher provided a spiritual intervention, and the student performance increased, so it is assumed that the scheme is demonstrated. The data presented is certainly consistent with the assumption, but does not in itself support this scheme without evidence. Awang provides evidence that student performance improved in four individuals after they had received the intervention – but there is no evidence offered to demonstrate the assumed mechanism.

A gardener might think that complimenting seedlings will cause them to grow. Perhaps she praises her seedlings every day, and they do indeed grow. Are we persuaded about the efficacy of her method, or might we suspect another cause at work? Would the peer-reveiewers and editor of the European Journal of Education and Pedagogy be persuaded this demonstrated that compliments cause plant growth? On the evidence of this paper, perhaps they would.

This is what Awang tells readers about the analysis undertaken:

Each student  respondent  involved  in  this  study  [sic, presumably not, rather the researcher] will  use  the analysis  of  the  respondent's  performance  to  determine the effect of hallucination disorders on student achievement in secondary school is accurate.

The elements compared in this analysis are as follows: a) difference in mean percentage of achievement by subject, b) difference in grade achievement by subject and c) difference in the grade of overall student achievement. All academic results of the respondents will be analyzed as well as get the mean of the difference between the  performance  before, during, and after the  respondents experience  hallucinations. 

These  results  will  be  used  as research material to determine the accuracy of the use of the Tazkiyatun  Nafs  Module  in  solving  the  problem  of hallucinations   in   school   and   can   improve   student achievement in academic school."

Awang, 2022, p.45

There is clearly a large jump between the analysis outlined in the second paragraph here, and testing the study hypotheses as set out in the final paragraph. But the author does not seem to notice this (and more worryingly, nor do the journal's reviewers and editor).

So interleaved into the account of findings discussing "mean percentage of achievement by subject…difference in grade achievement by subject…difference in the grade of overall student achievement" are totally unsupported claims. Here is an example for Respondent 1:

"Based on the findings of the respondent's achievement in the  grade  for  Respondent  1  while  facing  the  problem  of hallucinations  shows  that  there  is  not  much  decrease  or deterioration  of  the  respondent's  grade.  There  were  only  4 subjects who experienced a decline in grade between before and  during  hallucination  disorder.  The  subjects  that experienced  decline  were  English,  Geography,  CBC, and Civics.  Yet  there  is  one  subject  that  shows  a  very  critical grade change the Civics subject. The decline occurred from grade A to grade E. This shows that Civics education needs to be given serious attention in overcoming this problem of decline. Subjects experiencing this grade drop were subjects involving  emotion,  language,  as  well  as  psychomotor fitness.  In  the  context  of  psychology,  unstable  emotional development  leads  to  a  decline  in the psychomotor  and emotional development of respondents.

After  the  use  of  the  Tazkiyatun  Nafs  module  in overcoming  this  problem,  hallucinatory  disorders  can  be overcome.  This  situation  indicates  the  development  of  the respondents  during  and  after  experiencing  hallucinations after  practicing  the  Tazkiyatun  Nafs  module.  The  process that takes place in the Tzkiyatun Nafs module can help the respondent  to  stabilize  his  emotions  and  psyche  for  the better. From the above findings there were 5 subjects who experienced excellent improvement in grades. The increase occurred in English, Malay, Geography, and Civics subjects. The best improvement is in the subject of Civic education from grade E to grade B. The improvement in this language subject  shows  that  the  respondents'  emotions  have stabilized.  This  situation  is  very  positive  and  needs  to  be continued for other subjects so that respondents continue to excel in academic achievement in school.""

Awang, 2022, p.45 (emphasis added)

The material which I show here as underlined is interjected completely gratuitously. It does not logically fit in the sequence. It is not part of the analysis of school performance. It is not based on any evidence presented in this section. Indeed, nor is it based on any evidence presented anywhere else in the paper!

This pattern is repeated in discussing other aspects of respondents' school performance. Although there is mention of other factors which seem especially pertinent to the dip in school grades ("this was due to the absence of the  respondents  to  school  during  the  day  the  test  was conducted", p.46; "it was an increase from before with no marks due to non-attendance at school", p.46) the discussion of grades is interspersed with (repetitive) claims about the effects of the intervention for which no evidence is offered.


Respondent 1Respondent 2Respondent 3Respondent 4
§: Differences in Respondents' Grade Achievement by Subject"After the use of the Tazkiyatun Nafs module in overcoming this problem, hallucinatory disorders can be overcome. This situation indicates the development of the respondents during and after experiencing hallucinations after practicing the Tazkiyatun Nafs module. The process that takes place in the Tzkiyatun Nafs module can help the respondent to stabilize his emotions and psyche for the better." (p.45)"After the use of the Tazkiyatun Nafs module as a soul purification module, showing the development of the respondents during and after experiencing hallucination disorders is very good. The process that takes place in the Tzkiyatun Nafs module can help the respondent to stabilize his emotions and psyche for the better." (p.46)"The process that takes place in the Tazkiyatun Nafs module can help the respondent to stabilize his emotions and psyche for the better" (p.46)"The process that takes place in the Tazkiyatun Nafs module can help the respondent to stabilize his emotions and psyche for the better." (p.46)
§:Differences in Respondent Grades according to Overall Academic Achievement"Based on the findings of the study after the hallucination
disorder was overcome showed that the development of the respondents was very positive after going through the treatment process using the Tazkiyatun Nafs module…In general, the use of Tazkiyatun Nafs module successfully changed the learning lifestyle and achievement of the respondents from poor condition to good and excellent achievement.
" (pp.46-7)
"Based on the findings of the study after the hallucination disorder was overcome showed that the development of the respondents was very positive after going through the treatment process using the Tazkiyatun Nafs module. … This excellence also shows that the respondents have recovered from hallucinations after practicing the methods found in the Tazkiayatun Nafs module that has been introduced.
In general, the use of the Tazkiyatun Nafs module successfully changed the learning lifestyle and achievement of the respondents from poor condition to good and excellent achievement
." (p.47)
"Based on the findings of the study after the hallucination disorder was overcome showed that the development of the respondents was very positive after going through the treatment process using the Tazkiyatun Nafs module…In general, the use of the Tazkiyatun Nafs module successfully changed the learning lifestyle and achievement of the respondents from poor condition to good and excellent achievement." (p.47)"Based on the findings of the study after the hallucination disorder was overcome showed that the development of the respondents was very positive after going through the treatment process using the Tazkiyatun Nafs module…In general, the use of the Tazkiyatun Nafs module has successfully changed the learning lifestyle and achievement of the respondents from poor condition to good and excellent achievement." (p.47)
Unsupported claims made within findings sections reporting analyses of individual student academic grades: note (a) how these statements included in the analysis of individual school performance data from four separate participants (in a case study – a methodology that recognises and values diversity and individuality) are very similar across the participants; (b) claims about 'respondents' (plural) are included in the reports of findings from individual students.

Awang summarises what he claims the analysis of 'differences in respondents' grade achievement by subject' shows:

"The use of the Tazkiyatun Nafs module in this study helped the students improve their respective achievement grades. Therefore, this soul purification module should be practiced by every student to help them in stabilizing their soul and emotions and stay away from all the disturbances of the subtle beings that lead to hallucinations"

Awang, 2022, p.46

And, on the next page, Awang summarises what he claims the analysis of 'differences in respondent grades according to overall academic achievement' shows:

"The use of the Tazkiyatun Nafs module in this study helped the students improve their respective overall academic achievement. Therefore, this soul purification module should be practiced by every student to help them in stabilizing the soul and emotions as well as to stay away from all the disturbances of the subtle beings that lead to hallucination disorder."

Awang, 2022, p.47

So, the analysis of grades is said to demonstrate the value of the intervention, and indeed Awang considers this is reason to extend the intervention beyond the four participants, not just to others suffering hallucinations, but to "every student". The peer review process seems not to have raised queries about

  • the unsupported claims,
  • the confusion of recommendations with findings (it is normal to keep to results in a findings section), nor
  • the unwarranted generalisation from four hallucination suffers to all students whether healthy or not.

Interpreting the results

There seem to be two stories that can be told about the results:

When the four students suffered hallucinations, this led to a deterioration in their school performance. Later, once they had recovered from the episodes of hallucinations, their school performance improved.  

Narrative 1

Now narrative 1 relies on a very substantial implied assumption – which is that the numbers presented as school performance are comparable over time. So, a control would be useful: such as what happened to the performance scores of other students in the same classes over the same time period. It seems likely they would not have shown the same dip – unless the dip was related to something other than hallucinations – such as the well-recognised dip after long school holidays, or some cultural distraction (a major sports tournament; fasting during Ramadan; political unrest; a pandemic…). Without such a control the evidence is suggestive (after all, being ill, and missing school as a result, is likely to lead to a dip in school performance, so the findings are not surprising), but inconclusive.

Intriguingly, the author tells readers that "student  achievement  statistics  from  the  beginning  of  the year to the middle of the current [sic, published in 2022] year in secondary schools in Northern Peninsular Malaysia that have been surveyed by researchers show a decline (Sabri, 2015 [sic])" (p.42), but this is not considered in relation to the findings of the study.

When the four students suffered hallucinations, this led to a deterioration in their school performance. Later, as a result of undergoing the soul purification module, their school performance improved.  

Narrative 2

Clearly narrative 2 suffers from the same limitation as narrative 1. However, it also demands an extra step in making an inference. I could re-write this narrative:

When the four students suffered hallucinations, this led to a deterioration in their school performance. Later, once they had recovered from the episodes of hallucinations, their school performance improved. 
AND
the recovery was due to engagement with the soul purification module.

Narrative 2'.

That is, even if we accept narrative 1 as likely, to accept narrative 2 we would also need to be convinced that:

  • a) sufferers from medical conditions leading to hallucinations do not suffer periodic attacks with periods of remission in between; or
  • b) episodes of hallucinations cannot be due to one-off events (emotional trauma, T.I.A. {transient ischaemic attack or mini-strokes},…) that resolve naturally in time; or
  • c) sufferers from medical conditions leading to hallucinations do not find they resolve due to maturation; or
  • d) the four participants in this study did not undertaken any change in life-style (getting more sleep, ceasing eating strange fungi found in the woods) unrelated to the intervention that might have influenced the onset of hallucinations; or
  • e) the four participants in this study did not receive any medical treatment independent of the intervention (e.g., prescribed medication to treat migraine episodes) that might have influenced the onset of hallucinations

Despite this study being supposedly a case study (where the expectation is there should be 'thick description' of the case and its context), there is no information to help us exclude such options. We do not know the medical diagnoses of the conditions causing the participants' hallucinations, or anything about their lives or any medical treatment that may have been administered. Without such information, the analysis that is provided is useless for answering the research question.

In effect, regardless of all the other issues raised, the key problem is that the research design is simply inadequate to test the research question. But it seems the referees and editor did not notice this shortcoming.

Alleged implications of the research

After presenting his results Awang draws various implications, and makes a number of claims about what had been found in the study:

  • "After the students went through the treatment session by using the Tazkiayatun Nafsmodule to treat hallucinations, it showed a positive effect on the student respondents. All this was certified by the expert, the student's parents as well as the  counselor's  teacher." (p.48)
  • "Based on these findings, shows that hallucinations are very disturbing to humans and the appropriate method for now to solve this problem is to use the Tazkiyatun Nafs Module." (p.48)
  • "…the use of the Tazkiyatun Nafs module while the  respondent  is  suffering  from  hallucination  disorder  is very  appropriate…is very helpful to the respondents in restoring their minds and psyche to be calmer and healthier. These changes allow  students  to  focus  on  their  studies  as  well  as  allow them to improve their academic performance better." (p.48)
  • "The use of the Tazkiyatun Nafs Module in this study has led to very positive changes there are attitudes and traits of students  who  face  hallucinations  before.  All  the  negative traits  like  irritability, loneliness,  depression,etc.  can  be overcome  completely." (p.49)
  • "The personality development of students is getting better and perfect with the implementation of the Tazkiaytun Nafs module in their lives." (p.49)
  • "Results  indicate that  students  who  suffer  from  this hallucination  disorder are in  a  state  of  high  depression, inactivity, fatigue, weakness and pain,and insufficient sleep." (p.49)
  • "According  to  the  findings  of  this study,  the  history  of  this  hallucination  disorder  started in primary  school  and  when  a  person  is  in  adolescence,  then this  disorder  becomes  stronger  and  can  cause  various diseases  and  have  various  effects  on  a  person who  is disturbed." (p.50)

Given the range of interview data that Awang claims to have collected and analysed, at least some of the claims here are possibly supported by the data. However, none of this data and analysis is available to the reader. 2 These claims are not supported by any evidence presented in the paper. Yet peer reviewers and the editor who read the manuscript seem to feel it is entirely acceptable to publish such claims in a research paper, and not present any evidence whatsoever.

Summing up

In summary: as far as these four students were concerned (but not perhaps the fifth participant?), there did seem to be a relationship between periods of experiencing hallucinations and lower school performance (perhaps explained by such factors as "absenteeism to school during the day the test was conducted" p.46) ,

"the performance shown by students who face chronic hallucinations is also declining and  declining.  This  is  all  due  to  the  actions  of  students leaving the teacher's learning and teaching sessions as well as  not  attending  school  when  this  hallucinatory  disorder strikes.  This  illness or  disorder  comes  to  the  student suddenly  and  periodically.  Each  time  this  hallucination  disease strikes the student causes the student to have to take school  holidays  for  a  few  days  due  to  pain  or  depression"

Awang, 2022, p.42

However,

  • these four students do not represent any wider population;
  • there is no information about the specific nature, frequency, intensity, etcetera, of the hallucinations or diagnoses in these individuals;
  • there was no statistical test of significance of changes; and
  • there was no control condition to see if performance dips were experienced by others not experiencing hallucinations at the same time.

Once they had recovered from the hallucinations (and it is not clear on what basis that judgement was made) their scores improved.

The author would like us to believe that the relief from the hallucinations was due to the intervention, but this seems to be (quite literally) an act of faith 3 as no actual research evidence is offered to show that the soul purification module actually had any effect. It is of course possible the module did have an effect (whether for the conjectured or other reasons – such as simply offering troubled children some extra study time in a calm and safe environment and special attention – or because of an expectancy effect if the students were told by trusted authority figures that the intervention would lead to the purification of their hearts and the healing of their hallucinatory disorder) but the study, as reported, offers no strong grounds to assume it did have such an effect.

An irresponsible journal

As hallucinations are often symptoms of organic disease affecting blood supply to the brain, there is a major question of whether treating the condition by religious instruction is ethically sound. For example, hallucinations may indicate a tumour growing in the brain. Yet, if the module was only a complement to proper medical attention, a reader may prefer to suspect that any improvement in the condition (and consequent increased engagement in academic work) may have been entirely unrelated to the module being evaluated.

Indeed, a published research study that claims that soul purification is a suitable treatment for medical conditions presenting with hallucinations is potentially dangerous as it could lead to serious organic disease going untreated. If Awang's recommendations were widely taken up in Malaysia such that students with serious organic conditions were only treated for their hallucinations by soul purification rather than with medication or by surgery it would likely lead to preventable deaths. For a research journal to publish a paper with such a conclusion, where any qualified reviewer or editor could easily see the conclusion is not warranted, is irresponsible.

As the journal website points out,

"The process of reviewing is considered critical to establishing a reliable body of research and knowledge. The review process aims to make authors meet the standards of their discipline, and of science in general."

https://www.ej-edu.org/index.php/ejedu/about

So, why did the European Journal of Education and Pedagogy not subject this submission to meaningful review to help the author of this study meet the standards of the discipline, and of science in general?


Work cited:

Notes:

1 In mature fields in the natural sciences there are recognised traditions ('paradigms', 'disciplinary matrices') in any active field at any time. In general (and of course, there will be exceptions):

  • at any historical time, there is a common theoretical perspective underpinning work in a research programme, aligned with specific ontological and epistemological commitments;
  • at any historical time, there is a strong alignment between the active theories in a research programme and the acceptable instrumentation, methodology and analytical conventions.

Put more succinctly, in a mature research field, there is generally broad agreement on how a phenomenon is to be understood; and how to go about investigating it, and how to interpret data as research evidence.

This is generally not the case in educational research – which is in part at least due to the complexity and, so, multi-layered nature, of the phenomena studied (Taber, 2014a): phenomena such as classroom teaching. So, in reviewing educational papers, it is sometimes necessary to find different experts to look at the theoretical and the methodological aspects of the same submission.


2 The paper is very strange in that the introductory sections and the conclusions and implications sections have a very broad scope, but the actual research results are restricted to a very limited focus: analysis of school test scores and grades.

It is as if as (and could well be that) a dissertation with a number of evidential strands has been reduced to a paper drawing upon only one aspect of the research evidence, but with material from other sections of the dissertation being unchanged from the original broader study.


3 Readers are told that

"All  these  acts depend on the sincerity of the medical researcher or fortune-teller seeking the help of Allah S.W.T to ensure that these methods and means are successful. All success is obtained by the permission of Allah alone"

Awang, 2022, p.43


My brain can multitask even if yours makes a category error

Do not mind the brain, it is just doing its jobs

Keith S. Taber


Can Prof. Dux's brain really not multitask?

I was listening to a podcast where Professor Paul Dux of the University of Queensland said something that seemed to me to be clearly incorrect – even though I think I fully appreciated his point.

"why the brain can't multitask is still very much a topic of considerable debate"

Prof. Paul Dux
Is it true that brains cannot multitask? I think mine can. (Image by Gerd Altmann from Pixabay

The podcast was an episode of the ABC radio programme All in the Mind (not to be confused with the BBC radio programme All in the Mind, of course) entitled 'Misadventures in multitasking'

"All in the Mind is an exploration of the mental: the mind, brain and behaviour — everything from addiction to artificial intelligence." An ABC radio programme and podcast.

The argument against multitasking

Now mutlitasking is doing several things at once – such as perhaps having a phone conversation whilst reading an unrelated email. Some aspects of the modern world seem to encourage this – such as being queued on the telephone (as when I was kept on hold for over an hour waiting to get an appointment at my doctor's surgery – I was not going to just sit by the phone in the hope I would eventually get to the top of the queue). Similarly 'notifications' that seek to distract us from what we are doing on the computer, as if anything that arrives is likely to be important enough for us to need immediate alerting, add little to the sum of human happiness.1

Now I have heard the argument against multitasking before. The key is attention. We may think we are doing several things at once, but instead of focusing on one activity, completing, it, then shifting to another, what multitaskers actually do is continuously interrupt their focus on one activity to refocus attention on the another. The working memory has limited capacity (this surely is what limits our ability to reflectively multitask?), and we can only actually focus on one activity at a time, so multitasking is a con – we may think we are being more productive but we are not.

Now, people do tire, and after, say 45 minutes at one task it may be more effective to break, do something unrelated, and come back to your work fresh. If you are writing, and you break, and take the washing out of the machine and hang it up to dry, and make a cup of tea, and then come back to your writing fifteen or twenty minutes later, this is likely to be ultimately more productive than just ploughing on. You have been busy, not just resting, but a very different kind of activity, and your mind (hopefully) is refreshed. If you have been at your desk for 90 minutes without a break, then go for a walk, or even a quick lie down.

That however, is very different from doing your writing, as you check your email inbox, and keep an eye on a social media feed, and shop online. You can only really do one of those things at a time and if you try to multitask you are likely to quickly tire, and make mistakes as you keep interrupting your flow of concentration. (So, if you have been doing your writing, and you feel the need to do something else, give yourself a definite period of time to completely change activity, and then return fully committed to the writing.)

Now, I find that line of argument very convincing and in keeping my with own experience. (Which is not to say I always follow my own advice, of course.) Yet, I still thought Prof. Dux was wrong. And, indeed, there is one sense in which I would like to think deliberate reflective multitasking is not counterproductive.

If your brain cannot multitask you'd perhaps better hope it focuses on breathing

The brain is complex…

This is a short extract from the programme,

Paul Dux: Why the brain can't multitask is still very much a topic of considerable debate because we have these billions of neurons, trillions of synaptic connections, so why can't we do two simple things at once?

Sana Qadar: This is Professor Paul Dux, he's a psychologist and neuroscientist at the University of Queensland. He takes us deeper into what's going on in the brain.

Paul Dux: A lot of people would say it's because we have these capacities for attention. The brain regions that are involved in things like attention are our lateral prefrontal cortex. You have these populations of neurons that respond to lots of different tasks and multiple demands. That of course on one hand could be quite beneficial because it means that we are able to learn things quickly and can generalise quickly, but maybe the cost of that is that if we are doing two things at once in close temporal proximity, they try to draw on the same populations of neurons, and as a result leads to interference. And so that's why we get multitasking costs.

Sana Qadar: Right, so that's why if you are doing dishes while chatting to a friend, a dish might end up in the fridge rather than the cupboard where it's supposed to go.

Paul Dux: That's right, exactly.

Paul Dux talking to Sana Qadar who introduces 'All in the mind'

Now I imagine that Prof. Dux is an expert, and he certainly seemed authoritative. Yet, I sensed a kind of concept-creep, that led to a category error, here.

A category error

A category error is where something is thought of or discussed as though a member of an inappropriate class or category. A common example might be gender and sex. At one time it was widely assumed that gender (feminine-masculine) was directly correlated to biological sex (female-male) so terms were interchangeable. It is common to see studies in the literature which have looked for 'sex differences' when it seems likely that the researchers have collected no data on biological sex.

Models that suggest that the 'particles' (molecules, ions, atom) in a solid are touching encourage category errors among learners: that such quanticles are like tiny marbles that have a definite surface and diameter. This leads to questions such as whether on expansion the particles get larger or just further apart. (Usually the student is expected to think that the particles get further apart, but it is logically more sensible to say they get larger. But neither answer is really satisfactory.)

If someone suggested that a mushroom must photosynthesise because that is how plants power their metabolism then they would have made a category error. (Yes, plants photosynthesise. However, a mushroom is not a plant but a fungus, and fungi are decomposers.)

The issue here, to my mind (so to speak) was the distinction between brain (a material object) and conscious mind (the locus of subjective experience). Whilst it is usually assumed that mind and brain are related (and that mind may arise, emerge from processes in the brain) they may be considered to relate to different levels of description. So, mind and brain are not just different terms for the same thing.

Mind might well arise from brain, but it is not the same kind of thing. So, perhaps the notion of 'tasks' applies to minds, not brains? (Figure from Taber, 2013)

So, it is one thing to claim that the mind can only be actively engaged in one task at a time, but that is not equivalent to suggesting this is true of the brain that gives rise to that mind.2

Prof. Dax seemed to be concerned with the brain:

"the brain…billions of neurons, trillions of synaptic connections… brain regions…lateral prefrontal cortex…populations of neurons"

Yet it seems completely unfounded to claim that human brains do not multitask as we surely know they do. Our brains are simultaneously processing information from our eyes, our ears, our skin, our muscles, etc. This is not some kind of serial process with the brain shifting from one focus to another, but is parallel processing, with different modules doing different things at the same time. Certainly, we cannot give conscious attention to all these inputs at once, so the brain is filtering and prioritising which signals are worth notifying to head office (so to speak). We are not aware of most of this activity – but then that is generally the case with our brains.

The brain controls the endocrine system. The brain stem has various functions, including regulating breathing and heart rate and balance. If the brain cannot multitask we had perhaps better hope it focuses on breathing, although even then I doubt we would survive for long based on that activity alone.

Like the proverbial iceberg, most of our brain activity takes place below the waterline, out of conscious awareness. This is not just the physiological regulation – but a lot of the cognitive processing. So, we consolidate memories and develop intuitions and have sudden insights because our brains are constantly (but preconsciously) processing new data in the light of structures constructed through past experience.

If you are reading, you may suddenly notice that the room has become cold, or that the doorbell is ringing. This is because although you were reading (courtesy of your brain), your brain was also monitoring various aspects of the environment to keep alert for a cue to change activity. You (as in a conscious person, a mind if you like) may not be able to do two things at once, so your reading is interrupted by the door bell, but only because your brain was processing sensory information in the background whilst it was also tracking the lines of text in your book, and interpreting the symbols on the page, and recalling relevant information to provide context (how that term was defined, what the author claimed she was going to demonstrate at the start of the chapter…). Your mind as the locus of your conscious experience cannot multi-task, certainly, and certainly "brain regions that are involved in…attention" are very relevant to that, but your brain itself is still a master of multitasking.

Me, mybrain, and I

So, if the brain can clearly multitask, can we say that the person cannot multitask?

That does not seem to work either. The person can thermoregulate, digest food, grow hair and nails, blink to moisten the eye etc., etc as they take an examination or watch a film. These are automatic functions. So, might we say that it is the body, not the person carrying out those physiological functions? (The body of the person, but not the person, that is.)

Yet, most people (i.e., persons) can hold a conversation as they walk along, and still manage to duck under an obstruction. The conversation requires our direct attention, but walking and swerving seem to be things which we can do on 'autopilot' even if not automatic like our heartbeat. But if there was a complex obstruction which required planning to get around, then the conversation would likely pause.

So, it is not the brain, the body, or even the person that cannot multitask, but more the focus of attention, the stream of consciousness, the conscious mind. Perhaps confusion slips in because these distinctions do not seem absolute as our [sic] sense of identify and embodiment can shift. I kick out (with my leg), but it is my leg which hurts, and perhaps my brain that is telling me it is hurting?

Figure by  by mohamed Hassan from Pixabay; background by  by Sad93 from Pixabay 

Meanwhile, my other brain was relaxing

There is also one sense in which I regularly multitask. I listen to music a lot. This includes, usually, when I am reading. And, usually, when I am writing. I like to think I can listen to music and work. (But Prof. Dux may suggest this is just another example of how humans "are not actually good at knowing our own limitations".)

I like to think it usually helps. I also know this is not indiscriminate. If I am doing serious reading I do not play music with lyrics as that may distract me from my reading. But sometimes when I am writing I will listen to songs (and, unfortunately for anyone in earshot, may even find I am singing along). I also know that for some activities I need to have familiar music and not listen to something new if the music is to support rather than disturb my activity.

Perhaps I am kidding myself, and am actually shifting back and forth between

being distracted from my work by my musicandfocusing on my work and ignoring the music.

I know that certainly sometimes is the case, but my impression is that usually I am aware of the music at a level that does not interfere with my work, and sometimes the music both seems to screen out extraneous noise and even provides a sense of flow and rhythm to my thinking.

The human brain has two somewhat self-contained, but connected, hemispheres. (Image by Gerd Altmann from Pixabay)

I suspect this has something to do with brain lateralisation and how, in a sense, we all have two brains (as the hemispheres are to some extent autonomous). Perhaps one of my hemispheres is quietly (sic) enjoying my music whilst the other is studiously working. I even fancy that my less verbal hemisphere is being kept on side by being fed music and so does not get bored (and so perhaps instigate a distracting daydream) whilst it waits for the other me, its conjoined twin, to finish reading or writing.

I may well be completely wrong about that.

Perhaps I am just as hopeless at multitasking with my propensity to attempt simultaneous scholarship and music appreciation as those people who think they can monitor social media whilst effectively studying.3 Perhaps it is just an excuse to listen to music when I should be working.

But even if that is so, I am confident my brain can multitask, even if I cannot.


Work cited:

Note:

1 The four minute warning, perhaps. But,

  • Apple are releasing a new iPhone next spring?
  • Another email has arrived inviting me to talk at some medical conference on a specialism I cannot even pronounce?
  • A fiend of a friend of a friend has posted some update on social media that I can put into Google translate if I can be bothered?
  • Someone I do not recall seems to have a job anniversary?
  • Someone somewhere seems to have read something I once wrote (and I can find out who and where for a fee)?

Luckily I have been notified immediately as now I know this I will obviously no longer wish to complete the activity I was in the middle of.

2 One could argue that when a person is conscious (be that awake, or dreaming) one task the brain is carrying out is supporting that conscious experience. So, anything else a brain of a conscious person is doing must be an additional task. Perhaps, the problem is that minds carry out tasks (which suggests an awareness of purpose), but brains are just actively processing?

3 As a sporting analogy for the contrast I am implying here, there is a tradition in England of attending international cricket matches, and listening to the 'test match special' commentary (i.e., verbal) on the radio while watching (i.e. visual) the match. This seems to offer complementary enhancement of the experience. But I have also often seen paying spectators on televised football matches looking at their mobile phones rather than watching the match.

Climate change – either it is certain OR it is science

Is there a place for absolute certainty in science communication?

Keith S. Taber

I just got around to listening to the podcast of the 10th October episode of Science in Action. This was an episode entitled 'Youngest rock samples from the moon' which led with a story about rock samples collected on the moon and brought to earth by a Chinese mission (Chang'e-5). However, what caused me to, metaphorically at least, prick up my ears was a reference to "absolute certainty".

Now the tag line for Science in Action is "The BBC brings you all the week's science news". I think that phrase reveals something important about science journalism – it may be about science, but it is journalism, not science.

That is not meant as some kind of insult. But science in the media is not intended as science communication between scientists (they have journals and conferences and so forth), but science communicated to the public – which means it has to be represented in a form suitable for a general, non-specialist audience.

Read about science in public discourse and the media

Scientific and journalistic language games

For, surely, "all the week's science news" cannot be covered in one half-hour broadcast/podcast. 1

My point is that "The BBC brings you all the week's science news" is not intended to be understood and treated as a scientific claim, but as something rathere different. As Wittgenstein (1953/2009) famously pointed out, language has to be understood in specific contexts, and there are different 'language games'. So, in the genre of the scientific report there are particular standards and norms that apply to the claims made. Occasionally these norms are deliberately broken – perhaps a claim is made that is supported by fabricated evidence, or for which there is no supporting evidence – but this would be judged as malpractice, academic misconduct or at least incompetence. It is not within the rules of that game

However, the BBC's claim is part of a different 'language game' – no one is going to be accused of professional misconduct because, objectively, Science in Action does not brings a listener all the week's science news. The statement is not intended to be understood as an objective knowledge claim, but more a kind of motto or slogan; it is not to be considered 'false' because it not objectively correct. Rather, it is to be understood in a fuzzy, vague, impressionistic way.

To ask whether "The BBC brings you all the week's science news" through Science in Action is a true or false claim would be a kind of category error. The same kind of category error that occurs if we ask whether or not a scientist believes in the ideal gas law, the periodic table or models of climate change.

Who invented gravity?

This then raises the question of how we understand what professional academic scientists say on a science news programme that is part of the broadcast media in conversation with professional journalists. Are they, as scientists, engaged in 'science speak', or are they as guests on a news show engaged in 'media speak'?

What provoked this thought with was comments by Dr Fredi Otto who appeared on the programme "to discuss the 2021 Nobel Prizes for Science". In particular, I was struck by two specific comments. The second was:

"…you can't believe in climate change or not, that would just be, you believe in gravity, or not…"

Dr Friederike Otto speaking on Science in Action

Which I took to mean that gravity is so much part of our everyday experience that it is taken-for-granted, and it would be bizarre to have a debate on whether it exists. There are phenomena we all experience all the time that we explain in terms of gravity, and although there may be scope for debate about gravity's nature or its mode of action or even its universality, there is little sense in denying gravity. 2

Newton's notion of gravity predominated for a couple of centuries, but when Einstein proposed a completely different understanding, this did not in any sense undermine the common ('life-world' 2) experience labelled as gravity – what happens when we trip over, or drop something, or the tiring experience of climbing too many steps. And, of course, the common misconception that Newton somehow 'discovered' gravity is completely ahistorical as people had been dropping things and tripping over and noticing that fruit falls from trees for a very long time before Newton posited that the moon was in freefall around the earth in a way analogous to a falling apple!

Believing in gravity

Even if, in scientific terms, believing in a Newtonian conceptualisation of gravity as a force acting at a distance would be to believe something that was no longer considered the best scientific account (in a sense the 'force' of gravity becomes a kind of epiphenomenon in a relativistic account of gravity); in everyday day terms, believing in the phenomenon of gravity (as a way of describing a common pattern in experience of being in the world) is just plain common sense.

Dr Otto seemed to be suggesting that just as gravity is a phenomenon that we all take for granted (regardless of how it is operationalised or explained scientifically), so should climate change be. That might be something of a stretch as the phenomena we associate with gravity (e.g., dense objects falling when dropped, ending up on the floor when we fall) are more uniform than those associated with climate change – which is of course why one tends to come across more climate change deniers than gravity deniers. To the best of my knowledge, not even Donald Trump has claimed there is no gravity.

But the first comment that gave me pause for thought was:

"…we now can attribute, with absolute certainty, the increase in global mean temperature to the increase in greenhouse gases because our burning of fossil fuels…"

Dr Friederike Otto speaking on Science in Action
Dr Fredi Otto has a profile page at the The Environmental Change Unit,
University of Oxford

Absolute certainty?

That did not seem to me like a scientific statement – more like the kind of commitment associated with belief in a religious doctrine. Science produces conjectural, theoretical knowledge, but not absolute knowledge?

Surely, absolute certainty is limited to deductive logic, where proofs are possible (as in mathematics, where conclusions can be shown to inevitably follow from statements taken as axioms – as long as one accepts the axioms, then the conclusions must follow). Science deals with evidence, but not proof, and is always open to being revisited in the light of new evidence or new ways of thinking about things.

Read about the nature of scientific knowledge

Science is not about belief

For example, at one time many scientists would have said that the presence of an ether 3 was beyond question (as for example waves of light travelled from the sun to earth, and waves motion requires a medium). Its scientific characterisation -e.g., the precise nature of the ether, its motion relative to the earth – were open to investigation, but its existence seemed pretty secure.

It seemed inconceivable to many that the ether might not exist. We might say it was beyond reasonable doubt. 4 But now the ether has gone the way of caloric and phlogiston and N-rays and cold fusion and the four humours… It may have once been beyond reasonable doubt to some (given the state of the evidence and the available theoretical perspectives), but it can never have been 'absolutely certain'.

To suggest something is certain may open us to look foolish later: as when Wittgenstein himself suggested that we could be certain that "our whole system of physics forbids us to believe" that people could go to the moon.

Science is the best!

Science is the most reliable and trustworthy approach to understanding the natural world, but a large part of that strength comes from it never completely closing a case for good – from never suggesting to have provided absolute certainty. Science can be self-correcting because no scientific idea is 'beyond question'. That is not to say that we abandon, say, conversation of energy at the suggestion of the first eccentric thinker with designs for a perpetual motion machine – but in principle even the principle of conservation of energy should not be considered as absolutely certain. That would be religious faith, not scientific judgement.

So, we should not believe. It should not be considered absolutely certain that "the increase in global mean temperature [is due to] the increase in greenhouse gases because [of] our burning of fossil fuels", as that suggests we should believe it as a doctrine or dogma, rather than believe that the case is strong enough to make acting accordingly sensible. That is, if science is always provisional, technically open to review, then we can never wait for absolute certainty before we act, especially when something seems beyond reasonable doubt.

You should not believe scientific ideas

The point is that certainty and belief are not really the right concepts in science, and we should avoid them in teaching science:

"In brief, the argument to be made is that science education should aim for understanding of scientific ideas, but not for belief in those ideas. To be clear, the argument is not just that science education should not intend to bring about belief in scientific ideas, but rather that good science teaching discourages belief in the scientific ideas being taught."

Taber, 2017: 82

To be clear – to say that we do not want learners to believe in scientific ideas is NOT to say we want them to disbelieve them! Rather, belief/disbelief should be orthogonal to the focus on understanding ideas and their evidence base.

I suggested above that to ask whether "The BBC brings you all the week's science news" through Science in Action is a true or false claim would be a kind of category error. I would suggest it is a category error in the same sense as asking whether or not people should believe in the ideal gas law, the periodic table, or models of climate change.

"If science is not about belief, then having learners come out of science lessons believing in evolution, or for that matter believing that magnetic field lines are more concentrated near the poles of a magnet, or believing that energy is always conserved, or believing that acidic solutions contain solvated hydrogen ions,[5] misses the point. Science education should help students understand scientific ideas, and appreciate why these ideas are found useful, and something of their status (for example when they have a limited range of application). Once students can understand the scientific ideas then they become available as possible ways of thinking about the world, and perhaps as notions under current consideration as useful (but not final) accounts of how the world is."

Taber, 2017: 90

But how do scientists cross the borders from science to science communication?

Of course many scientists who have studied the topic are very convinced that climate change is occurring and that anthropogenic inputs into the atmosphere are a major or the major cause. In an everyday sense, they believe this (and as they have persuaded me, so do I). But in a strictly logical sense they cannot be absolutely certain. And they can never be absolutely certain. And therefore we need to act now, and not wait for certainty.

I do not know if Dr Otto would refer to 'absolute certainty' in a scientific context such as a research paper of a conference presentation. But a radio programme for a general audience – all ages, all levels of technical background, all degrees of sophistication in appreciating the nature of science – is not a professional scientific context, so perhaps a different language game applies. Perhaps scientists have to translate their message into a different kind of discourse to get their ideas across to the wider public?

The double bind

My reaction to Dr Otto's comments derived from a concern with public understanding of the nature of science. Too often learners think scientific models and theories are meant to be realistic absolute descriptions of nature. Too often they think science readily refutes false ideas and proves the true ones. Scientists talking in public about belief and absolute certainty can reinforce these misconceptions.

On the other hand, there is probably nothing more important that science can achieve today than persuade people to act to limit climate change before we might bring about shifts that are (for humanity if not for the planet) devastating. If most people think that science is about producing absolute certain knowledge, then any suggestion that there is uncertainty over whether human activity is causing climate change is likely to offer the deniers grist, and encourage a dangerous 'well let's wait till we know for sure' posture. Even when it is too late and the damage has been done, if there are any scientists left alive, they still will not know absolutely certainly what caused the changes.

"…Lord, here comes the flood
We'll say goodbye to flesh and blood
If again the seas are silent
In any still alive
It'll be those who gave their island to survive
…"

(Peter Gabriel performing on the Kate Bush TV special, 1979: BBC Birmingham)

So, perhaps climate scientists are in a double bind – they can represent the nature of science authentically, and have their scientific claims misunderstood; or they can do what they can to get across the critical significance of their science, but in doing so reinforce misconceptions of the nature of scientific knowledge.

Coda

I started drafting this yesterday: Thursday. By coincidence, this morning, I heard an excellent example of how a heavyweight broadcast journalist tried to downplay a scientific claim because it was couched as not being absolutely certain!

Works cited:

Notes

1 An alternative almost tautological interpretation might be that the BBC decides what is 'science news', and it is what is included in Science in Action, might fit some critics complaints that the BBC can be a very arrogant and self-important organisation – if only because there are stories not covered in Science in Action that do get covered in the BBC's other programmes such as BBC Inside Science.

2 This might be seen as equivalent to saying that the life-world claim that gravity (as is commonly understood and experienced) exists is taken-for-granted Schutz & Luckmann, 1973). A scientific claim would be different as gravity would need to be operationally defined in terms that were considered objective, rather that just assuming that everyone in the same language community shares a meaning for 'gravity'.

3 The 'luminiferous' aether or ether. The ether was the name given to the fifth element in the classical system where sublunary matter was composed of four elements (earth, water, air, fire) and the perfect heavens from a fifth.

(Film  director Luc Besson's sci-fi/fantasy movie 'The Fifth Element' {1997, Gaumont Film Company} borrows from this idea very loosely: Milla Jovovich was cast in the title role as a perfect being who is brought to earth to be reunited with the other four elements in order to save the world.)

4 Arguably the difference between forming an opinion on which to base everyday action (everyday as in whether to wear a rain coat, or to have marmalade on breakfast toast, not as in whether to close down the global fossil fuel industry), and proposing formal research conclusions can be compared to the difference between civil legal proceedings (decided on the balance of probabilities – what seems most likely given the available evidence) and criminal proceedings – where a conviction is supposed to depend upon guilt being judged beyond reasonable doubt given the available evidence (Taber, 2013).

Read about writing-up research

5 Whether acids do contain hydrated hydrogen ions may seem something that can reasonably be determined, at least beyond reasonable doubt, by empirical investigation. But actually not, as what counts as an acid has changed over time as chemists have redefined the concept according to what seemed most useful. (Taber, 2019, Chapter 6: Conceptualising acids: Reimagining a class of substances).

Is 6% kidney function just as good as 8% kidney function?

A case of justifying dubious medical ethics by treating epistemology as ontology

Keith S. Taber

Image by Mohamed Hassan from Pixabay

I was puzzled by something I heard a hospital doctor say regarding kidney functioning. The gist of his comments were that

  • once kidney function was below about 10% of normal functioning…
  • then protecting remaining kidney function was not important…
  • because estimates of function at that level are unreliable.

I thought this was an illogical argument as it confused ontology (the state of the kidneys and their functioning) and epistemology (how well we can measure kidney function).

The kidneys are essential organs that regulate hydration levels and eliminate toxic materials from the body. They are 'essential' in the sense that without kidney function someone soon dies. Typically healthy people have plenty of scope for contingency in the capacity of their kidneys. (Living kidneys donors give up one of their two kidneys for transplantation, so, after donation, they will only have, at best, 50%,of normal functioning.) So when people's kidneys start to deteriorate due to disease the patient can continue with normal life for some time. I am not an expert, but from what I understand, a person can manage a normal life with 20% of normal functioning.

Of course there reaches a point in progressive kidney disease when the remaining capacity is not enough to keep someone alive for an extended period. So if kidney function drops to something like an eighth of normal healthy functioning, the situation gets critical.

Kidney dialysis

These days people can have dialysis if their kidneys fail. Someone with 0% kidney function – someone who never excretes any urine at all – can be kept alive indefinitely by dialysis. However this is not ideal. The patient has to attend a clinic and have treatment for 3-4 hours at a time, usually three times a week. No time off – no holidays from dialysis if the patient wants to continue living (and some decide they would rather not continue living, although most 'tolerate' the treatment). Often patients feel unwell on, or after, dialysis – they may say they feel 'washed out', for example. Dialysis also costs the health service (or in some countries, the patient) a good deal of money.

Dialysis patients also have to be very careful about diet and avoid some foods (e.g., eating bananas can lead to dangerously high levels of potassium that can interfere with heart function and could lead to a heart attack), as sessions of dialysis (with no, or very little, blood filtration occurring in-between) is never as good as having constantly functioning kidneys.

Then there's the problem of fluid intake

Dialysis patients are asked to limit their intake of fluids. A healthy person who drinks a lot (whether tap water, tea, beer, etc.) simply produces more urine. Most dialysis patients, however, produce little, if any, urine, and the difference between what they 'should' excrete (to maintain homeostasis), and what they can actually excrete, needs to be removed during the dialysis process. So, whatever water a patient takes in drinks during the 45 or so hours between sessions (and is not lost through some other mechanism such as sweating or breathing), is all taken off during three or so hours on the machine. This brings about changes in the blood volume much more quickly than is comfortable. As the body cannot remove excess fluid via the kidneys, fluid intake means the fluid levels build up between dialysis sessions which can lead to various complications such as increases in blood pressure.

Dr McCoy is unimpressed by 20th Century medicine (Star Trek IV: The Voyage Home, Paramount Pictures)

So, having kidney function of, say, 10% or less of normal is a real pain and requires reorganising your entire life around your dialysis sessions (or perhaps getting a transplant if you are strong enough for surgery and are lucky enough that a good match can be found).

That provides some background in considering whether, once kidneys have deteriorated below, say 10%, it really makes any difference in worrying about the actual level. If you have 8% of normal functioning and are on dialysis for life, why would it matter if that fell to 6%?

An actual case

The context of this question was a patient with kidney failure or end-stage renal disease (a haemodialysis* patient, who would only live a matter of days without regular treatment) who was given a CAT scan** using a contrast medium*** to show up features that would not be observable otherwise. Such media are widely considered to have some toxicity in relation to the kidneys (Ahmed, Williams & Stott, 2009), but in a healthy person they are eliminated through the kidneys quite quickly and any risk is considered small. A person with kidney failure does not eliminate toxins in this way, and so when a scan is indicated, it can be scheduled for just before their next dialysis session.

"In every study comparing patients with and without some degree of renal insufficiency [kindeys not functioning adequately], renal insufficiency increased the likelihood of RCIN [radiocontrast-medium-induced nephropathy, i.e., kidney damage due to the use of contrast media]"

"Both peritoneal and hemodialysis remove substantial amounts of the contrast medium (50% to 90% of the dose); hemodialysis is more effective."

Solomon, 1998: 230, 236.

This patient, however, was admitted to a hospital very ill. The emergency department doctor ordered an immediate scan – late at night, at a weekend – but told the patient that the on-call dialysis staff could be called in to give dialysis after the scan. At the X-ray department, the radiographer then said that this was not needed, as long as the patient had dialysis within 24 hours of the scan.

The renal doctor's viewpoint

The next afternoon, the patient had still not gone for dialysis when the hospital renal doctor visited the patient. This doctor took the view that as the patient was due their regular dialysis the following day (i.e., about 38 hours after the scan), there was no point sending the patient for an additional dialysis session, as – after all – the kidneys had already failed sufficiently for the patient to be relying on dialysis for survival.

The patient's viewpoint

The counter-argument presented to the renal specialist (by the patient's spouse) was that even at this point further deterioration should be avoided if possible – that even if 8% of normal kidney function was not good, it was inherently better than 6% of normal kidney function.

After all, if for some reason a patient was further compromised (by an unrelated illness, or by delay in accessing normal dialysis due to some unexpected contingency) a few percentage points – making a small difference in how much the body could remove toxins and excess fluid from the blood by itself between dialysis sessions – could still be the critical factor in determining whether the person survived. (Those attending hospital dialysis notice the high frequency of fellow regular patients who, suddenly, are no longer attending for treatment.)

The renal doctor's justification

The doctor responded to this with the counter-argument that once kidney function was this low, there was no reason to be concerned about a change in measured kidney function from (say) 8% to 6% as the difference between such measurements was within the usual variations in measurements found in patients from time to time.

There are two issues here of interest.

Consent that is conditional is not consent if the conditions are broken

One issue relates to ethics (here, medical ethics). A patient consented to a diagnostic procedure with a possible risk of side effects on the understanding that a suitable counter measure would be taken immediately after the procedure to minimise any detrimental effect. The hospital undertook the procedure, but then decided (when it was too late for the patient to withdraw consent) not to follow through on the promised counter-measure. In effect, a procedure was carried out without consent as the consent was (as was made absolutely clear by the patient) conditional on the scan being followed by dialysis.

Reasons for refusing to provide treatment

The second issue relates to the justification given by the doctor as reported above.

The day after the explanation about measurement not clearly distinguishing between 8% and 6% functioning had been made, when dialysis was finally provided, another renal specialist offered a different justification entirely – that the potential risk to kidneys of the contract medium was just a myth. However, the earlier conversations

  1. in the emergency department;
  2. in the X-ray department; and
  3. with the first renal doctor within 24 hours of the scan,

were all clearly undertaken on the basis that both patient and medical staff thought the contrast medium was potentially damaging to kidneys.

"These contrast media can occasionally cause kidney damage, especially in patients who already have kidney disease"

Ahmed, Williams & Stott, 2009

In the context of that discourse, the first renal specialist had argued that because (a) the precision of estimates of kidney function was not great enough to reliably measure a difference between 6% and 8% functionality, then (b) there was no need to be concerned about treatment which could potentially cause damage bringing about deterioration of this order.

Presumably,

  • at any one time, a person's kidney function will be at a certain level.
  • If the kidney is then further damaged by toxins then that functionality will drop.
  • A more damaged kidney is inherently less desirable than a less damaged (better functioning) kidney.
  • So further damage to an already damaged kidney is inherently undesirable,
  • and should be avoided if possible, if the costs of doing so are not too high.

The state of a diseased person's kidneys could vary slightly 'naturally' in response to various factors related to their general health, diet, environment, etcetera. This is an ontological consideration – the actual state of the kidneys changes. This may well mean that changes of a few percent between measurements could just be natural fluctuation.

It may therefore be difficult to tell if a person's kidneys have become more damaged due to a particular event, such as a diagnostic scan. That is an epistemological issue – the limitation on how well we can identify a specific change that is masked by noise.

Presumably, there are also various factors that limit the precision of such estimates – all measurements are subject to errors, and small (real) differences may be difficult to identify if they are at the level of the likely measurement error. That is also an epistemological issue.

But, just because an effect cannot be clinically measured (epistemology), that does not mean it is not real and will not have consequences (ontology). A drop from 8% kidney function to 6% kidney function is only a change of 2% compared with normal functioning, BUT it is a loss of 25% of the patient's actual kidney function.

A small deterioration in already severely compromised kidneys may seem insignificant to the renal doctor because he does not think he could reliably measure the change. One day it could be the difference between life and death to the kidneys' owner.

Sources cited:
  • Ahmed, A., Williams, G., & Stott, I. (2009). Patient information-What I tell my patients about contrast medium nephrotoxicity. British Journal of Renal Medicine, 14(3), 15-18.
  • Solomon, R. (1998). Contrast-medium-induced acute renal failure. Kidney international, 53(1), 230-242.

* haemodialysis involves the patient having permanent 'plumbing' installed that allows their vascular system to be connected to a dialysis machine, so the blood can be diverted to the machine to be cleaned. This usually done using blood vessels in the arm. In the case discussed the surgeon cut into the neck and chest (with the patient fully conscious), and connected tubing to a vein in the neck. The tubing was run beneath the skin to exit in the chest below the neckline, where a fitting acted as a tap and connector for the external tubing to the machine. Very special care has to be taken to keep the area clean, and the dressing dry, as the plumbing provides a direct route into the bloodstream. (Baths, swimming, hot-tubs, etc. are not advisable.)

[Peritoneal dialysis is an alternative treatment that involves a catheter being implanted in the abdomen, and being used to allow a solution into the abdominal cavity, which is later removed after it has absorbed waste materials. The patient can manage the process at home, but needs to change the solution in the abdomen a number of times each day.]

** computerised tomography: a process that uses a series of X-ray bursts to collect data that can be compiled into a 3-D image.

*** a substance that shows up on X-ray scans, and which when injected into the blood helps detect vascular structures. (The term is generic – it also applies to substances swallowed  before scans of the alimentary canal.)

Note: this post was originally prepared in October 2015, but was not published at the time when the patient was alive and attending for treatment.