'Building up a profile of the pedagogic effectiveness of some innovation'
A topic in research methodology
Generalising from a single study
Generalisation is the process of applying findings in a research study beyond the immediate context of that study. Common approaches are statistical generalisation (from a sample to a population), analytic or theoretical generealisation (from a case to other cases argued to be similar in the most pertinent ways) and reader generalisation (from a case described in sufficient detail to another context that is judged sufficient similar).
Read about generalisation in research
Generalising through programmatic research
The term 'incremental generalisation' was intended to be used in relation to a programme of research rather than a single study. The assumption behind this concept is that social contexts (for example, teaching and learning contexts) vary across a wide range of variables, and this makes it very difficult to reach general conclusions.
Poetic chemistry?
As an example consider the hypothesis that teaching chemistry in verse leads to greater learning gains than teaching chemistry in prose. (Of course, this is a hypothetical example, which would only be tested in practice if there was a strong conceptual framework for the study providing a reasonable rationale for suspecting this might be the case!)
Imagine someone tested this experimentally with 14-16 year olds students in a 11-18 secondary school in Cambridge, England, and found that the hypothesis was supported as the mean of the learning gains of those taught chemistry in verse was significantly greater than the mean learning gains for those taught chemistry in prose. Moreover, the difference found in outcomes between the results in the two conditions was not only statistically significant (so unlikely to be due to chance factors) but had a large effect size, suggesting that the difference was not just measurable, but likely to make a substantive difference to learners. 1
We might wonder if the same results would apply to
- a diffrent cohort of students (for example, the following year), or
- with different teachers, or
- in another secondary school the other side of the city, or,
- indeed, one of the ('private' or 'independent') Cambridge schools which select their students from among those who can afford high school fees and are not accessible to most young people, or
- with 16-18 year old students studying at a higher level, or
- students in school in a socially deprived Lincolnshire coastal town, or
- in high schools in the United States, or
- for students in Paris being taught chimie en français, or
- in classes in a refugee camp in Chad, or…
Clearly the further we get away from the orignal context, the less likely we are to assume we can generalise form a single study.
Learning from series of studies in different contexts
But what if there were a good many studies, in different contexts, and most, but not all, found an advantage to teaching chemistry in rhyme?
The argument here then is that large-scale RCT that use representative samples from populations of interest are necessarily rare in education. What are more common are individual small-scale experiments that cannot be considered to offer highly generalisable results. Despite this, where these individual studies are seen as being akin to case studies (and reported in sufficient detail) they can collectively build up a useful account of the range of application of tested innovations. That is, some inherent limitations of small-scale experimental studies can be mitigated across series of studies, but this is most effective when individual studies offer thick description of teaching contexts and when contexts for 'replication' studies are selected to best complement previous studies.
Taber, 2019: 106
Under these circumstances, we can work towards 'incremental generalisation'.
Incremental generalisation
But how do we plan studies to contribute to such a programme? That is going to depend upon the current profile of studies testing an idea – the range of contexts tested, and the pattern of results found.
Efficient programmes of research of this kind require those planning individual studies to be able to gauge the variation across previously published studies. If the literature suggests mixed outcomes from previous testing, then what is indicated are further tests which can help determine the kinds of conditions that (do and do not) favour the effectiveness of the innovation from within the broad range of populations that have given inconsistent outcomes. If, however, the literature suggests something is very widely effective, then further tests will be most useful in situations outside the scope of existing studies (has it yet been tested with very young learners, with very disengaged learners, with the gifted, with traumatised students in migrant camps, with visually impaired students…?)
Over time, then, such programmes of 'replications' offer an opportunity to build up an account of the (multi-dimensional) ranges of effectiveness of different teaching approaches/curricula/resources. This does rely on 'negative' results being published as well as 'positive' results.[2] Knowing the characteristics of contexts where some innovation does not seem to be effective avoids wasting the expenditure of precious teacher time and other resources implementing something when the available evidence suggests (we can never be sure of course) it is unlikely to offer an educational return in a particular teaching context. Indeed, it is not appropriate to think of study outcomes as positive or negative replications, but contributions to building up a profile of the pedagogic effectiveness of some innovation. In this context, reporting a poor educational outcome is as valuable as reporting a good outcome…
Taber, 2020: 22
Selecting research sites for incremental generalisation
"When there is a series of studies testing the same innovation, it is most useful if collectively they sample in a way that offers maximum information about the potential range of effectiveness of the innovation. There are clearly many factors that may be relevant. It may be useful for replication studies of effective innovations to take place with groups of different socio-economic status, or in different countries with different curriculum contexts, or indeed in countries with different cultural norms (and perhaps very different class sizes; different access to laboratory facilities) and languages of instruction (Taber, 2012). It may be useful to test the range of effectiveness of some innovations in terms of the ages of students, or across a range of quite different science topics. Such decisions should be based on theoretical considerations.
…Progress in the field will then be best facilitated by a principled programme that complements existing studies by deliberately seeking to build systematically upon published studies when selecting the contexts of further replications.
Taber, 2019: 104-5
"If all existing studies report positive outcomes, | then it is most useful to select new samples that are as different as possible from those already tested. … |
When existing studies suggest the innovation is effective in some contexts but not others, | then the characteristics of samples/context of published studies can be used to guide the selection of new samples/contexts (perhaps those judged as offering intermediate cases) that can help illuminate the boundaries of the range of effectiveness of the innovation." |
Notes:
1 Unlike in some laboratory studies, it is extremely difficult to be sure experiments in education are 'fair tests' as there are usually many potentialy compounding variables that cannot be controlled – even if they can be identified.
Read about experimental research
2 That is, we need to be wary of publication bias, where studies which do not support an initial hypothesis are less likely to be reported in the literature
Taber, K. S. (2019). Experimental research into teaching innovations: responding to methodological and ethical challenges. Studies in Science Education, 55(1), 69-119. doi:10.1080/03057267.2019.1658058 [Download manuscript version]
Taber, K. S. (2020). Is reproducibility a realistic norm for scientific research into teaching? HPS&ST Newsletter (April 2020), 13-23
My introduction to educational research:
Taber, K. S. (2013). Classroom-based Research and Evidence-based Practice: An introduction (2nd ed.). London: Sage.