Samples from populations

A topic in research methodology

In research, it is often impractical, or even impossible, to collect data from the full population of interest – so the population is sampled.

Here is an example of a claim about a population ('scientists'), made on a website recommended by a visitor to this site (under the comments on 'The moon is a long way off and it is impossible to get there')

"there are more scientists who support the hoax theory, than those who refute it"

The 'hoax theory' being that NASA sent missions to the moon, i.e., the Apollo programme

http://apollofacts.atspace.co.uk

The website supports this claim by the following argument:

"To start off with here is a list of scientists who support the Moon landing as being genuine. Prof. Michael Brant Shermer, American. Prof. Steven I. Dutch, American. Prof. Brian Cox, British. Prof. Harald Lesch, German. A grand total of 4.

Now here is a list of scientists who support the Moon landing as a hoax. Prof. Lawrence S. Pinsky, American. Prof. James M. McCanney, American. Prof. Luke Sargent, American. Prof. André Balogh, British. Prof. Colin Rourke, British. Prof. Krassimir Ivanov Ivandjiiski, Russian. Prof. Takahiko Soejima, Japanese. Prof. Li Zifeng, Chinese. Prof. Federico Martín Maglio, Argentinean. That makes a grand total of 9.

So the Moon hoax supporters are out in front on 69% with Moon landing supporters way behind on 31%."

http://apollofacts.atspace.co.uk

There are two ways of reading this argument. One is that there are 13 scientists in the world with a view on this matter. This is how the information is presented ("A grand total of 4"): but it is clearly not the case. So, presumably the author(s) consider(s) that these 13 scientists can represent scientists in general – they are a sample of scientists with an opinion on whether the moon landings really took place. (No information is given on why these names are chosen.) Yet very few people would find this sampling convincing.

Valid sampling

In order for inferences to be validly drawn from a sample to a population (especially when the sample comprises a small proportion of the population of interest) there must be good reason to believe the sample is representative of the sample.

Read about populations in research

Read about sampling in research

Sadly many published studies are vague (or ambiguous) about the population of the study and may give readers little reason to think samples are representative of the populations specified (or, often, implied).



The table below offers examples of how samples are described in some published research reports.

Sampling within a local population

Leopold and Smith (2020) report a study in the context of an undergraduate "general chemistry, problem-based lab course" in a university in the United States of America. The design of the course required learners to work in laboratory teams of 3-4 students. Being able to work with others in effective teams is a valid educational objective in it own right and reflects an example of a 'transferable' competence that will later be useful in the workplace – whether than is a laboratory or elsewhere. So, students were expected to monitor their group-work, identify any issues and develop strategies to address these. To ensure this was done, specific time was assigned to undertake reflective activities where students effectively logged responses to cue questions, and students were required to submit process-focussed evaluations after each laboratory practical activity ('experiment').

The cohort of learners (the 'population' for this study) numbered nearly 800, and it was felt this gave too many student groups to collect and analyse detailed qualitative data for the study. So sampling was used. As can often be the case in real world contexts, the form of sampling they used was something of hybrid between being purposive and random.

There were 19 instructors, each responsible for a number of the student groups. The researchers decided they wanted to include one group from each instructor in their sample, as clearly the experience (e.g., level and nature of support) of groups supervised by different instructors might vary. But then they randomly chose one group from each of the instructors.

An interesting aspect of this study is that there was also an opportunity to test whether the sampling was likely to be representative of the full population (see the figure below). At the end of the course, the full cohort was surveyed (though a course evaluation), and the data collected analysed. It was possible to separate out the survey data from the students in the earlier sample. By analysing data from just that subset of the cohort and then comparing the outcome of the analysis to that from the full-cohort analysis it was possible to see that (in terms of the survey responses) the sample responses very closely matched those of the cohort as a whole.

This is discussed further in Reflecting the population: Sampling an "exceedingly large number of students".


The design of Leopold and Smith' study allowed them to test the representativeness of their sample, at least in terms of their overall course experiences.


Examples of samples reported in published research

Here are some examples from published studies I have discussed on the site:

Description of populationDescription of sampleCharacterisation of sampling approachSource
The population of this study was undergraduate male and female students attending both intermediate and advanced swimming courses.
They consisted of (n= 314) students enrolled at the schools of Sport Sciences at three state universities
260 studentsopportunity sample


"At each school one class was labelled as the lecture class…and the other class was labelled the inquiry class"
[No randomisation to condition is reported.]
Bayyat, M. M., Orabi, S. M., Al-Tarawneh, A. D., Alleimon, S. M., & Abaza, S. N. (2021). Psychological Skills in Relation to Academic Achievement through Swimming Context. Psychology and Education, 58(5), 4535-4551.
Paper does not define the population explicitly.

Undergraduate Students in Rivers State [article title]

Rivers undergraduate students'
[abstract]

undergraduate Chemistry students
[research question]

year three undergraduate students studying Chemistry Education (B.Sc. Ed) and Pure Chemistry (B.Sc.)…from three universities

A total of 60 year three undergraduate students studying Chemistry Education (B.Sc. Ed) and Pure Chemistry (B.Sc.) were randomly drawn from three universities namely; University of Port Harcourt (Uniport), Rivers State University (RSU) and Ignatius Ajuru University of Education (IAUE) with each university contributing 20 students
random
(no specific technique reported)


[It seems unlikely all 60 were randomly drawn as one sample given the even distribution across courses and universities – [30 in the sample were Chemistry Education (B.Sc. Ed) and 30 Pure Chemistry (B.Sc.)]
Ikiroma, B., Chinda, W., & Bankole, I. S. (2021). Chemistry Laboratory Safety Signs Awareness Among Undergraduate Students in Rivers State. Journal of Chemistry: Education Research and Practice, 5(1), 47-54.
Paper does not define the population explicitly.


grade four learners' [paper title]

two farm schools in Pretoria [abstract]
The sample was made up of four grade four classes (n = 45 in school A; n = 71 in school B), two classes from each of the selected two schools.Two grade four classes (n=116) were conveniently and purposively sampled from two farm schools in Pretoria, South Africa


[The research design seems experimental, but is described as a case study]
Mamombe, C., Mathabathe, K. C., & Gaigher, E. (2020). The influence of an inquiry-based approach on grade four learners' understanding of the particulate nature of matter in the gaseous phase: a case study. EURASIA Journal of Mathematics, Science and Technology Education, 16(1), 1-11. doi:10.29333/ejmste/110391
Higher Secondary Level Students of Dhaka, Bangladesh115 students
the study sample was from Mirpur Cantonment Public School and College , (11 and 12 class)
school chosen "As the institution have good flow of students [?] and the students there were capable of reading and understanding English language easily"

It is not made clear how 115 were sampled from within the two school year groups

Gurung, N. and Khanum, H. (2021) Preventive Practice on Earthquake Preparedness Among Higher Level Students of Dhaka City. Biomedical Journal of Scientific & Technical Research, July, 2020, Volume 37, 2, pp 29274-29281
Grade eleven learners (four classes, comprising 190 students) at
Raphael Kombe Secondary School in Ngungu Township, Kabwe District, Central Province of Zambia.
Two of the four classes.'simple random sampling technique'

Random selection of two classes, then random assignment to treatment (Details given)
Magawa, P., & Kalebaila, K. K. (2020). The effect of integrating DARTs on learners academic performance in rates of chemical reaction. International Journal of Chemistry Education Research, 4(2), 67-74.
[Research design uses counterbalancing]
Where claims are made about a population based on collecting data form a sample, it is important that the population and sample are well-specified, and sampling methodology assures the sample is representative of the population being reported on.


Work cited:
  • Leopold, H., & Smith, A. (2020). Implementing Reflective Group Work Activities in a Large Chemistry Lab to Support Collaborative Learning. Education Sciences, 10(1), 7. https://www.mdpi.com/2227-7102/10/1/7

My introduction to educational research:

Taber, K. S. (2013). Classroom-based Research and Evidence-based Practice: An introduction (2nd ed.). London: Sage.