05. Description of methodology



Chapter 5 of Understanding Chemical Bonding: The development of A level students' understanding of the concept of chemical bonding


Read the previous chapter


Description of methodology

I am conducting some research into how students learn about chemistry during their A level course.

I am going to show you some diagrams, and ask you some questions about them. I want to explore your ideas and your understanding so I will often follow up your answers with more questions, and I may challenge you to try and explain your ideas.

In order to probe your ideas I will not be judging your answers as right or wrong but will try and explore what you really think. So I may seem to go along with answers that I don't think are quite correct, and I could seem to disagree with others, even if I really agree with what you have said.


§5.0: The purpose and structure of the chapter

In the previous chapter the research methodology applied to this study was outlined and justified. In the present chapter the techniques of data collection (§5.1) and analysis (§5.2) are described in more detail. The origin of the categories used to analyse the data will also be considered (§5.2.3 and §5.2.4). The mode of analysis of the interview and other data will be described: including the transcription procedure (§5.2.2), the manner in which transcripts were used to construct cases (§5.3), and the stages in which the model in chapter 6 was derived (§5.4).


§5.1: Details of data collection techniques

§5.1.1: Interview procedure

At the beginning of the first interview given by a colearner he or she would be told as part of the introductory comments,

I am conducting some research into how students learn about chemistry during their A level course. I am going to show you some diagrams, and ask you some questions about them. I want to explore your ideas and your understanding so I will often follow up your answers with more questions, and I may challenge you to try and explain your ideas. In order to probe your ideas I will not be judging your answers as right or wrong but will try and explore what you really think. So I may seem to go along with answers that I don't think are quite correct, and I could seem to disagree with others, even if I really agree with what you have said.

As discussed in chapter 4, the interviews followed the 'interview-about-instances' approach (§4.6.2) "using a series of pictures as a focus" (Watts, Gilbert and Pope,1982, p.11).

§5.1.2: The focal diagrams

Focal diagrams for use in the interviews were drawn on A4 sheets. The diagrams were prepared manually, but as neatly as possible. For example, the image from focal figure 5 is reproduced below:

focal figure 5

Initially a deck of 17 diagrams was prepared, although this was considerably supplemented during the research in response to on-going reflection and analysis (c.f. §4.4.1).

The original deck of 17 diagrams is reproduced in appendix 12. The figures were intended to provide opportunities to talk about aspects of chemistry that had been identified as significant: atomic binding (focal figure 1), covalent bonding (2, 4, 7, 17), ionic bonding (5, 9), metallic bonding (5), polar bonding (3, 8, 10, 11, 14, 15, 16), multiple bonding (4, 12, 13), hydrogen bonding (11), dative bonding (15, 16), van der Waals' forces (17) and resonance (12, 13, 14).

As a result of reflecting on the data obtained during interviews, additional focal figures were added (i.e. there was theoretical sampling, §4.4.1). In order to elicit colearners' understanding of relevant notions regarding forces, a sequence of figures representing contexts for discussing forces was prepared. The early figures in this sequence were designed to have little obvious connection with chemistry (e.g., a falling apple), but later figures represented configurations of charged particles that could represent atoms or molecules (see appendix 13).

Additional focal figures were also produced to expand the original deck of diagrams of chemical species. A further 17 diagrams representing chemical species were prepared during the time the first cohort of colearners (Annie to Debra) were being interviewed. These are reproduced in appendix 14. As a result of reflecting on the data collected from this cohort, and from subsequent on-going data collection a further 15 figures were added to the deck during the time the interviews with the second cohort (Jagdish to Umar) were being undertaken. These are reproduced in appendix 15.

The additional diagrams of chemical structures were designed to complement the initial 17. They were produced for a range of reasons. For example, if I was unsure whether a colearner's comments were influenced by the type of representation used, perhaps channelling their thinking in a particular way, then an alternative representation might be construed differently. Some new figures were meant to represent bonding phenomena not covered in the original deck to provide additional contexts to discuss learners' ideas.

Some of the additional figures, such as focal figures 18 and 30, were meant to provide a more explicit context for discussing orbital ideas. Other figures were intended to focus student thinking on the electrostatic nature of interactions at the molecular level. As the significance of 'octet thinking' (see chapter 11) amongst colearners emerged during analysis, some diagrams were prepared to challenge this. Examples of the rationale for adding specific new figures to the deck are provided in appendix 16.

§5.1.3: Interview questions

Part of the rationale for selecting interviewing as a technique is its inherent flexibility (§4.5). Therefore there was no detailed interview schedule prepared before the research commenced (§4.6.2).

It was intended to work through the deck of figures, and to start discussion about each by asking colearners three questions:

  • a) what was the figure meant to represent?
  • b) was there any bonding in the species/substance represented?, (and if so)
  • c) what type(s) of bonding?

Often – especially once the colearner became accustomed to the questions – the second and third questions were taken as implicit in the first. The interview style was informal, and a fixed form of precise wording was not thought to be appropriate. So, for instance, in the first interview undertaken the actual form of these question used included:

  • "I wonder if you can tell me what you think it's meant to be?"
  • "Any idea what that's meant to be?"
  • "What do you think this is?"
  • "What about this, any idea about this?"
  • "So would you say there is a chemical bond there?"
  • "Are there any bonds in that diagram, do you think?"
  • "Is there any bonding in that molecule?"
  • "Would you say that there was any kind of bonding there?"
  • "Do you think there is any kind of bonds between the atoms?"
  • "So in that diagram, have we got any kind of chemical bond?"
  • "Can we see any bonds there?"
  • "Are there any bonds, do you think, in that picture?"
  • "Is there any chemical bonding there?"
  • "Any idea what kind of chemical bond that would be. Would you give it a name?"
  • "What kinds of bonds are they?"
  • "Right, what kinds of bonds have we got there do you think?
  • "Do you know what kind of bonding that might be?"

One of the 'issues' that concerned me when I commenced the research was the distinction between the representation and that which is represented: and the extent to which aspects not made explicit in a diagram are understood by the learner to be implicit. For example in focal figure 26 the arrangement of electrons is intended to imply synchronised (transient) dipoles to an observant interviewee. This is not shown in focal figure 17, but the van der Waals' interaction that results from the synchronisation of the dipoles could be seen to be implied by the representation of the molecules as occupying lattice positions as in a solid. Therefore for a colearner to explain why he or she thinks focal figure 17 shows van der Waals forces a substantial chain of inferences could be invoked. However, this does not necessarily mean that the colearner is consciously aware of this chain of logic in suggesting focal figure 17 represents a substance held together by van der Waals' forces.

focal figure 17

focal figure 26

In practice, the flexibility of the interview process – that enabled re-wording of questions, and paraphrasing of responses so they could be reflected back at colearners for confirmation or otherwise – enabled the figures to effectively provide foci to launch discussions about types of bonding in various materials, without the mode of representation being a major problem. That is, generally once a figure was perceived as representing a substance or class of substance, the colearner's background knowledge and understanding tended to be accessed regardless of specific aspects of the representation. There were some exceptions to this, so that there were occasions when Kabul would answer questions based on his reading of a figure, although he did not think his answers related to the actual structure of the substance represented (see appendix 36, §A36.1.12). Also, some aspects of some of the figures were challenged or queried (the representation of electrons in pairs in figures such as focal figure 21, and two of the electrons in focal figure 4 that were judged as 'not doing anything'!)

Where the mode of representation did seem to be a major constraint on responses this was often found to be a significant clue to student thinking. So for example focal figure 14 was sometimes taken to mean that boron trifluoride contains covalent and ionic bonds at the same time because the canonical forms were independently read as molecular structures. This often suggested an ignorance of resonance (§9.4.3), but also an acceptance of a dichotomous classification of bonds (§11.6). Similarly focal figure 15 was often interpreted as ionic (as a compound of a metal and non-metal, ignoring the mode of representation), or as covalent (due to the mode of representation, but ignoring the electronegativity difference). That these classifications continued when colearners had demonstrated that they had the concepts available to consider an intermediate option was found to be characteristic of a common way of thinking about chemical bonds (§11.6.2).

§5.1.4: The triad elements used in the present study

In chapter 4 the rationale for using George Kelly's construct repertory test in this research was established (§4.7.2). In this study my main technique has been interviews, using foci diagrams I prepared especially for the research. With the method of triads, as well as the mode of elicitation being different, I decided it was appropriate to use different foci diagrams from those used in the interviews. For the triads I have built decks of cards by photocopying diagrams from text books. In this way the diagrams have the extra face validity of being figures already in the public domain as representing chemical species.

In fact two separate decks of cards were prepared for the work: one based on texts designed for students taking chemistry prior to A level (so that they might be expected to be familiar to students embarking on an A level course) and a second deck from A level texts. A selection of triad elements from each of these decks is presented in appendices 17 and 18 respectively. (I will refer to the cumbersome 'triad elements' rather than the usual 'elements' to avoid possible confusion with chemical elements. For example, triad element 314 represents a molecule of the chemical element hydrogen.)

Diagrams from books were photocopied (at a suitable enlargement), trimmed, and attached to standard record cards (c.100 mm x 150 mm). Reference numbers were arbitrarily assigned to avoid using verbal labels that might be too leading or convoluted. Diagrams were selected to show a range of types of chemical species (molecules, atoms, ions, parts of lattices), representing a range of substances that should familiar to the students, in various forms of representation relating to different aspects of structure and associated properties.

In this way it was hoped that Fransella and Bannister's (1977, p.13) criteria for triad elements would be satisfied,

(a) the elements must be within the range of convenience of the constructs to be used.

(b) the elements must be representative of the pool from which they are drawn.

The two decks of element cards were prepared using diagrams from

  • books intended for pre-A level study (Freemantle and Tidy, 1983; Gallagher and Ingram, 1984; Garvie et al., 1979; Groves and Mansfield, 1981; Hughes, 1981; Jackson, 1984). A sample of these elements is presented in appendix 17.
  • A level texts (Andrew and Rispoli, 1991; Hill and Holman, 1989; Liptrot, 1983; Waller, 1985). A sample of these elements is presented in appendix 18.

Both decks were extensive, intended to provide a large repertoire of triad elements that could be presented, and only a selection of figures were used on each occasion. The first deck was piloted with a student near the end of his first year of A level (colearner Edward), and then used with the second cohort of ten colearners (Jagdish, Kabul, Lovesh, Mike, Noor, Paminder, Quorat, Rhea, Tajinder and Umar) at the beginning of their course. Some of these students repeated the exercise later in their course. The second deck was tried out with an undergraduate chemistry student (who had previously been interviewed as an A level student, early in the interview study, colearner Brian) at the end of his first year at University, and was then introduced for use with some of the colearners during their second year of the course.

As an example of a triad of elements based on figures from pre-A level texts, consider 343, 454 and 553 (from appendix 17). This triad would be useful for exploring the discrimination between covalent and ionic bonding, and in particular to see if the triad elicits a construct relating to polar bonding.

triad element 343

triad element 454

triad element 553

As an example of a triad of elements based on figures from A level texts, consider 180, 260 and 376 (from appendix 18). This triad would be useful for eliciting colearners' constructs of intermolecular bonding, for example

discriminating between hydrogen bonds and other forms of dipole-dipole interactions.

triad element 180

triad element 260

triad element 376

Constraints of time – and colearners' concentration and interest – did not allow complete grids, where each triad element was considered against each construct elicited – to be formed. However, sometimes specific constructs from those elicited were selected and used to construe each triad element. The choice of the particular constructs would be as a result of in situ hypothesis testing, when I wanted a clearer idea of what a particular colearner meant by the construct label offered.

§5.1.5: Procedures used in the Construct Repertory Test

Two different approaches to selecting triads was used. At first the choice of triads to present was made in situ during the exercise. This allowed the researcher to try out combinations of triad elements that might be useful, and to discard some triad elements as less suitable (e.g. ambiguous) for future use. Just as important it allowed the exercise to be interactive, as the researcher reacts to the students' elicited constructs by offering the next triad. (It is possible that such an interactive approach may have also given colearners a stronger impression of their being 'right' answers that I was testing for. This is not considered to be a major problem as it is accepted that the colearners came to the research assuming that there were 'right' responses and that I (as their teacher) would know what these were.) After some experience of using the technique a standard set of triads was established for use with each pack. It is these two 'standard' sets of triads that are reproduced in appendices 17 and 18.

Both approaches have advantages. The less structured approach allows the researcher to undertake hypothesis testing (c.f. §4.2.1 and §4.2.3) about the students' ideas, and to follow up immediately responses that seem of particular interest. In a sense the process of the researcher offering a triad to the student, the student offering 'constructs' in response, and the researcher responding with a further triad gives the exercise the form of a conversation: something that has been recognised as inherent in grid work (Fransella and Bannister, 1977, p.4).

The advantage of having a standard set of triads is that comparisons become easier. Comparisons may be made between different students, or between the same students at different times during their course. Appendix 19 presents an example of a comparison between five colearners, in terms of the constructs elicited by one particular triad. Appendix 20 presents a comparison between the richness of the constructs elicited from two colearners at the start of their course. Appendix 21 presents a comparison of constructs elicited from a single colearner at different times, when construing the same triads. (The full list of constructs elicited during the research is provided in appendix 22).

§5.1.6: Colearner dialogues about chemistry

As discussed in chapter 4, one technique used to supplement interviews was to record colearners discussing past examination questions (§4.8). The procedure followed was to pair two colearners and set them a task which required discussion. The discussion was recorded on cassette tape. I was present to set up the process, but then withdrew to the far side of the room, only intervening in order to answer procedural questions.

The pairing was based on a combination of which colearners wished to contribute in this way, when they – and the researcher, and a room – were available, and which students felt comfortable talking together in this way. It was only possible to collect a limited amount of data in this way, based on six sessions (as listed in appendix 1).

The tasks used were A level questions, about chemical bonding or bonding-related topics. This type of task was chosen as:

  • It was felt that such past questions had validity as probes, being by definition set at Advanced Level, and pertaining to A level Chemistry.
  • The tasks were seen as relevant by the student-colearners, who recognised that to be successful Advanced level students they would need to be able to answer such questions in ways that were judged (by an examiner ultimately) to be acceptable chemistry.
  • The questions were structured, which meant that once the colearner pairs were set working they should need minimal input from the researcher.

The task that was set was to work together to answer the examination question. The students were told to attempt to agree the wording of their joint answer. The students themselves were left to decide how to go about answering the question, how much time to spend on various parts (no time limit was given), and to agree when they were finished. At the end of the session I would answer any of their queries, but as far as the research was concerned the product of the sessions was not so much the written answers produced, as the dialogue through which the answers were constructed.


§5.2: Details of analytical technique

"…written language consists of a system of signs that designate the sounds and words of spoken language, which, in turn, are signs for real entities and relations. Gradually this intermediate link, spoken language, disappears, and written language is converted into a system of signs that directly symbolize the entities and relations between them."

Vygotsky 1978, p.106

"Vygotsky describes the process of learning written language as one where first-order symbols become second-order symbols (the child comes to discover that one can represent spoken language by written abstract symbolic signs), only later to become first-order symbols again at a higher level of psychological process…"

Newman and Holzman, 1993, p.104

§5.2.1: The problems of transcription

The research interviews were of varying length, usually exceeding thirty minutes, and often more than an hour long. Interviews were tape recorded, so that the data collected is on cassette tape. (A back-up copy was made as soon as possible after the recording as security.) The recording loses much of the non-verbal interaction of the original transaction, but includes tonality, emphasis, hesitation and so forth as well as (vocalisation that may be interpreted as) words.

However, in order to analyse the interviews, it was necessary to transfer the data into some written form, that could more readily be edited, indexed, juxtaposed, re- sequenced, abstracted, compared, tested-against-conjecture, and so forth. Whilst the primary data for the study was the recordings (being the closest representation of the original conversations available), simply listening to the tapes en masse would not be a sufficient method of analysis, as the human brain cannot hold enough information in consciousness to consider all the issues relevant to the research. (Nevertheless, part of the analytical process is to listen to entire recordings to obtain overall impressions, and to confirm that the data-reduction process has not distorted the subject's meaning, c.f. §4.4.4.)

The process of transcription is a time-consuming and skilled operation. The researcher required approximately ten hours to produce an 'untidied' transcript of one hour of recording. (I use the term untidy – rather than 'rough' – to imply material that has been transcribed but not formatted in terms of utterance numbers and who is speaking, etc.)

Transcription is very much an interpretation process: it is not possible to produce a 'neutral' translation of the information on the tape (c.f. Stubbs, 1983, pp.227). The ears detect sound, but the brain infers words. Indeed after thirty or so years of

practice of listening to speech my brain has developed its processing capacity to 'ignore' most of the words and to make conscious what experience has suggested is the likely meaning that the speaker intends: conscious conversation works at a fairly high semantic level. The transcriber has to try and work at a level closer to perception to produce a verbatim transcript. When checking early attempts at transcription I found that I had omitted ubiquitous "sort of"s which added nothing to meaning, and I had added 'missing' words, and 'corrected' word endings. Sometimes speech is indistinct on the tape, but one may not even realise this if the indistinct word(s) seem 'obvious' from the context (c.f. Stubbs, 1983, p.228). Although, on checking early versions of transcriptions I discovered many such 'errors', there were very few mis-transcriptions which altered the meaning of utterances. Indeed meaning that seemed quite clear when listening to the recordings could become less obvious in the written transcripts as finer precision was introduced. As Stubbs (1983, p.228) points out, the coarser, subconsciously edited, transcription may be closer to what the participants experienced during the original interview.

Having established that transcription is an interpretive process, the question of should a transcription be interpretive does not apply: instead the researcher has to make decisions about the degree of interpretation appropriate in the transcription stage of the data analysis (c.f. Stubbs, 1983, p.229). For example, if one was interested in a detailed discourse analysis there are notations to record changes in tonality during utterances (e.g. Coulthard, 1985). However, in the field of studies about the learning of science this has not usually been judged appropriate. Indeed classroom researchers Edwards and Mercer suggest that for those "interested in cognitive and educational processes, and particularly those whose research incorporates a developmental perspective, it is arguably [formal] discourse analysis which scratches the surface" (1987, p.10). In the present study I have also been influenced by Vygotsky's view that words make up the appropriate unit of analysis when studying conceptual development (§2.2.2).

Even ignoring formal discourse analysis I was left with a range of practical questions about the transcription process:

  • Words such as 'there' and 'their' can not be distinguished by sound, but only by context – so should the transcription show 'there/their'?
  • To what extent is it important to include pauses?
  • To what extent is it important to show mispronunciations of words?

In order to answer such questions I followed the principle that the researcher using interview data as a primary source of information needs to develop a transcript format that matches his or her own purposes in producing the transcripts.

I also needed to decide whether to transcribe complete recordings, or to be selective. Some interviews may not prove fruitful in answering research questions. For example in a longitudinal study some subjects may leave the cohort part-way through. Interviews with these subjects may ultimately contribute little to the enquiry, but it will not be known in advance which subjects may be lost, and some level of analysis will be required in preparation for any subsequent interviews with these subjects. Some subjects may tend to move the discussion into areas that are not directly relevant to the enquiry. The interviewer has the 'power' to prevent this, but may deliberately allow the respondent to follow her own trains of thought, as the interviewer does not know in advance what may be relevant in the mind of the respondent. If the discussion reveals an idiosyncratic connection this may be a significant aspect of the student's cognitive structure. If not, the only thing lost is time and tape (as all interviewers know – time and tape wait for no one), but little may be gained by transcribing this section of recording.

To summarise:

  • transcription is a lengthy and difficult process, but was an essential treatment for some of the interview data;
  • a style of transcription format needed to be developed to reflect the purpose(s) of the analysis;
  • some recordings, and some parts of other recordings, may not require
  • transcription; but it was not possible to decide what to transcribe until late in the enquiry; and
  • some level of analysis was needed during data collection to 'feedback' into subsequent interviews.

The conclusion from these points was that during the enquiry
a) a format for transcriptions should be developed, and tested for its utility in (i) drawing conclusions, (ii) providing evidence to support results;
b) an on-going mode of data analysis is required that (i) is less time-consuming than transcription so it may readily be applied to entire recordings, but (ii) allows the abstraction of points of interest which can be followed up in subsequent interviews, and (iii) gives reference to the primary data (recordings) to allow ready access to points of interest for closer scrutiny.

The solution that emerged was that the data from the two main cohorts was treated differently. While the interviews of Annie, Brian, Carol and Debra were being undertaken, a limited amount of analysis was undertaken to inform

subsequent interviews. Once these colearners had completed their course, their interviews were transcribed fully. By this time data was being collected from the other colearners, so the detailed analysis of the data from the first cohort took place as further data was being collected from other learners (c.f. §4.4.1, §4.4.4). A transcription procedure was developed and refined, and Annie's case study prepared.

Meanwhile, the interview data then being collected was initially analysed in less depth. All recordings were reviewed, and from each interview a protocol was produced which summarised the discussion. Initially this process involved a 'hard- copy' format using a series of hard-backed A4 notebooks, where alternate pages were used for the summary, and the facing pages used for notes. These protocols were coded (a list of codes used is given below), and indexed on a card-index system.

However, as the research proceeded I transferred the protocols to word-processing files on the computer, and started summarising subsequent interviews directly onto computer. This was done as far as possible in 'real time' on listening to the recording, using as much of the colearners' own language as possible. This provided sufficient written material to search for themes, and to apply initial codes.

As these protocols were word-processed, they were capable of being up-dated at any time. Indeed these protocols became working documents, and over a period of time sections of the summaries gradually became more detailed, and large sections were fully transcribed. For some interviews most of the data was eventually transcribed verbatim.

The use of the computer eventually made the card index system redundant as I was able to identify and access any word (either in a transcription, or my in coding or comments) on any file, using the software on the machine.

§5.2.2: Transcription

Given that any transcription is an interpretation of the raw data the main decision to make when developing a transcription procedure is the degree of interpretation to be undertaken in producing the transcript (§5.5.1).

In developing the procedure used in this present study a number of trial transcriptions were undertaken. The transcripts were considered in the light of how readily they could be used in the next stage of analysis, and the perceived risk of any inaccuracies distorting such subsequent analysis. At this time a number of examples of transcript data quoted in the literature were considered for comparison purposes. Transcripts were modified, and re-formatted as a result of reflection on these issues.

The following transcription schedule was developed:

  1. Sentences: although spoken language differs from the written word, a transcript is more sensible to readers if it follows the conventions of written language. It was decided that the use of capital letters, commas, and full stops was appropriate, although the 'sentences' produced may not reflect grammatical rules (Stubbs, 1983. p.35).
  2. Spellings: words would be transcribed as accurately as possible, with 'their/there' (and other examples) being recorded as seemed appropriate from the context. This was extended to include 'N-A- plus' as "Na+", and 's-p-three' as "sp3" etc.
  3. Hesitation: utterances such as "er", "uhm" etc. can provide information about hesitation and uncertainty, and would be transcribed as accurately as possible.
  4. Emphasis: No effort would be made to systematically signify variations in tonality, but '?' and '!' would be used where considered justified. Particular emphasis placed on a word or phrase would be represented by underlining.
  5. Utterance order: Speech would be listed in order of utterance, even if this meant 'sentence sharing' such that one speaker interjects into the other's pauses.
  6. Simultaneous speech: Where speech overlapped this could be shown by the use of chevrons (>…> and <…<) to bracket together the overlapping speech.
  7. Speaker's moves: Each change of speaker would be represented by a new line. The speaker would be represented by an initial to the left of the speech.
  8. Utterance numbers: For reference purposes 'utterance' numbers would be assigned, and shown at the far left of the page.
  9. Silences: Pauses would be represented by •, with each • representing a pause of about 1 second. Short pauses within words would be represented by colons, e.g.: electro:negativity.
  10. Brackets: Parentheses would be used:
    • [for additional information / interpretation added at transcription]
    • {for non-verbal sounds: coughing, laughter}
    • (for parts of speech apparently directed to the speaker his or herself)
  11. Uncertain transcription: Where the recorded sound was indistinct and part of speech was not transcribed * would indicate the missing speech, with the number of * symbols representing (as far as possible) the missed syllables. Where a transcription was possible, but of uncertain precision, this would be represented by striking the uncertain part through.

This transcription scheme involves a moderately high level of interpretation in that the speakers words are transcribed as questions, exclamations, etc. (see appendix 23 for a sample of transcript material). However the original data sources (the recordings) were available to be checked against the transcripts at any stage in subsequent analysis.

The system adopted for transcription actually provides a rendering of the recorded information that is closer to original data than may be appropriate when quoting to illustrate the model developed during the research, but I felt it was sensible to provide as much information in the transcripts as I was likely to need when interpreting extracts. It is simpler to subsequently edit over-elaborated transcript material, than to have to return to the right section of tape to fill-out sparse transcription. The editing process is discussed below (§5.3).

§5.2.3: Coding the interview data

In order to analyse the data collected, especially with regard to the transcripts of interviews of an hour or so duration, it is necessary to use certain codes or categories to organise the vast amount of data. As discussed in chapter 4, in the grounded theory approach the categories are considered to emerge from the data, or – where the category used in one the researcher already had in mind – to be modified to provide an emergent fit with the data (§4.4.1, §4.4.4).

Initially the data was coded according to the aspects of the concept area being studied in terms of curriculum science: so when the first case study was prepared the data was considered under twelve headings:

  • atomic structure
  • definition of chemical bonding
  • rationale and mechanism for bonding
  • covalent bonding
  • ionic bonding
  • polar bonding
  • metallic bonding
  • multiple bonding
  • delocalisation and resonance
  • dative bonding
  • hydrogen bonding
  • van der Waals' forces

However, in interrogating the data there were a number of codes used that had arisen from my consideration of the literature:

Other codes emerged from the data (§5.2.4), such as use of the notion of 'deviation charges' in Annie's case (§7.2.2), and the 'conservation of force' conception (§8.2.5).

During the interviewing of the first cohort (Annie, Brian, Carol and Debra) the following procedure was adopted. Loose leaf A4 file paper was used with a margin on either side of the paper. The reference number of the foci diagram was written in the left-hand margin, the respondents comments in the centre, and (at points of particular interest) the tape counter number was entered in the right-hand margin. This enable the appropriate section of the recording to be found for re-listening and quotations. These notes were used to prepare a list of questions to be asked in the subsequent interview.

For the second cohort (for colearners Jagdish, Kabul, Lovesh, Mike, Noor, Paminder, Quorat, Rhea, Tajinder and Umar) a hardback A4 note book was used for each colearner, with both tape counter number and diagram number on the left hand side. Only one side of the paper was used – the left hand page of each double spread – to allow plenty of space for notes to be made (interpretation, comparisons, ideas for follow-up, etc.) These notes would not necessarily be made concurrently, but could be added to when the protocols were re-visited.

§5.2.4: Using codes to index the interview data

The process of analysing the interview data (through the use of transcripts or notes) requires the application of codes to 'fracture' (Glaser, 1978, p.55) the raw data.

The codes used may amount to no more than 'working hypotheses' of useful ways to consider the data, and over time the members of the set of categories emerging may be split, joined, discarded, supplanted, supplemented and so forth (§4.4.4). A most important requirement is that no categorisation should be exclusive – an utterance may represent several points of interest (i.e. it would not be appropriate the cut up a transcript/protocol and place segments into groupings.) An utterance may be cited as a referent in any number of categories, and segments of a transcript/ protocol that are not classified at one time may be returned to later to be reconsidered in the light of new codes and categories.

Transcripts and protocols were studied, and codes generated liberally, and for each subject an alphabetical card index of citations was prepared. The use of index cards enabled ready additions, rearrangements etc. By coding data as the interviews continued I was able to develop my sensitivity to nuances of colearners' interview responses that might be relevant to emerging categories, and take opportunities to follow-up such leads during the interviews. At this stage of the analysis there were a large number of codes used: some common to a number of colearners, but others idiosyncratic. The codes used are listed in the box below.

Codes used in analysis of interview data whilst data collection continued
analogy • anion size • anthropomorphism • antibonding orbitals • atomic structure • Aufbau • balanced forces • banana bonds • boiling temperature • bond angles • bond fission • bond order • bonding orbitals • bonding = touching • bridging/terminal • cause and effect • centripetal acceleration • changes of state • charge • charge/force • cohesion • compound/element • compound properties • compound/mixture • confused • contradiction • coordination number • core charge • Coulomb's law • counting twice • covalent bonding • criteria for bonding • criterion for bond type • dative bonding • delocalisation • diatomic = gas • dichotomy • dipoles • direction of bond • double bond • eccentric orbits • electricity • electrolytic conduction • electronegativity • electronic configuration • electrons • electron cloud • electron density • electron orbits • electron pairing • electron spin • electron wave • electrostatic attraction • electrostatic force • electrostatic framework • electrostatic interactions • element • energy levels • equilibrium • excited states • flat • force • force conserved • force fields • forgetting • Gestalt • giant molecular structure • geometry • gravity • guessing • heat • history conjecture • hydration • hybridisation • hydrogen bonding • idiosyncrasy • inductive effect • inference • ions • ion formation • ionic bonding • ionic charge • ionic molecule • ionic radius • ionic reactions • ionisation energies • just forces conjecture • lattice • learning conversations • logic • lone pairs • macro/μ • maximum speed • mechanism • melting • metacognition • metallic bonding • metallic structure • molecular energy • molecular ion • molecular orbital • molecules • multiple bonding • natural state • neutral charge • Newton 1 • Newton 3 • noble gas configuration • number of bonds • octet states • ownership • orbital • orbital labels • orbitals/reality • orbital shapes • oxidation numbers • partial charges • periodic table • polar bonding • polarisation • potentialenergy • projectile • proper bonds • pseudo-explanation • quantify/qualify • quantum rules • radii • random thought generator • rationale for bonding • reactivity • reality manifold • redox • rehybridisation • representation • representation/reality • resonance • screening effect • shape of molecule • shielding • shells • solid • solvation • spare • stable electronic configuration • stability • state • stoichiometry • surfaces • symmetry • tacit knowledgetautologyteleology • V.S.E.P.R.T. • valence electrons • valency • valency conjecture • van der Waals' forces • variables • visualisation • waffling • 3-D • 3-D/2-D distortion • ∂-bond • π-bond • σ-bond

The labels in the box represent the initial attempts to sort the data, and for this reason there is some overlap and duplication of these original codes. This is quite normal in this type of analysis as the initial codes are intended to relate to the data with minimal theoretical interpretation. These codes influenced subsequent data collection, in particular by sensitising me to points of interest in my role as interviewer (c.f. Glaser, 1978, p.36). Some of the codes in the box are simply those used in coding the data from Annie in terms of aspects of the concept area (i.e. covalent boding, metallic bonding, see §5.2.3). However, a number of the codes being used at this stage developed into the categories around which the 'grounded theory', that is the model I present in chapter 6, emerged.

One example is anthropomorphism. This category developed from a code that I had in mind before data analysis commenced (c.f. §3.1.4). Glaser has described how in such cases one needs to develop an "emergent fit between the data and a preexistent category" (1978, p.4). In this case the initial code labelled all examples of anthropomorphism found. However as the analysis continued it was found that anthropomorphic language used to describe how and why chemical bonds formed took on a particular significance as part of octet thinking (§11.3). Not all the examples of anthropomorphic language as coded in the data were relevant to this particular category (which might be labelled 'anthropomorphism as a pseudo-explanation of the octet rule').

Similarly the code dichotomy originated from before the research commenced, and was central to my initial conceptualisation of the research focus. In my own teaching I had emphasised how the transition to A level required learners to switch from seeing the distinction covalentionic as a dichotomy, and considering it instead as a continuum. (This required them first to see the distinction metal-nonmetal as a continuum rather than a dichotomy.) In the research I discovered that the dichotomous perspective appeared to be part of a wider complex of related conceptions (§11.6).

By way of contrast the code force conserved was not based upon a code I brought to the data at the start of the research. Although my reading and professional experience meant I was aware of a range of alternative conceptions in mechanics and related areas, the 'conservation of force' conception discussed in this thesis (§10.5) has not to my knowledge been proposed as a specific alternative conception, although this research suggests it may be a commonly held idea (see appendix 3). This particular code was added quite late in the analysis, but once it emerged it was found to code for a number of instances in the data, and to inform the data collection (as it gave me a sensitivity to the possible significance of certain comments made by colearners, and therefore allowed me to test hypotheses about this notion being held, and thus elicit further examples).

An example of a code that developed by what Glaser refers to as 'refit' (1978, p.4) is history conjecture. Initially this referred to comments elicited in the context of ionic bonding, that is that an ionic bond would only be formed between ions that had transferred electrons – as if the electron or ions had some sort of memory of what had gone before (§11.4.2, and appendix 2). As analysis continued this developed into a category that included a wider range of data: so for example learners might suggest that on bond fission each atom gets its 'own electrons back'; another example where the history of the electron is endowed with some significance (§11.4.1).

§5.2.5: Citations to the data base

In order to relate analysis back to the data in which it is grounded, a system of citations has been used. Each subject interviewed is referred to by an initial: A (Annie), B (Brian), etc. The tape recorded interviews are signified A1, A2 etc. (see appendix 1).

As the recorded data from Annie, Brian, Carol and Debra were completely transcribed, the source of an quotation or point of interest is cited in terms of the recording reference and transcript utterance number, e.g. A1.1, A1.2, etc.

As the protocols of the later interview included partial transcription, utterance numbers were not assigned (as these would have needed to be altered with each increment in the amount of the interview being transcribed). When the source is a recording which has not been fully transcribed, the citation is to the recording, and the tape counter number, e.g. J1.A076. (As counter readings are made intermittently this citation will have limited precision in relation to the tape.) It is possible to distinguish which form is being used, as the citations to partially transcribed recordings include a letter indicating the side of the cassette on which that part of the interview was recorded (i.e. A or B, or occasionally C or D where the interview was long enough to require a second cassette).

§5.2.6: Supplementary data

The supplementary data was that collected outside of interview contexts. This included the construct repertory test (a clinical context, but different foci, and a different mode of elicitation); colearner dialogues (a clinical context, but minimal input from the researcher); and course work tasks (a more naturalistic context, and little or no scaffolding of the tasks by the researcher during their execution). Supplementary data was used in two ways: in what could be described as 'formative' and 'summative' modes.

The formative mode: During a sequence of interviews with a colearner, the data was interrogated to find particular points of interest. This information was used to suggest specific interview questions – sometimes explicitly referring to the evidence from the supplementary data source – to clarify and probe the colearner's thinking.

Even where there was no specific use of this data, the process of working through the data could contribute to my background appreciation of the colearner's case (being part of the 'system input' for whatever subconscious processing my brain was undertaking – post-inductive resonance, or 'integration' in McClintock's term, §4.2.1).

The summative mode: During the process of preparing case studies once a sequence of interviews was complete, the supplementary data was interrogated to find evidence that supported or challenged interpretations of interview data (bearing in mind that these data sources could not be considered 'independent' of the interviews, as they had been used to inform the interview questions). In other words, the supplementary data provided a form of triangulation for this interview- based study, but in terms of grounded theory's constant comparison approach (§4.4), rather than purely as a post hoc means of verification (Glaser and Strauss, 1967, pp. 68-69, c.f. §4.4.4. It should be noted that recognising the subconscious processing that forms part of the analysis process (§4.2.1), and with one researcher collecting data of a qualitative nature, it would not have been possible to have genuinely ignored slices of data (§4.4.2) during the interviewing, even if this had been considered desirable).

Analysing construct repertory test data

Some points of interest were clear during the process of data collection, and were recorded in field notes. The ethical stance taken in this research meant that I raised points that were considered to be particularly significant with colearners in the feedback at the end of the session. The data overall was subsequently examined.

Formative mode: To consider the data from Kelly's triads I would set out the triad of elements and work through the colearner's elicited constructs. I would note any constructs which suggested alternative conceptions, or ways of construing the elements which seemed to imply a different level of understanding to that expected. These points could be indexed, and could be followed up in later interviews.

As an example, consider colearner Noor. The construct repertory test suggested that she had alternative meanings for some basic chemical terms, including 'compound', 'molecule', 'ion' and 'element'. The nature of Noor's alternative nomenclature was such that the differences from standard usage were not readily apparent in interviews and written work: but became clear in the repertory test. This was because when Noor used these terms, she tended to use them 'appropriately' from a conventional viewpoint, but her own meanings meant she did not use the terms in other contexts where they would also be appropriate. Noor's meanings were restricted because she saw some of these categories as mutually exclusive (in P.C.T. terms, her constructs were preemptive rather than constellatory,

Kelly, 1963 {1955} , p.153-4). This became clear in the repertory test as she had to construe the elements according to her personal construct system. Once elicited in this way Noor's meanings could be explored, tuition provided, and the development of her use of the terminology followed (see appendix 24).

Summative mode: the constructs elicited were interrogated to identify evidence to support or challenge interpretations from the interviews – for example elements that seemed to be 'misconstrued', or the absence of constructs that would be expected to be applied, but were not elicited, from particular triads. However, the failure to elicit a construct does not prove it is not part of the colearner's system: for example when asked about the absence of certain constructs from his construct repertory tests, colearner Tajinder reported that he considered certain discriminations to be too basic to be relevant to the research. For example, in appendix 21 constructs elicited from Tajinder from the same triads on two occasions are presented. In the case of the triad of elements 229, 307 and 349 Tajinder suggested four discriminations in October 1993, but only two in May 1994. Yet the constructs elicited on the later date were more sophisticated, and it would not be appropriate to suggest that Tajinder no longer recognised which of the triad elements contained phosphorus (for example). Rather he was presenting discriminations at a more abstract level, such as whether dorbitals were used in the hybridisation (see appendix 21. This point relates to my comments about student seeing through the representations above, §5.1.3. Also relevant here is appendix 20 where it is tentatively suggested that Rhea's tendency to construe triad elements in terms of aspects of the form of representation – such as got shading – was an indication of her poor concept base in chemistry.)

Analysing concept map data

Concept maps were sometimes set as a classwork exercise, to be carried out without notes or books. However, they were also set as homework, where students had access to such resources.

Formative mode: concept maps were checked and graded in my teacher role, but particular points of interest were noted for later follow up. These were generally of the nature of propositions that were either incorrect from the curriculum science viewpoint, or at least ambiguous or dubiously worded. These comments suggested possible alternative conceptions, and the colearners were asked about them in interviews.

Summative mode: in preparing case studies, the concept maps were able to provide an additional check on the interpretations from interview studies, e.g. the absence or application of a particular explanatory principle.

Analysing colearner dialogues

The colearner dialogues were transcribed using the same format as the interviews (§5.2.2, see appendix 25 for a sample of the transcription). In an interview context I was able to structure questions to push colearners to the limits of their knowledge and understanding – in Vygotsky's terms to scaffold the dialogue to work in the colearner's Z.P.D (§2.2.2). In the colearner dialogues I made no input once the task was set-up and the students were working, unless I was asked for procedural directions. The dialogues suggested how far the colearners were able to push their thinking in the absence of teacher (researcher) input.

Once again, immediate feedback was given to the colearners for pedagogic purposes, and points of interest suggested lines of discussion for the interviews ('formative mode'), as well as providing another check on the interpretations of interviews when case studies were compiled.

S entence sharing. In practice it was found that some of the data from dialogues was quite difficult to make use of in the ways intended, because much of the discourse was not identifiable to an individual learner. There was little difficulty in assigning utterances to the colearners, but much of the argument developed was clearly the result of the interchange, rather than the individuals. Even where one colearner seemed to be acting as the main source of ideas, and the other was mirroring these, the transcripts did not provide clear evidence for this. In many cases there was 'sentence sharing' where an individual statement was divided between the two students into several moves. Although this limited the use of the data as evidence for the thinking of an individual learner, it was suggestive of the extent to which the interviews I have carried out are also the product of a conversation.

In an interview context the normal conventions of speech are somewhat subverted by the researcher who is aware that in a sense the tape recorder is an audience: often full sentences are used, and clarifications – that would not normally be requested – are sought. However, in the colearner dialogues, the discourse is more akin to the normal patterns of speech where the assumed common knowledge provides a context, and a 'referential framework', allowing abbreviated communication, and where the dialogue requires instant response rather than deliberate considered statements (Edwards and Mercer, 1987, p.6; Vygotsky, 1986 {1934} , p.240, p.242).

Thus the conditions that lead to the 'sentence sharing' I observed. The purpose of 'sentence sharing' may be to check common understanding by the participants: Stubbs points out that completion of another's utterance is a demonstration of having understood (1983, p.22, c.f. §2.2.3 and Edwards and Mercer, 1987, p.141). In the present study this checking process may be seen to operate when the interjection made by one of the colearners is not consistent with the expectation of the other – on occasions such discrepancies were quickly overcome by one of the participants 'changing tack'.

Analysing other supplementary data

Copies of some test scripts and other work undertaken by colearners as part of their course were kept on file. Again where there were specific points judged to be of particular interest that arose from this material, these points would be used to inform the agenda for interviews. The material was also available to be used as a check against interpretations made in writing up case studies.

§5.3: Compiling the case studies: using a journalistic style

"The analysis of an interview begins with two assumptions: (1) children answer honestly and (2) answers are consistent with personal meanings of concepts."

Ault, Novak and Gowin, 1984, p.446

Guba (1978) has suggested 'journalism' as an appropriate model for naturalistic research, where "truth can be elicited from partial, and even reluctant sources by processes of cross-checking triangulation, and re-cycling until convergence is achieved" (Guba, 1978, quoted in Gilbert and Pope, 1986b, p.42).

In preparing the case study of Annie, from the pilot study, I used a 'journalistic' approach to writing up the case (see §7.3). Having transcribed the interviews, and re-organised the data in terms of categories, I wrote up the case study to have a high level of readability by editing the evidence to give as far as possible a narrative form. (My interpretation of) Annie's thinking has been illustrated in her own words, but parts of utterances have been selected and spliced together to provide narrative, in the same way that a journalist might edit an interview for broadcast news (see appendix 26).

A deliberate decision to edit in this way places a responsibility on the researcher to ensure that increased readability is not attained at the cost of misrepresenting the full data. All citations from transcripts, short of publishing full texts, involve some degree of editing, and lose some of the information in the original tapes. In the case studies discussed in chapters 7 and 8 there is a high level of editing, and the reader should be aware of the scope for researcher bias and misinterpretation.

Despite these dangers it is accepted that such an approach is appropriate in writing up findings from a study such as this. So Sherman (1993) points out that qualitative research "has to make its case, in part, through literary persuasion" (p.233), and Sarup (1993) warns the users of such reports that "narrative, just by being narrative, always demands interpretation" (p.178).

In the previous chapter there was reference to Pope and Denicolo's (1986) discussion of the researcher's dilemma (§4.4.5). I have attempted to follow their maxim that "authenticity must be tempered with utility" (p.156.) To ensure utility my data chapters (7-11) have been written with an emphasis on providing a narrative to lead my readers through the substantive points. To ensure authenticity I have included in the appendices illustrations of the process of data reduction, and a range of verbatim evidence to exemplify and support my interpretations.


§5.4: Developing the model of progression of learning about bonding

As explained earlier in this chapter (§5.2.3), the case study of Annie (discussed in chapter 7) was initiated by coding based upon my analysis of the concept area. As analysis proceeded with data from other learners a large number of codes of various types were used (§5.2.4).

Once data collection form the main cohort of colearners was completed, the data from all but one of the learners were put aside. Tajinder was selected as a suitable case for in-depth study as he had provided the greatest amount of data. For a period of some months this data was worked into a case study, without direct reference to the data from other learners – although of course the codes used to initially organise the data had originated in the earlier on-going analysis that had taken place during data collection (§4.4).

In order to work the data from Tajinder into a case study a multi-stage process was adopted:

  • Each of the 23 interviews were summarised to produce a more manageable data source. This involved a great deal of editing, as described above (see appendix 27).
  • A chronological case study document was compiled from the summaries, plus points from supplementary data that was considered significant. For the case study document the material from the interview summaries were reorganised thematically (see appendix 27).
  • The main themes from within the case study document were identified, and Tajinder's case was written up around these to form the basis of chapter 8. At this point interpretations were checked back against the original protocols, extra transcription was undertaken where necessary, and suitable verbatim quotations were selected to illustrate the case.

The preparation of this case provided an outline model for organising the data from other cases. This was tested by preparing cases from the data from Brian, Carol, Debra, Jagdish and Kabul, based around this structure.

As a result of this process, the model was refined. The data from the other colearners (Edward, Lovesh, Mike, Noor, Paminder, Quorat, Rhea and Umar) were then interrogated and organised according to the model – again involving some refinement of the categories used in the model.

The incidental data collected from other learners was next considered in the light of the model. By this stage a model had been developed which seemed to explain a good deal of the original data. This version of the model was written up in chapter 6, and then the three main aspects of the model were described and illustrated in chapters 9, 10 and 11.


Read the next chapter