Introduction to
Grounded Theory

By Steve Borgatti

Goals and Perspective

The phrase "grounded theory" refers to theory that is developed inductively from a corpus of data. If done well, this means that the resulting theory at least fits one dataset perfectly. This contrasts with theory derived deductively from grand theory, without the help of data, and which could therefore turn out to fit no data at all.

Grounded theory takes a case rather than variable perspective, although the distinction is nearly impossible to draw. This means in part that the researcher takes different cases to be wholes, in which the variables interact as a unit to produce certain outcomes. A case-oriented perspective tends to assume that variables interact in complex ways, and is suspicious of simple additive models, such as ANOVA with main effects only.

Part and parcel of the case-orientation is a comparative orientation. Cases similar on many variables but with different outcomes are compared to see where the key causal differences may lie. This is based on John Stuart Mills' (1843, A system of logic: Ratiocinative and Inductive) method of differences -- essentially the use of (natural) experimental design. Similarly, cases that have the same outcome are examined to see which conditions they all have in common, thereby revealing necessary causes.

The grounded theory approach, particularly the way Strauss develops it, consists of a set of steps whose careful execution is thought to "guarantee" a good theory as the outcome. Strauss would say that the quality of a theory can be evaluated by the process by which a theory is constructed. (This contrasts with the scientific perspective that how you generate a theory, whether through dreams, analogies or dumb luck, is irrelevant: the quality of a theory is determined by its ability to explain new data.)

Although not part of the grounded theory rhetoric, it is apparent that grounded theorists are concerned with or largely influenced by emic understandings of the world: they use categories drawn from respondents themselves and tend to focus on making implicit belief systems explicit.



The basic idea of the grounded theory approach is to read (and re-read) a textual database (such as a corpus of field notes) and "discover" or label variables (called categories, concepts and properties) and their interrelationships. The ability to perceive variables and relationships is termed "theoretical sensitivity" and is affected by a number of things including one's reading of the literature and one's use of techniques designed to enhance sensitivity.

Of course, the data do not have to be literally textual -- they could be observations of behavior, such as interactions and events in a restaurant. Often they are in the form of field notes, which are like diary entries. An example is here.

Open Coding

Open coding is the part of the analysis concerned with identifying, naming, categorizing and describing phenomena found in the text. Essentially, each line, sentence, paragraph etc. is read in search of the answer to the repeated question "what is this about? What is being referenced here?"

These labels refer to things like hospitals, information gathering, friendship, social loss, etc. They are the nouns and verbs of a conceptual world. Part of the analytic process is to identify the more general categories that these things are instances of, such as institutions, work activities, social relations, social outcomes, etc.

We also seek out the adjectives and adverbs --- the properties of these categories. For example, about a friendship we might ask about its duration, and its closeness, and its importance to each party. Whether these properties or dimensions come from the data itself, from respondents, or from the mind of the researcher depends on the goals of the research.

It is important to have fairly abstract categories in addition to very concrete ones, as the abstract ones help to generate general theory.

Consider what is implied in the following passage of text (Strauss and Corbin pg. 78):

Text Fragment 1

Pain relief is a major problem when you have arthritis. Sometimes, the pain is worse than other times, but when it gets really bad, whew! It hurts so bad, you don't want to get out of bed. You don't feel like doing anything. Any relief you get from drugs that you take is only temporary or partial.

One thing that is being discussed here is PAIN. Implied in the text is that the speaker views pain as having certain properties, one of which is INTENSITY: it varies from a little to a lot. (When is it a lot and when is it little?) When it hurts a lot, there are consequences: don't want to get out of bed, don't feel like doing things (what are other things you don't do when in pain?). In order to solve this problem, you need PAIN RELIEF. One AGENT OF PAIN RELIEF is drugs (what are other members of this category?). Pain relief has a certain DURATION (could be temporary), and EFFECTIVENESS (could be partial).

One can see that this sort of analysis has a very emic cast to it, even though I think that most grounded theorists believe they are theorizing about how the world *is* rather than how respondents see it. 

The process of naming or labeling things, categories, and properties is known as coding. Coding can be done very formally and systematically or quite informally. In grounded theory, it is normally done quite informally. For example, if after coding much text, some new categories are invented, grounded theorists do not normally go back to the earlier text to code for that category. However, maintaining an inventory of codes with their descriptions (i.e., creating a codebook) is useful, along with pointers to text that contain them. In addition, as codes are developed, it is useful to write memos known as code notes that discuss the codes. These memos become fodder for later development into reports.

An example of a code note is found here.

Axial Coding

Axial coding is the process of relating codes (categories and properties) to each other, via a combination of inductive and deductive thinking. To simplify this process, rather than look for any and all kind of relations, grounded theorists emphasize causal relationships, and fit things into a basic frame of generic relationships. The frame consists of the following elements:

Element Description
Phenomenon This is what in schema theory might be called the name of the schema or frame. It is the concept that holds the bits together. In grounded theory it is sometimes the outcome of interest, or it can be the subject.
Causal conditions These are the events or variables that lead to the occurrence or development of the phenomenon. It is a set of causes and their properties.
Context Hard to distinguish from the causal conditions. It is the specific locations (values) of background variables. A set of conditions influencing the action/strategy. Researchers often make a quaint distinction between active variables (causes) and background variables (context). It has more to do with what the researcher finds interesting (causes) and less interesting (context) than with distinctions out in nature.
Intervening conditions Similar to context. If we like, we can identify context with moderating variables and intervening conditions with mediating variables. But it is not clear that grounded theorists cleanly distinguish between these two.
Action strategies The purposeful, goal-oriented activities that agents perform in response to the phenomenon and intervening conditions.
Consequences These are the consequences of the action strategies, intended and unintended.

In the text segment above, it seems obvious that the phenomenon of interest is pain, the causal conditions are arthritis, the action strategy is taking drugs, and the consequence is pain relief. Note that grounded theorists don't show much interest in the consequences of the phenomenon itself.

It should be noted again that a fallacy of some grounded theory work is that they take the respondent's understanding of what causes what as truth. That is, they see the informant as an insider expert, and the model they create is really the informant's folk model. 

Selective Coding

Selective coding is the process of choosing one category to be the core category, and relating all other categories to that category. The essential idea is to develop a single storyline around which all everything else is draped. There is a belief that such a core concept always exists.

I believe grounded theory draws from literary analysis, and one can see it here. The advice for building theory parallels advice for writing a story. Selective coding is about finding the driver that impels the story forward.


Memos are short documents that one writes to oneself as one proceeds through the analysis of a corpus of data. We have already been introduced to two kinds of memos, the field note and the code note (see above). Equally important is the theoretical note. A theoretical note is anything from a post-it that notes how something in the text or codes relates to the literature, to a 5-page paper developing the theoretical implications of something. The final theory and report is typically the integration of several theoretical memos. Writing theoretical memos allows you to think theoretically without the pressure of working on "the" paper.

An example of a theoretical memo is here.


Strauss and Corbin consider that paying attention to processes is vital. It is important to note that their usage of "process" is not quite the same as Lave and March, who use process as a synonym for "explanatory mechanism". Strauss and Corbin are really just concerned with describing and coding everything that is dynamic -- changing, moving, or occurring over time -- in the research setting.