Chapter 21: Content analysis

Danielle Berkovic

Learning outcomes

Upon completion of this chapter, you should be able to:

  • Describe the three types of qualitative content analysis.
  • Understand how to conduct the three types of qualitative content analysis.
  • Identify the strengths and limitations of each type of qualitative content analysis.


What is content analysis?

Content analysis is a widely used qualitative research technique to systematically classify codes and identify themes or patterns within the data.1 Content analysis goes beyond counting words; it can be used to examine data and to organise large amounts of text into an efficient number of categories that represent similar meanings.2 This textual data can be derived from the transcripts of interviews or focus groups, from the researcher’s notes taken during participant observation, from documents such as medical or policy guidelines, or from free-text responses to open-ended survey questions. There are three types of content analysis, which are explored in this chapter1:

  • Conventional – the inductive creation of descriptive categories (i.e. categories are not predefined)
  • Directed – the deductive creation of descriptive categories using a structured approach (i.e. categories may be predefined if analysing data within an existing framework)
  • Summative – the identification and quantification of words or content in the text, which aims to understand the contextual use of words or content, and to explore their usage.

Conventional content analysis

Conventional content analysis is used when existing theory or research literature on a phenomenon is limited.1 Researchers should avoid using preconceived codes, instead allowing the codes and names for codes to inductively arise from the data through the following steps:

  • The researcher should be immersed in the data, to enable new insights to emerge. This might include reading all interview and/or focus group transcripts, or the field notes from a participant observation project. This enables the researcher to obtain a sense of the whole dataset. Notes can be made on the transcripts at this time.
  • Once the researcher has read the dataset as a whole, the data should be read closely, sentence by sentence, to inductively capture key thoughts or initial concepts.
  • The researcher should approach each of these sentences by making notes of their first impressions, thoughts and initial direction for analysis.
  • As this process continues, potential categories should emerge that are reflective of more than one thought. These should be present directly within the text and become the initial coding scheme.
  • Codes should be sorted into categories, based on how the categories are related and linked. Depending on the relationships between categories, researchers can choose to combine categories or organise them into a smaller number of subcategories.
  • Definitions for each category, subcategory, and code should be developed.
  • To prepare for reporting the findings, example quotes for each of the codes and category definitions should be identified from the data.
  • Depending on the study’s aims and objectives, researchers might decide to explore further relationships between categories.
  • Before writing up results, the researcher should check the alignment of codes, categories and quotes.

Example of conventional content analysis

Heydari et al3 used conventional content analysis to better understand home-based palliative care for people with terminal cancer. The researchers collected data through 17 semi-structured interviews and one focus group with 8 participants. They describe reading the transcripts several times, extracting initial codes and merging codes to form categories based on their similarities, before attributing quotes to the categories. Two main categories (challenges and opportunities) and 10 subcategories were identified.

Directed content analysis

Sometimes, existing theory or prior research exists about a phenomenon that is incomplete or would benefit from further description.4 In this instance, the researcher should choose to conduct a directed content analysis.5 Directed content analysis is used to validate or extend conceptually a theoretical framework or theory. For example, researchers used Watson’s Human Caring Theory to analyse healthcare workers’ perspectives on human caring6, and the Information Motivation Behavioural skills model was used to better understand adherence to antiretroviral therapy.7 Directed content analysis provides a more structured process for data analysis than the conventional approach, through the following steps:

  • Using existing theory or prior research, the researchers begin by identifying key concepts as initial coding categories.
  • Next, operational definitions for each category are determined using the theory.
  • The researcher reads their data as a whole and highlights all text that on first impression appear to represent an aspect of the theory. This provides some reassurance that the researcher has captured all possible occurrences of a phenomenon.
  • The researcher codes the highlighted passages using the predetermined codes from the selected theory.
  • Any text that cannot be categorised with the initial coding scheme can be given a new code and used to provide new insight into the original theory or prior research.
  • The findings from a directed content analysis offer supporting and non-supporting evidence for a theory. This evidence can be presented by showing codes with example quotes, and by providing descriptive evidence.

Example of directed content analysis

Purkey et al8 used the Life Course theory to understand and imagine public health and policy responses to the multiple and varied impacts of the COVID-19 pandemic on different groups. Data was collected through participant-offered short stories and key informant interviews. The data was analysed using directed content analysis; transcripts were read and coded according to definitions outlined in the Life Course theory, which was identified as a theory with sufficient complexity to illustrate many of the impacts of COVID-19. The Life Course theory is described by three constructs: (1) timing (related to cohort effect and period effect), (2) trajectories (related to social pathways, transitions and turning points) and (3) broader principles through which any impact can be examined (related to life span development, agency, time and place, timing, and linked lives).

Summative content analysis

Researchers using summative content analysis start by identifying and quantifying certain words or content in the data to understand the contextual use of the words. This quantification is not designed to infer meaning, but rather to explore the usage and frequency of content.9

In a summative approach to content analysis, the researchers begin by searching the dataset for occurrences of the identified words (this can be done manually or using assistive software). The word frequency for each identified word is calculated, with the context also noted (e.g. the participant providing that data, or the setting in which it was provided). Counting is used to identify patterns in the data and to contextualise the codes. This enables interpretation of the context associated with the use of the word or phrase.1

Example of summative content analysis

Bender et al10 aimed to characterise the purpose, use and creators of Facebook groups related to breast cancer. The researchers began by analysing the content of the first 100 Facebook groups to develop a coding and classification scheme that could be applied to the entire set. This step led to the identification of four main codes within which to categorise posts: (1) fundraising groups, (2) awareness-raising groups, (3) support groups and (4) promote-a-site groups. The majority of posts related to fundraising (44.7%), followed by awareness (38.1%). Through a summative content analysis, the researchers were able to learn how Facebook is used to promote support for breast cancer.

Advantages and disadvantages of content analysis

Conventional content analysis


  • The ability to gain information directly from study participants without imposing preconceived categories
  • The establishment of data credibility through activities such as triangulation, negative case analysis and member checking


  • There is potential for incomplete understanding of the context of the data, leading to codes that do not accurately represent the data
  • This method of analysis can be easily confused with other qualitative methods that share similar analytical approaches

Directed content analysis


  • Existing theories can be supported and extended.


  • Researchers may approach the data with an informed yet visible bias based on the framework or theory that is being used. As a result, they may be more likely to find evidence that is supportive of, rather than non-supportive, of a particular theory
  • An overemphasis on the theory can limit researchers’ understanding of contextual aspects of the phenomenon

Summative content analysis


  • A simple and systematic way to study the phenomenon of interest
  • Provides basic insights into how words are used


  • Limited by inattention to the broader meanings present in the data


Content analysis is a widely used qualitative research technique to systematically classify codes and identify categories or patterns within the data. There are three types of content analysis: conventional, directed and summative. These three types cover the spectrum of content analysis, from inductive (conventional) to deductive (directed and summative) techniques. Each type of content analysis has its own advantages and disadvantages, but should be chosen based on the researchers’ aims and objectives.


  1. Hsieh HF, Shannon SE. Three approaches to qualitative content analysis. Qual Health Res. 2005;15(9):1277-88. doi:10.1177/1049732305276687
  2. Roberts CW. Other than counting words: a linguistic approach to content analysis. Soc Forces. 1989;68(1):147-177. doi:10.1093/sf/68.1.147
  3. Heydari H, Hojjat-Assari S, Almasian M et al. Exploring health care providers’ perceptions about home-based palliative care in terminally ill cancer patients. BMC Palliat Care. 2019;18(1):66. doi:10.1186/s12904-019-0452-3
  4. Colorafi KJ, Evans B. Qualitative descriptive methods in health science research. HERD. 2016;9(4):16-25. doi:10.1177/1937586715614171
  5. Elo S, Kyngas H. The qualitative content analysis process. J Adv Nursing. 2008;62(1):107-115. doi:10.1111/j.1365-2648.2007.04569.x
  6. Wei H, Watson J. Healthcare interprofessional team members’ perspectives on human caring: a directed content analysis study. Int J Nurs Sci. 2019;6(1):17-23. doi:10.1016/j.ijnss.2018.12.001
  7. Movahed E, Morowatisharifabad MA, Farokhzadian J et al. Antiretroviral therapy adherence among people living with HIV: directed content analysis based on information-motivation-behavioral skills model. Int Q Community Health Educ. 2019;40(1):47-56. doi:10.1177/0272684X19858029
  8. Purkey E, Bayoumi I, Davison CM et al. Directed content analysis: a life course approach to understanding the impacts of the COVID-19 pandemic with implications for public health and social service policy. PLoS One. 2022;17(12):e0278240. doi:10.1371/journal.pone.0278240
  9. Schaaf M, Boydell V, Topp SM et al. A summative content analysis of how programmes to improve the right to sexual and reproductive health address power. BMJ Glob Health. 2022;7(4). doi:10.1136/bmjgh-2022-008438
  10. Bender JL, Jimenez-Marroquin MC, Jadad AR. Seeking support on Facebook: a content analysis of breast cancer groups. J Med Internet Res. 2011;13(1):e16. doi:10.2196/jmir.1560