Chapter 34: Data storage and access
Danielle Berkovic
Learning outcomes
Upon completion of this chapter, you should be able to:
- Understand the importance of data storage and access in the context of qualitative research.
- Identify the types of data that require considered management processes.
Why do data storage and access matter?
Managing qualitative data requires designated structures and systems, for two main reasons: (1) to protect participants’ privacy, confidentiality and anonymity, and (2) to organise data in a way that makes it easily accessible and retrievable. 1 Data may include, but may not be limited to, consent forms and demography of participants, interview and/or focus group transcripts, audiotapes or videotapes, researcher notes and other types of data described in Section 3 (e.g. River of Life drawings and diagrams, social media data). Storing qualitative data appropriately enables researchers to manage data that may have been accumulated in various forms, at different points in time and across multiple locations for different research purposes, all while maintaining the security of the data. It is therefore recommended that researchers establish appropriate data management protocols at the outset of a research project and use a reflective process to better systemise their data (see Table 34.1).
Data storage policies
Researchers should be mindful of their university’s or institution’s data management policies. In Australia, the National Health and Medical Research Council (NHMRC) provide guidelines supporting the Australian Code for the Responsible Conduct of Research, and researchers are required to adhere to the access and storage policies outlined in the document. For example, the Code states that researchers must ‘retain clear, accurate, secure and complete records of all research including research data and primary materials’. 2 The NHMRC outlines two key reasons for appropriately storing data:
- Retention and publication: Enable the justification of the outcome of the research and facilitates sharing of the published findings. Primary qualitative data may be of cultural or historical value, or provide important insight into certain communities, so researchers should consider appropriate avenues through which to store their data. Published data also often requires some type of metadata, and although this is less applicable to qualitative research, data collection tools such as interview guides or researcher reflections may be appropriate to keep.
- Managing confidential and other sensitive information: Unless specified otherwise in the participant information sheet, prospective research participants usually participate in research with guaranteed endeavours to protect their anonymity. Researchers are responsible for enshrining this right by appropriately storing consent forms with names either in a locked filing cabinet or scanned and in a password-protected folder. Individual universities and institutions will have their own policies about this and it is always important to check this at the outset of your research when coming up with your data storage and access policies. What can also come under this is allocating pseudonyms for participants in interview transcripts.
Universities and other research institutions have policies that explain how qualitative data must be stored, the length of time that it should be stored for, and guidelines on when electronic archives are to be erased and physical archives are to be destroyed. In particular, electronic data should be stored on a secure server that is only accessible by the researchers named on the ethics application. Some Human Research Ethics Councils will not approve studies that use servers and/or storage services located outside of Australia, for example, Dropbox. These are standard research governance policies, which researchers must be aware of, and vital step in the research process that research ethics committees will also consider prior to approval. In addition, most universities or other research institutions will mandate that data storage and management processes are detailed in the participant information sheet or consent forms provided to prospective participants. 3 What to do with these sheets, in addition to other research materials, is provided in Table 34.1, based on the authors’ experiences in conducting qualitative research.
Table 34.1: Examples of data storage and access protocols
Research materials or data | Storage and access considerations | Example storage and access solutions |
---|---|---|
Consent forms | • Consent forms usually contain the participant’s name and can sometimes include their contact information. • Consent forms often explain the research context, and therefore are revealing of participants’ experiences, which the participant may prefer to keep private outside of the research. |
• If the consent form is a hard copy version, the researcher should place it in a locked filing cabinet at the designated research facility. The researcher should also scan the consent form and store it digitally in a folder that is protected and only accessible by the named researchers. The same process applies to electronic consent forms. |
Audio and video recordings | • Audio and video recordings usually contain the participant’s name (assuming that an interviewer, for example, refers to the participant by their name at least once). • Audio and video recordings may also detail private information that the participant is only disclosing for the purposes of the interview or focus group. |
• The researcher should download the audio and/or video recording onto a computer and delete the recording from the original device. • The researcher should save the file in a folder that is protected and only accessible by the named researchers. |
Transcripts | • Transcripts usually contain the participant’s name, assuming that the interviewer refers to the participant at least once in the recording. • Transcripts may also detail private information that the participant is only disclosing for the purposes of the interview. |
• The researcher should save the transcripts in a folder that is protected and only accessible by the named researchers. • Transcripts should be de-identified by replacing participants’ names with numbers or pseudonyms. |
Field notes | • Depending on what the researcher is detailing at the time, field notes may contain identifying information about research participants who are being observed. | • The researcher should save the notes in a folder that is protected and only accessible by the named researchers. |
Researcher reflections | • Depending on what the researcher is reflecting upon, their reflections may contain identifying information about participants. | • The researcher should re-read their reflections to ensure that there is no potentially identifying information about participants. If there is no identifying participant information, the researcher may choose to store this data as it best assists them. • If the reflections contain identifying information, the researcher should save the transcripts in a folder that is protected and only accessible by the named researchers. |
Differences between sharing ‘raw’ qualitative and quantitative data
As qualitative research gains prominence and is increasingly published in high-impact journals, the concept of data sharing is being discussed among journal editors, manuscript reviewers and researchers. Raw qualitative data – data that has not yet been analysed or aggregated – is distinct from raw quantitative data; how to adequately share qualitative data without compromising participant anonymity is a key consideration. Four challenges and potential solutions for sharing qualitative data are described here:4
- Qualitative data is unique: Qualitative data lends itself to generating new theoretical or practical insights about a phenomenon of interest, in greater detail than is possible through quantitative research. This qualitative data is usually not collected in linear fashion. Thus, the measure of reliability in quantitative research (e.g. producing the statistical code used to analyse anonymised data) does not translate to qualitative data. Instead, researchers may describe verification strategies for how analytic codes were developed, produce reporting checklists or calculate inter-rater reliability for comparing similarities between data coders.
- Reproducible research and qualitative data: The iterative nature of qualitative data collection, analysis and interpretation means that the process of data verification is challenging. This is dissimilar to quantitative research, where enough detail about the methods should be described to ensure it is independently reproducible. This concept is not transferrable to qualitative research; for example, even if an interview or focus group is audio-recorded, the recording cannot show the body language of participants, which may contribute data on their emotional response to a particular topic of discussion. Sharing qualitative data for reproducibility purposes is unlikely to produce the same results, as each population group, and even the same population group on a different day, are likely to contribute different data.
- Preserving the anonymity of participants: Maintaining participants’ anonymity is a pillar of qualitative research ethics, and so all data collected would need to be de-identified prior to sharing. Although this may be possible, it places undue burden on the participant to ask that their entire de-identified transcript be made publicly available, as their experiences alone may be identifying, or at least sensitive. De-identifying data for the purpose of data sharing also introduces unnecessary potential for human error and inconsistency, which in turn risks the participants’ privacy and the broader population’s confidence to participate in qualitative research. Certain types of studies are at particular risk of deductive disclosure, especially where participants are from easily identifiable minority groups or specialised professions. Instead of a data-sharing policy, researchers could be required to follow procedures that enhance transparency; for example, disclosing the transcription methods employed and the processes for analytic code development and determination of the final themes in the study’s findings). Supporting the study’s findings with an adequate number of quotes in research publications is also key.
- Other unintended consequences of qualitative data sharing: The burden of organising qualitative data to the extent that it is ready for external review is likely to exceed the time and effort required to write a report for publication. Given that qualitative research aims to convey people’s experiences, it could be argued that the researcher’s energy should first and foremost focus on displaying the data and interpreting its meaning. Journals should consider the possibility that, in response to data-sharing policies, participants and researchers may alter their questions and responses, similarly to how participants might alter their behaviour to be considered more ‘desirable’ if they are being observed by a researcher.
Summary
Ensuring that qualitative data is managed, stored and appropriately accessible is a key part of the research process, and should be considered by researchers at the outset of the project.
References
- Lin LC. Data management and security in qualitative research. Dimens Crit Care Nurs. 2009;28(3):132-137. doi:10.1097/DCC.0b013e31819aeff6
- National Health and Medical Research Council. Management of Data and Information in Research: A guide supporting the Australian Code for the Responsible Conduct of Research. 2018. Accessed March 31 2023. https://www.nhmrc.gov.au/sites/default/files/documents/attachments/Management-of-Data-and-Information-in-Research.pdf
- McCrae N, Murray J. When to delete recorded qualitative research data. Res Ethics. 2008;4(2):76-77. doi:10.1177/174701610800400211
- Tsai AC, Kohrt BA, Matthews LT et al. Promises and pitfalls of data sharing in qualitative research. Soc Sci Med. 2016;169:191-198. doi:10.1016/j.socscimed.2016.08.004