Skip to Main Content
Skip to main content

Graduate Writing: Writing About Data

Writing About Data

Data can take many forms – be it quantitative or qualitative – and is essential for building claims about research within results and discussion sections of a research paper or thesis. Such data acts as evidence to answer the research question or hypothesis that was established at the beginning of the text.

While there may be disciplinary differences to take into consideration, general principles of data communication involve (1) selection, (2) hedging, and (3) visual representation.

Selection

Raw data is seldom shared with readers; instead, they expect that writers will logically organize the data into more understandable information (e.g., percentages, frequencies). How this information will be organized will be dependent on the type of data that was collected or generated. For instance, interview data is generally captured via quotations that are incorporated into the text of the paper, but tables may also be used and referred to (e.g., to capture the demographics of respondents).

Hedging

Hedging refers to language that is used to communicate the strength of one’s claim. By qualifying or moderating claims, writers recognize the limits of the results that they are reporting.

Consider the following two examples. The hedge has been underlined for ease of reference.

  • Daily physical exercise improves test scores (strong claim).
  • Daily physical exercise will improve test scores (strong claim).
  • Daily physical exercise may improve test scores (weaker claim).
  • Daily physical exercise could improve test scores (weaker claim).

In using such language, writers signal a more nuanced understanding of data interpretation and distinguish between concepts of causation and correlation.

For more examples of hedging statements, visit the Academic Phrasebank hosted by the University of Manchester.

Visual representation

Quantitative data is conventionally expressed in a visual format – for instance, as a chart, graph, histogram, table, or data map.

  • Unsure of how to depict your data? Explore this flow chart created by Abelda (2009) that helps distinguish among different chart types.

Qualitative data can also be expressed visually – for instance, as a figure, map or model, word cloud, timeline, flow chart, Venn diagram, taxonomy, or metaphorical/relational visual display (e.g., using images or icons). Some software products build options for data visualization into their systems (e.g., NVIVO).

When embedding visual representations, it is essential that these are labelled (e.g., Figure 1, Table 1) and described in the prose of the text. Such descriptions reflect the most important aspect of the image (e.g., Figure 1 provides an overview of population changes in Saskatchewan between 2000 and 2020).  

For example phrases that are used to report results, visit the Academic Phrasebank hosted by the University of Manchester.

Looking for more information on writing about data?

Structuring Data Commentary

When it comes to reporting data, standard structures are popular, if not universal. Consider the following as outlined by Swales and Feak (2013) in Academic Writing for Graduate Students: Essential Tasks and Skills. Does this structure align with how you have seen data reported in your discipline?

  • Location elements and/or summary statements
    • Reference to a visual representation (e.g., table, figure)
    • Brief description of visual representation
  • Highlighting statements
    • Generalizations about the presented data (e.g., trends)
  • Discussions of implications, problems, expectations, recommendations, etc.
    • Closer analysis of the data, based on own expertise

Data communication in action

Sample 1

Review the following excerpt from Swales and Feak, which focuses on computer viruses. How easy or difficult do you find it to read? What language has been used to communicate trends? How has the information been ordered?

Table 5 shows the most common sources of infection for U.S. businesses. As can be seen, in a great majority of cases, the entry point of the virus infection can be detected, with e-mail attachments being responsible for nearly 9 out of 10 viruses. This very high percentage is increasingly alarming, especially since with a certain amount of caution such infections are largely preventable. In consequence, e-mail users should be wary of all attachments, even those from a trusted colleague. In addition, all computers used for e-mail need to have a current version of a good antivirus program whose virus definitions are updated regularly. While it may be possible to lessen the likelihood of downloading an infected file, businesses are still vulnerable to computer virus problems because of human error and the threat of new, quickly spreading viruses that cannot be identified by antivirus software.

Source: Swales, J. & Feak, C. (2013). Academic Writing for Graduate Students: Essential Tasks and Skills. The University of Michigan Press. p. 116.

Let’s break down the paragraph.

Table 5 shows the most common sources of infection for U.S. businesses.

  • Location element (Table 5)
  • Summary statement (reference to sources of computer virus infection)

As can be seen, in a great majority of cases, the entry point of the virus infection can be detected, with e-mail attachments being responsible for nearly 9 out of 10 viruses.

  • Highlighting statement (general trends)

This very high percentage is increasingly alarming, especially since with a certain amount of caution such infections are largely preventable. In consequence, e-mail users should be wary of all attachments, even those from a trusted colleague. In addition, all computers used for e-mail need to have a current version of a good antivirus program whose virus definitions are updated regularly. While it may be possible to lessen the likelihood of downloading an infected file, businesses are still vulnerable to computer virus problems because of human error and the threat of new, quickly spreading viruses that cannot be identified by antivirus software.

  • Implications (why the data is concerning)

Sample 2

Review the following excerpt from Primary Health Care Research & Development, which focuses on participant responses to a webinar series. How easy or difficult do you find it to read? What language has been used to communicate trends? How has the information been ordered? To what extent does it adhere to the structure identified by Swales and Feak?

Sixty-eight individuals participated in at least one webinar, and 46 post-webinar surveys were completed. Across all webinars, most participants belonged to the nursing profession (Table 1). A minority of webinar participants were memory clinic team members; however, this group comprised the majority of survey respondents. The survey response rate ranged from 42% to 67% across the three webinars, with FP/NPs [family physician/nurse practitioner] accounting for the majority of survey respondents (Table 2).

Among survey respondents across the webinars, overall satisfaction was high (94%) (Supplemental Table 3). Regarding webinar content, the majority agreed the sessions were appropriate for their professional needs and new information was learned (96%). Most found the interactive format and webinar environment effective for learning (96%). A majority also intended to apply the information in their practice and appreciated the participation of other PHC [primary health care] teams and professionals; however, these particular items were endorsed by fewer participants (<90%).

In open-ended comments, the most effective aspects of the webinar identified by survey respondents were the webinar topics and interactive question and answer format (Table 3). Primarily AHPs [allied health professionals] commented on topic effectiveness (n = 6 of 10 comments) (data not shown in table).

Source: Kosteniuk, J., Morgan, D., O’Connell, M. E., Seitz, D., Elliot, V., Bayly, M., … Froehlich Chow, A. (2022). Dementia-related continuing education for rural interprofessional primary health care in Saskatchewan, Canada: perceptions and needs of webinar participants. Primary Health Care Research & Development, 23, e32. doi:10.1017/S1463423622000226

Let’s break down the first paragraph.

Sixty-eight individuals participated in at least one webinar, and 46 post-webinar surveys were completed.

  • Basic description of data

Across all webinars, most participants belonged to the nursing profession (Table 1).

  • Location element (Table 1)
  • Summary statement (description of professional affiliation)

A minority of webinar participants were memory clinic team members; however, this group comprised the majority of survey respondents.

  • Highlighting statement (survey trends)

The survey response rate ranged from 42% to 67% across the three webinars, with FP/NPs [family physician/nurse practitioner] accounting for the majority of survey respondents (Table 2).

  • Highlighting statement (survey trends)

In this example, we see that the first two components identified by Swales and Feak have been included. This trend continues through the excerpt; no discussion is featured in this sample. Instead, the third component - implications - is later included in the discussion section (e.g., “Survey respondents identified session topics as the most effective webinar feature, possibly due partly to our efforts to organize webinars that met teams’ earlier topic suggestions”).

Reviewing articles and theses in your discipline will help you to develop a deeper understanding of data commentary within your field; for instance, some disciplines expect implications to be discussed within the results section, while others prefer that this information be addressed in a separate discussion section.