Epidemiology, the “Data Deluge,” and the Problem of “Good” Information

This article is part of the following series:

Walking down the halls of a public health agency in the fall of 2009, I quickly became recognizable as the person doing research on information-sharing and sensemaking during infectious disease outbreaks. Two weeks into my tenure, I started being hailed by my academic association and playfully taunted with echoes of my research question: “Hey, Berkeley! Have you figured out the problem of information yet?”

The joke belied the fact that people were often extremely eager to talk about the various issues associated with information in public health: gathering data, getting access to various types of data or information, deciphering information in the form of graphs or tables or numbers, generating and recirculating information, and discerning what was often referred to as any “actionable information” that might be used to help halt the spread of a growing pandemic. Often after I explained the research goals of the interdisciplinary team project I was on, people would let out an audible sigh expressing an “information fatigue” brought on by dealing with the daily glut. The public health professionals I knew well or interviewed  – working in public health agencies in the United States and Hong Kong – habitually referred to the steady stream of emails, phone calls, meetings, and teleconferences as part of a “sea of information” or a veritable “data deluge.” Already taxed with their regular duties of disease surveillance, prevention efforts, and outbreak response, public health workers everywhere felt that their burdens had increased exponentially throughout the first ten months of the 2009 H1N1 pandemic.

People regularly complained about “drowning” in information, about being bowled over by a never-ending series of “waves” of data, about having “barely a drop” of usable information in the oceans that crossed their desks each day. I rapidly discovered that the collective goal wasn’t necessarily to become adept swimmers; rather, it seemed to be simply learning to tread water in the midst of a virtual sea of information. The experts and analysts I worked alongside or interviewed throughout the year-long pandemic continuously voiced a common longing for a more permanent solution to the problem of too much information, for a method or practice or tool that might help them cope with the overflow produced by rapidly improving technological systems of data generation and information-sharing. In 2009, the primary problem was no longer necessarily getting access to information, but of effectively coping with an overabundance of it.

Post-SARS in 2003, it had become apparent to those within the global public health community that information on infectious disease outbreaks of global importance needed to be: 1.) verifiable from a trusted or validated source; 2.) more readily circulated; and 3.) shared at a faster rate. The public health community’s subsequent emphasis on fostering greater transparency and information-sharing in public health, spearheaded by changes to the WHO’s system for reporting infectious diseases, including the revision of the International Health Regulations (IHR), solved some of the concerns over access to information, yet at the same time added an increased pressure to more quickly report validated – or good – information. The modern “myth” that increased transparency and access to more information would produce “better” information had been born. And yet, during the world’s first influenza pandemic in decades, it became increasingly apparent to everyone working in public health that more information was not necessarily better information. Instead, the reality of information-sharing during the 2009 H1N1 pandemic had highlighted other, more social – or human – problems tied to the quality of the information being readily shared.

The book I’m working on now examines how information in global public health networks is produced, managed, understood, and circulated during an outbreak. Using the 2009 H1N1 pandemic as a specific case study for examining the social practice and politics of information-sharing, my data suggests that informal networks – consisting of personal relationships – were crucial to the process of sharing sensitive, unvalidated, or what people called “good” information. In particular, the recent drive to foster greater efficiency in information sharing has in turn created various technological, scientific, and institutional temptations to decontextualize information in order to share it more quickly. The end result of all this is a problem of quality, not quantity. In other words, the largely political push toward greater transparency and faster information-sharing in public health has aggravated a need for what the people I worked with often called “context.”

As a concept used by public health professionals, context refers to details of personal or clinical experience and intuition about a disease outbreak. To them, context is the key to transforming uncertainty into certainty. To me, context as a concept refers to the human relationships and daily practices and experiences at the heart of both the production and understanding of epidemiological information. If “information” is more about the production and circulation of data or facts, then “context” is more about the production of knowledge and the circulation of experience and beliefs. Without context, “facts” (or the type of validated information that epidemiologists and scientists traffic in) are still viewed with a certain suspicion as to their soundness or applicability. Contextual information is the alchemic force that helps to turn “information” into “knowledge.” Without its attendant context, information produced and circulated during the pandemic was deemed mostly, if not entirely, useless.

Context lies at the very nexus of the human and the technological. It is the dividing point or connecting bridge between “data” and “knowledge” as well as the symbol of a chronic lack in the midst of informational overload. Throughout my fieldwork during the 2009 pandemic, the thing that people most wanted to acquire, what they spent the largest amount of their time trying to gain access to, was not more information about case counts, or symptoms, or even about virulence, but information about how people were aggregating, analyzing, and producing information about the outbreak. In essence, the public health professionals I knew were desperate to better understand their peers’ thinking processes. They believed that this type of contextual information would help them to better decide which pieces of generic information – or aggregated data – about the outbreak were most important. In sum, then, they wanted context to help them separate out the important signals from the collective noise. Context was considered key to making good decision, to taking the right response actions.

To deal with the increasing volume of data and information, health organizations utilize a set of criteria for determining “good” information and have developed a protocol for information-sharing. Yet the epidemiologists who work within large public health institutions or agencies still have to individually “make sense” of each unique situation by using that set of criteria as a guideline. In order for certain response actions or decisions to take place, epidemiologists must rely upon each other’s analyses and personal judgments. Information-sharing and the use of context captured from my fieldwork and described above suggests to me that information in global public health moves through the following informational stages:

  1. gathering information, or aggregating data from unofficial, surveillance, or informal sources
  2. searching for and understanding context, or analyzing all previously aggregated information in light of personal opinions, unvalidated information, or contextual details of disease outbreaks
  3. producing, (re)circulating, and using ‘good’ information to affect official response actions or recommendations for local action.

While information on an outbreak might “look” exactly the same, the contextual information produced by people who interpret that information will necessarily be different. In other words, different conclusions will be based on the same information. This difference is qualitative and due to the common daily practice of producing contextual information that is itself based on the unique lived experiences of individuals working in public health. It is this type of past lived experience as context that global public health information systems have trouble sharing through any formal channels. A brief, but pertinent, example here: During my time observing analysts inside the public health agency, an outbreak of H1N1 occurred in a far-removed location that seemed as though it might be more “serious” than the milder outbreaks happening elsewhere. Lab data on viral samples collected from patients at this location were circulated freely, as was information on overall case counts and some clinical information. However, analysts complained that the “context” was still missing. They wanted to see some type of personalized interpretation of the lab results. They asked questions about who had conducted the lab tests and what type of assays had been used, They also wanted to hear from someone they already knew and trusted in the remote location to confirm the lab data and to talk about what was actually happening on the ground. They had questions about the political situation at the location that might be causing reports from news sources to be skewed and inaccurate. The multitude of teleconferences, meetings, emails and personal telephone calls which I observed throughout my fieldwork were all attempts to gather such context – all in a concerted, if misplaced, effort to qualify and quantify what it was difficult for many individuals to describe, little alone to capture in an email or standardized form.

Scholars working on topics and issues associated with the development of formal information systems have coined a name for humans living in the so-called Information Age – inforgs. Inforgs are loosely defined as “interconnected informational organisms” that consist of both “biological agents and engineered artefacts” that live in a world “ultimately made of information, the infosphere” (Floridi 2010: 9). Floridi sees this transformation from human to inforg as something that is fundamentally “re-ontologizing” what it means to be human and to live in the 21st century (Floridi 2007). I find the concept of inforgs compelling, even if I also find myself pushing against such a too-easy neologism. The daily practices of checking emails, looking for the latest news online, and of livestreaming meetings are merely a few common examples of the practice of epidemiology in the infosphere. While many studies have paid attention to how various experts, such as the analysts discussed here, gather and consume information, little attention has been paid to the human/technology interface that produces such information in the first place. One solution might be to take the use of information and information technologies more seriously from an anthropological viewpoint.

Right now, I’m working through the issues of defining “good” information, the 21st century “data deluge,” and the role of context in an attempt to craft an ethnography of the daily practice of turning information into actionable knowledge. These issues highlight the various difficulties of gathering, analyzing, and reporting information not only related the 2009 pandemic, but are indicative of the messy and complex process of making sense out of a daily barrage of information in any scientific or data-driven field. The already nascent ‘anthropology of information’ needs to pay particular attention to points where the human and the technological become enmeshed with each other.

As Bowker and Star have argued, there is “a permanent tension between universal standardization” of information-sharing systems and “the local circumstances of those using them” (139). Efforts at further standardization of information systems in global public health are only doomed to worsen the problem if they fail to take the problem of context more seriously. And context can only be understood at the level of the social and the cultural – or the realm of anthropology. I see the anthropology of information as a field that has rich potential not only for further research but to bridge the gap between theory and application. Information is a part of our daily lives; as inforgs, we need to get much better at understanding how we use it, think about it, and relate to it.


Theresa MacPhail received her PhD in Medical Anthropology from UC-Berkeley/UC-San Francisco. Her first book, Siren Song: A Pathography of Influenza and Global Public Health, is based on her dissertation research on the science and epidemiology of influenza in Hong Kong, the United States, and Europe. She is currently a Faculty Fellow/Assistant Professor in Science Studies in the John W. Draper Interdisciplinary Master’s Program in Humanities and Social Thought at New York University.

2 replies on “Epidemiology, the “Data Deluge,” and the Problem of “Good” Information”

I love this article. The idea that contextual knowledge is critical in understanding large amounts of information is a simple but under-appreciated insight. There are methods in quantitative disciplines like epidemiology to formally incorporate contextual knowledge into decisions about how to carry out quantitative analyses. In other words, inferences from statistical analyses hinge on the plausibility of the assumptions, and the plausibility of the assumptions can only be evaluated by contextual knowledge. I do not agree, however, that contextual knowledge is by nature “disciplinary” – either anthropological or sociological. It seems to me to emerge from “practice” -whether that is a physician who uses contextual knowledge about from day to day experience that frames a steady stream of laboratory results or the findings of the latest randomized trial or an intravenous drug user who understands the competing priorities of getting drugs, avoiding police, getting money and all while trying to use clean needles.

My colleagues and I recently published an analysis that may be relevant to this conversation.

Thanks for this, Elvin. I look forward to reading the article and perhaps continuing the conversation. My next project will focus on the “anthropology of information” in epidemiology – specifically focusing on the process of turning information into “Actionable Knowledge”.

Comments are closed.