Mixing It With Big Data, Again

We can’t afford to ignore Big Data. Everybody is talking about it. Some worship at its altar. But an increasing number are providing more balanced accounts of its role in society, as two recent books suggest.

Alex Ross, The Industries of the Future (New York: Simon and Schuster, 2016) a former advisor to the U.S. Secretary of State, considers innovation and globalism, focusing on data as the “new material of the information age.” He sees digitization as another milestone following the earliest writing and recording systems up to the advent of computers and networks. Ross reminds us that “ninety percent of the world’s digital data has been generated over the last two years” (p. 154), a truly astounding assessment. He looks at industries utilizing robotics, genomics, codification of the financial markets, technology’s use in weapons and warfare, and forth.

This is not a book lauding Big Data. “As it becomes more ubiquitous, big data will fade from use as a buzz phrase. As it reaches into more and different aspects of our everyday lives, the combination of big data and behavioral science will subtly change our routines and expectations through a series of digital nudges that guide our choices through the day” (p. 180). This, for me, can’t happen fast enough. Ross also provides a sense of reality about what data represents. “Big data is, by its nature, soulless and uncreative. It nudges us this way and that for reasons we are not meant to understand. It strips us of our privacy and puts our mistakes, secrets, and scandals on public display. It reinforces stereotypes and historical bias. And it is largely unregulated because we need it for economic growth and because of efforts to try to regulate it have tended not to work; the technologies are too far reaching and are not built to recognize the national foundations of our world’s 196 sovereign nation states” (p. 184). It is why Big Data is so intellectually engaging, for me and my students.

The other book is Christine L. Borgman, Big Data, Little Data, No Data: Scholarship in the Networked World (Cambridge: MIT Press, 2015). Like her earlier works, this is a volume aimed at a “broad audience of stakeholders in research data” (p. xix), and it is certainly a goal well achieved. Borgman is a scholar careful to take into account the promises and pitfalls of the uses of computing: “It is the power of data, combined with their fragility, that makes them such a fascinating topic of study” (p. 4). She provides excellent historical context for what we now term Big Data, something that many avoid providing, instead addressing it as a new issue (when it really isn’t).

Borgman considers a lot of issues relevant to the work of both librarians and archivists, that is, of the new emerging kinds of librarians and archivists. She wrestles with the challenge of what to preserve and how, rejecting the idea that we can maintain everything. She is expressly focused on the challenges of maintaining the evidentiary records: “Cited objects disappear, links break, and search algorithms evolve as proprietary secrets” (p. 57). Borgman provides lengthy case studies of data scholarship in the sciences (such as astronomy), the social sciences (such as social media studies), and the humanities (such as classical art and archaeology) She raises the issue of data policy for sharing, opening, and reusing data and the challenges associated with implementing such policy. If we do not solve these concerns, she forecasts a tough road ahead: “Data will remain ‘dark matter’ in scholarly communication unless they are described, curated, and made discoverable” (p. 241). In such an assessment, we can see the focus for new graduate studies in archives and libraries.

In a refreshing chapter in a book on Big Data, Borgman considers what to keep and why, concluding the book in this way: “To restate the premise of this book, the value of data lies in their use. Unless stakeholders can agree on what to keep and why, and invest in the invisible work necessary to sustain knowledge infrastructures, big data and little data alike will quickly become no data” (p. 287). Amen.








