I attended a full-day workshop called "Big: Culture and Data in the Digital Field" at the end of last week (I've been away on vacation for a week since then with no internet access). The workshop was organised by professor Tom Boellstorff at the UCI Center for Ethnography and also by the Intel Science & Technology Center for Social Computing (ISTC). Several people I "know of" but had never met IRL were there and that's always nice. The idea of organising a workshop appearantly came from ideas of Tom's from when he wrote a a book and a First Monday article about "metadata labor"
There were around 10 presenters (who each only got to give a 15-minute talk - perfect length) and lots of time for discussions and coffee. The organisation of the event was in fact exemplary. It would be hard (but not impossible) to surpass the organisation, but very easy to do a lot worse. There were basically three cycles that went like this:
- Three presenters giving 15 minutes talks (with powerpoint slides, movies etc.)
- A "fishbowl" conversation exercise (geard towards picking up ideas from the audience)
- A 30-minute general conversation (geared towards the presenters and the presentations)
- Break (lunch or coffee break)
On my sabbatical, I've for the most part hunkered down behind my computer, writing. I haven't gone to very many talks and seminars. I have had lots of time to read (my daily "fix" - or "allotment"), but I realise that most of the new ideas I have gotten here at UCI have come from texts rather than from other people. It was therefore really nice to be part of a truly intellectual academic environment - if but for a day. I felt almost cleansed, being among (many) people, hearing about new projects and taking part of a firework of ideas! Also, I met several persons I really got on well with and will meet again here in Irvine - and hopefully keep in touch with afterwards too!
This will (of course) not be a walk-through of the whole day and all the speakers, I'll just discuss a few of my personal highlights of the day.
- Paul Dourish talked about a meeting he had just attended were he was told that "Big Data" is sooo last year because now it's all about the three V's; Volume, Velocity and Variety. Tom reflected on the velocity of terms, something he saw as a dangerous trend. I agree. I would hate to have a text rejected for using "last year's terms" and I would hate for scholarship degenerating to being the first or the best at coining, defining and using the lastest, trending terms, rather than being about clear thoughts and deep insights. I have, for a long time, flirted/struggled with the idea that so much of what we do in the academy is "phony" (a race for positional goods). I would hate to be (even further) proved that that is the case. I mean, wouldn't it be terrible if people started to submit text of little value (or worse - get them published)? Yeah, right - like that never happens today... I even totally opted out of the academic race for some years, but now I'm back at it (like a true sucker, or junkie).
- The previous point can be neatly tied to a conversation we had later during the day about academic overproduction. Mary L. Gray though we should slow down. Talk more to each other and publish less. "We are producing too much of too little value" - and I agree 100%. She also said that the more we can publish, the more careful we should be about what we publish. As to that, am I contributing to publishing "too much" when I write blog posts here once or twice per week? Or is this part of the sought-after conversation she referred to? This all reminds me of Fred Brook's quip from "The mythical man-month" about decreasing returns: "Adding manpower to a late software project makes it later" and the corollary about the fact that it takes nine months to bear a baby no matter how many women are assigned to the task. As I write this, I come to think of Brook's law being an example of Tainter's law of decreasing returns of increasing complexity. I should think some more about the connection between Tainter and Brooks at some point - I know there is one but this is not the time to delve into that.
To summarise the two bullets above, the workshop raised questions (for me) about how big data is shifting our perception and our behaviour as individuals, as citizens and as teachers, academics and scholars. How does the idea and the practice of big data shift our perception of what is possible and what isn't possible, what is easy and what isn't easy, what is of value and what isn't of value, what is right and what is wrong? I'm the born sceptic, concentrating as much (or more) on what we loose as on what we gain. Still and even as a sceptic, I find this project extremely interesting and it would never have come about without big data and number crunching. In fact, I think that research project is an excellent example of the upside of big data as well as of innovative interdisciplinary collaboration.
One of the speakers was Anthropologist Morten Axel Pedersen from Copenhagen University (official homepage, own homepage) who talked about "Complementary social science? Reflections from a Deep Data Experiment". More specifically, his talk was about a project called SensibleDTU (DTU = the Technical University of Denmark). The projects wants to "map social networks in real-time" and at the heart of the project is the fact that they have handed out 1000 (!) Google Nexus 4 smart phones to first-year DTU students with the caveat that they agree to have their social interactions with others tracked and mapped by the DTU-developed "SensibleDTU app". The SensibleDTU project is a "large" study. The research project is interested in many different questions, for example "how friendship and networks of behaviour form offline and online" and "how information and influence is transmitted and transformed in the DTU 'social fabric'". As apart from the data collected through the app, the researchers also have access to the students themselves (interviews etc.). The project will run (collect data) for as long as the students keep their phones (i.e. after 12-18-24 months it is expected that students will buy new phones and drop out of the project in terms of generating new data). This project obviously raises huge ethical questions (none were discussed at the workshop), but the students themselves are apparently more interested in the technical back-end than they are worried about in-depth surveillance... (this was a technical university). The project employs (as far as I understood) around 10 persons (half are ph.d. students) and they (Morten) will organise a workshop about [something] in Copenhagen in the autumn. I talked to Morten and we (KTH/CSC/MID - STP, MID4S, EID) should be there! This is a project that KTH/CSC/MID (together with other institutions) definitely could replicate. We have the students and the in-house know-how to pull something like this off! Morten discussed mixed methods (big data + social science) in his talk. With an increasing number of users and and increasing volume of data (bits/user) the DTU project is collecting "deep data". The combination of big data + social science/ethnographic interpretation can lead to something Morten called "thick data". These are just terms I'm throwing around here in order to not forget them but I have lit tle deep understanding of what what "deep data" and "thick data" means in practice or what the project status is (are they collecting data now?).
Comment (140421): Although I didn't hear anything about ethical concerns at the workshop, Morten read this blog post and added that "the project have very profound and very reflective ethical rules and practices built into its very edifice".
Malte Ziewitz from NYU talked about "unscaling ethnography" and about big data vs "small data moments". Here's a small data moment: "why did you do that?", "well, I was hungry and wanted to go to lunch so I had to finish the experiment early"... Several speakers talked about the value of mixed methods. You can get a lot of data (logged) from "within" the computer system, but some things will forever elude you (that telephone call between two friends that sets one of them into motion). Complex problems will be hard to understand with only quantitative or qualitative methods and data. Geoff Bowker (Chair of informatics at UCI and leader of the Values and design lab) gave a dazzling high-speed talk ("Make It So, Data") with interesting juxtapositions of cool pictures and neat sound bytes and he said something to the effect that "people are afraid of getting eaten by the cookie monster, but, people who don't love cookies must have had troubled childhoods..." :-) Geoff had also created a histogram of the presence of the terms "knowledge", "information" and "data" during the last 200 years and in the 1960's, during the computer revolutions, these terms traded place; "knowledge" was on top but switched place with "data" that was previously at the bottom. His tongue-in-cheek conclusion was that the 1960's was the decade when knowledge became overtaken by information and data (and, it goes without saying that we are presumably suffering the consequences of that shift today).
Christine Borgman (Wikipedia) leads the Data practices team at UCLA and talked about "The data citation dance" (she also referred to a report with the great title "Out of cite, out of mind"). As we collect larger amounts of data, big data will change publishing and publications and there are new roles to be filled. Christine quoted [someone] who said something to the effect that "if publications are the stars and planets of the scientific universe, data are the "dark matter" - influential but largely unobserved in our mapping process". You should for example be able to get credit (including tenure) for taking care of and organising large data sets that are useful to other researchers, but you don't (today). Christine compared this to physics where there can be a lot of authors on a paper (including the name of someone being responsible for the scientific instruments). What (to us) seemed praiseworthy turns out to be more fraught with difficulties when you dig down - but it still represents a first attempt at something that will become more important in the social sciences as big data becomes bigger and more important (see the SensibleDTU project above!). This made me remember a slogan I've heard; "data is the new oil" (or, "data is the oil of the 21st century"). This is an intriguing notion even though I personally think oil is the new oil (as well as the old oil) because without the oil (energy), nothing will run and data will loose much of its current allure).
Here are some random observations and comments that I can't (or couldn't care to) attribute to the correct person who raised the issue in question at the workshop:
- The quantified self (QS) community already think about themselves in terms of large data sets.
- Ubiquitous computing, Algorithmic living, Quantified self and Big data are terms that fly around today. Anthropologists are big on "kinship", so, what's the kinship between these terms?
- We explicitly assume that what is mentioned and talked about the most is what is most important and means the most. But what if some things are really import but also taboo to talk about? What are the things that are really important, but that we do not talk about today (at the workshop, on the Internet and in society)?
- When researchers interview an informant, the informant can say "you can use the data I provide you with if I can read your text first". But what can you "interrogate" big data about? How can you correct errors propagating in the network (the baseline is that you can't).
- Social media is archived and lasts "forever". What are the consequences? It's like you yell something stupid out of the window when you're 18 and it won't ever go away. It will become part of your identity. My thought: is that like a stigma (something that Goffman has written about):
- What then could a "Big data stigma" be? What if the potential of a stigma exists that neither you nor anyone else knows about? Still, the potential of that stigma exists - if someone just has access to and manages to get the right information together. How would your behaviour differ from that of someone who had a physical stigma (like a mark on the body)? How would everyone's behaviour change if everyone (potentially) could be stigmatised, if everyone unbeknownst had 1000 "potential" stigmas, readily available to someone with enough data about you? Would that have a chilling effect on people/society? See further Lundblad's excellent 2004 paper "Privacy in a noise society" (short version here, long version here).
I ended by asking a sceptical, contrarian question, basically repeating the question that came to me and was the impetus to write a paper about "Ubiquitous information in a world of limitations" back in 2010. We all assume that big data is getting bigger, and even bigger, and moving faster. But this actually assumes a lot of things (BAU). While we can extrapolate, we don't really know very much about the future. It might turn out to be very different (e.g. disruptive change, "In times of disruptive change your expected future is no longer valid"). I also shamelessly self-promoted just the tiniest bit, referring to my upcoming (end of May) UCI talk about "Peak computing" (more on that later) but it didn't really take (no-one asked about it later for example).
Here's the text that framed the workshop and that made me decide to attend it:
Here's the text that framed the workshop and that made me decide to attend it:
Despite first appearing in an academic publication only in 2003, the term “big data” has swiftly become central to technology and social science. While bearing deep histories, big data is clearly linked to developments in computational storage, algorithmic analysis, mobile devices, and online sociality. But big data is also debated in the blogosphere, portrayed in mass media, discussed in everyday life.
The goal of this workshop is to take these multiple meanings and practices of big data seriously by placing them in conversation with ethnographic methods. Big data has sometimes been said to imply the “death of ethnographic methods” because it ostensibly provides a more comprehensive, accurate, or unbiased view of social life. In this workshop, however, we explore emergent synergies between ethnographic methods and big data. While some speak of a quantitative versus qualitative divide as foundational to social inquiry, there is value in exploring the possibly more consequential distinction between experimental methods “in” a laboratory (based on the control of variables) versus fieldwork methods “out” in the world (based on empirically investigating contexts preexisting the research process).
From this perspective, big data and ethnography lie on the same side of a divide that separates them from laboratory approaches. Both are forms of engagement with “the field.” As a result, considering new possibilities for their creative entanglement and mutual reconfiguration could present “big” possibilities for investigating the digital dimensions of contemporary cultures.
Oh, and one more thing (if you've read this far). I met some really cool people at the workshop:
- Judith Gregory is the co-director of the UCI EVOKE &Values in Design Laboratory. She does work about Quantified Self. We got off and talked non-stop for an hour and didn't really get around to talk about Quantified Sefl - so we just have to meet and talk some more. I actually think someone recommended that I talk to her some time ago (which I didn't). I'm just so happy to have hooked up with Judith as we had an eclectic and electric conversation about so many different things.
- Just as the event was winding down, I also met and talked intensively with UCSD sociology ph.d student Joan Donovan who is also a bona fide social media activist-guru, and who seems to know everyone. She's also has Manuel Castells as her advisor and she's working with Interoccupy.net ("Connect. Collaborate. Organize.", "InterOccupy is an interactive space for activists looking to organize for global and local social change"). Her private blog is Occupy the Social. We talked about the future of work and I presented her with the concept of "empty labour" while she retorted with presenting me with "the cognitariat"
- I finally chatted just a little with Bill Maurer who is a UCI professor of anthropology. He directed me to the Institute for Money, Technology and Financial Inclusion (IMTFI) as well as to some Swedes who are doing work in "Valuation Studies" (new journal).