<nettime> We are what we tweet: The Problem with a Big Data World when E
Fenwick Mckelvey on Tue, 4 Jun 2013

<nettime> We are what we tweet: The Problem with a Big Data World when Everything You Say is Data

Hi Nettimers,
I have been a long-time reader to Nettime and I'd like to share a
piece I co-authored with Matt Tiessen and Luke Simcoe. We just posted
it on the blog Culturally Digital (http://culturedigitally.org/) and I
thought Nettime readers might also enjoy it. You can see it one the
site at: http://culturedigitally.org/2013/06/we-are-what-we-tweet-the-problem-with-a-big-data-world-when-everything-you-say-is-data-mined/-

We are what we tweet: The Problem with a Big Data World when
Everything You Say is Data

It was written by: Fenwick McKelvey, Matthew Tiessen & Luke Simcoe

Are we living in a simulated “reality”? Although a work of science
fiction from 1964, the book Simulacron-3 asks a question relevant to
our digitally-enabled world. The city where the book takes its name
perturbs its inhabitants. Over the course of the novel, it turns out
the city is a simulation running on a computer by scientists as a
market research experiment. The city is one giant public opinion poll,
data-mining the minds of millions of people to help companies and
governments make decisions. Though a cautionary tale written in an era
of Command and Control and Operations Research, the book accurately
describes the modern Internet and our present moment, one that both
facilitates our communication and informs market research. The primary
purpose of digital media, we contend, is simulation with free
communication being merely an appealing side effect or distraction.

The growing mediation of everyday life by the Internet and social
media, coupled with Big Data mining and predictive analytics, is
turning the Internet into a simulation machine. The collective
activity of humanity provides the data that informs the decision
making processes of algorithmic systems such as high-frequency trading
and aggregated news services that, in turn, are owned by those who
wield global power and control: banks, corporations, governments. The
Internet is no longer a space primarily of communication, but of
simulation. By simulation, we do not mean a reproduction of reality
“as it is out there,” but rather a sort of reality-in-parallel, one
that generates its own sets of tangible quanta and its own
“realities”-to-be-calculated. The issue for our times, then, is not
that we Netizens – nor the inhabitants of Simulacron-3 – need to
grapple with the idea that we are living fake or fictitious virtual
lives; rather, we must come to terms with our own online activities
feeding the appetites of algorithmically-driven machines designed to
facilitate the expansion of profit and power by quantifying and
modulating our desires. We’ve become more valuable to the Internet and
its scanbots as aggregate data inputs than we ever were as consumers
of banner ads.

The imperative driving today’s Internet and mobile technology has more
to do with informing algorithms designed to manage plugged-in
populations than the facilitation of free expression. Social media is
valuable not because it involves tweets, opinions and desires
expressive of the general experience of being alive, but because this
data produces a useful resources for finance, business, and government
akin to the collective behaviour of Simulacron-3.

Counter-Culture and Control

Digital networks are making real the fantasies of the cyberneticists
and information theorists who first developed computing. Computers
promised to make the world more stable and more easy to manage.
Marshall McLuhan echoed this promise in his famed Playboy interview
from 1969:
There’s nothing at all difficult about putting computers in the
position where they will be able to conduct carefully orchestrated
programing of the sensory life of whole populations. I know it sounds
rather science-fictional, but if you understood cybernetics you’d
realize we could do it today. The computer could program the media to
determine the given messages a people should hear in terms of their
over-all needs, creating a total media experience absorbed and
patterned by all the senses. We could program five hours less of TV in
Italy to promote the reading of newspapers during an election, or lay
on an additional 25 hours of TV in Venezuela to cool down the tribal
temperature raised by radio the preceding month. By such orchestrated
interplay of all media, whole cultures could now be programed in order
to improve and stabilize their emotional climate, just as we are
beginning to learn how to maintain equilibrium among the world’s
competing economies.

Computers, McLuhan understood, had the potential to observe and
intervene in the global emotional zeitgeist. Digital simulation
blankets the earth, keeping us warm and cozy in the face of an
unstable atmosphere. McLuhan’s idea that the world could reach
equilibrium echoed Norbert Wiener who called for cybernetic systems
integrating humans and machines as a way, at once, to avoid entropy
and achieve homeostasis. Wiener’s dreams also resonated with those of
Buckminister Fuller who sought as a generalist to create systems that
could make the world a better place. Such an idea drove the early work
on digital computing as described by Paul Edwards who characterized
the Semi-Automatic Ground Environment as a way to make the world more
predictable and manageable.

When first introduced, books like Simulacron-3 described the
managerial promise of computing as a dystopia. The worry was that
computers would become the next “Big Brother.” Fred Turner begins his
important history of computer culture with students burning IBM punch
cards because they did not want to be part of the machine. His book,
however, traced the evolution of computing from symbolizing totalizing
control to personal computers symbolizing personal liberation.

Digital computer networks still promise stability through simulation,
but this “closed world” promise has been refracted through the
counterculture and the cyberculture described by Fred Turner.
Resistance to the 1960s formation of the cybernetic dream has
ironically allowed a re-iteration of the same premise in the modern
age. Personal computers, it’s commonly held, allow humans to express
themselves more freely. We won’t argue with that claim, but let’s not
forget that this “free expression” is allowed to persist only to
nourish data aggregators and simulators.
Our position is similar to that of Darin Barney who described the
Internet as a standing reserve of bits. Digital mediation and
communication transforms social activity and information into a
Heideggerian “standing reserve” that can be consumed single-mindedly
as a resource. While Barney wrote his work before the emergence of
social media and big data, he nevertheless foreshadows our present,
one where ever-growing troves of human communication data reside in
global server farms as standing reserves of data. This phenomenon
might become clear with a few examples.

These days over 80 per cent of stock trading on Western financial
markets is done by computer algorithms that make decisions in
nanoseconds as we’ve discussed previously. These decisions to trade
result not from human intuition (let alone discernments of such
fuddy-duddy things such as “value”), but from the “logic” embedded in
these algorithms; as they feed upon a variety of information sources
as inputs, they output decisions of what to buy and sell. It is
becoming more apparent that the Internet has been wired as an input
for these decisions. For example, the BBC reports that Google searches
can predict stock market decisions. Indeed, United Airlines lost most
of its stock value in 2008 when the Google News aggregator mistakenly
categorized a six-year old “news” story as current. The story
discussed the insolvency of the airline – long resolved – so investors
or more likely machines reacted promptly by dumping its stock. More
recently, hackers breached the Twitter account of the Associated Press
and tweeted that President Obama had been attacked. Algorithms
perceived a potential crisis and reacted instantly, causing the S&P
500 to lose $130 billion in “value.”

Firms increasingly trade in selling access to this future. Companies
like Recorded Future and Palantir now mine the data that the Internet
so freely provides and sells these predictions. Recorded Futures, for
example, sells its products with the promise to “unlock the predictive
power of the web.” While these firms might differ in their algorithmic
systems, they all depend on access to the standing reserve of the Web.

The horizons of possibility that stem from the world of algorithmic
trading were prefigured by William Gibson, who ends his most recent
novel Zero History with antagonist turned protagonist Hubertus Bigend
able to predict the future using what he calls the “order flow:”

It’s the aggregate of all the orders in the market. Everything anyone
is about to buy or sell, all of it. Stocks, bonds, gold, anything. If
I understood him, that information exists, at any given moment, but
there’s no aggregator. It exists, constantly, but is unknowable. If
someone were able to aggregate that, the market would cease to be
real… Because the market is the inability to aggregate the order flow
at any given moment.

In the book’s final pages, Bigend confesses that he needs only the
briefest lead on the present – “seven seconds, in most cases” – to
construct his desired financial future. Knowing “seven seconds” ahead
is all anyone needs to eke out a profit. As both the fictional Bigend
and today’s very real army of stock trading bots attest, the value of
the Internet is to render accessible the data companies need to know
the future with greater certainty than ever before. Bigend wins in the
end because he has exclusive access to the order flow.

New Digital Divides?

The order flow demonstrates the digital divides being created by the
Internet as simulation machine. A divide emerges between those
included and excluded from participating in the future of preemption
and prediction. Those who are excluded no longer lack information, but
lack input in the algorithmically-driven future-making.

Though a divide exists between those qualified as inputs and those
filtered out as noise, another divide involves what (more so than who)
has access to the order flow – or what Twitter calls the “firehose” –
in the first place. Access to the firehose is one critical questions
about Big Data raised by danah boyd and Kate Crawford. A lack of
access to algorithmic control leads to a lack of on-and off-line
agential control insofar as the Internet’s end users – we the public –
lack access to the present: to our own flows of data. The order flow
is too valuable or too dangerous to be entirely public.

It follows that social and financial exclusion and asymmetries of
power will intensify as the Internet succeeds in making its
algorithmically-sorted simulation “real.” Each iteration of the Web
further disavows uncertainty, contributing to an “evidence-based” and
data-driven simulation that, increasingly, will overcode our everyday
lives. Those inputs that cannot be aggregated will be removed or
ignored. There is little hope nor need for the system to self-correct
should things to “wrong.” Those invested in the simulation do not
worry about the outcome as much as they do with the probability of its
realization. A future that is unequal and knowable is better than an
unknowable – or uncertain – future concocted with freedom, equality,
and flourishing as its goal.

Embracing the New Trickster: Vacoules of Non-Communication and Other
Circuit Breakers

Our simulated future requires new tactics of resistance. In this
regard we’re inspired – as are many others – by the work of Gilles
Deleuze. Deleuze once presciently wrote:

Maybe speech and communication have been corrupted. They’re thoroughly
per­meated by money—and not by accident but by their very nature.
We’ve got to hijack speech. Creating has always been something
dif­ferent from communicating. The key thing may be to create vacuoles
of noncommunication, circuit breakers, so we can elude control.

These are the tasks of the modern resistance. The transitory British
collective known as The Deterritorial Support Group also draws on
Deleuze to offer a perfect example of the vacuole of non-communication
in their writings on Internet memes:

We didn’t spread any such rumour – we hijacked an existing meme with
enormous potential. Internet memes originally functioned as a subject
of the Internet hate machine – operating in a totally amoral fashion,
where achieving “lulz” was the only aim. Within the past few years,
memes have started to take on a totally different function, and what
would have been perceived as a slightly pathetic bunch of bastards in
the past are today global players in undermining international
relations – namely in the complex interaction of Wikileaks with
Anonymous, 4chan and other online hooligans.

There’s no coherent analysis to be had of this at the moment. However
“lulz” also demonstrate their potential as part of a policy of radical
refusal to the demands of capital. When asked by liberals “Do you
condone or condemn the violence of the Black Bloc?” We can only reply
in unison “This cat is pushing a watermelon out of a lake. Your
premise is invalid”.
Today it seems to make more sense to make no sense since logical forms
of antagonistic communication are still inputs that feed bots making
stock market decisions. Gawker’s Adrien Chen, referencing Harmony
Korine’s absurd(ist) Reddit AMA echoes this sentiment when he writes:
“It’s unclear if he’s on or off something but his
typo-and-non-sequitur-filled performance in his AMA today was
inspired. The only way to resist the insufferable PR machine is
clogging it with pure nonsense [emphasis added].” Nonsense might be
the new sabotage.

In fact, nonsense clarifies what matters about Internet tricksters
like LulzSec or the hordes of trolls that spew forth from sites like
4chan. The creator of 4chan, Christopher Poole, suggested that 4chan
matters because it allows for an anonymity not possible on platforms
like Facebook. As Jessica Beyer attests in the conclusion of her
dissertation and forthcoming book, the design of the 4chan platform is
radical because it does not lend itself to data mining and monitoring.
Content is unfiltered, ephemeral and too noisy to make predictions. In
other words, 4chan’s refusal to archive itself results in an online
social space that consciously resists simulation. And as we’ve
elsewhere observed, the culture of 4chan works alongside its code to
confound visitors, both human and nonhuman. The community actively
resists categorization, and the only memes or behaviours that cohere
are those with sufficient room for ambiguity and playful

4chan users often deploy their penchant for nonsense in a tactical
fashion. The constant stream of obscenity, inside jokes and non
sequiturs they insist upon has rendered the site one of the few online
forums not able to turn a profit. Despite a base of 22 million monthly
users, Poole has struggled to find advertisers willing to have their
products appear next to Goatse (he also spoke at a 2012 conference
centred on mining online creativity and bragged that 4chan loses
money). Outside the confines of the site, trolls lurk under the
Internet’s myriad bridges, circulating false news stories, hijacking
online polls and marketing campaigns, and generally making online
information just a little less reliable than its non-simulated cousin.
And when those kernels of spurious data get fed into the algorithmic
feedback loop, who knows what lulzy futures they may give rise to?

This is hardly specific to 4chan. In her work on Second Life, Burcu
Bakioglu describes how “griefers” rejected the commodification of
their virtual world through in-game raids and hacks that spawned
hordes of offensive sprites, ranging from flying pensises to
swastikas. This type of “grief-play” jammed Second Life’s
signification system, literally causing sims to crash, and disrupted
Linden Lab’s virtual economy. Such actions suggest new horizons of
possibility for “acting out” because trolls and griefers interfere
with the logics driving simulation and computational sense making.

Pranks can also reveal the algorithmic systems at work in our media.
To mine a homespun example, we can look at what happened when Canadian
Prime Minister Stephen Harper choked on a hash brown. His Members of
Parliament appeared convinced of their leader’s fate, as one by one
they posted the news item to Twitter. At least it seemed that way
until their party admitted a hack of its website circulated the fake
story. More than embarrass the reigning party, the hackers revealed
the frailty of their communication apparatus. The Hash Brown incident
would have been dismissed as hearsay, but instead appeared credible
because party members had posted it to Twitter – or rather algorithms
posted the story because their Twitter accounts were configured to
automatically forward news items from the party’s central site.
Indeed, these days the tactical placement of online jokes and
ridiculousness (let alone false news items about stocks, terrorism, or
disasters) can function as a form of research on algorithmic media.
After all, what is the trick but “a method by which a stranger or
underling can enter the game, change its rules, and win a piece of the
action” (Hyde, 1998, p. 204).

These resistances, however, run the risk of just being filtered out.
We can imagine in the future that 4chan might either be removed to
ensure that an Internet with real names offers better
simulation-supporting information or just ignored altogether by
algorithms programmed to avoid websites that traffic in troublemakers,
perverts, and weirdos. Conversely, the potential of the trickster
could be co-opted, and used to train the machine. By trying to predict
a future of nonsense are we helping the machine learn? Can the right
algorithm data-mine nonsense and render it predictable? The question
remains whether there can be any popular resistance at all to today’s
– and tomorrow’s – algorithmically-modulated simulation. Will people
join in a nonsensical refrain? Does free communication matter enough
to publics that they will stop communicating (or at least, stop making
sense)? And if the Internet fails to simulate effectively, will the
externalities of free communication endure?

About the Authors
Dr. Fenwick McKelvey is a SSHRC Postdoctoral Fellow and Visiting
Scholar at the Department of Communication at the University of
Washington. His postdoctoral project entitled Programming the Vote
traces the early history of computers in politics. He will start as an
Assistant Professor in the Department of Communication Studies. He
completed his MA (2008) and his PhD (2012) in the joint program in
Communication and Culture of York University and Ryerson University.
His doctoral work was supported by a SSHRC Joseph-Armand Bombardier
Canada Graduate Scholarship. Fenwick investigates how algorithms
afford new forms of control in digital media. He explores this issue
of control through studies of Internet routing algorithms and
political campaign management software. He is also co-author of The
Permanent Campaign: New Media, New Politics with Dr. Elmer and Dr.
Langlois. He is presently developing a book manuscript entitled Media
Demons: Algorithms, Internet Routing and Time. His personal website
page is: http://www.fenwickmckelvey.com/.

Dr. Matthew Tiessen is a SSHRC Postdoctoral Research Fellow at the
Infoscape Research Lab in the Faculty of Communication and Design at
Ryerson University. He teaches and publishes in the area of technology
studies, digital culture, contemporary theory, and visual culture.
Matthew’s doctoral research on affect, agency, and creativity was
funded by an Izaak Walton Killam Memorial Scholarship and a SSHRC
Doctoral Fellowship. Dr. Tiessen is currently under contract to write
a book on “apps,” affect, and gamification with U of Toronto Press.
His research is featured in academic journals and anthologies such as:
Theory, Culture & Society, Cultural Studies<=>Critical Methodologies,
Surveillance & Society as well as Space and Culture. He is also an
exhibiting artist. Matthew’s homepage can be found

Luke Simcoe, MA is a journalist and independent scholar. He holds a MA
in the joint program in Communication and Culture of York University
and Ryerson University (2012). His first major research paper explored
how participants on the popular 4chan message board used carnivalesque
humour and memes to foster a collective identity and articulate a
political orientation towards the internet—summed up by the phrase
“the internet is serious business”. He writes at:

