www.nettime.org
Nettime mailing list archives

<nettime> Down with algorithms!
nettime's avid reader on Thu, 20 Oct 2011 12:38:27 +0200 (CEST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

<nettime> Down with algorithms!




Can an algorithm be wrong? Twitter Trends, the specter of censorship, and 
our faith in the algorithms around us Oct 19, 2011 

http://culturedigitally.org/2011/10/can-an-algorithm-be-wrong/

The interesting question is not whether Twitter is censoring its Trends 
list. The interesting question is, what do we think the Trends list is, 
what it represents and how it works, that we can presume to hold it 
accountable when we think it is âwrong?â What are these algorithms, and 
what do we want them to be?

Itâs not the first time it has been asked. Gilad Lotan at SocialFlow (and 
erstwhile Microsoft UX designer), spurred by questions raised by 
participants and supporters of the Occupy Wall Street protests, asks the 
question: is Twitter censoring its Trends list to exclude #occupywallstreet 
and #occupyboston? While the protest movement gains traction and media 
coverage, and participants, observers and critics turn to Twitter to 
discuss it, why are these widely-known hashtags not Trending? Why are they 
not Trending in the very cities where protests have occurred, including New 
York?

The presumption, though Gilad carefully debunks it, is that Twitter is, for 
some reason, either removing #occupywallstreet from Trends, or has designed 
an algorithm to prefer banal topics like Kim Kardashianâs wedding over 
important contentious, political debates. Similar charges emerged around 
the absence of #wikileaks from Twitterâs Trends when the trove of 
diplomatic cables were released in December of last year, as well as around 
the #demo2010 student protests in the UK, the controversial execution of 
#TroyDavis in the state of Georgia, the Gaza #flotilla, even the death of 
#SteveJobs. Why, when these important points of discussion seem to spike, 
do they not Trend?

Despite an unshakeable undercurrent of paranoid skepticism, in the analyses 
and especially in the comment threads that trail off from them, most of 
those who have looked at the issue are reassured that Twitter is not in 
fact censoring these topics. Their absence on the Trends listings is a 
product of the particular dynamics of the algorithm that determines Trends, 
and the misunderstanding most users have about what exactly the Trends 
algorithm is designed to identify. I do not disagree with this assessment, 
and have no particular interest in reopening these questions. Along with 
Giladâs thorough analysis, Angus Johnston has a series of posts (1, 2, 3, 
and 4) debunking the charge of censorship around #wikileaks. Trends has 
been designed (and re-designed) by Twitter not to simply measure 
popularity, i.e. the sheer quantity of posts using a certain word or 
hashtag. Instead, Twitter designed the Trends algorithm to capture topics 
that are enjoying a surge in popularity, rising distinctly above the normal 
level of chatter. To do this, their algorithm is designed to take into 
account not just the number of tweets, but factors such as: is the term 
accelerating in its use? Has it trended before? Is it being used across 
several networks of people, as opposed to a single, densely-interconnected 
cluster of users? Are the tweets different, or are they largely re-tweets 
of the same post? As Twitter representatives have said, they donât want 
simply the most tweeted word (in which case the Trend list might read like 
a grammar assignment about pronouns and indefinite articles) or the topics 
that are always popular and seem destined to remain so (apparently this 
means Justin Bieber).

The charge of censorship is, on the face of it, counterintuitive. Twitter 
has, over the last few years, enjoyed and agreed with claims that has 
played a catalytic role in recent political and civil unrest, particularly 
in the Arab world, wearing its political importance as a red badge of 
courage (see Shepherd and Busch).  To censor these hot button political 
topics from Trends would work against their current self-proclaimed 
purposes and, more importantly, its marketing tactics. And, as Johnston 
noted, the tweets themselves are available, many highly charged - so why, 
and for what ends, remove #wikileaks or #occupywallstreet from the Trends 
list, yet  let the actual discussion of these topics run free?
On the other hand, the vigor and persistence of the charge of censorship is 
not surprising at all. Advocates of these political efforts want 
desperately for their topic to gain visibility. Those involved in the 
discussion likely have an exaggerated sense of how important and widely-
discussed it is. And, especially with #wikileaks and #occupywallstreet, the 
possibility that Twitter may be censoring their efforts would fit their 
supportersâ ideological worldview: Twitter might be working against 
Wikileaks just as Amazon, Paypal, and Mastercard were; or in the case of 
#occupywallstreet, while the Twitter network supports the voice of the 
people, Twitter the corporation of course must have allegiances firmly 
intertwined with the fatcats of Wall Street.

But the debate about tools like Twitter Trends is, I believe, a debate we 
will be having more and more often. As more and more of our online public 
discourse takes place on a select set of private content platforms and 
communication networks, and these providers turn to complex algorithms to 
manage, curate, and organize these massive collections, there is an 
important tension emerging between what we expect these algorithms to be, 
and what they in fact are. Not only must we recognize that these algorithms 
are not neutral, and that they encode political choices, and that they 
frame information in a particular way. We must also understand what it 
means that we are coming to rely on these algorithms, that we want them to 
be neutral, we want them to be reliable, we want them to be the effective 
ways in which we come to know what is most important.

Twitter Trends is only the most visible of these tools. The search engine 
itself, whether Google or the search bar on your favorite content site 
(often the same engine, under the hood), is an algorithm that promises to 
provide a logical set of results in response to a query, but is in fact the 
result of an algorithm designed to take a range of criteria into account so 
as to serve up results that satisfy, not just the user, but the aims of the 
provider, their vision of relevance or newsworthiness or public import, and 
the particular demands of their business model. As James Grimmelmann 
observed, âSearch engines pride themselves on being automated, except when 
they arenât.â When Amazon, or YouTube, or Facebook, offer to 
algorithmically and in real time report on what is âmost popularâ or 
âlikedâ or âmost viewedâ or âbest sellingâ or âmost commentedâ or âhighest 
rated,â it is curating a list whose legitimacy is based on the presumption 
that it has not been curated. And we want them to feel that way, even to 
the point that we are unwilling to ask about the choices and implications 
of the algorithms we use every day.

Peel back the algorithms, and this becomes quite apparent. Yes, a casual 
visit to Twitterâs home page may present Trends as an unproblematic list of 
terms, that might appear a simple calculation. But a cursory look at 
Twitterâs explanation of how Trends works â in its policies and help pages, 
in its company blog, in tweets, in response to press queries, even in the 
comment threads of the censorship discussions - Twitter lays bare the 
variety of weighted factors Trends takes into account, and cops to the 
occasional and unfortunate consequences of these algorithms. Wikileaks may 
not have trended when people expected it to because it had before; because 
the discussion of #wikileaks grew too slowly and consistently over time to 
have spiked enough to draw the algorithmâs attention; because the bulk of 
messages were retweets; or because the users tweeting about Wikileaks were 
already densely interconnected. When Twitter changed their algorithm 
significantly in May 2010 (though, undoubtedly, it has been tweaked in less 
noticeable ways before and after), they announced the change in their blog, 
explained why it was made â and even apologized directly to Justin Bieber, 
whose position in the Trends list would be diminished by the change. In 
response to charges of censorship, they have explained why they believe 
Trends should privilege terms that spike, terms that exceed single clusters 
of interconnected users, new content over retweets, new terms over already 
trending ones. Critics gather anecdotal evidence and conduct thorough 
statistical analysis, using available online tools that track the raw 
popularity of words in a vastly more exhaustive and catholic way than 
Twitter does, or at least is willing to make available to its users. The 
algorithms that define what is âtrendingâ or what is âhotâ or what is âmost 
popularâ are not simple measures, they are carefully designed to capture 
something the site providers want to capture, and to weed out the 
inevitable âmistakesâ a simple calculation would make.

At the same time, Twitter most certainly does curate its Trends lists. It 
engages in traditional censorship: for example, a Twitter engineer 
acknowledges here that Trends excludes profanity, something thatâs obvious 
from the relatively circuitous path that prurient attempts to push dirty 
words onto the Trends list must take. Twitter will remove tweets that 
constitute specific threats of violence, copyright or trademark violations, 
impersonation of others, revelations of othersâ private information, or 
spam. (Twitter has even been criticized (1, 2) for not removing some terms 
from Trends, as in this userâs complaint that #reasonstobeatyourgirlfriend 
was permitted to appear.) Twitter also engages in softer forms of 
governance, by designing the algorithm so as to privilege some kinds of 
content and exclude others, and some users and not others. Twitter offers 
rules, guidelines, and suggestions for proper tweeting, in the hopes of 
gently moving users towards the kinds of topics that suit their site and 
away from the kinds of content that, were it to trend, might reflect badly 
on the site. For some of their rules for proper profile content, tweet 
content, and hashtag use, the punishment imposed on violators is that their 
tweets will not factor into search or Trends - thereby culling the Trends 
lists by culling what content is even in consideration for it. Twitter 
includes terms in its Trends from promotional partners, terms that were not 
spiking in popularity otherwise. This list, automatically calculated on the 
fly, is yet also the result of careful curation to decide what it should 
represent, what counts as âtrend-ness.â

Ironically, terms like #wikileaks and #occupywallstreet are exactly the 
kinds of terms that, from a reasonable perspective, Twitter should want to 
show up as Trends. If we take the reasonable position that Twitter is 
benefiting from its role in the democratic uprisings of recent years, and 
that it is pitching itself as a vital tool for important political 
discussion, and that it wants to highlight terms that will support that 
vision and draw users to topics that strike them as relevant, 
#occupywallstreet seems to fit the bill. So despite carefully designing 
their algorithm away from the perennials of Bieber and the weeds of common 
language, it still cannot always successfully pluck out the vital public 
discussion it might want. In this, Twitter is in agreement with its 
critics; perhaps #wikileaks should have trended after the diplomatic cables 
were released. These algorithms are not perfect; they are still cudgels, 
where one might want scalpels. The Trends list can often look, in fact, 
like a study in insignificance. Not only are the interests of a few often 
precisely irrelevant to the rest of us, but much of what we talk about on 
Twitter every day is in fact quite everyday, despite their most heroic 
claims of political import. But, many Twitter users take it to be not just 
a measure of visibility but a means of visibility â whether or not the 
appearance of a term or #hashtag increases audience, which is not in fact 
clear. Trends offers to propel a topic towards greater attention, and 
offers proof of the attention already being paid. Or seems to.

Of course, Twitter has in its hands the biggest resource by which to 
improve their tool, a massive and interested user base. One could imagine 
âcrowdsourcingâ this problem, asking users to rate the quality of the 
Trends lists, and assessing these responses over time and a huge number of 
data points. But they face a dilemma: revealing the workings of their 
algorithm, even enough to respond to charges of censorship and 
manipulation, much less to share the task of improving it, risks helping 
those who would game the system. Everyone from spammers to political 
activist to 4chan tricksters to narcissists might want to âoptimizeâ their 
tweets and hashtags so as to show up in the Trends. So the mechanism 
underneath this tool, that is meant to present a (quasi) democratic 
assessment of what the public finds important right now, cannot reveals its 
own âsecret sauce.â

Which in some ways leaves us, and Twitter, in an unresolvable quandary. The 
algorithmic gloss of our aggregate social data practices can always be 
read/misread as censorship, if the results do not match what someone 
expects. If #occupywallstreet is not trending, does that mean (a) it is 
being purposefully censored? (b) it is very popular but consistently so, 
not a spike? (c) it is actually less popular than one might think? Broad 
scrapes of huge data, like Twitter Trends, are in some ways meant to show 
us what we know to be true, and to show us what we are unable to perceive 
as true because of our limited scope. And we can never really tell which it 
is showing us, or failing to show us. We remain trapped in an algorithmic 
regress, and not even Twitter can help, as it canât risk revealing the 
criteria it used.

But what is most important here is not the consequences of algorithms, it 
is our emerging and powerful faith in them. Trends measures âtrends,â a 
phenomena Twitter gets to define and build into its algorithm. But we are 
invited to treat Trends as a reasonable measure of popularity and 
importance, a âtrendâ in our understanding of the term. And we want it to 
be so. We want Trends to be an impartial arbiter of whatâs relevantâ and we 
want our pet topic, the one it seems certain that âeveryoneâ is (or should 
be) talking about, to be duly noted by this objective measure specifically 
designed to do so. We want Twitter to be ârightâ about what is importantâ 
and sometimes we kinda want them to be wrong, deliberately wrong â because 
that will also fit our worldview: that when the facts are misrepresented, 
itâs because someone did so deliberately, not because facts are in many 
ways the product of how theyâre manufactured.

We donât have a sufficient vocabulary for assessing the algorithmic 
intervention a tool like Trends. Weâre not good at comprehending the 
complexity required to make a tool like Trends â that seems to effortlessly 
identify whatâs going on, that isnât swamped by the mundane or the 
irrelevant. We donât have a language for the unexpected associations 
algorithms make, beyond the intention (or even comprehension) of their 
designers. We donât have a clear sense of how to talk about the politics of 
this algorithm. If Trends, as designed, does leave #occupywallstreet off 
the list, even when its use is surging and even when some people think it 
should be there: is that the algorithm correctly assessing what is 
happening? Is it looking for the wrong things? Has it been turned from its 
proper ends by interested parties? Too often, maybe in nearly every 
instance in which we use these platforms, we fail to ask these questions. 
We equate the âhotâ list with our understanding of what is popular, the 
âtrendsâ list with what matters. Most importantly, we may be unwilling or 
unable to recognize our growing dependence on these algorithmic tools, as 
our means of navigating the huge corpuses of data that we must, because we 
want so badly for these tools to perform a simple, neutral calculus, 
without blurry edges, without human intervention, without having to be 
tweaked to get it âright,â without being shaped by the interests of their 
providers. 

-Contributed by Tarleton Gillespie, Cornell University Department of 
Communication-


#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime {AT} kein.org