nettime's filter algorithm on Sun, 11 Sep 2011 10:47:13 +0200 (CEST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

<nettime> Computer Generated News Articles




In Case You Wondered, a Real Human Wrote This Column
By STEVE LOHR

http://www.nytimes.com/2011/09/11/business/computer-generated-articles-are-
gaining-traction.html

âWISCONSIN appears to be in the driverâs seat en route to a win, as it 
leads 51-10 after the third quarter. Wisconsin added to its lead when 
Russell Wilson found Jacob Pedersen for an eight-yard touchdown to make the 
score 44-3 ... . â

Those words began a news brief written within 60 seconds of the end of the 
third quarter of the Wisconsin-U.N.L.V. football game earlier this month. 
They may not seem like much â but they were written by a computer.

The clever code is the handiwork of Narrative Science, a start-up in 
Evanston, Ill., that offers proof of the progress of artificial 
intelligence â the ability of computers to mimic human reasoning.

The companyâs software takes data, like that from sports statistics, 
company financial reports and housing starts and sales, and turns it into 
articles. For years, programmers have experimented with software that wrote 
such articles, typically for sports events, but these efforts had a 
formulaic, fill-in-the-blank style. They read as if a machine wrote them.

But Narrative Science is based on more than a decade of research, led by 
two of the companyâs founders, Kris Hammond and Larry Birnbaum, co-
directors of the Intelligent Information Laboratory at Northwestern 
University, which holds a stake in the company. And the articles produced 
by Narrative Science are different.

âI thought it was magic,â says Roger Lee, a general partner of Battery 
Ventures, which led a $6 million investment in the company earlier this 
year. âItâs as if a human wrote it.â

Experts in artificial intelligence and language are also impressed, if less 
enthralled. Oren Etzioni, a computer scientist at the University of 
Washington, says, âThe quality of the narrative produced was quite good,â 
as if written by a human, if not an accomplished wordsmith. Narrative 
Science, Mr. Etzioni says, points to a larger trend in computing of âthe 
increasing sophistication in automatic language understanding and, now, 
language generation.â

The innovative work at Narrative Science raises the broader issue of 
whether such applications of artificial intelligence will mainly assist 
human workers or replace them. Technology is already undermining the 
economics of traditional journalism. Online advertising, while on the rise, 
has not offset the decline in print advertising. But will ârobot 
journalistsâ replace flesh-and-blood journalists in newsrooms?

The leaders of Narrative Science emphasized that their technology would be 
primarily a low-cost tool for publications to expand and enrich coverage 
when editorial budgets are under pressure. The company, founded last year, 
has 20 customers so far. Several are still experimenting with the 
technology, and Stuart Frankel, the chief executive of Narrative Science, 
wouldnât name them. They include newspaper chains seeking to offer 
automated summary articles for more extensive coverage of local youth 
sports and to generate articles about the quarterly financial results of 
local public companies.

âMostly, weâre doing things that are not being done otherwise,â Mr. Frankel 
says.

The Narrative Science customers that are willing to talk do fit that model. 
The Big Ten Network, a joint venture of the Big Ten Conference and Fox 
Networks, began using the technology in the spring of 2010 for short recaps 
of baseball and softball games. They were posted on the networkâs Web site 
within a minute or two of the end of each game; box scores and play-by-play 
data were used to generate the brief articles. (Previously, the network 
relied on online summaries provided by university sports offices.)

As the spring sports season progressed, the computer-generated articles 
improved, helped by suggestions from editors on the networkâs staff, says 
Michael Calderon, vice president for digital and interactive media at the 
Big Ten Network.

The Narrative Science software can make inferences based on the historical 
data it collects and the sequence and outcomes of past games. To generate 
story âangles,â explains Mr. Hammond of Narrative Science, the software 
learns concepts for articles like âindividual effort,â âteam effort,â âcome 
from behind,â âback and forth,â âseason high,â âplayerâs streakâ and 
ârankings for team.â Then the software decides what element is most 
important for that game, and it becomes the lead of the article, he said. 
The data also determines vocabulary selection. A lopsided score may well be 
termed a âroutâ rather than a âwin.â

âComposition is the key concept,â Mr. Hammond says. âThis is not just 
taking data and spilling it over into text.â

Last fall, the Big Ten Network began using Narrative Science for updates of 
football and basketball games. Those reports helped drive a surge in 
referrals to the Web site from Googleâs search algorithm, which highly 
ranks new content on popular subjects, Mr. Calderon says. The networkâs Web 
traffic for football games last season was 40 percent higher than in 2009.

Hanley Wood, a trade publisher for the construction industry, began using 
the program in August to provide monthly reports on more than 350 local 
housing markets, posted on its site, builderonline.com. The company had 
long collected the data, but hiring people to write trend articles would 
have been too costly, says Andrew Reid, president of Hanley Woodâs digital 
media and market intelligence unit.

Mr. Reid says Hanley Wood worked with Narrative Science for months to fine-
tune the software for construction. A former executive at Thomson Reuters, 
he says he was struck by the high quality of the articles.

âThey got over a big linguistic hurdle,â he observes. âThe stories are not 
duplicates by any means.â

He was also impressed by the cost. Hanley Wood pays Narrative Science less 
than $10 for each article of about 500 words â and the price will very 
likely decline over time. Even at $10, the cost is far less, by industry 
estimates, than the average cost per article of local online news ventures 
like AOLâs Patch or answer sites, like those run by Demand Media.

NARRATIVE SCIENCEâS ambitions include moving further up the ladder of 
quality. Both Mr. Birnbaum and Mr. Hammond are professors of journalism as 
well as computer science. The company itself is an outgrowth of 
collaboration between the two schools.

âThis kind of technology can deepen journalism,â says John Lavine, dean of 
the Medill School of Journalism at Northwestern.

Mr. Hammond says the combination of advances in its writing engine and data 
mining can open new horizons for computer journalism, exploring 
âcorrelations that you did not expectâ â conceptually similar to 
âFreakonomics,â by two humans, the economist Steven D. Levitt and the 
author Stephen J. Dubner.

Mr. Hammond cited a media mavenâs prediction that a computer program might 
win a Pulitzer Prize in journalism in 20 years â and he begged to differ.

âIn five years,â he says, âa computer program will win a Pulitzer Prize â 
and Iâll be damned if itâs not our technology.â

Should it happen, the prize, of course, would not be awarded to abstract 
code, but to its human creators. 


#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org