La Vida Loca on Wikinews
February 13, 2006

Most days on Wikinews I log in, I check over the developing articles, nudge a few toward publication if I have time, respond to the messages on my talk page and requests for assistance in IRC...

Last week we had a nice story, a bit of investigative reporting I was able to help the lead author (Daniel_Bush) with by modding a script from another contributor (MrMiscellanious) and compiling the output. We did some research, wrote it up, and published it on Tuesday as Wikinews investigates Wikipedia usage by U.S. Senate staff member.

The topic, it seems, was very interesting to members of the press. Here's a description of what we did to learn what we learned.

Every edit on Wikipedia is credited to either a username or an IP address. Wikipedia lets you see all the edits made by either a username or an IP address by checking their user edits: the url to see my edits is - the last word can be substituted with either a different username or an IP. The U.S. congress has a specific block of IP addresses, registered to the U.S. Senate Sergeant at Arms. This block is to 156.255.255.

Daniel_Bush had started working on checking political articles, such as the Senator's own articles, searching through their history for IP addresses which belonged to the senate. He recorded his findings on a research page, compiling enough interesting edits to suggest a bigger story might be hidden in the history of Wikipedia.

MrMiscellanious's script simply counted up through the IPs owned by the senate, and checked if they had ever edited on Wikiped. Some minor tweaks to this collected just the information needed, and helped create lists of IPs which had edited with links to their edits. In the 3 February scan of Wikipedia there were 180 editing IPs within the Senate's block.

While we were compiling the list, I contacted - via their websites - every sitting senator's office. Many of these contacts generated an automated e-mail response from the website server. In these e-mails there is a list of the different mail servers who have pushed the e-mail through the internet. This list can be spoofed, or altered, but the evidence we found suggested they were legitimate.

The early data suggested the Senate IPs were separated into blocks for different purposes, and were probably assigned physically. We had a hypothesis that at least one of these blocks was set aside for senators and their staff, and that it would be separated out into 100 sub-sections. We developed a model which would predict which senator would be assigned which section, and it had the sections broken down first by state and then by senior/junior senator.

When we compared the information from the e-mails we found that not all the senator's websites were located within the senators and staff block. Most of the automated e-mails did come from that block, but two were from machines on a router (where multiple machines use a single IP address) and the rest were from a single block we thought was set aside for servers. The e-mail IPs we did have verified our prediction model with 100% accuracy of those in the target IP address set.

The real test, we thought, would be in which articles were edited by whom. And this turned out to be the case. In examining the hundreds of edits by the IPs the model proved very accurate to a point: it consistently predicted the states of senators, but it was only about 78% accurate on predicting senior/junior senator.

We were confident enough of our results to go to the stage of contacting Senator's offices. We were focusing in on a handful of fairly clear edits, and each of these offices were contacted by phone, sometimes more than once. There were also a half-dozen unusual or particularly beneficial edits we were looking at mentioning in the article, and the offices we had determined these edits came from were likewise contacted. None of our voicemails were returned, however.

By Monday the investigative report, penned mostly by Daniel_Bush with the results of the investigative work, was moved to a developing story on Wikinews, ready for public editing. On Tuesday there were clear signs the article was beginning to get some notice, and it was published.

On Wednesday JWales and staff contacted me, asking if I could take press questions about the story. I spoke with 4 reporters. Thursday had calls from 5 members of the press, and two capitol hill offices. Friday 6-8 (I lost track), and a senator's press officer. Saturday, I vegetated.

Investigative reporting is a lot of fun: you're finding out new information, and even though it's long and boring work, picking through piles of information and making sure you have the fact straight, it's really rewarding when the story is written and gets published. Next time, though, I hope it ends at that point; excited people calling on the phone, all with deadlines and lots of questions, isn't fun and it isn't reporting. Just made my life crazy for a while.

Posted by Amgine | 5:09 PM | Permalink | 0 comments


Post a Comment