Latest Updates: Our Blog

Author Archive

Welcome Aboard, Ted Han

Posted
Sep 21st, 2011

Tags
IdeaLab,People

Author
Amanda Hickman

Back in August, we announced that we’d be welcoming a new lead developer, but he’s been on the job two weeks already and we managed to forget to say anything like “Welcome aboard!”

Well, better late than never. Continue reading »

Getting the Most out of DocumentCloud

Posted
Aug 2nd, 2011

Tags
Workspace ,

Author
Amanda Hickman

Updated! How I left MuckRock out is beyond me. There may be more updates as appropriate.

If you’re new to programming, looking at what others have done is probably the best way to get your bearings. DocumentCloud is no exception. You asked for more, better API examples. We’re long overdue for a roundup of some of the great tools DocumentCloud users have built on our API or otherwise poked their heads under the hood. Continue reading »

DocumentCloud Merges with IRE

Posted
Jun 9th, 2011

Tags
People

Author
Amanda Hickman

DocumentCloud is beyond delighted to announce that we’ve found a long-term home for our project. We’re merging our operation with Investigative Reporters and Editors, a nonprofit grassroots organization committed to fostering excellence in investigative journalism. This transition means that DocumentCloud will have a permanent place in a longstanding resource for investigative reporting. IRE has a long and established history of supporting investigative reporting, and we’ll be a proud part of their ongoing work to provide journalists with tools that support their reporting. It goes without saying that DocumentCloud is a natural fit for an organization that has been upholding high professional standards and instilling a passion for public service journalism for more than 35 years.

IRE will continue to honor all of the promises we have made to our users, and our staff will be working to ensure a smooth transition. The best way to get your questions answered will still be reaching out to support@documentcloud.org or contacting us through the workspace.  We’re still welcoming new users — contact us to find out more about bringing your newsroom on board.

We’ve even got some great new tools in the works. More on that soon.

All of us are committed to the continuing success of DocumentCloud. Over the next few months, we’ll be handing off day to day responsibility for managing DocumentCloud to IRE’s staff based at the University of Missouri in Columbia, Mo. I’ll stay on as program director through the summer to facilitate a smooth transition. Developer Sam Clay is moving to San Francisco to join a startup there. Our lead developer, Jeremy Ashkenas, has moved to the New York Times’s Interactive News team, but will remain actively involved with DocumentCloud on the technical side. Our founders will be here to help DocumentCloud continue to thrive — Scott Klein, Aron Pilhofer and Eric Umansky will remain on the project as advisors and advocates.

We’re already interviewing strong candidates to take over as lead developer, but will be looking for more developers, too. More on that soon as well.

DocumentCloud was first envisioned by a team of editors at ProPublica and The New York Times, and was founded in 2009 through a grant from the John S. and James L. Knight Foundation to build an online catalog of primary source documents and a set of tools to help journalists get more out of source documents. We are all immensely grateful to Knight for their confidence in us. We think their investment paid off. Not only do newsrooms have a new resource that is already indispensable, but DocumentCloud helped demonstrate that 21st century newsrooms are ready to collaborate and share what were once privately held materials. The public is better informed because of it.

Since we launched in March of 2010, newsrooms and watchdog organizations have used DocumentCloud to analyze, annotate, and publish thousands of documents ranging from suspicious, if not outright spurious, expense reports filed by local authorities in Long Island, New York to hundreds of pages of correspondence released by the Financial Crisis Inquiry Commission, and much, much more. How much more? We encourage you to search our public catalog and see for yourself.

See Also:
IRE’s announccment: DocumentCloud joins IRE, and Knight’s: News Challenge Success Story Finds a Home

Frequently Answered Questions: Blogging Tools

Posted
May 30th, 2011

Tags
Workspace ,

Author
Amanda Hickman

DocumentCloud’s users are a diverse lot, to say the least. In some newsrooms skilled programmers are busy writing python wrappers for our API, while in other users are embedding documents with no programmers to be found.

If you’re trying to integrate DocumentCloud with blogging software like WordPress or Blogger, we can help. Continue reading »

New Feature: Arbitrary Metadata

Posted
May 17th, 2011

Tags
Workspace

Author
Amanda Hickman

We’re thrilled to release a feature that has been simmering on a back burner since we launched DocumentCloud: more metadata! Using our document data tools, DocumentCloud users can tag documents with any values you need to store or search by.

Organize hearing transcripts by committee. Tag a stack of emails with information about who sent and received them. Add FOIA request numbers or the date a published document was originally retrieved, and you’ll know much more about document provenance at a glance. Continue reading »

Much Ado About Obama’s Birth Certificate on DocumentCloud

Posted
May 11th, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

As we watched traffic stats skyrocket last month as newsroom after newsroom uploaded President Obama’s birth certificate to DocumentCloud and then embedded it, my reaction was hardly one of joy.

Why on Earth is a birth certificate more interesting than, say, the pages and pages of receipts documenting some outrageous meals (15 steaks, two orders of fish and a lamb chop — for five people submitted by National Grid to the Long Island Power Authority after their Hurricane Earl cleanup)?

I like to think these are the documents we built DocumentCloud for — that we’re here to give a leg up to reporters scrutinizing spurious spending reports (reporting that prompted a formal state investigation) or documenting patent dishonesty and the unusual lengths one California town went to in order to conceal extraordinary salaries paid to city officials.

Vote of Confidence

obama birth certificate.jpg

Forgive me if I was underwhelmed by all the attention that the birth certificate got. My esteemed colleagues, however, helped me see the bright side of the flurry. For one thing, it was fast. Within minutes, 10 different newsrooms had uploaded the birth certificate and embedded it.

That says a lot: It says that when they have something they know their readers want to see, reporters turn to DocumentCloud. That’s a huge vote of confidence in us. Plus, we didn’t falter under the weight of the tenfold increase in traffic — that’s solid architecture for you. We built DocumentCloud with the hope that we could improve the way newsrooms share source documents with their readers, and at that, we’re thrilled to be succeeding.

Increasingly, DocumentCloud is a resource for breaking news. When the news broke that Osama bin Laden had been killed in a town called Abbottabad, a search for “Abbottabad” turned up some pretty rich stuff, most notably that a former Gitmo detainee led U.S. authorities to the Pakistani town back in 2008.

New Feature Roundup

Meanwhile, we’re still listening to our users and looking for more ways to make DocumentCloud easier to use and to help reporters give their readers the documents behind the story.

We’re looking forward to seeing what our users do with our new tool that lets you embed a single annotation, and we’re excited to watch the great uses newsrooms have put document sets to.

From embedding documents accumulated over two decades spent covering an Oregon commune where things went horribly awry to sharing the documents detailing the Federal Reserve’s support for ailing financial institutions, or the background material from coverage of a profoundly embarrassed local philanthropist, reporters seem to be getting the hang of embedding document sets.

So we have a question for the reporters who have been using DocumentCloud already: What would have made this even easier for you?

Discuss Much Ado About Obama’s Birth Certificate on DocumentCloud on PBS’s IdeaLab.

How DocumentCloud Helped Award-Winning Investigations

Posted
Apr 14th, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

Investigative Reporters and Editors (IRE) announced their medal winners this week, and we were impressed to see that both winners wove DocumentCloud into their winning reporting. Since 1979, IRE has honored outstanding investigative work with their annual awards. This year they honored a Los Angeles Times series on outrageous salaries in one of California’s poorest towns and a collaboration between International Consortium of Investigative Journalists and the BBC for a report on the global asbestos trade.

Breach of Faith

Los Angeles Times was awarded an IRE Medal for Breach of Faith. An investigation of financial impropriety in a small town revealed what turned out to be a dramatic case of corruption and mismanagement in the quiet city of Bell, California. Los Angeles Times reporters uncovered exorbitant city salaries (including compensation packages topping the million dollar mark) and errors in financial reporting that were more than just
mistakes.

breach of faith image.jpg

They used DocumentCloud to post the falsified salary information that city administrators had provided to concerned citizens years earlier, and the subsequent indictment of the administrators who provided those false records.

Dangers in the Dust

dangers in the dust.jpg

International Consortium of Investigative Journalists and the BBC share an IRE Medal for Dangers in the Dust: Inside the Global Asbestos Trade. Throughout their year-long reporting project, they added all manner of document source material to a growing archive of documents.

Great work, and congratulations to both teams.

More Prize-Winning Work

We’ve seen other prestigious journalism awards go to DocumentCloud users, too. Last month, Alex Richards and Marshall Allen (who has moved from the Las Vegas Sun to ProPublica since) were honored with the Goldsmith Prize for Investigative Reporting, given each year by Harvard’s Shorenstein Center to honor investigative reporting that “promotes more effective and ethical conduct of government, the making of public policy, or the practice of politics” for their in-depth report on hospital care in Las Vegas.

Alongside each of the five stories that made up that award-winning coverage, Richards and Allen used DocumentCloud to share their source documents with readers. It’s great reporting and exactly the kind of work we imagined we could help support when we set about building DocumentCloud.

At least one finalist for the Goldsmith prize also put DocumentCloud to excellent use. ProPublica, NPR’s “Planet Money” and Chicago Public Radio’s “This American Life” collaborated on Betting Against the American Dream, an alarming expose of Wall Street’s role in exacerbating their own meltdown. ProPublic used DocumentCloud to detail correspondence with their uncooperative subjects.

Discuss How DocumentCloud Helped Award-Winning Investigations on PBS’s IdeaLab.

FAQ: Should I Try Again?

Posted
Apr 1st, 2011

Tags
Documents,Workspace ,

Author
Amanda Hickman

Every once in a while, DocumentCloud gets hit with the kind of document stash that really slows us down. We can take a lot, but if one newsroom finally gets a 25,000 page FOIA turned over to them and another gets a hold of 30,000 pages of documents for a breaking news story about the on the same afternoon, that’s a volume that will tax our servers.

We recently established a “fast lane” to ensure that smaller documents don’t have to get in line behind behemoths, but that doesn’t help if you’ve got a few MB of documents about a local scandal — you’ll still have to shuffle into line with the big sets. Continue reading »

DocumentCloud Enables Public Searches, Embeddable Sets

Posted
Mar 31st, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

We quietly opened DocumentCloud’s catalog to public searches in January, and we’ve been working since to do more with the great documents that reporters have added to our catalog.

When Vancouver Sun investigative reporter Chad Skelton asked if there was a way to automate display of the growing cache of documents he was retrieving from the city’s ferry authority, the best answer we could offer was to point his readers to a search for the DocumentCloud project he was stashing them in. Our goal from the outset has been to help news organizations make their own substantive reporting more engaging online, not to drive traffic to DocumentCloud.org. Moreover, Chad was far from the only reporter asking us to make it easier to embed whole document sets. Homicide Watch even built a JavaScript widget to embed their sets. So the latest DocumentCloud feature, built out by our own Samuel Clay, is embeddable document sets.

Any DocumentCloud user can embed pretty much any set of documents on their site. It works whether or not the user’s own newsroom originally published the documents. This means that the Vancouver Sun can embed their ferry documents, and that any user can embed a set of documents matching a search for Scientology. Documents initially published by the New Yorker will open on newyorker.com while documents that were published by ProPublica will open there. Someone could also embed the complete set of public documents that match a search for former Illinois governor Rod Blagojevich:

More Tools

We’ve added plenty more tools to help newsrooms get the most out of DocumentCloud, too. A dozen different “How do I …” questions led us to dramatically increase the options available when users publish documents. Plus, a brainstorming session with American Public Media’s Andrew Haeg in the halls of this year’s Online News Association conference led to a tool newsrooms can use to share documents with reviewers outside of the newsroom.

Our users continue to help us make the most of the tools we’ve built, too. It’s been a few weeks since the unstoppable Chicago Tribune news apps team released dcupload, but the python script, written against our API, makes it a whole lot easier for DocumentCloud users to upload a great heap of documents in one fell swoop.

Discuss DocumentCloud Enables Public Searches, Embeddable Sets on PBS’s IdeaLab.

Improved Document Collaboration

Posted
Mar 9th, 2011

Tags
Documents,Workspace , ,

Author
Amanda Hickman

From inviting a law professor to help Arizona readers understand recent legislation to asking some top notch designers to review New York’s new ballot, DocumentCloud users have already found some great ways to bring experts from outside the newsroom in, and we thought it was time to make it much easier to do just that.

We spent some time at ONA last year, brainstorming with the good folks from the Public Insight Network — they really helped us distill this into a workable feature. We’re looking forward to seeing PIN newsrooms do some great reporting aided by this new feature. Continue reading »

DocumentCloud Passes Major Milestone: 1 Million Pages Uploaded

Posted
Mar 1st, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

DocumentCloud’s Jeremy Ashkenas collaborated on this post.

It has been less than a year since DocumentCloud began adding users to our beta. Late Monday morning, a user uploaded our millionth page of primary source documents.

The thousands of documents in our catalog have arrived in small batches: five pages here, twenty there. The vast majority of the 65,000 documents that those million pages comprise remain private, but we’re fast closing in on 10,000 public documents in our catalog.

Broad Appeal

Journalists are using DocumentCloud to publish all sorts of documents, including these:

Remaking History

Documents in our catalog reach back into the past, as well. In 1970 Ruben Salazar was killed by police while covering an anti-war protest in east Los Angeles. A story rife with controversy, questions, and suspicions, his death became a rallying point in the Mexican American civil rights movement. Forty years later — after refusing a public records request for documents that might shed some light on the circumstances of his death — the Los Angeles County Sheriff’s Department agreed to turn the files over to the Office of Independent Review.

While Los Angeles Times reporters waited for the report, they assembled their own folio of early clippings on Ruben Salazar. Readers can review FBI files obtained by the Times in 1999 and LAPD records on the department’s repeated clashes with the journalist as well as a draft of the report prepared by the Office of Independent Review.

Join the Cloud

You can browse recently published documents by searching for “filter: published” or read up on other searches you might want to run. Here’s hoping that the next year brings millions more pages, and more great document-driven reporting.

Discuss DocumentCloud Passes Major Milestone: 1 Million Pages Uploaded on PBS’s IdeaLab.

DocumentCloud Passes Major Milestone: 1 Million Pages Uploaded

Posted
Mar 1st, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

DocumentCloud’s Jeremy Ashkenas collaborated on this post.

It has been less than a year since DocumentCloud began adding users to our beta. Late Monday morning, a user uploaded our millionth page of primary source documents.

The thousands of documents in our catalog have arrived in small batches: five pages here, twenty there. The vast majority of the 65,000 documents that those million pages comprise remain private, but we’re fast closing in on 10,000 public documents in our catalog.

Broad Appeal

Journalists are using DocumentCloud to publish all sorts of documents, including these:

Remaking History

Documents in our catalog reach back into the past, as well. In 1970 Ruben Salazar was killed by police while covering an anti-war protest in east Los Angeles. A story rife with controversy, questions, and suspicions, his death became a rallying point in the Mexican American civil rights movement. Forty years later — after refusing a public records request for documents that might shed some light on the circumstances of his death — the Los Angeles County Sheriff’s Department agreed to turn the files over to the Office of Independent Review.

While Los Angeles Times reporters waited for the report, they assembled their own folio of early clippings on Ruben Salazar. Readers can review FBI files obtained by the Times in 1999 and LAPD records on the department’s repeated clashes with the journalist as well as a draft of the report prepared by the Office of Independent Review.

Join the Cloud

You can browse recently published documents by searching for “filter: published” or read up on other searches you might want to run. Here’s hoping that the next year brings millions more pages, and more great document-driven reporting.

Discuss DocumentCloud Passes Major Milestone: 1 Million Pages Uploaded on PBS’s IdeaLab.

Going Public

Posted
Jan 26th, 2011

Tags
Documents,Workspace

Author
Amanda Hickman

With close to 200 newsrooms contributing documents and thousands of documents in our catalog, we decided it was time to open DocumentCloud to public searches.

Wondering who is still covering the Deepwater Horizon oil spill? Try a search for “deepwater horizon” organization: transocean, and see documents that both reference the rig by name as well as the drilling contractor, Transocean. Then, click on the “Entities” tab to see more data provided by OpenCalais’ entity extraction.

Did you miss Memphis Commercial Appeal‘s coverage of Ernest Whithers? Catch up with a search for
group: commercial-appeal withers, and find every document uploaded by reporters in the Commercial Appeal newsroom that mentions Whithers by name. Curious to see the annotations journalists have been making on the documents they’re sharing? Try a search for filter: annotated and you’ll skip any documents that were published without annotations.

There’s plenty more you can do with DocumentCloud’s search syntax. Check out our primer and try a few searches.

We’d love to know what you think, and what you’ve found.

PS. Finding bugs rather than documents? We want to know about those, too.

Which Metrics Matter for Measuring User Engagement?

Posted
Jan 18th, 2011

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

Gail Robinson’s recent post on traffic in a post-loyal era got me thinking about measures of web traffic and, more broadly, how to measure the impact of non-profit journalism.

I certainly don’t disagree with Gotham Gazette‘s decision to pass on providing Yahoo with free content. There’s no good reason that Yahoo can’t create a lively community without wholly reprinting Gotham Gazette’s excellent original reporting free of charge.

There are probably good reasons that it would complicate Gotham Gazette’s work to license stories to a commercial outlet like Yahoo Local, too: As a non-profit, the local policy publication regularly livens up stories by illustrating them with images licensed only for non-commercial use, or by independently licensing photos that aren’t available under a Creative Commons license at all. Sorting out the images that can be re-licensed to a commercial entity like Yahoo isn’t a trivial project, especially not for a small local publication.

It doesn’t look like Gotham Gazette is alone in declining Yahoo’s advances — Yahoo Local’s New York City page was recently dominated by pleas for piety from someone in Georgia:

yahoolocal_crop.png

And I definitely appreciate the impulse to own your traffic. One of the reasons DocumentCloud is thriving right now is that we’ve been very careful to ensure news organizations aren’t handing traffic off to us. They own their traffic. They can keep track of their readership numbers, evaluate efforts to increase site visits, and slap as many ads and extra navigation elements on embedded documents as they want. Even so, they want more: Users and prospective users alike regularly ask for better metrics on the documents they’re publishing.

Meaningful Metrics

Oakland Local, a project as commendable for its willingness to share insights as for its local coverage and community, has been quite open about the stats they look at as meaningful: Page views, unique visitors, average time on site and returning traffic. Returning visitors made up half their traffic when they spoke with Michele McLelland last spring. They also keep an eye on where their readers are coming from — they’re interested in how much of their audience is reading from Oakland.

When I was at Gotham Gazette, in addition to those basic web analytics, I kept a close watch on our comments — their vibrancy struck me as a good measure of participation.

So what do you measure?

So I’m curious: Do you look for measures of your impact beyond the kind of numbers you show to advertisers? Share your thoughts in the comments below.

Discuss Which Metrics Matter for Measuring User Engagement? on PBS’s IdeaLab.

Frequently Asked Questions: Journalism School Edition

Posted
Jan 5th, 2011

Tags
Accounts,Workspace ,

Author
Amanda Hickman

We get a decent number of inquiries from journalism schools interested in incorporating DocumentCloud into their coursework. That’s great, it really is. If you take a look at our list of document contributors, you’ll see a nice collection of journalism schools, student reporting projects and investigative reporting institutes. We absolutely welcome journalism schools.

That said, there are a few things worth knowing before you contact us. Continue reading »

Frequently Asked Questions: WikiLeaks Edition

Posted
Dec 20th, 2010

Tags
Workspace ,

Author
Amanda Hickman

With WikiLeaks in the news, there are a few questions (two, actually) that we’ve been asked rather frequently of late, questions we hadn’t anticipated in our original list of frequently asked questions. Questions like …

Is DocumentCloud the new Wikileaks? Isn’t OpenLeaks just a Swedish DocumentCloud?

No, not really. We’re both nonprofits dedicated to publishing data and documents, but that’s about it.

To join DocumentCloud, you need to be a journalist, or work a lot like one. Our goal is to help reporters publish more source documents and to build a catalog of primary (and secondary) source documents that individual journalists have researched and written about: we expect our users to be uploading documents they’re reporting on. Document contributors make a commitment to us that they’re confident of the authenticity of the documents they upload. And every user tells us their name — it goes right on every document. Continue reading »

Altering Docs? Now There’s a Tool for That in DocumentCloud

Posted
Dec 9th, 2010

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

When we embarked on the DocumentCloud project, tools for altering documents were the furthest thing from our minds. After all, a responsible journalist doesn’t tweak source documents!

But one of the first papers to embed material using DocumentCloud needed to do just that. The Chicago Tribune accompanied their coverage of a troubled foster home with a collection of letters and court orders. Though the documents offered an excellent illustration of the state child services agency’s lax oversight and slipped follow-ups, they were predictably full of personal information about children in the foster care system, individual agency staff names and other personal and identifying details about private individuals that the Tribune opted to omit from their reporting. That decision, however, left the news apps team replacing the whole stack of letters multiple times before the package was finally ready to post.

A tool, right inside of DocumentCloud, for replacing, removing and re-ordering the pages of a document would have helped them a lot.

When the “PBS NewsHour” shared a century old hand-written Mark Twain essay, our OCR tools were not nearly up to the task of reading his handwriting. NewsHour transcribed the 10-page essay by hand and we worked with them to replace the text stored in DocumentCloud and displayed on the embedded letters.

By the time that Memphis’ Commercial Appeal wanted to run a lengthy series of handwritten letters from Abdulhakim Mujahid Muhammad, a young Memphis-born man who opened fire on a military recruiting center in Little Rock last summer, we at DocumentCloud were busy supporting nearly 200 newsrooms — offering to hide the text tab was the best we could do.

What NewsHour and Commercial Appeal really needed was a tool, right inside of DocumentCloud, with which to edit the text of each document.

And so, we’ve assembled what we think is a sweet suite of tools to let you re-order pages, insert new ones, delete old ones and edit the text that will appear in your embedded document. Check out our user guide to see how it all works. We welcome your bugs, feedback, rants, raves and, as ever, your documents.

Discuss Altering Docs? Now There’s a Tool for That in DocumentCloud on PBS’s IdeaLab.

Announcing: Document Modification Tools

Posted
Dec 9th, 2010

Tags
Workspace

Author
Amanda Hickman

Fine tuning text, adding, removing and reordering pages: when we embarked on this project, tools for altering documents were the furthest thing from our minds. A responsible journalist doesn’t tweak source documents! One of the first papers to embed material using DocumentCloud needed to do just that. Chicago Tribune accompanied their coverage of a troubled foster home with a collection of letters and court orders. Though the documents offered an excellent illustration of the state child services agency’s lax oversight and slipped follow-ups, they were predictably full of personal information about children in the foster care system, individual agency staff names and other personal and identifying details about private individuals that The Trib opted to omit from their reporting. That decision, however, left the news apps team replacing the whole stack of letters multiple times before the package was finally ready to post.

A tool, right inside of DocumentCloud, for replacing, removing and reordering the pages of a document would have helped them a lot. Continue reading »

Last Minute News Challenge Tips: Tell a Story, Be Realistic, and More

Posted
Nov 24th, 2010

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

Planning to spend the long weekend finalizing your Knight News Challenge application? It’s too late for my favorite bit of advice (“don’t wait until the last minute!”), but as someone who’s been involved with three different winning projects, I like to fancy that I’ve got got some insight into what makes a good project.

A half dozen prospective applicants have sat down with me to workshop their News Challenge ideas, and I think I’ve helped them think through their projects to get them to a more viable place. The application process isn’t hard, but you do need to give some sincere thought to your project or you’re just wasting your time. Here’s the advice I keep giving people: Continue reading »

DocumentCloud Users Make Ballot Design An Election Issue

Posted
Oct 27th, 2010

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

When we make lists of the kinds of source documents users can upload to DocumentCloud, they can get pretty long. DocumentCloud is court filings, hearing transcripts, testimony, legislation, lab reports, memos, meeting minutes, correspondence. I can say with absolute confidence that in all of our planning, “ballots” never once came up as the sort of document a news organization might want to annotate for readers. Our relentlessly creative users have shown us otherwise.

This summer, the Memphis Commercial Appeal rounded out its guide to August’s primary elections with a sample ballot. Their digital content editor told us that many readers who’d missed the sample ballot in the print edition turned to the version online as primary day approached. Earlier this month, they added the general election ballot to that guide.

New York Ballots

WNYC, New York City’s NPR affiliate, also published a few ballots this summer. In an effort to comply with a 2002 federal law that mandates significant updates to voting systems in each state, New York City introduced paper ballots for the 2010 primary election, replacing the city’s famously arcane voting machines. One look at the new design and everyone was up in arms, proclaiming its absurdity, but WNYC actually invited a group of ballot design experts to review the city’s new ballots. Their findings: the ballot was confusing.

Design for Democracy works to increase civic participation, in part through a ballot design project that aims to make voting easier and more accurate. WNYC used Design for Democracy’s feedback to annotate a sample ballot on their blog, offering readers vital voting advice.

When the city released sample ballots for November’s general election, a local think tank pointed out that the instructions erroneously advise voters to mark the oval above their candidate’s name. In fact, the relevant ovals appear below candidate’s names. WNYC highlighted the issue by embedding a sample ballot on their blog. Apparently the “oval above” language was mandated by state law. Don’t believe me? See for yourselfWNYC posted the legislation, with the relevant passage highlighted.

From now on, my laundry list of things DocumentCloud catalogs will most definitely include ballots.

Discuss DocumentCloud Users Make Ballot Design An Election Issue on PBS’s IdeaLab.

The Best User Feedback Comes From Watching and Listening

Posted
Oct 12th, 2010

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

ProPublica used DocumentCloud to develop an excellent story they published Friday. I’d planned to write it up, but Krista Kjellman Schmidt, the news applications editor who worked on the story, put it much better than I ever could have. Here’s the opening of her post:

On Oct. 8, we published an investigation examining how a judicial opinion in a pivotal lawsuit brought by a Guantanamo detainee vanished, only to be replaced weeks later by an entirely different opinion. At the center of our reporting are two documents representing separate versions of that same opinion: the original opinion written by Judge Henry H. Kennedy, and a second opinion quietly put in the original’s place more than a month later.

Read the rest of the post to enjoy the tale of these two important documents. What Schmidt doesn’t mention, however, is that I just happened to have been camped out in ProPublica’s offices last week while she was putting the story together. House guests and a nearby demolition project conspired to drive me from my home office, so I had an unusual vantage point on their story.

At one point, while I was savoring an excellent apple cake (I should decamp to Exchange Place more often!), I overheard the news applications team comparing notes on how best to prepare thumbnail images for a chart that was to accompany the investigation, and I realized that we had failed to alert our users to some of the bells and whistles under the hood of DocumentCloud.

For instance, we’ve got those thumbnail images ready already. No need to break out firebug and manually resize graphics!

Which brings me to my real point: The only way we know what our users need from us is by watching them try to use DocumentCloud and listening to them describe the use cases at the outer edges of what we’d expected.

Discuss The Best User Feedback Comes From Watching and Listening on PBS’s IdeaLab.

How to: Grab Thumbnail Images

Posted
Oct 8th, 2010

Tags
Workspace , ,

Author
Amanda Hickman

When Krista Kjellman Schmidt was putting together the chart illustrating key deletions in a set of documents, she needed a quick way to grab thumbnail images of particular pages. She was lucky — I just happened to be at ProPublica’s offices and just happened to overhear a few snips of conversation. You might be able to use firebug, someone suggested, and then reduce the images? another followed. I’m not sure what prompted me to look up (possibly the fact that I’m a busybody?) but I did, and when I realized what she was trying to do I had a far better suggestion. It went something like this: Continue reading »

Questions, Answered

Posted
Sep 22nd, 2010

Tags
Workspace

Author
Amanda Hickman

Last week, Investigative Reporters and Editors had us over for a webinar all about DocumentCloud. A video of the webinar will be available very soon from IRE. In the meantime, I did promise to answer every last question from the back channel.

We covered a lot of ground but there were plenty of questions that I couldn’t get to in the hour alloted. I realized, as I read through the questions that folks were asking, that a good many of our current users have been hesitant to ask questions when our documentation isn’t clear. Part of being in beta means that DocumentCloud is changing fast and often. Sometimes our documentation lags behind, sometimes we miss things. Sometimes what we think is clear makes no sense at all to you. So we’re trying something new: on top of all the other ways in which you’re more than welcome to contact us we’re setting up virtual office hours. Stop by: we’ll take all questions, no matter how technical or mundane. As for the questions you’ve already asked … Continue reading »

DocumentCloud Helps Newspapers Bring Transparency to Government

Posted
Sep 7th, 2010

Tags
IdeaLab

Author
Amanda Hickman

Cross posted from PBS Idealab.

Since we last updated readers on DocumentCloud’s progress, we’ve made it much easier to upload a lot of documents at once, and introduced a related documents search that uses data about names and places provided by OpenCalais to find documents that are probably related to the one you’re looking at. We’ve also added a bit more contextto the data we help reporters comb through. Most of this work is happening inside the gates of the DocumentCloud workspace, but it is resulting in some lively reporting. For example…

Using Documents to Tell the Story

This summer, as the federal 5th Circuit Court of Appeals prepared to hear arguments in a challenge to the University of Texas’s affirmative action policy, Texas Tribune complemented its coverage of the case with nearly 200 pages of annotated court documents, including the original district court ruling, the university’s appellate brief, as well as that of the plaintiffs in the case.

The Las Vegas Sun incorporated quite a trove of documents into its series on hospital care in Las Vegas. Readers were invited to browse everything from Department of Health and Human Services reports to individual records, right along with the Sun’s reporters. When they say that hospital-acquired infections cost the country $30 billion per year or account for close to 100,000 deaths, they back each number up with original documents.

The Columbia Missourian annotated the city budget and took a local blogger to task for exaggerating Columbia, Missouri’s cash reserves.

When Texas Governor Rick Perry challenged reporters to find anyone who can out-work him, Texas Tribune posted the governor’s May 2010 schedule alongside that of Florida’s Gov. Crist, New York’s Gov. Paterson and California’s Gov. Schwarzenegger and invited readers to help them skim over a hundred pages of briefings, receptions and photo ops for stories deserving of a closer look.

The Washington Post supplemented its reporting on the cozy relationship between the oil industry and the federal agency assigned to regulate them with an annotated report on the prospects for “Moving beyond Conflict” between regulator and regulated. Their document cache also included reports outlining just how cozy things had gotten by 2008. As Emily Keller pointed out in Free Government Info, a transparency project, documents like these give more transparency to journalism itself.

New Features in the Testing Lab

We’re also hard at work fine tuning the document viewer, transforming it into something that users could reasonably plug into a template with a narrower content column. Thus far folks have been stuck with a full page viewer. We haven’t fully rolled it out yet, but we’ve worked with a couple of our beta testers to implement it already.

Iowa State has a new men’s basketball coach, and the Des Moines Register included all 14 pages of his contract to their coverage of the finer points contained in it. Among the unusual clauses? Hoiberg can walk away if the university decides to increase academic standards for student athletes beyond the NCAA’s minimum.

Meanwhile, at the Santa Fe Reporter, Alexa Schirtzinger opted not to publish tables of information right inside her story on elder abuse in New Mexico, but she did use her staff blog to share the data that she had such a hard time tracking down. An annotation highlights the numbers that showed her that New Mexico fields more abuse complaints per nursing home bed than any other state.

DocumentCloud watchers will notice that they posted the contract right on the same page as Randy Peterson’s writeup instead of displaying the document in a full page. We’ll be making tweaks like this a lot easier for all of our users. In the meantime, if you’re skilled at the art of reverse engineering JavaScript, you can view the source of the Register’s story (or the Reporter’s) to see just how they toggled the sidebar or zoom on those documents.

Discuss DocumentCloud Helps Newspapers Bring Transparency to Government on PBS’s IdeaLab.

Uploading Documents Gets a Little Easier

Posted
Aug 18th, 2010

Tags
Workspace ,

Author
Amanda Hickman

You’ve always been able to script batch uploads using our API, but for users without coding skills, uploads were one at a time. Today we’re rolling out an improved document uploading dialog that will let you upload as many documents as you want, all in one fell swoop.

You’ll still use the “New Documents” button, but now that button takes you straight to a file selection dialog.
Use the control (on MS Windows) or command (on Macs) key to select additional documents, just like you would in your file browser.

File selection screenshot

We’ll start you off by suggesting a title, based on each file’s name, but you can edit that name and add additional information, including the source of each document and a description. As with the old upload dialog, you can decide right when you upload your document whether or not you’re ready to share it with the world yet. As ever, you can edit all of these fields again later.

If your documents share a common source or description, use the “Apply to All Files” link to copy your metadata to each document in this batch. Note: this new upload interface requires Flash. If that’s an issue for you, let us know ASAP and we’ll whip up an alternate interface that doesn’t require any plugins. Promise.

As the files upload from your computer to DocumentCloud, you’ll see the progress of each transfer.

Better Processing, Too
This week’s release is more than just a new upload dialog. We’ve made some big changes to Docsplit and the RightAWS gem, and we’re hoping this means the dreaded “import failed” error will be a thing of the past. If looking under the hood is your thing, both Docsplit and our fork of RightAWS are on on github for your viewing (and reusing) pleasure.

Don’t be a Stranger
If you have gigabytes worth of documents to upload, get in touch before you start uploading so we can add more horsepower to handle your job. Otherwise, happy uploading! And don’t forget to tell us about what you’re publishing with DocumentCloud.