We added excerpts to our timelines and entities tab today. We are using OpenCalais to parse every document you upload and extract the names of people, organizations, terms and places in the text. We display these under the Entities tab. We also extract date information from each document that you upload and plot those on a timeline which you can access from the Analyze menu.
DocumentCloud’s entities show you, at a glance, the people who are mentioned the most times in a given project, or the organizations that are named in each of the documents you’ve selected. Here’s how it works now:
Click on the “show pages” link next to any entity to reveal a thumbnail of each page in each document that contains that term, alongside excerpts highlighting the mention of the entity in the text. Clicking on the highlighted phrase will take you directly to term within the document itself. In the screenshot below, you can see how the Environmental Protection Agency was correctly identified by both its proper name and its acronym.
We’ve added excerpts to the timeline as well. When you open a timeline from the Analyze menu and scroll over any date, you’ll see a few words along with the date as it appears in the document–useful for corroborating a single event across multiple sources or for comparing different accounts of what should be a shared timeline. Click on a date to go straight to the point in the document where that date appears.
Hopefully excerpts will come in handy for your DocumentCloud projects. If you think of a way we can make them even more useful, comment or let us know.