Latest Updates: Our Blog

Job posting: DocumentCloud seeks data engineer

Posted
Mar 7th, 2015

Tags
Jobs

Author
Anthony DeBarros

We’re looking for a data engineer to join the growing team at DocumentCloud! If you’d enjoy a chance to help develop the next generation of our service — an open-source civic platform that more than 1,000 news organizations use to analyze, annotate and publish documents for the public good — we’d love to hear from you.

This is a full-time, two-year position with full University of Missouri benefits funded by a grant from the Knight Foundation. We’re a nimble, tightly knit team that works remotely — we stay connected via Slack and video chats — so you can live where you’d like and work flexible hours.

You’ll work on DocumentCloud’s processing pipeline, which makes searching and analyzing document collections accessible to journalists, to improve DocumentCloud’s extraction and analysis capabilities. The pipeline consists of several open source tools wrapped up in our Ruby-based infrastructure (a Rails-driven API and our CloudCrowd parallel processing toolkit). You’ll also play a key role in developing our production API capabilities, especially focused around what information we extract for users from documents and how best to do so.

Our ideal candidate would have the following skills and qualities:

— Independent problem-solver who values learning, keeps current on trends, and knows how to pick the right set of tools for a problem.
— Able to write clean, well-documented code; you know your way around Git, and your Github account shows activity.
— Strong ability to collaborate and communicate with a distributed team.
— Ruby and Rails.
— Experience with Unix-based systems.
— Some knowledge of data science, linguistics, information extraction or search. SOLR experience is a bonus.
— An interest in language and data processing.
— Knowledge of SQL (Postgres preferred).

You’ll join DocumentCloud at a significant time. We’re enjoying widespread use of our platform, and our tools have been used to investigate and publish stories from the grand jury decision in Ferguson, Missouri, to the Guardian’s NSA spying leaks. We collaborate with organizations such as the Washington Post, The Associated Press and Mozilla’s OpenNews fellows to build better ways to present the news, and you’ll have the chance to be part of the community exploring this intersection of news, data and technology.

To apply, please contact us at jobs@documentcloud.org

Leave a Reply