We’re happy to announce that we’ve received a $420,000 grant from the John S. and James L. Knight Foundation for a new project that we’re calling “PublicBits.” Our goal is to collect the world’s open data sources, to make their historical data available through a dataset registry at PublicBits.org. This “dataset of datasets” will become accessible through a free, decentralized, redundant, open network, inspired by BitTorrent. PublicBits.org will use Dat, our flagship project, to craft a new, collaborative framework that breaks down the silos of data portals. Our project takes inspiration from GitHub, the popular platform that allows developers to collaboratively develop software. We plan to bring the collaborative, historical, and reusable nature of open source to the open data movement.
Developer Max Ogden started developing Dat with support from a Knight Prototype Grant in 2013. For the past two years, Dat has evolved with input from civic tech, journalism, and data science partners, and with $900,000 in funding from the Alfred P. Sloan Foundation. Through this process, we identified a significant pain point that spans the open data landscape, centered around the discovery and distribution of data. With PublicBits, we bring a new vision for an open data architecture that brings structure to the openness of data, by making it findable, manageable, and re-usable.
Long ago, Max Ogden set out to build a “GitHub for data,” but through prototypes, we learned that a centralized model for data hosting is different than source code, posing problems for hosting big data. It’s possible to centrally host code as a service because code is small, but datasets are often too large to shuttle in and out of cloud storage, both because it can be computationally expensive process but also because it costs a lot of money. A distributed approach is more robust, cheaper, faster, and more open than a centralized model because it provides a strong network of redundant peers. To that end, we are pursuing partnerships with cloud storage companies, academic institutions, and internet freedom organizations to ensure there will be “super-sharer” hosts available, with lots of reliable storage and bandwidth.
The Dat development team will continue to be led by creator Max Ogden, while Dat teammember Karissa McKelvey will take the reins at PublicBits. It will be a challenge to find the right team and execute the plan within the allotted time for the grant. The design must also contend with a scale of data that could potentially be very large, diverse, and complicated. We need to make sure users will not get overwhelmed by the amount of data, and that we can return relevant results. These challenges will not be insurmountable without the right team. Thankfully, we are already experts in the field and have built simple prototypes of the PublicBits registry and desktop application. Feel free to check out the code on GitHub. Contributions welcome!