Thursday, April 8, 2010

Other Data Blogs

A simple Google search of the term “Data Blog” results in 148,000 hits.

Conclusion: what we’re attempting is nothing new. However, compared to a search of “News Blog” (20.9 million hits) or “Sports Blog” (8.6 million hits), data blogging has yet to take flight (assuming that it will).

So data blogging is actually in a strange place where only a few have taken the lead yet the level is nothing compared to other forms of blogging. The first order of business then, is to see what these select few are up to.

Through simple Google research here are some key findings for data blogs:
  1. Data blogs are predominantly blogs on data visualizations. To me, this makes sense because people are trying to find new ways to display data and go beyond the ubiquitous scatter plots and pie charts. Also, since people are visual animals, this topic is able to attract more traffic, than say, the folks who write about data mining, making it more profitable and pervasive.

  2. These data display sites come in three different forms:

    (a) Sites that provide tools to display data
    (b) Sites that display interesting data
    (c) Sites that discuss data visualization

  3. Many of the data blogs that features interesting data are concerned with showcasing contemporary data. The Guardian has an excellent one called the Data Blog.

  4. There are virtually no sites that are blogging exclusively about historical data and its implications (aka what we’re doing). Hooray! There are, however, historical blogs and data blogs that occasionally write about history, though none are completely devoted to historical data. Note: there are also insanely interesting blogs about the history of data.

  5. Then you have really esoteric sites writing about data mining, data storage, legality of data, etc.
Since the Dataverse blog is a companion site to the Dataverse website, I also looked at what other people are doing in terms of data archiving. Here’s what I found:
  1. In general, there are two types of sites out there: sites that archive and sites that point to these archives. The Dataverse Project does both.

  2. The Dataverse site is far from perfect. There are sites that are the models for categorization (spatial, temporal, by discipline), sites that are models for archiving, and sites that are models for data attribution that we can learn from.

  3. There are also sites that provide space to archive. One famous one is the Dataverse Network Project (not to be confused with our World-Historical Dataverse), run by the folks at Harvard.

  4. Finally, you have sites that are in my dreams: sites that do not exist[s], that I wish had existed. These include blogs on the nature of data, on data software, on open source data, etc.
One of the main goals of this site is to delve deeper into each one of these findings. Going forward, we hope to write about each one of these elements in depth, in addition to many other issues that will invariably arise. Stay tune!

No comments:

Post a Comment