Data Guy / BI Playground

I was recently talking to Centric’s very own Azure MVP Michael Collier.  Michael told me about something pretty cool he learned about from this year’s PDC.  Microsoft’s recent release of Windows Azure Marketplace DataMarket (code named Dallas).  The DataMarket essentially acts as a clearinghouse of commercial and public data sets making it easier to find and purchase the data you need to power your application or analysis (www.datamarket.azure.com).  From the Microsoft site:

Content partners who collect data can publish it on DataMarket to increase its discoverability and achieve global reach with high availability. Data from databases, image files, reports and real-time feeds is provided in a consistent manner through Internet standards. Users can easily discover, explore, subscribe and consume data from both trusted public domains and from premium commercial providers.

End users who need data for business analysis and decision making can conveniently consume it directly in Microsoft Office applications such as Microsoft Excel and Microsoft BI tools (PowerPivot, SQL Server Reporting Services). Users can gain new insights into business performance and processes by bringing together disparate datasets in innovative ways.

Application developers can use data feeds to create content rich solutions that provide up-to-date relevant information in the right context for end users. Developers can use built-in support for consumption of data feeds from DataMarket within Visual Studio or from any Web development tool that supports HTTP.

Microsoft has also worked to create a numerous mechanisms for consuming the data including integration to Azure, Excel, C# integration, and Windows Phone 7.

So what kind of data is available?  Right now specific data sets are a bit sparse, but broad categories range from Entertainment & Media, Health and Wellness, Sports, Transportation & Navigation, and Weather.  Some specific sets I discovered while browsing the catalog:

  • Practice Fusion Medical Research Data – key info needed to power clinical research
  • Major League Baseball Stats, and
  • 2006 – 2008 Crime In The United States.

Will be interested to see if this repository grows, or if alternate data clearing houses such as Google Public Data Explorer will continue to grow.

I welcome your comments,
Mike Brannan