Monday, August 16, 2010

PivotViewer: More Than Just Images

PivotViewer (aka simply as Pivot) is a framework that comes out of Microsoft Live Labs and is intended to support analysis of large datasets where the individual data entities have an image associated with them. We say this carefully because at first glance it looks like “yet-another-image-gallery-application” but it really is not (although we’d agree that you could use it for that purpose if you wanted, just as you can use a chisel to pull up carpet tacks and a 6-burner commercial class stove to cook a packet of soup).

Screenshot of AGM Movie Demo in PivotViewer showing the tiled view


The Silverlight enabled viewer works in a not too dissimilar way from an Excel Pivot table. Data can be filtered by any of the facets/categories available, supplemented (if required) by keyword searching. Images can be shown in tiled view or organized in bar chart view by chosen facet. Drilling down to item detail is as simple as zooming into an image. The corresponding data is displayed in a list to the side and adjacent items can be quickly stepped through using forward/backward buttons.

Screenshot of AGM Movie Demo in PivotViewer showing chart view



Underpinning the framework is the concept of a “collection”. A collection comprises a set of images and an XML file describing the images. The CollectionXML schema is a set of property-values that specify the collection as a whole, the facet categories into which the collection is organized and the individual items. The images in the collection are stored in Deep Zoom format and rendered using Seadragon technology.


CollectionXML Schema Overivew


Creating a Pivot collection is not as intimidating or difficult as it might sound, however, because fortunately LiveLabs provide several tools to facilitate the process including one that is based on Excel.

Screenshot of Pivot Collection Tool for Excel


This summer, LiveLabs also released a Silverlight 4 control which can be embedded in web sites (including SharePoint) and used to view, manipulate and analyze collections. The tools are available (for free) from the Pivot site. The Silverlight PivotViewer control can be downloaded from: www.silverlight.net/learn/pivotviewer/

Our initial interest in PivotViewer was its visualization capability and its potential for presenting complex data in ways that make it easier for users to understand and analyze. To this end we decided to try it out for ourselves and build a mini application using the Silverlight control as the viewer and the Pivot Collection Tool for Microsoft Excel to create the underlying collection. We had available a small collection of data and images relating to laboratory equipment and thought this would provide an interesting proof-of-concept.

Unlike many “interesting concept” toolsets we have attempted to deploy in the past, this one turned out to be very straightforward to use – despite a paucity of documentation. While the Excel Collection tool is “plug-and-play”, some knowledge of .NET development and Silverlight is obviously necessary to deploy the PivotViewer control. Thanks to Tim Heurer’s very helpful blog on how to deploy PivotViewer, we were able to get a basic Lab Equipment PivotViewer up and running very quickly.

Screenshot of Laboratory Equipment application - tile view


Screenshot of Laboratory Items by material type (chart view)



Although we knew going in that the small number of data items we were using was less than ideal, (more is definitely better here), we thought that the set of uniform images we had available (complete with color coding) and the supporting data about the equipment (size, material type, category, descriptions etc). would make up for it. We were wrong! We had focused on the images and these, while necessary, are not sufficient. What is absolutely essential to really make the most of this application is rich data. We had only two main facets and a small number of parameters for each significantly limiting what we could do.

Contrast this with the AMG Movie demo provided as a sample with the control where each movie is accompanied by a wealth of information including a description as well as faceted data such as date of release, director, actors, genre, box office takings, countries, runtime time and it is this information that fuels the application.
Close-up of Movie demo item and accompanying data


When thinking about how Pivot could be used, our first thoughts had been the obvious “image gallery” type applications: a web enabled version of an art gallery or museum for example. The “out-of-the-box” ability to support filtering and search by multiple facets – supplemented by keyword searching – would be ideal. Users could look, for example, for all Impressionist paintings depicting lakes painted in France between this date and that. Similarly, it could be used to develop a very useful, useable interface to any large catalog of items: from clothing (women’s jeans boot-cut dark-wash) to hardware (small plate door knocker solid brass satin nickel finish).

However, it was when playing with the movie application that we realized that thinking of it as simply a front end to a catalog was to underplay its potential. We had started to look at the box office takings facet and it was then that the penny-dropped. We found ourselves looking for patterns. What correlations were there between directors, actors and takings? It was very easy to ask these questions and then focus in on the results, arranging the items as tiles or as bar graphs. We could see the visual potential of PivotViewer really coming into play when looking at, for example, trends in sales on clothing or even real estate – anything where visual appearance (from color to style) is a factor in sales, cost of manufacture, page views or some other key metric.

Screenshot from AGM Movie demo showing Movies by Box Office Gross


In the movie demo, the images are a nice-to-have as a visualization but are not an essential part of the analysis per se. In other cases, we could envisage the images themselves being an essential part of the analysis. For example, retailers often study the selling power of pages in their printed catalogs or web sites, to determine which layouts are the most effective. PivotViewer would make this a very easy analysis to conduct. Similarly a greetings card manufacturer could look for patterns and trends in consumer choice of design.

In summary, we believe this technology has great potential deployed in environments that are data rich and where either visual appearance is correlated with one or more key metrics, or can facilitate visualization of complex data simply by making the individual items (or groups of items) more recognizable.
Underpinning the framework is the concept of a “collection”. A collection comprises a set of images and an XML file describing the images. The CollectionXML schema is a set of property-values that specify the collection as a whole, the facet categories into which the collection is organized and the individual items. The images in the collection are stored in Deep Zoom format and rendered using Seadragon technology.