Saturday, September 18, 2010

Analyzing Email Communications: An Ego-Centric Approach

As a quick scan through prior blogs will show, throughout this year we have been exploring the application of social network visualization software to email communications. Our interest has been two-fold: finding tools to support those working in legal and regulatory environments who need to examine large numbers of emails for answers to “Who, What, Where, When and Who Knew” kinds of questions and secondly, to see if this approach might provide behavioral psychologists with tools to identify and/or objectively measure, communications issues in workplace teams. In many workplace situations, email has become the primary communication mechanism whether through cultural factors (as with many IT teams) or because of distance (with geographically dispersed teams). At the same time communication issues are cited as one of the primary reasons why projects fail. It seemed to us that tools for analyzing the flow of email communications in a team might help identify team members who are outside the group, or who have significantly fewer interactions with key individuals in the team, thereby enabling remedial action to be taken.

Software we have looked at so far includes: Gephi – useful for large data sets – and NodeXL – useful for analyzing smaller groups of individuals with great options for customizing the appearance of the graphs e.g. color coding particular attributes or clusters and easy to use. Data feeds into both are organized basically as edge lists and node lists with Gephi requiring XML formatting and NodeXL spreadsheet or csv lists. (Note: in an email environment, a node is an individual – represented by either an email address or a name and an edge is the communication between two individuals with the volume of communications represented by a weight measure). The visualizations produced look at communication and clustering from a birds-eye view across the entire data set.

UCINET takes a somewhat different approach. UCINET is a social network analysis program developed by university researchers at the University of Kentucky and distributed by Analytic Technologies (see www.analytictech.com/ucinet/). There is a free trial version and relatively low cost options for students, researchers and single users.

Unlike NodeXL or Gephi, UCINET is not a complete visualization package but only the analytic engine. It is, however, integrated with a freeware program called NETDRAW. Since both are included in the download package, installation is straightforward. We did find in practice though that the package behaves like a set of separate tools operating on a common data set compared with the more integrated environments of NodeXL or Gephi. Another difference is that UCINET works on matrices not edge/node lists. Fortunately, it has an import function which accepts a standard edge list (e.g. person1, person2, weight) in excel format. The import function then converts this into a matrix for analysis and visualization.

Our test data set is the same as before: an anonymized set of email communications. For this investigation we started with a small subset of 368 nodes and 1223 edges.

NETDRAW visualization of entire email network


While NETDRAW is by no means as sophisticated as the graphical packages in Gephi or even NodeXL, where the UCINET/NETDRAW package came into its own is in its ability to hone in easily on a selected set of individuals. A checklist menu of nodes appears on the right hand side of the graph and altering the selections immediately redraws the graph showing only those individuals and their connections. We think this is very helpful when drilling down to investigate the interactions between a particular group of people.

Another great feature of UCINET/NETDRAW is its ability to visualize interactions from an “ego” perspective. By selecting an initial “ego”, the software identifies all the individuals in communication with the selected individual and produces a subgraph of communications between them. For example, simply selecting “Carmela Soprano” produced the following subgraph.

"Carmelo Soprano" Ego Network Graph


NETDRAW can be configured to represent the volume of communications as the size of the link:

Network Graph with Link Width Representing Communication Volume


Or with the volume shown in a link label:

Network Graph with Link Label Showing Communication Volume


UCINET offers a range of node centrality measures including Closeness, Betweenness, Degee and Eigenvector. (For information about what these measures represent, see previous blogs or go to: http://en.wikipedia.org/wiki/Betweenness_centrality#Eigenvector_centrality). Once the measures are calculated, nodes can be colorized to represent one of the selected measures. For example the nodes on the sub-graph below have been colorized to represent the value of the Indegree attribute.
It is also possible to filter based on a particular measure. The graph below shows the entire set filtered to show only nodes with high Eignvector counts (a measure of the importance of the individual in the network).

Network filtered by Eigenvector Measure (to show 'Important' individuals only)



UCINET/NETDRAW also has a number of algorithms for analyzing subgroups. For example, in the subgraph below (an “ego” network for Tom Hagen), it has identified 3 factions – represented by the three different colors: red, blue, black.

Graph identifying Factions within a Subgroup


An analysis of cliques in the entire set identified 60 separate groups shown in the graph below.

Graph showing the 60 cliques identified in the data set


What we liked about UCINET/NETDRAW is the ease with which we could explore the involvement of particular individuals in the network using the ego feature combined with the filtering and attribute based node coloring. We also liked the wide range of analysis options which included not only the standard centrality measures but also various clustering algorithms and analyses of cliques and subgroups. While more extensive documentation would have been helpful, (although we do appreciate that this was initially developed as a research tool), we did appreciate that whatever we did to it, it never crashed and managed to catch any errors gracefully.

Saturday, September 11, 2010

The Case of the Missing Spell Checker

A recent project involved creating a proof-of-concept SharePoint 2010 Foundation site(s) for a client. The aim was to demonstrate some of SharePoint’s collaboration features and show how the platform could support various teams within the client’s organization. In setting up the demonstration, we decided to create a small Knowledge Base using the built in content creation tools.

The new page editing tools are certainly easier to use than in previous versions of SharePoint and adding in pictures is a cinch. The range of styles and fonts is also much improved. We did think the mechanism for linking pages – while very wiki-like – could have been made easier for less tech-savvy users. More importantly, since Foundation users do not get the content management and tagging features of the Standard and Enterprise versions, better tools for organizing the pages – other than simple links – would have been helpful. For example, it would have been nice to have been able to designate one of the pages as the “Home Page” of the Knowledge Base. Another great feature would have been to have an “Index Page” with an automatically created index of pages in the wiki.

SharePoint 2010 Foundation Content Editor: Insert Options


SharePoint 2010 Foundation : Text Editing Options


It wasn’t until someone pointed out a glaring spelling error in the copy we’d been writing for the Knowledge Base that we realized that, most strangely, there wasn’t any form of spell checker in the content editor. At first we thought we’d simply mislaid it somewhere in the ribbon but after looking high and low for it and checking several blogs, we realized that it in fact doesn’t exist in Foundation. Microsoft skirt round the issue by declaring that spell checking exists in Standard and Enterprise, thereby carefully not saying that it doesn’t exist in Foundation.

This seems to us very strange and a significant drawback to Foundation (which is almost certain to be the de facto hosted version). After all, blog platforms and software like Blogger - on which ChromaScope is hosted - have incorporated spell checkers for some time now.

Blogger's Editing Options (Spell Check is the last icon on the right)


Intrigued, we decided to do a quick comparison of functionality between the HTML editors in Blogger and SharePoint 2010.

Feature

Blogger

SharePoint 2010
Foundation

Cut/Copy/Paste

Yes

Yes

Font
Styles

Yes
(7 available)

Yes  (13 available)

Font
Color

Yes
(limited range)

Yes
(extensive range)

Strike-through/SuperScript/Subscript

Strike-through
only

Yes

Highlight
Text

Yes

Yes

Paragraph
Formatting (e.g. justification)

Yes

Yes

Style
Gallery (e.g. Byline)

Quote
only

Yes
(7 available)

MarkUp Style Gallery (e.g. Heading1)

Title
and Body only (from blog content editor).

Yes
(14 available)

Text
Layout (e.g. columns)

Yes
but through Page Design rather than content editor.

Yes

Insert
Picture/Image

Yes

Yes

Insert
Video

Yes

Yes
(but not as obvious how to do this)

Insert
Link

Yes

Yes

Insert
Jump Break

Yes

No

Insert
Table

No

Yes

Select
Elements based on HTML tag

No

Yes

CheckIn/CheckOut

No
(but the publish function enables users to decide when pages become publically available.)

Yes

Tagging

Yes

No

Edit
HTML Source

Yes

Yes

Page
Templating

Yes

Yes
but by using SharePoint Designer

Language
Support

Yes
including non-latin

Extensive
including non-latin

Spell Checking

Yes

No

While overall, SharePoint 2010 Foundation has a very rich content editor, some of the features and the rather technical HTML element orientation may make it difficult for the general user or, more likely, simply languish unused. Blogger, on the other hand, with the exception of the option of easily adding a table, has all the features the general user/content creator would need to compose content AND a spell checker! Hopefully Microsoft take note of the feedback that we, and we are sure everyone else, will give them and make the text editor in SharePoint 2010 Foundation more like an easy-to-use content editor and less like an HTML editor for web designers.