Sunday, June 6, 2010

Searching SharePoint 2010 with FAST

FAST is a high-end search engine that is being provided by Microsoft (at additional cost) as an enterprise level alternative to SharePoint’s built-in search engine. Whereas standard SharePoint 2010 can handle millions of documents, the FAST search engine can index and search over a hundred million i.e. it can scale to handle not only document management for an entire organization but more specialist requirements such as regulatory compliance and litigation document review. It also has extensive support for languages other than English including Chinese, Japanese and Korean.

As well as being an enterprise level search engine, FAST incorporates a number of features designed to make it easier for end users to find things. For example, many users remember documents by their visual appearance. FAST supports visual recognition by displaying a small thumbnail next to the summary of the document so users looking for a specific document can rapidly identify it. In addition FAST also includes graphical previewers for PowerPoint documents which can be used, for example, to find that one particular slide in a presentation without having to open the whole file and go through it slide by slide. Results also include links to ‘Similar Results’ and to ‘Duplicates’.

Example of a FAST Results Display


To support its search capabilities, FAST includes extremely powerful content processing based on linguistics and text analysis. Examples of linguistic processing in the item and query processing include character normalization, normalization of stemming variations and suggested spelling corrections. FAST automatically extracts document metadata such as author, date last modified, and makes them available for fielded searching, faceted search refinement and relevancy tuning. In addition to document metadata, it is also possible to define what Microsoft refer to as “managed properties”. These are categories such as organization names, place names and dates that may exist in the content of the document and can help develop or refine a search. Defining a custom extractor will enable such properties to be identified and indexed. (Note: this is a similar capability to that offered by several ‘Early Case Assessment’ tools in the litigation space).

Example of FAST Refinement Category List for a Results Set


Sharepoint 2010 Standard provides the ability to refine search results based on key metadata/properties such as document type, author, date created. These refinement metadata values are based by default on the first 50 results returned. With FAST, refinement moves to a whole other level, so-called ‘Deep’ refinement, where the refinement categories are based on managed properties within the entire result set. Users are presented with a list of refinement categories together with the counts within each category. (Note: this functionality is similar to the refinement capability that many major eCommerce sites provide e.g. NewEgg.com, BestBuy etc).

SharePoint 2010 with FAST : Architectural Overview


A detailed feature comparison between SharePoint2010 Standard Search and FAST is and further information about FAST is provided in Microsoft’s document “FAST Search Server 2010 for SharePoint Evaluation Guide” downloadable from http://www.microsoft.com/downloads/details.aspx?FamilyID=f1e3fb39-6959-4185-8b28-5315300b6e6b&displaylang=en

No comments:

Post a Comment