Friday, February 17, 2012

Solution Faceted Search

Search is an integral part of every company with online presence and corpus data both structured and unstructured. The goal is always to navigate customer find the product/content in a fast efficient way to purchase or view.

What is a facet: As per the Google definition, it's a particular feature or an aspect of something. For those of us from SQL background think of facets as groupBy functionality. As we all understand to perform a groupBy is a complex performance operation from code. In world of searchandize, each facet is modeled as a Dimensions and Dimension Values using a Search engine(It could be any engine Endeca or SOLR or something else)

E.g. Dimensions: Color, Size, Price, Men's, Shirts
Dimensions Values: Red,Blue,Black, White
Ranges: In case where you have e.g. Authors on books or Price where we have a lot of values, it is a good idea to go with a range.

Faceted Search process:
Most of the Search engines have an ETL process to merge\Join data from various sources and massage\clean data, if you are data is not good (e.g. If a company sells USB drive, GB would be a facet but if the value is not clean as an G B or GB. or G.B for GB, it would translate to wrong facets in the front end). hence it is important to have good data.
Once you have get past cleaning data, the process is to build Attribute dictionary and map it to the data and build indices off it.
From the front end, the Navigation is driven by state N=0 that would provide all data from Search and as you drill to specific data, you change the Navigation state. Most of the times, the navigation state is encrypted for front end customers to make the URL smaller. There are other parameters such as Navigation filters.


