Heliosearch/Solr Facet Functions and Analytics

Traditional faceted search (also called guided navigation) involves counting search results that belong to categories (also called facet constraints). The new facet functions in Heliosearch/Solr extends normal faceting by allowing additional aggregations on document fields themselves. Combined with the new Sub-facet feature, this provides powerful new realtime analytics capabilities. Also see the page about the new JSON Facet API.

Aggregation Functions

Faceting involves breaking up the domain into multiple buckets and providing information about each bucket.
There are multiple aggregation functions / statistics that can be used:

Aggregation Example Effect
sum sum(sales) summation of numeric values
avg avg(popularity) average of numeric values
sumsq sumsq(rent) sum of squares
min min(salary) minimum value
max max(mul(price,popularity)) maximum value
unique unique(state) number of unique values

 
Numeric aggregation functions such as avg can be on any numeric field, or on another function of multiple numeric fields.

 

Simple Example

The faceting domain starts with the set of documents that match the main query and filters.
We can ask for statistics over this whole set of documents:


And the response will contain a facets section:

[...]
  "facets":{
    "count":32,
    "x":164.10218846797943
  }
[...]

 
If we want to break up the domain into buckets and then calculate a function per bucket, we simply add a nested facet command to the facet parameters. For example (using curl this time):

$ curl http://localhost:8983/solr/query -d 'q=*:*&
 json.facet={
   categories:{ 
     terms:{  // terms facet creates a bucket for each indexed term in the field
       field : cat,
       facet:{
         x : "avg(price)",
         y : "sum(price)"
       }
     }
   }
 }
'

The response will contain the two stats we asked for in each category bucket.

[...]
  "facets":{
    "count":32,
    "categories":{
      "buckets":[
        { 
          "val":"electronics",
          "count":12,
          "x":231.02666823069254,
          "y":2772.3200187683105
        },
        { 
          "val":"memory",
          "count":3,
          "x":86.66333262125652,
          "y":259.98999786376953
        },
[...]

 

Facet Sorting

The default sort for a field or terms facet is by bucket count descending.
We can optionally sort ascending or descending by any facet function that appears in each bucket. For example, if we wanted to find the top buckets by average price, then we would add sort:"x desc" to the previous facet request:

$ curl http://localhost:8983/solr/query -d 'q=*:*&
 json.facet={
   categories:{ 
     terms:{
       field : cat,
       sort : "x desc",   // can also use sort:{x:desc}
       facet:{
         x : "avg(price)",
         y : "sum(price)"
       }
     }
   }
 }
'

 

Try it out

Facet functions and Subfacets are currently only in the latest release of Heliosearch. Download the latest release and give it a spin!