Heliosearch/Solr JSON Request API

posted in: heliosearch, json, query, solr

Although query parameters are often an easy method to create a Heliosearch/Solr requests by hand, they have a number of drawbacks:

  • Inherently un-structured, requiring unsightly parameters like f.facet_name.facet.range.start=5
  • Inherently un-typed… everything is a string.
  • More difficult to decipher large requests.
  • Harder to programmatically create a request.
  • Impossible to validate. Because of the lack of structure, we don’t know the set of valid parameter and thus can’t do good error checking.

Heliosearch already had a JSON API for faceting and analytics, and Heliosearch 0.09 has extended that to the complete Solr request!

Simple Example

First let’s add a few excellent books from the fantasy genre (the “commitWithin=1000″ will cause them to be visible to searches within 1000 milliseconds):

$ curl http://localhost:8983/solr/update?commitWithin=1000 -d '
[
 {"id":"book1", "author":"Brandon Sanderson", "title":"The Final Empire",
   "series_s":"Mistborn", "sequence_i":1, "genre_s":"fantasy"},
 {"id":"book2", "author":"Brandon Sanderson", "title":"The Well of Ascension",
   "series_s":"Mistborn", "sequence_i":2, "genre_s":"fantasy"},
 {"id":"book3", "author":"Brandon Sanderson", "title":"The Hero of Ages",
   "series_s":"Mistborn", "sequence_i":3, "genre_s":"fantasy"}
]'

Now we can search them with a JSON request rather than using query parameters:

$ curl http://localhost:8983/solr/query -d '
{
  query:"hero"
}'

RESPONSE:

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "json":"\n{\n  query:\"hero\"\n}"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"book3",
        "author":"Brandon Sanderson",
        "author_s":"Brandon Sanderson",
        "title":["The Hero of Aages"],
        "series_s":"Mistborn",
        "sequence_i":3,
        "genre_s":"fantasy",
        "_version_":1486581355536973824
      }]
  }
}

A few things to note from our example:

  • JSON body is considered a parameter named “json” and echoed back with the other params (unless you disable it with echoParams=none). This will also cause it to be logged, which is normally important.
  • The JSON we send to Heliosearch/Solr can include unquoted simple strings and can contain comments. See JSON Extensions.
  • We don’t need to pass the Content-Type for indexing or for querying when we’re using JSON since Heliosearch is smart enough to auto-detect it when Curl is the client.
  • Unlike Solr, HTTP GET requests are now allowed to have a request body (i.e. try using “curl -XGET” for the query)

Here’s a more complete example:

curl -XGET http://localhost:8983/solr/query -d '
{
  query : "*:*",
  filter : [
    "author:brandon",
    "genre_s:fantasy"
  ],
  offset : 0,
  limit : 5,
  fields : ["title","author"],  // we could also use the string form "title,author" 
  sort : "sequence_i desc",

  facet : {  // the JSON Facet API is nicely integrated as well
    avg_price : "avg(price)",
    top_authors : {terms : author} 
  }
}'

Passing JSON via Request Parameter

It may sometimes be more convenient to pass the JSON body as a request parameter rather than in the actual body of the HTTP request. Heliosearch treats a json parameter the same as a JSON body.

$ curl http://localhost:8983/solr/query -d 'json={query:"hero"}&fq=author:brandon'

Smart merging of multiple JSON parameters

Multiple json parameters in a single request are merged before being interpreted.

  • Single-valued elements are overwritten by the last value.
  • Multi-valued elements likefields and filter are appended.
  • Parameters of the form json.= are merged in the appropriate place in the hierarchy. For example a json.facet parameter is the same as “facet” within the JSON body.
  • A JSON body, or straight json parameters are always parsed first, meaning that other request parameters come after, and overwrite single valued elements.

Smart merging gives the best of both worlds… the structure of JSON with the ability to selectively separate out / decompose parts of the request!

Simple Example

curl 'http://localhost:8983/solr/query?json.limit=5&json.filter="genre_s:fantasy"' -d '
{
  query : "hero",
  limit : 10,
  filter : "author:brandon"
}'

is equivalent to

curl http://localhost:8983/solr/query -d '
{
  query : "hero",
  limit : 5,   // this parameter was overwritten
  filter : [ "author:brandon" , "genre_s:fantasy" ]  // this parameter was appended to
}'

Facet Example

In fact, you don’t even need to start with a JSON body for smart merging to be very useful. Consider the following request composed entirely of request params:

curl http://localhost:8983/solr/query -d 'q=*:*&rows=1&
  json.facet.avg_price="avg(price)"&
  json.facet.top_authors.terms={field:"author_s",limit:5}'

That is equivalent to having the following JSON body or json parameter:

{
  facet: {
    avg_price: "avg(price)",
    top_authors: {
      terms: { 
        field: "author_s",
        limit:5
      }
    }
  }
}

Debugging

What to see what your merged JSON looks like? Just ask for debugging information (i.e. use the debug=true param), and it will come back under the "json" key along with the other debugging information.

Passing Parameters via JSON

We can also pass normal request parameters in the JSON body within the params block:

$ curl "http://localhost:8983/solr/query?fl=title,author"-d '
{
  params:{
    q:"title:hero",
    rows:1
  }
}
'

Which is equivalent to:

$ curl "http://localhost:8983/solr/query?fl=title,author&q=title:hero&rows=1"

Error Detection

Because we didn’t pollute the root body of the JSON request with the normal Solr request parameters (they are all contained in the params block), we now have the ability to validate requests and return an error for unknown JSON keys.

$ curl http://localhost:8983/solr/query -d '
{
  query : "hero",
  fulter : "author:brandon"  // oops, we misspelled "filter" 
}'

And we get an error back containing the error string:

"Unknown top-level key in JSON request : fulter"

Parameter Substitution / Macro Expansion

Of course request templating via Parameter Substitution works fully with JSON request bodies or parameters as well.

Example:

$ curl "http://localhost:8983/solr/query?FIELD=text&TERM=hero&HOWMANY=10" -d '
{
  query:"${FIELD}:${TERM}",
  limit:${HOWMANY}
}'

Current Status

The JSON Request API is currently in it’s infancy – only a few query parameters are supported (although the JSON Facet API which is part of this is more mature). Other Heliosearch/Solr features you may want access to (like highlighting) currently need to be controlled through the normal Solr request params (e.g. just ad hl=true to the normal request parameters, or in the params block of a JSON request.

Have ideas on what will make the API better? Want to help out with development? We’d love to hear from you on the !