SolrCloud – Assigning Nodes To Machines

posted in: solr, solr cloud

This post will outline a method for assigning specific shard replicas to specific nodes using the Solr Collections API and, as a bonus using the Solr Core Admin API. I’m assuming basic familiarity with Solr and SolrCloud.

How it works Out Of the Box

SolrCloud is great. Fire of one simple command and presto! You have a live cluster and “it all just works”. Which is true. Nodes get created, shards get assigned to nodes and it all does, indeed, “just work”. Documents get routed to the correct shard at index time. Distributed queries are automatic. You can create replicas of all your shards. HA/DR is built in, when nodes go down and come back up SolrCloud just figures it out and you get to sit back and watch the magic happen.

All this is great for getting running quickly and is fine for many production situations.

Reality intervenes though

Inevitably, though, the Operations group isn’t about to replace all of the hardware with identical hardware just to make your life easier, you’ll have a mix of machines with varying capabilities. Furthermore, you may be hosting multiple collections with different usage patterns and/or index sizes.

Some real-world situations will require you to take control of the physical node assignment in your cluster; i.e. I want node5 to have replicas for shard1, and shard3 for collectionA and I want to dedicate nodes10-20 exclusively to collectionZ. It’s actually quite easy to do this.

First, I’ve set up a 4 node cluster, all running on my local machine in different JVMs (this is straight from the SolrCloud page):

java -DzkRun -Dbootstrap_confdir=../../conf -Dcollection.configName=myconf -jar start.jar
java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
java -Djetty.port=8900 -DzkHost=localhost:9983 -jar start.jar
java -Djetty.port=7500 -DzkHost=localhost:9983 -jar start.jar

NOTE: I’ve used the bootstrap_confdir option to push the Solr configuration from an arbitrary local directory to ZooKeeper and label it “myconf”. This will not create any collections! Since ZooKeeper keeps the Solr configuration files, this will make them available to the collections API when I do want to create a collection (see below). This can also be done with the ZkCli scripts (see the SolrCloud page).

I also removed the /example/solr/collection1 directory completely before I started the first node with the embedded ZooKeeper (after copying the conf directory up a couple of levels). As you see below, I have no collections defined, no cores in the drop-down; it’s a clean slate.

no_collections

Hmmm, looks a little thin. Notice there are no nodes assigned to any collection.

Add some leaders

Now let’s create a 3-shard collection on our cluster (from the Collections API).

http://localhost:8983/solr/admin/collections?
action=CREATE&name=mycollection&
maxShardsPerNode=4&collection.configName=myconf&numShards=3&
createNodeSet=127.0.0.1:8983_solr,127.0.0.1:7500_solr,127.0.0.1:8900_solr

just_leaders

What’s happened so far.

  • I’ve created 3-shard cluster (leaving one node unused) and assigned leaders to the nodes I want using createNodeSet.
  • I cheated a little and added maxShardsPerNode to have room to play.
  • We’re using “compositeId” routing. The number of shards is fixed at collection creation. We can add new replicas. We can split shards. But we cannot add arbitrary shards. There’ll be another blog on the topic of “implicit” routing and core creation soon.

Expanding our capacity by adding nodes where we want

Now, say, you get more hardware. Or decide to add more replicas.  So you need to assign replicas for specific shards to specific nodes. It’s actually easy, just use the admin collection ADDREPLICA commands to the collection and specify the node and the shard it belongs to. The command looks like this:

http://localhost:8900/solr/admin/collections?action=ADDREPLICA&shard=shard2&collection=mycollection&node=127.0.0.1:8983_solr

I executed that command twice!
Here’s what this looks like:

mult_shard2

Shard 2 has gotten a two replicas on the node at 8983. It’s a little confusing, so you need to pay attention to the port numbers, which in this demo are a stand-in for the physical nodes.

  • The 8983_solr is a bit of magic that identifies the node to SolrCloud.
  • I executed this command twice, so two new shards have been assigned on the same node. This can happen because I specified maxShardsPerNode when I created the collection.
  • It doesn’t matter what node you send the command to, node= defines where the node gets created. In this case I sent the command to 8900 and created the replica on 8983.
  • collection – is critical, it tells the creation process that this core is part of the specified collection. The config files (schema.xml, solrconfig.xml, etc.) are pulled from ZK for this replica. Nothing has been pre-loaded on the node for this core, in fact, you won’t find a “conf” directory here for this Solr core!
  • shard – is also critical, it tells the admin handler what shard this core “belongs” to. Incidentally, all the replicas that make up a shard are called a “slice”.

Bonus section (using the Solr admin/cores API)

The above works perfectly well, but I wanted to point out another mechanism for adding shard replicas to a collection, using the admin/cores API.

  • http://localhost:7500/solr/admin/cores?action=CREATE&name=shard1_r9&collection=mycollection&property.shard=shard1
  • http://localhost:7574/solr/admin/cores?action=CREATE&name=shard3_r2&collection=mycollection&property.shard=shard2
  • http://localhost:7574/solr/admin/cores?action=CREATE&name=shard2_r3&collection=mycollection&property.shard=shard3

lastshot

  • Since this is using the core API, the core is created on the node the command executes on: 7500 and 7574 respectively. This is very different from the collections API, where you can specify an arbitrary node for the replica.
  • There was no need to pre-create the directory containing the Solr configuration files. Specifying that this core is a shard in a collection causes the configuration information to be read from ZooKeeper. This is unlike using the core admin api in non-SolrCloud.
  • collection is critical. It instructs Solr what collection this replica belongs to.
  • property.shard is a little obscure. It associates the core with a particular shard in collection.
  • name is arbitrary. Be careful though, use some rational naming scheme! The default if name is left off is something like mycollection_shard3_replica1.
  • I put two shards on a node not visible (7574), but you have to “just know” it exists.
    • We’re looking into showing even nodes that ZK is unaware of, anyone with GUI skills is more than welcome to help out, just ping us on the Solr User’s list, you can sign up here Solr User’s List Instructions.

Conclusion

We’ve walked through the process of creating collections and being able to assign specific replicas to specific nodes for specific shards. The mechanics are easy, once you know the magic, this post gathers the magic together into a single place for easy reference.  There are a number of parameters we haven’t explored, see the Solr Reference Guide.

I used the browser to execute all these commands, but one could use cURL, SolrJ (see the CollectionAdminRequest class) or some kind of scripting engine.

Notice the interaction between collections and cores. The various Collection APIs are intended for cluster-wide commands while the Core Admin APIs are specific to a particular node; the node you send the command to. Certain parameters to the core API calls allow it to be assigned to a cluster. While I’ve shown a way to add replicas to shards with the core admin API, I’d recommend using the Collections API if possible, as it’s intended to work with SolrCloud.

In a future blog we may repeat this exercise showing the different options for node creation for the two main kinds of collections, implicit and compositeId routing.

Wouldn’t it be nice to be able to do this via a nice GUI? As it happens, this is under development, probably Solr 4.8 time frame. Volunteers welcome! A nice GUI aside, for large complex setups, some kind of automation is probably necessary, this post should give you a place to start.