Introducing SolrCloud
Posted by Kelvin on 02 Jul 2018 at 02:17 pm | Tagged as: Lucene / Solr / Elasticsearch / Nutch
In this post, we will examine SolrCloud and examine how it is different from standalone Solr..
Introducing SolrCloud
In a standalone Solr installation, the data resides on a single machine and all requests are served from this machine.
SolrCloud is the operation mode in Solr where the data resides on multiple machines (known as a cluster), and requests are served from this cluster of machines.
SolrCloud terminology
Where a Solr 'database' in standalone Solr is known as a Solr core, a Solr 'database' in SolrCloud is known as a Solr collection.
Further, a Solr collection is split into a number of partitions known as shards. A shard is a logical partition of a collection, comprising a subset of all the documents in the collection. It is, however, a conceptual partitioning, in the sense that a shard actually contains of one or more replicas. A replica is a full 'instance' or 'copy' of a shard. Replicas provide fault-tolerance to the Solr collection, so that when machines go down, there can still be a replicas of the machine's shards on other machines.