Migrating Rule-Based Replica Rules to Autoscaling Policies

Creating rules for replica placement in a Solr cluster is now done with the autoscaling framework.

This document outlines how to migrate from the legacy rule-based replica placement to an autoscaling policy.

The autoscaling framework is designed to fully automate your cluster management. However, if you do not want actions taken on your cluster in an automatic way, you can still use the framework to set rules and preferences. With a set of rules and preferences in place, instead of taking action directly the system will suggest actions you can take manually.

The section Cluster Preferences Specification describes the capabilities of an autoscaling policy in detail. Below we’ll walk through a few examples to show how you would express the your legacy rules in the autoscaling syntax. Every rule in the legacy rule-based replica framework can be expressed in the new syntax.

How Rules are Defined

One key difference between the frameworks is the way rules are defined.

With the rule-based replica placement framework, rules are defined with the Collections API at the time of collection creation.

The autoscaling framework, however, has its own API. Policies can be configured for the entire cluster or for individual collections depending on your needs.

The following is the legacy syntax for a rule that limits the cluster to one replica for each shard in any Solr node:

replica:<2,node:*,shard:**

The equivalent rule in the autoscaling policy is:

{"replica":"<2", "node":"#ANY", "shard":"#EACH"}

Differences in Rule Syntaxes

Many elements of defining rules are similar in both frameworks, but some elements are different.

Rule Operators

All of the following operators can be directly used in the new policy syntax and they mean the same in both frameworks.

  • equals (no operator required): tag:x means the value for a tag must be equal to 'x'.

  • greater than (>): tag:>x means the tag value must be greater than 'x'. In this case, 'x' must be a number.

  • less than (<): tag:<x means the tag value must be less than ‘x’. In this case also, 'x' must be a number.

  • not equal (!): tag:!x means tag value MUST NOT be equal to ‘x’. The equals check is performed on a String value.

Fuzzy Operator (~)

There is no ~ operator in the autoscaling policy syntax. Instead, it uses the strict parameter, which can be true or false. To replace the ~ operator, use the attribute "strict":false instead.

For example:

Rule-based replica placement framework:
replica:<2~,node:*,shard:**
Autoscaling framework:
{"replica":"<2", "node":"#ANY", "shard":"#EACH", "strict": false}

Attributes

Attributes were known as "tags" in the rule-based replica placement framework. In the autoscaling framework, they are attributes that are used for node selection or to set global cluster-wide rules.

The available attributes in the autoscaling framework are similar to the tags that were available with rule-based replica placement. Attributes with the same name mean the same in both frameworks.

  • cores: Number of cores in the node

  • freedisk: Disk space available in the node

  • host: host name of the node

  • port: port of the node

  • node: node name

  • role: The role of the node. The only supported role is 'overseer'

  • ip_1, ip_2, ip_3, ip_4: These are ip fragments for each node. For example, in a host with ip 192.168.1.2, ip_1 = 2, ip_2 =1, ip_3 = 168 and` ip_4 = 192`

  • sysprop.{PROPERTY_NAME}: These are values available from system properties. sysprop.key means a value that is passed to the node as -Dkey=keyValue during the node startup. It is possible to use rules like sysprop.key:expectedVal,shard:*

Snitches

There is no equivalent for a snitch in the autoscaling policy framework.

Porting Existing Replica Placement Rules

There is no automatic way to move from using rule-based replica placement rules to an autoscaling policy. Instead you will need to remove your replica rules from each collection and institute a policy using the autoscaling API.

The following examples are intended to help you translate your existing rules into new rules that fit the autoscaling framework.

Keep less than 2 replicas (at most 1 replica) of this collection on any node

For this rule, we define the replica condition with operators for "less than 2", and use a pre-defined tag named node to define nodes with any name.

Rule-based replica placement framework:
replica:<2,node:*
Autoscaling framework:
{"replica":"<2","node":"#ANY"}

For a given shard, keep less than 2 replicas on any node

For this rule, we use the shard condition to define any shard, the replica condition with operators for "less than 2", and finally a pre-defined tag named node to define nodes with any name.

Rule-based replica placement framework:
shard:*,replica:<2,node:*
Autoscaling framework:
{"replica":"<2","shard":"#EACH", "node":"#ANY"}

Assign all replicas in shard1 to rack 730

This rule limits the shard condition to 'shard1', but any number of replicas. We’re also referencing a custom tag named rack.

Rule-based replica placement framework:
shard:shard1,replica:*,rack:730
Autoscaling framework:
{"replica":"#ALL", "shard":"shard1", "sysprop.rack":"730"}

In the rule-based replica placement framework, we needed to configure a custom Snitch which provides values for the tag rack.

With the autoscaling framework, however, we need to start all nodes with a system property to define the rack values. For example, bin/solr start -c -Drack=<rack-number>.

Create replicas in nodes with less than 5 cores only

This rule uses the replica condition to define any number of replicas, but adds a pre-defined tag named core and uses operators for "less than 5".

Rule-based replica placement framework:
cores:<5
Autoscaling framework:
{"cores":"<5", "node":"#ANY"}

Do not create any replicas in host 192.45.67.3

legacy syntax:
host:!192.45.67.3
autoscaling framework:
{"replica": 0, "host":"192.45.67.3"}