Autoscaling API

The Autoscaling API is used to manage autoscaling policies, preferences, triggers, listeners and to get diagnostics on the state of the cluster.

Read API

The autoscaling Read API is available at /solr/admin/autoscaling or /api/cluster/autoscaling (v2 API style). It returns information about the configured cluster preferences, cluster policy, collection-specific policies triggers and listeners.

This API does not take any parameters.

Read API Response

The output will contain cluster preferences, cluster policy and collection specific policies.

Examples using Read API

Output

{
    "responseHeader": {
        "status": 0,
        "QTime": 2
    },
    "cluster-policy": [
        {
            "replica": "<2",
            "shard": "#EACH",
            "node": "#ANY"
        }
    ],
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

Diagnostics API

The diagnostics API shows the violations, if any, of all conditions in the cluster and, if applicable, the collection-specific policy. It is available at the /admin/autoscaling/diagnostics path.

This API does not take any parameters.

Diagnostics API Response

The output will contain sortedNodes which is a list of nodes in the cluster sorted according to overall load in descending order (as determined by the preferences) and violations which is a list of nodes along with the conditions that they violate.

Examples Using Diagnostics API

Here is an example with no violations but in the sortedNodes section, we can see that the first node is most loaded (according to number of cores):

{
    "responseHeader": {
        "status": 0,
        "QTime": 65
    },
    "diagnostics": {
        "sortedNodes": [
            {
                "node": "127.0.0.1:8983_solr",
                "cores": 3
            },
            {
                "node": "127.0.0.1:7574_solr",
                "cores": 2
            }
        ],
        "violations": []
    },
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

Suppose we added a condition to the cluster policy as follows:

{"replica": "<2", "shard": "#EACH", "node": "#ANY"}

However, since the first node in the first example had more than 1 replica for a shard already, then the diagnostics API will return:

{
    "responseHeader": {
        "status": 0,
        "QTime": 45
    },
    "diagnostics": {
        "sortedNodes": [
            {
                "node": "127.0.0.1:8983_solr",
                "cores": 3
            },
            {
                "node": "127.0.0.1:7574_solr",
                "cores": 2
            }
        ],
        "violations": [
            {
                "collection": "gettingstarted",
                "shard": "shard1",
                "node": "127.0.0.1:8983_solr",
                "tagKey": "127.0.0.1:8983_solr",
                "violation": {
                    "replica": "2",
                    "delta": 0
                },
                "clause": {
                    "replica": "<2",
                    "shard": "#EACH",
                    "node": "#ANY",
                    "collection": "gettingstarted"
                }
            }
        ]
    },
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

In the above example the node with port 8983 has two replicas for shard1 in violation of our policy.

Inline Policy Configuration

If there is no autoscaling policy configured or if you wish to use a configuration other than the default, it is possible to send the autoscaling policy JSON as an inline payload as follows:

 curl -X POST -H 'Content-type:application/json'  -d '{
 "cluster-policy": [
   {"replica": 0, "put" : "on-each", "nodeset": {"port" : "7574"}}]
 }' http://localhost:8983/api/cluster/autoscaling/diagnostics?omitHeader=true

Output

{
  "diagnostics":{
    "sortedNodes":[{
        "node":"10.0.0.80:7574_solr",
        "isLive":true,
        "cores":2.0,
        "freedisk":567.4989128112793,
        "port":7574,
        "totaldisk":1044.122688293457,
        "replicas":{"mycoll":{
            "shard2":[{
                "core_node7":{
                  "core":"mycoll_shard2_replica_n4",
                  "shard":"shard2",
                  "collection":"mycoll",
                  "node_name":"10.0.0.80:7574_solr",
                  "type":"NRT",
                  "base_url":"http://10.0.0.80:7574/solr",
                  "state":"active",
                  "force_set_state":"false",
                  "INDEX.sizeInGB":6.426125764846802E-8}}],
            "shard1":[{
                "core_node3":{
                  "core":"mycoll_shard1_replica_n1",
                  "shard":"shard1",
                  "collection":"mycoll",
                  "node_name":"10.0.0.80:7574_solr",
                  "type":"NRT",
                  "base_url":"http://10.0.0.80:7574/solr",
                  "state":"active",
                  "force_set_state":"false",
                  "INDEX.sizeInGB":6.426125764846802E-8}}]}}}
      ,{
        "node":"10.0.0.80:8983_solr",
        "isLive":true,
        "cores":2.0,
        "freedisk":567.498908996582,
        "port":8983,
        "totaldisk":1044.122688293457,
        "replicas":{"mycoll":{
            "shard2":[{
                "core_node8":{
                  "core":"mycoll_shard2_replica_n6",
                  "shard":"shard2",
                  "collection":"mycoll",
                  "node_name":"10.0.0.80:8983_solr",
                  "type":"NRT",
                  "leader":"true",
                  "base_url":"http://10.0.0.80:8983/solr",
                  "state":"active",
                  "force_set_state":"false",
                  "INDEX.sizeInGB":6.426125764846802E-8}}],
            "shard1":[{
                "core_node5":{
                  "core":"mycoll_shard1_replica_n2",
                  "shard":"shard1",
                  "collection":"mycoll",
                  "node_name":"10.0.0.80:8983_solr",
                  "type":"NRT",
                  "leader":"true",
                  "base_url":"http://10.0.0.80:8983/solr",
                  "state":"active",
                  "force_set_state":"false",
                  "INDEX.sizeInGB":6.426125764846802E-8}}]}}}],
    "liveNodes":["10.0.0.80:7574_solr",
      "10.0.0.80:8983_solr"],
    "violations":[{
        "collection":"mycoll",
        "tagKey":7574,
        "violation":{
          "replica":{
            "NRT":2,
            "count":2},
          "delta":2.0},
        "clause":{
          "replica":0,
          "port":"7574",
          "collection":"mycoll"},
        "violatingReplicas":[{
            "core_node7":{
              "core":"mycoll_shard2_replica_n4",
              "shard":"shard2",
              "collection":"mycoll",
              "node_name":"10.0.0.80:7574_solr",
              "type":"NRT",
              "base_url":"http://10.0.0.80:7574/solr",
              "state":"active",
              "force_set_state":"false",
              "INDEX.sizeInGB":6.426125764846802E-8}}
          ,{
            "core_node3":{
              "core":"mycoll_shard1_replica_n1",
              "shard":"shard1",
              "collection":"mycoll",
              "node_name":"10.0.0.80:7574_solr",
              "type":"NRT",
              "base_url":"http://10.0.0.80:7574/solr",
              "state":"active",
              "force_set_state":"false",
              "INDEX.sizeInGB":6.426125764846802E-8}}]}],
    "config":{
      "cluster-policy":[{
          "replica":0,
          "port":"7574"}]}},
  "WARNING":"This response format is experimental.  It is likely to change in the future."}

Suggestions API

Suggestions are operations recommended by the system according to the policies and preferences the user has set.

Suggestions are made only if there are violations to active policies. The operation section of the response uses the defined preferences to identify the target node.

The API is available at /admin/autoscaling/suggestions. Here is an example output from a suggestion request:

{
  "responseHeader":{
    "status":0,
    "QTime":101},
  "suggestions":[{
      "type":"violation",
      "violation":{
        "collection":"mycoll",
        "shard":"shard2",
        "tagKey":"7574",
        "violation":{ "delta":-1},
        "clause":{
          "replica":"0",
          "shard":"#EACH",
          "port":7574,
          "collection":"mycoll"}},
      "operation":{
        "method":"POST",
        "path":"/c/mycoll",
        "command":{"move-replica":{
            "targetNode":"192.168.43.37:8983_solr",
            "replica":"core_node7"}}}},
    {
      "type":"violation",
      "violation":{
        "collection":"mycoll",
        "shard":"shard2",
        "tagKey":"7574",
        "violation":{ "delta":-1},
        "clause":{
          "replica":"0",
          "shard":"#EACH",
          "port":7574,
          "collection":"mycoll"}},
      "operation":{
        "method":"POST",
        "path":"/c/mycoll",
        "command":{"move-replica":{
            "targetNode":"192.168.43.37:7575_solr",
            "replica":"core_node15"}}}}],
  "WARNING":"This response format is experimental.  It is likely to change in the future."}

The suggested operation is an API call that can be invoked to remedy the current violation.

The types of suggestions available are

  • violation : Fixes a violation to one or more policy rules
  • repair : Add missing replicas
  • improvement : move replicas around so that the load is more evenly balanced according to the autoscaling preferences

By default, the suggestions API return all of the above , in that order. However it is possible to fetch only certain types by adding a request parameter type. e.g: type=violation&type=repair

Inline Policy Configuration

If there is no autoscaling policy configured or if you wish to use a configuration other than the default, it is possible to send the autoscaling policy JSON as an inline payload as follows:

curl -X POST -H 'Content-type:application/json'  -d '{
 "cluster-policy": [
    {"replica": 0, "put" : "on-each-node", "nodeset": {"port" : "7574"}}
   ]
}' http://localhost:8983/solr/admin/autoscaling/suggestions?omitHeader=true

Output

{
  "suggestions":[{
      "type":"violation",
      "violation":{
        "collection":"mycoll",
        "tagKey":7574,
        "violation":{
          "replica":{
            "NRT":2,
            "count":2},
          "delta":2.0},
        "clause":{
          "replica":0,
          "port":"7574",
          "collection":"mycoll"}},
      "operation":{
        "method":"POST",
        "path":"/c/mycoll",
        "command":{"move-replica":{
            "targetNode":"10.0.0.80:8983_solr",
            "inPlaceMove":"true",
            "replica":"core_node8"}}}},
    {
      "type":"violation",
      "violation":{
        "collection":"mycoll",
        "tagKey":7574,
        "violation":{
          "replica":{
            "NRT":2,
            "count":2},
          "delta":2.0},
        "clause":{
          "replica":0,
          "port":"7574",
          "collection":"mycoll"}},
      "operation":{
        "method":"POST",
        "path":"/c/mycoll",
        "command":{"move-replica":{
            "targetNode":"10.0.0.80:8983_solr",
            "inPlaceMove":"true",
            "replica":"core_node5"}}}}],
  "WARNING":"This response format is experimental.  It is likely to change in the future."}

History API

The history of autoscaling events is available at /admin/autoscaling/history. It returns information about past autoscaling events and details about their processing. This history is kept in the .system collection, and is populated by a trigger listener SystemLogListener. By default this listener is added to all new triggers.

History events are regular Solr documents so they can be also accessed directly by searching on the .system collection. The history handler acts as a regular search handler, so all query parameters supported by /select handler for that collection are supported here too. However, the history handler makes this process easier by offering a simpler syntax and knowledge of field names used by SystemLogListener for serialization of event data.

History documents contain the action context, if it was available, which gives further insight into e.g., exact operations that were computed and/or executed.

Specifically, the following query parameters can be used (they are turned into filter queries, so an implicit AND is applied):

trigger
The name of the trigger.
eventType
The event type or trigger type (e.g., nodeAdded).
collection
The name of the collection involved in event processing.
stage
An event processing stage.
action
A trigger action.
node
A node name that the event refers to.
beforeAction
A beforeAction stage.
afterAction
An afterAction stage.
Example output
{
    "responseHeader": {
        "status": 0,
        "QTime": 64
    },
    "response": {
        "numFound": 2,
        "start": 0,
        "docs": [
            {
                "type": "autoscaling_event",
                "source_s": "SystemLogListener",
                "id": "15f53efdf4bT2qlmj80580yuu997vktddfob3",
                "event.id_s": "14f0d67fe7b97d80T2qlmj80580yuu997vktddfob2",
                "event.type_s": "NODELOST",
                "event.source_s": ".auto_add_replicas",
                "event.time_l": 1508941720006000000,
                "timestamp": "2017-10-25T14:29:10.091Z",
                "event.property.eventTimes_ss": [
                    "1508941720006000000"
                ],
                "event.property._enqueue_time__ss": [
                    "1508941750088000000"
                ],
                "event.property.nodeNames_ss": [
                    "192.168.1.104:7574_solr"
                ],
                "stage_s": "STARTED",
                "event_str": "{\n  \"id\":\"14f0d67fe7b97d80T2qlmj80580yuu997vktddfob2\",\n  \"source\":\".auto_add_replicas\",\n  \"eventTime\":1508941720006000000,\n  \"eventType\":\"NODELOST\",\n  \"properties\":{\n    \"eventTimes\":[1508941720006000000],\n    \"_enqueue_time_\":1508941750088000000,\n    \"nodeNames\":[\"192.168.1.104:7574_solr\"]}}",
                "_version_": 1582240104552857600
            },
            {
                "type": "autoscaling_event",
                "source_s": "SystemLogListener",
                "id": "15f53eff316T2qlmj80580yuu997vktddfob6",
                "event.id_s": "14f0d67fe7b97d80T2qlmj80580yuu997vktddfob2",
                "event.type_s": "NODELOST",
                "event.source_s": ".auto_add_replicas",
                "event.time_l": 1508941720006000000,
                "timestamp": "2017-10-25T14:29:15.158Z",
                "event.property.eventTimes_ss": [
                    "1508941720006000000"
                ],
                "event.property._enqueue_time__ss": [
                    "1508941750088000000"
                ],
                "event.property.nodeNames_ss": [
                    "192.168.1.104:7574_solr"
                ],
                "stage_s": "SUCCEEDED",
                "event_str": "{\n  \"id\":\"14f0d67fe7b97d80T2qlmj80580yuu997vktddfob2\",\n  \"source\":\".auto_add_replicas\",\n  \"eventTime\":1508941720006000000,\n  \"eventType\":\"NODELOST\",\n  \"properties\":{\n    \"eventTimes\":[1508941720006000000],\n    \"_enqueue_time_\":1508941750088000000,\n    \"nodeNames\":[\"192.168.1.104:7574_solr\"]}}",
                "_version_": 1582240109859700736
            }
        ]
    }
}

Write API

The Write API is available at the same /admin/autoscaling and /api/cluster/autoscaling endpoints as the Read API but can only be used with the POST HTTP verb.

The payload of the POST request is a JSON message with commands to set and remove components. Multiple commands can be specified together in the payload. The commands are executed in the order specified and the changes are atomic, i.e., either all succeed or none.

Create and Modify Cluster Preferences

Cluster preferences are specified as a list of sort preferences. Multiple sorting preferences can be specified and they are applied in the order they are set.

They are defined using the set-cluster-preferences command.

Each preference is a JSON map having the following syntax:

{'<sort_order>':'<sort_param>', 'precision':'<precision_val>'}

See the section Cluster Preferences Specification for details about the allowed values for the sort_order, sort_param and precision parameters.

Changing the cluster preferences after the cluster is already built doesn’t automatically reconfigure the cluster. However, all future cluster management operations will use the changed preferences.

Input

{
"set-cluster-preferences" : [
  {"minimize": "cores"}
  ]
}

Output

The output has a key named result which will return either success or failure depending on whether the command succeeded or failed.

{
    "responseHeader": {
        "status": 0,
        "QTime": 138
    },
    "result": "success",
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

Example Setting Cluster Preferences

In this example we add cluster preferences that sort on three different parameters:

{
  "set-cluster-preferences": [
    {
      "minimize": "cores",
      "precision": 2
    },
    {
      "maximize": "freedisk",
      "precision": 100
    },
    {
      "minimize": "sysLoadAvg",
      "precision": 10
    }
  ]
}

We can remove all cluster preferences by setting preferences to an empty list.

{
  "set-cluster-preferences": []
}

Create and Modify Cluster Policies

Cluster policies are set using the set-cluster-policy command.

Like set-cluster-preferences, the policy definition is a JSON map defining the desired attributes and values.

Refer to the Policy Specification section for details of the allowed values for each condition in the policy.

Input:

{
"set-cluster-policy": [
  {"replica": "<2", "shard": "#EACH", "node": "#ANY"}
  ]
}

Output:

{
    "responseHeader": {
        "status": 0,
        "QTime": 47
    },
    "result": "success",
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

We can remove all cluster policy conditions by setting policy to an empty list.

{
  "set-cluster-policy": []
}

Changing the cluster policy after the cluster is already built doesn’t automatically reconfigure the cluster. However, all future cluster management operations will use the changed cluster policy.

Create and Modify Collection-Specific Policy

The set-policy command accepts a map of policy names to the list of conditions for that policy. Multiple named policies can be specified together. A named policy that does not exist already is created and if the named policy accepts already then it is replaced.

Refer to the Policy Specification section for details of the allowed values for each condition in the policy.

Input

{
"set-policy": {
  "policy1": [
    {"replica": "1", "shard": "#EACH", "nodeset":{"port": "8983"}}
    ]
  }
}

Output

{
    "responseHeader": {
        "status": 0,
        "QTime": 246
    },
    "result": "success",
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

Changing the policy after the collection is already built doesn’t automatically reconfigure the collection. However, all future cluster management operations will use the changed policy.

Remove a Collection-Specific Policy

The remove-policy command accepts a policy name to be removed from Solr. The policy being removed must not be attached to any collection otherwise the command will fail.

Input

{"remove-policy": "policy1"}

Output

{
    "responseHeader": {
        "status": 0,
        "QTime": 42
    },
    "result": "success",
    "WARNING": "This response format is experimental.  It is likely to change in the future."
}

If you attempt to remove a policy that is being used by a collection, this command will fail to delete the policy until the collection itself is deleted.

Create/Update Trigger

The set-trigger command can be used to create a new trigger or overwrite an existing one.

You can see the section Trigger Configuration for a full list of configuration options.

Creating a nodeAdded Trigger
{
 "set-trigger": {
  "name" : "node_added_trigger",
  "event" : "nodeAdded",
  "waitFor" : "1s"
 }
}
Updating Trigger with waitFor set to 5 seconds
{
 "set-trigger": {
  "name" : "node_added_trigger",
  "event" : "nodeAdded",
  "waitFor" : "5s",
 }
}
Creating a nodeLost Trigger
{
 "set-trigger": {
  "name" : "node_lost_trigger1",
  "event" : "nodeLost",
  "waitFor" : "60s",
 }
}

Remove Trigger

The remove-trigger command can be used to remove a trigger. It accepts a single parameter: the name of the trigger.

Removing the nodeLost Trigger
{
 "remove-trigger": {
  "name" : "node_lost_trigger1"
 }
}

Create/Update Trigger Listener

The set-listener command can be used to create or modify a listener for a trigger.

You can see the section Trigger Listener Configuration for a full list of configuration options.

Creating a listener for the nodeAdded Trigger
{
 "set-listener": {
    "name": "foo",
    "trigger": "node_added_trigger",
    "stage": ["STARTED", "ABORTED", "SUCCEEDED", "FAILED"],
    "class": "com.example.Listener"
 }
}

Remove Trigger Listener

The remove-listener command can be used to remove an existing listener. It accepts a single parameter: the name of the listener.

Removing the foo listener
{
 "remove-listener": {
    "name": "foo"
 }
}

Change Autoscaling Properties

The set-properties command can be used to change the default properties used by the Autoscaling framework.

The following properties can be specified in the payload:

triggerScheduleDelaySeconds
This is the delay in seconds between two executions of a trigger. Every trigger is scheduled using Java’s ScheduledThreadPoolExecutor with this delay. The default is 1 second.
triggerCooldownPeriodSeconds
Solr pauses all other triggers for this cool down period after a trigger fires so that the system can stabilize before running triggers again. The default is 5 seconds.
triggerCorePoolSize
The core pool size of the ScheduledThreadPoolExecutor used to schedule triggers. The default is 4 threads.

The command allows setting arbitrary properties in addition to the above properties. Such arbitrary properties can be useful in custom TriggerAction instances.

Change default triggerScheduleDelaySeconds
{
  "set-properties": {
    "triggerScheduleDelaySeconds": 8
  }
}

The set-properties command replaces older values if present. So using set-properties to set the same value twice will overwrite the old value. If a property is not specified then it retains the last set value or the default, if no change was made. A changed value can be unset by using a null value.

Revert changed value of triggerScheduleDelaySeconds to default
{
  "set-properties": {
    "triggerScheduleDelaySeconds": null
  }
}

The changed values of these properties, if any, can be read using the Autoscaling Read API in the properties section.