Elasticsearch Performance Tuning

Elasticsearch performance tuning
You have numerous options when tuning the performance of NiFi and Elasticsearch. The
following guide introduces tools for monitoring performance and validating key tuning
parameters, and provides a performance tuning strategy that you can use with the component.
The NiFi and Elasticsearch component may be perceived as relatively slow in processing speed.
This may appear to be the case if you are testing the default configuration, which has not been
modified or tuned to improve processing performance. This default configuration includes a
mostly single threaded configuration of NiFi, and a minimal configuration of the Elasticsearch
cluster. This setup can result in performance issues when building the search index. However,
the performance of the solution is not only due to the default configuration.
It is also important to choose an optimal hardware configuration when testing. While CPU and
memory resources are not critical, the I/O bandwidth of the disk subsystem on the Elasticsearch
cluster emerges as a critical factor. Thus, careful selection of the environment from the start can
avoid several simple performance pitfalls.
The following documents outline this approach. They also provide a general understanding of the
index building process and the corresponding metrics that can be used to observe the process.
They provide a general understanding and interpretation of these metrics to further improve
performance. Also included is a list of configuration changes for two hardware specifications:
 A minimal configuration, that is applicable for use with a minimal hardware specification,
and
 A recommended configuration, that should be applied on the recommended hardware
specification.
Background information
Multiple factors influence the performance of the search index build, including hardware footprint,
catalog size, and attribute dictionary richness. Understanding the bottlenecks and how they
express themselves across the whole process is crucial to fine-tuning the search solution.
The index creation process consists of three key steps:
1. Data retrieval.
2. Data processing or Transformation.
3. Data uploading.
A set of predefined connectors consisting of multiple process groups for different purposes is
available. To handle the data retrieval, processing, and uploading stages of the index creation
process, each process group often has nested process groups.
Retrieving data group
Fetch data from database or Elasticsearch.
Processing data group

For the fetched data: build, update, or copy the index documents.
Uploading data group

Upload the index document to Elasticsearch.
Each group has an influence on the index building process's speed and
efficiency. The retrieving data group, for example, would be in control of the
flow file size (bucket size) and query execution frequency (scroll page size).
You can optimise the payload and retrieval cost from the database, as the
chunk of data NiFi processes as a unit, by altering these variables. The size
of the flow file affects Elasticsearch upload performance. Complicated and
large structures can take longer time for Elasticsearch to parse, resulting in
poor scalability.
The processing data group controls the amount of work NiFi can do. For
example, you can regulate how many flow files may be processed
concurrently by controlling the processor's thread count. This increases the
processing speed of a typical processor, potentially improving flow file pileup
in front of the processor. The NLP processor, for example, is a typical
processor that benefits significantly from additional threads. You can control
how many connections to Elasticsearch you make concurrently using the
more specialised bulk update type processor, allowing you to import more
data to Elasticsearch.
These scenarios will be examined in further detail in Interpreting patterns and

tuning search solution, using real-world examples and data.
Infrastructure requirements
HCL Commerce's infrastructure requirements are well defined, and while NiFi
and Elasticsearch may function on a reduced footprint, performance may
suffer if you reduce their resource allocation. Good I/O bandwidth is
necessary for both NiFi and Elasticsearch infrastructures, enough memory for
Java heap and Java native memory allocation, and preferably, enough
memory for file caching. The latter may need to be specified in the pod since
it ensures that the operating system has enough additional RAM for the
service.
Key Tunable Variables
The following key tunable variables can be adjusted to improve overall

processing time:
Processor thread count (Concurrent Tasks)
The default processor runs a single thread at a time, processing one flow file
at a time. If concurrent processing is desired, the number of concurrent tasks
that it can do can be adjusted. Set the number of threads for the process
group by changing the processor Concurrent Tasks value (under Processor
configuration or SCHEDULING tab).
Throughput can be improved if a CPU is able to multitask by increasing the
number of threads it employs. Two such examples are the transformation
processor (as in NLP) and the Bulk update processor (sends data to
Elasticsearch). This update does not help every processor. Most processors
come with an default configuration that takes this variable into account and
does not need to be altered. When performance testing reveals a bottleneck
in front of the processors, the default configuration may benefit from further
tuning.
When the processor can process flow files at the same rate as they come,
the Concurrent Tasks value is ideal, preventing large pileups of flow files in
the processor's wait queue. Because such a balance may not always be
feasible, the best configuration focuses on reducing the flow file pileup in the
waiting queue.
 Elasticsearch monitoring and tuning

The Elasticsearch cluster is a key component of the search solution
for HCL Commerce Version 9.1. A number of monitoring tools are
provided with key metrics that you should continuously monitor. The
following topics describe these tools and demonstrate how to
troubleshoot potential issues. While every possible situation is not
covered, the most frequent events and problems are noted and
resolutions provided.
 Monitoring Elasticsearch cluster
The Elasticsearch monitoring dashboard page displays numerous
metric displays, ranging from KPI, Shards and JVM Garbage
Collection, to CPU and Memory, Disk and Network information. In-
depth knowledge of Elasticsearch operations is required for a
complete understanding of some metrics, but the crucial indicators for
monitoring and troubleshooting Elasticsearch are broad enough that
they are easy to use.
 Key monitoring metrics
The query time dashboard and CPU utilization dashboard are the two
primary monitoring metrics that will enable you to monitor the
Elasticsearch (ES) cluster and identify changes over time
dynamically.
 Active threads dashboards
Thread pool dashboards display real-time information about worker
threads and their operations on the cluster, with metrics including the
number of tasks waiting to be executed and the number of active
threads. Understanding the role of each thread pool, such as the
Generic, Search, Bulk, Index, GET, Write, and Analysis pools, is
crucial for troubleshooting problems and optimizing performance in
Elasticsearch (ES). Additionally, the field data and query caches help
speed up search operations but may need to be cleared periodically,
causing temporary slowdowns.
 Total Operations dashboard
The total operations dashboard displays a breakdown of the total
number of index, search, get, and delete operations carried out on
your Elasticsearch cluster.
 Multiple or Concurent Elasticsearch operations
Multiple or concurrent Elasticsearch operations can reduce
performance. Finding the cause should be done with attention and
planning.
 Tuning knobs
Elasticsearch has several configuration points that may be changed
for optimal performance and the allocation of resources. The following
are some critical tuning knobs to think about.
 Tuning the Ingest service
NiFi can generate large amounts of data during the Ingest process.
Use the formulas provided in the following topics to tune your Ingest
processes so that NiFi provides data at the same rate as
Elasticsearch is able to consume it.
Interpreting patterns and tuning

search solution
You have several options for improving overall performance. Tuning ingestion also includes
learning about possible side effects and negative results.
A case of slow data transfer – multithreading

The transfer operation from NiFi to Elasticsearch is an example of a suboptimal segment that you
can readily detect. This operation is usually handled by the bulk update processor, which shows
up in Grafana as a rapid growth in the number of incoming documents, which then peaks,
followed by a slowdown, shallow decline in the number of documents due to the transfer speed.
As you can see, it looks like this:
Due to the slowdown transfer speed to Elasticsearch, you can see a shallow depletion of the
documents in the queue. You can increase this speed by opening more connections to
Elasticsearch and configuring more threads for the Bulk Update Processor to increase
throughput.
When the total number of threads is raised, the following graph represents what happens on the
system:
When the Bulk Update Processor is configured with only three threads, the initial configuration
shows a very shallow depletion of the documents. When configured with 16 threads, the ramp-
down angle increases significantly, and when configured with 64 threads, it improves even more.
The important distinction is that increasing the number of threads in this processor increases the
number of HTTP connections opened from NiFi to Elasticsearch, while the resulting CPU
utilisation remains almost unchanged.
Other processors, such as the NLP process group, could benefit from similar observation and
improvement. The maximum concurrency achievable in the system is restricted by the number of
physical CPU cores available to the NiFi pod while using CPU bound processors like NLP.
Side effects from thread increase

Increasing the number of processor threads is a risky modification that should be done in small
increments instead of in large ones. Going from 3 to 64 threads, for example, is not
recommended. Make smaller incremental changes of 16, 32, and eventually 64 threads, testing
each increment and observing the results. It is observed that after 32 threads per CPU, the
benefits are negligible in most cases.
Furthermore, the OS resources available to the pod/ JVM where NiFi runs should be given
special consideration. A CPU-bound processor, for instance, benefits from concurrency only if
there are any available cores on the CPU. Increased processor concurrency, on the other hand,
increases the JVM heap size and the required native memory. To detect and correct such
situations, it is critical to monitor heap values and overall memory.
Side effects of bucket size increase

Increasing bucket size greatly affects the NiFi system's resources, such as heap and, more
importantly, native memory. Since the native memory is used as a buffering area when sending
the data to Elasticsearch, this increase enlarges the total memory footprint of the pod.
Detecting and capturing such a failure is a rare occurrence that is often overlooked. One such
event is shared here, along with point to the metrics values:
The Java heap size is seen in the graph above. The brief pause in the middle of the graph
reflects the NiFi pod crashing and restarting. If you observe the CPU before the crash, you can
see that the CPU utilisation spikes, but the heap size never reaches 100%, instead staying
around 70%, which is a reasonable heap size.
However, the below graph shows a totally different picture:

The total memory assigned to the Pod can be seen here, and you can see that the requested
Memory is reaching the maximum allocated memory at the time of the NiFi JVM's crash. As a
result, more Memory must be allocated to accommodate larger bucket size.
With a large bucket size, a different problem may arise: Elasticsearch may slow down the data
import significantly as the flow file size increases, to a point that the benefits in NiFi are
completely negated and processing degrades due to the Elasticsearch data import. Organize and
track your ingest testing for different bucket size values to avoid such situations and confusing
results.
A case of slow processing- scroll size tune

The slowdown delivery of data to NiFi is the third most common reason of data processing delay.
This could be due to the lack of database processing power, improper tuning, or inefficient
customization. While this issue is best addressed on the database side, it is sometimes much
more efficient to optimise NiFi data access to minimize the impact of database overhead on
ingest execution.
In this particular case, the slowdown are going to be visible within the Grafana graphs as idle
time between the queued item’s peaks. This happens when the scroll.size parameter is
configured to be relatively low compared to the total size of the database table that is being
accessed. The scroll.size should ideally match the processing time, where the database query
time should be equal to the NiFi processing time of the extracted data. However, in special cases
where the SQL runs longer then the NiFi processing, you could observe this as short peaks in
the queried items graph, spaced apart by flat/idle line.
This idle time between the fetches of the database data could be mitigated by increasing
the scroll.page.size value to higher number. For example if the database has total of 1 M
catalog items, and the scroll.page.size is set to 100000 items, the whole process
involves 10 iterations. This indicates that the items in the queue have 10 spikes, with idle time
intervals. You can reduce the wait time by 50% by increasing scroll.page.size to 200000,
thereby reducing the cycles to 5 cycles. You should set scroll.page.size to 1M so that the whole
data can be retrieved in one cycle and only one idle period for the processing phase is
observed.The following graph shows one such case:
Side effect from very large scroll sizes

Since the effect of changing the NiFi pod might be considerable, this minor adjustment should be
carefully considered. Because all of the data are sent to NiFi at the same time, you will need
enough of storage and RAM to keep the pod functioning smoothly. If NiFi is running out of heap,
native memory, or storage space, careful monitoring and modification are needed to maintain
optimal performance.
Finding the sweet spot

Overall, the strategy described should provide a monitored and measured performance testing
and tuning of the NiFi/ Elasticsearch operations. While some tuning variables are more impactful
on the system, you should consider all three to obtain optimal throughput of the system.
Due to the side effects, very large tuning variable changes can adversely affect total processing
time. It should be noted again that careful change of the values should be monitored and
measured, making sure that each individual change produces a positive change (improves) to
the overall processing time.
Other considerations
 Cache size
 "[${TENANT:-}${ENVIRONMENT:-}live]:services/cache/nifi/Price":
 localCache:
 maxSize: -1
 maxSizeInHeapPercent: 8 (default 2)
 remoteCache:
 enabled: false
 remoteInvalidations:
 publish: false
 subscribe: false
 "[${TENANT:-}${ENVIRONMENT:-}auth]:services/cache/nifi/Inventory":
 localCache:
 maxSize: -1
 remoteCache:
 enabled: false
 publish: false
 "[${TENANT:-}${ENVIRONMENT:-}auth]:services/cache/nifi/Bulk":
 localCache:
 maxSize: -1
 remoteCache:
 enabled: false
 publish: false
 "[${TENANT:-}${ENVIRONMENT:-}auth]:services/cache/nifi/Wait":
 localCache:
 maxSize: -1
 remoteCache:
enabled: false
Elasticsearch monitoring and tuning

The Elasticsearch cluster is a key component of the search solution for HCL Commerce Version
9.1. A number of monitoring tools are provided with key metrics that you should continuously
monitor. The following topics describe these tools and demonstrate how to troubleshoot potential
issues. While every possible situation is not covered, the most frequent events and problems are
noted and resolutions provided.
 Monitoring the operation

 Understanding the Grafana graphs
You can use Grafana to analyze the performance of the ingest pipeline.
Monitoring the operation
Monitoring should always start with the operating system's resources and how they are utilized.
On a system level, identify if a resource is saturated, such as CPU (processor utilisation), IO
(network, disc, or memory), Memory, and so on.
The NiFi and Elasticsearch heaps, and also native memory utilization, should be given special
attention. If the memory size is inadequate for the workload, it must be extended. After each
tuning adjustment, the heap and native memory use should be monitored. This is crucial when
increasing processor concurrency or increasing bucket.size/flowfile size.
The Graphana NiFi Performance graph is the most convenient way to track the index's overall
development. You can look at the total execution pace, determine the speed of major processor
groups, and see how much data is generated and uploaded to Elasticsearch.
Grafana
You can use Grafana to analyze the performance of the ingest pipeline. The two most useful
graphs are Queued Items and Wait Link.
WaitLink process groups are added between process groups in the NiFi ingest connectors to
ensure that the previous stage is completed before the subsequent stage is started. Data
currently in use in an ongoing process cannot be used in subsequent stages. Furthermore, this
reduces the probability of multiple processes operating at the same time, which might result in
significant spikes in CPU, network, memory, or disc IO resource requests.
The time spent on WaitLink can be used to estimate the total time spent on a stage and identify
the stages that consume the most time and/or resources during the build. Since WaitLink is not
available for all process groups, the Queued Items graph offers more details about the
processing time for each process group.
The Bulk Service <- XXXX> charts are useful to look at within Queued Items. The processed
data (index documents) is sent from NiFi to Elasticsearch by these process groups. Bulk Service
- Product is the most essential. Use the timestamp in WaitLink to access the corresponding
stages because the curve runs from the beginning to the finish of the ingest pipeline.
The next two graphs, for example, illustrate that the Product Stage 1e has the most queued
items. This observation indicates that the retrieving data group and the processing data group
are capable of handling the task rapidly and sending large amounts of data to the Bulk service
group for transmission.
The duration with 100 queued items is short in this example, thus it is not a concern. A possible
bottleneck in the pipeline would be a process group that takes longer and has a larger number of
queued items.
Grafana may also be used to track other parameters.
NiFi counters and reporting

When running an ingest pipeline, you can verify the pipeline report using NiFi counters or
Grafana.
Due to high resource consumption, the NiFi counters access http://nifi.<your

domain>/nifi/counters is disabled by default.
You can enable it by adding the following two line within nifi-app.yaml /commerce-helmchart-
master/hcl-commerce-helmchart/stable/hcl-commerce/templates/nifi-app.yaml before installing
NiFi:
- name: "DOMAIN_NAME"
value: "{{ .Release.Namespace }}.svc.cluster.local"
- name: "SPIUSER_NAME"
value: {{ $.Values.common.spiUserName | quote }}
- name: "FEATURE_NIFI_COUNTER"
value: "true"
- name: "VAULT_CA"
value: {{ .Values.vaultCA.enabled | quote }}
Examine the report while the test is ongoing or after the ingest process is done if you enable it.
One disadvantage is that each connection can only display one report. By using the same
connector, another ingest pipeline can be run at the same time (please allow a couple of minutes
for this to complete). Once the ingest pipeline starts, the report created for the previous run is
deleted.
The ingest report, Ingest Metrics, is sent to the index run within Elasticsearch once an ingest
pipeline is completed. Grafana can be set up to display the report in the format you specify. All of
the reports for the various ingest pipelines and connections are saved. To view the report, select
connector and runID.
At Grafana, the data for Ingest Metrics differs from that for Queued Items/ Wait Link.
Elasticsearch receives the metrics from NiFi after the ingest process is complete. However,
Queued Items/ Wait Link uses Prometheus to collect data in real time.
You may not want to finish an ingest pipeline before running it again for tuning purposes, or the
process could fail at any time throughout the ingest process. In these circumstances, NiFi
counters may make reporting for particular stages of an ingest pipeline easier.
Kibana
Kibana can be used to monitor the resource consumption of Elasticsearch. For more information
about Kibana, see Kibana documentation.
Kibana is monitoring Elasticsearch activities in this graph. The CPU usage, JVM heap, and IO
operations rate are the key metrics for the index building process. The IO operation rate is the
main metric since it is difficult to push faster overall throughput if the IO rate is fully utilised. If the
speed is unacceptable, the best course of action is to look into other options that have a higher
throughput.
Understanding the Grafana graphs

You can use Grafana to analyze the performance of the ingest pipeline.
You can use Grafana to analyze the performance of the ingest pipeline. The two most useful
graphs are Queued Items and WaitLink.
Visual Representation of the NiFi activities
In the NiFi ingest connectors, WaitLink process groups are added between process groups to
ensure that the previous stage is completed before the next stage is started. This way,
subsequent stages will not use data that is currently being worked on in an unfinished process. In
addition, this reduces the occurrence of different processes running at the same time, which can
cause extreme spikes in resource requests for CPU, network, memory or disk I/O.
NiFi uses "flow files" to process data in batches. The number of documents included in a flow file
is defined by the scroll.bucket.size property. Setting (scroll.bucket.size)=300, for example,
would allow 300 catentryIds per flowfile if applied to the Product Update 1i processing
segment.
Both WaitLink and Bucket.Size values can be tracked in Grafana. Observing the activities and
quantities helps determine system behavior and aids in the detection of slow segments.
Interpreting the graphs and detecting a bottleneck

The "wait link" and "queued items" graphs showing data for the Bulk Service Processor, are key
metrics for understanding the Ingest/Index build operation. Bulk Service Processor values are
important since they indicate packages sent to the Elasticsearch cluster. This is because all of
the flowfile backlogs are found only inside of the Bulk Service and not in other Extract and
Transform phases from each stage.
Both graphs have a number of metrics that can be tracked (by clicking the coloured line on the
right side of the graph), but only the most key metrics are displayed by default. Hover the mouse
pointer over the graph line and see which curve belongs to which process group or wait link.
When the processor group name or wait link is clicked, a small pop-up box appears:
The "queued link" graph depicts the number of flow files queued for processing at a given
processor group. A sharp rise in the curve indicates that the previous processor (or processor
group) is processing data faster, or that the processor group is struggling to keep up with the
overall throughput of adjacent processor groups. In the image below, you can observe a rapid
increase in the number of queued items around the 21.54 timestamp, indicating that the
processor is not keeping up with the incoming flow:
Similarly, the graph's ramp-down section has a steep curve, indicating that the CPU was able to
complete the processing rapidly. The steeper the curve, the faster the processor can process the
flowfiles, and the shallower the curve, the slower the processor can process the data. A case of
sluggish data flow processing can be seen in the image below:
The incoming rate (centered at 22:22 timestamp) is substantially greater than the outgoing rate,
with the incoming rate being relatively steep compared to the shallow angle of the outgoing
curve.
These simple observations are simple to apply to graphs and identify potential bottlenecks.
However, the conclusions are not always true, and the processor groups are sometimes
constrained in their data processing. To conclude, more observations are needed to confirm the
bottleneck.
Below the queued items are graphs of WaitLink. WaitLink graphs, unlike queued items, show
which stage or segment is processing at any given time. In other words, while the X axis
indicates time (corresponding to the Queue Item graph), the Y - axis shows the active segment,
having values ranging from 0 to 1:
If the system supports various languages, you may see many WaitLinks appear at the same
time. Thus, graphs reaching the Y axis up to value 2 may be shown for two languages, and so
on.
Wait links are helpful in assessing which processing stage takes the longest to complete. The
slowest segments are the longest rectangles, which are the best candidates for ingestion
process optimization.
In the next topic let us explore few typical cases of suboptimal ingestion processing and you will
formulate a strategy to improve the processing speed.
Monitoring Elasticsearch cluster
The Elasticsearch monitoring dashboard page displays numerous metric displays, ranging from KPI,
Shards and JVM Garbage Collection, to CPU and Memory, Disk and Network information. In-depth
knowledge of Elasticsearch operations is required for a complete understanding of some metrics, but
the crucial indicators for monitoring and troubleshooting Elasticsearch are broad enough that they
are easy to use.
The Elasticsearch monitoring dashboard pagedisplays numerous metric displays.
KPI
Displays several metrics by single number value.
Shards
Displays high level shard information.
JVM Garbage Collection
Garbage collection is the automated process of cleaning up objects used by code in memory.
Breakers
circuit breakers dashboards that displays tripped circuit breakers, their frequency and value of the
metrics that tripped them.
CPU and Memory
Disk and Network
Documents
Displays dashboards that show the documents and the operations done on them.
Total Operations Statistics
Times
Dashboards that show service times to the following operations: Query, Indexing, Merging and
Throttle time.
Thread Pool
Critical dashboard that shows activity of the worker threads.
Caches
Dashboard presenting the Elasticsearch caches.
Segments
Indices
Count of documents and total Size.
Indices
Doc values.
Indices
Refreshes.
Indices
Fields.
For a complete understanding of some metrics, in-depth knowledge of Elasticsearch operations is

required. You should concentrate on the crucial indicators that are essential for monitoring and
troubleshooting Elasticsearch until you develop expertise in using Elasticsearch with HCL Commerce..
Key monitoring metrics

The query time dashboard and CPU utilization dashboard are the two primary monitoring metrics
that will enable you to monitor the Elasticsearch (ES) cluster and identify changes over time
dynamically.
The following two critical monitoring metrics allow you to dynamically monitor the Elasticsearch
cluster and spot variations over time:
 Query time dashboard

 CPU utilisation dashboard
The following screenshots display a given time slice's query time and processing time. The
minimum, maximum, and average values are shown in the side table, while the graph presents
the maximum values.
The chart is simple to understand and can be used to quickly identify issues. Any sudden jump in
the max query time values would indicate a severe problem in the cluster, and further
investigation is mandatory. The graph depicts a situation where the query processing time is
disturbed for a more extended period, and this period should be correlated with the other
displays to determine the source of the problem.
Sudden spikes and then a return to normal are also possible. These spikes are usually due to
external events impacting the query processing time.
Moving the mouse cursor over the image provides the actual values for that time interval.
CPU Utilisation
This Dashboard is so a critical dashboardthat can alert you to unusual behavior or a
difficulty of the system to cope with the current workload.
Resource consumption
The resource consumption group ovides in-depth information on the Elasticsearch cluster
operations and resource availability. The steady state of operation should be well
understood, and any deviation from the steady state should present an alert for potential
threats or instability in the system that needs to be investigated.
CPU utilisation
The CPU age graph is relatively simple. It presents each ES node as a separate color in
the graph and, at the same time, presents a table with min, max, and average values for
each node in the ES cluster. Detecting situations where the CPU is very high (starvation)
or low (contention and slowdown in the processing rate) is easy and noticeable. The
following picture shows how the CPU utilization jumps to 80%+ while the traffic and
indexing are performed on the site.
Network
The network resources graphs are relatively simple but will provide additional data and
easy determination if an excess requests volume is fluencing operations.
Garbage Collection operations – GC time

The JVM heap and garbage collection (GC) timing may be tracked using the GC time
dashboard. This dashboard displays the amount of time used to clear the JVM heap of
dead objects in order to make space for new allocations. The operation sually represents a
10% overhead and its effect negligible.
The table displays the minimum, maximum, and average values for the chosen time period
as well as the value for the Old (also known as tenure) and Young (also known as rsery)
spaces in the heap.
Note: The OpenJDK JVM, which powers the Elasticsearch cluster, employs a
generational GC algorithm that is comparable to the IBM gencon GC algorithm.
JVM Memory Utilisation

Memory allocation and heap space expansion in the Java Virtual Machine (JVM) can be
tracked by the memory age dashboard. An example is presented in the following graph:
The total heap space is overreported and could be confusing. Multiple metrics are
reported, but due to the ES heap setup (min=max), only the Elasticsearch-master-
NNN used heap is displayed.
The general expectation is that the used heap metrics will be below the max
heap metrics, while the CPU utilization chart will depict normal and steady resource
consumption. The GC Time chart should display low overhead and a short time spent
doing the garbage collection.
However, if the used heap metric is frequently maximum and close to the max
heap metric, steady operation is still present. This indicates that the overall heap in ES is
insufficient and should be increased.
Active threads dashboards

Thread pool dashboards display real-time information about worker threads and their operations
on the cluster, with metrics including the number of tasks waiting to be executed and the number
of active threads. Understanding the role of each thread pool, such as the Generic, Search, Bulk,
Index, GET, Write, and Analysis pools, is crucial for troubleshooting problems and optimizing
performance in Elasticsearch (ES). Additionally, the field data and query caches help speed up
search operations but may need to be cleared periodically, causing temporary slowdowns.
Thread pool dashboards display real-time information about the worker threads and how they
operate on the cluster. The information is on a cluster level, but each thread group is shown per
ES node.
The picture below depicts two graphs:
1. Thread pool operations queued.

2. Thread pool threads active.
Thread pool operations queued indicates the number of tasks waiting to be executed,
while Thread pool threads active indicates the number of threads executing tasks. Both metrics
are essential for monitoring the health and performance of Elasticsearch.
The Thread pool operations queued metric indicates the number of tasks waiting to be
executed by the thread pool. This can happen when the number of tasks submitted to the thread
pool exceeds the maximum number of threads available to execute them. When this happens,
tasks are placed in a queue and are executed as soon as a thread becomes available.
The Thread pool threads active metric indicates the number of threads that are actively
executing tasks. When this number is close to the maximum number of threads available, it can
indicate that the system is under heavy load and may be experiencing performance issues.
You can further inspect the active threads by placing the mouse pointer over the graph at a
specific time to get the count of active threads at that specific time, as shown in image below.
Understanding the thread pools and their role in Elasticsearch operations is crucial to
troubleshoot problems:
1. Generic thread pool
This thread pool runs tasks that do not fit into any specialized thread pool. The generic
thread pool runs internal tasks within Elasticsearch, such as sending and receiving
network requests.
2. Search thread pool

This thread pool is used to run search requests. It handles tasks related to search
operations, such as querying and filtering data. The number of threads in the search
thread pool is typically set to the number of available CPU cores on the Elasticsearch
node.
3. Bulk thread pool
This thread pool runs bulk indexing requests. It is responsible for handling tasks related
to indexing large volumes of data.
4. Index thread pool
This thread pool runs indexing requests that are not performed in bulk. It is responsible
for handling tasks related to indexing individual documents.
5. GET thread pool
This thread pool is used to run GET requests. It is responsible for handling tasks related to
retrieving individual documents.
6. Write thread pool
This thread pool runs write-related operations, including indexing, updating, and deleting
documents. It handles tasks related to writing operations that cannot be executed on the
Bulk thread pool or the Index thread pool.
7. Analysis thread pool
This thread pool is used to run analysis tasks. It handles tasks related to analyzing text,
such as tokenization and filtering.
8. Snapshot thread pool
This thread pool runs snapshots and restores operations. It is responsible for handling
tasks related to backing up and restoring data.
Note: The number of threads in

the Bulk, Index, GET, Write, Analysis and Snapshot thread pools is typically set
to a small number, such as 1 or 2, to prevent the system from overloading.
Each thread pool has its settings, such as the maximum number of threads and the queue size,
which can be configured to optimize performance based on the specific needs of your
Elasticsearch deployment.
The write thread pool and the search request thread pool are the two most essential threads that
must be closely watched and aged. Depending on the workload level and workload mix,
appropriate settings for each thread pool may be required to control the workload flow and keep
the cluster in stable operation.
Elasticsearch internal caches:
There are several internal caches in lasticsearch, but the critical cache pools for Eastic operation
are:
1. Field data cache
The field data cache is used to cache field values for frequently accessed fields, and it
helps to speed up sorting aggregations and scripted fields. The field data cache is
implemented as a soft reference cache, which means that the cache can be cleared by
the garbage collector when memory becomes scarce.
2. Query cache
The query cache is used to cache the results of frequently executed queries and helps
speed up search operations. The query cache is implemented as an Least Recently
Used (LRU) cache, which means that the least recently executed queries are evicted
from the cache when it becomes full.
The field cache, for example, will be cleared whenever an index refresh or index merging
operation is carried out, necessitating a new load of field values from the disc to memory.
Whenever a reindexing operation is performed, such as when a new document is added or
changed the index, the query cache will be deleted entirely.
It is important to note that clearing the cache can cause a temporary slowdown in performance,
as the cache will need to be re-populated with new data.
Total Operations dashboard

The total operations dashboard displays a breakdown of the total number of index, search, get,
and delete operations carried out on your Elasticsearch cluster.
The dashboard visualizes the number of operations performed on your Elasticsearch cluster,
broken down by index, search, get, and delete operations.
1. Total Operations rate

This dashboard shows the rate at which operations are being performed on your
Elasticsearch cluster, broken down by index, search, get, and delete operations.
2. Total Operations time
This dashboard shows the total time each operation took, broken down by index, search,
get, and delete operations.
This can help you identify operations that are taking longer than expected and that may require
optimization.
Monitoring and Analysis of a typical problem

One can observe the Elasticsearch operations and tune them for a single type of workload or a
mix of various operations workloads. Tuning Elasticsearch for a single type of workload is
relatively easy and straightforward.
When you have a mix of operations, for example building the index and processing production
search requests, the success of the tuning depends on how well the workload mix is understood
and simulated in the performance testing environment.
The following section describes the straightforward workload tuning of the instance for creating
an index on Elasticsearch. Before attempting to tune live traffic, this will likely be the team's initial
tuning test.
Single type operations of Elasticsearch

To determine if Elasticsearch is short on resources when building an index, you can use the
following guidelines based on the metrics mentioned earlier:
High CPU usage
If Elasticsearch uses a high percentage of the CPU during indexing, you can add more
CPU resources to your cluster. You can also check if any poorly performing queries or
indexing processes consume more CPU resources than expected.
High memory usage
If Elasticsearch uses a large amount of memory during indexing, you can add more
memory resources to your cluster. You can also check if poorly performing queries or
indexing processes consume more memory resources than expected. In addition, you
can monitor the memory usage of Elasticsearch caches, such as the field data cache,
and adjust the cache size accordingly.
Slow indexing rate
If the indexing rate is slower than expected, it may indicate that Elasticsearch is short on
resources, such as CPU or memory. You can check the CPU and memory usage metrics
to see if they are close to their limits. If so, you may need to add more resources to your
cluster. You can also check if any queries or indexing processes are causing bottlenecks
or if there are any network or disk I/O issues.
High disk I/O
If the disk I/O is high during indexing, you can add more disk resources to your cluster.
You can also check if poorly performing queries or indexing processes are causing high
disk I/O. In addition, you can monitor the disk space usage and adjust the Elasticsearch
configuration settings, such as the shard size, to optimize the disk usage.
Elasticsearch logs
Elasticsearch logs can provide valuable information about errors, warnings, and other
issues related to resource shortages. You can check the Elasticsearch logs to see if
there are any error messages related to resource shortages and take appropriate actions
to address them.
By monitoring these metrics and analyzing the patterns and

trends over time, you can determine if Elasticsearch is short on
resources and take appropriate actions to optimize your cluster
performance during the indexing process.
The case of low heap

When Elasticsearch is low on heap, it can cause various issues,
such as slow response times, increased latency, and even
crashes. These unstable behaviors can be exhibited when
executing a single task on Elasticsearch or when multiple
concurrent tasks are executed simultaneously on Elasticsearch,
such as in index building and the search requests service.
These are typical behaviors that Elasticsearch may exhibit when

the heap is running low:
1. Increased Garbage Collection (GC) activity
Elasticsearch's garbage collector must run more frequently to reclaim memory as the
heap fills up. This can increase CPU usage, cause longer GC pauses, and air
performance.
2. Reduced cache size
Elasticsearch uses various caches to speed up searches and queries. When the heap is
low, the cache size may be reduced or even disabled, resulting in slower search times.
3. Increased disk I/O
If Elasticsearch unable to fit all of its data in memory, it may need to swap data to disk,
resulting in increased disk I/O and slower performance.
It is important to note that Elasticsearch es native memory for

various caches and storage repositories. The typically
recommended setup for Elasticsearch heap is to allocate 50% of
the pod memory to JVM Heap space. Leave the other 50% to
the native memory and file cache.
Additional considerations arding memory

consumption.
Bulk size
Using bulk requests can significantly improve indexing performance when indexing
multiple documents at once. However, the size of the bulk requests can also affect
performance. If the bulk size is too large, it can cause resource exhaustion and slow
down the indexing process. If the bulk size is too small, it lead to more requests and
slower overall performance.
There is no correct size of the bulk request. Test with

different settings for the bulk request size to find the
optimal size for your workload. This configuration can be
performed by setting the NiFi flow file size, which will, in
turn, reflect directly on the bulk request size that NiFi will
make to Elasticsearch.
It is important to note that Elasticsearch limits the

maximum size of an HTTP request to 100 MB by default,
so you must ensure that no request exceeds this size.
Multiple or Concurent Elasticsearch

operations
Multiple or concurrent Elasticsearch operations can reduce performance. Finding the cause
should be done with attention and planning.
Detecting and determining the root cause of Elasticsearch slowness when concurrent operations
are executed is complex and should be performed with extra care and preparation. You can
anticipate carrying out environment testing in the following situations:
1. As anticipated for the peak season, when the production workload is at capacity or at its
peak.
2. All the index-related operations that are expected to be executed during the peak
workload (index build, NRT updates to the index, inventory index updates, etc.).
3. Other operations do not directly affect the index or search data but can affect the overall
operations (for example, full cache clear after reindexing, etc.).
The case of flow congestion
Certain situations are frequently encountered when executing a mixed workload, such as funnel
congestion and performance degradation. Differing from the single type of operation, this is a
more complicated tuning exercise that observes and adjusts for insufficient resources and
balances those resources according to some priority. For example, the production search
request should precede the index build operations.
In a case where the slowness of Elasticsearch is due to a workload mix in which the search
cluster was indexing and serving live request data, the tuning comes more complicated and
requires upfront decision-making regarding the priorities of the workload mix types.
In most cases, you need to prioritize one type of operation over another. In this example, it would
require selecting a higher priority for the Live searches and a lower priority for the index creation
on the Auth environment.
Changing write threads:

Changing the write and search threads in Elasticsearch can effectively tune the cluster for better
performance and provide prioritization between the write (Auth index build) and Live (search
threads requests) operations.
Elasticsearch has a write thread pool that handles indexing requests. By default, this pool has a
size of (number of CPU cores * 2) + 1. You can increase or decrease the size of this pool based
on the workload of your cluster.
To change the size of the write thread pool, you can use the thread_pool.write.size setting in
the Elasticsearch configuration file. For example, to increase the size of the write thread pool
to 16, you can add the following line to the configuration file:
thread_pool.write.size : 16.
Changing search threads

Elasticsearch also has a search thread pool that handles search requests. By default, this pool
has a size of (number of CPU cores * 3) / 2. You can increase or decrease the size of this pool
based on the workload of your cluster.
To change the size of the search thread pool, you can use the thread_pool.search.size setting
in the Elasticsearch configuration file. For example, to increase the size of the search thread pool
to 24, you can add the following line to the configuration file:
thread_pool.search.size: 24.
Important: Changing the thread pool sizes should be done cautiously and after thorough testing.
Increasing the thread pool size too much can lead to resource exhaustion and cause
performance issues. It is recommended that you monitor the resource usage of your cluster after
changing the thread pool sizes and adjust them accordingly to achieve optimal performance.
Adjusting index settings can also effectively tune Elasticsearch for better performance. The
following parameters can be changed to enhance indexing performance:
Refresh interval
By default, Elasticsearch refreshes the search index every second. This means that
when a document is indexed, it will be immediately available for search at the next
refresh interval. If you have a high volume of indexing requests, consider increasing the
refresh interval to reduce the frequency of index refreshes and improve indexing
performance. Conversely, if you need real-time indexing, you can decrease the refresh
interval.
To adjust the refresh interval, use the index.refresh_interval setting. For example, to set
the refresh interval to 5 seconds, you can run the following command:
PUT /my_index/_settings
{
"index" : {
"refresh_interval" : "5s"
}
}
Number of shards
The number of shards in an index can also affect indexing performance. If you have an
extensive index with many shards, indexing performance may be slower due to the
overhead of coordinating writes across multiple shards. In general, keeping the number
of shards per index between 1 and 5 is recommended.
To adjust the number of shards, you can use the index.number_of_shards setting. For
example, to set the number of shards to 3, you can run the following command:
PUT /my_index/_settings
{
"index" : {
"number_of_shards" : 3
}
}
Tuning knobs
Elasticsearch has several configuration points that may be changed for optimal performance and
the allocation of resources. The following are some critical tuning knobs to think about.
 Heap Size
The heap size is one of the most critical tuning parameters for Elasticsearch. It
determines the amount of memory allocated to Elasticsearch's JVM and affects various
operations, including caching, indexing, and search.
Note: It is recommended to allocate around 50%of available memory to the heap, up to a
maximum of 30 GB. Elasticsearch uses native memory for various caches and buffers in
intra- and inter-pod communications.
 Thread Pools
Elasticsearch uses various thread pools for different operations, such as indexing,
searching, and merging. You can tune the thread pool settings to control the number of
threads allocated for each type of operation and adjust the queue size for pending
requests. This can help balance the allocation of resources based on your specific
workload.
For more information, see Thread pools.
 Circuit Breakers
Circuit breakers protect Elasticsearch against excessive memory usage or disk space
consumption. You can configure circuit breaker settings to control how Elasticsearch
handles resource limitations and prevent out-of-memory or disk space errors.
For more information, see Circuit breaker settings.
 Cache Size
Elasticsearch uses various caches, such as the field data and query cache, to improve
search performance. You can adjust the cache size settings to optimize memory usage
based on your query patterns and data size.
 Field Data Cache

The Field Data Cache in Elasticsearch is a memory-based cache that stores the field
values of indexed documents in a compressed and optimized format. It speeds up query
execution by pre-loading frequently accessed field values into memory. By caching field
data, Elasticsearch avoids loading data from a disk for each query, improving
performance. It is a crucial optimization feature that can improve search performance,
especially for aggregations, sorting, and scripting operations.
For more information, see Field data cache settings.
 Field Data Loading Circuit Breaker

The field data loading circuit breaker protects against excessive memory usage by field
data caches. You can configure the indices.breaker.fielddata.limit setting to control the
memory allocated for field data caches and prevent out-of-memory errors.
For more information, see Field data circuit breaker.
 Query Caching
Elasticsearch supports query caching, which can improve query performance by caching
the results of frequently executed queries. To optimize performance, you can enable
query caching and adjust cache settings, such as the cache size and expiration time.
For more information, see Node query cache settings.
 Hardware Resources
Elasticsearch's performance is heavily influenced by the hardware resources available.
Consider tuning the hardware configuration, such as CPU memory, disk I/O, and network
settings, to match your workload requirements and optimize performance.
 File Descriptors and Process Limits
Elasticsearch is a resource-intensive application requiring sufficient file descriptors and
process limits to function correctly. You can increase the maximum number of open file
descriptors and adjust process limits to accommodate the needs of your Elasticsearch
cluster.
 Indexing Buffer Settings
Elasticsearch uses memory buffers to stage data before it is written to disk. You can tune
the indexing buffer sizes to optimize indexing performance.
The indices.memory.index_buffer_size setting controls the size of the buffer for each
shard.
For more information, see Indexing Buffer settings.
For more information, see Tune for indexing speed.
 Query-Time Filters
Elasticsearch provides query-time filters that allow you to apply filters to a query. Using
filters can improve query performance by reducing the amount of data that needs to be
processed.
 Refresh and Flush Intervals
Elasticsearch periodically refreshes its index to make new data searchable. You can
adjust the refresh interval to balance indexing performance and search latency.
Similarly, the flush interval controls how often Elasticsearch writes data from memory to
disk. Adjusting these intervals can impact indexing throughput and resource usage. An
Elasticsearch flush performs a Lucene commit and starts a new translog generation.
Flushes are performed automatically in the background to ensure the translog does not
grow too large, which would make replaying its operations take considerable time during
recovery.
For more information, see Index Modules.
For more information, see Translog.
 Translog Durability
The translog is a transaction log that ensures data durability in case of node failures.
Adjust the translog durability settings to balance data safety and indexing performance.
When a document is indexed or updated in Elasticsearch, it is first written to the translog

before being written to the index. This allows Elasticsearch to recover the changes in
case of node failures or restarts. The translog acts as a buffer, storing the changes
temporarily until they are flushed to disk and become part of the index.
1. Request Durability
With request durability, every indexing or update request is synced to disk before a
response is sent back to the client. This ensures that the changes are immediately
durable but can impact performance due to the disk synchronization overhead.
2. Translog Durability Settings
Elasticsearch provides configuration settings to control the durability of the translog.
These settings include:
a. translog.sync_interval
Specifies the time interval between sync operations. Changes in the translog are
periodically synced to disk based on this interval.
b. translog.durability
Controls the translog's durability level. It can be set to request, async,
or request_sync to balance performance and durability.
By default, Elasticsearch uses the async durability mode, where the changes are
periodically synced to disk but not necessarily after each request. This provides a good
balance between durability and performance.
 Aggregations
Elasticsearch provides powerful aggregation capabilities, but complex aggregations can
impact performance. You can tune aggregation settings, such
as search.max_buckets and indices.breaker.total.limit, to control the memory usage
and limit the number of buckets aggregations produce.
For more information, see Aggregations.
 Shard Size
Each shard in Elasticsearch comes with some overhead, so having a large number of
small shards can impact performance. It is essential to balance the number of shards and
the size of each shard based on your data volume and hardware resources.
 Shard Allocation
Shards are the basic units of data distribution in Elasticsearch. By default, Elasticsearch
tries to distribute shards evenly across nodes. However, you can control shard allocation
settings to ensure balanced resource usage and optimize cluster performance.
 Shard Routing
Elasticsearch distributes shards across nodes based on a hashing algorithm. You can
influence shard routing by customizing the shard allocation process using shard
allocation filters and allocation awareness settings. This can help balance data
distribution and improve cluster performance.
 Network Settings
Adjusting network settings, such as TCP configurations, can impact the performance and
responsiveness of Elasticsearch. To ensure efficient network communication, you can
optimize settings like TCP keep-alive, socket buffers, and connection timeouts.
 Data Serialization and Compression
Elasticsearch allows configuring data serialization and compression options, such as
using a more efficient binary format (like SMILE or CBOR) or enabling compression for
network communication. These settings can improve storage efficiency and reduce
network overhead.
See Save space and money with improved storage efficiency in Elasticsearch 7.10 for
more information.
For more information, see Index Modules.
All these tuning knobs provide flexibility for optimizing Elasticsearch based on your specific
workload, hardware resources, and performance requirements. It is essential to carefully monitor
the impact of any changes and conduct performance testing to ensure optimal results.
Additionally, always refer to the official Elasticsearch documentation and consider the
recommendations provided by Elastic for tuning and optimization.
Thread pools
A node uses several thread pools to manage memory consumption. Queues associated with many of
the thread pools enable pending requests to be held instead of discarded.
There are several thread pools, but the important ones include:
generic
For generic operations (for example, background node discovery). Thread pool type is scaling.
search
For count/search/suggest operations. Thread pool type is fixed with a size of int((# of allocated
processors * 3) / 2) + 1, and queue_size of 1000.
search_throttled
For count/search/suggest/get operations on search_throttled indices. Thread pool type is fixed with
a size of 1, and queue_size of 100.
search_coordination
For lightweight search-related coordination operations. Thread pool type is fixed with a size of a max
of min(5, (# of allocated processors) / 2), and queue_size of 1000.
get
For get operations. Thread pool type is fixed with a size of int((# of allocated processors * 3) / 2) + 1,
and queue_size of 1000.
analyze
For analyze requests. Thread pool type is fixed with a size of 1, queue size of 16.
write
For single-document index/delete/update and bulk requests. Thread pool type is fixed with a size
of # of allocated processors, queue_size of 10000. The maximum size for this pool is 1 + # of
allocated processors.
snapshot
For snapshot/restore operations. Thread pool type is scaling with a keep-alive of 5m. On nodes with
at least 750MB of heap the maximum size of this pool is 10 by default. On nodes with less than
750MB of heap the maximum size of this pool is min(5, (# of allocated processors) / 2) by default.
snapshot_meta
For snapshot repository metadata read operations. Thread pool type is scaling with a keep-alive
of 5m and a max of min(50, (# of allocated processors* 3)).
warmer
For segment warm-up operations. Thread pool type is scaling with a keep-alive of 5m and a max
of min(5, (# of allocated processors) / 2).
refresh
For refresh operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(10, (# of
allocated processors) / 2).
fetch_shard_started
For listing shard states. Thread pool type is scaling with keep-alive of 5m and a default maximum size
of 2 * # of allocated processors.
fetch_shard_store
For listing shard stores. Thread pool type is scaling with keep-alive of 5m and a default maximum size
of 2 * # of allocated processors.
flush
For flush and translog fsync operations. Thread pool type is scaling with a keep-alive of 5m and a
default maximum size of min(5, (# of allocated processors) / 2).
force_merge
For force merge operations. Thread pool type is fixed with a size of max(1, (# of allocated processors)
/ 8) and an unbounded queue size.
management
For cluster management. Thread pool type is scaling with a keep-alive of 5m and a default maximum
size of 5.
system_read
For read operations on system indices. Thread pool type is fixed with a default maximum size
system_write
For write operations on system indices. Thread pool type is fixed with a default maximum size
system_critical_read
For critical read operations on system indices. Thread pool type is fixed with a default maximum size
system_critical_write
For critical write operations on system indices. Thread pool type is fixed with a default maximum size
watcher
For watch executions. Thread pool type is fixed with a default maximum size of min(5 * (# of
allocated processors), 50) and queue_size of 1000.
Thread pool settings are static and can be changed by editing elasticsearch.yml. Changing a specific
thread pool can be done by setting its type-specific parameters; for example, changing the number of
threads in the write thread pool:
thread_pool:
write:
size: 30
Thread pool types
The following are the types of thread pools and their respective parameters:
fixed
The fixed thread pool holds a fixed size of threads to handle the requests with a queue (optionally
bounded) for pending requests that have no threads to service them.
The size parameter controls the number of threads.
The queue_size allows to control the size of the queue of pending requests that have no threads to
execute them. By default, it is set to -1 which means its unbounded. When a request comes in and
the queue is full, it will abort the request.
thread_pool:
write:
size: 30
queue_size: 1000
scaling
The scaling thread pool holds a dynamic number of threads. This number is proportional to the
workload and varies between the value of the core and max parameters.
The keep_alive parameter determines how long a thread should be kept around in the thread pool
without it doing any work.
thread_pool:
warmer:
core: 1
max: 8
keep_alive: 2m
Allocated processors setting
The number of processors is automatically detected, and the thread pool settings are automatically
set based on it. In some cases it can be useful to override the number of detected processors. This
can be done by explicitly setting the node.processors setting. This setting is bounded by the number
of available processors and accepts floating point numbers, which can be useful in environments
where the Elasticsearch nodes are configured to run with CPU limits, such as cpu shares or quota
under Cgroups.
node.processors: 2
There are a few use-cases for explicitly overriding the node.processors setting:
1. If you are running multiple instances of Elasticsearch on the same host but want
Elasticsearch to size its thread pools as if it only has a fraction of the CPU, you should
override the node.processors setting to the desired fraction, for example, if you’re running
two instances of Elasticsearch on a 16-core machine, set node.processors to 8. Note that this
is an expert-level use case and there’s a lot more involved than just setting
the node.processors setting as there are other considerations like changing the number of
garbage collector threads, pinning processes to cores, and so on.
2. Sometimes the number of processors is wrongly detected and in such cases explicitly setting
the node.processors setting will workaround such issues.
In order to check the number of processors detected, use the nodes info API with the os flag.
Circuit breaker settings
Elasticsearch contains multiple circuit breakers used to prevent operations from causing an
OutOfMemoryError. Each breaker specifies a limit for how much memory it can use. Additionally,
there is a parent-level breaker that specifies the total amount of memory that can be used across all
breakers.
Except where noted otherwise, these settings can be dynamically updated on a live cluster with
the cluster-update-settings API.
For information about circuit breaker errors, see Circuit breaker errors.
Parent circuit breaker
The parent-level breaker can be configured with the following settings:
indices.breaker.total.use_real_memory
(Static) Determines whether the parent breaker should take real memory usage into account (true)
or only consider the amount that is reserved by child circuit breakers (false). Defaults to true.
indices.breaker.total.limit
(Dynamic) Starting limit for overall parent breaker. Defaults to 70% of JVM heap
if indices.breaker.total.use_real_memory is false. If indices.breaker.total.use_real_memory is true,
defaults to 95% of the JVM heap.
Field data circuit breaker
The field data circuit breaker estimates the heap memory required to load a field into the field data
cache. If loading the field would cause the cache to exceed a predefined memory limit, the circuit
breaker stops the operation and returns an error.
indices.breaker.fielddata.limit
(Dynamic) Limit for fielddata breaker. Defaults to 40% of JVM heap.
indices.breaker.fielddata.overhead
(Dynamic) A constant that all field data estimations are multiplied with to determine a final
estimation. Defaults to 1.03.
Request circuit breaker
The request circuit breaker allows Elasticsearch to prevent per-request data structures (for example,
memory used for calculating aggregations during a request) from exceeding a certain amount of
memory.
indices.breaker.request.limit
(Dynamic) Limit for request breaker, defaults to 60% of JVM heap.
indices.breaker.request.overhead
(Dynamic) A constant that all request estimations are multiplied with to determine a final estimation.
Defaults to 1.
In flight requests circuit breaker
The in flight requests circuit breaker allows Elasticsearch to limit the memory usage of all currently
active incoming requests on transport or HTTP level from exceeding a certain amount of memory on
a node. The memory usage is based on the content length of the request itself. This circuit breaker
also considers that memory is not only needed for representing the raw request but also as a
structured object which is reflected by default overhead.
network.breaker.inflight_requests.limit
(Dynamic) Limit for in flight requests breaker, defaults to 100% of JVM heap. This means that it is
bound by the limit configured for the parent circuit breaker.
network.breaker.inflight_requests.overhead
(Dynamic) A constant that all in flight requests estimations are multiplied with to determine a final
estimation. Defaults to 2.
Accounting requests circuit breaker
The accounting circuit breaker allows Elasticsearch to limit the memory usage of things held in
memory that are not released when a request is completed. This includes things like the Lucene
segment memory.
indices.breaker.accounting.limit
(Dynamic) Limit for accounting breaker, defaults to 100% of JVM heap. This means that it is bound by
the limit configured for the parent circuit breaker.
indices.breaker.accounting.overhead
(Dynamic) A constant that all accounting estimations are multiplied with to determine a final
estimation. Defaults to 1
Script compilation circuit breaker
Slightly different than the previous memory-based circuit breaker, the script compilation circuit
breaker limits the number of inline script compilations within a period of time.
See the "prefer-parameters" section of the scripting documentation for more information.
script.max_compilations_rate
(Dynamic) Limit for the number of unique dynamic scripts within a certain interval that are allowed
to be compiled. Defaults to 150/5m, meaning 150 every 5 minutes.
If the cluster regularly hits the given max_compilation_rate, it’s possible the script cache is
undersized, use Nodes Stats to inspect the number of recent cache
evictions, script.cache_evictions_history and compilations script.compilations_history. If there are a
large number of recent cache evictions or compilations, the script cache may be undersized, consider
doubling the size of the script cache via the setting script.cache.max_size.
Regex circuit breaker
Poorly written regular expressions can degrade cluster stability and performance. The regex circuit
breaker limits the use and complexity of regex in Painless scripts.
script.painless.regex.enabled
(Static) Enables regex in Painless scripts. Accepts:
limited (Default)
Enables regex but limits complexity using the script.painless.regex.limit-factor cluster setting.
true
Enables regex with no complexity limits. Disables the regex circuit breaker.
false
Disables regex. Any Painless script containing a regular expression returns an error.
script.painless.regex.limit-factor
(Static) Limits the number of characters a regular expression in a Painless script can consider.
Elasticsearch calculates this limit by multiplying the setting value by the script input’s character
length.
For example, the input foobarbaz has a character length of 9. If script.painless.regex.limit-factor is 6,

a regular expression on foobarbaz can consider up to 54 (9 * 6) characters. If the expression exceeds
this limit, it triggers the regex circuit breaker and returns an error.
Elasticsearch only applies this limit if script.painless.regex.enabled is limited.
EQL circuit breaker
When a sequence query is executed, the node handling the query needs to keep some structures in
memory, which are needed by the algorithm implementing the sequence matching. When large
amounts of data need to be processed, and/or a large amount of matched sequences is requested by
the user (by setting the size query param), the memory occupied by those structures could
potentially exceed the available memory of the JVM. This would cause an OutOfMemory exception
which would bring down the node.
To prevent this from happening, a special circuit breaker is used, which limits the memory allocation
during the execution of a sequence query. When the breaker is triggered,
an org.elasticsearch.common.breaker.CircuitBreakingException is thrown and a descriptive error
message is returned to the user.
This circuit breaker can be configured using the following settings:
breaker.eql_sequence.limit
(Dynamic) The limit for circuit breaker used to restrict the memory utilisation during the execution of
an EQL sequence query. This value is defined as a percentage of the JVM heap. Defaults to 50%. If
the parent circuit breaker is set to a value less than 50%, this setting uses that value as its default
instead.
breaker.eql_sequence.overhead
(Dynamic) A constant that sequence query memory estimates are multiplied by to determine a final
estimate. Defaults to 1.
breaker.eql_sequence.type
(Static) Circuit breaker type. Valid values are:
memory (Default)
The breaker limits memory usage for EQL sequence queries.
noop
Disables the breaker.
Machine learning circuit breaker
breaker.model_inference.limit
(Dynamic) The limit for the trained model circuit breaker. This value is defined as a percentage of the
JVM heap. Defaults to 50%. If the parent circuit breaker is set to a value less than 50%, this setting
uses that value as its default instead.
breaker.model_inference.overhead
(Dynamic) A constant that all trained model estimations are multiplied by to determine a final
estimation. See Circuit breaker settings. Defaults to 1.
breaker.model_inference.type
(Static) The underlying type of the circuit breaker. There are two valid
options: noop and memory. noop means the circuit breaker does nothing to prevent too much
memory usage. memory means the circuit breaker tracks the memory used by trained models and
can potentially break and prevent OutOfMemory errors. The default value is memory.
Field data cache settings

The field data cache contains field data and global ordinals, which are both used to support
aggregations on certain field types. Since these are on-heap data structures, it is important to
monitor the cache’s use.
Cache size
The entries in the cache are expensive to build, so the default behavior is to keep the cache
loaded in memory. The default cache size is unlimited, causing the cache to grow until it
reaches the limit set by the field data circuit breaker. This behavior can be configured.
If the cache size limit is set, the cache will begin clearing the least-recently-updated entries in
the cache. This setting can automatically avoid the circuit breaker limit, at the cost of
rebuilding the cache as needed.
If the circuit breaker limit is reached, further requests that increase the cache size will be
prevented. In this case you should manually clear the cache.
indices.fielddata.cache.size
(Static) The max size of the field data cache, eg 38% of node heap space, or an absolute value,
eg 12GB. Defaults to unbounded. If you choose to set it, it should be smaller than Field data
circuit breaker limit.
Monitoring field data

You can monitor memory usage for field data as well as the field data circuit breaker
using the nodes stats API or the cat fielddata API.
Circuit breaker settingsedit
Elasticsearch contains multiple circuit breakers used to prevent operations from causing an
OutOfMemoryError. Each breaker specifies a limit for how much memory it can use. Additionally,
there is a parent-level breaker that specifies the total amount of memory that can be used across all
breakers.
Except where noted otherwise, these settings can be dynamically updated on a live cluster with
the cluster-update-settings API.
For information about circuit breaker errors, see Circuit breaker errors.
Parent circuit breakeredit
The parent-level breaker can be configured with the following settings:
indices.breaker.total.use_real_memory
(Static) Determines whether the parent breaker should take real memory usage into account (true)
or only consider the amount that is reserved by child circuit breakers (false). Defaults to true.
indices.breaker.total.limit
(Dynamic) Starting limit for overall parent breaker. Defaults to 70% of JVM heap
if indices.breaker.total.use_real_memory is false. If indices.breaker.total.use_real_memory is true,
defaults to 95% of the JVM heap.
Field data circuit breakeredit
The field data circuit breaker estimates the heap memory required to load a field into the field data
cache. If loading the field would cause the cache to exceed a predefined memory limit, the circuit
breaker stops the operation and returns an error.
indices.breaker.fielddata.limit
(Dynamic) Limit for fielddata breaker. Defaults to 40% of JVM heap.
indices.breaker.fielddata.overhead
(Dynamic) A constant that all field data estimations are multiplied with to determine a final
estimation. Defaults to 1.03.
Request circuit breakeredit

The request circuit breaker allows Elasticsearch to prevent per-request data structures (for example,
memory used for calculating aggregations during a request) from exceeding a certain amount of
memory.
indices.breaker.request.limit
(Dynamic) Limit for request breaker, defaults to 60% of JVM heap.
indices.breaker.request.overhead
(Dynamic) A constant that all request estimations are multiplied with to determine a final estimation.
Defaults to 1.
In flight requests circuit breakeredit
The in flight requests circuit breaker allows Elasticsearch to limit the memory usage of all currently
active incoming requests on transport or HTTP level from exceeding a certain amount of memory on
a node. The memory usage is based on the content length of the request itself. This circuit breaker
also considers that memory is not only needed for representing the raw request but also as a
structured object which is reflected by default overhead.
network.breaker.inflight_requests.limit
(Dynamic) Limit for in flight requests breaker, defaults to 100% of JVM heap. This means that it is
bound by the limit configured for the parent circuit breaker.
network.breaker.inflight_requests.overhead
(Dynamic) A constant that all in flight requests estimations are multiplied with to determine a final
estimation. Defaults to 2.
Accounting requests circuit breakeredit
The accounting circuit breaker allows Elasticsearch to limit the memory usage of things held in
memory that are not released when a request is completed. This includes things like the Lucene
segment memory.
indices.breaker.accounting.limit
(Dynamic) Limit for accounting breaker, defaults to 100% of JVM heap. This means that it is bound by
the limit configured for the parent circuit breaker.
indices.breaker.accounting.overhead
(Dynamic) A constant that all accounting estimations are multiplied with to determine a final
estimation. Defaults to 1
Script compilation circuit breakeredit
Slightly different than the previous memory-based circuit breaker, the script compilation circuit
breaker limits the number of inline script compilations within a period of time.
See the "prefer-parameters" section of the scripting documentation for more information.
script.max_compilations_rate
(Dynamic) Limit for the number of unique dynamic scripts within a certain interval that are allowed
to be compiled. Defaults to 150/5m, meaning 150 every 5 minutes.
If the cluster regularly hits the given max_compilation_rate, it’s possible the script cache is
undersized, use Nodes Stats to inspect the number of recent cache
evictions, script.cache_evictions_history and compilations script.compilations_history. If there are a
large number of recent cache evictions or compilations, the script cache may be undersized, consider
doubling the size of the script cache via the setting script.cache.max_size.
Regex circuit breakeredit
Poorly written regular expressions can degrade cluster stability and performance. The regex circuit
breaker limits the use and complexity of regex in Painless scripts.
script.painless.regex.enabled
(Static) Enables regex in Painless scripts. Accepts:
limited (Default)
Enables regex but limits complexity using the script.painless.regex.limit-factor cluster setting.
true
Enables regex with no complexity limits. Disables the regex circuit breaker.
false
Disables regex. Any Painless script containing a regular expression returns an error.
script.painless.regex.limit-factor
(Static) Limits the number of characters a regular expression in a Painless script can consider.
Elasticsearch calculates this limit by multiplying the setting value by the script input’s character
length.
For example, the input foobarbaz has a character length of 9. If script.painless.regex.limit-factor is 6,

a regular expression on foobarbaz can consider up to 54 (9 * 6) characters. If the expression exceeds
this limit, it triggers the regex circuit breaker and returns an error.
Elasticsearch only applies this limit if script.painless.regex.enabled is limited.
EQL circuit breakeredit
When a sequence query is executed, the node handling the query needs to keep some structures in
memory, which are needed by the algorithm implementing the sequence matching. When large
amounts of data need to be processed, and/or a large amount of matched sequences is requested by
the user (by setting the size query param), the memory occupied by those structures could
potentially exceed the available memory of the JVM. This would cause an OutOfMemory exception
which would bring down the node.
To prevent this from happening, a special circuit breaker is used, which limits the memory allocation
during the execution of a sequence query. When the breaker is triggered,
an org.elasticsearch.common.breaker.CircuitBreakingException is thrown and a descriptive error
message is returned to the user.
This circuit breaker can be configured using the following settings:
breaker.eql_sequence.limit
(Dynamic) The limit for circuit breaker used to restrict the memory utilisation during the execution of
an EQL sequence query. This value is defined as a percentage of the JVM heap. Defaults to 50%. If
the parent circuit breaker is set to a value less than 50%, this setting uses that value as its default
instead.
breaker.eql_sequence.overhead
(Dynamic) A constant that sequence query memory estimates are multiplied by to determine a final
estimate. Defaults to 1.
breaker.eql_sequence.type
(Static) Circuit breaker type. Valid values are:
memory (Default)
The breaker limits memory usage for EQL sequence queries.
noop
Disables the breaker.
Machine learning circuit breakeredit
breaker.model_inference.limit
(Dynamic) The limit for the trained model circuit breaker. This value is defined as a percentage of the
JVM heap. Defaults to 50%. If the parent circuit breaker is set to a value less than 50%, this setting
uses that value as its default instead.
breaker.model_inference.overhead
(Dynamic) A constant that all trained model estimations are multiplied by to determine a final
estimation. See Circuit breaker settings. Defaults to 1.
breaker.model_inference.type
(Static) The underlying type of the circuit breaker. There are two valid
options: noop and memory. noop means the circuit breaker does nothing to prevent too much
memory usage. memory means the circuit breaker tracks the memory used by trained models and
can potentially break and prevent OutOfMemory errors. The default value is memory.
Node query cache settingsedit

The results of queries used in the filter context are cached in the node query cache for fast
lookup. There is one query cache per node that is shared by all shards. The cache uses an
LRU eviction policy: when the cache is full, the least recently used query results are evicted
to make way for new data. You cannot inspect the contents of the query cache.
Term queries and queries used outside of a filter context are not eligible for caching.
By default, the cache holds a maximum of 10000 queries in up to 10% of the total heap
space. To determine if a query is eligible for caching, Elasticsearch maintains a query history
to track occurrences.
Caching is done on a per segment basis if a segment contains at least 10000 documents and
the segment has at least 3% of the total documents of a shard. Because caching is per
segment, merging segments can invalidate cached queries.
The following setting is static and must be configured on every data node in the cluster:
indices.queries.cache.size
(Static) Controls the memory size for the filter cache. Accepts either a percentage value,
like 5%, or an exact value, like 512mb. Defaults to 10%.
Query cache index settingsedit

The following setting is an index setting that can be configured on a per-index basis. Can
only be set at index creation time or on a closed index:
index.queries.cache.enabled
(Static) Controls whether to enable query caching. Accepts true (default) or false.
Indexing buffer settingsedit

The indexing buffer is used to store newly indexed documents. When it fills up, the
documents in the buffer are written to a segment on disk. It is divided between all shards on
the node.
The following settings are static and must be configured on every data node in the cluster:
indices.memory.index_buffer_size
(Static) Accepts either a percentage or a byte size value. It defaults to 10%, meaning
that 10% of the total heap allocated to a node will be used as the indexing buffer size
shared across all shards.
indices.memory.min_index_buffer_size
(Static) If the index_buffer_size is specified as a percentage, then this setting can
be used to specify an absolute minimum. Defaults to 48mb.
indices.memory.max_index_buffer_size
(Static) If the index_buffer_size is specified as a percentage, then this setting can
be used to specify an absolute maximum. Defaults to unbounded.
Tune for indexing speededit

Use bulk requestsedit
Bulk requests will yield much better performance than single-document index requests. In
order to know the optimal size of a bulk request, you should run a benchmark on a single
node with a single shard. First try to index 100 documents at once, then 200, then 400, etc.
doubling the number of documents in a bulk request in every benchmark run. When the
indexing speed starts to plateau then you know you reached the optimal size of a bulk request
for your data. In case of tie, it is better to err in the direction of too few rather than too many
documents. Beware that too large bulk requests might put the cluster under memory pressure
when many of them are sent concurrently, so it is advisable to avoid going beyond a couple
tens of megabytes per request even if larger requests seem to perform better.
Use multiple workers/threads to send data to Elasticsearchedit

A single thread sending bulk requests is unlikely to be able to max out the indexing capacity
of an Elasticsearch cluster. In order to use all resources of the cluster, you should send data
from multiple threads or processes. In addition to making better use of the resources of the
cluster, this should help reduce the cost of each fsync.
Make sure to watch for TOO_MANY_REQUESTS (429) response codes

(EsRejectedExecutionException with the Java client), which is the way that Elasticsearch
tells you that it cannot keep up with the current indexing rate. When it happens, you should
pause indexing a bit before trying again, ideally with randomized exponential backoff.
Similarly to sizing bulk requests, only testing can tell what the optimal number of workers is.
This can be tested by progressively increasing the number of workers until either I/O or CPU
is saturated on the cluster.
Unset or increase the refresh intervaledit

The operation that consists of making changes visible to search - called a refresh - is costly,
and calling it often while there is ongoing indexing activity can hurt indexing speed.
By default, Elasticsearch periodically refreshes indices every second, but only on indices that
have received one search request or more in the last 30 seconds.
This is the optimal configuration if you have no or very little search traffic (e.g. less than one
search request every 5 minutes) and want to optimize for indexing speed. This behavior aims
to automatically optimize bulk indexing in the default case when no searches are performed.
In order to opt out of this behavior set the refresh interval explicitly.
On the other hand, if your index experiences regular search requests, this default behavior
means that Elasticsearch will refresh your index every 1 second. If you can afford to increase
the amount of time between when a document gets indexed and when it becomes visible,
increasing the index.refresh_interval to a larger value, e.g. 30s, might help improve
indexing speed.
Disable replicas for initial loadsedit

If you have a large amount of data that you want to load all at once into Elasticsearch, it may
be beneficial to set index.number_of_replicas to 0 in order to speed up indexing. Having
no replicas means that losing a single node may incur data loss, so it is important that the data
lives elsewhere so that this initial load can be retried in case of an issue. Once the initial load
is finished, you can set index.number_of_replicas back to its original value.
If index.refresh_interval is configured in the index settings, it may further help to unset

it during this initial load and setting it back to its original value once the initial load is
finished.
Disable swappingedit
You should make sure that the operating system is not swapping out the java process
by disabling swapping.
Give memory to the filesystem cacheedit

The filesystem cache will be used in order to buffer I/O operations. You should make sure to
give at least half the memory of the machine running Elasticsearch to the filesystem cache.
Use auto-generated idsedit

When indexing a document that has an explicit id, Elasticsearch needs to check whether a
document with the same id already exists within the same shard, which is a costly operation
and gets even more costly as the index grows. By using auto-generated ids, Elasticsearch can
skip this check, which makes indexing faster.
Use faster hardwareedit

If indexing is I/O-bound, consider increasing the size of the filesystem cache (see above) or
using faster storage. Elasticsearch generally creates individual files with sequential writes.
However, indexing involves writing multiple files concurrently, and a mix of random and
sequential reads too, so SSD drives tend to perform better than spinning disks.
Stripe your index across multiple SSDs by configuring a RAID 0 array. Remember that it will
increase the risk of failure since the failure of any one SSD destroys the index. However this
is typically the right tradeoff to make: optimize single shards for maximum performance, and
then add replicas across different nodes so there’s redundancy for any node failures. You can
also use snapshot and restore to backup the index for further insurance.
Directly-attached (local) storage generally performs better than remote storage because it is
simpler to configure well and avoids communications overheads. With careful tuning it is
sometimes possible to achieve acceptable performance using remote storage too. Benchmark
your system with a realistic workload to determine the effects of any tuning parameters. If
you cannot achieve the performance you expect, work with the vendor of your storage system
to identify the problem.
Indexing buffer sizeedit

If your node is doing only heavy indexing, be sure indices.memory.index_buffer_size is
large enough to give at most 512 MB indexing buffer per shard doing heavy indexing
(beyond that indexing performance does not typically improve). Elasticsearch takes that
setting (a percentage of the java heap or an absolute byte-size), and uses it as a shared buffer
across all active shards. Very active shards will naturally use this buffer more than shards that
are performing lightweight indexing.
The default is 10% which is often plenty: for example, if you give the JVM 10GB of memory,
it will give 1GB to the index buffer, which is enough to host two shards that are heavily
indexing.
Use cross-cluster replication to prevent searching from stealing resources

from indexingedit
Within a single cluster, indexing and searching can compete for resources. By setting up two
clusters, configuring cross-cluster replication to replicate data from one cluster to the other
one, and routing all searches to the cluster that has the follower indices, search activity will
no longer steal resources from indexing on the cluster that hosts the leader indices.
Avoid hot spottingedit

Hot Spotting can occur when node resources, shards, or requests are not evenly distributed.
Elasticsearch maintains cluster state by syncing it across nodes, so continually hot spotted
nodes can cause overall cluster performance degredation.
Additional optimizationsedit
Many of the strategies outlined in Tune for disk usage also provide an improvement in the
speed of indexing.
Index modulesedit
Index Modules are modules created per index and control all aspects related to an index.
Index Settingsedit
Index level settings can be set per-index. Settings may be:
static
They can only be set at index creation time or on a closed index, or by using the update-index-
settings API with the reopen query parameter set to true (which automatically closes and reopens
impacted indices).
dynamic
They can be changed on a live index using the update-index-settings API.
Changing static or dynamic index settings on a closed index could result in incorrect settings that are
impossible to rectify without deleting and recreating the index.
Static index settingsedit
Below is a list of all static index settings that are not associated with any specific index module:
index.number_of_shards
The number of primary shards that an index should have. Defaults to 1. This setting can only be set at
index creation time. It cannot be changed on a closed index.
The number of shards are limited to 1024 per index. This limitation is a safety limit to prevent
accidental creation of indices that can destabilize a cluster due to resource allocation. The limit can
be modified by specifying export ES_JAVA_OPTS="-Des.index.max_number_of_shards=128" system
property on every node that is part of the cluster.
index.number_of_routing_shards
Integer value used with index.number_of_shards to route documents to a primary shard.

See _routing field.
Elasticsearch uses this value when splitting an index. For example, a 5 shard index
with number_of_routing_shards set to 30 (5 x 2 x 3) could be split by a factor of 2 or 3. In other
words, it could be split as follows:
 5 → 10 → 30 (split by 2, then by 3)
 5 → 15 → 30 (split by 3, then by 2)
 5 → 30 (split by 6)
This setting’s default value depends on the number of primary shards in the index. The default is
designed to allow you to split by factors of 2 up to a maximum of 1024 shards.
In Elasticsearch 7.0.0 and later versions, this setting affects how documents are distributed across
shards. When reindexing an older index with custom routing, you must explicitly
set index.number_of_routing_shards to maintain the same document distribution. See the related
breaking change.
index.codec
The default value compresses stored data with LZ4 compression, but this can be set
to best_compression which uses DEFLATE for a higher compression ratio, at the expense of slower
stored fields performance. If you are updating the compression type, the new one will be applied
after segments are merged. Segment merging can be forced using force merge.
index.routing_partition_size
The number of shards a custom routing value can go to. Defaults to 1 and can only be set at index
creation time. This value must be less than the index.number_of_shards unless
the index.number_of_shards value is also 1. See Routing to an index partition for more details about
how this setting is used.
index.soft_deletes.enabled
[7.6.0] Deprecated in 7.6.0. Creating indices with soft-deletes disabled is deprecated and will be
removed in future Elasticsearch versions.Indicates whether soft deletes are enabled on the index.
Soft deletes can only be configured at index creation and only on indices created on or after
Elasticsearch 6.5.0. Defaults to true.
index.soft_deletes.retention_lease.period
The maximum period to retain a shard history retention lease before it is considered expired. Shard
history retention leases ensure that soft deletes are retained during merges on the Lucene index. If a
soft delete is merged away before it can be replicated to a follower the following process will fail due
to incomplete history on the leader. Defaults to 12h.
index.load_fixed_bitset_filters_eagerly
Indicates whether cached filters are pre-loaded for nested queries. Possible values are true (default)
and false.
index.shard.check_on_startup
Expert users only. This setting enables some very expensive processing at shard startup and is only
ever useful while diagnosing a problem in your cluster. If you do use it, you should do so only
temporarily and remove it once it is no longer needed.
Elasticsearch automatically performs integrity checks on the contents of shards at various points
during their lifecycle. For instance, it verifies the checksum of every file transferred when recovering
a replica or taking a snapshot. It also verifies the integrity of many important files when opening a
shard, which happens when starting up a node and when finishing a shard recovery or relocation.
You can therefore manually verify the integrity of a whole shard while it is running by taking a
snapshot of it into a fresh repository or by recovering it onto a fresh node.
This setting determines whether Elasticsearch performs additional integrity checks while opening a
shard. If these checks detect corruption then they will prevent the shard from being opened. It
accepts the following values:
false
Don’t perform additional checks for corruption when opening a shard. This is the default and
recommended behaviour.
checksum
Verify that the checksum of every file in the shard matches its contents. This will detect cases where
the data read from disk differ from the data that Elasticsearch originally wrote, for instance due to
undetected disk corruption or other hardware failures. These checks require reading the entire shard
from disk which takes substantial time and IO bandwidth and may affect cluster performance by
evicting important data from your filesystem cache.
true
Performs the same checks as checksum and also checks for logical inconsistencies in the shard, which
could for instance be caused by the data being corrupted while it was being written due to faulty
RAM or other hardware failures. These checks require reading the entire shard from disk which takes
substantial time and IO bandwidth, and then performing various checks on the contents of the shard
which take substantial time, CPU and memory.
Dynamic index settingsedit
Below is a list of all dynamic index settings that are not associated with any specific index module:
index.number_of_replicas
The number of replicas each primary shard has. Defaults to 1.
WARNING: Configuring it to 0 may lead to temporary availability loss
during node restarts or permanent data loss in case of data corruption.
index.auto_expand_replicas
Auto-expand the number of replicas based on the number of data nodes in the cluster. Set to a dash
delimited lower and upper bound (e.g. 0-5) or use all for the upper bound (e.g. 0-all). Defaults
to false (i.e. disabled). Note that the auto-expanded number of replicas only takes allocation
filtering rules into account, but ignores other allocation rules such as total shards per node, and this
can lead to the cluster health becoming YELLOW if the applicable rules prevent all the replicas from
being allocated.
If the upper bound is all then shard allocation

awareness and cluster.routing.allocation.same_shard.host are ignored for this index.
index.search.idle.after
How long a shard can not receive a search or get request until it’s considered search idle. (default
is 30s)
index.refresh_interval
How often to perform a refresh operation, which makes recent changes to the index visible to search.
Defaults to 1s. Can be set to -1 to disable refresh. If this setting is not explicitly set, shards that
haven’t seen search traffic for at least index.search.idle.after seconds will not receive background
refreshes until they receive a search request. Searches that hit an idle shard where a refresh is
pending will trigger a refresh as part of the search operation for that shard only. This behavior aims
to automatically optimize bulk indexing in the default case when no searches are performed. In order
to opt out of this behavior an explicit value of 1s should set as the refresh interval.
index.max_result_window
The maximum value of from + size for searches to this index. Defaults to 10000. Search requests take
heap memory and time proportional to from + size and this limits that memory. See Scroll or Search
After for a more efficient alternative to raising this.
index.max_inner_result_window
The maximum value of from + size for inner hits definition and top hits aggregations to this index.
Defaults to 100. Inner hits and top hits aggregation take heap memory and time proportional to from
+ size and this limits that memory.
index.max_rescore_window
The maximum value of window_size for rescore requests in searches of this index. Defaults
to index.max_result_window which defaults to 10000. Search requests take heap memory and time
proportional to max(window_size, from + size) and this limits that memory.
index.max_docvalue_fields_search
The maximum number of docvalue_fields that are allowed in a query. Defaults to 100. Doc-value
fields are costly since they might incur a per-field per-document seek.
index.max_script_fields
The maximum number of script_fields that are allowed in a query. Defaults to 32.
index.max_ngram_diff
The maximum allowed difference between min_gram and max_gram for NGramTokenizer and
NGramTokenFilter. Defaults to 1.
index.max_shingle_diff
The maximum allowed difference between max_shingle_size and min_shingle_size for

the shingle token filter. Defaults to 3.
index.max_refresh_listeners
Maximum number of refresh listeners available on each shard of the index. These listeners are used
to implement refresh=wait_for.
index.analyze.max_token_count
The maximum number of tokens that can be produced using _analyze API. Defaults to 10000.
index.highlight.max_analyzed_offset
The maximum number of characters that will be analyzed for a highlight request. This setting is only
applicable when highlighting is requested on a text that was indexed without offsets or term vectors.
Defaults to 1000000.
index.max_terms_count
The maximum number of terms that can be used in Terms Query. Defaults to 65536.
index.max_regex_length
The maximum length of regex that can be used in Regexp Query. Defaults to 1000.
index.query.default_field
(string or array of strings) Wildcard (*) patterns matching one or more fields. The following query
types search these matching fields by default:
 More like this
 Multi-match
 Query string
 Simple query string
Defaults to *, which matches all fields eligible for term-level queries, excluding metadata fields.
index.routing.allocation.enable
Controls shard allocation for this index. It can be set to:
 all (default) - Allows shard allocation for all shards.
 primaries - Allows shard allocation only for primary shards.
 new_primaries - Allows shard allocation only for newly-created primary shards.
 none - No shard allocation is allowed.
index.routing.rebalance.enable
Enables shard rebalancing for this index. It can be set to:
 all (default) - Allows shard rebalancing for all shards.
 primaries - Allows shard rebalancing only for primary shards.
 replicas - Allows shard rebalancing only for replica shards.
 none - No shard rebalancing is allowed.
index.gc_deletes
The length of time that a deleted document’s version number remains available for further versioned
operations. Defaults to 60s.
index.default_pipeline
Default ingest pipeline for the index. Index requests will fail if the default pipeline is set and the
pipeline does not exist. The default may be overridden using the pipeline parameter. The special
pipeline name _none indicates no default ingest pipeline will run.
index.final_pipeline
Final ingest pipeline for the index. Indexing requests will fail if the final pipeline is set and the
pipeline does not exist. The final pipeline always runs after the request pipeline (if specified) and the
default pipeline (if it exists). The special pipeline name _none indicates no final ingest pipeline will
run.
You can’t use a final pipeline to change the _index field. If the pipeline attempts to change
the _index field, the indexing request will fail.
index.hidden
Indicates whether the index should be hidden by default. Hidden indices are not returned by default
when using a wildcard expression. This behavior is controlled per request through the use of
the expand_wildcards parameter. Possible values are true and false (default).
Settings in other index modulesedit
Other index settings are available in index modules:
Analysis
Settings to define analyzers, tokenizers, token filters and character filters.
Index shard allocation
Control over where, when, and how shards are allocated to nodes.
Mapping
Enable or disable dynamic mapping for an index.
Merging
Control over how shards are merged by the background merge process.
Similarities
Configure custom similarity settings to customize how search results are scored.
Slowlog
Control over how slow queries and fetch requests are logged.
Store
Configure the type of filesystem used to access shard data.
Translog
Control over the transaction log and background flush operations.
History retention
Control over the retention of a history of operations in the index.
Indexing pressure
Configure indexing back pressure limits.
X-Pack index settingsedit
Index lifecycle management

Specify the lifecycle policy and rollover alias for an index.
Translogedit
Changes to Lucene are only persisted to disk during a Lucene commit, which is a relatively
expensive operation and so cannot be performed after every index or delete operation.
Changes that happen after one commit and before another will be removed from the index by
Lucene in the event of process exit or hardware failure.
Lucene commits are too expensive to perform on every individual change, so each shard copy
also writes operations into its transaction log known as the translog. All index and delete
operations are written to the translog after being processed by the internal Lucene index but
before they are acknowledged. In the event of a crash, recent operations that have been
acknowledged but not yet included in the last Lucene commit are instead recovered from the
translog when the shard recovers.
An Elasticsearch flush is the process of performing a Lucene commit and starting a new
translog generation. Flushes are performed automatically in the background in order to make
sure the translog does not grow too large, which would make replaying its operations take a
considerable amount of time during recovery. The ability to perform a flush manually is also
exposed through an API, although this is rarely needed.
Translog settingsedit
The data in the translog is only persisted to disk when the translog is fsynced and committed.
In the event of a hardware failure or an operating system crash or a JVM crash or a shard
failure, any data written since the previous translog commit will be lost.
By default, index.translog.durability is set to request meaning that Elasticsearch will

only report success of an index, delete, update, or bulk request to the client after the translog
has been successfully fsynced and committed on the primary and on every allocated replica.
If index.translog.durability is set to async then Elasticsearch fsyncs and commits the
translog only every index.translog.sync_interval which means that any operations that
were performed just before a crash may be lost when the node recovers.
The following dynamically updatable per-index settings control the behaviour of the translog:
index.translog.sync_interval
How often the translog is fsynced to disk and committed, regardless of write operations.
Defaults to 5s. Values less than 100ms are not allowed.
index.translog.durability
Whether or not to fsync and commit the translog after every index, delete, update, or
bulk request. This setting accepts the following parameters:
request
(default) fsync and commit after every request. In the event of hardware failure, all
acknowledged writes will already have been committed to disk.
async
fsync and commit in the background every sync_interval. In the event of a failure, all
acknowledged writes since the last automatic commit will be discarded.
index.translog.flush_threshold_size
The translog stores all operations that are not yet safely persisted in Lucene (i.e., are not part
of a Lucene commit point). Although these operations are available for reads, they will need
to be replayed if the shard was stopped and had to be recovered. This setting controls the
maximum total size of these operations, to prevent recoveries from taking too long. Once the
maximum size has been reached a flush will happen, generating a new Lucene commit point.
Defaults to 512mb.
Tuning the Ingest service

NiFi can generate large amounts of data during the Ingest process. Use the formulas provided in
the following topics to tune your Ingest processes so that NiFi provides data at the same rate as
Elasticsearch is able to consume it.
Why tune NiFi
NiFi can ingest information faster than Elasticsearch can index it. It is therefore possible for NiFi
to overwhelm Elasticsearch with data, which can result in performance degradation. If do not
have separate Auth and Live environments, your production search experience may be affected.
The solution is NiFi tuning. If the rate at which NiFi ingests data is less than or equal to the rate
at which Elasticsearch consumes it, then there is no performance degradation. This is a
straightforward solution in theory, however, each HCL Commerce Search environment is
different. Therefore, tuning has to be done on an individual basis, specifically for each
implementation. To ensure that you are able to do this, HCL Commerce provides the following
guidelines, methods and parameter settings.
General approach
Tuning NiFi to match your Elasticsearch throughput involves adding additional configuration
options to NiFi tuning parameters. These configuration points are provided in the following
documents. In addition to the tuning parameters themselves, a method for calculating
appropriate values for these tuning parameters is provided. This aids you in knowing what to
tune and how to tune it.
It is useful to break the tuning process down into two clear steps:
1. Tuning:
o By adding additional upgrade-friendly configuration for NiFi tuning parameters,
o By publicly documenting these configuration points,
o By privately documenting a method for calculating sane values for these tuning
parameters.
2. Automation:
o By adding an endpoint to Ingest service so that it can analyze the historical ingest
data, calculate the new tuning values

o By adding an endpoint to Ingest service so that it can assign the new tuning
values to the appropriate ingest pipelines
The automation phase includes adding an endpoint to the Ingest service. This endpoint will
analyze historical ingest data and calculate new tuning values. Another endpoint will assign
these new tuning values to the relevant ingest pipelines. This automation will streamline the
process and ensure accurate tuning based on actual data.
Summary of the tuning process
The Ingest dataflow consists of multiple business processing stages, linked together one after
another. Each stage is a stream of data moving from one location (database) to another
(Elasticsearch). Each dataflow involves three main ETL operations: Extracting, Transforming,and
Loading.
 Each operation can be controlled by the following tuning parameters:
o Extracting uses page size and bucket size to determine the size of each
payload in the data stream.

o Transforming can be spread across multiple concurrent threads.
o Loading rate is determined by the size of the request and the concurrent number
of threads sending to Elasticsearch.
The tuning goal of Ingest dataflow is to obtain environment-specific tuning settings optimized for
each stage with the least overall Ingest elapsed time. It is recommended that you attempt to
satisfy certain assumptions when performing tuning to obtain more reliable results: use the
heaviest ingest run, such as the re-index connector, for tuning estimation, and only allow one
exclusive re-indexing operation to run in NiFi at any time.
 When and how to tune your Ingest pipelines
Tuning Ingest dataflow is necessary to avoid overloading or idling the system. An
overloaded or underused system can experience performance issues.
 Metrics for tuning NiFi
Factors to consider when calculating tuning parameters for each stage and the metrics to
collect from each stage. The importance of monitoring resource utilization is emphasized.
 NiFi parameter tuning
The text provides formulas and suggested values for tuning parameters related to data
extraction, transformation, and bulk service in a given pipeline. These parameters are
calculated based on various factors and can help optimize the performance of the
system. Sample tests are provided to illustrate the application of these tuning
parameters.
 Recommended Parameters for NIFI and Elasticsearch
You can run your Elasticsearch and NiFi environments using the default settings, which
provides a minimal resource set. For best performance, tune your configuration or use
the recommended parameters for CPU, memory and system resources.
Related tasks
 Custom NiFi processors
Related reference
 Ingest Store index pipeline
 Ingest Store index schema
 Ingest Catalog index pipeline
 Ingest Catalog index schema
 Ingest Product index pipeline
 Ingest Product index schema
 Ingest Category index pipeline
 Ingest Category index schema
 Ingest Attribute index pipeline
 Ingest Attribute index schema
 Ingest Inventory index pipeline
 Ingest Inventory index schema
 Ingest Price index pipeline
 Ingest Price index schema
 Ingest URL index pipeline
 Ingest URL index schema
 Ingest Synonym index pipeline
 Ingest Stopword index pipeline
 Index field type aliases and usage
 Search issues
 Logging and troubleshooting the Ingest and Query services
 Elasticsearch index schema changes
Optimizing index build and overall

flow
Optmizations are provided full index builds and for tuning parameters. Potential improvements
that can be implimented for Near Real Time (NRT) index updates are not described.
The index building process

A full index build consists of three major steps: retrieving data, processing data, and uploading
data. There are several predefined connectors, which consist of several process groups for
different purposes. Usually, each process group will contain process subgroups to handle the
data retrieval, processing, and uploading stages that are associated with the build process.
 Retrieving data group: Fetch data from database or Elasticsearch.
 Processing data group: Based on the fetched data, build, update, and/or copy the index
documents.
 Uploading data group: Upload the index document to Elasticsearch. From the HCL
Commerce 9.1.3.0 release, each index has one associated group.
Tuning
Generally, default settings will work for this process, but it may take a long time to finish the
indexing process for a large data set. Depending on the data set size and hardware configuration
(memory, CPU, disk, and network), improvements to the subgroups and process groups can be
made so process flows are faster and more efficient within subgroups, and between connected
subgroups.
Retrieving data group

In this group, there are three different sources that the data can be retrieved from.
Data can be:
 Fetched from a database with SQL scroll;

 Fetched from a database with SQL without scroll;
 Fetched from Elasticsearch with scroll
Fetch data from a database with SQL scroll
Enter the SCROLL SQL group, then right click on the base canvas and select variables.
 The scroll.page.size is the number of rows fetched from database by the SQL.
 The scroll.bucket.size is the number of rows from the fetched data in each
bucket for processing. The bucket.size will determine the size of the flow files
(and the number of documents contained within each file).
Change the values of scroll.bucket.size and scroll.page.size based on the following

considerations:
 • Depending on the catalog size, the SQL can take a long time to get response
data back to the NiFi. The purpose of the scroll SQL is to limit the data size that
can be processed in NiFi at once, to avoid memory errors on large catalogs.
 • The scroll settings are optimal when the time that it takes to process the data is
evenly matched with the amount of time that the next SQL scroll takes to receive
its data. With this optimization, unnecessary processing or I/O delay is minimized.
 • The output from one subgroup is handed in turn to the next connected process
subgroup. This process must be audited, to ensure that there are no bottlenecks
which can impact the efficiency of the overall process.
Fetch data from database with SQL without scroll
In the process group, the data set is fetched from the database by using a single SQL
stream. For example, the Find Associations at DatabaseProductStage1b.
Enter the Processor group that we are optimizing, right click on the base canvas, and
select variables.
Set the scroll.bucket.size parameter to the number that you want.
The scroll.bucket.size is the number of rows from the fetched data that is placed in
each bucket for processing. The bucket.size will determine the size of the flow files (and
the number of documents contained within each file).
Fetch data from Elasticsearch
Since the index build is a staged process, some information may be added to existed
index documents. In the case, NiFi needs to fetch data from Elasticsearch. Let us use the
URL Stage 2 as the example.
Enter the SCROLL Elasticsearch group, right click on the base canvas, and
select variables.
Change scroll.bucket.size and scroll.page.size to values that you want, based on the
following considerations:
The scroll.page.size is the number of documents that are fetched from Elasticsearch. If
the number is too small, NiFi must make more connections to Elasticsearch.
The scroll.bucket.size is the number of documents from the fetched data in each bucket
for processing. The bucket.size will determine the size of the flow files (and the number
of documents contained within each file).
Another parameter that is useful for tuning is scroll.duration. This value defines the
amount of time that Elasticsearch will store the query result set in memory. This
parameter is useful when dealing with many stores and languages running in parallel,
where a running out of scroll error can be encountered. This error indicates that you are
running out of scroll space, and reducing the scroll duration will force Elasticsearch to
free older or obsolete buffers faster. Inversely, increasing the scroll duration in
Elasticsearch for that index will provide extra time to complete processing operations.
Processing data group
In the following example, the process group DatabaseProductStage 1a is used.
Enter the Create Product Document from Database, right click Create Product
Document from Database and select configure. Under the SCHEDULING tab, update
the Concurrent Tasks value to set the number of threads that will be used for the
process group. When increasing the number of concurrent tasks, the memory usage for
the process group is also increased accordingly. Therefore, setting this value to a
number that is greater than the number of CPUs that are allocated to the pod, or beyond
the amount of memory that is allocated to the pod may not make sense, and can have a
negative impact on performance.
Uploading documents group

Enter the Bulk Elasticsearch group. There are are several processes displayed. Right
click on each process, and select configure. Under the SCHEDULING tab, update
the Concurrent Tasks values to one that makes sense for your environment.
The Post Bulk Elasticsearch processor sends the created index documents to
Elasticsearch. By default, Elasticsearch will use the same number of CPUs as the
number of the connections. Considering the possible delay or pools, the number that is
set for the Post Bulk Elasticsearch processor may be larger than number of CPUs that
are allocated to the Elasticesearch pod.
It is a good rule of thumb to consider all processes in a group together. In general, it is

best to try and make later processes complete faster than earlier ones. Otherwise,
queued objects will consume extra memory and slow down overall processing efficiency.
Overall considerations for tuning

The general objective for tuning is to process and store the most
data for a given time period, given a configured system’s
resources, and their utilization. In general, we can tune and
improve the following areas:
Concurrent tasks/threads
Increasing the number of threads that are processed can help to improve a processing
group throughput, however this needs to be assessed carefully.
 Do not multithread processors that use scrolling to getting data from database or
Elasticsearch. Since scrolling approach is used to batch the data in sizes that are
best fit, multithreading would have negative impact of the overal system
processing efficiency.
 Consider increases to bucket sizes for multithreaded processors that send bulk
updates to Elasticsearch, or perform single reads from Elasticsearch.
 More concurrent tasks/threads naturally consume more memory and vCPU
resources.
 If the cost of memory garbage collection is high, you may need to reduce the
concurrent number of threads, or add additional memory resources.
 • Monitor all servers (Database, NiFi, and Elasticsearch) during the ingest
process to find the bottleneck in the pipeline.
Bucket size/Flow file size

Increasing the bucket size increases the flow file size, i.e., the number of documents that
are processed as a group in NiFi. The bigger the flow file, the better the efficiency.
However, limited operating system resources will limit the maximum size of the flow file.
 Flow file size tuning is very visible and impactful on the system.
 Large flow files have several negative side effects:
o They demand a large memory heap for NiFi.
o They require a matched funnel on Elasticsearch, to accept the data as it
comes over.
 NiFi GC overhead may become prohibitively high, or NiFi can run out of heap
space with an Out of Memory error.
Back pressure in links between

processes, process groups, and process
subgroups
In the pipeline, there are many links between steps. Each link has its own queue for the
resulting objects (queued items) for processing in the next step.
The link is a location which can be used to identify the bottleneck of a process or process
group. If a link has large number of queued items, the next process will be the bottleneck.
This is especially true for longer duration builds.
For example, the following link has 451 queued items for the process Analyze
Successful SQL Response.
Back pressure is a configuration threshold that controls the overall data streaming speed.
This threshold indicates how much data should be allowed to exist in the queue before
the component (Processor or Processor Group) that is producing the data in the queue is
no longer scheduled to run. This is designed to avoid the system from being overrun with
data in motion.
NiFi provides two configuration elements to control back pressure:
 Back Pressure Object Threshold - This is the number of objects that can be in
the queue before back pressure control is applied.
 Back Pressure Size Threshold - This specifies the size of the objects that can
be in the queue before back pressure control is applied.
If you usually work with documents, use the back pressure object threshold to control the
back pressure. To configure it, right click the link and select View Configuration.
The Back Pressure Object Threshold and Back Pressure Size threshold can be set
from their default values.
Monitoring and understanding

Elasticsearch and NiFi metrics
You can use Grafana and related tools to analyze the performance of the Ingest pipeline, and
Kibana to do the same with Elasticsearch.
Due to high resource consumption Monitoring should always start with operating system
resources, and their utilization. Identify if there is a resource that is saturated, such as CPU
(processor utilization), IO (network, disk, or memory), Memory, and so on, on the system level.
This is the first step in the tuning exercise – to ensure that we are not running the solution with
system resources that are improperly configured, consumed, or bottlenecked. The easiest way to
monitor this is with Grafana, and Kibana (for Elasticsearch specific metrics), or any other system
level monitor (for example, nmon). If a system resource is saturated, adjustment in the
environment is required before attempting further tuning. For example, there is no point to tune
processor threads/concurrency if there is not enough CPU resource available in the system.
Special attention should be paid to the NiFi and Elasticsearch heap. If the heap size is
inadequate for the workload, it will need adjustment. The heap utilization should be monitored
after each tuning change. This is especially crucial when increasing the concurrency of
processors, or changes to bucket.size/flowfile size. These heap values may be required to be
adjusted for each change to these key performance variables.
The easiest way to observe the overall progress of the index building is via the Grafana NiFi
Performance graph. We can observe the overall execution speed, identify major processor
group speed, and view the amount of data that is generated and pushed to Elasticsearch.
Grafana
You can use Grafana to analyze the performance of the Ingest pipeline. The two most useful
graphs are Queued Items and Wait Link. To set up these and other dashboards, refer
to Extensible metrics for monitoring and alerts.
In the NiFi ingest connectors, WaitLink process groups are added between process groups to
ensure that the previous stage is completed before the next stage is started. This way,
subsequent stages will not use data that is currently being worked on in an unfinished process. In
addition, this reduces the occurrence of different processes running at the same time, which can
cause extreme spikes in resource requests for CPU, network, memory or disk IO.
The time that is spent on WaitLink can be used to estimate the full time that is used for a stage,
and identify stages with the highest time and/or resource usage within the build. Since not all of
the process groups have WaitLink, the Queued Items graph provides more details for the time
taken for processing within each process group.
The useful charts to look at within Queued Items are the Bulk Service - <XXXX> charts. These
process groups send the processed data (index documents) to Elasticsearch from NiFi. The most
important one is Bulk Service – Product. Since the curve starts from the beginning to the end of
the ingest pipeline, we can use the timestamp in Wait Link to get the related stages.
For example, the following two graphs show that the biggest number of queued items is at
the Product Stage 1e. This observation means the retrieving data group and processing data
group can handle the task quickly, and send lots of data to the Bulk service group for
transferring.
In this example, the duration with 100 queued items is short and therefore is not a problem. If a
process group takes a longer time, with a larger number of queued items, it would be a possible
bottleneck in the pipeline.
We can also use the Grafana to monitor other metrics.
Kibana
Kibana can be used to monitor the resource consumption of Elasticsearch. For more information
about Kibana, refer to the Kibana documentation.
This graph displays Kibana monitoring Elasticsearch operations. For the index building process,
the key metrics are the CPU utilization, JVM heap, and IO operations rate. The IO operation rate
is the most critical metric, in the sense that if the IO rate is fully utilized, it is not possible to push
faster overall throughput. If the speed is not acceptable, the best course of action is to investigate
alternative solutions with higher throughput.
NiFi counters and reporting

When running ingest pipeline, we can use the NiFi counters or Grafana to check the pipeline
report.
Due to high resource consumption, the NiFi counters collection for HCL Commerce activities are
disabled by default.
You can enable it by adding the following lines within nifi-app.yaml (/commerce-
helmchart-master/hcl-commerce-helmchart/stable/hcl-commerce/templates/
nifi-app.yaml) before installing NiFi:
name: "FEATURE_NIFI_COUNTER"
value: "true"
After enabling it, you can view the report while the test is running, or after the Ingest process is
completed. One disadvantage is that you can only see one report for each connector. If you are
using the same connector to run another Ingest pipeline, the report that was generated for the
previous run will be removed at the beginning of the new Ingest process (this process can take a
couple of minutes).
After an Ingest pipeline is finished, the Ingest report, Ingest Metrics, will be sent to the
index run within Elasticsearch. You can configure Grafana to display the report in the format you
defined. The reports for the different Ingest pipelines and different connectors are all stored. You
can select connector and runID to view the report.
The data for Ingest Metrics at Grafana is different of the Queued Items/Wait Link. The metrics
will only be sent, by NiFi, to Elasticsearch after the Ingest process is finished. But Queued
Items/Wait Link are using Prometheus to collect information at runtime.
For tuning purposes, you may not want to finish an Ingest pipeline before running it again, or the
process can fail at any point in the Ingest process. In these cases, NiFi counters may be easier
to collect reports for some of the stages in an Ingest pipeline.
Tunable Parameters in the setup of

NiFi and Elasticsearch
How you can modify the values for the tunable parameters, and some default values and how
they can be improved in different circumstances.
System resources, memory footprint, and CPU allocation

The environment (recommended or minimal) will constitute of several nodes that will host the
NiFi and Elasticsearch pods. In general, you will have after the initial installation, three pods for
Elasticsearch and one NiFi.
In the minimal configuration, each of the pods is allocated six vCPUs and 10GB of memory.
In the recommended configuration, each of the pods is allocated sixteen vCPUs and a minimum
of sixteen GB of memory.
Adjustment or increase of the allocated resources is possible. In these cases, however,

additional testing to validate such configuration changes is nessessary, to ensure stability and
operability.
Disk space per node

The amount of disk space you require is strongly dependent on the size of your index. Since this
varies widely from installation to installation, there is no single recommended figure. To estimate,
consider that an index of one million catalog items running on the recommended system
configuration generates an index about six gigabytes in size.
This number does not reflect your actual requirements, however. Index files are dated and with
time will accumulate on the disk. As a general rule, provide at least ten times the disk space of
the anticipated index size to reflect this variability, and clean up old index files on a daily basis. In
the case of the six gigabyte index, this would mean allocating at least sixty gigabytes and running
a regular job to delete stale files.
NiFi
The processing speed of the data set, and the resulting speed of the search index creation is the
result of the throughput one can achieve in the NiFi and Elasticsearch cluster. There are several
parameters that would improve and optimize the throughput for a given hardware footprint; NiFi
processors threads and bucket size.
Threads
The default processor runs on a single thread, processing just one flow file at a time. If you use
concurrent processing, the number of concurrent tasks that it can do can be adjusted. Set the
number of threads for the process group by changing the processor Concurrent Tasks value
(under Processor configuration or SCHEDULING tab).
When the processor can process flow files at the same rate as they come, the Concurrent
Tasks value is ideal, preventing large pileups of flow files in the processor's wait queue.
If a CPU can multitask, increasing the threads available to the processor increases the processor
throughput. The transformation processor (as in NLP) and the Bulk update processor are two
such examples. You can assign more threads to the processor, or set the Concurrent
Tasks variable to be higher than one. Increasing the threads of the processor will result in
improvement of the processor throughput. The following screenshot represents an NLP
Processor which is set to sixteen concurrent tasks, equal to the number of virtual CPUs (vCPUs)
that are available on the node.
This update is
not useful for all processors. Most processors come with an default configuration that takes this
variable into account and does not need to be altered. When performance testing reveals a
bottleneck in front of the processors, the default configuration may benefit from further tuning.
Because such balancing may not always be feasible, the recommended best practice is to focus
on reducing the flow file pileup in the waiting queue.
Additional threads increase the processor bandwidth by the factor of the number of threads, if the
processor is doing computational processing (that is, you can experience linear scaling). In the
case of the I/O operations, the processor will experience some improvement that would start to
diminish after certain level of threads is set (such as nonlinear scalability that would end in
saturation).
In the case of the NLP Processor, the processor is purely computational, and the thread limit is
considered to be the same number of vCPUs that are available for the NiFi Pod.
Bucket size
Bucket size (scroll.bucket.size) is another parameter you can change to improve a processor
bandwidth. It changes the size of the flowfile that is processed. By increasing the bucket size you
are increasing the size of data that will be processed by a single processor as a group.
Bucket size changes are a bit more difficult to implement. The location of the variable is on the
second level parameters of the processor group.
For example, for DataBaseProductStage1a:
The top level appears like this:

Drill down one level, right-click on Create Product Document and select variables.
The Bucket Size is optimal when you see that the flow file is easily processed through the system
(including Elasticsearch upload). Increasing the size (number of documents in the flow file)
increases the throughput. However, when increased to very high size, the throughput tapers off
and gradually decreases. Simultaneously, the resources requirements of the system (NiFi and
Elasticsearch) increase, leading to lower throughput.
The Variables window opens. Here you can change the bucket size value.
Scroll page size (scroll.page.size)

scroll.page.size
The scroll.page.size parameter contains the number of rows fetched from database by
the SQL. SQL runs several times (once with each page turn) if the retrieved row count is
smaller than the total row count in the table.
When the time it takes to process the data is evenly matched with the time it takes for the
next SQL scroll to receive its data, the scroll settings are optimal. This improvement
eliminates needless processing and I/O time. Additional factors to consider include the
memory space available in NiFi to hold the result set while parsing and splitting this into
flowfiles, and the total number of flow files that would be created in NiFi at once. If the
scroll page size is larger, you should expect an impact on NiFi operations. When you
reach this limit, you can increase the resources allocated to NiFi, or limit the scroll page
size to reduce the impact to performance.
When retrieving data from Elasticsearch,

the scroll.page.size and scroll.bucket.size variables are also utilised. The SCROLL
Elasticsearch processor, for example, defines the following variables:
 The scroll.page.size is the number of documents that are fetched from

Elasticsearch. If the number is too small, NiFi must make more connections to
Elasticsearch.
 The scroll.bucket.size is the number of documents included from the fetched data
that are processed in each bucket. The bucket.size determines the size of the
flow files (and the number of documents that are contained within each file).
LISTAGG() and Serialize

In the ingest pipeline, most SQLs are using the aggregate to combine multiple rows
together. Due to a limitation of the database (especially with Oracle), and the size of the
data, LISTAGG can exceed the limits in place during the live.reindex and auth.reindex
index builds. To resolve the issue, you can disable the LISTAGG.
You can set LISTAGG locally or globally. To set it locally, change the
attribute flow.database.listagg.
You control attributes using UpdateAttribute processors, which update the attributes of
the flow files. For example, if you want to
set flow.database.listagg="false" for AttributeStage1b in auth.reindex, set it in
the Properties as follows: NiFi Flow > auth.reindex - Attribute Stage 1b (Find Attribute
Values) > Find Attribute Values > SCROLL SQL > Define scroll offset and page size.
Note: If you experience issues during a particular stage of ingest pipeline processing
where string aggregation is exceeding the 32k LISTAGG function limit, you will need to
disable LISTAGG for that particular processing stage. For instance, to disable LISTAGG
for Attribute Stage 1b (Find Attribute Values) in versions prior to 9.1.11:
1. Go to NiFi Flow > auth.reindex - Attribute Stage 1b (Find Attribute

Values) > Find Attribute Values > SCROLL SQL > Define scroll offset and
page size.
2. Double-click on the Define scroll offset and page size processor.
3. Click STOP & CONFIGURE in the upper right-hand corner of the processor to stop
the processor.
4. Click the Properties tab of the processor and then click the + icon to add
the flow.database.listagg property and set its value to false.
5. Restart the processor.
For versions 9.1.11 and up, use the following endpoint:

https://data-query/search/resources/api/v2/configuration?
nodeName=ingest&envType=auth
Where data-query is the URL for the Query server. Use the GET method to retrieve Ingest
configuration information. Use the PATCH method to set it false, with the following body to
the query:
{
"global": {
"connector": [
{
"name": "attribute",
"property": [
{
"name": "flow.database.listagg",
"value": "false"
}
]
}
]
}
}
You can make make global changes using Ingest profiles. For more information
and an example of how to change LISTAGG using an Ingest profile, see Ingest
configuration via REST.
When the list aggregate is disabled, the SQL proccess will return more rows and NiFi will
use the Serialize process to handle the returned data. In the case, the duration for
processing the data will be much longer. To account for this,
the page.size and bucket.size for the SQL process, and thread number for
the Serialize process must be increased.
Elasticsearch
The following section will discuss a few improvements that can be made to the
Elasticsearch configuration to increase the overall throughput and speed of index building.
Indices Refresh Rate

By default, Elasticsearch periodically refreshes indices every second, but only on indices
that have received one search request or more in the last 30 seconds. The HCL
Commerce index schema is setting this interval to 10 sec by default.
However, if it is viable, disable this behavior by setting its value to -1. If disabling this is
not viable, a longer time interval, such as 60 seconds will also make an impact. By
increasing the refresh rate to longer periods, the updated documents from the memory
buffer are less frequently written into the indices (and eventually written to disk). This
improves the processing speed on Elasticsearch, as fewer refresh events will result in
more resource bandwidth to receive data. In addition, bulk updates to the file system are
always more desirable. On the other side of the equation, the longer refresh intervals will
cause the memory buffer to inflate while trying to accommodate for all of the incoming
data.
For more information on the index refesh interval setting, see the Elasticsearch
documentation.
To set a custom value:

NiFi Flow > live reindex- StoreSchema > Setup Elasticsearch Index Schema :
Populate Store Index Schema
Ensure that you stop the processor. Select and edit the json object, replacing
the refresh_interval value with the value that you want.
Indexing Buffer Sizes
Increasing indexing buffer sizes will help to speed up the indexing operation and will
improve the overall index building speed.
The value is set in the indices.memory.index_buffer_size variable and is most commonly

set to 10% of the heap size by default.
Set this value higher, to 20% of the heap size.
Elasticsearch funnel configuration

The Elasticsearch cluster has complicated funnel requirements. In general, for ordinary
use the Elasticsearch cluster will determine the initial sizes for the threads on each server
according to the number of available CPUs for that pod. In simple terms, if the pod is
configured to allow up to six CPUs, the worker threads will be constrained at the same
number as well.
However, this may not be sufficient when dealing with large catalogs and large
bucketsizes. In the case of the index build, a bulk update request is issued from NiFi to
Elasticsearch. The Elasticsearch master node receives the request body, which is
comprised of multiple documents, and for each document, it determines the shard it
should be stored in. A connection is opened to the appropriate node/shard so that the
document can be processed.
Thus, the bulk update ends up using multiple connections, and it is quite possible to run
out of threads as well as connections. If Elasticsearch runs out of connections, a
response code, 429, is returned. This will interrupt the index build process, and the index
build fails.
To accommodate the needs for more connections and threads, the Elasticsearch server
can be configured to start with more threads and a deeper connection queue on each
node. The following file describes the key Elasticsearch configurations (contents are set
within the es-config.yaml configuration file):
replicas: 3
minimumMasterNodes: 2
ingress:
enabled: true
path: /
hosts:
- es.andon.svt.hcl.com
tls: []
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: local-storage-es
resources:
requests:
storage: 15Gi
esJavaOpts: "-Xmx12g -Xms12g"
resources:
requests:
cpu: 2
memory: "14Gi"
limits:
cpu: 14
memory: "14Gi"
esConfig:
elasticsearch.yml: |
indices.fielddata.cache.size: "20%"
indices.queries.cache.size: "30%"
indices.memory.index_buffer_size: "20%"
node.processors: 12
thread_pool:
search:
size: 100
queue_size: 10000
min_queue_size: 100
max_queue_size: 10000
auto_queue_frame_size: 10000
target_response_time: 30s
thread_pool:
write:
size: 100
queue_size: 10000
The file defines two threadpools, write and search.
For each threadpool we can define the following parameters:
 size – the number of worker threads per node.

 queue_size – the number of connections that can be received and tracked in the
connection queue
 min_queue_size – the minimum size of the connection queue.
 max_queue_size – the maximum queue size, beyond which Elasticsearch will
send the 429 response code.
The actual values must be specific to the environment and its configuration.
Increasing the worker threads to 100, and the connections pool to 10000 will suffice for
catalogs of 1M items on an Elasticsearch cluster of 3 nodes and 3 shards with the default
configurations in place.
To apply the changes, the Elasticsearch cluster should be reinstalled using the new
configuration file.
The following steps describe the process:
1. Delete the existing Elasticsearch cluster.
helm delete -n elastic elasticsearch
2. Reinstall the Elasticsearch cluster using the modified configuration file.
helm install elasticsearch elastic/elasticsearch -f

es_es_config.yaml -n elastic
Sharding
It is useful to know the optimal number of index shards to be used as
your data grows during production. You can determine this based on
the existing size of the search index. Use the following three rules to
calculate when to adjust the number of index shards.
 An index shard should not not exceed 40% of the total

available storage of its node cluster.
 An index shard size should not exceed 50 GB; generally the
index performs best when its size is less than 25 GB per shard.
 The document counts and sizes across shards should be
similar.
Elasticsearch scaling and hardware

requirements
You can achieve additional capacity by implementing or extending pod clustering on
Elasticsearch, NiFi, or both. You can also consider the hardware footprint and key resources that
impact the processing and index creation speed.
Clustering
Elasticsearch
Elasticsearch comes installed by default as a three shard cluster. Both minimal and
recommended sizing implements such clustering, and the only differences are the
resources that are allocated. The recommended sizing has more vCPUs and memory per
pod, which is sufficient to drive traffic and build the index.
If required, additional capacity can be added by scaling horizontally, by increasing the

cluster size with additional nodes/pods for Elasticsearch. It is recommended, however, to
omptimize your infrastructure performance with faster storage, faster network
interconnection, faster memory and CPU, and more memory and CPU allocation, before
scaling horizontally.
NiFi
NiFi is configured as single server, in both minimal and recommended configurations. For
typical expected workloads this is sufficient. However, if Natural Language Processing
(NLP) processing presents a bottleneck, NiFi horizontal clustering will improve NLP
throughput, with linear scalability.
Sharding
It is useful to know the optimal number of index shards to be used as your data
grows during production. You can determine this based on the existing size of the
search index. Use the following three rules to calculate when to adjust the number
of index shards.
 An index shard should not not exceed 40% of the total available storage of
its node cluster.
 An index shard size should not exceed 50 GB; generally the index performs
best when its size is less than 25 GB per shard.
 The document counts and sizes across shards should be similar.
Hardware footprint
There are several factors in considering the hardware footprint and key resources
that impact the processing and index creation speed.
Resources influencing the operations

The default installation of the Nifi an Elasticsearch e-commerce cluster assumes one NiFi
pod, coupled with 3 clustered pods with Elasticsearch. The key resources are CPU,
RAM/Heap sizes and IO subsystem.
The minimum CPU resources that are recommended for each pod are set to 6 vCPUs
per pod. This is shown to have acceptable performance when building catalogs of
medium size (assuming 300 000 items). However, each catalog is different, and catalogs
that have an excessively large attribute dictionary can require extra resources to keep up
with the increased processing demand.
In general, NiFi processing will comfortably fit into the allocated CPU resource size,
except in the case of NLP processing. Typically NLP processing is not CPU bound.
Before increasing allocated CPU resources (to boost NLP processing speed), it is
recommended to re-test the index build with the existing hardware. During a second
index build, NLP will re-use some of the computation from the initial run, reducing the
overall processing time dramatically. Use these repeated builds to derive any increased
resource requirements.
More importantly, the heap sizes need to be adjusted to fit the indexed data size and
complexity. The NiFi and Elasticsearch process is streamlined, and as long as the
configuration is kept same, the heap should be sufficient for any size of the catalog.
However, some adjustment will be required if additional optimization is attempted on

larger catalogs (bigger flow files, more threads, etc.), or if the produced data set itself
becomes larger (due to a larger number of attributes, for example).
In the provided minimal and recommended cases, there are two heap configurations,
targeting the aforementioned 300,000 and 1M catalog. The tuning parameters are
different between these configurations, since a 1M item catalog requires larger heap
sizes in both NiFi and Elasticsearch (12GB and 16GB accordingly).
The most influential resource to the overall operation of the NiFi and Elasticsearch
coupling is the Elasticsearch I/O subsystem, which is generally driving the overall speed
of the processing. If Elasticsearch stores the indexed data on the disk slowly, it will also
slow down the overall build process, including NiFi execution speed. Thus, the file/io
system on Elasticsearch must be considered early, and ideally configured on local
SSD/NVMe storage for maximum throughput and I/O rates.
Defined minimal and recommended hardware footprint

You have two primary configuration deployment settings. One is for the default
deployment configuration, also known as the minimal configuration for NiFi and
Elasticsearch, and a second, known as the recommended configuration, that has shown
good results with larger catalogs with approximately 1M items.
The following table shows resource values for minimal and recommended hardware
footprint:
Pods vCPUs/ES vCPU/NiFi ES NiFi

Pod Pod heap Heap
6 6 12 9
Minimal Configuration 3 - ES
(default)
1-
Nifi
Recommended Configuration 16 16 16 12
3 - ES
1-
Nifi
Elasticsearch with dedicated Auth and

Live nodes
For production environments, Elasticsearch is recommended in a clustered configuration for high
availability and scalability considerations.
An Elasticsearch cluster consists of nodes that have different node roles, including a master-
eligible node, data node, ingest node, remote-eligible node and machine learning node. The
master-eligible node is responsible for cluster-wide operation such as index creation and
deletion, node tracking and shard allocation. It is crucial to have a stable Master node to maintain
the cluster health.
In the default HCL Commerce Elasticsearch cluster implementation, all nodes have all the node
roles. If one node is busy on data operations and under resource constraints while it also has the
master role, it can affect the cluster health, which in turn impacts data availability. Elasticsearch
in particular can demand high CPU usage in the Auth environment when doing ingest operations,
which would adversely impact the data queries in the production environment.
With the default clustered deployment, Elasticsearch automatically distributes the primary and
replica shards to the available nodes (pods). In an HCL Commerce deployment, with authoring
and live configurations, ingest operations could create periods of heavy activity on Elasticsearch,
that stress resources such as memory and CPU. If an Elasticsearch node hosts both authoring
and live indices, ingest operations could impact query performance and affect the availability of
the live storefront.
This documentation describes a sample configuration that defines dedicated Elasticsearch node
groups for authoring and live. By separating authoring and live indices into two different node
groups, we can prevent reindexing from impacting the live site.
Elasticsearch installation
To setup Elasticsearch with dedicated node groups, each group is installed as a separate Helm
release. This enables the use of different configurations for each group. When each node group
is started, they join a single Elasticsearch cluster.
The configuration presented here is a sample. You will need to adjust the node configurations
such as memory and CPU, and the number of nodes to meet your requirements. Depending on
your load requirements, you might need additional live nodes. You can also change the pod
configurations to better match your available infrastructure.
The values files for each release are available here:
 master.yaml
 auth.yaml
 live.yaml
The installation is done using Helm:
helm repo add elastic https://helm.elastic.co
kubectl create ns elastic
helm install es-master elastic/elasticsearch -f master.yaml -n elastic
helm install es-auth elastic/elasticsearch -f auth.yaml -n elastic
helm install es-live elastic/elasticsearch -f live.yaml -n elastic
Elasticsearch pods
In Elasticsearch terminology, an Elasticsearch node refers to an instance of Elasticsearch. In

Kubernetes, Elasticsearch nodes are deployed as stateful sets and run as pods. The sample
deploys pods with the following configurations and limits:
vcp
group pods u memory heap
maste 3 2 2G 1.5G
vcp
group pods u memory heap
auth 1 10 24G 16G
live 2 6 10G 6G
Kubernetes node groups
The Elasticsearch node groups, that are installed as separate Helm releases, can be configured
with different node affinity rules. For example, Elasticsearch live nodes (pods) can be deployed
within a particular Kubernetes node pool.
This sample from Google Cloud, defines affinity rules so the pods are only deployed within nodes
that belong to the elastic-live-pool node pool:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- elastic-live-pool
While it is possible to deploy Auth and Live to the same node pool, using separate node pools
has the avantage that each pool can be configured with different Kubernetes node
characteristics. For example, the authoring node pool can be configured with a machine with
more memory than the live pool.
The sample is deployed in Google Cloud (GKE) using the following node configurations:
node
pools s vcpu memory
elastic-auth- 1 12 32G
node
pools s vcpu memory
pool
elastic-live-pool 2 8 16G
If your cluster is not currently configured with multiple node pools, all other deployments must
have affinity rules added, as otherwise they could deploy to any pool including the Elasticsearch
pools. If a non-Elasticsearch pod (for example, Redis or Vault) deploys to the Elasticsearch node
pools, the node might not be left with enough resources to start the Elasticsearch pods.
Master node groups
Besides the node groups for authoring and live, the sample configuration defines a group
of dedicated master nodes. These nodes do not handle traffic or maintain indices. Instead, their
only responsability is to manage the cluster state. In production environments, it is recommended
to have dedicated master nodes, as data nodes might become overwhelmed and unresponsive,
and this could lead to cluster state synchronization problems.
The dedicated master nodes typically require few resources. The sample configures the limits to
2 vcpu, memory to 2G and Java Heap to 1.5G. Monitoring should be used to confirm these
resources are sufficient.
In our example, master nodes can run within either the elastic-auth-pool or the elastic-live-pool
node pools:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- elastic-auth-pool
- elastic-live-pool
Index affinity rules
The charts use lifecycle/postStart to automate required routing and sharding

configurations. index.routing.allocation.require.env must match the node
attribute node.attr.env configured in the same chart. Authoring is configured with 1 shard and
no replicas. Live is also configured with 1 shard, but the number of replicas is configured to the
number of nodes minus one. This ensures all live nodes maintain copies of all indices. In the
example configuration, with two live nodes, one live node runs the primary shard, while the other
runs a replica shard. If your configuration runs with more live pods, number_of_replicas must be
updated.
Authoring index configuration

Index pattern configuration for authoring indexes:
"index_patterns": [".auth*", "auth*", "log", "nlp", "run","live*"],
"settings": {
"index.routing.allocation.require.env": "auth",
"number_of_replicas": 0,
"number_of_shards": 1
The inclusion of the live* index pattern (as opposed to .live* which is used in the live
group) is to include a set of indices that must be kept within the authoring group.
The live.master.* indices are copies of the production ready data which are kept within
the authoring group, and are copied into the .live.* indices during push-to-live. Price
and inventory data for the live environment is ingested into
the live.price and live.inventory indices. These indices are copied into master
indices and then propagated into the live environment.
The live.store index is used during ingest while the .live.store.yyyymmddHHmm index
is a copy kept within the live group. This allows ingest and live operations to remain
available even if the other Elasticsearch node group is unreacheable.
Live index configuration
Index pattern configuration for live indexes:
"index_patterns": [".live*"],
"settings": {
"index.routing.allocation.require.env": "live",
"number_of_replicas": 1,
"number_of_shards": 1
Configuring NiFi
Additional configurations are required for NiFi to support the dual node group setup.
The following example uses curl to apply the configurations using the shard
queryapp server:
curl -XPATCH -v -u spiuser:<password>
'localhost:30900/search/resources/api/v2/configuration?
nodeName=ingest&envType=auth' -H 'Content-Type: application/json' -H
'Accept: application/json' -d@split.json
split.json:
"global": {
"connector": [
{
"name": "attribute",
"property": [
"name": "flow.inventory.copy",
"value": "auth,live"
},
"name": "flow.price.copy",
"value": "auth"
},
"name": "alias.keep.backup",
"value": "0"
},
"name": "cluster.index.nodegroup",
"value": "dual"
Connecting to Elasticsearch
Each Elasticsearch release generates its own set of services. While all the nodes
can handle requests for any index, there is additional overhead if the node handling
the request does not locally manage the index.
kubectl get services -n elastic
elasticsearch-auth ClusterIP
10.244.8.225 <none> 9200/TCP,9300/TCP 109m
elasticsearch-auth-headless ClusterIP None
<none> 9200/TCP,9300/TCP 109m
elasticsearch-live ClusterIP
10.244.7.235 <none> 9200/TCP,9300/TCP 109m
elasticsearch-live-headless ClusterIP None
<none> 9200/TCP,9300/TCP 109m
elasticsearch-master ClusterIP 10.244.6.29
<none> 9200/TCP,9300/TCP 109m
elasticsearch-master-headless ClusterIP None
<none> 9200/TCP,9300/TCP 109m
The recommended configuration is to have the live servers use the live
service elasticsearch-live.elastic.svc.cluster.local, while the rest can
connect directly to the authoring node via the authoring service (elasticsearch-
auth.elastic.svc.cluster.local). The master nodes do not own indices and
should never be used to connect to the cluster.
To use this configuration in Vault, define

the elasticSearchHost, elasticSearchPort and elasticSearchScheme keys under
the "live" branch, referencing the live service.
svt/qa/live/elasticSearchHost value="elasticsearch-
live.elastic.svc.cluster.local"
svt/qa/live/elasticSearchPort value="9200"
svt/qa/live/elasticSearchScheme value="http"
The authoring service is referenced under the environment level. It is used when not
running in live:
svt/qa/elasticSearchHost value="elasticsearch-
auth.elastic.svc.cluster.local"
Validating the environment
Once the installation of all the helm releases is complete (es-master, es-auth and
es-live) and the statefulsets are started, all pods should form a single Elasticsearch
cluster. This can be validated by executing the /_cat/nodes API as follows:
curl "localhost:9200/_cat/nodes?v"
ip heap.percent ram.percent cpu load_1m load_5m load_15m
node.role master name
1.1.1.0 12 89 0 0.06 0.11 0.10 m
- elasticsearch-master-0
1.1.1.1 37 89 0 0.02 0.11 0.17 m
- elasticsearch-master-1
1.1.1.2 43 90 0 0.00 0.14 0.14 m
* elasticsearch-master-2
1.1.2.0 7 73 0 0.00 0.14 0.14
cdfhirstw - elasticsearch-auth-0
1.1.3.0 21 56 0 0.06 0.11 0.10
cdfhirstw - elasticsearch-live-0
1.1.3.1 13 56 0 0.66 1.13 0.71
cdfhirstw - elasticsearch-live-1
After reindexing is complete, use the _cat/indices API to verify the indices' health
is 'green.' If the primary shard cannot be allocated, the index health will be 'red'. If a
replica shard is not allocated, the index health is 'yellow'. If there are indices that are
not green, there could be a problem with the Elasticsearch or NiFi configurations.
The Elasticsearch Cluster allocation explain API can describe reasons why the
cluster is unable to allocate a shard.
curl "localhost:9200/_cat/indices?v"
health status index uuid
pri rep docs.count docs.deleted store.size pri.store.size
green open .auth.12001.attribute.202312012227 R-
Te5iRhS1CFXY0OZv0wKg 1 0 43 0 44.1kb
44.1kb
green open .auth.12001.catalog.202312012227
MAkWPxKhSfyHuxyEqFkFRA 1 0 3 0 10.5kb
10.5kb
green open .auth.12001.category.202312012227
HOoMtbyVRt6Ow8FPPUco6Q 1 0 54 0 70.7kb
70.7kb
green open .auth.12001.description.202312012227
JWGHyUevTaK8vszHvBfVUg 1 0 327 0 106.9kb
106.9kb
green open .auth.12001.page.202312012227
EOZXi76ITeiQdC3iUFXmog 1 0 10 0 13.4kb
13.4kb
...
Similarly, the /_cat/shards API shows the nodes on which each index is allocated.
This is used to verify that the affinity for Auth and Live indices is working correctly.
curl "localhost:9200/_cat/shards?v"
index shard prirep state docs
store ip node
.auth.12001.category.202312051630 0 p STARTED 928 1.3mb
1.1.1.1 elasticsearch-auth-0
.auth.12001.description.202312051630 0 p STARTED 697064 494.7mb
1.1.1.1 elasticsearch-auth-0
...
.live.12001.category.202312011848 0 r STARTED 928 1.3mb
1.1.2.0 elasticsearch-live-0
.live.12001.category.202312011848 0 p STARTED 928 1.3mb
.live.12001.description.202312011848 0 r STARTED 697064 508mb
.live.12001.description.202312011848 0 p STARTED 697064 508mb
...
 Migrating to a dual nodegroup configuration
Existing environments can be migrated to a split Auth and Live
configuration.
 Indexing data lifecycles in dual-nodegroup Elasticsearch configurations
Different processes are involved in indexing data lifecycles in dual
Elasticsearch nodegroup configurations. The affected processes include
Near-Real-Time updates, offline dataloads, Push-To-Live scenarios, and
Update-Live operations. Cache invalidation processes are also performed
with each of these updates.
Minimal and recommended

configuration tunable parameter
values
Two configurationsof the NiFi tunable parameter values are presented, a minimal and an optimal
value.
A set of optimal processing values for the minimal configuration for NiFi and Elasticsearch are
provided. These are the default values for the HCL Commerce deployment. They are sufficient
for a typical index up to 300,000 catalog items. The second configuration provides optimal
processing of the recommended configuration for NiFi and Elasticsearch. These are the provided
recommended values for the HCL Commerce deployment. They are sufficient for a typical index
up to 1M catalog items.
Heap sizes
Optimal values:
NiFi heap – 9GB
Elasticsearch heap - 12GB
Recommended values:
NiFi heap – 12GB
Processor groups threading changes

Nifi Flow>>Logging Service
Optimal values:
Nifi Flow>>Logging Service>>Generate Log Message: Configuration/Concurrent
Tasks = 4
Nifi Flow>>Logging Service>>Batch Bulk Document – Log:
Configuration/Concurrent Tasks = 4
Recommended values:
Tasks = 8
All other processors are set at Concurrent Tasks= 1
All other processors are set at Concurrent Tasks = 1
Nifi Flow>>Bulk Service – Product

Optimal values:
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Track Bulk Request:
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Post Bulk
Elasticsearch-Product: Configuration/Concurrent Tasks = 16`
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Set
stage.end.time : Configuration/Concurrent Tasks = 4
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Analyze Bulk
Response: Configuration/Concurrent Tasks = 4
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search> Post Refresh
Elasticsearch: Configuration/Concurrent Tasks = 4
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Retry Flow File:
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Route On
Environment: Configuration/Concurrent Tasks = 4
Recommended values:
Elasticsearch-Product: Configuration/Concurrent Tasks = 32
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Set
stage.end.time : Configuration/Concurrent Tasks = 8
Nifi Flow>>Bulk Service – Price
Optimal values:
Nifi Flow>>Bulk Service – Price>>Bulk Elastic Search> Post Bulk
Elasticsearch-Price: Configuration/Concurrent Tasks = 16
Nifi Flow>>Bulk Service – Price>>Bulk Elastic Search> Set stage.end.time:
Nifi Flow>>Bulk Service – Price>>Bulk Elastic Search> Analyze Bulk
Nifi Flow>>Bulk Service – Price>>Bulk Elastic Search>Route On Environment:
Recommended values:
Nifi Flow>>Bulk Service – Inventory
Optimal values:
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search>Track Bulk
Request: Configuration/Concurrent Tasks = 4
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search> Post Bulk
Elasticsearch-Inventory: Configuration/Concurrent Tasks = 16
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search> Set
stage.end.time: Configuration/Concurrent Tasks = 4
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search> Analyze Bulk
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search>Route On
Nifi Flow>>Bulk Service – Inventory >>Bulk Elastic Search>Retry Flow File:
Recommended values:
Nifi Flow>>NLP Service
Optimal values:
Nifi Flow>>NLP Service>>Parse NLP Messages>>SCROLL Elasticsearch>>Generate
Query: Configuration/Concurrent Tasks = 6
Nifi Flow>>NLP Service>>Parse NLP Messages>>SCROLL Elasticsearch>>GET
Nifi Flow>>NLP Service>>Parse NLP Messages>>SCROLL Elasticsearch>> SCROLL
Nifi Flow>>NLP Service>>Parse NLP Messages>>NLP Processor>>Transform
Document-CoreNLP Ingest Processor: Configuration/Concurrent Tasks = 16
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Track
Bulk Request: Configuration/Concurrent Tasks = 4
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Post Bulk
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Analyze
Bulk Response: Configuration/Concurrent Tasks = 4
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Retry
Flow File: Configuration/Concurrent Tasks = 4
Recommended values:
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Track
Bulk Request: Configuration/Concurrent Tasks = 8
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Retry
Flow File: Configuration/Concurrent Tasks = 8
Bucket size changes

1i
Optimal:
Nifi Flow> live reindex- DatabaseProductStage1i:
Variables:scroll.bucket.size=5000
Recommended:
1e
Optimal:
Nifi Flow> live reindex- DatabaseProductStage1e:
Recommended:
Minimal and recommended configuration tunable parameter values

Two configurationsof the NiFi tunable parameter values are presented, a
minimal and an optimal value.
A set of optimal processing values for the minimal configuration for NiFi
and Elasticsearch are provided. These are the default values for the HCL
Commerce deployment. They are sufficient for a typical index up to 300,000
catalog items. The second configuration provides optimal processing of the
recommended configuration for NiFi and Elasticsearch. These are the
provided recommended values for the HCL Commerce deployment. They are
sufficient for a typical index up to 1M catalog items.
Heap sizes
Optimal values:
NiFi heap – 9GB
Recommended values:
NiFi heap – 12GB
Processor groups threading changes
Nifi Flow>>Logging Service
Optimal values:
Tasks = 4
Recommended values:
Tasks = 8
All other processors are set at Concurrent Tasks= 1
All other processors are set at Concurrent Tasks = 1
Nifi Flow>>Bulk Service – Product
Optimal values:
Elasticsearch-Product: Configuration/Concurrent Tasks = 16`
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Set stage.end.time :
Recommended values:
Elasticsearch-Product: Configuration/Concurrent Tasks = 32
Nifi Flow>>Bulk Service – Product>>Bulk Elastic Search>Set stage.end.time :
Nifi Flow>>Bulk Service – Price
Optimal values:
Recommended values:
Nifi Flow>>Bulk Service – Inventory
Optimal values:
Recommended values:
Nifi Flow>>NLP Service
Optimal values:
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Track Bulk
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Retry Flow
File: Configuration/Concurrent Tasks = 4
Recommended values:
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Track Bulk
Nifi Flow>>NLP Service>>Parse NLP Messages>>Bulk Elasticsearch>> Retry Flow
File: Configuration/Concurrent Tasks = 8
Bucket size changes
1i
Optimal:
Recommended:
1e
Optimal:
Recommended:
Refresh rate change
NiFi Flow> live reindex- StoreSchema >> Setup Elasticsearch Index Schema :
Populate Store Index Schema
Note: To change this value, stop the processor, click and edit the json
object, and replace the refresh_interval value.
Extensible metrics for monitoring and
alerts
You can use the HCL Commerce Version 9.1 Metrics Monitoring framework with built-in
performance dashboards, or build your own. The monitoring data are collected using Micrometer
and are provided in the industry-standard Prometheus format. This means you can use them with
many different tools. HCL provides a set of Grafana dashboards to get you started.
You can also use this Metrics Monitoring framework to visualize the cache requests sent and
received by Nifi . Using this new API, http://NIFIHOST:30690/monitor/metrics, the monitoring
data can be collected in the industry-standard Prometheus format which can be used in Grafana
or any other different tools to visualize the cache requests sent and received by Nifi.
There are three parts to the monitoring framework. First, a fully-customizable presentation layer
enables you to use your preferred tools to report and analyze your systems' performance. The
flexibility of this layer comes from its use of a vendor-neutral, industry-standard data-
representation language. This is the open-source Prometheus toolkit. Finally, Prometheus gets
its data from the fully-customizable Micrometer library, which "scrapes" the data from your
containers.
Note: For more information on using of Grafana and sample dashboards, see HCL Commerce
Monitoring - Prometheus and Grafana Integration.
Reporting and dashboarding
The top of the framework is the reporting layer. Because your data is represented in the
Prometheus format, you can use many different tools to display and analyze it. One popular
dashboarding tool is Grafana (https://grafana.com/). Grafana is often used with Prometheus to
provide graphical analysis of monitoring data.
You can download the HCL Commerce Grafana Dashboards from the HCL License and Delivery
portal. For more information on the available HCL Commerce Grafana Dashboard package,
see HCL Commerce eAssemblies.
The Prometheus toolkit

HCL Commerce metrics use the Prometheus text-based exposition format. Although this format
is native to the Prometheus monitoring framework (https://prometheus.io/), the popularity of the
format has led to widespread adoption and support. For example, Prometheus joined the Cloud
Native Computing Foundation in 2016 as the second hosted project, after Kubernetes.
Micrometer application monitoring
Monitoring and performance data is scraped using the JVM-based Micrometer instrumentation
library. The key concept for Micrometer is the meter. A rich set of predefined meter primitives
exist that define times, counters, gauges and other data collection types. You can use the default
meters to aggregate performance and monitoring data from your containers, or customize your
own.
Metrics for the performance of each container are exposed at its /monitor/metrics endpoint.
They are collected by a process known as “scraping.” Micrometer scrapes the metrics endpoint
on all containers at a configurable internal. The metrics are stored in a database where other
services can access them. In Kubernetes environments, scrapers also add contextual metadata
to the metrics obtained by endpoints, such as the service, namespace, and pod that identify the
origin of the data.
Configuring meters
Metrics are enabled by default when using the HCL Commerce Helm charts. They can also be
enabled by configuring the environment variable:
EXPOSE_METRICS=true
Metrics are exposed on each pods on the following paths and ports:
Deployment Path Metrics port (http)
demoqaauthcrs-app /monitor/metrics 8280
demoqaauthsearch-* /monitor/metrics 3280
demoqaliveingest-app /monitor/metrics 30880
demoqalivequery-app /monitor/metrics 30280
demoqaauthts-app /monitor/metrics 5280
demoqaauthxc-app /monitor/metrics 9280

Deployment Path Metrics port (http)
demoqaingest-app /monitor/metrics 30880
demoqalivequery-app /monitor/metrics 30280
In addition to enabling metrics, the Helm chart exposes the metrics port thru the services, and
offers the option to define a servicemonitor
( metrics.servicemonitor.enabled, metrics.servicemonitor.namespace) for use with
Prometheus Operator.
Implementing custom meters
In addition to the default set of meters, you can add your own. When meters are enabled,
the Metrics class makes the global registry available. Meters added to the global registry are
automatically published to the metrics endpoint.
New meters can be added to the registry by using the Micrometer APIs. See the Micrometer
Javadoc for API details: https://javadoc.io/doc/io.micrometer/micrometer-core/1.3.5/index.html.
Samples
The following examples show how metrics can be used from custom code.
Counters
A positive count that can be increased by a fixed amount. For example, “number of
requests.” Prometheus includes functions such as rate() and increase() that can be
used to protect against counter resets
See the following examples for defining a Counter type:
Adding a new Counter with known value labels
private static Counter BACKEND_COUNTER =
Metrics.isEnabled()
? Counter.builder( "backend.calls.total" )
.tags( "result", "ok" )
.description("Number of successful backend requests")

.builder.register( Metrics.getRegistry())
: null;
if ( BACKEND_COUNTER != null ) {
BACKEND_COUNTER.increment();
Adding a new Counter with labels unknown in advance
if ( Metrics.isEnabled() ) {
Metrics.getRegistry().counter(
"backend.calls.total",
"result",
myGetResult()
).increment();
Timers
Timers are used to track the duration and frequency of events. Besides calculating the
average durations, the API allows to configure a set of Service Level Objectives (SLO),
which are translated to histogram buckets. SLOs can also be used to calculate quantiles.
For more information, see Histograms and Summaries on the Prometheus website.
The Metrics class defines SLOs for common usages. For example,
Metrics.DEFAULT_SLO_REST_DURATIONS_NAME defines buckets that are
appropriate for typical REST execution times. If your timer doesn’t match these durations,
you can specify new values as a long array. For more information, see .sla() in
the Timer.Builder class definition on the Micrometer website.
Example: Adding a new Timer with known value labels

private static Timer BACKEND_TIMER =
Metrics.isEnabled()
? Timer.builder( "backend.calls.duration" )
.tags( "result", "ok" )
.sla( Metrics.getSLOsByName(Metrics.DEFAULT_SLO_REST_DURATI
ONS_NAME) )
.description("Duration of successful backend requests")
.builder.register( Metrics.getRegistry())
: null;
if (BACKEND_TIMER != null ) {
startTime = System.nanoTime();
doWork();
if (BACKEND_TIMER != null ) {
final long deltaTime = System.nanoTime() - startTime;
BACKEND_TIMER .record(deltaTime, TimeUnit.NANOSECONDS );
When using a Timer with label values that are not known in advance, the micrometer API
doesn’t allow for SLO (.sla(..)) to be specified. In order to achieve this, define a meter
filter to merge the config. The Metrics.applySLO(final String metricName, final
long[] slos) or Metrics.applySLO(final String metricName, final String
name) utility methods can be used for the same.

Example: Adding a new Timer with label values not known in advance
private static String TIMER_NAME = "backend.calls.duration";
static {
Metrics.applySLO( TIMER_NAME,
Metrics.DEFAULT_SLO_REST_DURATIONS_NAME );
startTime = System.nanoTime();
doWork();
final long deltaTime = System.nanoTime() - startTime;
Metrics.getRegistry().timer(
TIMER_NAME,
"result",
getResult() )
.record( deltaTime , TimeUnit.MILLISECONDS );
Gauges
A gauge holds a value that can increase and decrease over time. The meter is mapped
to a function to obtain the value. Examples include number of active sessions and current
cache sizes.
Example: Defining a gauge
class MyService {
private Gauge myActiveClientsGauge;
final void setup() {
if (Metrics.isEnabled()) {
myActiveClientsGauge = Gauge.builder( "myservice.activeclients",
this,
MyService::getActiveClients )
.tags("endpoint", getEndpointName())
.register(Metrics.getRegistry());
private double getActiveClients() {
return nActiveClients;
}
 Enabling and customizing the Metrics
Monitoring framework in a development
environment
In order to use and extend the HCL
Commerce Version 9.1 Metrics Monitoring
framework in your HCL Commerce
development environment, enable it in each
of the Transaction, Search, Customization
(Xc) and Store servers.
Performance Measurement tool
As a developer, you can use the Performance Measurement tool to gather performance data on
a running application to help you identify any performance bottlenecks. You can use this tool in
both your development environment to test operation performance, and in your production
environment to analyze the actual performance of an operation. When you run this tool, you can
generate different reports to help you identify operations that impact performance and determine
how to improve caching performance.
The Performance Measurement tool is a flexible measurement tool that you can use in the
following ways:
 As a serviceability tool, which you can use to evaluate what the system is doing. This
evaluation can be done while the operation runs in the production environment.
 As a general performance measurement tool, which you can use to determine where
time is spent during a request execution.
 As a cache potential measurement tool, which you can use to determine the value that
caching could bring to various operations.
When you run this tool, you can generate any of the following types of reports to help you
measure and analyze your site performance:
 Performance reports
 Stack reports
 Execution reports
 Caller reports
When this tool runs, it uses API classes to gather metric data for operations to use for generating
the preceding reports. The API classes that the tool uses to gather the metrics are built on the
Java Logging mechanism. These API classes create a metric gatherer and then create objects of
a type OperationMetric. The tool uses these objects to gather the following information about a
single operation execution:
 Start time
 End time
 Duration
 Result size
 Name
 Key-value pairs. (that are used as a unique cache key)
 Whether caching is enabled for the operation
 Whether the operation result is fetched from cache.
 Unique ID of the request
 Unique ID of the parent request
For more information about how to run the Performance Measurement tool to generate reports,
see Using the Performance Measurement tool
Security
Depending on how you set up caching for your site, some operations that you cache can use
sensitive information as key-values. For example, a servlet login form request might include a
user password as a parameter. In this case, the parameter value is typically masked to prevent
the value from being included in any generated performance logs.
When you run the Performance Measurement tool, the tool does capture parameter values for
select statement within "GET" operations. If the parameter values that contain sensitive
information are masked, the generated Performance Measurement tool reports should not
include any sensitive information.
 Using the Performance Measurement tool

The Performance Measurement can help you analyze your cache settings and improve
performance.
 Performance reports
You can use the Performance Measurement tool to generate performance reports for
operations. You can use performance reports to measure the impact that your caching
practices have on your system performance. You can then analyze these reports to help
you adjust your caching to improve performance.
 Stack reports
You can use the Performance Measurement tool to generate stack reports. Stack reports
include the measurements for the stack of operations that are used when you run a
sample operation.
 Execution reports
You can use the Performance Measurement tool to generate execution reports.
Execution reports include the measurements for a single execution stack.
 Caller reports
You can use the Performance Measurement tool to generate caller reports, which identify
the source of the calls to run an operation. With these reports, you can analyze
operations and identify the sources for numerous calls for the operation to run.
Performance reports
You can use the Performance Measurement tool to generate performance reports for operations.
You can use performance reports to measure the impact that your caching practices have on your
system performance. You can then analyze these reports to help you adjust your caching to improve
performance.
When you are generating performance reports, three files are generated:
report-operations.csv
This report provides a simplified view of caching performance. Use this report when you do not need
complex statistics about data caching. This report includes the following information:
OPERATION_NAME
The name of the executed operation.
AVERAGE_CALL_DURATION_IN_MS
The average duration of a call (in seconds).
AVERAGE_RESULT_SIZE_IN_BYTES
The average size of a result when the result would be saved in cache.
CUMULATIVE_EXECUTION_TIME_MS
The amount of time that is spent by the system when it runs all of the measured executions of the
operation.
CALL_COUNT
The number of calls that are measured for a specific operation.
report-execution.csv
This report lists the main operations that the system executes. These operations are listed from
slowest duration to fastest. Use this report to help you identify the slowest requests on your system.
You can use this report with execution reports to help identify the performance of the requests by
matching the operation name and starting timestamp between the reports. This report includes the
following information:
OPERATION_NAME
The name of the executed operation.
DURATION_MS
The duration of the operation in milliseconds.
START_TIME_MS
The start time of the operation in milliseconds as a relative timestamp to the stop time.
STOP_TIME_MS
The stop time of the operation in milliseconds as a relative timestamp to the start time.
RESULT_SIZE
The size of the operation result.
KEY_VALUE
The list of key-values that are used to invoke the operation.
IDENTIFIER
The unique identifier for the execution.
report-operation-cache.csv
Use this report to help you analyze the cache efficiency and potential for every operation. This report
includes information for all the following metrics. This report can include measurements and
information for the following metrics:
MS_SAVED_PER_BYTE
The time (in seconds) that is saved on your system for every byte of cache that you allocate to a
specific operation. This value is based on the assumption that your cache is infinite and that the
cache access is instantaneous. You can use this information to help you determine the best place to
allocate your available cache resources.
CACHE_ALLOCATION_IN_BYTES
The recommended amount of memory (in bytes) to allocate to the cache. This amount is based on
the allocatedCacheSize variable that is set in the analysis.properties file.
AVERAGE_CALL_DURATION_IN_MS
The average duration of a call (in seconds).
AVERAGE_CACHE_HIT_DURATION_IN_MS
The average duration of a call when the call results in a cache hit.
AVERAGE_CACHE_MISS_DURATION_IN_MS
The average duration of a call when the call results in a cache miss.
AVERAGE_RESULT_SIZE_IN_BYTES
The average size of a result when the result would be saved in cache.
CUMULATIVE_EXECUTION_TIME_MS
The amount of time that is spent by the system when it runs all of the measured executions of the
operation.
MAX_CACHE_ALLOCATION_SIZE_IN_BYTES
This is the maximum amount of cache (in bytes) that this operation could take if all of the unique
calls are stored in cache.
MAX_CACHE_BENEFIT_MS
The amount of time that is saved during the execution of an operation if the operation uses a perfect
cache that takes no execution time for a cache hit.
UNIQUE_CACHE_ENTRY_COUNT
The number of unique cache entries that generate if you have an infinite cache and every operation
result is cached.
MAX_THEORIC_CACHE_HIT_COUNT
The number of cache hits that generate during the cache performance measurement if you have an
infinite cache and every operation result is cached and never invalidated.
REAL_CACHE_HIT_COUNT
The number of request results that are actually fetched from the cache when your cache is enabled.
You can use this information to find the operations that are redundant.
REAL_CACHE_ENABLED_COUNT
The number of calls that are executed where caching is enabled.
CACHE_ENABLED_CALL_PERCENTAGE
The percentage of calls that occur when the cache is enabled.
MAX_THEORIC_CACHE_HIT_PERCENTAGE
The theoretical maximum number of requests that result in cache hits if you have an infinite cache
and no invalidation occurs.
REAL_CACHE_HIT_PERCENTAGE
The percentage of requests that result in actual cache hits.
CACHE_EFFECTIVENESS_VS_THEORY_PERCENTAGE
The effectiveness of your cache as a percentage of the effectiveness of the maximum value that is
predicted by the theoretical caching of the operation. You can use this information to help you find
where your caching practices are inefficient. This information can also help you to pinpoint where
your cache is too efficient and might be missing a key.
CALL_COUNT
The number of calls that are measured for a specific operation.
Analysis with IBM Cloud tools

A suite of IBM Cloud tools is available to diagnose the overall health and performance of your
site. Run diagnostics and upload various metric containing files to get guided feedback and
analysis to help you understand your site configuration and performance.
These tools generate the following reports:
 Java Health Report

 DB2 Report
 IHS Repo
HCL Commerce
Search performance tuning
Consider these search performance tuning hints and tips when you administer HCL Commerce
Search.
HCL Commerce Search performance tuning falls under the following sections:
 Indexing server
 Search runtime server
Where the main objective of tuning the indexing server is for optimal memory management,
tuning the search server run time is to obtain the best response times.
When to perform full search index builds
The HCL Commerce Search index is automatically built when certain business tasks are
performed, as outlined in Common business tasks and their impact to the HCL Commerce
Search index. In several cases, common business tasks result in delta index builds that do not
pose a significant risk to production system performance. However, doing several delta index
builds without occasional full index builds might result in the search index gradually degrading
over time due to fragmentation. To avoid this issue, doing full search index builds when possible
ensures that the search index performs well over time.
When Lucene receives a delete request, it does not delete entries from the index, but instead
marks them for deletion and adds updated records to the end of the index. This marking results
in the catalog unevenly spreading out across different segment data files in the search index, and
might result in increased search response times. If you have a dedicated indexing server,
consider scheduling a periodic full search index build. Make this build a background task that
runs once per month, so that the deleted entries are flushed out, and to optimize the data.
Indexing server
Consider the following factors when you tune the indexing server:
Index build preprocessor is now using Varchar as field type rather than Clob
The data type of several columns of the TI_ATTR table were changed from CLOB. The
six columns are now defined as varchar(32672) in Db2, and varchar2(32767) for Oracle
in the wc-dataimport-preprocess-attribute.xml configuration file. The same change was
made in the ATTRIBUTES column of TI_ADATTR. This change reduces the
preprocessing time of these two tables.
This change requires that Oracle users enable the "Extended Data Types" feature
described in https://oracle-base.com/articles/12c/extended-data-types-12cR1. If you are
migrating from a previous version, ensure that you drop all temporary tables before
proceeding.
Note: You must also execute the instruction in bold in this sample, or the Oracle
database will not come back online after a restart. You need only execute this instruction
once.
CONN / AS SYSDBA
SHUTDOWN IMMEDIATE;
STARTUP UPGRADE;
ALTER SYSTEM SET max_string_size=extended;
@?/rdbms/admin/utl32k.sql
SHUTDOWN IMMEDIATE;
STARTUP;
An x-data-config.xml configuration file is available that consolidates the parameters that

activate CLOB preprocessing. To use CLOB in preprocessing, uncomment the relevant
sections.
 x-data-config.xml
The CopyColumnsDataPreProcessor preprocessor can reduce

temporary table processing time by 50%
The preprocessor uses SQL syntax to eliminate unnecessary round-trip communication
between Java and the database. This syntax can take the format: INSERT INTO <one
table> SELECT FROM <another table>.
To enable the preprocessor, copy and use the XML files that are provided.
 To enable the preprocessor for your CI/CD pipeline, begin by copying the XML
files within your development (toolkit) environment samples folder. Copy the XML
files from the samples/dataimport/copy_columns_data_preprocessor directory within
your development environment to the \WC\xml\search\dataImport directory for your
CI/CD pipeline.
Note: In 9.0.1.1, the samples/dataimport/copy_columns_data_preprocessor folder is

deprecated. All corresponding XML files have a copy in \WC\xml\search\
dataImport\v3\database\CatalogEntry folder (where database is
either db2 | oracle). The fileshave a .copycolumns extension in the toolkit. They are
also available in /profile/installedApps/localhost/ts.ear/xml/search/dataImport/v3/
database/CatalogEntry in the runtime of the Transaction and Utility server
Docker containers.
 If you want a quick trial of the preprocessor, copy the XML files from your Utility
Docker container to
the /profile/installedApps/localhost/ts.ear/xml/search/dataImport directory of your
Transaction server Docker Container. You can complete this procedure to test
the preprocessor results within your CI/CD pipeline or within a development
environment.
The following table lists the XML files that use

the CopyColumnsDataPreProcessor preprocessor and the temporary table or tables
that each file can optimize.
XML file Temporary table

 TI_CATENTREL_#INDEX_SCOPE_TAG#
wc-dataimport-  TI_ADATTR_#INDEX_SCOPE_TAG#
preprocess-
attribute.xml
wc-dataimport-  TI_CATALOG_#INDEX_SCOPE_TAG#
preprocess-  TI_CATALOGI_#INDEX_SCOPE_TAG#
catalog.xml
wc-dataimport-  TI_SEOURL_#INDEX_SCOPE_TAG#_#lang_tag#
XML file Temporary table
preprocess-
common.xml
wc-dataimport-  TI_CATGPENREL_#INDEX_SCOPE_TAG#
preprocess-direct-  TI_DPCATENTRY_#INDEX_SCOPE_TAG#
parent-catentry.xml  TI_GROUPING_#INDEX_SCOPE_TAG#
wc-dataimport-  TI_DPGROUPI_#INDEX_SCOPE_TAG#
preprocess-direct-  TI_DPGRPNAME_#INDEX_SCOPE_TAG#_#lang_tag
parent- #
catgroup.xml
wc-dataimport-  TI_CATENTRY_#INDEX_SCOPE_TAG#
preprocess-
fullbuild.xml
wc-dataimport-  TI_D_CATENTRY_#INDEX_SCOPE_TAG#
preprocess-  TI_CATENTRY_#INDEX_SCOPE_TAG#
fullbuild-
workspace.xml
wc-dataimport-  TI_OFFER_#INDEX_SCOPE_TAG#
preprocess-
offerprice.xml
wc-dataimport-  TI_APGROUPI_#INDEX_SCOPE_TAG#
preprocess-parent-
catgroup.xml
wc-dataimport- TI_PRODUCTSET_#INDEX_SCOPE_TAG#
preprocess-
productset.xml
Important: Before you build an index, ensure that you delete all temporary tables with
the exception of the following delta indexing tables:
 TI_DELTA_CATENTRY
 TI_DELTA_CATGROUP
 TI_DELTA_INVENTORY
Ensure that you have Tracing enabled. Run the index as usual, and use Trace to
determine what performance improvements occurred.
The CopyColumnsDataPreProcessor can rely heavily on database computation.

When the preprocessing occurs, you might encounter issues with the processing and see
the error "SQLCODE=-964,". This error can occur when you have insufficient log space
for the preprocessor. You can run SQL statements against your database to increase the
log space.
The transaction log size in the Db2 database is controlled by LOGFILSIZ and
LOGPRIMARY+LOGSECOND. The following SQL statements provide an example of
how to increase the log space to 4 KB*40000*(20+160)=28.8 GB:
db2 update db cfg for <dbname> using LOGFILSIZ 40000

db2 update db cfg for <dbname> using LOGPRIMARY 20
db2 update db cfg for <dbname> using LOGSECOND 160
In HCL Commerce version 9.0.1.3, CopyColumnsDataPreProcessor is used by default
for applicable tables in xml/search/dataImport/v3/db2/CatalogEntry.
Note: Tables in old directories (Db2 and Oracle) still
use StaticAttributeDataPreProcessor.
The transaction log is disabled by default for CopyColumnsDataPreProcessor. The

following example XML snippet defines how to disable the transaction log for Db2:
<_config:data-processing-config
processor="com.ibm.commerce.foundation.dataimport.preprocess.CopyColumnsDa
taPreProcessor">
<_config:table definition="..." name="..."/>
<_config:query sql="..."/>
<_config:property name="no_logging_sql" value="alter table
#TABLE_NAME# activate not logged initially" />
</_config:data-processing-config>
To enable the transaction log for specific CopyColumnsDataPreProcessor, just remove
the no_logging_sql property from the configuration. In this example,
the no_logging_sql property was removed:
processor="com.ibm.commerce.foundation.dataimport.preprocess.CopyColumnsDa
taPreProcessor">
<_config:table definition="..." name="..."/>
<_config:query sql="..."/>
The CatalogHierarchyDataPreProcessor can improve processing
speed when the fetchSize parameter is specified.
In HCL Commerce Version 9.0.0.6, CatalogHierarchyDataPreProcessor is updated to

improve performance. This preprocessor, which is enabled by default, is used to inject
processed data into the TI_APGROUP temporary table. TI_APGROUP becomes
inefficient at large sales catalog numbers because it iterates an internal data structure
and issues a query on each iteration. By specifying the fetchsize parameter, you can
improve the processing speed of the preprocessor by up to 50%. The fetchsize option is
a batch select process that uses WHERE catentry_id in any (?.?…?) clause.
The default fetchSize and batchSize of the preprocessor are each 500.
The fetchSize cannot be larger than 32767 for Db2, or 1000 for Oracle.
For example:
processor="com.ibm.commerce.foundation.dataimport.preprocess.CatalogHierar
chyDataPreProcessor"
masterCatalogId="10101" batchSize="500" fetchSize="1000">
...
The query for the TI_ADATTR temporary table is changed in
Version 9.0.0.6+
During index building, nearly all rtrim() and cast() calls were removed from the query
for the TI_ADATTR temporary table. These calls were redundant for ordinary index
builds. The removal of these calls improves the response time of this query against Db2
databases and improves scaling for large numbers of catalog entries. The change for this
query is enabled by default when you update to Version 9.0.0.6+.
Search caching for the indexing server
Typically disable all Solr caches on the indexing server.
Tuning index buffer size and commit actions
during data import (buildindex)
You can tune your solrconfig.xml file to allocate sufficient memory for index buffering and
prevent commit actions when you are building your index. When the RAM buffer for index
updates is full, Solr performs commit actions that persist data onto disks. When these
commit actions occur, Solr has a global exclusive lock on your entire JVM. This lock
prevents other threads from doing update operations, even when the thread is working
on different records or files. This locking can increase the amount of time that is required
to build your index. By increasing your RAM buffer size, and disabling the commit trigger,
you can reduce the chances of this locking. You can tune your Solr parameters for
commit timing and buffer size in the solrconfig.xml file:
 Allocate more memory for index buffering by changing the value for
the ramBufferSizeMB parameter. 2048 MB is the maximum memory that you
can allocate:
<ramBufferSizeMB>2048</ramBufferSizeMB>
 Disable the server-side automatic commit trigger to also reduce occurrence of

commit actions by commenting out the autoCommit trigger:
 
Paging and database heap size
configuration
Ensure that your memory and paging size is configured according to the size of your
catalog data or if your environment contains multiple indexes for different languages. For
example, if you are having issues with accessing or building large amounts of index data:
1. Increase the default paging size for your operating system. For example, 3 GB. In
cases where the operating system requires a higher paging size, adding more
memory to the system also helps to resolve issues.
2. Increase the default database heap size to a larger value. For example, increase
the Db2 heap size to 8192.
3. Increase the file descriptor limit to a higher value. For example: ulimit -n
8192.
Heap size configuration

Ensure that your HCL Commerce Search heap size is configured according to the size of
your catalog data or if your environment contains multiple indexes for different
languages. For example, if you are having issues with accessing large amounts of index
data, increase the default heap size to a larger value such as 1280. For more information,
see Configuring JVM heap size in WebSphere Liberty .
Important:
 Do not exceed 28 GB of heap size per JVM, even when you use a 64-bit
environment. In a 64-bit JVM, an address compressed reference optimization
feature exists that might be disabled if the heap space exceeds 28 GB. If it is
disabled, there can up to a 30% overall throughput degradation.
Shared pool size

configuration
Ensure that the SHARED_POOL_SIZE is configured according to your
environment. Increasing the shared pool size might improve the performance of the index
build process.
For example:
ALTER SYSTEM SET SHARED_POOL_SIZE='668M' SCOPE=BOTH

Multithreaded
running of SQL query
expressions
Consider enabling multithreading in Db2 to allow for increased performance
when you preprocess the search index.
To do so, set INTRA_PARALLEL=YES in the Db2 DBM configuration. On the DB

client side, update the data source property of currentDegree to ANY. The Setting Parallel
Processing utility is in di-parallel-process.properties. A sample configuration statement
is Database.jdbcURL=jdbc:db2://db:50000/mall:currentDegree=ANY. For more
information, see Common IBM Data Server Driver for JDBC and SQLJ properties for Db2
servers.
Search runtime
server
Consider the
following factors
when you tune the
search runtime
server:
Caching
Search caching for the runtime production subordinate servers
The starter configuration that is included in the CatalogEntry solrconfig.xml file is only
designed for a small scale development environment, such as HCL
Commerce Developer.
When you redeploy this index configuration to a larger scale system, such as a staging or
production system, customize at least the following cache parameters:
 queryResultWindowSize
 queryResultMaxDocsCached
 queryResultCache
 filterCache (Required on the product index when an extension index such as
Inventory exists)
 documentCache (Required on the product index when an extension index such
as Inventory exists)
The following example demonstrates how to define cache sizes for the Catalog Entry
index and its corresponding memory heap space that is required in the JVM:
Sample catalog size:

Catalog size
1.8 million entries.
Total attributes
2000
Total categories
10000
Each product contains
Twenty attributes.
Average size of each Catalog Entry
10 KB.
Sample calculation:
queryResultWindowSize
The size of each search result page in the storefront, such as 12 items per page. This
result includes two prefetch pages.
This action results in a queryResultWindowSize value of 36 (3 x 12).

queryResultMaxDocsCached
For optimal performance, set this value to be the same value as
queryResultWindowSize.
queryResultCache
The size of each queryResultCache is 4 bytes per docId (int) reference x
queryResultWindowSize, for a value of 144 bytes.
Allocate 10 M cache slots for caching the first three pages of the main query.
This action results in a total required queryResultCache size of 1.44 GB (144 B x

10000000).
filterCache
Assume an average search result size to be 5% of the entire catalog size of 1.8 M, or
90,000.
The size of each filterCache is 4 bytes per docId (int) reference x random number of
search hits of 90,000, equaling 360 KB.
Assign 5000 entries for the filterCache.
The total required size for the filterCache results in a value of 1.8 GB (360 KB x 5000).
Note: The filterCache is required on the product index when an extension index such as
Inventory exists, so that the query component functions correctly.
docum
entCac
he
Assume an average size of each Catalog Entry document to be 10 KB.
Assign 5% of the entire catalog to be cached, or 100000 entries for the documentCache.
The total required size for the documentCache results in a value of 1.0 GB (10 KB x
100000).
Note:
 Set the documentCache size to at least the maximum anticipated size of a search
result.
 The documentCache is required on the product index when an extension index
such as Inventory exists so that the query component functions correctly.
As a result, the e
JVM heap size t
required for eac
Entry core is 4.3
GB + 1.8 GB + 1
Managing
cache sizes
to conform
to JVM
memory
Ensure that you configure the fieldValueCache of the catalog entry index core in
the solrconfig.xml file. This configuration can prevent out-of-memory issues by limiting its
size to conform to JVM memory.
The cache set size depends on the facets field quantity and catalog size. The cache
entry size can roughly be computed by the quantity of catalog entries in the index core,
which is then multiplied by 4 bytes. That is, the potential quantity of cache entries equals
the quantity of potential facets.
For example, in the solrconfig.xml file:
<fieldValueCache class="solr.FastLRUCache"
size="300"
autowarmCount="128"
showItems="32" />
Note: The recommended solr.FastLRUCache caching implementation does not have a
hard limit to its size. It is useful for caches that have high hit ratios, but might significantly
exceed the size value that you set. If you are using solr.FastLRUCache, monitor your
heap usage during peak periods. If the cache is significantly exceeding its limit, consider
changing the fieldValueCache class to solr.LRUCache to avoid performance issues or
an out-of-memory condition.
For more information, see https://https://lucene.apache.org/solr/guide/7_3/query-settings-

in-solrconfig.html/solr/guide/6_6/query-settings-in-solrconfig.html.
Tunin
g the
searc
h
releva
ncy
data
cache
Ensure that you tune the search relevancy data cache for your catalog size.
Relevance data is stored in the following cache instance:
 services/cache/SearchNavigationDistributedMapCache
Each entry ranges 8 - 10 KB, containing 10 - 20 relevancy fields. The cache instance
also contains other types of cache entries. The database is used for every page hit when
the cache instance is full, reducing performance.
T
u
n
i
n
g
t
h
e
s
e
a
r
c
h
d
a
t
a
c
a
c
h
e
f
o
r
f
a
c
e
t
e
d
n
a
v
i
g
a
t
i
o
n
The HCL Commerce Search server code uses the WebSphere Dynamic Cache facility to
perform caching of database query results. Similar to the data cache used by the
main HCL Commerce server, this caching code is referred to as the HCL Commerce
Search server data cache.
Adju
sting
heap
spac
e
whe
n
sear
ch
prod
uct
displ
ay is
enab
led
When
the
searc
h
produ
ct
displa
y
featur
e is
enabl
ed,
adjust
the
heap
size
accor
ding
to
these
guideli
nes:
 Alloca
te
appro
ximate
ly 5
MB/ca
tegory
with
produ
ct
seque
ncing
file for
produ
ct
seque
ncing:
 For
Image
Facet
Overri
de:
~10M
B per
categ
ory
with
image
overri
de file.
 For
Seque
ncing
and
Image
Overri
de:
Assu
ming
a
baseli
ne of
100,0
00
produ
cts in
the
categ
ory,
allocat
e
~15M
B per
categ
ory
with
seque
ncing
and
image
overri
de file.
If you
are
using
manu
al
seque
ncing
with
many
categ
ories,
add
1.5
MB
per
categ
ory
that is
seque
nced
for
each
additio
nal
100,0
00
produ
cts.
For
exam
ple,
accor
ding
to the
15 MB
per
categ
ory
estima
te,
manu
al
seque
ncing
of 200
categ
ories
with a
catalo
g size
of
100k
can
use 3
GB of
memo
ry.
Manu
al
seque
ncing
of the
same
200
categ
ories
can
use 6
GB
when
the
catalo
g size
is 1.1
million
.
Theref
ore,
the
heap
space
that is
allocat
ed per
categ
ory
must
be
adjust
ed
accor
ding
to the
catalo
g size.
Face
t
perf
orma
nce
Consi
der
the
followi
ng
facet
perfor
manc
e
tuning
consid
eratio
ns
when
you
work
with
facets
in
starter
stores
:
 Tune
the
size of
the ser
vices/c
ache/S
earchN
avigati
onDist
ributed
MapCa
che ca
che
instan
ce
accor
ding
to the
numb
er of
categ
ories.
 Tune
the
size of
the ser
vices/c
ache/S
earchA
ttribute
Distrib
utedM
apCac
he cac
he
instan
ce
accor
ding
to the
numb
er of
attribu
te
diction
ary
faceta
ble
attribu
tes.
 Avoid
enabli
ng
many
attribu
te
diction
ary
facete
d
naviga
tion
attribu
tes in
the
storefr
ont
(Sho
w
facets
in
searc
h
result
s).
Avoidi
ng
many
of
these
attribu
tes
can
help
avoid
Solr
out-of-
memo
ry
issues
.
Exte
nsio
n
inde
x
Consi
der
the
followi
ng
usage
when
an
extens
ion
index
such
as
Invent
ory
exists
in
WebS
phere
Comm
erce
searc
h:
 The
filterC
ache
and
docu
mentC
ache
are
requir
ed on
the
produ
ct
index
when
an
extens
ion
index
such
as
Invent
ory
exists
in HC
L
Comm
erce
Searc
h so
that
the
query
comp
onent
functio
ns
correc
tly.
 You
can
typical
ly
disabl
e all
other
intern
al Solr
cache
s for
the
extens
ion
index
in the
searc
h run
time.
Conf
igura
tion
optio
ns
Searc
h
confi
gurati
on
Ensure that you are familiar with the various Solr configuration parameters, Solr Wiki:
solrconfig.xml. The documentation contains information for typical configuration
customizations that can potentially increase your search server performance. For
example, if your store contains a high number of categories or contracts, or if your search
server is receiving Too many boolean clauses errors, increase the default value
for maxBooleanClauses.
Indexing
changes
and other
considera
tions
Garbage
collection
The default garbage collector policy for the HCL Commerce JVM is the Generational
Concurrent Garbage Collector. Typically, you do not need to change this garbage
collector policy.
You can activate the Generational Concurrent Garbage Collector for the HCL Commerce
Search JVM by using the -Xgcpolicy:gencon command line option.
Note: Using a garbage collection policy other than the Generational Concurrent Garbage
Collector might result in situations with increased request processing times and high CPU
utilization.
For more information, see Garbage collection policies.
Spell checking
You can experience a performance impact when you enable spell checking for HCL
Commerce Search terms.
You might see performance gains in transaction throughput if either spell checking is
skipped where necessary, or when users search for products with catalog overrides.
For example, a search term that is submitted in a different language than the storefront
requires resources for spell checking. However, product names with catalog overrides
are already known and do not require any resources for spell checking.
The spell checker component, DirectSolrSpellChecker, uses data directly from the
CatalogEntry index, instead of relying on a separate stand-alone index.
Improving Stor
Preview perform
for search chan
To improve performance when you preview search changes, you can skip indexing
unstructured content when business users start Store Preview:
In the wc-component.xml file, set the IndexUnstructured property to false.
For more information, see Changing properties in the HCL Commerce configuration file
(wc-component.xml).
Performance
You can monitor
Commerce Sear
following method
Lucene Index t
Luke is a development and diagnostic tool for search indexes. It displays and modifies
search index content. For more information, see Luke - Lucene Index Toolbox.
WebSphere Ap
clients
JMX clients can read runtime statistics from Solr.
To set up the client:
1. Add the JMX registry in the Solr core configuration file, solrconfig.xml:
2.
3. <jmx
serviceURL="service:jmx:iiop://host_name:2809/jndi/JMXConnector"/>
4. Use jconsole in Rational Application Developer to connect to the runtime JMX.
When the Solr core is initialized, you can use jconsole to view information from
JMX, such as statistics information for caches.
For more information, see SolrJmx.
Common business tasks and their

impact to the HCL Commerce
Search index
There are impacts to the search index when certain business tasks are performed. Ensure that
you are aware of your daily tasks when they might require search index rebuilds.
General guidelines
Certain business tasks might introduce a significant impact to the production system. This is due
to certain changes that trigger a full reindexing, which might vary in time from several minutes to
hours, depending on the catalog size. The following general guidelines highlight the business
actions that you can perform during business hours, and those to avoid:
Perform the following business actions during business hours, as they do not pose a significant
risk to production system performance:
 Creating a sales catalog that results in adding new categories only.

 Updating an existing category property.
 Changes made to any new or existing products, SKUs, packages, or bundles.
 Product attachments should be updated with their associated products.
 Adding or updating attribute dictionary attributes.
 Updating default price lists for any product.
 Adding or updating any type of marketing activity. For example, Web activities, Dialog
activities, or search rules.
 Adding or updating search term associations.
 Committing or canceling any task groups.
 Undoing any uncommitted operations within tasks.
Avoid performing the following business actions during business hours, as they might affect
performance to the production system:
 Reparenting an existing category. This triggers a full reindexing, which is not suitable
during business hours where the system usage is high.
 Removing an existing category or a subcategory from a catalog. This triggers a full
reindexing, which is not suitable during business hours where the system usage is high.
 Individual attachments should not be reloaded or deleted separately. This operation must
be performed with an associated product.
Warning: By default, the updateSearchIndex scheduled job runs a full index update. Do not
run the updateSearchIndex scheduled job in any full index configurations on a production
environment.
The inventory index contains operational data and therefore can be used only for previewing in
an authoring environment. IT administrators can set up a recurring task to take snapshots of
inventory status from a production environment and use them in an authoring environment for
previewing, testing, and tuning of search rules.
Common business tasks that affect the search index

The following table highlights the common business tasks and their impact to the search index in
an authoring or staging environment, where:
 An authoring environment is one that has a workspace that is enabled and business
users can use this environment to perform changes within a workspace. Once the
workspace gets approved, committed data (from the base schema) can be reindexed and
then published to the production environment through index replication.
 A staging environment is similar to an authoring environment, with the exception that it
does not have workspace that is enabled. Business users can still make changes in this
staging environment, but changes are made immediately to the search index in this
staging environment.
Note: All reindexing types that are listed in the table are denoted against the product index and
not the category index, except where indicated (Category business component).
Business tasks that affect the search index
The following table groups business tasks and reindexing types by business components.
Business
component Business task Reindexing type
Catalog: master Linking or unlinking to an Delta: Product and Category index
or sales catalog existing category from a Note: A delta reindex is performed only if the
catalog tree number of changes that are affected by the
business task is less than
the DeltaIndexingThreshold threshold.
Changes to an association Delta: Product index
of existing product to a
catalog
Store: direct Adding a new stand-alone Full: All indexes
business model direct model store that
uses a separate master
catalog
Store: extended Adding a new extended Not required
business model site that uses an existing
indexed catalog asset store
Adding a product, or Delta: Product index
delete an existing product
from an existing extended
site
Catalog entry: Adds a product, or delete Delta: Product index
product, package, an existing product
bundle, kit, item Updates any existing Delta: Product index
property or adds a
property to an existing
catalog entry, such as the
product description,
product name, brand
name, thumbnail, images,
SKU
Updates any existing or Delta: Product index
adds new package or
bundle
Associating or removing a Delta: Product index
product attribute from an
existing product
Reparent a catalog entry Delta: Product index
Category Adds a category Delta: Category index
Deletes an existing Full
category
Business
Updates any existing Delta: Category index
property or adds a
property to an existing
category, such as the
category description,
thumbnail, images
Reparent a category Delta: Product and Category index
Merchandising Updates or adds new Not required
association merchandising association
Attribute Adding or removing any Not required
Dictionary value of a newly created
attributes product attribute in the
attribute dictionary
Updates to any value of an Delta: Product index
existing attribute in the
attribute dictionary that is
associated with products.
Updates made to an Delta: Product index
associated catalog entry's
Attribute Dictionary
attributes or their allowed
values.
Attributes Updates to any value of a Delta: Product index
newly created or existing
product attribute
Adding or removing any Delta: Product index
value of an existing
product attribute
Adding or removing a Delta: Product index
product attribute
Associated asset Uploads a new attachment Delta: Product and Unstructured index
and associates with an
existing product
Reuploads or deletes an Delta: Product and Unstructured index
existing attachment that is
associated with only one
product
Reuploads or deletes an Delta: Product and Unstructured index
existing attachment that is
associated with existing
products
Price Updates to any existing or Delta: Product index
Business
adds a new (default) price
rule to a store
Updates to store default Delta: Product index
offer price for a product
Updates to list price for a Not required
product
Contract Creating or changing a Full
contract using Catalog
Filter from within the
WebSphere Commerce
Accelerator.
Marketing Adding, Not required
changing, or deleting an
existing marketing activity
(Web, Dialog)
Search rule Adding, changing, or Not required
deleting an existing search
rule
Search term Adding, changing, or Not required
association deleting an existing search
term association
Versioning Rollback or forward to Delta: Product and Category index
another version of a
category
Rollback or forward to Delta: Product index
another version of a
product
Inventory Updates to the inventory Full: Inventory index
search index
Common business tasks that affect the search index with workspaces
enabled
The following table highlights the available index types for approved content (base) and
workspaces.
All reindexing types that are listed in the table are denoted against the following indexes:
Index types
Index type Approved Content (base) Workspace
Product index Yes Yes
Category index Yes Yes

Index types
Index type Approved Content (base) Workspace
Unstructured Yes Yes

index
The following table highlights the common business tasks and their impact to the search index in
an environment with workspaces:
Note: Upon approving a task group that contains catalog changes:
 Indexing is triggered against the base schema to index the workspace changes under the
Approved content index, and
 Indexing is required against the workspace schema to clean up the approved changes
from the workspace index.
The following table describes business tasks that affect the search index
Reindexing type
Business Approved
component Business task Content Workspace
Catalog: master or Linking or unlinking to an existing Delta: Product Delta: Product
sales catalog subcategory from a catalog tree and Category and Category
index index
Linking or unlinking for a top Full: All indexes Full: All indexes
category
Changes to an association of existing Delta: Product Delta: Product
product to a catalog index index
Create Sales Catalog Not required Not required
Update Catalog description Not required Not required
Update default Catalog Not required Not required
Store: direct Adding a new stand-alone direct Full: All indexes Full: All indexes
business model model store that uses a separate
master catalog
Store: extended Adding a new Extended Site that uses Not required Not required
business model an existing indexed catalog asset store
Adding a product, or delete an Delta: Product Delta: Product
existing product from an existing index index
Extended Site
Catalog entry: Adding a product, or delete an Delta: Product Delta: Product
product, package, existing product index index
bundle, kit, item Updating any existing property or Delta: Product Delta: Product
adding a property to an existing index index
catalog entry, such as the product
description, product name, brand
name, thumbnail, images, SKU
Reindexing type
Business Approved
Updating any existing or adding a Delta: Product Delta: Product
package or bundle index index
Associating or removing a product Delta: Product Delta: Product
attribute from an existing product index index
Reparenting a catalog entry Delta: Product Delta: Product
index index
Updating the sequence of a catalog Delta: Product Delta: Product
entry within a category index index
Unpublish a product (Display to Delta: Product Delta: Product
customer not selected in the index index
Management Center)
Category Adding a subcategory to an existing Delta: Category Delta: Category
category index index
Deleting a subcategory from an Full: All indexes Full
existing category
Updating any existing property or Delta: Category Delta: Category
adding a property to an existing index index
category, such as the category
description, thumbnail, images
Reparenting a category Delta: Product Delta: Product
and Category and Category
index index
Updating the sequence of a sales Delta: Category Delta: Category
category when Expanded Category index index
Navigation is disabled
Updating the sequence of a sales Delta: Product Delta: Product
category when Expanded Category and Category and Category
Navigation is enabled index index
Unpublish a category (Display to Delta: Category Delta: Category
customer not selected in the index index
Management Center)
Unpublish a category (Display to Full: Product Full: Product
customer not selected in the and Category and Category
Management Center) when deep index index
category unpublish is enabled
Merchandising Updating or adding new Not required Not required
association merchandising associations
Attribute Dictionary Adding a value to an existing Not required Not required
attributes attribute dictionary attribute
Updating or removing any value of an Delta: Product Delta: Product
Reindexing type
Business Approved
existing attribute in the attribute index index
dictionary that is associated with
products
Adding an attribute dictionary Not required Not required
attribute
Marking an attribute dictionary Delta: Product Delta: Product
attribute as searchable or facetable index index
Removing an attribute dictionary Delta: Product Delta: Product
attribute index index
Attributes Updates to any value of a newly Delta: Product Delta: Product
created or existing product attribute index index
Adding or removing any value of an Delta: Product Delta: Product
existing product attribute index index
Adding or removing a product Delta: Product Delta: Product
attribute index index
Associated asset Uploading a new attachment and Delta: Product Delta: Product
associating it with an existing product and and
Unstructured Unstructured
index index
Uploading or deleting an existing Delta: Product Delta: Product
attachment that is associated with one and and
or more products Unstructured Unstructured
Note: You must also update the index index
product
Price Updates to default store offer price Delta: Product Delta: Product
for a product index index
Contract Creating or changing a contract using Full: All indexes Full: All indexes
Catalog Filter from within the
WebSphere Commerce Accelerator.
Marketing Adding, changing, or deleting an Not required Not required
existing Web or Dialog activity
Dialog activities are

available in the Enterprise edition
of HCL Commerce.
Search rule Adding, changing, or deleting an Not required Not required

existing search rule
Search term Adding, changing, or deleting an Not required Not required
association existing search term association
Reindexing type
Business Approved
Versioning Rollback or forward to another Delta: Product Delta: Product
version of a category and Category and Category
index index
Rollback or forward to another Delta: Product Delta: Product
version of a product index index
Inventory Updates to the inventory search index Full: Inventory Full: Inventory
index index
Example: Reading a table row for common business tasks
 When a business user is working on a workspace schema and creating a new product in
the Catalogs tool, a delta reindexing is required to update the workspace product index.
 Performance logger
 The performance logger produces trace information about the response time when HCL
Commerce calls out to an external system. The trace can be used by a monitor to
measure response times.
 The trace is enabled using the following string: com.ibm.commerce.performance=fine.
 The name and location of the trace file
is: WAS_profiledir/logs/performanceTrace.json.
 The following is a sample entry in the performance trace file, in JSON format:

 {"timestamp": "2012/10/02 23:56:38:265 EDT", "threadID": "0000009c",
"source": "External OMS",
 "service": "getPage-getOrderList", "serviceTime": "6188 ms"},
 The following are the external service calls that are traced by default, with the API that
calls out the performance logger:
Default traced external service calls
Sour
API ce Service
SterlingClientHelper.getOrderLineDetails() Exter getCompleteOrder

Details
nal
OMS
AbstractProcessInventoryRequirementActionCmdImpl.callGetInven Exter monitorItemAvail
toryAvailabilityService() ability
nal
OMS
ComposeDOMInventoryAvailabilityCmdImpl.callGetInventoryAvaila Exter monitorItemAvail
bilityService() ability
nal
OMS
Default traced external service calls
Sour
API ce Service
ProcessInventoryRequirementCancelInventoryReservationActionCm Exter multiAPI-
dImpl.callCancelInventoryService() cancelReservatio
nal n
OMS
ProcessInventoryRequirementReserveInventoryActionCmdImpl.call Exter reserveAvailable
ReserveInventoryService() Inventory
nal
OMS
FetchTransferredExternalOrderByStoreMemberAndStatusCmdImpl.fe Exter getPage-
tchExternalOrders() getOrderList
nal
OMS
Data recording: Improving runtime

performance
Two options are provided which can improve runtime performance regarding the recording of
user behavior data.
Recording user behavior data
 An option that stores the data in cache but does not persist that data to the database.
This is the default behavior for the recently viewed categories and products, but can be
configured for any user behavior data recording.
 An option to set the maximum size of user behavior data that is stored in cache before
that data is persisted to the database.
For more information, see Marketing component configuration files (wc-admin-component.xml).
Areas for performance consideration
Area Performance consideration
Activity and All information that is related to the processing of an activity during the
behavior rules storefront is put into the marketing cache. All information includes definition
of the behavior rules that need to match against URLs and controller
commands.
Recording of Performed in batch mode. Only once there is an activity is the data that is
user data related to that activity recorded. For example, only once a target has a
behavior rule that requires a customer to have browsed the Furniture category
five times is the customers browsing of the Furniture recorded. The browsing
of other categories is not recorded in this instance.
Optional For high amounts of data. For example, for a recently browsed list, the data is
Area Performance consideration
recording of not recorded in the database.

user data
Detecting For controller commands, the command execution trigger functionality is

events used. The command execution trigger is only enabled once a rule is set up that
is interested in the successful exit of that controller command. For behavior
rules, all the rule definitions are in the marketing cache.
Accessing A customer's online behavior user data is in a user data cache. While a
user data customer is browsing the site, any access to their user data is from the cache
that avoids any database access.
Customer is in Processing time of large dialog activities can be configured to run off peak.
Segment Segment evaluation can be expensive and result in many customers being run
trigger through a dialog activity.
Aggregate Views, clicks, and customers reach an element are accumulated in memory
statistics and are periodically persisted to the database that avoids a database write on
every page visit.
Emerald REST Caching On TS Server for Commerce 9.1
Emerald Store is powered by the REST framework, and the cache is implemented using a REST
servlet.
To enable caching of the default Emerald store, copy the following example;
/Rest/WebContent/WEB-INF/cachespec.xml.emerald.sample.store in the
REST/WEB-APP/cachespec.xml file and restart.
Caching on Emerald's TS application
The following URLs are cached on Emerald;
/wcs/resources/store/{storeId}/adminLookup?q=findByStoreIdentifier&storeIdentifier={storename}
/wcs/resources/store/{storeId}/associated_promotion?
q=byProduct&qProductId={productId}&langId={langId
/wcs/resources/store/{storeId}/espot/{espotIdentifier}?
catalogId={catalogId}&name={name}&langId={langId
/wcs/resources/store/{storeId}/inventoryavailability/{catentryId}?langId={langId}
/wcs/resources/store/{storeId}/guestidentity?langId={langId}
/wcs/resources/store/{storeId}/online_store
Only a few of the cached urls require the invalidation method, which is enabled. The inventory url is
the critical url, which is cached and requires a caching trigger to keep it up to date with the system's
real inventory.
Enabling the REST cache for Emerald
The sample cachespec.xml that defines the rules is provided in the file:
/Rest/WebContent/WEB-INF/cachespec.xml.emerald.sample.store
To enable caching, you are required
Replace the Rest/WebContent/WEB-INF/cachespec.xml with the sample file.
Enable the caching inventory triggers to manage the inventory cache.
Restart the server.
Emerald Store and Inventory Caching
The product inventory is implemented as a separate rest call to the TS-app server in the Emerald
store, which is a REST-based store. This call is submitted straight from the browser to the TS-app,
bypassing the CDN layer. The store's performance will be significantly improved by caching this call.
Although the call can be cached on the CDN, the content will be temporary, and the invalidation
policy will be rudimentary. Caching it on the appserver, on the other hand, is considerably more
versatile and will enable invalidation techniques to satisfy business needs and keep the data fresh
over time.
Inventory Model and Operations
In general, the inventory/quantity is updated regularly from both the backend and user interaction
with the system (when submitting the order).
Daily updates and direct feeds would run on the production database, causing inventory
adjustments. The triggers would take track of these changes according to the inventory tracking
rules. In other words, the inventory feed will be treated in the same way that the user browses and
purchases items from the inventory.
While surfing, the browse page data will verify stock/inventory information to ensure that the end
user receives accurate information.
The inventory in the Emerald store is accessed through REST requests sent from the user's browser
to the TS app server. These REST calls return inventory information/counts as a response. In most
instances, the actual count is less essential than knowing whether the product is in stock or out of
stock. If that's the case, the Boolean value of "in stock" should only be refreshed if the following
conditions are met:
The stock count goes from positive number to zero (the product is sold/inventory is depleted).
The stock count goes from zero to positive number (new stock arrived in the system)
In other words, the inventory can be cached until it runs out of stock, at which point it should be
invalidated, OR the inventory can be stored until the backend feed puts the item back into stock.
There could be a bit more complicated lifecycle of the cached content as:
In stock
Low stock (where the stock is smaller than some value).
Out of stock.
Obviously, we will have to invalidate the inventory as it is passing through each of these conditions:
The stock count goes down to LowStock Value (we don’t really need to track if this approached from
high or low.
The Inventory count goes from positive value to zero.
The inventory count goes from zero to positive value.
In the inventory invalidation strategy, this mode of operation and logic must be followed. The
mechanism for tracking and emitting the invalidation message to the rest of the system should be
integrated into the caching triggers in this specific case.
Daily Updates
Daily updates and direct feeds operate on the production database and might induce alterations to
the database's inventory table. Since this inventory table's caching triggers will be active, data feed
change operations will be handled in the same location and in the same manner as user browsing
and checkout activities on the inventory.
Cache Dependencies and Invalidations
The design of the dependencies and invalidations is defined by the website's actual implementation.
The inventory calls on the TS-app server, for example, are rest calls that can be directly invalidated;
but, if the inventory is displayed on the search response pages, the cache will be emptied altogether.
Inventory Cache Triggers

HCL Commerce supports a variety of inventory types; however, the triggers presented here are for
non-ATP inventory, which has a simpler model and is more widely used.
Conditional Triggers
The triggers should fire only if the condition is satisfied, i.e. they should only insert into CACHEIVL
table only if the condition is satisfied. This is achieved with the when clause in the trigger:
REFERENCING OLD AS O NEW AS N FOR EACH ROW MODE DB2SQL
WHEN (clause)
BEGIN ATOMIC
The clause for the first case where we have only instock and out of stock would look like:
WHEN ((N.quantity > 0 AND O.quantity = 0) or (N.quantity = 0 and O.quantity > 0))
Or the trigger definition would look as:
WHEN ((N.quantity > 0 AND O.quantity = 0) or (N.quantity = 0 and O.quantity > 0))
BEGIN ATOMIC
When we have 3 levels of inventory, as in instock, low stock and out of stock:
WHEN ((N.quantity > 0 AND O.quantity = 0) or (N.quantity = 0 and O.quantity > 0) OR (N.quantity =
10 and O.quantity > 10))
BEGIN ATOMIC
Tips on how to implement special conditions
The example below demonstrates how to set up triggers on an inventory table. The trigger will insert
invalidation messages in the cache ivl for the following objects, in addition to the trigger clauses that
follow the business logic:
SKU level.
Product Level.
Bundle Level.
--#SET TERMINATOR #
DROP TRIGGER ch_inventory_u#
CREATE TRIGGER ch_inventory_u
AFTER UPDATE ON inventory
WHEN ((N.quantity > 0 AND O.quantity = 0) or (N.quantity = 0 and O.quantity > 0) OR (N.quantity =
10 and O.quantity > 10))
BEGIN ATOMIC
INSERT INTO cacheivl (template, dataid, inserttime) VALUES ( NULLIF('A', 'A'), 'catentryId:'||
RTRIM(CHAR(N.catentry_id)) , current_timestamp );
INSERT INTO cacheivl (template, dataid, inserttime)
(SELECT distinct NULLIF('A', 'A'), 'catentryId:'|| RTRIM(CHAR(product.catentry_id)) ,

current_timestamp
FROM catentry sku, catentry product, catentrel
WHERE O.catentry_id = N.catentry_id and
N.catentry_id = sku.catentry_id and
sku.catentry_id = catentrel.catentry_id_child and
catentrel.catentry_id_parent = product.catentry_id);
INSERT INTO cacheivl (template, dataid, inserttime)
(SELECT distinct NULLIF('A', 'A'), 'catentryId:'|| RTRIM(CHAR(bundle.catentry_id)) ,

current_timestamp
FROM catentry sku, catentry bundle, catentrel skutoprod, catentrel prodtobund
WHERE O.catentry_id = N.catentry_id and
N.catentry_id = sku.catentry_id and
sku.catentry_id = skutoprod.catentry_id_child and
skutoprod.catentry_id_parent = prodtobund.catentry_id_child and
prodtobund.catentry_id_parent = bundle.catentry_id);
END#
The 3 inserts in the trigger are matching the 3 possible objects that may have been affected by the
inventory table record. The provided trigger is directly applicable to DB2 database, and with minimal
changes it can be applied in Oracle DB as well. No further changes are needed, and the invalidations
will start working once the trigger is in place.

Elasticsearch Performance Tuning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Elasticsearch Performance Tuning

Uploaded by

Copyright:

Available Formats

Elasticsearch performance tuning

The index creation process consists of three key steps:

Processing data group

Uploading data group

These scenarios will be examined in further detail in Interpreting patterns and

Key Tunable Variables

The following key tunable variables can be adjusted to improve overall

Processor thread count (Concurrent Tasks)

 Elasticsearch monitoring and tuning

Interpreting patterns and tuning

A case of slow data transfer – multithreading

Side effects from thread increase

Side effects of bucket size increase

However, the below graph shows a totally different picture:

A case of slow processing- scroll size tune

Side effect from very large scroll sizes

Finding the sweet spot

Elasticsearch monitoring and tuning

 Monitoring the operation

NiFi counters and reporting

Due to high resource consumption, the NiFi counters access http://nifi.<your

Understanding the Grafana graphs

Interpreting the graphs and detecting a bottleneck

Monitoring Elasticsearch cluster

The Elasticsearch monitoring dashboard pagedisplays numerous metric displays.

Displays several metrics by single number value.

Displays high level shard information.

JVM Garbage Collection

CPU and Memory

Disk and Network

Total Operations Statistics

Critical dashboard that shows activity of the worker threads.

Dashboard presenting the Elasticsearch caches.

For a complete understanding of some metrics, in-depth knowledge of Elasticsearch operations is

Key monitoring metrics

 Query time dashboard

Garbage Collection operations – GC time

JVM Memory Utilisation

Active threads dashboards

The picture below depicts two graphs:

1. Thread pool operations queued.

2. Search thread pool

Note: The number of threads in

Elasticsearch internal caches:

Total Operations dashboard

1. Total Operations rate

Monitoring and Analysis of a typical problem

Single type operations of Elasticsearch

By monitoring these metrics and analyzing the patterns and

The case of low heap

These are typical behaviors that Elasticsearch may exhibit when

It is important to note that Elasticsearch es native memory for

Additional considerations arding memory

There is no correct size of the bulk request. Test with

It is important to note that Elasticsearch limits the

Multiple or Concurent Elasticsearch

Changing write threads:

Changing search threads

For more information, see Thread pools.

For more information, see Circuit breaker settings.

 Field Data Cache

For more information, see Field data cache settings.

 Field Data Loading Circuit Breaker