Skip to content

Sizing guidance for rendering in a small-sized Kubernetes configuration

This topic provides the details of the environments used for rendering in a small-sized Kubernetes configuration. Test results and recommendations for small configurations are provided.

You can also find details of the environments used for rendering with the upper limit in a single-node Kubernetes configuration.

Methodology

Overview of DX rendering sizing-performance tests

This sizing work consisted of rendering scenarios of WCM, portlets, and DAM with a rendering setup enabled in AWS/Native-Kubernetes (Kubernetes installed directly in Amazon EC2 instances). A combination run was performed that rendered WCM content, DAM assets, and DX pages and portlets. The load distribution was WCM content (40%), DAM assets (30%), and DX pages and portlets (30%). All systems were pre-populated before performing the rendering tests.

To achieve the 1,000 concurrent users mark, an initial set of runs was done with a lower number of users on a single node setup. The tests started with the desired load of 1,000 users and an acceptable error rate (< 0.01%). Further steps were taken to optimize the limits on the available resources for each pod.

The following table contains the rendering scenario details for a small configuration.

Concurrent users WCM pages DAM content Pages and portlets content
1,000 users 20 2,500 8

For more information about the setup of test data, refer to the following:

Environment

This section provides details for the Kubernetes cluster, JMeter, and database.

AWS/Native Kubernetes

  • A Kubernetes platform is running on an AWS Elastic Compute Cloud (EC2) instance with the DX images installed and configured.

  • In AWS/Native Kubernetes, the tests are executed in EC2 instances with one node instance (c5.2xlarge).

  • The tests used a remote DB2 instance for the core database (c5.2xlarge).

  • [Small Configuration - Single node] - [c5.2xlarge]

    • Information

    • Processor details

    • Volume details

DB2 instance

  • Remote DB2 - [t3a.large]

  • Processor details

  • Volume details

JMeter agents

  • JMeter instance - [t2.xlarge]

  • To run the tests, a distributed AWS/JMeter agents setup consisting of one primary and two subordinates was used.

  • Processor details

  • Network details

  • Volume details

Note

Ramp-up time is 1.5 seconds per user. Test duration is the total of ramp-up time and 1 hour with peak load of concurrent users.

Results

The initial test runs were conducted on an AWS-distributed Kubernetes setup with a single node. The system successfully handled concurrent user loads of 100, 200, 400, and 500 users, with a low error rate (0.0%). At 600 users, error rates increased dramatically and response times went up as well. For a response time to be considered optimal, it should be under 1 second. All the errors came from WCM and Pages and Portlets, not from DAM.

Test results were analyzed in Prometheus and Grafana dashboards. For HAProxy and Core pods, the CPU and memory limits were fully utilized. These limits were increased based on the CPU and memory usage observations from Grafana during the load test. Increasing the CPU and memory limits of HAProxy and Core pods resolved the errors.

In addition, the event loop lag for Ring API pod was on the higher end at 400 ms for a user load of 500. After adjusting the CPU and memory limits of the RingAPI pod, event loop lag was reduced to 6.6 ms.

From these observations, CPU and memory limits of core, ringAPI, and HAProxy pods were tuned one by one to see if no errors occur during a user load of 600 to 1,000 users.

Conclusion

This performance tuning guide aims to understand how the ratios of key pod limits can improve the rendering response time in a simple single pod system. This is an important step before attempting to illustrate the impact of scaling of pods. This guide concludes that:

  • Changes to the pod limits for the following pods significantly improve the responsiveness of the setup and enable the system to handle more users.
Pod Name Minimum Number of Pods Container Container Image Container CPU Request and Limit Container Memory Request and Limit
core 1 core core 3000 m 5000 Mi
ringApi 1 ringApi ringApi 500 m 512 Mi
haproxy 1 haproxy haproxy 700 m 1024 Mi

Note

Performance tuning for a Kubernetes DX cluster must be conducted for the particular workloads involving the number of concurrent users. Generally, these recommendations are intended to speed up tuning for others. Refer to the DX Core tuning guide for further enhancements.

Recommendations

  • Currently, default CPU and memory values in the Helm chart are the minimum values for DX to work. For a small-sized workload in AWS, the Kubernetes cluster should begin with a single node with at least a c5.2xlarge instance type to support a load of 1,000 users.

  • For testing purposes, OpenLDAP pod values were used for holding more authenticated users for rendering. However, the OpenLDAP pod is not for production use.

There were a number of alterations done to the initial Helm chart configuration. The following table contains the number and limits for each pod. Using these values significantly improves the responsiveness of the setup and enables the system to handle 1,000 concurrent users with an improved error rate, average response time, throughput, and an event loop lag of Ring API containers.

Request Request Limit Limit
Component No. of pods CPU (m)
Memory (Mi)
CPU (m)
Memory (Mi)
contentComposer 1 100 128 100 128
core 1 3000 5000 3000 5000
digitalAssetManagement 1 500 1536 500 1536
imageProcessor 1 200 2048 200 2048
openLdap 1 200 768 200 768
persistenceNode 1 500 1024 500 1024
persistenceConnectionPool 1 500 512 500 512
ringApi 1 500 512 500 512
runtimeController 1 100 256 100 256
haproxy 1 700 1024 700 1024
licenseManager 1 100 300 100 300
Total 6400 13108 6400 13108

Note

Values in bold are tuned Helm values while the rest are default minimal values.

For convenience, these values were added to the small-config-values.yaml file in the hcl-dx-deployment Helm chart. To use these values, refer to the following steps:

  1. Download the Helm chart from FlexNet or Harbor.

  2. Extract the TGZ file (hcl-dx-deployment-XXX.tgz).

  3. In the extracted folder, navigate to the following structure to go to the small-config-values.yaml file: hcl-dx-deployment/value-samples/small-config-values.yaml.

Guidance for rendering with the upper limit in a single-node configuration

This section provides the details of the environments used for rendering with the upper limit in a single-node Kubernetes configuration. You can also find the test results and recommendations for single-node configurations in this section.

Methodology

Overview of DX rendering sizing-performance tests

This sizing work consisted of rendering scenarios of WCM, portlets, and DAM with a rendering setup enabled in AWS/Native-Kubernetes (Kubernetes installed directly in Amazon EC2 instances). A combination run was performed that rendered WCM content, DAM assets, and DX pages and portlets. The load distribution was WCM content (40%), DAM assets (30%), and DX pages and portlets (30%). All systems were pre-populated before performing the rendering tests.

To achieve the maximum throughput for users, an initial set of runs was done with a lower number of users on a single-node setup. Pods were then scaled accordingly.

The following table contains the rendering scenario details for a single-node upper limit configuration.

Concurrent users WCM pages DAM content Pages and portlets content
10,000 users 200 25,000 80

For more information about the setup of test data, refer to the following:

Environment

This section provides details for the Kubernetes cluster, JMeter, and database.

AWS/Native Kubernetes

  • A Kubernetes platform is running on an AWS Elastic Compute Cloud (EC2) instance with the DX images installed and configured.

  • In AWS/Native Kubernetes, the tests are executed in EC2 instances with one node instance (c5.2xlarge).

  • The tests used a remote DB2 instance for the core database (c5.2xlarge).

  • [Single-node Configuration] - [c5.9xlarge]

  • The tests started with c5.2xlarge, then c5.4xlarge, and then a c5.9xlarge instance after analyzing test results and observations.

    • Information

    • Processor details

    • Volume details

DB2 instance

  • Remote DB2 - [c5.2xlarge]

  • Processor details

  • Volume details

JMeter agents

  • JMeter instance - [c5.2xlarge]

  • To run the tests, a distributed AWS/JMeter agents setup consisting of one primary and eight subordinates was used.

  • Processor details

  • Volume details

  • Processor details

  • Volume details

Note

Ramp-up time is 1.5 seconds per user. Test duration is the total of ramp-up time and 1 hour with peak load of concurrent users.

DX core tuning for concurrent user run

For tuning details and enhancements done to DX core during testing, see DX core tuning.

Results

The initial test runs were conducted on an AWS-distributed Kubernetes setup with a single node of instance types c5.2xlarge and c5.4xlarge. The system successfully handled concurrent user loads of 1,000, 2,000, 3,000, and 5,000 with a < 0.01% error rate. At 6,000 users, error rates increased dramatically and the response times went up as well. For a response time to be considered optimal, it should be under 1 second.

Later tests were done from a c5.9xlarge instance and Horizontal Pod Autoscaling (HPA) was enabled for core, DAM, HAProxy, and ringAPI pods with thresholds of 50% for CPU utilization and 80% for memory utilization. The HPA test run finished successfully with no errors. Through the HPA tests, it was observed that four pods each of core, DAM, and HAProxy, and three pods of ringAPI are required to have a successful run for 6,000 concurrent users. With this setup, the test was run for 10,000 concurrent users. At 10,000 concurrent users, there were a few failures due to the ringAPI pod decreasing intermittently. RingAPI pods were then scaled to four. The test run with four pods each of core, DAM, HAProxy, and ringAPI was successful for 10,000 concurrent users.

Test results were analyzed in Prometheus and Grafana dashboards. The single-node CPU usage of a node reached an average of 80% in tests with 10,000 concurrent users. The saturation was checked by reducing the number of users to 5,000, 3,000, and 2,500 users. For these user load numbers, the average usage of a node CPU was around 70 to 80%. The recommended load is 2,500 users; response times are optimal with this user load.

Conclusion

This guidance shows the upper limit on a single-node K8s cluster AWS instance (c5.9xlarge). For single-node (c5.9xlarge) rendering scenarios for DAM, WCM, and pages with portlets, the recommended load is 2,500 concurrent users.

The following table contains the number and limits for each pod. Using these values significantly improves the responsiveness of the setup and enables the system to handle 2,500 concurrent users.

Pod Name Number of Pods Container Container Image Container CPU Request and Limit Container Memory Request and Limit
core 4 core core 5000 m 8000 Mi
ringApi 4 ringApi ringApi 800 m 512 Mi
haproxy 4 haproxy haproxy 700 m 1024 Mi
digitalAssetManagement 4 digitalAssetManagement digitalAssetManagement 1000 m 2048 Mi
persistence-connection-pool 2 persistence-connection-pool persistence-connection-pool 500 m 512 Mi
persistence-node 2 persistence-node persistence-node 1000 m 2048 Mi

Note

Performance tuning for a Kubernetes DX cluster must be conducted for the particular workloads involving the number of concurrent users. Generally, these recommendations are intended to speed up tuning for others. Refer to the DX Core tuning guide for further enhancements.

Recommendations

  • Currently, default CPU and memory values in the Helm chart are the minimum values for DX to work. For an upper limit on one instance in AWS, the Kubernetes cluster should begin with a single node with at least a c5.9xlarge instance type to support a load of 2,500 users for optimal response time.

  • For testing purposes, OpenLDAP pod values were used for holding more authenticated users for rendering. However, the OpenLDAP pod is not for production use.

There were a number of alterations done to the initial Helm chart configuration. The following table contains the number and limits for each pod in a single-node setup.

Request Request Limit Limit
Component No. of pods CPU (m)
Memory (Mi)
CPU (m)
Memory (Mi)
contentComposer 1 100 128 100 128
core 4 5000 8000 5000 8000
digitalAssetManagement 4 1000 2048 1000 2048
imageProcessor 1 200 2048 200 2048
openLdap 1 200 768 200 768
persistenceNode 2 1000 2048 1000 2048
persistenceConnectionPool 2 500 512 500 512
ringApi 2 800 512 800 512
runtimeController 1 100 256 100 256
haproxy 1 700 1024 700 1024
licenseManager 1 100 300 100 300
Total 30000 50860 30000 50860

Note

Values in bold are tuned Helm values while the rest are default minimal values.

Related information