Using Logstash to push pod logs to OpenSearch
This topic outlines the steps for configuring Logstash to push pod logs to OpenSearch.
Introduction
Kubernetes is a widely adopted platform for container orchestration in cloud-native environments. As applications scale, managing the logs they generate is important for monitoring, debugging, and compliance. Logstash is an open source data processing pipeline that collects, processes, and forwards logs from Kubernetes pods to OpenSearch, an open source search and analytics engine.
Overview of Logstash, Filebeat, and OpenSearch
Logstash
Logstash is an open source tool for managing events and logs. It enables the collection, parsing, and transformation of logs before forwarding them to a specified destination (for example, OpenSearch). Logstash supports various input, filter, and output plugins, making it extensible and flexible.
Filebeat
Filebeat is a lightweight, open source log shipper developed by Elastic. It is designed to collect and send data to Elasticsearch or Logstash for indexing and analysis. Filebeat focuses on reading log files and forwarding their contents to a specified destination.
OpenSearch
OpenSearch is a community-driven, open source search and analytics suite derived from Elasticsearch. It provides powerful indexing, search, and visualization capabilities. OpenSearch is ideal for storing, searching, and analyzing large volumes of log data in near real-time.
System architecture
The following flowchart shows how pod logs are processed from Kubernetes to OpenSearch:
flowchart TB
accTitle: Data flow overview
accDescr: Flowchart showing how logs are processed from Kubernetes to OpenSearch
node_1["`**Containers in Pods** generate log data.`"]
node_2["`**Filebeat** collects logs.`"]
node_3["`**Logstash** processes and filters logs.`"]
node_4["`**OpenSearch** stores and indexes the processed logs.`"]
node_1 --> node_2
node_2 --> node_3
node_3 --> node_4
Prerequisites
Make sure that you have the following:
-
A running Kubernetes cluster.
-
An accessible OpenSearch cluster. Deploy an OpenSearch cluster using the official OpenSearch Documentation. Ensure the cluster is accessible from the Logstash instance.
Installing and configuring Logstash
-
Install Logstash on a node within the Kubernetes cluster or on a dedicated log processing server. Refer to the official documentation Installing Logstash for more information.
-
Configure Logstash for container logs.
Modify the provided
pipeline.conf
file according to your requirements. This configuration file defines the input sources, filters, and output destinations for Logstash. Refer to the following sample of apipeline.conf
file.input { beats { port => 5044 } } filter { mutate { update => { "[host][name]" => "${KUBE_HOSTNAME}" } } } output { opensearch { hosts => ["${OS_PROTOCOL}://${OS_HOSTNAME}:443"] index => "${OS_INDEX_NAME}" auth_type => { type => "basic" user => "${OS_USERNAME}" password => "${OPENSEARCH_PASSWORD}" } ssl => true ssl_certificate_verification => false } }
Refer to the following list for more information about the parameters in the example:
-
Input Configuration
: This defines the data source. The example is configured to receive data from Filebeat on TCP port 5044. -
Filter
: This processes the incoming data. The example uses themutate
filter to modify events.Update
: This updates the [host][name] field of the event to the value of the KUBE_HOSTNAME environment variable. This is useful for dynamically setting the hostname based on the environment where Logstash is running.
-
Output Configuration
: Theopensearch
output plugin is configured to send logs to an OpenSearch server. Thehosts
parameter specifies theserver's protocol
,hostname
, andport
. Additionally, thelogs index
,user
, andpassword
parameters are provided for authentication. SSL encryption is enabled (ssl => true
), with certificate verification disabled (ssl_certificate_verification => false
). This configuration ensures logs are sent to the correct OpenSearch endpoint.
-
Installing and creating the Filebeat configuration file
-
Install FileBeat on a node within the Kubernetes cluster or on a dedicated log processing server. Refer to the official documentation Filebeat quick start: installation and configuration for more information.
-
Create the Filebeat configuration file.
Create the
filebeat-conf.yml
file according to your requirements. This file specifies the log files to read the input from Kubernetes and send them to the output destination. Refer to the following sample of afilebeat-conf.yml
file.filebeat.autodiscover: providers: - type: kubernetes hints.enabled: true kube_config: /home/centos/.kube/config templates: - condition: equals: kubernetes.namespace: "dxns" config: - type: container paths: - /var/log/containers/*\${data.kubernetes.container.id}.log fields: pod_name: \${data.kubernetes.pod.name} output.logstash: hosts: ["${KUBE_HOSTNAME}:5044"]
This sample configuration tells Filebeat to read all
.log
files in the /var/log/containers directory and send the log data to a Logstash instance running on ${KUBE_HOSTNAME}:5044. The kube_config file is the default way to authenticate to a Kubernetes cluster. By default, the kube_config file is located in the ~/.kube directory.
Managing indexes in OpenSearch
Modify indexes to handle the creation, rollover, and deletion of indexes. This helps manage storage efficiently and maintain optimal performance. For detailed information, refer to the official OpenSearch documentation:
- Managing Indexes in OpenSearch
- OpenSearch Index Templates
- OpenSearch Index Lifecycle Management (ILM)
Monitoring and debugging
This section contains recommendations for monitoring and debugging log data.
-
Use OpenSearch Dashboards to visualize log data. Use available filters to check specific deployment host names and pod in the log data. The following image shows the filters you can use.
Viewing data in this format is more user-friendly and helps you make quicker conclusions compared to a regular server SystemOut format.
-
Debug server issues using metrics and logs.
Case study results
The recommended architecture was implemented in a DX deployment. The following improvements were observed:
- Efforts for debugging the logs for root cause analysis were reduced.
- Using the recommended architecture in the DX deployment resulted in improved log visibility, faster incident response times, and better overall system reliability.
Conclusion
Using Logstash to push Kubernetes pod logs to OpenSearch provides a robust and scalable solution for log management. Follow the setup and best practices outlined in this topic to enhance observability and improve operational efficiency of your log data.
For more information, refer to the official documentation of Logstash, Filebeat, OpenSearch, and Kubernetes: