What is ELK stack?
The ELK stack is a set of tools used for searching, analyzing, and visualizing large volumes of data in real-time. It is composed of three main components:
- Elasticsearch [https://www.elastic.co/elasticsearch]
- Logstash [https://www.elastic.co/logstash]
- Kibana [https://www.elastic.co/kibana]
What is it used for?
- aggregates logs from all systems and applications
- logs analytics
- visualizations for application and infrastructure monitoring, faster troubleshooting, security analytics etc.
image source: https://www.guru99.com/ |
Logstash
- Server-side data processing pipeline that ingests (takes in) data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch.
- Supports a variety of input sources, such as:
- log files (log shipper)
- databases
- message queues
- Allows for complex data transformations and filtering
- Helps easily transform source data and load it into Elasticsearch cluster
Logstash configuration examples:
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.
input {
file {
path => "/home/my-app/.pm2/logs/my-app-out.log"
start_position => "beginning"
sincedb_path => "/opt/logstash/sincedb-access"
}
}
filter {
grok {
match => { "message" => "%{DATA:timestamp} - info: processRequestMain: my-product: (input|output) sessionid = \{%{GREEDYDATA:session_id}\} (reqXml|resXml) = %{GREEDYDATA:content_xml}"
}
if "_grokparsefailure" in [tags] {
drop { }
}
xml {
source => "content_xml"
target => "content"
}
split {
field => "content[app]"
}
mutate {
add_field => {
"env" => "${MYAPP_ENV}"
"instance_id" => "${MYAPP_INSTANCE_ID}"
}
}
}
output {
amazon_es {
hosts => [ "search-myapp-dev-af6m6cidasgqsnmskxup2fh57y.us-east-1.es.amazonaws.com" ]
region => "us-east-1"
index => "logstash-myapp-%{+YYYY.MM.dd}"
}
}
Elasticsearch
- Distributed, RESTful search and analytics engine
- Built on Apache Lucene
- Used for storing (it is basically a Database), searching, and analyzing large volumes of data (e.g. logs) quickly and in near real-time
- Scalable, fast, and able to handle complex queries
- Licensed, not open source
- OpenSearch is open-sourced alternative (supported by AWS)
- FluentD is another open-source data collection alternative
- Data in the form of JSON documents is sent to Elasticsearch using:
- API
- Ingestion tools
- Logstash - e.g. it's pushing logs to ElasticSearch
- Amazon Kinesis Data Firehose
- The original document is automatically stored and a searchable reference is added to the document in the cluster’s index
- Elasticsearch REST-based API is used to manipulate with documents:
- send
- search
- retrieve
- Uses schema-free JSON documents
- Distributed system
- Enables it to process large volumes of data in parallel, quickly finding the best matches for your queries
- Operations such as reading or writing data usually take less than a second to complete => Elasticsearch can be used for near real-time use cases such as application monitoring and anomaly detection
- Has support for various languages: Java, Python, PHP, JavaScript, Node.js, Ruby etc...
Kibana
- Visualisation and reporting tool
- Used with Elasticsearch to:
- visualize the data
- build interactive dashboards
Filebeat
- https://www.elastic.co/beats/filebeat
- log shipper
- both Filebeat and Logstash can be used to send logs from a file-based data source to a supported output destination
- Filebeat is a lightweight option, ideal for environments with limited resources and basic log parsing needs. Conversely, Logstash is tailored for scenarios that demand advanced log processing
- both FB and LS can be used in tandem when building a logging pipeline with the ELK Stack because both have a different function
image source: https://www.guru99.com/ |
No comments:
Post a Comment