Friday 16 February 2024

Introduction to ELK Stack





What is ELK stack?

What is it used for?
  • aggregates logs from all systems and applications
  • logs analytics
  • visualizations for application and infrastructure monitoring, faster troubleshooting, security analytics etc.

Elasticsearch


  • distributed search and analytics engine built on Apache Lucene
  • licensed, not open source
    • OpenSearch is open-sourced alternative (supported by AWS)
    • FluentD is another open-source data collection alternative
  • data in the form of JSON documents is sent to Elasticsearch using:
    • API
    • ingestion tools
      • Logstash
      • Amazon Kinesis Data Firehose
  • the original document automatically stored and a searchable reference is added to the document in the cluster’s index
  • Elasticsearch REST-based API is used to manipulate with documents:
    • send
    • search
    • retrieve 
  • uses schema-free JSON documents
  • distributed system
    • enables it to process large volumes of data in parallel, quickly finding the best matches for your queries
  • operations such as reading or writing data usually take less than a second to complete => Elasticsearch can be used for near real-time use cases such as application monitoring and anomaly detection
  • has support for various languages: Java, Python, PHP, JavaScript, Node.js, Ruby etc...
  • .


Logstash


  • log shipper
  • helps easily transform source data and load it into Elasticsearch cluster
  • .

Filebeat


  • https://www.elastic.co/beats/filebeat
  • log shipper
  • both Filebeat and Logstash can be used to send logs from a file-based data source to a supported output destination
  • Filebeat is a lightweight option, ideal for environments with limited resources and basic log parsing needs. Conversely, Logstash is tailored for scenarios that demand advanced log processing
  • both FB and LS can be used in tandem when building a logging pipeline with the ELK Stack because both have a different function
  • .

Kibana


  • visualisation and reporting tool
  • used with Elasticsearch to:
    • visualize the data
    • build interactive dashboards
  • .

Friday 2 February 2024

Installing GraphViz on MacOS

I wanted to test Command: graph | Terraform | HashiCorp Developer by cd-ing to an arbitrary Terraform module and executing:

% terraform graph -type=plan | dot -Tpng >graph.png

But this issued an error:

zsh: command not found: dot

Solution:

% brew install graphviz  
...
==> Installing graphviz
==> Pouring graphviz--9.0.0.arm64_ventura.bottle.tar.gz
🍺  /opt/homebrew/Cellar/graphviz/9.0.0: 287 files, 7.1MB
==> Running `brew cleanup graphviz`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).


To verify installation:

% dot --version 
dot - graphviz version 9.0.0 (20230911.1827)

Wednesday 15 November 2023

Pingdom vs Site24x7 - Feature Comparison

 

Pingdom

Site24x7

Note

Check

(Website) Monitor


Name of check

Display Name


URL/IP

Web page URL


Check interval

Check Frequency

Default is 1 minute in both

Test from

Monitoring Locations


Port

N/A for Website Monitors but is available for e.g. Port Monitor (Custom Protocol) 

If the website url starts with http://, port 80 is used; if https:// port 443 is used.

User name

Web Credentials


Password

Web Credentials


Check for string:

Should contain

Should contain string(s)

Pingdom:  If this text is missing from the page, the site will be considered as down


Site24x7 offers flagging this as Trouble or Down


Provide a space-separated list of strings, all of which must be present in the response. Specify each string within double quotes.


Example:

 {"status":"ok"} needs to be entered as

{\"status\":\"ok\"}

In Terraform:

"\"{\\\"status\\\":\\\"ok\\\"}\""

Check for string:

Should not contain

Should not contain string(s) 

Site24x7 offers flagging this as Trouble or Down

N/A

Case sensitive


N/A

Should match regular expression

Site24x7 offers flagging this as Trouble or Down

N/A

Should contain HTTP Response Header(s)

Site24x7 offers flagging this as Trouble or Down

POST data

HTTP Method = POST
Request Body:

  • FORM

  • Text

  • XML

  • JSON 

Send data to website via POST method, one per line. Eg.j_username=joe


N/A

HTTP Method:

  • POST

  • GET

  • HEAD


Request headers

HTTP Request Headers


N/A (included in Req. headers)

User Agent 



Authentication Method

Basic / NTLM

OAuth

Web Token

N/A

Client Certificate

Only PKCS #12 files are supported

N/A

Query Authoritative Name Server


N/A

Force Domain / IP Addresses


N/A 

Accepted HTTP Status Codes


N/A

Follow HTTP Redirection


Monitor SSL/TLS

certificate

Trust the Server SSL Certificate


N/A

SSL Protocol

SSL version

N/A

HTTP Protocol

HTTP version

N/A

Enable ALPN


Consider down prior

to certificate expiring



Use IPv6

Prefer IPv6



Connection Timeout 

(e.g. 10 secs)

(socket connection)



Monitor Groups



Dependent on Monitor



Tags



IT Automation Templates



Execute IT Automation during Scheduled Maintenance




Alerts



Pingdom

Site24x7

Note

Alerting Settings

(within Check settings)

Threshold Profile

(referenced in Monitor)


N/A

Monitor Type 

(e.g. Website)


N/A

Display Name


(Pingdom uses 1 Second Opinion probe server)

Number of locations to report monitor as down


N/A

Notify when website content is modified (yes/no)

Notify as:

  • Down

  • Trouble 

  • Critical


Check importance 

(High or Low)

User profile >> Alert Settings


Choose notification mode:

  • Email

  • SMS

  • Voice Call

…for alert severity:

  • Down 

  • Critical

  • Trouble

  • Up


Who to alert?

User Alert Group


Consider down after:__ sec

(e.g. 30 sec timeout)


(response time)

Notify in case of read timeout (yes/no)


Response receive timeout period is 30 seconds and it can’t be changed.

Notify as:

  • Down

  • Trouble 

  • Critical


When down, alert after

(e.g. 2 mins)

Notification Profile >> Notification Delay


When the status is:

Down

Trouble 

Critical


…Notification Delay:

  • Notify immediately after failure

  • Notify after [2-5] continuous failures]


Resend alert every

Notification Profile >> Persistent Alert


Notify After Every: __th error



Customized message



Alert when back up

Notification Profile >> Alert Configuration


When the status is:

  • Any

  • Down 

  • Critical

  • Trouble

  • Up



Connect Integrations

Third-Party Integrations

  • Webhooks

  • Microsoft Teams