background.png

Logging

Logs help DevOps teams catch issues before they arise. Detailed logging is also used to simplify troubleshooting. Additionally, DevOps teams can use log data to create a more user-friendly product or service by exploring user engagements and by detecting anomalies.

The Benefit of Logging

Logging allows you to report and persist error and warning messages, as well as info messages. The messages can then later be retrieved and analyzed. They are deployed in places where you don’t have access, or where there are no debugging tools. Even in this case, log messages can still help localize the problem. When metrics alert you that something is wrong, logging is the last layer to leverage and the most important in this case.

Services 2.png
logging 2.png

At stack.io we can:

  • Set up centralized logging infrastructure so there’s one place to view all logs and search through them quickly.  (eg. application logs, infrastructure (host/OS), Kubernetes (API server and all the components), ELB, NGINX, Network, CloudTrail)

  • Set up logging dashboards that allow you to visualize what’s happening with your infrastructure.

  • Set up log routing/forwarding from new solutions to existing log systems such as Splunk and CloudWatch.

  • Recommend best practices around log formatting in order to create a log template that helps debug and to collect insightful data for troubleshooting in application and infrastructure logs.

  • Apply best practices to logging, including but not limited to: structured logging, building meaning and context into log messages, avoiding logging non-essential or sensitive data, capturing logs from diverse sources, aggregating and centralizing your log data, indexing logs for querying and analytics, configuring real-time log monitoring and alerts, optimizing your log retention policy.




Untitled+design-22.jpg

DevOps Maturity

Where does your setup fit on our DevOps maturity scale?

+ Poor

  • When an incident happens, it’s difficult for us to find the information we need.
  • We have to check each system and log individually to find things and often we come up empty-handed.
  • Our logs frequently cause us to run out of disk space and cause outages.

+ Fair

  • Our team knows where to go to find logs, but it still takes time to investigate and we have to use things like awk/grep/sed to get the information we need.

+ Good

  • All of our logs go to a centralized location and are searchable.

+ Great

  • We have dashboards to give an overview of what’s happening, and where, at a glance. Even non-technical staff are able to easily find the information they need.

  • We have alerts setup to notify us of problems in our logs.