Blog

Updates, Ideas and Messages from the Grok Team

Deep Dive: Monitoring Anomalies with Grok

How does Grok detect incidents using performance data from your IT environment?

As soon as you add a monitoring target to Grok, the platform will begin building cognitive models on each metric provided, providing insights as soon as it has grokked enough information from the service. Any detected anomalies will appear on the charts page.

Grok provides charts for every service monitored by the platform, making it easy to assess the health of your environment. The tabs let you switch the timeframe that is viewed, and the options include the ability to sort the charts and filter them by the custom dashboards you can create.

The charts view lets you view all monitored services, with the option to organize the page with sorting and filtering options. The page is useful to those who want an at-a-glance view of the current health of their services as required by the team. Clicking on the blue bar collapses the chart so you can focus on the metrics you wish to have open for further analysis.

Woohoo customization!

Anomalies detected across each metric are aggregated to provide a quick view of the health of a given service, and selecting one of these charts reveals the individual metric charts for further analysis of the detected issue. For example, this chart shows a test WordPress instance with a detected issue that is the result of an anomaly detected on both the CPU utilization and network traffic metrics.

Grok has identified anomalies at the midnight hour with CPU Utilization, followed by abnormal behavior in network traffic. Note how Grok alerts on patterns of beahvior and not just spikes of activity, as the pattern of network activity during the afternoon two days prior was normal.

Since Grok learns the typical traffic patterns of a service, this behavior might be due to a sudden spike in traffic, or it could be due to an erroneous code push. If Grok is monitoring the front-end traffic via the metric streamer, then you can quickly verify if this was due to a traffic spike vs. a potential incident. Regardless, the behavior was deemed abnormal despite knowing the context of the service’s behavior since it was monitored by the platform.

If your company uses a custom web dashboard, you can embed the chart using HTML. We will generate code that only works with a specific URL, so your data is secure so long as the web destination chosen is secure as well.

Embed code contains a unique hash that must match what Grok has stored, so the data is secure if the endpoint is secure.

We hope the Grok dashboard and chart views provide a simple way to monitor the health of your environment(s), and we look forward to future ideas on how to expand the capability further. Our goal was to make a simple experience to begin understanding an environment quickly, without requiring a data science background. With the Grok API, you can retrieve data from your environment for use in other tools as well. Let us know what monitoring and analysis tools you use with Grok data in the comments!

Tarun Gangwani is Head of Product at Grok. He an award-winning product and design professional whose work has been used by millions of people around the world. With his background in cognitive science and design, Tarun has delivered user-centered solutions to startups and enterprise companies within a wide variety of industries that leverage cloud technologies to deliver innovation to their clients. Tarun’s perspectives and work have been featured in major news publications, including the New York Times, CIO.com, Tech.Co and Forbes.

Comments are closed