Updates, Ideas and Messages from the Grok Team

What is AIOps?

A quick primer

Today’s IT organizations struggle to keep up with the pace of innovation required in a cloud-first world. Cloud computing has made it easier than ever to deliver applications at scale, but it comes at a cost. Applications run on complex environments with many points of failure. Identifying where to start in an IT incident process can lead down several rabbit holes, creating management fatigue. Worse, operating expenses can rise rapidly, making budgets more painful every year.

By 2020, on-premise IT will start to become the minority, but the rapid rise of public does not remove spend entirely. Rather, most spend just shifts to cloud based resources, because OpEx is still not reduced.

DevOps practices helped curb these costs, but it also lead a proliferation of tools that come with new challenges. To reduce OpEx, companies hired for DevOps to reduce staff and streamline operations. As if architecting systems was hard enough, maintaining these systems adds another strenuous challenge. Monitoring and analytics tools try to ease the pain, but most provide extra annoyances via arbitrary alerts against data observed from these systems.

While ITOA provides a basis for decreasing OpEx, no solution is actually helping to reduce cost and provide value to IT in a truly impactful way.

With machine learning, alerting tools can provide better insights for DevOps and IT as a whole. Rather than monitoring tools telling teams what they know (your app is down), teams can use ML to provide a buffer between monitoring data and DevOps. Combined with automation, DevOps teams need not lift a finger during an IT incident — the system understands a dangerous event and responds automatically.

AIOps provides a framework to bring this type of intelligence and automation to make IT smarter. Teams that adopt an AIOps discipline can help DevOps work smarter, not harder. Rather than waiting for fires to happen, AIOps encourages a proactive culture within an IT organization.

Grok uses machine learning to sense issues in cloud environments or other streaming data sources to predict major cloud incidents. Once Grok finds an issue, it uses automation with tools you trust to proactively mitigate issues before they become an IT incident.

Grok enables this proactive behavior using industry leading machine learning and automation. Grok sense behaviors that lead to downtime using anomaly detection — finding outliers in app and device data. Once Grok finds anomalous behavior, it includes hooks to tools you trust to take action with speed.

Want to learn more? Check out our whitepaper available on AIOps today! It’s free to download and provides our vision for a more efficient, powerful IT operations practice.

Get the Whitepaper

Happy Building!

Tarun Gangwani is Head of Product at Grok. He an award-winning product and design professional whose work has been used by millions of people around the world. With his background in cognitive science and design, Tarun has delivered user-centered solutions to startups and enterprise companies within a wide variety of industries that leverage cloud technologies to deliver innovation to their clients. Tarun’s perspectives and work have been featured in major news publications, including the New York Times,, Tech.Co and Forbes.

Comments are closed