How OpsClarity Leverages Data Science and AI to Improve Automation and Root-cause Analysis

  Data science, artificial intelligence (AI) and machine learning have transformed e-commerce, personalization, digital marketing, search engines etc. Large scale analysis of data has become a powerful tool for businesses to create competitive differentiation. Seemingly left out of this, however, is IT operations - the place where all this incredible computation for data science takes [...]

By | December 13th, 2016|Data Science, Machine Learning|0 Comments

Optimizing Alerts on Free space on disks using Machine Learning

The available space on the disk (diskfree) has a significant and often catastrophic impact on applications and services running on the system. For this reason, every DevOps engineer knows that it is crucial to carefully monitor disk usage in all critical systems, especially ones that tend to rapidly use up disk space, such as heavily [...]

By | November 10th, 2016|Analytics, Data Science, Machine Learning|0 Comments

Leveraging AI and Machine Learning to tease out Seasonality patterns

The standard models, such as SMA, EWMA, etc., fail in the presence of trend or seasonality conditions. We have seen the effect of trend with metrics representing queue size. Many metrics representing important business concerns also exhibit strong seasonal behavior. For example, the number of active users on an e-commerce site shows both daily and [...]

Leveraging Anomaly Detection to Monitor for Errors

Metrics that represent errors pose a special problem. Error is often used in a generic sense that could imply something very serious where any non-zero values warrants investigation, e.g., 5xx errors, or, it could represent something that has an acceptable baseline value but where an unusual change could indicate serious problems, e.g., 4xx errors, page faults, [...]

By | October 23rd, 2016|Analytics, Data Science, Machine Learning|0 Comments

Queue Length in Messaging Systems – Kafka, SQS, etc.

The use of message queues/brokers is ubiquitous in any real-time application. Such intermediate modules (e.g. Apache Kafka, RabbitMQ, AWS SQS, etc.) improve system reliability by decoupling the producers from the consumers, thus freeing them from any synchronization requirements. A primary operational concern with such systems is whether the consumers are keeping up with the producers, [...]

By | October 21st, 2016|Analytics, Data Science, Machine Learning|0 Comments

Anomaly Detection and Machine Learning Applied to DevOps and Monitoring

In our previous blog post (Drowning in Alerts: Blame it on Statistical Models for Anomaly Detection), we talked about how standard anomaly detection constructs, such as SMA, exponential smoothing, etc., do not work well when it comes to Ops monitoring. At OpsClarity, we view monitoring as a multi-layered activity. For example, you may discover a [...]

By | October 21st, 2016|Data Science, Machine Learning|0 Comments

Operational Knowledge Graph : The intelligence behind OpsClarity

The OpsClarity platform has several analytics constructs that are specifically designed to manage the hyper-scale, hyper-change microservices architecture of large, complex and distributed data intensive applications. The platform was built with the specific goal of significantly improving the troubleshooting workflow for these applications. It was designed from the ground up to handle the massive volume [...]

By | January 28th, 2016|Data Processing Frameworks, Data Science, Machine Learning|0 Comments