Open Source Tool Tests Effectiveness of Algorithms on IoT Data
REDWOOD CITY, CA –November 10, 2015— Numenta, Inc., a leader in machine intelligence, today launched the Numenta Anomaly Benchmark (NAB), an open-source benchmark and tool designed to help data researchers evaluate the effectiveness of algorithms for anomaly detection in streaming, real-time applications.
Anomalies in streaming data are patterns that do not conform to past patterns of behavior for a given data stream. Until now, no benchmark has existed to evaluate anomaly detection in real-time streaming data.
NAB will be publicly unveiled on November 13 during MLconf in San Francisco in a presentation by Numenta Research VP Subutai Ahmad, “Real-time Anomaly Detection for Real-time Data Needs.” A peer-reviewed paper on NAB also was accepted by the IEEE Conference on Machine Learning and Applications and will be presented during the conference on December 9-11 in Miami.
The Need for Anomaly Detection in Time-Series Data
Explosive growth in streaming data is happening across industries, largely driven by the rise of the Internet of Things (IoT) and the proliferation of connected real-time data sources and applications with sensors producing waves of data. Voluminous amounts of this data are being stored for later analysis, though it often isn’t necessary or practical to capture and store all the information. Instead, data analysts need a way to analyze time-series data in real time, identify when something is different and act upon that insight.
Different approaches are being pursued to solve this problem, in the form of anomaly detection algorithms. But until now, a measurement to gauge the effectiveness of real-time anomaly detection algorithms has been lacking. With this goal in mind, Numenta created NAB.
“There is an explosion in real-time streaming data sources. Data owners want to be able to model this data and figure out if anything has changed,” commented Numenta CEO Donna Dubinsky. “We created this open benchmark as a tool to help data scientists evaluate the effectiveness of different algorithms in finding anomalous behavior in these data streams.”
Early anomaly detection in streaming data has practical and significant applications across many industries – from monitoring critical IT infrastructure to detecting potential fraudulent financial transactions, from understanding energy consumption to geo-tracking of vehicles in logistics networks.
The Numenta Anomaly Benchmark
NAB is an open source framework that was created to help data professionals test, score and evaluate anomaly detection algorithms on time-series data and to compare their internal anomaly detection techniques to published algorithms.
NAB also allows people to test their algorithms against Numenta’s HTM detector, which is based on Numenta’s Hierarchical Temporal Memory technology. It uses a biologically inspired memory prediction algorithm to model real-time data streams and continuously learns.
The major components to the NAB framework include:
- Real-world data. Includes 58 labeled streaming data files that are a combination of real-world data sets along with some simulated datasets. All anomalies are marked.
- Anomaly windows. These are defined ranges of data points that surround a known anomaly label. NAB uses these windows to decide whether, and how early, an algorithm detected each anomaly.
- A scoring mechanism. Scoring is specifically designed for streaming data and rewards early detection.
“NAB’s emphasis on anomaly windows and early detection is pioneering. In addition the research community stands to benefit greatly from an open dataset containing real world data, and an open source tool for measuring the effectiveness of real-time anomaly detection algorithms,” said Varun Chandola, Assistant Professor in Computer Science and Engineering, SUNY Buffalo.
“Having a standard benchmark could spur innovation in real-time anomaly detection algorithms. Our hope is the open source community will add new data sets, propose different scoring mechanisms, and test and compare other algorithms with our HTM algorithms,” said Dubinsky.
For more detailed information on NAB, go to:
- NAB Peer-Reviewed Research Paper
- NAB Business White Paper
- Numenta Anomaly Benchmark Repository
- Algorithm & product code
Founded in 2005, Numenta has developed a cohesive theory, core software technology, and numerous applications all based on principles of the neocortex. Laying the groundwork for the new era of machine intelligence, this technology is ideal for large-scale analysis of continuously streaming data sets and excels at modeling and predicting patterns in data. Numenta has also developed a suite of products and demonstration applications that utilize its flexible and generalizable HTM learning algorithms to provide solutions that encompass the fields of machine generated data, human behavioral modeling, geo-location
processing, semantic understanding and sensory-motor control. In addition, Numenta has created NuPIC (Numenta Platform for Intelligent Computing) as an open source project. Numenta is based in Redwood City, California.