ai_anomaly_detection
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| ai_anomaly_detection [2025/05/23 16:45] – [2. Batch Detection for Multiple Data Sets] eagleeyenebula | ai_anomaly_detection [2025/06/26 18:20] (current) – [AI Anomaly Detection] eagleeyenebula | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== AI Anomaly Detection ====== | ====== AI Anomaly Detection ====== | ||
| - | * **[[https:// | + | [[https:// |
| The **AI Anomaly Detection** system is a Python-based utility that identifies outliers in datasets using statistical principles like standard deviation. This function is essential for finding anomalous data points that deviate significantly from the dataset' | The **AI Anomaly Detection** system is a Python-based utility that identifies outliers in datasets using statistical principles like standard deviation. This function is essential for finding anomalous data points that deviate significantly from the dataset' | ||
| + | |||
| + | |||
| + | {{youtube> | ||
| + | |||
| ===== Overview ===== | ===== Overview ===== | ||
| Line 23: | Line 28: | ||
| **Threshold for Anomalies**: | **Threshold for Anomalies**: | ||
| Data points are considered anomalies if they fall outside the range: | Data points are considered anomalies if they fall outside the range: | ||
| + | < | ||
| [mean - (3 * standard deviation), mean + (3 * standard deviation)] | [mean - (3 * standard deviation), mean + (3 * standard deviation)] | ||
| + | </ | ||
| ==== 2. Logging Information ==== | ==== 2. Logging Information ==== | ||
| Line 34: | Line 39: | ||
| **Example Log Messages**: | **Example Log Messages**: | ||
| + | < | ||
| INFO: Detecting anomalies in the data... INFO: Anomalies detected: [120, -45] | INFO: Detecting anomalies in the data... INFO: Anomalies detected: [120, -45] | ||
| + | </ | ||
| ====== Function Details ====== | ====== Function Details ====== | ||
| Line 44: | Line 49: | ||
| **Signature**: | **Signature**: | ||
| - | python | + | < |
| + | python | ||
| def detect_anomalies(data: | def detect_anomalies(data: | ||
| """ | """ | ||
| Line 51: | Line 57: | ||
| :return: List of anomalies detected | :return: List of anomalies detected | ||
| """ | """ | ||
| + | </ | ||
| ===== Examples ===== | ===== Examples ===== | ||
| Line 57: | Line 63: | ||
| **Input Example**: | **Input Example**: | ||
| - | python | + | < |
| + | python | ||
| data = [10, 12, 15, 10, 11, 14, 120, 12, 9, -45] | data = [10, 12, 15, 10, 11, 14, 120, 12, 9, -45] | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| print(f" | print(f" | ||
| + | </ | ||
| **Output**: | **Output**: | ||
| - | + | < | |
| Anomalies: [120, -45] | Anomalies: [120, -45] | ||
| - | | + | </ |
| **Explanation**: | **Explanation**: | ||
| Line 75: | Line 82: | ||
| **Example: Empty Dataset**: | **Example: Empty Dataset**: | ||
| - | python | + | < |
| + | python | ||
| data = [] | data = [] | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| print(f" | print(f" | ||
| - | + | </ | |
| **Output**: | **Output**: | ||
| - | + | < | |
| Anomalies: [] | Anomalies: [] | ||
| - | + | </ | |
| **Explanation**: | **Explanation**: | ||
| The function immediately returns an empty list if the dataset is empty. | The function immediately returns an empty list if the dataset is empty. | ||
| **Example: All Data Within Range**: | **Example: All Data Within Range**: | ||
| - | python | + | < |
| + | python | ||
| data = [100, 102, 98, 101, 99] | data = [100, 102, 98, 101, 99] | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| print(f" | print(f" | ||
| - | + | </ | |
| **Output**: | **Output**: | ||
| - | + | < | |
| Anomalies: [] | Anomalies: [] | ||
| - | | + | </ |
| **Explanation**: | **Explanation**: | ||
| Line 109: | Line 115: | ||
| **Input Data**: | **Input Data**: | ||
| - | python | + | < |
| + | python | ||
| data = [100, 150, 200, 1000, 105, 210, 980, 115, 195] | data = [100, 150, 200, 1000, 105, 210, 980, 115, 195] | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| - | + | </ | |
| **Output**: | **Output**: | ||
| - | + | < | |
| Anomalies: [1000, 980] | Anomalies: [1000, 980] | ||
| - | + | </ | |
| **Explanation**: | **Explanation**: | ||
| Outliers 1000 and 980 are classified as anomalies due to their significant deviation from the mean of the dataset. | Outliers 1000 and 980 are classified as anomalies due to their significant deviation from the mean of the dataset. | ||
| Line 128: | Line 133: | ||
| **Framework for Live Data Streams**: | **Framework for Live Data Streams**: | ||
| - | python | + | < |
| + | python | ||
| import random | import random | ||
| import time | import time | ||
| Line 148: | Line 154: | ||
| stream_anomaly_detection() | stream_anomaly_detection() | ||
| + | </ | ||
| ===== Advanced Usage ===== | ===== Advanced Usage ===== | ||
| Line 154: | Line 160: | ||
| By default, the function uses **3 standard deviations** as the threshold for anomaly detection. To customize this, modify the following part of the function: | By default, the function uses **3 standard deviations** as the threshold for anomaly detection. To customize this, modify the following part of the function: | ||
| - | python | + | < |
| + | python | ||
| anomalies = [x for x in data if abs(x - mean) > THRESHOLD * std_dev] | anomalies = [x for x in data if abs(x - mean) > THRESHOLD * std_dev] | ||
| - | + | </ | |
| **Example Custom Threshold**: | **Example Custom Threshold**: | ||
| - | python | + | < |
| + | python | ||
| THRESHOLD = 2 # Using 2 standard deviations instead of 3 | THRESHOLD = 2 # Using 2 standard deviations instead of 3 | ||
| data = [12, 15, 18, 10, 140] | data = [12, 15, 18, 10, 140] | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| print(f" | print(f" | ||
| - | + | </ | |
| ==== 2. Batch Detection for Multiple Data Sets ==== | ==== 2. Batch Detection for Multiple Data Sets ==== | ||
| Line 170: | Line 177: | ||
| **Example**: | **Example**: | ||
| - | python | + | < |
| + | python | ||
| datasets = [ | datasets = [ | ||
| [10, 12, 14, 18, 200], | [10, 12, 14, 18, 200], | ||
| Line 180: | Line 188: | ||
| anomalies = detect_anomalies(data) | anomalies = detect_anomalies(data) | ||
| print(f" | print(f" | ||
| - | + | </ | |
| **Output**: | **Output**: | ||
| - | + | < | |
| Dataset 1: [200] Dataset 2: [700] Dataset 3: [500] | Dataset 1: [200] Dataset 2: [700] Dataset 3: [500] | ||
| + | </ | ||
| ==== 3. Combining with Visualization ==== | ==== 3. Combining with Visualization ==== | ||
| Line 190: | Line 198: | ||
| **Example with Matplotlib**: | **Example with Matplotlib**: | ||
| - | ```python | + | < |
| + | python | ||
| import matplotlib.pyplot as plt | import matplotlib.pyplot as plt | ||
| Line 209: | Line 218: | ||
| plt.legend() | plt.legend() | ||
| plt.show() | plt.show() | ||
| - | ``` | + | </ |
| - | + | ||
| - | --- | + | |
| ===== Applications ===== | ===== Applications ===== | ||
| Line 223: | Line 229: | ||
| **3. Preprocessing for AI Pipelines**: | **3. Preprocessing for AI Pipelines**: | ||
| Flag and handle anomalous data points before model training to improve model robustness and accuracy. | Flag and handle anomalous data points before model training to improve model robustness and accuracy. | ||
| - | |||
| - | --- | ||
| ===== Best Practices ===== | ===== Best Practices ===== | ||
| Line 236: | Line 240: | ||
| 3. **Visualization**: | 3. **Visualization**: | ||
| Combine detection results with visualizations for better interpretability. | Combine detection results with visualizations for better interpretability. | ||
| - | |||
| - | --- | ||
| - | |||
| ===== Conclusion ===== | ===== Conclusion ===== | ||
| The **AI Anomaly Detection** framework provides a robust, flexible, and extensible mechanism for outlier detection in numerical datasets. With applications ranging from real-time monitoring to preprocessing for AI pipelines, the system is a valuable tool for automated anomaly analysis. By leveraging advanced usage patterns like visualization and threshold adjustments, | The **AI Anomaly Detection** framework provides a robust, flexible, and extensible mechanism for outlier detection in numerical datasets. With applications ranging from real-time monitoring to preprocessing for AI pipelines, the system is a valuable tool for automated anomaly analysis. By leveraging advanced usage patterns like visualization and threshold adjustments, | ||
ai_anomaly_detection.1748018716.txt.gz · Last modified: 2025/05/23 16:45 by eagleeyenebula
