Differences

This shows you the differences between two versions of the page.

--- ai_disaster_recovery [2025/05/26 14:41] – [Core Components] eagleeyenebula
+++ ai_disaster_recovery [2025/05/26 14:44] (current) – [Best Practices] eagleeyenebula
@@ Line 54: / Line 54: @@
    * Allows the pipeline to resume execution from the last known good state.
-. **Checkpoints Store (**self.checkpoints**)**:
+. **Checkpoints Store (self.checkpoints)**:
    * Maintains the in-memory storage for all pipeline checkpoints.
    * Key: **step_name** (uniquely identifies the pipeline step).
@@ Line 131: / Line 131: @@
 In scenarios requiring persistent storage of checkpoints, the module can be extended with custom serialization. Here’s how to save checkpoints to disk:
-```python
+<code>
+python
 import pickle
 from ai_disaster_recovery import DisasterRecovery
@@ Line 156: / Line 157: @@
             logging.warning(f"Checkpoint file not found for step: {step_name}")
             return None
+<code>
-# Usage
+# **Usage**
+<code>
 persistent_recovery = PersistentDisasterRecovery()
+</code>
-# Save and rollback with disk persistence
+# **Save and rollback with disk persistence**
+<code>
 persistent_recovery.save_checkpoint("step_3", {"data": [7, 8, 9]})
 restored_data = persistent_recovery.rollback_to_checkpoint("step_3")
 print(f"Restored data: {restored_data}")
-```
+</code>
 **Expected Output:**
----
 ===== Use Cases =====
 . **AI Model Training Pipelines**:
-   - Save model state after every training epoch for fault recovery.
+   * Save model state after every training epoch for fault recovery.
 . **Data Processing Pipelines**:
-   - Save intermediate transformation results to prevent reprocessing from scratch in the event of failure.
+   * Save intermediate transformation results to prevent reprocessing from scratch in the event of failure.
 . **Workflow Management Systems**:
-   - Use checkpoints to incrementally save the state of a multi-step workflow.
+   * Use checkpoints to incrementally save the state of a multi-step workflow.
 . **Debugging Complex Errors**:
-   - Rollback to a known-good state for error analysis and testing.
+   * Rollback to a known-good state for error analysis and testing.
----
 ===== Best Practices =====
 . **Granular Checkpoints**:
-   - Save checkpoints at critical pipeline steps (e.g., post-feature extraction, model training).
+   * Save checkpoints at critical pipeline steps (e.g., post-feature extraction, model training).
 . **Logging and Debugging**:
-   - Leverage logging to monitor checkpoint creation and rollback actions.
+   * Leverage logging to monitor checkpoint creation and rollback actions.
 . **Serialization**:
-   - Use serialization (e.g., `pickle`, `JSON`, or database) for persistent checkpoint management, especially in distributed systems.
+   * Use serialization (e.g., **pickle**, **JSON**, or database) for persistent checkpoint management, especially in distributed systems.
 . **Version Control**:
-   - Employ versioning for checkpoints to avoid overwriting critical recovery points.
+   * Employ versioning for checkpoints to avoid overwriting critical recovery points.
 . **Secure Recovery**:
-   - When using external storage (e.g., cloud), ensure encryption to secure sensitive pipeline states.
+   * When using external storage (e.g., cloud), ensure encryption to secure sensitive pipeline states.
----
 ===== Conclusion =====