Skip to content

jadia/backup-utility

Repository files navigation

Intelligent Cascading Backup & Audit Utility

A robust, interactive command-line backup utility designed to prevent cascading data corruption (bit-rot), accidental deletions, and unintended overwrites by intelligently archiving changes before syncing.

⚠️ Important Disclaimer

This utility was highly personalized and written specifically for a very strict, hardcoded hardware topology. If you are cloning this repository, you must modify the code and configurations to match your setup, or it will fail.

Core Assumptions & Hardcoding Defaults:

  1. Three Hard Drives Topology: The system expects exactly three external drives (1TB, 2TB, 4TB).
  2. Hub and Spoke Strategy (No Nested Cascading):
    • 1TB gets synced as a backup into the 2TB.
    • 1TB gets synced as a backup into the 4TB.
    • The 2TB gets synced into the 4TB. (The 1TB backup already on the 2TB drive is explicitly excluded so it does not get redundantly nested).
    • The primary Laptop (Source) gets synced directly to a separate folder on the 4TB.
  3. Mount Paths: The scripts strictly assume your drives will be mounted at /mnt/1tb, /mnt/2tb, and /mnt/4tb.

Quick Start

1. Prerequisites

Ensure you have the following installed on your Ubuntu/Linux system:

  • rsync
  • python3
  • sqlite3

2. Configuration

Before running the utility, you must configure two files:

config.env: Open this file to map your fundamental hardware topology. You must define:

  1. The UUIDs of your hard drives.
  2. The mount points where they will be attached.
  3. The sync destination paths mapping how you cascade your data (e.g., 1TB -> 2TB).
  4. Thresholds for safe-sync warnings.

auditor_config.json: Open this JSON file to customize the Integrity Auditor's parameters:

  1. EXT_FILTER: Tailor this array with specific file extensions you want protected (e.g., .jpg, .mp4, .pdf). Any non-critical extensions omitted from this list will be intelligently ignored to drastically improve hashing speeds on mechanical drives.
  2. EXCLUSIONS: Directories completely ignored by the auditor.

3. Usage

Run the interactive menu wrapper:

./main.sh

Follow the on-screen prompts to mount drives, run dry-run analyses, sync data safely, or run the hashing auditor.

Project Structure

  • main.sh: The interactive Bash menu and drive management wrapper.
  • core_sync.sh: The core rsync logic that handles --backup archiving and dry-run threshold validation.
  • auditor.py: A fast, SQLite-backed Python script that calculates and tracks SHA-256 hashes to detect bit-rot or silent corruption. It also includes an intelligent duplicate file detection mechanism mapping to known duplicates.
  • config.env: Centralized configuration.

Detailed Documentation

For a deep dive into how this utility works, recovery procedures, and troubleshooting, please refer to the docs/ directory:

  1. Architecture & Philosophy
  2. General Use Cases & Recovery
  3. Troubleshooting & Scenario Guides

Created by jadia.dev. The architecture, Python auditor, bash wrapper, and documentation in this repository were collaboratively designed, refactored, and generated by an Agentic AI Assistant, serving as an expert systems strategist.

About

A robust, interactive command-line backup utility designed to prevent cascading data corruption (bit-rot), accidental deletions, and unintended overwrites by intelligently archiving changes before syncing.

Resources

License

Stars

Watchers

Forks

Contributors