minimost.clean

minimost.clean

Maintenance utilities for purging old uploads and messages.

There are two kinds of limit — age-based retention and size-based caps:

All are called automatically by a background daemon thread started in minimost.create_app() — no cron job or external scheduler is required. The thread runs 5 minutes after startup and repeats every 24 hours. Settings are read from settings.json on each run:

  • "image_retention_days" — image file attachments (default: 30 days).

  • "file_retention_days" — all other file attachments (default: 30 days).

  • "message_retention_days" — messages in the message database (default: 770 days).

  • "max_upload_dir_size_mb" — total size cap for the uploads/ directory; oldest files are deleted when exceeded (0 or absent disables the cap).

  • "max_message_db_size_mb" — size cap for the shared message database; oldest messages are deleted when exceeded (0 or absent disables the cap).

This module can also be invoked directly for ad-hoc cleanup:

python3 src/minimost/clean.py
minimost.clean.delete_files_older_than(directory: str, image_days: int, file_days: int, dry_run: bool = False)[source]

Delete files in directory based on type-specific retention periods.

Image files (jpg, jpeg, png, gif, webp) are removed when older than image_days; all other files are removed when older than file_days.

Parameters:
  • directory (str) – Path to the directory to clean.

  • image_days (int) – Retention period in days for image files.

  • file_days (int) – Retention period in days for non-image files.

  • dry_run (bool) – If True, only print what would be deleted without removing any files. Defaults to False.

Raises:

ValueError – If directory does not exist or is not a directory.

minimost.clean.delete_files_over_size(directory: str, max_size_mb: float, dry_run: bool = False) None[source]

Delete the oldest files in directory until it fits within a size cap.

The combined size of every regular file directly in directory is compared against max_size_mb. While the total exceeds the cap, files are deleted oldest-first (by modification time) until the directory is back under it.

This bounds the disk footprint of uploads/ independently of the age-based retention in delete_files_older_than(): a burst of large uploads is trimmed by size even before any of it ages out. The two run together — age-based cleanup first, then this size cap on whatever remains.

Subdirectories are ignored (only regular files are counted and deleted), so the function is safe to point at a directory that nests other content. A cap of 0 (or any non-positive value) disables the check.

Parameters:
  • directory (str) – Path to the directory to bound.

  • max_size_mb (float) – Maximum combined size in mebibytes. Non-positive disables the check.

  • dry_run (bool) – If True, only print what would be deleted without removing any files. Defaults to False.

Raises:

ValueError – If directory does not exist or is not a directory.

minimost.clean.delete_messages_older_than(users_dir: str, days: int, dry_run: bool = False)[source]

Hard-delete messages older than days from every user database.

Iterates every *.db file in users_dir and removes rows from the messages table whose ts timestamp predates the cutoff. Each database is processed independently so a single corrupted file does not abort the run.

Parameters:
  • users_dir (str) – Path to the directory containing per-user .db files.

  • days (int) – Messages older than this many days are deleted.

  • dry_run (bool) – If True, print what would be deleted without making any changes. Defaults to False.

Raises:

ValueError – If users_dir does not exist or is not a directory.

minimost.clean._live_size_bytes(conn) int[source]

Return the size in bytes of the live (non-free) pages of a database.

page_count counts every page, including those on the freelist left behind by deletes/edits; those free pages are reclaimed when the database is compacted. (page_count - freelist_count) × page_size is therefore the size the .db file shrinks to after compaction, which is the meaningful quantity to cap: it ignores transient free-page bloat (so we don’t delete messages merely because space has not been reclaimed yet) and is independent of the WAL file, avoiding the WAL-mode quirk where os.stat on the main file lags committed changes until a checkpoint.

minimost.clean.delete_messages_over_size(db_path: str, max_size_mb: float, dry_run: bool = False, batch: int = 1000) None[source]

Delete the oldest messages until the message database fits a size cap.

The shared messages.db is the only database that grows with prunable content, so it is the only one a size cap can be enforced on. Size is measured as the live (post-compaction) data size — see _live_size_bytes() — so transient free-page bloat never triggers a deletion. When that size exceeds max_size_mb, the oldest messages (lowest ts) are deleted in batches of batch rows until the database is back under the cap, after which the freed pages are reclaimed in one VACUUM and the WAL checkpointed so the on-disk file shrinks to match. VACUUM runs at most once per call, and only when something was actually pruned.

A size cap of 0 (or any non-positive value) disables the check, leaving age-based retention as the only purge. The database is opened only when it actually exceeds the cap.

Parameters:
  • db_path (str) – Path to the shared messages.db file.

  • max_size_mb (float) – Maximum allowed size in mebibytes. Non-positive disables the cap.

  • dry_run (bool) – If True, only report what would be deleted.

  • batch (int) – Number of oldest messages to delete per cycle.

Usage

As a cron job (recommended — delete uploads older than 30 days at 02:30 every day):

30 2 * * * /usr/bin/python3 /srv/minimost/src/minimost/clean.py

From the command line:

python3 src/minimost/clean.py

Programmatically:

from minimost.clean import delete_files_older_than

# Preview what would be deleted (no files removed)
delete_files_older_than("uploads", days=30, dry_run=True)

# Delete files older than 14 days
delete_files_older_than("uploads", days=14)