A Python project implementing various autoencoder models including:
-
AE (Autoencoder)
- Standard autoencoder: learns a compressed latent representation of the input.
- The latent space has no explicit distribution constraints, so it captures features purely for reconstruction.
-
VAE (Variational Autoencoder)
- The latent space is regularized to follow a standard normal distribution
$\mathcal{N}(0, 1) $ . - This allows smooth interpolation in the latent space and generative sampling.
- The latent space is regularized to follow a standard normal distribution
-
Beta-VAE (VAE with β regularization)
- Extends the VAE by introducing a β weight on the KL divergence term.
- Encourages disentangled latent representations, making each latent dimension capture independent factors of variation.
-
VQ-VAE (Vector-Quantized VAE)
- The latent space is discrete, mapped to a finite set of embedding vectors.
- Useful for discrete representation learning, compression, and tasks like generative modeling with autoregressive priors.
makemake train You can override parameters with Hydra, e.g.:
python train.py model=vq_vae training.epochs=5 training.optimizer.lr=1e-3make sweep PARAMS="model=vq_vae training.epochs=3,5,7 training.optimizer.lr=5e-3,1e-3"This launches all combinations of the parameter values. Each run gets its own output folder under outputs/YYYY-MM-DD_HH-MM-SS/.
All parameters are configurable via Hydra in the config/ directory:
-
model
ae— Autoencodervae— Variational Autoencoderbeta_vae— Beta-VAEvq_vae— Vector-Quantized VAE
-
training
epochs— number of training epochsbatch_size— batch sizeoptimizername— optimizer type (e.g.,adam,sgd)lr— learning rateweight_decay— weight decay for regularization
device—"cuda","cpu", or"auto"
-
visualization
active— enable or disable visualizationn_samples— number of images to visualize
-
evaluation
active— enable or disable evaluation
Override any parameter directly from the command line using Hydra syntax. For example:
python train.py model=vae training.epochs=10 training.optimizer.lr=1e-3 visualization.n_samples=16Hydra automatically creates per-run output directories:
-
outputs/YYYY-MM-DD_HH-MM-SS/-
train.log: logs for the run -
plot/: reconstructed images for the model
-
Each multirun or sweep creates a separate folder, so runs do not overwrite each other.
make train # Run single training
make sweep PARAMS="..." # Run Hydra multirun
make clean # Remove temporary files, virtualenv, outputs
make all # Show help