Skip to content

deigen/checkerboardgen

Repository files navigation

Progressive Checkerboards for Autoregressive Multiscale Image Generation

example visualization

Code and pretrained models, as described in the paper: Progressive Checkerboards for Autoregressive Multiscale Image Generation.

This codebase is based on a combination of several prior codebases including:

Download Pretrained Models

Pretrained models are available on HuggingFace at deigen/checkerboardgen.

To download them, you can use the download.sh script:

bash pretrained_models/download.sh

And also available at the following links:

Model Google Drive HuggingFace
Autoencoder gdrive hf
Checkerboard-L-2x gdrive hf
Checkerboard-L-4x gdrive hf

To download by hand, create the following directories:

mkdir -p pretrained_models/autoencoder
mkdir -p pretrained_models/checkerboard-L-2x
mkdir -p pretrained_models/checkerboard-L-4x

Then download the config files and checkpoints from the links above into their respective directories.

Generating Samples

To generate samples using the pretrained models, run the following command:

python main_sample.py --checkpoint pretrained_models/checkerboard-L-2x/checkpoint-last.pth --cfg 1.5 --steps_per_scale 4

Training Models

To train a new model, use the following command:

python main.py --config configs/config_256_S.yaml

This assumes you have imagenet downloaded and linked to data/imagenet. Adjust the config file as needed for different datasets or model configurations.

Training Data Cache

To speed up training, you can create a cached version of the training data, using

python main.py --config configs/config_256_S.yaml --create_cache --batch_size 32

and then use it with the --use_cache flag during training.

This creates a cache with all of the scales used for experiments in the paper. To only cache scales used in one model config, see the code in main.py.

Evaluation

There are two evaluation methods used in this repo.

torch_fidelity is used for inline evlauations during training.

For final evaluations, we use the original tensorflow implementations, which produce slightly different numbers, but are more comparable to prior work. These are in evaluator.py and run_evaluator.sh. The reference set used for these evals is at reference set.

Autoencoder Requantization

The autoencoder used in this work is a requantized version of the LlamaGen autoencoder. The requantization code is in train_ae_codebook/. The code here is somewhat complex for what it does, but included for completeness. This code retrains just the quantization layer as described in our paper, using the pretrained weights from LlamaGen link page.

Errata

  • We use the standard reference set for FID evals, not the full ImageNet validation set as mentioned in the paper (v1).

Citation

If you find this code useful for your research, please consider citing the following paper:

@article{progcheck2026,
  title={Progressive Checkerboards for Autoregressive Multiscale Image Generation},
  author={David Eigen},
  journal={arXiv preprint arXiv:2602.03811},
  year={2026}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors