Skip to content

Add DCAE and SAAF with dictionary context and AuxT support#356

Open
Yiozolm wants to merge 6 commits into
InterDigitalInc:masterfrom
Yiozolm:pr-dcae-saaf-auxt
Open

Add DCAE and SAAF with dictionary context and AuxT support#356
Yiozolm wants to merge 6 commits into
InterDigitalInc:masterfrom
Yiozolm:pr-dcae-saaf-auxt

Conversation

@Yiozolm

@Yiozolm Yiozolm commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Adds two channel-slice image compression models from the per-model PR series:

  • DCAE (Dictionary-based Channel-wise Auto-regressive Entropy) from Lu et al., "Learned Image Compression with Dictionary-based Entropy Model", CVPR 2025.
  • SAAF (Sparse Attention with Adaptive Frequency) from Ma et al., "Learned Image Compression via Sparse Attention and Adaptive Frequency", CVPR 2026.

This PR builds on the containerized entropy-stack direction from #355: both models use HyperpriorLatentCodec + ChannelGroupsLatentCodec, with model-specific dictionary context heads layered on top instead of introducing a separate codec class.

Pretrained weights are intentionally not bundled. Calling pretrained=True raises a clear RuntimeError until weights are hosted.

Summary

  • New compressai.models.dcae.DCAE and compressai.models.saaf.SAAF model classes.
  • New dictionary cross-attention building blocks in compressai.layers.attn.dictionary.
  • New dictionary context helpers in compressai.models._helpers.dictionary_context.
  • New AuxT primitives and integration helpers in compressai.models._helpers.auxt.
  • Adds optional AuxT side-branch support to TCM via use_auxt=True.
  • Adds lightweight wavelet wrappers under compressai.layers.wave for AuxT.
  • Adds zoo entries for "dcae" and "saaf" using lazy imports.
  • Adds checkpoint converters:
    • examples/convert_dcae_checkpoint.py
    • examples/convert_saaf_checkpoint.py
  • Extends examples/convert_tcm_checkpoint.py for upstream TCM checkpoints that include AuxT keys.
  • Adds focused tests for model forward paths, from_state_dict round trips, converters, dictionary helpers, AuxT helpers, and zoo registration.

Design notes

DCAE and SAAF share the same outer entropy topology:

HyperpriorLatentCodec(
    h_a=h_a,
    h_s=DualHyperSynthesis(h_mean_s, h_scale_s),
    latent_codec={
        "z": EntropyBottleneckLatentCodec(...),
        "y": ChannelGroupsLatentCodec(...),
    },
)

The model-specific variation lives in the per-slice dictionary context heads:

  • shared dictionary parameters are represented by SharedDictionary;
  • per-slice mean / scale prediction uses DictionaryMeanScaleContextHead;
  • LRP remains in the per-slice LRPGaussianLatentCodec leaf.

SAAF reuses the DCAE entropy wiring and adds:

  • integral AuxT-style aux_enc / aux_dec branches;
  • OLP regularization aggregated through the shared AuxT helper;
  • a training-only diffusion_loss head.

TCM keeps AuxT optional through use_auxt=True; from_state_dict auto-detects saved AuxT keys.

State-dict layout

Upstream checkpoint layouts are converted into the CompressAI-native layout before loading:

latent_codec.h_a.*
latent_codec.h_s.h_mean_s.*
latent_codec.h_s.h_scale_s.*
latent_codec.z.entropy_bottleneck.*
latent_codec.y.channel_context.y{k}.*
latent_codec.y.latent_codec.y{k}.*

The converters handle upstream-specific naming, including DCAE / SAAF dictionary keys, mean/scale hyper-synthesis renames, per-slice context heads, LRP transforms, Gaussian conditional buffers, and AuxT key normalization.

from_state_dict only infers constructor kwargs from the converted CompressAI layout; it does not directly accept every upstream checkpoint layout.

Commits

Commit Scope
feat(layers): lift dictionary cross-attention building blocks to compressai.layers.attn Shared dictionary attention primitives
feat(models/_helpers): add SharedDictionary and DictionaryMeanScaleContextHead DCAE / SAAF dictionary context wiring
feat(models): add DCAE with containerized codec DCAE model + converter support
feat(models): add AuxT primitives, helpers, and TCM use_auxt opt-in AuxT helpers, wavelet wrappers, TCM integration
feat(models): add SAAF with containerized codec and integral AuxT SAAF model + converter support
test,zoo: cover and register DCAE SAAF and AuxT Zoo entries and focused tests

Test plan

  • uv run pytest tests/test_models.py tests/test_models_helpers.py tests/test_layers.py -q
  • uv run pytest tests/test_zoo.py -q
  • uv run pytest tests/test_latent_codecs.py -q
  • uv run ruff format --check compressai tests examples
  • uv run ruff check compressai tests examples

Notes for reviewers

  • DCAE-private and SAAF-private backbone blocks are intentionally kept inside their model files because they are not currently reused by other models.
  • The dictionary context helper is shared because it is used by both DCAE and SAAF.
  • AuxT is factored into _helpers.auxt because it is used by SAAF and optionally by TCM, and is expected to be reusable by future models with the same side-branch pattern.

@Yiozolm Yiozolm changed the title Pr dcae saaf auxt Add DCAE and SAAF with dictionary context and AuxT support Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant