BoltzFormer is designed for text promptable segmentation, with superior performance for small objects. It performs Boltzmann sampling within the attention mechanism in the transformer, allowing the ...
Abstract: Vector-Quantization (VQ) based discrete generative models are widely used to learn powerful high-quality (HQ) priors for blind image restoration (BIR). In this paper, we diagnose the ...
Abstract: Variable-rate coding is challenging but indispensable for learned image compression (LIC) that is in nature characterized by nonlinear transform coding (NTC). Existing methods for ...