Publications | Vijay Veerabadran

2023

NeurIPS 2023

Adaptive recurrent vision performs zero-shot computation scaling to unseen difficulty levels

Vijay Veerabadran, Srinivas Ravishankar, Yuan Tang, and 2 more authors

Proceedings of the 37th International Conference on Neural Information Processing Systems, 2023

Abs HTML PDF

Humans solving algorithmic (or) reasoning problems typically exhibit solution times that grow as a function of problem difficulty. Adaptive recurrent neural networks have been shown to exhibit this property for various language-processing tasks. However, little work has been performed to assess whether such adaptive computation can also enable vision models to extrapolate solutions beyond their training distribution’s difficulty level, with prior work focusing on very simple tasks. In this study, we investigate a critical functional role of such adaptive processing using recurrent neural networks: to dynamically scale computational resources conditional on input requirements that allow for zero-shot generalization to novel difficulty levels not seen during training using two challenging visual reasoning tasks: PathFinder and Mazes. We combine convolutional recurrent neural networks (ConvRNNs) with a learnable halting mechanism based on Graves (2016). We explore various implementations of such adaptive ConvRNNs (AdRNNs) ranging from tying weights across layers to more sophisticated biologically inspired recurrent networks that possess long/short-range lateral connections and gating. We show that 1) AdRNNs learn to dynamically halt processing early (or late) to solve easier (or harder) problems, 2) these RNNs zero-shot generalize to more difficult problem settings not shown during training by dynamically increasing the number of recurrent iterations at test time. Our study provides modeling evidence supporting the hypothesis that recurrent processing enables the functional advantage of adaptively allocating compute resources conditional on input requirements and hence allowing generalization to harder difficulty levels of a visual reasoning problem without training.
VSS 2023

Cortically motivated recurrence enables visual task extrapolation

Vijay Veerabadran, Yuan Tang, Ritik Raina, and 1 more author

J. Vis., Aug 2023

Abs HTML PDF

Feedforward deep neural networks have become the standard class of models in computer vision. Yet, they possess a striking difference relative to their biological counterparts which predominantly perform...
VSS 2023

Cortically motivated recurrence enables task extrapolation

Vijay Veerabadran, Yuan Tang, Ritik Raina, and 1 more author

Vision Sciences Society, COSYNE, May 2023

Abs HTML PDF

Feedforward deep neural networks have become the standard class of models in the field of computer vision. Yet, they possess a striking difference relative to their biological counterparts which predominantly perform “recurrent” computations. Why do biological neurons evolve to employ recurrence pervasively? In this paper, we show that a recurrent network is able to flexibly adapt its computational budget during inference and generalize within-task across difficulties. Simultaneously in this study, we contribute a recurrent module we call LocRNN that is designed based on a prior computational model of local recurrent intracortical connections in primates to support such dynamic task extrapolation. LocRNN learns highly accurate solutions to the challenging visual reasoning problems of Mazes and PathFinder that we use here. More importantly, it is able to flexibly use less or more recurrent iterations during inference to zero-shot generalize to less- and more difficult instantiations of each task without requiring extra training data, a potential functional advantage of recurrence that biological visual systems capitalize on. Feedforward networks on the other hand with their fixed computational graphs only partially exhibit this trend, potentially owing to image-level similarities across difficulties. We also posit an intriguing tradeoff between recurrent networks’ representational capacity and their stability in the recurrent state space. Our work encourages further study of the role of recurrence in deep learning models – especially from the context of out-of-distribution generalization & task extrapolation – and their properties of task performance and stability.
Nature 2023

Subtle adversarial image manipulations influence both human and machine perception

Vijay Veerabadran, Josh Goldman, Shreya Shankar, and 8 more authors

Nature Communications, Aug 2023

Abs HTML PDF

Although artificial neural networks (ANNs) were inspired by the brain, ANNs exhibit a brittleness not generally observed in human perception. One shortcoming of ANNs is their susceptibility to adversarial perturbations-subtle modulations of natural images that result in changes to classification decisions, such as confidently mislabelling an image of an elephant, initially classified correctly, as a clock. In contrast, a human observer might well dismiss the perturbations as an innocuous imaging artifact. This phenomenon may point to a fundamental difference between human and machine perception, but it drives one to ask whether human sensitivity to adversarial perturbations might be revealed with appropriate behavioral measures. Here, we find that adversarial perturbations that fool ANNs similarly bias human choice. We further show that the effect is more likely driven by higher-order statistics of natural images to which both humans and ANNs are sensitive, rather than by the detailed architecture of the ANN.

2022

VSS 2022

Bio-inspired divisive normalization improves object recognition performance in ANNs

Vijay Veerabadran, Ritik Raina, and Virginia De Sa

Vision Sciences Society, Dec 2022

Abs HTML PDF

In this work we introduce DivNormEI, a novel bio-inspired convolutional network that performs divisive normalization, a canonical cortical computation, along with lateral inhibition and excitation that is tailored for integration into modern Artificial Neural Networks (ANNs). DivNormEI, an extension of prior computational models of divisive normalization (Schwartz & Simoncelli, 2001; Robinson et. al, 2007) in the primate primary visual cortex, is implemented as a modular fully differentiable neural network layer that can be integrated in a straightforward manner into most commonly used modern ANNs. DivNormEI normalizes incoming activations via learned non-linear within-feature shunting inhibition along with across-feature linear lateral inhibition and excitation. In this work, we show how the integration of DivNormEI within a task-driven self-supervised encoder-decoder architecture encourages the emergence of the well-known contrast-invariant tuning property found to be exhibited by simple cells in the primate primary visual cortex. In addition, the integration of DivNormEI into an ANN (VGG-9 network) trained to perform large-scale object recognition on static images from the ImageNet-100 dataset improves both sample efficiency and top-1 accuracy on a held-out validation set. We also discuss the ability of a larger hybrid ANN (ResNet-50 with hierarchical placement of DivNormEI) to perform competitively on the more challenging task of semantic image segmentation. We believe our findings from the bio-inspired DivNormEI model that simultaneously explains properties found in primate V1 neurons and outperforms the competing baseline architecture on large-scale object recognition will promote further investigation of this crucial cortical computation in the context of modern machine learning tasks and ANNs.

2021

NeurIPS 2021

Bio-inspired learnable divisive normalization for ANNs

Vijay Veerabadran, Ritik Raina, and Virginia R De Sa

3rd Workshop on Shared Visual Representations in Human and Machine Intelligence (SVRHM), NeurIPS, Dec 2021

Abs HTML PDF

In this work we introduce DivNormEI, a novel bio-inspired convolutional network that performs divisive normalization, a canonical cortical computation, along with lateral …

2020

arXiv

Learning compact generalizable neural representations supporting perceptual grouping

Vijay Veerabadran, and Virginia R De Sa

arXiv preprint, Jun 2020

Abs HTML PDF

Work at the intersection of vision science and deep learning is starting to explore the efficacy of deep convolutional networks (DCNs) and recurrent networks in solving perceptual grouping problems that underlie primate visual recognition and segmentation. Here, we extend this line of work to investigate the compactness and generalizability of DCN solutions to learning low-level perceptual grouping routines involving contour integration. We introduce V1Net, a bio-inspired recurrent unit that incorporates lateral connections ubiquitous in cortical circuitry. Feedforward convolutional layers in DCNs can be substituted with V1Net modules to enhance their contextual visual processing support for perceptual grouping. We compare the learning efficiency and accuracy of V1Net-DCNs to that of 14 carefully selected feedforward and recurrent neural architectures (including state-of-the-art DCNs) on MarkedLong – a synthetic forced-choice contour integration dataset of 800,000 images we introduce here – and the previously published Pathfinder contour integration benchmarks. We gauged solution generalizability by measuring the transfer learning performance of our candidate models trained on MarkedLong that were fine-tuned to learn PathFinder. Our results demonstrate that a compact 3-layer V1Net-DCN matches or outperforms the test accuracy and sample efficiency of all tested comparison models which contain between 5x and 1000x more trainable parameters; we also note that V1Net-DCN learns the most compact generalizable solution to MarkedLong. A visualization of the temporal dynamics of a V1Net-DCN elucidates its usage of interpretable grouping computations to solve MarkedLong. The compact and rich representations of V1Net-DCN also make it a promising candidate to build on-device machine vision algorithms as well as help better understand biological cortical circuitry.
CVPRW 2020

Adversarial distortion for learned video compression

Vijay Veerabadran, R Pourreza, and others

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Jun 2020

Abs HTML PDF

In this paper, we present a novel adversarial lossy video compression model. At extremely low bit-rates, standard video coding schemes suffer from unpleasant reconstruction artifacts …

2018

NeurIPS 2018

Learning long-range spatial dependencies with horizontal gated recurrent units

D Linsley, J Kim, Vijay Veerabadran, and 1 more author

Proceedings of the 32nd International Conference on Neural Information Processing Systems, Jun 2018

Abs HTML PDF

Progress in deep learning has spawned great successes in many engineering applications. As a prime example, convolutional neural networks, a type of feedforward neural networks …