You are on page 1of 23

Modelling visual complexity

of real-word scenes

Fintan Nagle and Nilli Lavie

Institute of Cognitive Neuroscience,


University College London
Visual complexity and perceptual load

Espeseth 2010 PLOS One Molloy 2015 J Neurosci


Visual complexity and perceptual load

How do we estimate the perceptual load of real-world scenes?

18.7 30.1
Visual complexity and perceptual load

Which is more visually complex?

16.9 30.7
Visual complexity and perceptual load

Which is more visually complex?

34.0 19.27
Modelling visual complexity: previous work

Machado et al 2006 Acta Psychologica


• Model: feedforward network with one hidden layer to
predict complexity from 329 image features
• 800 images of several classes
• Spearman’s ρ rank correlation: .83 on unseen
validation set
• However results could be affected by inter class
differences
Corchs et al 2016 PLOS One Previous work has used
• Model: particle swarm classifiers on small feature
optimised regression on vectors and has not
features demonstrated wide
• 49 real-world scenes applicability to real-world
• r = .81 on unseen validation scenes.
set of 49 images
Data collection

OR

Data collected using a custom Web platform (Flask


server in Python, hosted on AWS)

• 4000 images
• 62 observers, 11 male, mean age 24.9
• 75,020 comparisons (1,210 each on average)
TrueSkill convergence

Final distribution of complexity ratings for all


Convergence of the TrueSkill algorithm 4,000 images.
after rating of the first 2,000 images
(iterations 0 - 4) and after the addition of the 2AFC judgements matched the TrueSkill
second 2,000 images (iterations 5 to 8). ratings in 76.1% of comparisons
Modelling

r = 0.83
Object taggings and their influence on complexity

Image count
Correlations with perceived complexity ratings of
predictions from the PASCAL VOC object taggings.
Complexity and feature congestion

Original image Contrast clutter map

Rosenholtz et al
2007 JoV
Other features

Object taggings Replication of Corchs et al


from Mask R-CNN
We replicated the linear feature
combination proposed.
Edge ratio
Results: r = .47.
This feature combination does not
work well on a diverse set of real-
Salience models world images.
Predicting complexity from image statistics and features
Complexity maps: occlusion method
Moving from local complexity to global complexity

Perceived complexity maps


Images split into discs

Discs rated via 2AFC Example


disc ratings
Correlations between disc and whole-image ratings
Complexity maps: interpolation and blurring
Complexity maps: bicubic interpolation
Complexity maps: bicubic interpolation
Complexity maps: bicubic interpolation
Conclusions

• Observers show good agreement on 2AFC decisions


on visual complexity.

• A one-dimensional embedding explains these decisions well


(agreeing with 76.1% of comparisons).

• Visual complexity can be effectively predicted by a convolutional network


(r = 0.83) on a large, diverse set (3600 training, 400 test)
of indoor and outdoor real-world (natural and man-made) images.
This approach outperforms the previous state of the art (feature-based models).

• Our model allows estimation of the complexity of natural scene images

• Provides a plausible tool to operationalize perceptual load

• Some preliminary work in this area: Chech & Lavie 2019 ECVP (poster this afternoon!)
Thanks

You might also like