FintanNagle-ModellingVisualComplexityOfRealWorldScenes v5

Modelling visual complexity
of real-word scenes
Fintan Nagle
Imperial College London
f.nagle@imperial.ac.uk
About me
• Cognitive scientist and developer

• Experience with deep learning and building interactive websites
Research interests:
• Temporal visual search & dynamic object recognition (PhD work)

• Visual complexity (postdoc work)
• How is episodic memory implemented in connectionist networks? (opposing
Gallistel and King's view; talk next week at GEM Bochum)
• How distrust and misinformation spread in online communities; how critical
thinking develops online (co-PI, grant with NCH & Northeastern)
• Knowledge representation and question-answering: NLP vs. semantic networks
and knowledge networks
Visual complexity and perceptual load
Espeseth 2010 PLOS One Molloy 2015 J Neurosci

How do we estimate the perceptual load of real-world scenes?

Which is more visually complex?


Modelling visual complexity: previous work
Machado et al 2006 Acta Psychologica
• Model: feedforward network with one hidden layer to

predict complexity from 329 image features
• 800 images of several classes
• Spearman’s ρ rank correlation: .83 on unseen
validation set
Corchs et al 2016 PLOS One ➤ Previous work has used

classifiers on small feature
• Model: particle swarm optimised vectors and has not
regression on features demonstrated wide applicability
• 49 real-world scenes to real-world scenes.
• r = .81 on unseen validation set of 49
images
How subjective is complexity?
We asked 4 observers to choose the more complex image from 100 randomly
selected pairs of PASCAL VOC images.
• 48% were consistently judged by all observers

• 33.5% were agreed upon by 3 out of 4 observers
• 18.5% were tied between 2 and 2 observers
This indicates that there is a good degree of agreement between observers’

judgements of natural scene images: the percept of visual complexity is not so
subjective that it cannot be modelled.
Data collection
or
Data collected using a custom Web platform (Flask server

in Python, hosted on Amazon Web Services)
• 4000 images
• 62 observers, 11 male, mean age 24.9
Continuum of images
ordered by complexity • 75,020 comparisons (1,210 each on average)
TrueSkill convergence
Final distribution of complexity ratings for

Convergence of the TrueSkill algorithm
all 4,000 images.
after rating of the first 2,000 images
(iterations 0 - 4) and after the addition of
2AFC judgements matched the TrueSkill
the second 2,000 images (iterations 5 to 8).
ratings in 76.1% of comparisons
18.7 30.1
16.9 30.7
34.0 19.27
Modelling
r = 0.83
Object taggings and their influence on complexity
Image count
Correlations with perceived complexity ratings

of predictions from the PASCAL VOC object taggings
Complexity and feature congestion
Original image Contrast clutter map
Rosenholtz et al
2007 JoV
Other features
Replication of Corchs et al
Object taggings
from Mask R-CNN We replicated the linear feature
combination proposed, running it on
our images.
Edge ratio Result: r = .47
This feature combination does not

work well on a diverse set of real-world
images.
Salience models
Predicting complexity from image statistics and features
Complexity maps: occlusion method
Moving from local complexity to global complexity
Perceived complexity maps

Images split into discs
Discs rated via 2AFC Example

disc ratings
Correlations between disc and whole-image ratings
Complexity maps: interpolation and blurring
Complexity maps: bicubic interpolation
Conclusions
• Observers show good agreement on 2AFC decisions

on visual complexity.
• A one-dimensional embedding explains these decisions well

(agreeing with 76.1% of comparisons).
• Visual complexity can be effectively predicted by a convolutional network

(r = 0.83) on a large, diverse set (3600 training, 400 test)
of indoor and outdoor real-world (natural and man-made) images.
This approach outperforms the previous state of the art (feature-based models).
• Our model allows evaluation of the complexity of natural scene images

and investigation of the relationship between visual complexity
and visual perceptual load.
• Some preliminary work in this area: Chech & Lavie 2019 ECVP (poster)
Thanks
fintan.nagle@ucl.ac.uk

FintanNagle-ModellingVisualComplexityOfRealWorldScenes v5

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FintanNagle-ModellingVisualComplexityOfRealWorldScenes v5

Uploaded by

Copyright:

Available Formats

Modelling visual complexity

Imperial College London

• Cognitive scientist and developer

• Temporal visual search & dynamic object recognition (PhD work)

Espeseth 2010 PLOS One Molloy 2015 J Neurosci

How do we estimate the perceptual load of real-world scenes?

Which is more visually complex?

Which is more visually complex?

Machado et al 2006 Acta Psychologica

• Model: feedforward network with one hidden layer to

Corchs et al 2016 PLOS One ➤ Previous work has used

• 48% were consistently judged by all observers

This indicates that there is a good degree of agreement between observers’

Data collected using a custom Web platform (Flask server

Final distribution of complexity ratings for

Which is more visually complex?

Which is more visually complex?

Which is more visually complex?

Correlations with perceived complexity ratings

Original image Contrast clutter map

Edge ratio Result: r = .47

This feature combination does not

Perceived complexity maps

Discs rated via 2AFC Example

• Observers show good agreement on 2AFC decisions

• A one-dimensional embedding explains these decisions well

• Visual complexity can be effectively predicted by a convolutional network

• Our model allows evaluation of the complexity of natural scene images

You might also like