You are on page 1of 25

SCRIPT ANALYZER:

A Tool for Quantitative Paleography


VINODH RAJAN
UNIVERSITY OF HAMBURG
HAMBURG, GERMANY

1
QUANTITATIVE PALEOGRAPHY

• Paleography based on quantitative metrics


• The idea that paleographic features can be ‘measured’
• Goes back several decades. e.g. Loew (1912), Mallon (1952) etc.
• But catching up recently due to the advancement of computational methods

10.5 0.54 111.5


1.5 111.5
91.5 41.5 2
WHY QUANTITATIVE PALEOGRAPHY

• It’s much more interesting!


• With numbers you can do many things
• Basically, the whole world of statistics opens up itself
• Run different kinds of analysis
• Ability to visualize data in different ways
• It’s much more fun 
3
[...] manuscripts have the potential to deliver up a vast quantity of
information about how scribes wrote. Perhaps the automated
analysis of script will soon turn its attention to reconstructing the
motion of the scribe’s pen on the page and [...] explore the ways that
these strokes evolved. It is then that we will begin to be able to
measure the scribe’s art. (Stansbury, 2009)

4
SCRIPT ANALYSIS FRAMEWORK

• Characters need a proper framework for analysis


• We propose a new framework to analyze characters via their ductus among
other things
• Take a ‘human-aided’ approach rather than a pure automatic approach
• Operate as ‘grey’ box rather than a ‘black’ box

5
SCRIPT ANALYSIS FRAMEWORK

Trajectory Stroke
Spline Metrics
Character Reconstructi Segmentatio
Conversion Extraction
on n

6
SCRIPT ANALYZER: THE TOOL

• This is basically a digital manifestation of the Script Analysis Framework


• A prototype implementation of the framework was developed in Python 2.7
• Provided an GUI interface to perform various steps of the framework
• Starting from creating the digitized glyphs to extracting metrics

7
SPLINE CONVERSION

• Digitization of a character as splines


• Abstract away from the ‘appearance’
• Capture the structure
• This is ‘probably’ a tricky step
• Has to be done usually manually
• If you have very clear images, can be done automatically
8
SPLINE CONVERSION

9
SPLINE CONVERSION

10
RECONSTRUCTING TRAJECTORY

• We now try to reconstruct the ‘ductus’


• Reconstruction is based on the following criteria
• Effort Minimization
• Preferred directions of writing
• A user can also give their own ductus

11
RECONSTRUCTING TRAJECTORY I

12
RECONSTRUCTING TRAJECTORY

13
STROKE SEGMENTATION

• Detect points, which cause ’pen velocity’ to slow down


• Curvature extremities, abrupt change in direction
• And segment the characters at those points
• Each of these segments correspond to a ‘stroke’

14
STROKE SEGMENTATION

15
STROKE SEGMENTATION

16
METRICS EXTRACTION

• Three kinds of metrics are extracted


• Visual Metrics
• Dynamic Metrics
• Cognitive
• Metrics are ‘descriptive’ and ‘interpretable’ by humans
• Metrics are meant (or at least supposed) to be script-agnostic

17
METRICS EXTRACTION
Visual Production

• Angles between strokes


• Perplexity of writing
Size Length-Breadth Circularit
Index y
• Fluency of writing
• Changeability
• Entropy of writing
Divergence Stroke
Angles

18

Compactness
SAVING GLYPHS

19
SCRIPT REPOSITORY

20
Development of Grantha and Kannada from Brahmi

POSSIBLE APPLICATIONS
21
POSSIBLE APPLICATIONS
22
FUTURE WORK

• The software was written around 2012 (that‘s ancient!)


• It does not run on modern machines without workarounds
• It is a pain to install and configure it
• Making it web-based to increase accessibility and usability
• Adding data visualization & simple statistical capabilities

23
CONCLUSION

• Quantitative Paleography
• Overview of the Script Analysis Framework
• Processing glyphs with the Script Analyzer
• Overview of various metrics
• Possible applications

24
QUESTIONS?

25

You might also like