You are on page 1of 53

Network Intelligence Meets User

Centered Social Media Networks Reda


Alhajj
Visit to download the full and correct content document:
https://textbookfull.com/product/network-intelligence-meets-user-centered-social-medi
a-networks-reda-alhajj/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Social Computing and Social Media User Experience and


Behavior Gabriele Meiselwitz

https://textbookfull.com/product/social-computing-and-social-
media-user-experience-and-behavior-gabriele-meiselwitz/

Fundamentals of User-centered Design Brian Still

https://textbookfull.com/product/fundamentals-of-user-centered-
design-brian-still/

Clinical Neurotechnology meets Artificial Intelligence


Philosophical Ethical Legal and Social Implications 1st
Edition Orsolya Friedrich

https://textbookfull.com/product/clinical-neurotechnology-meets-
artificial-intelligence-philosophical-ethical-legal-and-social-
implications-1st-edition-orsolya-friedrich/

Deploying Wireless Sensor Networks. Theory and Practice


1st Edition Mustapha Reda Senouci

https://textbookfull.com/product/deploying-wireless-sensor-
networks-theory-and-practice-1st-edition-mustapha-reda-senouci/
AI Meets BI: Artificial Intelligence and Business
Intelligence 1st Edition Lakshman Bulusu

https://textbookfull.com/product/ai-meets-bi-artificial-
intelligence-and-business-intelligence-1st-edition-lakshman-
bulusu/

Social Network Analysis and Law Enforcement:


Applications for Intelligence Analysis Morgan Burcher

https://textbookfull.com/product/social-network-analysis-and-law-
enforcement-applications-for-intelligence-analysis-morgan-
burcher/

Network Oriented Modeling for Adaptive Networks


Designing Higher Order Adaptive Biological Mental and
Social Network Models Jan Treur

https://textbookfull.com/product/network-oriented-modeling-for-
adaptive-networks-designing-higher-order-adaptive-biological-
mental-and-social-network-models-jan-treur/

Social Computing and Social Media Design Ethics User


Behavior and Social Network Analysis 12th International
Conference SCSM 2020 Held as Part of the 22nd HCI
International Conference HCII 2020 Copenhagen Denmark
July 19 24 2020 Proceedings Gabriele Meiselwitz
https://textbookfull.com/product/social-computing-and-social-
media-design-ethics-user-behavior-and-social-network-
analysis-12th-international-conference-scsm-2020-held-as-part-of-
the-22nd-hci-international-conference-hcii-2020-copenh/

Social Media Analytics for User Behavior Modeling: A


Task Heterogeneity Perspective 1st Edition Arun Reddy
Nelakurthi

https://textbookfull.com/product/social-media-analytics-for-user-
behavior-modeling-a-task-heterogeneity-perspective-1st-edition-
arun-reddy-nelakurthi/
Lecture Notes in Social Networks

Reda Alhajj · H. Ulrich Hoppe


Tobias Hecking · Piotr Bródka
Przemyslaw Kazienko Editors

Network
Intelligence
Meets User
Centered Social
Media Networks
Lecture Notes in Social Networks

Series editors
Reda Alhajj, University of Calgary, Calgary, AB, Canada
Uwe Glässer, Simon Fraser University, Burnaby, BC, Canada
Huan Liu, Arizona State University, Tempe, AZ, USA
Rafael Wittek, University of Groningen, Groningen, The Netherlands
Daniel Zeng, University of Arizona, Tucson, AZ, USA

Advisory Board
Charu C. Aggarwal, Yorktown Heights, NY, USA
Patricia L. Brantingham, Simon Fraser University, Burnaby, BC, Canada
Thilo Gross, University of Bristol, Bristol, UK
Jiawei Han, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Raúl Manásevich, University of Chile, Santiago, Chile
Anthony J. Masys, University of Leicester, Ottawa, ON, Canada
Carlo Morselli, School of Criminology, Montreal, QC, Canada
More information about this series at http://www.springer.com/series/8768
Reda Alhajj • H. Ulrich Hoppe • Tobias Hecking
Piotr Bródka • Przemyslaw Kazienko
Editors

Network Intelligence Meets


User Centered Social Media
Networks

123
Editors
Reda Alhajj H. Ulrich Hoppe
Department of Computer Science Department of Computer Science and
University of Calgary Applied Cognitive Science
Calgary, Alberta, Canada University of Duisburg-Essen
Duisburg, Nordrhein-Westfalen, Germany
Tobias Hecking
Computer Science and Applied Piotr Bródka
Cognitive Science Department of Computational Intelligence
University of Duisburg-Essen The Engine Centre, Faculty of
Duisburg, Nordrhein-Westfalen, Germany Computer Science and Management
Wroclaw University of Science
Przemyslaw Kazienko and Technology
Faculty of Computer Science Wroclaw, Poland
and Management
The Engine Centre
Wroclaw University of Science
and Technology
Wroclaw, Poland

ISSN 2190-5428 ISSN 2190-5436 (electronic)


Lecture Notes in Social Networks
ISBN 978-3-319-90311-8 ISBN 978-3-319-90312-5 (eBook)
https://doi.org/10.1007/978-3-319-90312-5

Library of Congress Control Number: 2018944599

© Springer International Publishing AG, part of Springer Nature 2018


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, express or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents

Part I Centrality and Influence


Targeting Influential Nodes for Recovery in Bootstrap Percolation
on Hyperbolic Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Christine Marshall, Colm O’Riordan, and James Cruickshank
Process-Driven Betweenness Centrality Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Mareike Bockholt and Katharina A. Zweig
Behavior-Based Relevance Estimation for Social Networks
Interaction Relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Pascal Held and Mitch Köhler

Part II Knowledge and Information Diffusion


Network Patterns of Direct and Indirect Reciprocity
in edX MOOC Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Oleksandra Poquet and Shane Dawson
Extracting the Main Path of Historic Events from Wikipedia . . . . . . . . . . . . . . 65
Benjamin Cabrera and Barbara König
Identifying Accelerators of Information Diffusion Across Social
Media Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Tobias Hecking, Laura Steinert, Simon Leßmann, Víctor H. Masías,
and H. Ulrich Hoppe

Part III Algorithms and Applications I


Community Aliveness: Discovering Interaction Decay Patterns
in Online Social Communities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Mohammed Abufouda

v
vi Contents

Extended Feature-Driven Graph Model for Social Media Networks . . . . . . 119


Ziyaad Qasem, Tobias Hecking, Benjamin Cabrera, Marc Jansen,
and H. Ulrich Hoppe
Incremental Learning in Dynamic Networks for Node Classification. . . . . . 133
Tomasz Kajdanowicz, Kamil Tagowski, Maciej Falkiewicz,
and Przemyslaw Kazienko
Sponge Walker: Community Detection in Large Directed Social
Networks Using Local Structures and Random Walks . . . . . . . . . . . . . . . . . . . . . . 143
Omar Jaafor and Babiga Birregah

Part IV Algorithms and Applications II


Market Basket Analysis Using Minimum Spanning Trees . . . . . . . . . . . . . . . . . . 155
Mauricio A. Valle, Gonzalo A. Ruz, and Rodrigo Morrás
Simulating Trade in Economic Networks with TrEcSim . . . . . . . . . . . . . . . . . . . . 169
Gabriel Barina, Calin Sicoe, Mihai Udrescu, and Mircea Vladutiu
Towards an ILP Approach for Learning Privacy Heuristics
from Users’ Regrets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Nicolás Emilio Díaz Ferreyra, Rene Meis, and Maritta Heisel

Part V Content Analysis


Trump Versus Clinton: Twitter Communication During
the US Primaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Jennifer Fromm, Stefanie Melzer, Björn Ross, and Stefan Stieglitz
Strength of Nations: A Case Study on Estimating the Influence
of Leading Countries Using Social Media Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Alexandru Topîrceanu and Mihai Udrescu
Identifying Promising Research Topics in Computer Science . . . . . . . . . . . . . . 231
Rajmund Klemiński and Przemyslaw Kazienko

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Part I
Centrality and Influence
Targeting Influential Nodes for Recovery
in Bootstrap Percolation on Hyperbolic
Networks

Christine Marshall, Colm O’Riordan, and James Cruickshank

Abstract The influence of our peers is a powerful reinforcement for our social
behaviour, evidenced in voter behaviour and trend adoption. Bootstrap percolation
is a simple method for modelling this process. In this work we look at bootstrap
percolation on hyperbolic random geometric graphs, which have been used to model
the Internet graph, and introduce a form of bootstrap percolation with recovery,
showing that random targeting of nodes for recovery will delay adoption, but this
effect is enhanced when nodes of high degree are selectively targeted.

Keywords Bootstrap percolation · Bootstrap percolation with recovery ·


Hyperbolic random geometric graphs

1 Introduction

In this work, we examine agent-based modelling of Bootstrap Percolation on Hyper-


bolic Random Geometric graphs, and introduce a modified version of Bootstrap
Percolation with Recovery on the same set of graphs. Bootstrap percolation has
been used to model social reinforcement, with the rationale that individuals are
more likely to adopt a new technology, for example, if a significant number of
their contacts already use that technology. A key question in bootstrap percolation
is whether the activity will percolate or not. Our focus is on the effects of
spatial structure on the spread or percolation of an activity, specifically observing
bootstrap percolation on hyperbolic random geometric graphs. Our motivation for
using hyperbolic networks is they share many features with real-world complex

C. Marshall () · C. O’Riordan


Discipline of Information Technology, National University of Ireland, Galway, Ireland
e-mail: c.marshall1@nuigalway.ie; colm.oriordan@nuigalway.ie
J. Cruickshank
School of Mathematics, National University of Ireland, Galway, Ireland
e-mail: james.cruickshank@nuigalway.ie

© Springer International Publishing AG, part of Springer Nature 2018 3


R. Alhajj et al. (eds.), Network Intelligence Meets User Centered Social Media
Networks, Lecture Notes in Social Networks,
https://doi.org/10.1007/978-3-319-90312-5_1
4 C. Marshall et al.

networks; typically displaying power-law degree distribution, high clustering, short


path lengths, high closeness and significant betweenness centralisation.
We have previously created a set of hyperbolic random geometric graphs from
unconnected, with increasing edge density, to fully connected. In this work, we
attach a population of agents to the graphs and explore the effect of percolation
processes on these agents on this set of graphs. Our first question is whether we can
identify a distinct percolation threshold as we increase the number of edges, above
which the activity percolates and below which the activity fails to percolate.
Typically work on bootstrap percolation has looked at the effect of the initial
active seed set on the final outcome of the process, looking at the choice of active
seed, or at the fraction of active seeds required to guarantee percolation of the
activity. This has been used in, for example, viral marketing to see which nodes
might best be targeted to optimise the spread of information. Our overall interest
is in targeting nodes which might prevent the percolation of the activity, given a
random seed set of fixed size. This could model a situation where a network is
randomly targeted with active seeds, and we wish to know which nodes might have
the best chance of obstructing the process. Our focus is on minimising the spread of
an activity given a small-scale attack at random points within the network.
We develop a modified version of standard bootstrap percolation which allows
for the recovery of a defined percentage of active nodes after each activation step;
by recovery we mean the transition from active state back to inactive state. Our next
questions are whether this will impact the spread of the activity, and, furthermore,
if we selectively target active nodes of high degree, will this have a greater impact
on the spread of activity than the same percentage chosen at random, given that hub
nodes are commonly regarded as influential nodes in a network.
Section 2 provides related information about bootstrap percolation and hyper-
bolic graphs. Section 3 introduces our conceptual framework for Bootstrap Perco-
lation with recovery. Section 4 describes our experimental set-up, and our results
are presented in Sect. 5. Our conclusions are discussed in Sect. 6, together with
suggestions for future work.

2 Background

In this section we describe bootstrap percolation and hyperbolic random geometric


graphs.

2.1 Bootstrap Percolation

Bootstrap percolation is a dynamic process in which an activity can spread to


a node in a network if the number of active neighbours of that node is greater
than a predefined activation threshold. This process can model forms of social
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 5

reinforcement where the spread of an activity largely depends on a tipping point in


the number of activated contacts. These activities can include the spread of opinions
or the adoption of a new technology; the underlying assumption is that people are
more likely to adopt a new activity if several of their contacts are already involved
in the same activity. The idea of Bootstrap Percolation was introduced by Chalupa,
Leath and Reich in their 1979 work on Bethe lattices studying the mechanisms of
ferromagnetism, or how materials become magnetised [13].
The bootstrap percolation model is essentially a two-state model, with agents
either active or inactive. In some of the literature, these states are termed infected or
uninfected, but basically means that an activity is present or not present. Standard
bootstrap percolation is a discrete-time process, where the activation mechanism
occurs synchronously for all nodes in the graph in rounds, at each time step. Starting
with an inactive population of agents, a set of nodes is chosen as the active seed set,
this choice may be at random or deterministic. An integer value activation threshold
is chosen and the activation mechanism occurs at each time step, whereby a node
becomes activated if the number of its active neighbours is at least that value. The
rounds of activation are then repeated until equilibrium, where no further state
change is possible, or at some predetermined parameter, such as the number of
rounds. In the spatial form of the process, agents are attached to nodes in a network
and activation depends on the number of active nodes directly connected to a node.
Bootstrap percolation has been studied on a variety of random graphs and
complex networks [2–4, 6, 19, 34]. A key question is whether the activity will
completely percolate. A lot of research has concentrated on the relationship between
the initial active seed set and the final outcome of the process; this has in essence
involved looking at the choice of the initial active seed set to optimise the spread
of the activity [20], or the size of the initial seed set to ensure percolation [11, 17].
Kempe et al. investigated algorithms for optimising this set selection on a variety
of networks, noting that this was NP-hard [21]. The bootstrap percolation process
has been used to model information diffusion [23–25] and viral marketing [15],
behavioural diffusion [12], such as opinion formation and voter trends, the adoption
of brands, technology and innovation in social networks [16, 18, 31], and also
cascading failures in power systems.
In the standard form of bootstrap percolation, activated nodes must remain active,
no reverse state is allowed. In 2014, Coker and Gunderson investigated the idea of
bootstrap percolation with recovery on lattice grids based on an update rule that
infected nodes with few infected neighbours will become uninfected [14]. They used
a probabilistic approach to determine thresholds for the probability of percolation
based on the size of the initial active seed set. By examining various configurations
of 2-tiles (pairs of sites that share an edge or a corner in the lattice grid), they
determine the critical probabilities for percolation.
In epidemiology, related models are the SI and SIS models of disease propagation
[32] which have similar state changes to bootstrap percolation and bootstrap
percolation with recovery, respectively. In the SI model, the population is com-
partmentalised into either Susceptible (i.e. uninfected) or Infected sinks. Each
susceptible individual has a probability of becoming infected, based on the trans-
6 C. Marshall et al.

mission rate, which is a function of the typical number of contacts per individual,
and of the infectivity of the disease. Once infected, individuals move to the infected
compartment and remain infected. The SIS model is an extension of this model
where infected individuals will recover (and become susceptible again) based on
the rate of recovery.
The key point of interest is the epidemic threshold, and the effects of varying
rates of infection and recovery on the spread of the disease. In the SI model, the
infection will ultimately spread to the entire population. In the SIS model, for high
recovery rates the illness will die out in the population and for low recovery rates the
illness will become endemic [5]. These models are compartmental models, where
the population in each sink is homogeneous; each individual within a compartment
has the same likelihood of changing state.
Bootstrap percolation is distinctly different from the epidemic models, as the
population of agents is heterogeneous, each with local knowledge of node properties
in their neighbourhood. Once the initial active set has been chosen, the spread of the
activity is determined by the number of neighbourhood links to active nodes, which
is a function of the underlying network topology.
To address issues of network structure in the epidemic models, Pastor-Satorras
and Vespignani in 2001 [27] introduced new compartments based on node degree,
with nodes of the same degree having the same likelihood of changing state. In
large-scale free models, the hub nodes ensure that the infection is likely to spread
[5]. However, this approach does not take into account community structure or local
clustering. Agent-based modelling of bootstrap percolation on a network is a simple
dynamic process ideally suited to accounting for individual node features.

2.2 Hyperbolic Random Geometric Graphs

In Random Geometric Graphs, pairs of nodes are connected if they lie within a spec-
ified distance parameter of each other. They were developed in the 1970s to model
situations where the distance between nodes is of interest, as in modelling the spread
of a forest fire. These graphs have generally been created in the Euclidean plane,
particularly the unit square and unit disc. In computer science, these models are
frequently used to model wireless ad-hoc and sensor networks [7], where proximity
between nodes defines network connections. Other uses include modelling neural
networks [9], mapping protein-protein interactions [28], and in percolation theory
[11], modelling processes such as diffusion in a network, fluid percolation in porous
materials, fracture patterns in earthquakes and conductivity [30].
Hyperbolic random geometric graphs were developed by Krioukov et al. in 2010
[22]. In this model, pairs of nodes are connected if the hyperbolic distance between
them is less than a specified distance parameter. To highlight the contrast between
hyperbolic and Euclidean random geometric graphs, the Poincaré disc model
transforms the negative curvature of the hyperbolic plane to a 2 dimensional disc.
Points within the disc are not uniformly distributed; the hyperbolic model has more
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 7

Fig. 1 Claudio Rocchini


Order-3 heptakis heptagonal
tiling 2007, distributed under
a CC BY 2.5 licence [29]

“room” than the Euclidean disc. As the radius of a Euclidean disc increases, the
circumference increases linearly; in a hyperbolic disc, the circumference increases
exponentially. For example, a hyperbolic tree can represent more data as child nodes
have as much space as parent nodes. This tends to give a fish-eye view to nodes
within the disc, with greater emphasis placed on central nodes, while outer nodes
are exponentially distant, this effect is captured in an image by Claudio Rocchini
[29], see Fig. 1.
Hyperbolic geometric graphs have been used to explore structural properties and
processes in complex networks, such as clustering [10] and navigability. Krioukov
et al. have proposed that these models have potential for modelling the Internet
graph, which they suggest has an underlying hyperbolic geometry [22]. Recent
work by Papadopoulos et al. has developed a tool to map real-world complex
networks, such as the Autonomous Systems Internet, to a hyperbolic space using
the hyperbolic geometric model [26]. Recently, researchers have developed faster
algorithms for creating large hyperbolic networks. Von Looz et al. have developed a
speedy generator to create representative subsets of hyperbolic graphs, allowing for
networks with billions of edges, while retaining key features of hyperbolic graphs
[33], while Bringmann et al. have developed a generator to create a generalised
hyperbolic model in linear time, by avoiding the use of hyperbolic cosines [8].

3 Conceptual Framework for Bootstrap Percolation with


Recovery

The concept of bootstrap percolation with recovery allows for an activated node to
become deactivated. This might model a change of mind situation, in trend adoption
8 C. Marshall et al.

or behavioural diffusion, where individuals are initially interested in an activity


popular with their contacts, but immediately change their mind and no longer
endorse the activity, at least in the short term. In standard bootstrap percolation,
a lot of work focuses on optimising the size and selection of the initial active seed
set to ensure percolation. Our own focus is on a fixed-size randomly chosen seed
set, motivated by the notion of a small-scale random seeding in a network, with
a view to targeting recovery to inhibit the spread of the activity. Our proposed
method of recovery involves targeting active nodes with specific node properties,
such as high degree centrality; our aim is to identify those properties that might
have the most impact on delaying the percolation process. We have chosen node
degree, in the first instance, as node degree is a standard measure of influence in
networks, [1, 5], and the hyperbolic geometric graphs display highly skewed degree
centralisation, reflecting the high level of variance in node degree within each graph.
We examine the effect that this has on the percolation of the activity, in comparison
with the standard bootstrap percolation process, in which recovery is not permitted.
As a control, we also simulate bootstrap percolation with random selection of active
nodes for recovery.

3.1 Proposed Method of Bootstrap Percolation with Recovery

Our proposed method involves setting up the experiments as for the standard
bootstrap process but introducing a recovery mechanism to follow the activation
mechanism in each time step.
1. Set Parameters
– Set size of active seed set Ao
– Set integer value activation threshold r
– Set recovery rate percentage RR%
2. Define Mechanisms
– Activation mechanism
• For all inactive nodes, activate if the number of active neighbours is at least
r.
– Recovery mechanism
• For all active nodes, select RR% to deactivate, by random or deterministic
selection
3. Define Process
– For each graph, attach an agent to each node in the graph
– Set all agents inactive
– Select active seed set
– Apply activation mechanism followed by recovery mechanism at each time
step until equilibrium
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 9

4 Experimental Set-up

Our simulations involve creating a set of hyperbolic random geometric graphs and
observing the effect of graph topology on the spread of activity by embedding the
bootstrap percolation process on these graphs.
A set of hyperbolic geometric graphs, each with 1000 nodes, was created with
varying edge density from disconnected to fully connected, with 20 graphs created
at each value of the distance parameter. Bootstrap percolation was embedded on
this set of graphs and then our modified versions of bootstrap percolation with
recovery, both random and targeted, were simulated on the same set of graphs to
compare outcomes. The recovery experiments were repeated with varying recovery
rate percentages (RR%) of 10–90%, in steps of 10%.

4.1 Graph Creation

In this research, hyperbolic random geometric graphs are created using the model
outlined by Krioukov et al. [22]. This approach situates the graph in a disc of radius
R within the Poincaré disc model of hyperbolic space. Nodes in the graph have polar
coordinates (r, θ ) and are positioned by generating these coordinates as random
variates; edges are created between pairs of nodes where the hyperbolic distance
between them is less than the radius R of the disc of interest, varying R from 0.1 to
12 in increments of 0.1, to create graphs with an increasing number of edges, from
disconnected to fully connected, with 20 graphs created at each distance parameter
R. The relationship with R and edge density is not linear, see Fig. 2; therefore, R is
used in all our results for ease of comparison.

Distance Parameter R mapped to Density


1

0.8

0.6
Density

0.4

0.2

0
0 2 4 6 8 10 12
Distance parameter R

Fig. 2 Density parameter R mapped to edge density


10 C. Marshall et al.

Fig. 3 Hyperbolic random geometric graph with 1000 nodes and an edge density of 0.037

Figure 3 shows a hyperbolic random geometric graph with 1000 nodes, distance
parameter R = 4, and an edge density of 0.037. This shows a tree-like structure, with
the major hub nodes in the centre and the majority of nodes, that is, those of lower
degree, are situated towards the boundary.

4.2 Simulation of Bootstrap Percolation on the Set of


Hyperbolic Graphs

A population of agents, engaged in the process of bootstrap percolation, is attached


to each graph to observe any spread in activity. An agent is assigned to each node,
with all agents initially inactive and, from these, 20 nodes are randomly selected to
form the initial active seed set, Ao . The activation mechanism involves an activation
threshold r whereby an inactive node becomes active if it has at least r active
neighbours. At each time step, this activation mechanism is applied synchronously
to all nodes in the graph and is repeated until no further state change is possible and
a stable equilibrium is reached; the number of active nodes is recorded at this point.
Our experiments are then repeated varying the activation threshold r from 2 to 10.

4.3 Bootstrap Percolation with Recovery

The bootstrap percolation process is modified to allow for reverse state change from
active to inactive, as outlined in Sect. 3.1 by introducing a recovery phase in each
time step, immediately following activation. Our first set of experiments involves
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 11

random selection of active nodes for recovery. The experiments are then repeated
targeting active nodes of high degree for recovery. In both sets of simulations, the
experiments are repeated with varying Recovery Rate percentages (RR%) from 10
to 90% in steps of 10% for each activation threshold in the range 2–10, as before.
As for the standard process, a population of agents, engaged in the process of
bootstrap percolation, is attached to each graph to observe any spread in activity. An
agent is assigned to each node, with all agents initially inactive and, from these, 20
nodes are randomly selected to form the initial active seed set, Ao . The activation
mechanism involves an activation threshold r whereby an inactive node becomes
active if it has at least r active neighbours. The recovery mechanism involves
selecting a percentage of the active nodes for deactivation.
At each time step, the activation mechanism is applied synchronously to all nodes
in the graph followed by the recovery mechanism applied to all active nodes. This
cycle of activation and recovery at each time step is repeated until equilibrium; the
number of active nodes is recorded at this point. Our experiments are then repeated
at this Recovery Rate varying the activation threshold r from 2 to 10. We then repeat
the experiments incrementing the recovery rate by 10% for each set, from 10 to 90%.
Our test simulations showed that equilibrium was reached rapidly for all graphs,
and we therefore set an arbitrary cut-off point at 100 time steps, to ensure we
captured equilibrium. In the case of graphs which had complete percolation in
the standard bootstrap process, equilibrium cycled between full activation and
deactivation of the recovery rate fraction.

Random Recovery

After each activation time step, the recovery rate percentage of active nodes is
randomly selected to become inactive.

Targeted Recovery Based on Node Degree Ranking

After each activation time step, active nodes were ranked according to node degree,
from highest to lowest, the recovery rate percentage of the top-ranked active nodes
is then selected to become inactive.

5 Results

In this section we describe our results from simulating the processes of bootstrap
percolation and bootstrap percolation with random recovery and then targeted
recovery, all on the same set of hyperbolic random geometric graphs, using averaged
data from the 20 graphs created at each parameter R.
12 C. Marshall et al.

Bootstrap Percolation: Number of active nodes at equilibrium


12 1000
900

Number of active nodes at equilibrium


10
800
Distance parameter R

700
8
600
6 500
400
4
300
200
2
100
0
2 3 4 5 6 7 8 9 10
Activation threshold

Fig. 4 Heat Map for Ao = 20, with AT from 2 to 10, and R = 0.1 to 12, showing the number of
final active nodes at equilibrium

5.1 Bootstrap Percolation

The heatmap in Fig. 4 shows a distinct threshold above which the activity completely
percolated and below which the activity failed to percolate. This percolation
threshold lay between values of R from 1.5 to 5.8, representing edge densities of
0.005–0.1, respectively, demonstrating increasing edge density with each increase
in activation value. Each increase in activation threshold has an inhibiting effect on
the spread of activity, as it requires more of a node’s contacts to be active before
the activity will spread. Increasing the edge density allows an inactive node greater
potential access to activated contacts, and this is manifested by the increase in the
percolation threshold.
In all of our simulations, the heat maps displayed a clear percolation threshold.
For comparison purposes, we developed a simple algorithm to find a representative
point within the threshold, to allow for representation of the percolation threshold
as a curve.

5.2 Bootstrap Percolation with Recovery

In the following charts, the standard Bootstrap Percolation process is depicted as


a case of 0% recovery. Each fence plot has a differing recovery rate percentage
(RR%) and represents the percolation threshold for that particular recovery rate.
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 13

Fig. 5 Bootstrap percolation with RANDOM recovery

It can be clearly seen that either form of recovery significantly raises the threshold
for complete percolation of the activity and that, in general, each increase in RR%
increases this stepwise.

Bootstrap Percolation with Random Recovery

Figure 5 shows bootstrap percolation with recovery randomly targeting a percentage


of active nodes following activation in each time step. It can be seen that, overall,
the edge density of the threshold increases as the recovery rate increases from 10 to
90%, with the form of each curve similar to its neighbour. However, after the initial
increase at 10%, increasing the recovery rate had little further effect before 50%
recovery.
We can also gain information by looking at the edges of each curve. Percolation
threshold values arising from an activation threshold of 2 are depicted on the left
wall of the fence plot, whereas percolation threshold values arising from increasing
the activation threshold to 10 are depicted on the right edge; the increase in
activation threshold means that more active contacts are required for an inactive
node to become activated. For an activation threshold of 2, these values for R
at the percolation threshold increased from 2 to 4.7, representing edge densities
ranging from 0.01 to 0.055. For an activation threshold of 10, the values for R at
the percolation threshold increased from 5.2 to 8, representing edge densities from
0.075 to 0.3.

Bootstrap Percolation with Targeted Recovery

Figure 6 shows bootstrap percolation with recovery selectively targeting the top-
ranked active nodes of the highest degree after each activation period. This clearly
shows that the corresponding edge density of thresholds has increased significantly
for all parameters, when compared with randomly selected recovery. Unlike random
recovery, where the increase in recovery percentage had no effect between 10
14 C. Marshall et al.

Fig. 6 Bootstrap percolation with TARGETED recovery

and 50%, the percolation threshold curves for targeted recovery showed a regular
stepwise increase as the recovery percentage increased, for all percentages.
The most notable observation, with each percentage increase in recovery rate,
was that for an activation threshold of 2, the percolation threshold occurred at values
of R between 2 and 4, representing edge densities in the range 0.01–0.037, and
that for an activation threshold of 10, the values for R at the percolation threshold
increased from 5.2 to 7.6, representing edge densities of 0.075–0.25, with the form
of each curve again similar to its neighbour. This demonstrates that selectively
targeting the nodes of the highest degree for recovery has had a significantly
greater impact on the spread of the activity compared with random selection, and
particularly when compared with the standard bootstrap process.

6 Conclusion

We have studied bootstrap percolation on hyperbolic geometric graphs and noticed


that there was a clear threshold above which the activity percolated to all nodes
and below which the activity failed to percolate. We developed a modified form of
bootstrap percolation which allowed for recovery of active nodes to inactive state.
In our experiments with recovery on the same set of graphs, we noticed that with
random recovery, after the initial increase at 10%, there was little further impact
until we increased the recovery rate to 50%, at which the percolation threshold was
significantly increased to graphs with higher edge density, with all curves of similar
form to their neighbours.
Our experiments with targeted recovery, based on the top-ranked nodes of the
highest degree, showed that this delayed the threshold even further, and had a
stepwise increase in the threshold for all recovery rate increases. This suggests
that nodes of high degree are influential in the bootstrap percolation process.
Following these results, we surmise that it may be possible to fine-tune our targeting,
by choosing differing node properties, or by targeting nodes highly ranked for a
combination of properties.
Targeting Influential Nodes for Recovery in Bootstrap Percolation on. . . 15

Our next intention is to target active nodes for recovery based on properties which
are significantly skewed in the hyperbolic graphs, such as clustering coefficient,
betweenness centralisation and closeness centralisation. Our intuition is that finding
measures which are highly skewed in the hyperbolic graphs will allow us to
significantly delay percolation.
It would also be interesting to specifically target the most influential nodes in
the network, with a priori rankings, so that whenever such a node became active it
would always revert to inactivity. This would effectively immunise that node from
activity and would represent an influential player constantly resisting an onslaught
of activity.
Another potential area of study would concentrate on graphs within the percola-
tion threshold zone of transition, investigating graph or individual node properties
that facilitate the process above a certain edge density, or those which impede
percolation below a certain edge density. It might be useful to observe the impact
of targeted recovery within this percolation threshold in order to determine the local
structural properties of individual nodes, or neighbourhoods, which might have the
greatest impact on global outcomes in the graph.

References

1. Albert, R., Jeong, H., Barabási, A.L.: Error and attack tolerance of complex networks. Nature
406(6794), 378–382 (2000)
2. Amini, H., Fountoulakis, N.: Bootstrap percolation in power-law random graphs. J. Stat. Phys.
155(1), 72–92 (2014)
3. Balister, P., Bollobàs, B., Johnson, J.R., Walters, M.: Random majority percolation. Random
Struct. Algoritm. 36(3), 315–340 (2010)
4. Balogh, J., Pittel, B.G.: Bootstrap percolation on the random regular graph. Random Struct.
Algoritm. 30(12), 257–286 (2007)
5. Barabási, A.L.: Network Science. Cambridge University Press, Cambridge (2016)
6. Baxter, G.J., Dorogovtsev, S.N., Goltsev, A.V., Mendes, J.F.: Bootstrap percolation on complex
networks. Phys. Rev. E 82(1), 011103 (2010)
7. Bénézit, F., Dimakis, A.G., Thiran, P., Vetterli, M.: Order-optimal consensus through random-
ized path averaging. IEEE Trans. Inf. Theory 56(10), 5150–5167 (2010)
8. Bringmann, K., Keusch, R., Lengler, J.: Geometric inhomogeneous random graphs. Preprint
(2015). arXiv:1511.00576
9. Bullmore, E., Bassett, D.: Brain graphs: graphical models of the human brain connectome.
Annu. Rev. Clin. Psychol. 7, 113–140 (2011)
10. Candellero, E., Fountoulakis, N.: Clustering and the hyperbolic geometry of complex networks.
In: Bonato, A., Graham, F., Pralat, P. (eds.) Algorithms and Models for the Web Graph. WAW
2014. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), vol. 8882, pp. 1–12. Springer, Cham (2014)
11. Candellero, E., Fountoulakis, N.: Bootstrap percolation and the geometry of complex networks.
Stoch. Process. Appl. 126, 234–264 (2015)
12. Centola, D.: The spread of behavior in an online social network experiment. Science
329(5996), 1194–1197 (2010)
13. Chalupa, J., Leath, P.L., Reich, G.R.: Bootstrap percolation on a bethe lattice. J. Phys. C Solid
State Phys. 12(1), L31 (1979)
16 C. Marshall et al.

14. Coker, T., Gunderson, K.: A sharp threshold for a modified bootstrap percolation with recovery.
J. Stat. Phys. 157(3), 531–570 (2014)
15. Domingos, P., Richardson, M.: Mining the network value of customers. In: Proceedings of the
Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
KDD ’01, pp. 57–66. ACM, New York (2001)
16. Gleeson, J.P.: Cascades on correlated and modular random networks. Phys. Rev. E 77(4),
046117 (2008)
17. Gomez Rodriguez, M., Leskovec, J., Krause, A.: Inferring networks of diffusion and influence.
In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pp. 1019–1028. ACM, New York (2010)
18. Jackson, M.O., López-Pintado, D.: Diffusion and contagion in networks with heterogeneous
agents and homophily. Netw. Sci. 1(01), 49–67 (2013)
19. Janson, S., Łuczak, T., Turova, T., Vallier, T.: Bootstrap percolation on the random graph gn,p .
Ann. Appl. Probab. 22(5), 1989–2047 (2012)
20. Kempe, D., Kleinberg, J.M., Tardos, É.: Influential nodes in a diffusion model for social
networks. In: ICALP, vol. 5, pp. 1127–1138. Springer, Berlin (2005)
21. Kempe, D., Kleinberg, J.M., Tardos, É.: Maximizing the spread of influence through a social
network. Theory Comput. 11(4), 105–147 (2015)
22. Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., Boguñá, M.: Hyperbolic geometry of
complex networks. Phys. Rev. E 82, 036106 (2010)
23. Leskovec, J., Backstrom, L., Kleinberg, J.: Meme-tracking and the dynamics of the news cycle.
In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pp. 497–506. ACM, New York (2009)
24. Liben-Nowell, D., Kleinberg, J.: Tracing information flow on a global scale using internet
chain-letter data. Proc. Natl. Acad. Sci. 105(12), 4633–4638 (2008)
25. Myers, S.A., Zhu, C., Leskovec, J.: Information diffusion and external influence in networks.
In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining, pp. 33–41. ACM, New York (2012)
26. Papadopoulos, F., Psomas, C., Krioukov, D.: Network mapping by replaying hyperbolic
growth. IEEE/ACM Trans. Networking 23(1), 198–211 (2015)
27. Pastor-Satorras, R., Vespignani, A.: Epidemic spreading in scale-free networks. Phys. Rev.
Lett. 86(14), 3200 (2001)
28. Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics
23(2), e177–e183 (2007)
29. Rocchini, C.: Order-3 heptakis heptagonal tiling. https://commons.wikimedia.org/wiki/File:
Order-3_heptakis_heptagonal_tiling.png (2007). Accessed 15 May 2017
30. Sahini, M., Sahimi, M.: Applications of Percolation Theory. CRC Press, Boca Raton (1994)
31. Shrestha, M., Moore, C.: Message-passing approach for threshold models of behavior in
networks. Phys. Rev. E 89(2), 022805 (2014)
32. Tassier, T.: Simple epidemics and SIS models. In: The Economics of Epidemiology, pp. 9–16.
Springer, Berlin (2013)
33. von Looz, M., Staudt, C.L., Meyerhenke, H., Prutkin, R.: Fast generation of dynamic complex
networks with underlying hyperbolic geometry. Preprint (2015). arXiv:1501.03545
34. Watts, D.J.: A simple model of global cascades on random networks. Proc. Natl. Acad. Sci.
99(9), 5766–5771 (2002)
Process-Driven Betweenness Centrality
Measures

Mareike Bockholt and Katharina A. Zweig

Abstract In network analysis, it is often desired to determine the most central


node of a network, for example, for identifying the most influential individual in
a social network. Borgatti states that almost all centrality measures assume that
there exists a process moving through the network from node to node (Borgatti,
Soc Netw 27(1):55–71, 2005). A node is then considered as central if it is important
with respect to the underlying process. One often used measure is the betweenness
centrality which is supposed to measure to which extent a node is “between”
all other nodes by counting on how many shortest paths a node lies. However,
most centrality indices make implicit assumptions about the underlying process.
However, data containing a network and trajectories that a process takes on this
network are available: this can be used for computing the centrality. Hence, in this
work, we use existing data sets, human paths through the Wikipedia network, human
solutions of a game in the game’s state space, and passengers’ travels between US
American airports, in order to (1) test the assumptions of the betweenness centrality
for these processes, and (2) derive several variants of a “process-driven betweenness
centrality” using information about the network process. The comparison of the
resulting node rankings yields that there are nodes which are stable with respect to
their ranking while others increase or decrease in importance dramatically.

Keywords Network analysis · Centrality measures · Network processes · Path


data analysis

M. Bockholt () · K. A. Zweig


Department of Computer Science, University of Kaiserslautern, Kaiserslautern, Germany
e-mail: mareike.bockholt@cs.uni-kl.de; zweig@cs.uni-kl.de

© Springer International Publishing AG, part of Springer Nature 2018 17


R. Alhajj et al. (eds.), Network Intelligence Meets User Centered Social Media
Networks, Lecture Notes in Social Networks,
https://doi.org/10.1007/978-3-319-90312-5_2
18 M. Bockholt and K. A. Zweig

1 Introduction

An often performed task in network analysis is the identification of the most


important nodes in a network. The goal might be to find the most influential
individual in a social network, the most vulnerable location in a transportation
network, or the leader in a terrorist network [6, 17]. The identification of such
nodes is usually done with a centrality measure which computes a value for each
node of the network based on the network structure [2, 10, 20]. The concept
of centrality in networks was first introduced by Bavelas in the late 1940s who
considered human communication networks [2]. Inspired by this idea, a large
number of different methods for measuring the centrality of a node were proposed
in the following decades, where the best known centrality measures are degree
centrality [9], closeness centrality [9], betweenness centrality [1, 9], and Eigenvector
centrality [3] (for an overview, cf. [5] or [14]).
An important contribution was made by Borgatti who states that almost all
centrality indices are based on the assumption that there is some kind of traffic
(or communication or process) flowing through the network by moving from node
to node [4]. This might be the propagation of information in a social network,
packages being routed through the WWW, or the spreading of a disease in human
interaction networks. A node is then considered as central, that is it is assigned a
high value of the measure, if it is somehow important with respect to this underlying
process. However, different centrality measures make different assumptions about
the process flowing through the network. Some measures are based on shortest
paths in the network, assuming that the underlying process moves on shortest paths.
Others assume that, if whatever flows through the network is at one node, it will
spread simultaneously to all the node’s neighbors, while others assume that it can
only be at one node at a time.
Borgatti provides a typology of the most popular centrality measures by con-
sidering the network process [4] and argues that centrality measure values for a
network can only be interpreted in a meaningful way, if the assumptions of the
measure respect the properties of the process. He identifies two different dimensions
by which the process can vary: First, on which type of trajectory does the process
move through the network, and second, how does the process spread from node
to node? For the first dimension, he differentiates between shortest paths, paths
(not necessarily shortest, but nodes and edges can only occur at most once in it),
trails (edges might occur several times in it, while nodes cannot be repeated), and
walks (in which nodes and edges might occur several times). The second identified
dimension is the “mechanism of node-to-node transmission”: A process might
transfer a good from node to node (e.g. a package), or it passes something to the
next node by duplicating it: whatever flows through the network is passed to the next
node and simultaneously stays at the current node (e.g. information flowing through
a network or a viral infection: it will still be at the current node after it was passed to
another node). Duplication can take place in a serial (one neighbour at once) or in a
Process-Driven Betweenness Centrality 19

Fig. 1 Introductory example. Node A can be seen as a gatekeeper between the two node groups.
Classic betweenness centrality hence assigns a high value to A. If, however, the network process
only moves within the two groups (indicated by boldly drawn edges), should A still be considered
as the most central node of the network?

parallel manner (simultaneously to all neighbours). This categorization is a helpful


tool for choosing an appropriate centrality measure, given a network and a process
on the network.
The betweenness centrality is an often used centrality measure in social network
analysis [1, 9]. The idea is to measure to which extent a node v is positioned
“between” all other nodes. A node with a high betweenness value is assumed to have
control over other nodes: by passing information or withholding it, it can control
the information flow. Consider for example the graph in Fig. 1 in which node A
can be seen as the gatekeeper between the left subgraph and the right subgraph:
A can prevent information being passed to the other subgraph because all paths
between the two subgraphs pass through A. Betweenness centrality counts how
many shortest paths from any node s to any other node t pass through the given
node v and averages over all node pairs (s, t). In order to account for cases in which
there are many shortest paths between s and t, but only a few of them pass through
v, the measure normalizes by the total number of shortest paths between s and t.
Formally, with σst denoting the number of shortest paths from a node s to a node t
(with σst = 1 if s = t) and with σst (v) denoting the number of shortest paths from
s to t that pass through v, we can define the betweenness centrality for a node v as1

  σst (v)
Bc (v) = (1)
σst
s∈V , t∈V ,
s=v s=t=v

According to Borgatti [4], this measure is appropriate for networks on which the
process of interest literally moves from node to node by a transfer mechanism and
travels on shortest paths. This becomes obvious in Fig. 1: if there were added two

1 In literature, it is often noted as C B . Since we are only considering this centrality measure and for
a better readability in Sect. 6, we use this notation.
20 M. Bockholt and K. A. Zweig

additional nodes D  and E  , and edges linking D to D  , D  to E  , and E  to E,


the high importance of A as a gatekeeper can only be still justified if the process
exclusively takes shortest paths and can only take one path at once.
When taking a closer look at the formula, it is easy to see that there are even more
simplifying assumptions in this measure: it assumed that the process flows between
any node pair (s, t) and the amount of traffic flowing from s to t is equal (and
equally important) for all node pairs. For a given network with a process fulfilling
the two main assumptions, these further assumptions are justifiable: If there is
no other information about the process available, these assumptions are the best
possible. However, if the information of how the process moves through the network
is available, the importance of the nodes can be measured differently. Consider again
Fig. 1 in which the bold directed edges indicate that there is communication between
those nodes and no communication between the other nodes. Hence, in both groups,
there are many entities moving from one of the upper nodes to (the) one node below,
using the nodes B and C, respectively, but no entity moving from one group to
the other. Although the assumptions of moving on shortest paths and by a transfer
mechanism are met, should the node A still be considered as the most central node
if there is actually no communication at all between those two groups?
However, there are data sets available containing a network structure and
trajectories that the network’s process has taken. It is therefore possible to actually
use the information of how the process moves through the network in order to
assign a centrality value to the nodes. It is clear that this is a different approach
than the classic betweenness centrality or other existing centrality measures: classic
centrality indices solely use the structure of the given graph in order to compute
centrality values for the nodes—the choice which centrality index is appropriate for
this network can then be taken based on the knowledge about the properties of the
process. Here, the information about the process is already used for computing the
centrality value.
Hence, the contribution of this work is the following: we use three data sets
in order to (1) investigate whether the assumptions of the betweenness centrality
are met by those processes, and (2) incorporate the information about the process
contained in the data sets into a process-driven betweenness centrality (PDBC).
Several variants of PDBC measures are introduced in order to analyse which piece
of information affects the node ranking in which manner. For example, one variant
only counts the shortest paths between those node pairs which are source and
destination of the process at least once. Another variant does not count the number
of shortest paths but counts the number of actually used trajectories in which a
node is contained. Those variants are then applied on the available data sets and the
resulting node rankings of each variant compared to each other. It can be observed
that the resulting rankings show a high correlation to each other, but there are nodes
whose importance increases or decreases considerably.
This article is therefore structured as follows: Sect. 2 introduces the necessary
definitions and notations before Sect. 3 presents related work in this area. Section 4
gives a description of the available data sets. Section 5 discusses the assumptions
of the betweenness centrality and tests whether those assumptions are met in the
Process-Driven Betweenness Centrality 21

data sets. Section 6 will introduce four variants of a process-driven betweenness


centrality and Sect. 7 discusses the results of the PDBC measures on the given data
sets, before Sect. 8 summarises the articles and gives an outlook for future work.

2 Definitions

G = (V , E) denotes a directed simple unweighted graph with vertex set V , and


edge set E ⊆ V × V . A path is an alternating (finite) sequence of nodes and
edges, P = (v1 , e1 , v2 , . . . , ek−1 , vk ) with vi ∈ V and ej = (vj , vj +1 ) ∈ E
for all i ∈ {1, . . . , k} and j ∈ {1, . . . , k − 1}, respectively.2 Since G is simple,
P is uniquely determined by its node sequence and the notation can be simplified
to P = (v1 , v2 , . . . , vk ). The length of a path P (|P |) is defined as its number
of (not necessarily distinct) edges. The start node of the path P is denoted by
s(P ) = v0 , the end node by t (P ) = vk (occasionally referred to as source/start
and destination/target). Let d(v, w) denote the length of the shortest path from node
v to node w. If w cannot be reached from v, we set d(v, w) := ∞.

3 Related Work

As illustrated in Sect. 1, the betweennness centrality contains several assumptions.


Since these assumptions are not met in all networks and all processes, there have
been proposed many variants of the betweenness centrality. The most prominent
variants questioning the assumption of shortest paths are the flow betweenness
centrality of Freeman [11], and the random walk betweenness centrality by New-
man [16], but there are also variants as routing betweenness [7] or variants for
dynamic networks where paths up to a certain factor longer than the shortest one
contribute to the centrality [6]. A variant questioning the assumption that all paths
contribute equally to centrality value is the length-scaled betweenness [5].
Path Analysis The idea of analysing the process moving through a network is not
new. In many real-world networks, entities use the network to navigate through it
by moving from one node to the other. This includes Internet users surfing the Web
yielding clickstream data, or passengers travelling through a transportation network.
The navigation of an entity in a network to a target node, but with only local
information about the network structure is called decentralized search. The fact
that humans are often able to find surprisingly short paths through a network was
already illustrated by Milgram in his famous small-world experiment in 1967 [15].
An answer of how humans actually find these short paths was not known before

2 Notethat we do not require the nodes and edges to be pairwise distinct. In some literature, P
would be referred to as a walk.
22 M. Bockholt and K. A. Zweig

Kleinberg investigated which effect the network structure has on the performance of
any decentralized search algorithm [13]. West and Leskovec analysed the navigation
of information seekers in the Wikipedia networks [23] and also find that human
paths are surprisingly short, and show similar characteristics in their structure. This
is also found by Iyengar et al. [21] who considered human navigation in a word
game.
Combination of Centrality and Path Analysis There are approaches of using the
information about the network process in order to infer knowledge about the
network itself: West et al. [24] use the paths of information seekers in the Wikipedia
network to compute a semantic similarity of the contained articles. Rosvall et al. [19]
show how real-word pathway data incorporated into a second-order Markov model
has an effect on the node rankings in an approach generalising PageRank, and can
derive communities of the network by the network’s usage patterns. Zheng [26]
identify popular places based on GPS sequences of travellers. Also based on GPS
trajectories, the approach of Yuan et al. [25] makes use of Taxi drivers’ trajectories
in order to compute the effectively quickest path between two places in a city. Dorn
et al. [8] can show that the results of betweenness centrality in the air transportation
network significantly change if the centrality considers the number of actually taken
paths traversing through an airport instead of all possible shortest paths.

4 Data Sets

The goal is to investigate whether the assumptions built into the betweenness
centrality are actually met by a data set containing the information how a process
moves in a network. Since we consider the betweenness centrality, it is, according
to Borgatti [4], only meaningful to consider processes which (1) move by a transfer
mechanism, and (2) move through the network with a target, that is a predetermined
node to reach. The first condition implies that the movement of the entity can be
modelled as a path. Data sets appropriate to use in this work, hence, need to fulfil
the following requirements:
1. It contains a network structure, given as graph G = (V , E)
2. It contains a set of paths of one or several entities moving through the network,
given as P = {P1 , . . . , P }
3. The entities move through the network with a predetermined goal to reach
4. That they aim to reach as soon as possible
We use the following data sets (see also Table 1).
Wikispeedia This data set provided by West et al. [23, 24] contains (a subset of)
the network of Wikipedia articles where a node represents an article and there exists
an edge from one node to another if there exists a link in the one article leading to
the other article. The paths represent human navigation paths, collected by the game
Wikispeedia in which a player navigates from one (given) article to another (given)
Process-Driven Betweenness Centrality 23

Table 1 Overview of the used data sets


Data set Source Graph type Nodes Edges
Wikispeedia [23, 24] Directed Articles Hyperlinks
Rush Hour [12] Undirected Configurations Valid game moves
DB1B [18] Directed Cities Non-stop airline connections
Path length
Data set |V | |E| |P | Range Average
Wikispeedia 4592 119,804 51,306 [1, 404] 5
Rush Hour 364 1524 3044 [3, 33] 5
DB1B 419 12,015 63,681,979 [1, 14] 1.3

article by following the links in the articles. Only paths reaching their determined
target are considered. We model the Wikispeedia network as a directed, simple
unweighted graph (GW ) and the set of paths as PW .
Rush Hour This data set contains a state space of a single-player board game
called Rush Hour where each node represents a possible game configuration and
there is an edge from node v to node w if configuration w can be reached from
v by a valid move in the game. We only include those nodes which are reachable
from the node representing the start configuration of the game. Since all moves
are reversible, the network is modelled as undirected unweighted graph (denoted
by GR ) where one node represents the start configuration and one or more nodes
represent configurations in which the game is solved (final nodes). A path in this
data set is then the solution of a player trying to reach the final node from the start
node by a sequence of valid moves. Only paths ending in a final node are considered.
The set of paths will be denoted by PR . The data set was collected by Pelánek and
Jarušek [12] by their web-based tool for education (available under tutor.fi.muni.cz).
DB1B This data set contains a sample of 10% of all airline tickets of all reporting
airlines, including all intermediate stops of a passenger’s travel, provided by the US
Bureau of Transportation Statistics which publishes for each quarter of a year the
Airline Origin and Destination Survey (DB1B) [18]. We consider the passengers’
travels for the years 2010 and 2011. A path is a journey of a passenger travelling
from one airport to another, possibly with one or more intermediate stops. The
network (GD ) is modelled as a simple directed and unweighted graph and is
extracted from the ticket data by identifying city areas (possibly including more
than one airport) as nodes, and adding a directed edge from a node v to a node w if
at least one passenger’s journey contains a flight from one airport in the area of node
v to another airport in the area of node w. Passengers’ journeys which are symmetric
in the sense that the passenger travels from airport A to airport B over i intermediate
airports and via the same intermediate airports back to A will be considered as two
paths: one from A to B and one from B to A.
Another random document with
no related content on Scribd:
Close equivalents of the culture-areas of American ethnologists
are in fact recognized by European archæologists. Thus, Déchelette
distinguishes seven “geographical provinces” in the Bronze Age of
Europe, as follows: 1, Ægean (Greece, islands, coast of Asia Minor);
2, Italian (with Sicily and Sardinia); 3, Iberian (Spain, Portugal,
Balearics); 4, Western (France, Great Britain, Ireland, Belgium,
southern Germany, Switzerland, Bohemia); 5, Danubian (Hungary,
Moravia, the Balkan countries); 6, Scandinavian (including northern
Germany and the Baltic coast); 7, Uralic (Russia and western
Siberia).
The fourteen hundred years generally allowed the Scandinavian
Bronze Age are divisible into five or six periods,[35] which become
progressively shorter.

2500-2100, Neolithic, with copper, and 2100-1900, with occasional


bronze, have already been mentioned.
1900-1600 B.C. Burial in stone cysts. Little decoration of bronze, and
that only in straight lines. Flat ax heads or celts. Triangular daggers.
Daggers mounted on staves like ax heads.
1600-1400 B.C. Occasional cremation of corpses, the ashes put into
very small cysts. Decoration of bronze in engraved spirals. Flanged and
stop-ridged axes. Swords. Straight fibulas.
1400-1050 B.C. Cremation general. Axes of socketed type. Bowed
fibulas. Bronze vessels with lids.
1050-850 B.C. Spiral ornament decaying. Fibulas with two large bosses.
Ship-shaped razors.
850-650 B.C. Ornamentation plastic, rather than engraved, often
produced in the casting. Rows of concentric circles and other patterns
replace spirals. Fibulas with two large disks. Knives with voluted antennæ-
like handles. Sporadic occurrence of iron.
650-500 B.C. Iron increasing in use; decorative bronze deteriorating.

236. Problems of Chronology


The dating of events in the Neolithic and Metal Ages is of much
more importance than in the Palæolithic. Whether an invention was
made in Babylonia in 5000 or in 3000 B.C. means the difference
between its occurring in the hazy past of a formative culture or in a
well advanced and directly documented phase of that culture. If the
dolmens and other megalithic monuments of northern Europe were
erected about 3000 B.C., they are older than the pyramids of Egypt
and contemporaneous with the first slight unfoldings of civilization in
Crete and Troy. But if their date is 1000 B.C., they were set up when
pure alphabetic writing and iron and horses were in use in western
Asia, when Egypt was already senile, and the Cretan and Trojan
cultures half forgotten. In the one case, the megaliths represent a
local achievement, perhaps independent of the stone architecture of
Egypt; in the other event, they are likely to be a belated and crudely
barbarian imitation of this architecture.
But in the Palæolithic, year dates scarcely matter. Whether the
Mousterian phase culminated 25,000 or 75,000 years ago is
irrelevant: it was far before the beginning of historic time in either
case. If one sets the earlier date, the Chellean and Magdalenian are
also stretched farther off; if the later, it is because one shrinks his
estimate of the whole Palæolithic, the sequence of whose periods
remains fixed. It is really only the relative chronology that counts
within the Old Stone Age. The durations are so great, and so wholly
prehistoric, that the only value of figures is the vividness of their
concrete impression on the mind, and the emphasis that they place
on the length of human antiquity as compared with the brevity of
recorded history. Palæolithic datings might almost be said to be
useful in proportion as they are not taken seriously.
At the same time, the chronology of the Palæolithic is aided by
several lines of geological evidence that are practically absent for the
Neolithic and Bronze Ages: thickness of strata, height of river
deposits and moraines, depth of erosion, species of wild animals.
The Neolithic is too brief to show notable traces of geological
processes. Its age must therefore be determined by subtler means:
slight changes in temperature and precipitation; the thickness of
refuse deposits; and above all, the linking of its latter phases to the
earliest datable events in documentary or inscriptional history. By
these aids, comparison has gradually built up a chronology which is
accepted as approximate by most authorities. This chronology puts
the beginning of bronze in the Baltic region at about 1900 B.C., in
Spain at 2500, the first Swiss lake-dwellings at 4000, the
domestication of cattle and grain in middle Europe around 5500, the
first pottery-bearing shellmounds about 8000 B.C. Of course, these
figures must not be taken as accurate. Estimates vary somewhat.
Yet the dates cited probably represent the opinion of the majority of
specialists without serious deviation: except on one point, and that
an important one.
This point is the hinge from the end of prehistory to the beginning
of history: the date of the first dynasties of Egypt and Babylonian
Sumer. On this matter, there was for many years a wide discrepancy
among Orientalists. The present tendency is to set 3400 or 3315
B.C., with an error of not over a century, as the time when upper and
lower Egypt began to be ruled by a single king, Mena, the founder of
the “first dynasty”; 2750 as the date of Sargon of Akkad, the first
consolidator and empire-builder in the Babylonian region; and about
3100 as the period of the earliest discovered datable remains from
the Sumerian city states.
The longer reckoning puts the Egyptian first dynasty back to about
4000; according to some, even earlier; and Sargon to 3750. This last
date rests on the discovery by Nabonidus, the last king of Babylon,
successor of Nebuchadnezzar in the sixth century B.C., of a deeply
buried inscription in a foundation wall erected by Naram-sin, son of
Sargon. Nabonidus had antiquarian tastes, and set his
archæologists and historians to compute how long before him
Naram-sin had lived. Their answer was 3,200 years—the thirty-
eighth century B.C., we should say; and this figure Nabonidus had
put into an inscription which has come down to us. All this looks
direct and sure enough. But did the king’s scholars really know when
they told him that just 3,200 years had elapsed since his
predecessor’s reign, or were they guessing? The number is a round
one: eighty forties of years, such as the Old Testament is fond of
reckoning with. The trend of modern opinion, based on a variety of
considerations, is that the Babylonian historians were deceiving
either the king or themselves.
The bearing of the discrepancy is this. The scholars who are in
more or less agreement that bronze reached Mediterranean Europe
about 2500 and Northern Europe about 1900, were generally
building on the longer chronology of the Near East. They were
putting Mena around 4000 and Sargon in 3750. This allowed an
interval of over a thousand years in which bronze working could
have been carried to Spain, and two thousand to Denmark. With the
shorter chronology for Egypt and Babylonia, the time available for
the transmission has to be cut down by a thousand years. Should
the European dates, therefore, be made correspondingly more
recent? Or is the diminished interval still sufficient—that is, might a
few centuries have sufficed to carry the bronze arts from the Orient
to Sicily and Spain, and a thousand years to bear them to
Scandinavia? The interval seems reasonable, and is accordingly the
one here accepted. But it is possible that if the shorter chronology
becomes proved beyond contradiction for Egypt and Babylonia, the
dates for the Neolithic and Bronze Ages of Europe may also have to
be abbreviated somewhat.
One famous time-reckoning has in fact been made by the Dane
Sophus Müller, who, without basing very much on any Oriental
chronology, shortens all the prehistoric periods of Europe. His
scheme of approximate dating is reproduced, with some
simplifications, in Figure 42. It will be seen that he sets the second or
dolmen period of the Full Neolithic in Scandinavia from about 1800
to 1400 B.C., whereas the more usual reckoning puts it at 3500 to
2500; his earliest Kitchenmidden date is 4000, as against 8000.
Fig. 42. The development of prehistoric civilization in Europe. Simplified from
Sophus Müller. His absolute dates are generally considered too low, but their
relative intervals are almost undisputed. The diagram shows very clearly the
persistent cultural precedence of the countries nearest the Orient, and the
lagging of western and especially of northern Europe.

237. Principles of the Prehistoric Spread of


Culture
This chronology has much to commend it besides its almost daring
conservatism; especially the clarity of its consistent recognition of
certain cultural processes. Five principles and three extensions are
set up by Müller:

1. The south [of Europe, with the Near East] was the vanguard and
dispensing source of culture; the peripheral regions, especially in the north
[of Europe] followed and received.
2. The elements of southern culture were transmitted to the north only in
reduction and extract.
3. They were also subject to modifications.
4. These elements of southern culture sometimes appeared in the
remoter areas with great vigor and new qualities of their own.
5. But such remote appearances are later in time than the occurrence of
the same elements in the south.
6. Forms of artifacts or ornaments may survive for a long time with but
little modification, especially if transmitted to new territory.
7. Separate elements characteristic of successive periods in a culture
center may occur contemporaneously in the marginal areas, their diffusion
having occurred at different rates of speed.
8. Marginal cultures thus present a curious mixture of traits whose
original age is great and of others that are much newer; the latter, in fact,
occasionally reach the peripheries earlier than old traits.

The basic idea of these formulations is that of the gradual radiation


of culture from creative focal centers to backward marginal areas,
without the original dependence of the peripheries wholly precluding
their subsequent independent development. It is obvious that this
point of view is substantially identical with that which has been held
to in the presentation of native American culture in the preceding
chapter.
It is only fair to say that a number of eminent archæologists
combat the prevalent opinion that the sources of European Neolithic
and Bronze Age civilization are to be derived almost wholly from the
Orient. They speak of this view as an “Oriental mirage.” They see
more specific differences than identities between the several local
cultures of the two regions, and tend to explain the similarities as
due to independent invention.
Since knowledge of ancient cultures is necessarily never
complete, there is a wide range of facts to which either explanation
is, theoretically, applicable. But the focal-marginal diffusion
interpretation has the following considerations in its favor.
Within the fully historic period, there have been numerous
undoubted diffusions, of which the alphabet, the week, and the true
arch may be taken as illustrations. At least in the earlier portion of
the historic period, the flow of such diffusions was regularly out of the
Orient; which raises a considerable presumption that the flow was in
the same direction as early as the Neolithic. On the other hand,
indubitably independent parallelisms are very difficult to establish
within historic areas and periods, and therefore likely to have been
equally rare during prehistory.
Then, too, the diffusion interpretation explains a large part of
civilization to a certain degree in terms of a large, consistent
scheme. To the contrary, the parallelistic opinion leaves the facts
both unexplained and unrelated. If the Etruscans devised the true
arch and liver divination independently of the Babylonians, there are
two sets of phenomena awaiting interpretation instead of one. To say
that they are both “natural” events is equivalent to calling them
accidental, that is, unexplainable. To fall back on instinctive impulses
of the human mind will not do, else all or most nations should have
made these inventions.
Of course it is important to remember that no sane interpretation of
culture explains everything. We do not know what caused the true
arch to be invented in Babylonia, hieroglyphic writing in Egypt, the
alphabet in Phœnicia, at a certain time rather than at another or
rather than in another place. The diffusion point of view simply
accepts certain intensive focal developments of culture as
empirically given by the facts, and then relates as many other facts
as possible to these. Every clear-minded historian, anthropologist,
and sociologist admits that we are still in ignorance on the problems
of what caused the great bursts of higher organization and original
productiveness of early Egypt and Sumer, of Crete, of ancient North
China, of the Mayas, of Periclean Athens. We know many of the
events of civilization, know them in their place and order. We can
infer from these something of the processes of imitation,
conservatism, rationalization that have shaped them. We know as
yet as good as nothing of the first or productive causes of civilization.
It is extremely important that this limitation of our understanding be
frankly realized. It is only awareness of darkness that brings seeking
for light. Scientific problems must be felt before they can be
grappled. But within the bounds of our actual knowledge, the
principle of culture derivation and transmission seems to integrate,
and thus in a measure to explain, a far greater body of facts than any
other principle—provided it is not stretched into an instrument of
magic and forced to explain everything.
CHAPTER XV
THE GROWTH OF CIVILIZATION: OLD WORLD
HISTORY AND ETHNOLOGY

238. The early focal area.—239. Egypt and Sumer and their background.
—240. Predynastic Egypt.—241. Culture growth in dynastic Egypt.—
242. The Sumerian development.—243. The Sumerian hinterland.—
244. Entry of Semites and Indo-Europeans.—245. Iranian peoples
and cultures.—246. The composite culture of the Near East.—247.
Phœnicians, Aramæans, Hebrews.—248. Other contributing
nationalities.—249. Ægean civilization.—250. Europe.—251. China.—
252. Growth and spread of Chinese civilization.—253. The Lolos.—
254. Korea.—255. Japan.—256. Central and northern Asia.—257.
India.—258. Indian caste and religion.—259. Relations between India
and the outer world.—260. Indo-China.—261. Oceania.—262. The
East Indies.—263. Melanesia and Polynesia.—264. Australia.—265.
Tasmania.—266. Africa.—267. Egyptian radiations.—268. The
influence of other cultures.—269. The Bushmen.—270. The West
African culture-area and its meaning.—271. Civilization, race, and the
future.

238. The Early Focal Area


The prehistoric archæology of Europe and the Near East, outlined
in the last chapter, besides arriving at a tolerable chronology, reveals
a set of processes of which the outstanding one is the principle of
the origin of culture at focal centers and its diffusion to marginal
tracts. Obviously this principle should apply in the field of history as
well as prehistory, and should be even more easily traceable there.
In the Western Hemisphere it is plain that the great hearth of
cultural nourishment and production has been Middle America—the
tracts at the two ends of the intercontinental bridge, the Isthmus of
Panama. That a similarly preëminent focal area existed in the
Eastern Hemisphere has been implied over and over again in the
pages that immediately precede this one, in the references to the
priority of Egypt and Babylonia—the countries of the Nile and of the
Two Rivers. These two lands lie at no great distance from each
other: they are closer than Mexico and Peru. Like these two, they are
also connected by a strip of mostly favorable territory—the “Fertile
Crescent” of Palestine, Syria, and northern Mesopotamia. Curiously,
the two countries also lie in two continents connected by a land
bridge: the Isthmus of Suez is a parallel to that of Panama.
Both in Egypt and in Babylonia we find a little before 3000 B.C. a
system of partly phonetic writing, which, though cumbersome by
modern standards, was adequate to record whatever was spoken.
Copper was abundant and bronze in use for weapons and tools.
Pottery was being turned on wheels. Economic life was at bottom
agricultural. The same food plants were grown: barley and wheat;
similar beer brewed from them. The same animals were raised:
cattle, swine, sheep, goats; with the ass for transport. Architecture
was in sun-dried brick. Considerable walled cities had arisen. Their
rulers struggled or attained supremacy over one another as avowed
kings with millions of subjects. A regulated calendar existed by which
events were dated. There were taxes, governors, courts of law,
police protection, and social order. A series of great gods, with
particular names and attributes, were worshiped in temples.

239. Egypt and Sumer and Their Background


It is scarcely conceivable that these two parallel growths of
civilization, easily the most advanced that had until then appeared on
earth, should have sprung up independently within a thousand miles
of each other. Had Sumerian culture blossomed far away, say on the
shores of the Pacific instead of the lower Euphrates, its essential
separateness, like that of Middle American civilization, might be
probable. But not only is the stretch of land between Babylonia and
Egypt relatively short: it is, except in the Suez district, productive and
pleasant, and was settled fairly densely by relatively advanced
nations soon after the historic period opens or even before. The
same is true of the adjoining regions. Canaan, Syria, Mesopotamia,
Troy, Crete, Elam, southwestern Turkistan, had all passed beyond
barbarism and into the period of city life during the fourth and third
millenia B.C. This cannot be a series of coincidences. Evidently
western Asia, together with the nearest European islands and the
adjacent fertile corner of Africa, formed a complex but connected
unit, a larger hearth in which culture was glowing at a number of
points. It merely happened that a little upstream from the mouths of
the Nile and of the Euphrates the development flamed up faster
during the fifth and fourth millenia. The causes can only be
conjectured. Perhaps when agriculture came to be systematically
instead of casually conducted, these annually overflowed bottom
lands proved unusually favorable; their population grew,
necessitating fixed government and social order, which in turn
enabled a still more rapid growth of numbers, the fuller exploitation
of resources, and division of labor. This looks plausible enough. But
too much weight should not be attached to explanations of this sort:
they remain chiefly hypothetical. That culture had however by 3000
B.C. attained a greater richness and organization in Egypt and in the
Babylonian region than elsewhere, are facts, and can hardly be
anything but causally connected facts. These two civilizations had
evidently arisen out of a common Near Eastern high level of
Neolithic culture, much as the peaks of Mexico and Peru arose
above the plateau of Middle American culture in which they were
grounded.
Of course this means that Egypt and Sumer did not stand in
parental-filial relation. They were rather collateral kin—brothers, or
better, perhaps, the two most eminent of a group of cousins.
Attempts to derive Egyptian hieroglyphic from Babylonian Cuneiform
writing, and vice versa, have been rejected as unproved by the
majority of unbiased scholars. But it is likely that at least the idea of
making legible records, of using pictorial signs for sounds of speech,
was carried from one people to the other, which thereupon worked
out its own symbols and meanings. Just so, while the Phœnician
alphabet has never yet been led back to either Egyptian, Cuneiform,
Cretan, or Hittite writing with enough evidence to satisfy more than a
minority fraction of the world of scholarship, it seems incredible that
this new form of writing should have originated uninfluenced by any
of the several systems which had been in current use in the near
neighborhood, in part in Phœnicia itself, for from one to two or three
thousand years. Such a view denies neither the essentially new
element in Phœnician script nor its cultural importance. It does not
consider the origin of the alphabet explained away by a reference to
another and earlier system of writing. It does bring the alphabet into
some sort of causal relation with the other systems, without merging
it in them. It is along lines like this that the relation of early Egypt and
Babylonia to each other and to the other cultures of the ancient Near
East must be conceived.

240. Predynastic Egypt


Egyptian civilization was already in full blown flower at the time of
the consolidation of Lower and Upper Egypt under the first dynasty
in the thirty-fourth century. Its developmental stages must have
reached much farther back. Hieroglyphic writing, for instance, had
taken on substantially the forms and degree of efficiency which it
maintained for the next three thousand years. An elaborate,
conventional system of this sort must have required centuries for its
formative stages. A non-lunar 365-day calendar was in use. This
was easily the most accurate and effective calendar developed in the
ancient world, and furnished the basis of our own. It erred by the few
hours’ difference between the solar and the assumed year. This
difference the Egyptians did not correct but recorded, with the result
that when the initial day had slowly swung around the cycle of the
seasons, they reckoned a “Sothic year” of 1,461 years. One of these
was completed in 2781; which gives 4241 B.C. as the date of the
fixing of the calendar. This is considered the earliest exactly known
date in human history. Of course, a calendar of such fineness cannot
be established without long continued observations, whose duration
will be the greater for lack of astronomical instruments. Centuries
must have elapsed while this calendar was being worked out. Nor
would oral tradition be a sufficient vehicle for carrying the
observations. Permanent records must have been transmitted from
generation to generation; and these presuppose stability of society,
enduring buildings, towns, and a class with leisure to devote to
astronomical computations. It is safe therefore to set 4500 B.C. as
the time when Egypt had emerged from a tribal or rural peasant
condition into one that can be called “civilized” in the original
meaning of that word: a period of city states, or at least districts
organized under recognized rulers. From 4500 on, then, is the time
of the Predynastic Local Kingdoms.
Beyond this time there must lie another: the Predynastic Tribal
period, before towns or calendars or writing or metal, when pottery
was being made, stone ground, boats built, plants and animals being
domesticated—the typical pure Neolithic Age, in short. Yet with all its
prehistoric wealth, Egypt has not yet produced any true Neolithic
remains. It is hardly likely that the country was uninhabited for
thousands of years; much more probably have Neolithic remains
been obliterated. This inference is strengthened by the paucity and
dubiousness of Upper Palæolithic artifacts in Egypt. Lower
Palæolithic flint implements are abundant, just as are remains from
the whole of the period of metals. What has happened to the missing
deposits and burials of the Upper Palæolithic and Neolithic that fell
between?
Apparently they have been buried on the floor of the Nile valley
under alluvial deposits. The Eolithic and Lower Palæolithic
implements are found on the plateau through which the valley
stretches; also cemented into conglomerate formed of gravel and
stone washed down from this plateau and cut into terraces by the
Nile when it still flowed from 30 to 100 feet higher than at present;
and on the terraces. Just when these terraces were formed, it is
difficult to say in terms of European Pleistocene periods; but not later
than the third glacial epoch, it would seem, and perhaps as early as
the second. As the Chellean of Europe is put after the third glaciation
in the chronology followed in this book, the antiquity of the first flints
used, and perhaps deliberately shaped, in Egypt, is carried back to
an extremely high antiquity by the specimens imbedded in the
terrace cliffs.
About the time of the last glaciation—the Mousterian or end of the
Lower Palæolithic in Europe—the Nile ceased cutting down through
the gravels that bordered it and began to build up its bed and its
valley with a deposit of mud as it does to-day. From excavations to
the base of dated monuments it is known that during the last 4,000
years this alluvium has been laid down at the rate of a foot in 300
years in northern Egypt. As it there attains a depth of over a hundred
feet, the process of deposition is indicated as having begun 30,000
or more years ago. Of course no computation of this sort is entirely
reliable because various factors can enter to change the rate; but the
probability is that, other things equal, the deposition would have
been slower at first than of late, and the time of the aggrading
correspondingly longer. In any event geologists agree that their
Recent—the last 10,000 years or so—is insufficient, and that the
deposition of the floor of the Nile valley must have begun during
what in Europe was part of the Würm glaciation.
This is the period of transition from Lower to Upper Palæolithic (§
69, 70, 213) in Europe. The disappearance of Upper Palæolithic
remains in Egypt is most plausibly explained by the fact that the
Upper Palæolithic, just as later the Neolithic, was indeed
represented in Egypt, very likely flourishingly, but that in the mild
climate its artifacts were lost or interred on or near the surface of the
valley itself and have therefore long since been covered over so
deep that only future lucky accidents, like well-soundings, may now
and then bring a specimen to light.
For the Neolithic, there actually are such discoveries: bits of
pottery brought up from borings, 60 feet deep in the vicinity of one of
the monuments referred to, 75 and 90 feet deep at other points in
lower Egypt. The smallest of these figures computes to a lapse of
18,000 years—nearly twice as long as the estimated age of the
earliest pottery in Europe. It is always necessary not to lay too much
reliance on durations calculated solely from thickness of strata,
whether these are geological or culture-bearing. But in this case
general probability confirms, at least in the rough. In 4000 B.C.,
when Egypt was beginning to use copper, western Europe was still
in its first phase of stone polishing; in 1500 B.C., when Egypt was
becoming acquainted with iron, Europe was scarcely yet at the
height of its bronze industry. If the fisher folk camped on their oyster
shells on the Baltic shores were able to make pottery by 8000 B.C.,
there is nothing staggering in the suggestion that the Egyptians knew
the art in 16,000 B.C. They have had writing more than twice as long
as the North Europeans.
The dates themselves, then, need not be taken too literally. They
are calculated from slender even though impressive evidence and
subject to revision by perhaps thousands of years. But they do
suggest strongly the distinct precedence of northern Africa, and by
implication of western Asia, over Europe in the Neolithic, as
precedence is clear in the Bronze and early Iron Ages and indicated
for the Lower Palæolithic.
If, accordingly, the beginning of the Early Neolithic—the age of
pottery, bow, dog—be set for Egypt somewhere around 16,000 B.C.,
about coeval with the beginning of the Magdalenian in middle
western Europe (§ 215), the Full Neolithic, the time of first
domestication of animals and plants and polishing of stone, could be
estimated at around 10,000 B.C., when Europe was still lingering in
its epi-Palæolithic phase (§ 216). One can cut away several
thousand years and retain the essential situation unimpaired: 8000,
or 7000 B.C., still leaves Egypt in the van; helps to explain the
appearance of eastern grains and animals in Europe around 6000-
5000 B.C.; and, what is most to the point, allows a sufficient interval
for agriculture and the allied phases of civilization to have reached
the degree of development which they display when the Predynastic
Local Kingdoms drift into our vision around 4500 B.C. A long Full
Neolithic, then, is both demanded by the situation in Egypt and
indicated by such facts as there are; a long period in which millet,
barley, split-wheat, wheat, flax, cattle, sheep, and asses were
gradually modified and made more useful by breeding under
domestication. This Full Neolithic, or its last portion, was the
Predynastic Tribal Age of Egypt; which, when it passed into the
Predynastic Local Kingdoms phase about 4500 B.C., had brought
these plants and animals substantially to their modern forms, and
had increased and coördinated the population of the land to a point
that the devising of calendar, metallurgy, writing, and kingship soon
followed.

241. Culture Growth in Dynastic Egypt


The story of the growth of Egyptian civilization during the Dynastic
or historic period is a fascinating one. There were three phases of
prosperity and splendor. The first was the Old Kingdom, 3000-2500,
culminating in the fourth dynasty of the Great Pyramid builders, in
the twenty-ninth century. The second was the Middle or Feudal
Kingdom, with its climax in the twelfth dynasty, 2000-1788. After the
invasion of the Asiatic Hyksos in the seventeenth century, the New
Empire of the eighteenth and nineteenth dynasties arose, whose
greatest extension fell in the reign of Thutmose III, 1501-1447,
although the country retained some powers of offensive for a couple
of centuries longer. There followed a slow nationalistic decline, a
transient seventh century conquest by Assyria, a brief and fictitious
renascence supported by foreign mercenaries, and the Persian
conquest of 525 B.C., since which time Egypt has never been an
independent power under native rulers. A waning of cultural energy,
at least relatively to other peoples, had set in before the military
decline. By 1000 B.C. certainly, by 1500 perhaps, Egypt was
receiving more elements of civilization than she was imparting. She
still loomed wealthy and refined in contrast with younger nations; but
these were producing more that was new.
Copper smelted from the ores of the Sinai peninsula had
apparently come into use by 4000, but remained scarce for a long
time. The first dynasty had some low grade bronze; by 2500 bronze
containing a tenth of tin was in use. Iron was introduced about 1500
or soon after, but for centuries remained a sparing Asiatic import. In
fact, the conservatism that was settling over the old age of the
civilization caused it to cling with unusual tenacity to bronze. As late
as Ptolemaic and Roman times, when the graves of foreigners
abounded in iron, those of Egyptians were still prevailingly bronze
furnished.
Quite different must have been the social psychology three
thousand years earlier. The first masonry was laid in place of adobe
brick to line tomb walls in the thirty-first century; and within a century
and a half the grandest stone architecture in human history had been
attained in the Pyramid of Cheops. This was a burst of cultural
energy such as has been equalled only in the rise of Greek art or
modern science.
Glass making seems to have been discovered in Egypt in the early
dynasties and to have spread from there to Syria and the Euphrates.
The earliest glass was colored faïence, at most translucent, and
devoted chiefly to jewelry, or to surfacing brick. Later it was blown
into vessels, usually small bottles, and only gradually attained to
clearness. From western Asia the art was carried to Europe, and, in
Christian times, to China, which at first paid gem prices for glass
beads, but later was perhaps stimulated by knowledge of the new art
into devising porcelain—a pottery vitrified through.
The horse first reached Egypt with the Hyksos. With it came the
war chariot. Wheeled vehicles seem to have been lacking previously.
The alphabet, the arch, the zodiac, coinage, heavy metal armor, and
many other important inventions gained no foot-hold in Egypt until
after the country had definitely passed under foreign domination. The
superior intensity of early Egyptian civilization had evidently fostered
a spirit of cultural self-sufficiency, analogous to that of China or
Byzantine Greece, which produced a resistance to innovation from
without. At the same time, inward development continued, as
attested by numerous advances in religion, literature, dress, the arts,
and science when the Old Kingdom and New Empire are compared.
In the eleventh and twelfth dynasties, for instance, the monarchy
was feudal; in the eighteenth lived the famous monotheistic
iconoclastic ruler Ikhnaton.
The civilization of ancient Egypt was wholly produced and carried
by a Hamitic-speaking people. This people has sometimes been
thought to have come from Asia, but its Hamitic relatives hold Africa
from Somaliland to Morocco even to-day, and there is no cogent
reason to look for its ancestors outside that continent.
242. The Sumerian Development
The story of Babylonia is less completely known than that of Egypt
because as a rockless land it was forced to depend on sun-dried
brick, because the climate is less arid, and also because of its
position. Egypt was isolated by deserts, Babylonia open to many
neighbors, and so was invaded, fought over, and never unified as
long as Egypt. The civilization was therefore never so nationally
specific, so concentrated; and its records, though abundant, are
patchy. Babylonia, the region of the lower Euphrates and Tigris, was
the leading culture center of Asia in the third and second millenia
before Christ; and while mostly a Semitic land in this period, its
civilization was the product of another people, the Sumerians,
established near the mouth of the two rivers in the fourth millenium.
It was the Sumerians who in this thousand years worked out the
Cuneiform or wedge-shaped system of writing—mixedly phonetic
and ideographic like the Egyptian, but of wholly different values, so
far as can be told to-day, and executed in straight strokes instead of
the pictorial forms of Hieroglyphic or the cursive ones of its derivative
Hieratic. It was the Semites who, coming in from the Arabian region,
took over this writing, together with the culture that accompanied it.
Their dependence is shown by the fact that they used characters
with Sumerian phonetic values as well as by their retention of
Sumerian as a sacred language, for which, as time went on, they
were compelled to compile dictionaries, to the easement of modern
archæologists.
Of the local origins of this Sumerian city-state civilization with its
irrigation and intensive agriculture, nothing is known. It kept
substantially abreast of Egypt, or at most a few centuries in arrears.
In certain features, as the use of metals, it seems to have been in
advance. Egypt is a long-drawn oasis stretched through a great
desert. Babylonia lies adjacent to desert and highland, steppe and
mountains. Within a range of a few hundred miles, its environment is
far more varied. Not only copper but tin is said to have been
available among neighboring peoples. So, too, Babylonia is likely to
have had the early domestic animals—except perhaps cattle—earlier
than Egypt, because it lay nearer to what seem to have been their
native habitats. The result is that whereas Egyptian culture makes
the superficial impression of having been largely evolved on the spot
by the Egyptians, Sumerian culture already promises to resolve,
when we shall know it better, into a blend to which a series of
peoples contributed measurably. The rôle of the Sumerians, like that
of their Semitic successors, was perhaps primarily that of organizers.

243. The Sumerian Hinterland


There are some evidences of these cultures previous to the
Sumerian one or coeval with it. In Elam, the foot-hill country east of
Sumer, a mound at Susa contains over 100 feet of culture-bearing
deposits; in southern Turkistan, at Anau, 300 miles east of the
Caspian sea, one is more than fifty feet deep. The date of the first
occupancy of these sites has been set, largely on the basis of the
rate of accumulation of the deposits—an unsafe criterion—at 18,000
and 8000 B.C. respectively. These dates, particularly the former, are
surely too high. But a remote antiquity is indicated. Both sites show
adobe brick houses and hand-made, painted pottery at the very
bottom. Susa contains copper implements in the lowest stratum;
Anau, three-fourths way to the base—in the same level as remains
of sheep and camels. Still lower levels at Anau yielded remains of
tamed cattle, pigs, and goats, while wheat and barley appear at the
very bottom, before domesticated animals were kept. Whatever the
date of the introduction of copper in these regions, it was very likely
anterior to the thirty-first century, when bronze was already used by
the Sumerians. These excavations therefore shed light on the Full
Neolithic and Eneolithic or Transition phases of west Asiatic culture
which must have preceded the Sumerian civilization known to us.
The languages of these early west Asiatic peoples have not been
classified. Sumerian was non-Indo-European, non-Semitic, non-
Hamitic. Some have thought to detect Turkish, that is Ural-Altaic,
resemblances in it. But others find similarities to modern African
languages. This divergence of opinion probably means that
Sumerian cannot yet be safely linked with any other linguistic group.
The same applies to Elamite, which is known from inscriptions of a
later date than the early strata at Susa. What the Neolithic and
Bronze Age people of Anau spoke there is no means of knowing.
The region at present is Turkish, but this is of course no evidence
that it was so thousands of years ago. The speech of the region
might conceivably have been Indo-European, for it lies at the foot of
the Persian plateau, which by the Iron Age was occupied by the
Iranian branch of the Indo-Europeans, who are generally thought to
have entered it from the north or northwest. But again there is not a
shred of positive evidence for an Indo-European population at this
wholly prehistoric period. The one thing that is clear is that the early
civilization of the general west Asiatic area was not developed by the
Semites who were its chief carriers later. This is a situation parallel to
that obtaining in Europe and Asia Minor, whose civilization in the full
historic period has been almost wholly in the hands of Indo-
Europeans, whereas at the dawn of history the nations who were
culturally in the lead, the Hittites, Trojans, Cretans, Lydians,
Etruscans, and Iberians, were all non-Indo-European.

244. Entry of Semites and Indo-Europeans


The Semitic invasions seem to have proceeded from the great
motherland of that stock, Arabia, whose deserts and half-deserts
have been at all times like a multiplying hive. Some of the
movements were outright conquests, others half-forceful
penetrations, still others infiltrations. Several great waves can be
distinguished. About 3000 there was a drift which brought the
Akkadians, Sargon’s people, into Babylonia, perhaps the Assyrians
into their home up the Tigris, the Canaanites and Phœnicians into
the Syrian region. About 2200 the Amorites flowed north: into
Babylonia, where Babylon now sprang up and the famous lawgiver
Hammurabi ruled; into Mesopotamia proper; and into Syria. Around
1400, the Aramæans gradually occupied the Syrian district, and the
Hebrews began to dispossess the Canaanites. Around 700 still
another wave brought the Chaldæans into Babylonia, to erect a
great Semitic kingdom once more—that of Nebuchadnezzar. Then,

You might also like