17 views

Uploaded by Diego Astorga

- Isi Articu
- Scimakelatex.26951.John.doe.Jane.doe
- DUET SamplePaperzsda
- DSF
- API_I_C7
- ParkSahni_maxLTRout
- schultes_diss.pdf
- Labroratory 1
- Can Dan 2014
- UT Dallas Syllabus for opre6385.001 06s taught by Chelliah Sriskandarajah (chelliah)
- Nln Live Icmc Smc 2014 Proceedings
- More Golay Sequences
- MIT1_204S10_lec10
- Michail Zak and Colin P. Williams- Quantum Neural Nets
- Ehrgott00approximation.ps
- Exploring Congestion Control Using Client-Server Technology
- final e-portfolio assignment
- CCNAX Time Table
- 2012-5-14--11-3-55-860_CSIS-Dec2011-I-Dep
- The International Conference on Computing, Networking and Digital Technologies (ICCNDT 2012)

You are on page 1of 11

University of Ljubljana

vladimir.batagelj,andrej.mrvar @uni-lj.si

May 28, 1997 / January 3, 1999

Abstract

Large networks, having thousands of vertices and lines, can be found in many different

areas, e. g: genealogies, flow graphs of programs, molecule, computer networks, trans-

portation networks, social networks, intra/inter organisational networks ... Many standard

network algorithms are very time and space consuming and therefore unsuitable for analy-

sis of such networks. In the article we present some approaches to analysis and visualisation

of large networks implemented in program Pajek. Some typical examples are also given.

1 Introduction

Pajek (Slovene word for Spider) is a program, for Windows (32 bit), for

analysis of large networks. It is freely available, for noncommercial use,

at its homepage:

http://vlado.fmf.uni-lj.si/pub/networks/pajek/

Large networks can be found in many different areas. Usually they are produced auto-

matically, using computers, from different data sources that are already available in computer

readable form. For example:

large genealogies (genealogies having some people [21]), Theoretical Computer

Science Genealogy (

persons [30]);

networks derived from dictionaries and other texts (character mutation/insertion/ deletion

network on
English words [24]);

transportation networks (American airlines with airports [31]);

large molecule (molecule having thousands of atoms, e. g. DNA [28]);

communication networks: links among pages or servers on Internet, usage of

Usenet [29], phone calls [20];

1

Figure 1: Goals.

bibliographies, citation networks [9, 7], Erdös graph (network with

co-

authors [18]), . . .

Such networks cannot be treated efficiently using standard network analysis tools which are

mostly based on matrix representation and are therefore limited to networks of moderate size

(some tens or hundreds of vertices at most).

The main goals in the design of Pajek are (see Figure 1):

to support abstraction by (recursive) factorization of a large network into several smaller

networks that can be treated further using more sophisticated methods;

to provide the user with some powerful visualisation tools;

to implement a selection of efficient algorithms for analysis of large networks.

One of the approaches to support abstraction is: find clusters (components, neighbourhoods

of ’central’ vertices, cores, . . . ) in a network, extract and show vertices that belong to the same

clusters separately, possibly with the parts of the context (detailed local view), shrink vertices

in clusters and show relations among clusters (global view).

2

Figure 2: USA presidents genealogy.

3

George Herbert Walker Bush & Barbara Pierce

Samuel Howard Fay & Susan Shellman Franklin Delano Roosevelt & Eleanor (Anna) Roosevelt

Samuel Prescott PhillipsFay & Harriet Howard James Roosevelt & Sara Delano

Samuel Howard & Anna Lillie Warren Delano & Catherine Robbins Lyman

John Lillie & Abigail Breck Joseph Lyman & Anne Jean Robbins

Theophilus Lillie & Hannah Ruck Edward Hutchinson Robbins & Elizabeth Murray

John Ruck & Hannah Hutchinson Nathaniel Robbins & Elizabeth Hutchinson

Elisha Hutchinson & Hannah Hawkins Edward Hutchinson & Lydia Foster

Figure 3: Shortest path between Franklin Delano Roosvelt and George Herbert Walker Bush in

USA presidents genealogy.

Shuffle 0,00 s 0,015 s 0,17 s 2,22 s

Quick Sor 0,00 s 0,00 s 0,40 s 5,14 s

Heap Sort 0,00 s 0,06 s 0,98 s 14,35 s

Insertion Sort "!# 0,07 s 7,50 s 12,5 min 20,83 hours

XY "$# 0,10 s 1,67 min 1,16 d 3,17 years

Time % ( ) and space & ( ) complexities of an algorithm are estimates of the time and memory

space needed to run it on instances of size (in our case – number of vertices or lines).

In most large networks the number of lines ' is of the same order as the number of vertices

– or at most () (such networks are considered sparse networks). In the following

we assume that we have to analyse large but sparse networks.

According to capabilities of nowadays computers, space complexity for storing sparse net-

works is not crucial any more. The problem can be solved using appropriate data structures for

internal representation of data (doubly linked lists representation of networks is used in Pajek).

Having much faster computers does not help a lot in the case of high order time complexities.

In the theory problems solvable with algorithms of polynomial complexity are considered easy.

4

Pajek - shadow [0.00,1.00] Sep- 5-1998

World Trade (Snyder and Kick, 1979) - cores

uki

net

bel

lux

fra

ita

den

jap

usa

can

bra

arg

ire

swi

spa

por

wge

ege

pol

aus

hun

cze

yug

gre

bul

rum

usr

fin

swe

nor

irn

tur

irq

egy

leb

cha

ind

pak

aut

cub

mex

uru

nig

ken

saf

mor

sud

syr

isr

sau

kuw

sri

tha

mla

gua

hon

els

nic

cos

pan

col

ven

ecu

per

chi

tai

kor

vnr

phi

ins

nze

mli

sen

nir

ivo

upv

gha

cam

gab

maa

alg

hai

dom

jam

tri

bol

par

mat

alb

cyp

ice

dah

nau

gui

lib

sie

tog

car

chd

con

zai

uga

bur

rwa

som

eth

tun

liy

jor

yem

afg

mon

kod

brm

nep

kmr

lao

vnd

maa

dom

mon

mex

cam

som

yem

wge

swe

kuw

rum

mor

brm

den

ege

hun

gua

hon

pan

gha

gab

mat

dah

nau

uga

nep

kmr

usa

can

spa

aus

yug

egy

cha

pak

cub

ken

sud

sau

mla

ven

ecu

nze

sen

upv

jam

chd

con

rwa

kod

vnd

cze

cos

cyp

bra

arg

swi

por

gre

nor

uru

per

par

bur

net

usr

aut

tha

kor

vnr

tog

car

eth

tun

afg

bel

jap

pol

bul

leb

ind

nig

saf

syr

phi

alg

hai

bol

alb

gui

lao

uki

lux

els

nic

col

chi

ins

mli

ivo

ice

sie

zai

fra

tur

ire

irn

irq

nir

jor

ita

fin

isr

sri

tai

lib

liy

tri

But, in the case of large , in practice already algorithms of time complexity of order !

can be too slow (for the interactive use), what can be seen in the Table 1.

Therefore, most of the algorithms implemented in Pajek have subquadratic time complex-

ities: () , () * , (),+ , or are restricted to small sets of selected vertices.

5

3 Data Structures

Currently Pajek uses six data structures to implement the algorithms:

network – main object (vertices and lines);

permutation – reordering of vertices;

vector – values of vertices;

cluster – subset of vertices (e. g. one class from partition);

partition – tells for each vertex to which cluster the vertex belongs;

hierarchy – hierarchically ordered clusters and vertices.

The power of Pajek is based on several transformations which support different transitions

among these data structures.

Besides its own input formats, Pajek supports several other formats: UCINET DL; GED [19],

genealogies can be read either as Ore-graph or p-graph [21, 16, 17]; and some molecular for-

mats: BS (Ball and Stick), MAC (Mac Molecule) and MOL (MDL MOLfile).

4 Algorithms

Using the above data structures the basic set of efficient algorithms was implemented [1, 24, 13,

14], for example:

partitions: degree, depth, core, - -cliques, centers;

binary operations: union, intersection, difference;

components: strong, weak, biconnected, symmetric [1];

decompositions: symmetric-acyclic;

paths: shortest path(s), all paths between two vertices [11];

flows: maximum flow between two vertices [14];

citation weights: Paths Count Method and SPLC Method [2];

neighbourhood: . -neighbours;

CPM (Critical Path Method);

extracting subnetwork;

shrinking clusters in network (generalized blockmodeling) [4];

reordering: topological ordering, Richards’s numbering, depth/breadth first search;

6

Lovasz Sos

Babai Simonovits

Frankl

Graham

Faudree

Burr

Schelp

Spencer

Alon

Chung

Szemeredi

Trotter

Lehel

Pach West

Furedi Gyarfas

Hajnal Gould

Jacobson

Tuza

Rodl

Bollobas

Nesetril

Harary

7

reduction: hierarchy, subdivision, degree;

simplifications and transformations: deleting loops, multiple lines, transforming arcs to

edges . . .

We can define an often used sequence of elementary operations as a macro and run it as

a single command. Using systems of macros we can adapt Pajek to special groups of users

(analysis of genealogies, chemical applications, . . . ).

Some special algorithms for solving problems from different areas of network analysis were

also included in Pajek: e. g. algorithms for checking whether a program is written struc-

turally [15]; simulation of Petri nets [12]; searching for given fragments/patterns in molecule or

genealogies.

Special emphasis was given to automatic generation of network layouts. Several standard

algorithms for automatic graph drawing were implemented: spring embedders based on minimi-

sation of the total energy of the system (Kamada-Kawai [10] and Fruchterman-Reingold [6]),

layouts determined by eigenvectors (Lanczos algorithm [3, 5]), drawing in layers (genealogies

and other acyclic structures).

These algorithms were modified and extended to enable additional options: drawing with

constraints (optimisation of the selected part of the network, fixing some vertices to predefined

positions, using values of lines as similarities or dissimilarities), drawing in space. Pajek also

provides tools for manual graph editing.

Pajek supports several output graphic formats which can be examined by special 2D and

3D viewers (Encapsulated PostScript – GSView [25]; VRML – CosmoPlayer [23]; MDL-

MOL – Rasmol [28], Chime [22]; Kinemages – Mage [26]).

5 Examples

In Figure 2 the largest connected p-graph component of the Genealogy of the USA presi-

dents [32] is presented. The shortest kinship path between Franklin Delano Roosvelt and

George Herbert Walker Bush, determined using macro Path, is displayed in Figure 3.

Figure 4 represents a reordered matrix of Snyder and Kick’s world trade relation (iterative

core decomposition with additional analysis of internal structure of cores).

The cliques decomposition (obtained by MODEL2 [27]) of the main core of Erdös graph [18]

is displayed in Figure 5.

The last picture (Figure 6) presents a snapshot of the 3D display by Mage [26] of the Prison

network (from the UCINET dataset).

6 Future plans

Pajek is in a constant development. The latest version is available from its homepage. In the

near future we are planning to implement the following additional options: different clustering

and decomposition procedures, some statistics (triad counts), animation and presentation of the

sequence of networks, output formatting, control structures in macros, planarity testing and

layout. . .

8

Figure 6: Prison 3D graph.

9

References

[1] AHO, A. V., HOPCROFT, J. E., ULLMAN, J. D. (1976): The Design and Analysis of Computer

Algorithms. Addison-Wesley, Reading, MA.

[2] BATAGELJ, V. (1994): An Efficient Algorithm for Citation Networks Analysis. Presented at

EASST’94, Budapest, Hungary, August 28-31, 1994.

[3] DATTA, B. N. (1995): Numerical Linear Algebra and Applications. Brooks&Cole Publishing

Company, Pacific Grove.

[4] DOREIAN, P., BATAGELJ, V., FERLIGOJ, A. (1994): Partitioning Networks Based on General-

ized Concepts of Equivalence. Journal of Mathematical Sociology, 19, 1, 1-27.

[5] GOLUB, G. H., van LOAN, C. F. (1996): Matrix Computations. The John Hopkins University

Press, Baltimore.

ment. Software, Practice and Experience, 21, 1129-1164.

[7] HUMMON, N. P., DOREIAN, P. (1989): Connectivity in a Citation Network: The Development

of DNA Theory. Social Networks, 11, 39-63.

[8] HUMMON, N. P., DOREIAN, P. (1990): Computational Methods for Social Network Analysis.

Social Networks, 12, 273-288.

[9] HUMMON, N. P., DOREIAN, P., FREEMAN, L. C. (1990): Analyzing the Structure of the

Centrality-Productivity Literature Created Between 1948 and 1979. Knowledge: Creation, Dif-

fusion, Utilization, Vol. 11, June, No. 4, 459-480.

[10] KAMADA, T., KAWAI, S. (1989): An Algorithm for Drawing General Undirected Graphs. Infor-

mation Processing Letters, 31, 7-15.

[11] KNUTH, D. E. (1993): The Stanford GraphBase. Stanford University, ACM Press, New York.

[12] PETERSON, J. L., (1981): Petri Net Theory and Modeling of Systems. Prentice-Hall, Inc., Engle-

wood Cliffs, N.J.

[13] ROGERS, E. M., KINCAID, D. L. (1981): Communication Networks, Toward a New Paradigm

for Research. The Free Press, New York.

[14] TARJAN, R. E. (1983): Data Structures and Network Algorithms. Society for Industrial and Ap-

plied Mathematics Philadelphia, Pennsylvania.

[15] WATSON, A. H., MCCABE, T. J. (1996): Structured Testing: A Testing Methodology Using the

Cyclomatic Complexity Metric. Computer Systems Laboratory, National Institute of Standards and

Technology Special Publication 500-235, Gaithersburg, MD 20899-0001.

[16] WHITE, D. R., JORION, P. (1992): Representing and Analyzing Kinship: A New Approach.

Current Athropology 33, 454-462.

[17] WHITE, D. R., JORION, P. (1996): Kinship Networks and Discrete Structure Theory: Applica-

tions and Implications. Social Networks 18, 267-314.

10

[18] Erdös Number Project:

http://www.oakland.edu/˜grossman/erdoshp.html

http://www.gendex.com/gedcom55/55gcint.htm

http://portal.research.bell-labs.com/orgs/ssr/people/north/contest.html

[21] p-graphs:

http://eclectic.ss.uci.edu/˜drwhite/pgraph/p-graphs.html

http://www.mdli.com/download/chimedown.html

http://cosmosoftware.com/

ftp://labrea.stanford.edu/pub/dict/

ftp://ftp.cs.wisc.edu/pub/ghost/rjl/

http://www.prosci.org/Kinemage/MageSoftware.html

http://vlado.fmf.uni-lj.si/pub/networks/stran/default.htm

http://klaatu.oit.umass.edu/microbio/rasmol/getras.htm

http://www.sscnet.ucla.edu/soc/csoc/netscan/netscan.htm

http://sigact.acm.org/genealogy/

[31] Transportation Networks, National Transportation Atlas Database, Bureau of Transportation Statis-

tics:

http://www.bts.gov/gis/ntatlas/networks.html

ftp://www.dcs.hull.ac.uk/public/genealogy/

11

- Isi ArticuUploaded byAndres Acevedo Mejia
- Scimakelatex.26951.John.doe.Jane.doeUploaded bymdp anon
- DUET SamplePaperzsdaUploaded bySaad Bin Sadaqat
- DSFUploaded bygagan-randhava-8248
- API_I_C7Uploaded byvozoscribd
- ParkSahni_maxLTRoutUploaded byivan
- schultes_diss.pdfUploaded byDaNi Alexis
- Labroratory 1Uploaded byAshiqin Rashid
- Can Dan 2014Uploaded byamindur
- UT Dallas Syllabus for opre6385.001 06s taught by Chelliah Sriskandarajah (chelliah)Uploaded byUT Dallas Provost's Technology Group
- Nln Live Icmc Smc 2014 ProceedingsUploaded byThan Van Nispen
- More Golay SequencesUploaded bydeep_678
- MIT1_204S10_lec10Uploaded byChippyVijayan
- Michail Zak and Colin P. Williams- Quantum Neural NetsUploaded bydcsi3
- Ehrgott00approximation.psUploaded bybb_sportsman
- Exploring Congestion Control Using Client-Server TechnologyUploaded byEduardo Castrillón
- final e-portfolio assignmentUploaded byapi-302594281
- CCNAX Time TableUploaded byThọ Phạm
- 2012-5-14--11-3-55-860_CSIS-Dec2011-I-DepUploaded bykhalid anwar
- The International Conference on Computing, Networking and Digital Technologies (ICCNDT 2012)Uploaded bySDIWCjb
- trc CSSUploaded byreynald manzano
- TMS_PASSENGER TERMINALS_Operation&Maintenance Manual.13.06.2014.pdfUploaded bySergey Girshev
- Lisa Shipley Resume v2Uploaded byshipleyl
- fahaddxbnumpicUploaded byfahad siddiqui
- mba III semUploaded byDev Janghel
- It ManagerUploaded byapi-79124038
- solving np complete problems with quantum computingUploaded byapi-377494169
- Unit-VUploaded byAshi Krey
- Computational IntelligenceUploaded bySibarama Panigrahi
- Sistema de Comunicación AcústicaUploaded byLeticia Elizabeth Jimenez Flores

- Ecos de la Revolución PingüinaUploaded byCentro de La Aldeacultural
- Wenu Mapu La Astronomía MapucheUploaded byDiego Astorga
- La Cuestion MulticulturalUploaded byDiego Astorga
- Programa Seminario 1Uploaded byDiego Astorga
- escuela efectivasUploaded byCristian Daniel Villegas Fernandez
- 514Uploaded byDiego Astorga
- 02. Capítulo 2. A. Elementos para una sociología del curriculum...Juan Carlos TedescoUploaded byDiego Astorga
- Jaime PastorUploaded byDiego Astorga
- Dossier_WallersteinUploaded byDiego Astorga
- BUTLER Cuerpos Prefacio IntroUploaded bypuckdgc
- Perry Anderson- Historia y Lecciones Del NeoliberalismoUploaded byMelvin Gerardo López
- Norbert Elias - El Proceso de La Civilizacion[1]Uploaded byliguoria91732
- Rodó, José Enrique. Ariel. Motivos de ProteoUploaded byJosé Carlos Morales
- offe2Uploaded byPablo Almada

- 2011 10 Lyp Mathematics Sa1 13Uploaded byRohit Maheshwari
- Us 5373560Uploaded bycbetter
- Keller Trotter - Applied CombinatoricsUploaded byMasacroso
- Handout 9.docxUploaded bymano
- Rr210501 Discrete Structures and Graph TheoryUploaded bySRINIVASA RAO GANTA
- Matgraph - A Matlab Toolbox for Graph TheoryUploaded byTan Duy
- Depth First Search in PrologUploaded byAji Satrio
- Plugin Iran Team Selection Test 2011 90Uploaded byandrejkt
- ArticlesUploaded bySparsh Shrivastava
- ERROR DETECTION AND CORRECTION.docxUploaded byPawan Kumar
- 1-5-18 M Tech CSE Batch 2018.pdfUploaded bySangeeta
- ClawfreeGraphs_MJM_JMHUploaded byghanmakad
- D1,L9 Solving Linear Programming problems.pptUploaded bymokhtarppg
- Ieep201 - CopyUploaded byRyan Mills
- A Primal Dual Integer Programming AlgorithmUploaded byJuan Sebastián Poveda Gulfo
- Mathematical Attack on RSAUploaded bykgskgm
- CRT ExampleUploaded bySagar Jaikar
- CCandRCPUploaded byMarina Kamalova
- Lecture 01Uploaded byDeion Liew
- MAT301 Assignment4 SolUploaded byHongyi Zhao
- 103Bsp13_hw2_solUploaded byoscura
- BenesUploaded byGabriella Ella
- dismath.docxUploaded byCristina Joy Sanchez
- ps9Uploaded byEhsan Spencer
- InterlacingUploaded byAmirul Arifin
- Design and Analysis of Algorithms Important Two Mark Questions With Answers _ Studentsblog100Uploaded byRahul Sharma
- 10 Mathematics Impq Sa 1 1 Real Numbers 2Uploaded bySurya Sriram
- Anytime A-star Algorithm.docUploaded byDisha Sharma
- Algorithmic Graph TheoryUploaded byNatalia Coiutti
- Pigeonhole PrincipleUploaded bynafizz