You are on page 1of 30

Department of Computer Science Institute for System Architecture, Chair for Computer Networks

Content and File


Management

Wikis
Introduction ?...?
Access control?
File Server
Concurrent access?
Source
code files
Update conflicts?

Error handling?

Remote editing?

Versioning?
Web Server
___ Recovery of old versions?
___
Published ___ Easy usage?
Web
pages ___
___ ___ Deep know-how needed?
___ ___
___ Time-dependent publication?

?...? 2
Outline

1. Revision Control Systems


– Software for managing and monitoring document and
software revisions (versions)
– Primarily concerned with recovery of former revisions,
storing, versioning and locking of project files or
directories
– CVS, SVN, Git
2. Web-based Content Management Systems (CMS)
– System used to manage the content of a Web site by
providing means for e.g. creation, modification, removal
of content and support for publishing, format
management, revision control, indexing, search and
retrieval
– TYPO3
3. Wikis
– Software that allows users to easily generate, edit,
interconnect pages and that can be used to create
collaborative Web sites
– MediaWiki, Semantic Wikis

3
Categories of Revision Control Systems
Local Version Centralized Version Decentralized Version
Control Control Systems Control Systems

Version Version Computer Computer


Version
Controlled Controlled A B
Controlled Data
Directory Directory
File 1 Local Local
File 1 File 1 Version Version
File 2
Database Database
File 2 File 2
Checkout Computer A Computer B
Version
Version 3 Checkout exchange

Version 2
Version 3 Version 3
Version 1
Version 2 Version 2
Local Version
Database Version 1 Version 1
Version Database Version
Computer Database
Central Version Control
e.g. rcs Server Server
e.g. Concurrent Versions System e.g. Git 4
(CVS), Subversion (SVN)
CVS (Concurrent Versions System)
Repository and Sandbox /home/project
raw files
Repository CVS src1.php
/home/cvs/project1
client directory
CVSROOT
Initiator
module1
history files src2.php
CVS src1.php,v src3.jpg
server directory
src2.php,v Sandbox
src3.jpg,v /home/cvs/module1
revisions
CVS
CVS src1.php
client directory
CVS
src2.php
Repository:
src3.jpg
• Central storage for all raw files which are under RC
• Each raw file is converted to a history file (*,v) that contains recovery
information and the current revision
• Provides a “CVSROOT” directory containing additional meta-information
Sandbox: (a working directory containing specific checked-out revisions)
• A Sandbox is created by the “cvs checkout” command, thereby
specific revisions of raw files are created with the help of history files
• Every directory in a CVS-Sandbox has a “CVS” directory, which contains
management files providing e.g. information about current revision number
or checked-out time 5
CVS – Selected mechanisms
Sandbox
Versioning /home/cvs/module1
CVS
Repository
CVS src1.php {1.9}
/home/cvs/project1
CVSROOT
client directory
CVS
module1
revisions src2.php {1.3}
CVS src1.php,v {1.1-1.9} src3.jpg {1.7}
directory
server src2.php,v {1.1-1.13} Sandbox
src3.jpg,v {1.1-1.7} /home/cvs/module1
CVS
CVS src1.php {1.3}
client directory
CVS
src2.php {1.13}
src3.jpg {1.5}

• Allocation of revision numbers for every modification of a source file


which is registered in the Repository by the “cvs commit” command
• Due to the history information contained in history files all revisions are
available and accessible, thereby clients can load various revisions from
the Repository (via “cvs checkout” or “cvs export” methods – same
effect like checkout but no management files are created)
6
CVS – Selected mechanisms
Merging Developer 1 Developer 2
src1.php -r1.7
1: if (i>10) cvs checkout
2: { EDIT EDIT
3: echo "hello world"; 1: if (i>5)
4: } 2: { 1: if (i>10)
3: echo "hello world"; 2: {
-r1.8 3: echo "hallo Welt";
cvs commit
4: }
1: if (i>5) 4: }
2: {
3: echo "hello world"; Transmission of
4: } differences between 1: if (i>5)
cvs update -r1.7 and -r1.8 2: {
in src1.php 3: echo "hallo Welt";
4: }
REPOSITORY Manual merging of
differences with local
-r1.9 checked out revision
1: if (i>5) cvs commit
1: if (i>5)
2: { 2: {
3: echo "hallo Welt"; 3: echo "hallo Welt";
4: } 4: }

• Merging is a mechanism for joining two different revisions of the same file
– Notice: If the same lines are edited a conflict occurs that has to be
solved by the developers manually
• The "cvs update" proofs whether new revisions exist in the 7
Repository and, if this is the case, tries to merge these different revisions
CVS – Selected mechanisms

Tagging
Tag: release_1_0

src1.php 1.1 1.2 1.3 1.4

src2.php 1.1 1.2 1.3 1.4

src3.jgp 1.1 1.2 1.3 1.4 1.5

• Tagging provides a possibility for creating a kind of symbolic linkage


of various files and their revisions
• Thereby it is necessary to create a tag (e.g. release_1_0) on a
particular revision of a file
• By this mechanism it is possible to checkout concrete tagged revisions
annotated by these meaningful tags
• Each tag (label) is unique within a history file

Locking
• CVS provides a way to set an explicit commit lock on a revision
• While the commit lock is active other users are not able to commit a
revision of the particular file 8
CVS – Selected mechanisms

Branching

Repository Sandboxes

-r1.1
cvs checkout
-r1.2 Label a branch (-b) on -r1.2 -r1.2
cvs tag –b branch1
Branch-Tag-Name
cvs update branch1
branch1
Write the label “branch1”
on the local revision -r1.2
EDIT
branch1
cvs commit
-r1.2.1.1 -r1.2
edited
cvs checkout
-r1.3 -r1.3
-j…joining

cvs update -j Joined


rev.
-r1.4 cvs commit
9
CVS – Central shortcomings

• CVS does not support directory versioning


• “cvs commit” is not atomic
– “cvs commit” is run for every file separately
– When there is a conflict regarding one file, other files are committed
beside the file containing the conflict -> inconsistent repository state
• No self-defined properties (meta information)
• Management of tags and branches is difficult and resource intensive
• No efficient storing of binary files

History file of a raw text file History file of a raw binary file
(incremental) (full data)

-r1.9: -r1.5:
1: if (i>5)
2: { current src3.jpg
3: echo "hallo Welt"; revision
4: } -r1.4:
-r1.8:
3: echo "hello world"; Recovery src3.jpg
information

10
SVN (Subversion System)

• Designed to be a modern replacement for CVS that solves the


mentioned shortcomings
• Software support for converting CVS Repositories to SVN Repositories
exists (e.g. cvs2svn)
• Implements a subset of the WebDAV / Delta-V functionality

GUI client apps


Architecture Command line e.g. Subclipse
client app

Access methods mod_dav… WebDAV / Delta-V


module for Apache
WebDAV SVN Local mod_dav_svn… WebDAV / Delta-V
http:// module for SVN
https:// Internet/ svn:// file:/// svnserve… SVN own implemented
Local ssh:// server
Apache
S network
S mod_dav SSH svnserve
L mod_dav_svn

Repository

(Fairly Secure
Berkeley DB FSFS
File System) 11
SVN – Tags and Branches

• Realised by copying the revisions to an extra folder


– The “svn copy” method does not copy the files physically to
a folder - reference to the original data is created
– Files and directories can be copied
– If a copy is modified, only the differences are stored

Tagging Branching
[…]/svn/module1
module1 […]/svn/branches/bugfix1
Repository
src1.php
http://myhost.de/svn module1 … -r2 -r4 -r6 -r7

COPY
tag COPY
bugfix1 -r3 -r5
release_1_0

A copy of a src1.php -r6 was


special revision checked MERGE COMMIT
directory out
before Merged
-r6
module1/bugfix
http://myhost.de/svn/tag/release_1_0 module1 Sandbox
12
Git - Overview
• Widespread decentralized Version Control System used for huge
software projects (such as the Linux Kernel)
• Optimized for nonlinear development and branching
Typical variants of Version Control Systems realise versions by tracking changes to an
original version of each file:
Version 1 Version 2 Version 3 Version 4 Version 5 Version 6

File A ∆1 ∆2 ∆3

File B ∆1 ∆2

File C ∆1 ∆2 ∆3

Git creates with every commit a new snapshot of the versioned files:

File Av1 File Av2 File Av2 File Av3 File Av3 File Av4

File Bv1 File Bv2 File Bv2 File Bv2 File Bv3 File Bv3 SHA1
hash of
File Cv1 File Cv1 File Cv2 File Cv2 File Cv3 File Cv4 concatenated
hashes of
versioned
files are used
Snapshot Snapshot Snapshot Snapshot Snapshot Snapshot as identifiers
Identifier 1 Identifier 2 Identifier 3 Identifier 4 Identifier 5 Identifier 6
13
Link to previous file: if files are not changed, a link to previous file is stored
File X
(increases efficency and reduces storage space)
Git - Branching
Every branch has a name.
• A branch in Git is a reference to a snapshot The main branch is named
“master“. One branch is
Result after 3 commits always the active one –
master head marked by the “head“
(chain of 3 snapshots):
reference.
78aafc 8aacf1 68bcfc
1 $git branch bugfix
Git command: creates a branch
(in local repository) with name “bugfix“

Result after command: master head

78aafc 8aacf1 68bcfc

bugfix

2 $git checkout bugfix Git command: branch “bugfix“ is activated

Result after command: master

78aafc 8aacf1 68bcfc

bugfix head

3 $git commit Git command: commit changes to active branch

Result after command: master

78aafc 8aacf1 68bcfc 93aabc


14
bugfix head
Git - Merging
• Merging is done into the active branch:

master

78aafc 8aacf1 68bcfc 93aabc

bugfix head
4 $git checkout master Branch “master“ is activated and new changes
are committed to this branch
5 $git commit
master head
Result after commands:
ab11aa
78aafc 8aacf1 68bcfc
93aabc

bugfix
6 $git merge bugfix Git command: merge branch “bugfix“ into active branch
master head
Result after command:
ab11aa ac61ca
78aafc 8aacf1 68bcfc
93aabc

bugfix
15
Git – Remote branches
master head
(Remote) Canonical
ca672a ab11aa Project Repository Via the Git command “clone“, a
copy of a remote repository is
1 $git clone user@somedomain.com/repo.git
transferred to the local computer.
origin/master Local Repository Two branches are created: The
master branch and the branch
ca672a ab11aa origin/master are pointing to the
origin’s master branch.
master head

2 $git commit After two commits the local master


branch is not equal to the remote
origin/master Local Repository master branch (origin/master)
anymore.
ab11aa cd78a2 bbc77a
Changes done by others in the
head master remote repository can be
transferred to the local repository
4 $git fetch via the Git command “fetch“.
Local Repository origin/master The remote repository has been
17aaba aba17a updated by two new snapshots.
ab11aa The resulting branches in the local
cd78a2 bbc77a
repository can be merged in the
head master usual manner. 16
Git – Distributed Workflows
• Git enables distributed software development workflows with
several hierarchical levels Git command “push” is used to transfer
• Example with two levels: data to a remote repository (write
access to repository is necessary)
Selects Developers merge their
valuable Integration Pushes selected changes local repositories with
changes from 4 canonical one in regular
Manager intervals (e.g. after they
available Canonical
modifications Local have finished a task)
Version Project
Database Repository 1

Integration
Manager has Fetches and merges with local copy
read access 3 3 3

Public Developer 1 Public Developer 2 Public Developer 3


Version Version Version
Database Database Database
Push 2 Push 2 Push 2

Local Local Local


Version Version Version
Database Database Database

17
Motivation: Manual Web publishing vs. CMS

Manual
Administrative effort

Using a Content
Management
System

Content volume

18
Content

• Content basically consists out of three components


– Structure information
• Defines the composition, sequence and encapsulation
of raw data
• E.g. HTML structure tags, DTDs, XMLSchema
– Raw data
• Composed according to the structure, e.g. text, pictures, audio
– Layout information
• Formal specification for presentation on a potential output
medium, e.g. layouting based on CSS or XSL-FO

<html><head><title>Test-page</title>
<style type="text/css"> Layout
Structure data h1 {text-align:center} </style>
Raw data
</head><h1>Picture</h1>
<img src="picture.jpg"/>
</body></html>

19
Content Lifecycle – CMS functionalities

Content Lifecycle

Creation Archiving
Authoring, Retrieval,
Authentication, Backup/Rollback,
Check-in / Check-out, Organisation Publishing Version management
WYSIWYG, User management, Presentation,
Multi-user ability Workflow/ Quality Personalisation,
protection, Multi-Channel-
Page integrity/ Link Output,
Portal function
management,
Log functionality

Content Management System

Creation phase: Deals with collection, creation and editing of the content
Organisation phase: Deals with quality check and clearing of the content for publication
Publication phase: Deals with publication of the content in intra- or internet
Archiving phase: After removing the content from the network, content can be stored in
the content repository for documentation purpose or future usage (advantage: 20
retrieval or rollback)
Features of a CMS

Basic features WYSIWYG editor Fixed Workflows/ Quality protection

Search function/ Retrieval (full text or via meta information) Version management

Page integrity/ Link management Separation of structure, layout and raw data

Time-independent publishing Content modify locking with Check-in/ Check-out

Admission control/ Access control Meta information and Maintenance function

Log functionality Multi-language management


Advanced features
Mass operation Foreign format conversion Import of existing Web sites
Backup/ Rollback Community function (chat, forum, Wikis…)
Multi-user ability Portal function Shopping function (eCommerce)

Freely designable Workflows/ Quality protection

Multi-channel output (WAP, SMS, CDROM, Fax…)

Functional intersection with KMS

Basic features: Offered by typical CMS 21


Advanced features: Special features of a CMS, characterise different types of CMS
Example for a widespread CMS: TYPO3

• Open source CMS


• Based on PHP and MySQL database

• For configuration of TYPO3 TypoScript (TS) is used


– TypoScript is a declarative language
– Conversion of TypoScript into HTML by PHP logic (TYPO3
Frontend Engine)
– Area of application:
• Designing + Integration of templates
(front-end = published Web site)
– Pure TS-Template
– HTML-Template + TS
• Configuration of the back-end (administration + content
management area)
– User/user group properties (User TSConfig)
– Page properties (Page TSConfig)

22
TYPO3 - Separation of structure, layout and raw data

Back-end (BE) Front-end


Text Database (FE)
Layouting MySQL

Page tree Structure


structure Templates TYPO3
Front-end Web site
Engine HTML
BE-User + FE-User
Editor Visitor

Raw data
elements
CSS
Layout TP File
Picture, Audio System

TP… Template
TS… TypoScript
23
TypoScript example Comment row (not interpreted)
# default page
page = PAGE
PAGE object builds a framework for page.10 = TEXT
embedding further content page.10.value = The TUD logo
page.10.wrap = <h1>|</h1>
Content Object Array is created page.20 = COA
and put to “position” 20 of the page page.20.wrap = <table border= "1" >|</table
page.20.10 = COA
page.20.10.wrap = <tr>|</tr>
COA is wrapped into HTML table
row elements page.20.10 {
10 = TEXT
10.value = 266 x 77
Abbreviatory syntax for assigning
10.wrap = <td>|</td>
values to positions
20 = IMAGE
20.file = fileadmin/user_upload/tu-logo.gif
After creating an image object a 20.wrap = <td><div>|</div></td>
file is associated with it }

COA = content object that itself


24
can contain further objects
Wikis

• Software systems that allow users to easily generate,


interconnect, edit pages and that can be used to create
collaborative Web sites

• Support basic features of CMS


– Separation of structure, layout and raw data
– Integrated editing functionality
– Version management, history mechanism, retrieval of
former versions

• Provide often special Wiki syntax for formatting purposes and


link creation

• Can be used as a means for group communication (e.g. for


software/research project management)

• Various Wiki engines available: UseMod Wiki, TWiki (both store


content in pure text files), MediaWiki, …

25
MediaWiki

• PHP-based Wiki implementation


• Uses MySQL or PostgreSQL database
• Very easy dialog-driven installation and configuration
• Extendible by various add-ons (e.g. Semantic MediaWiki)
• Used by many highly frequented Web pages (e.g. Wikipedia)

==heading== Include headings in different sizes


===level 3===
====level 4====
[[Link to another page]] Internal Link to another page on the wiki
[[Link|different title]] (name/title of article has to be provided as link)
[[de:Seite auf Deutsch]] Interwiki (enables easy interconnection between
(e.g. [[de:Dresden]]) Wiki content in different languages) link to french
Wikipedia (appears under “languages“)
# one Numbered list
# two
# three
#REDIRECT [[Other article]] Redirect one title to another article
Examples of MediaWiki syntax
26
MediaWiki Link to article about
“industrialization”

Create a new
section by
creating its
heading

Include an image
and assign
different
attributes to it

Include regular
HTML tags

27
Semantic Wikis

• Enable encoding of semantic data within regular Wiki pages


• Provide possibilities to annotate pages with meaningful relations
to other pages and with attributes
• Data can be used for semantic search
(inline querying tools are available)
• Provide RDF export functionality

Example (from a Semantic MediaWiki article about Dresden)

... is the capital of [[Is capital of::Saxony]] ...


„Dresden“ „Is capital of“ „Saxony“
Current page Further Wiki page
subject predicate object

... the number of students living in Dresden is


[[Has students:=35.000]] ...
„Dresden“ „Has students“ „35.000“
Regular value/attribute
subject predicate object 28
Conclusion
CVS / SVN
client software
File / Web Server Web Server

Centralized Organize
Systems CVS/SVN Content
Git Lifecycle
client software
Decentralized Web browser
System
Web-based
Git CMS ___
___
(e.g. TYPO3) ___

___
WebDAV HTTP ___
+ extension ___

Versioning Wiki engine


___
extension (e.g. MediaWiki ___
with semantic ___
Web browser extension)
Delta-V WebDAV client

29
References

Revision Control

CVS http://www.nongnu.org/cvs/
SVN http://subversion.apache.org/
Git http://git-scm.com/
Open Book
about Git http://progit.org/

Content Management

TYPO3 http://www.typo3.com/
MediaWiki http://www.mediawiki.org/wiki/MediaWiki
Semantic
MediaWiki http://semantic-mediawiki.org/wiki/Semantic_MediaWiki

30

You might also like