You are on page 1of 17

NAME

gittutorial - A tutorial introduction to Git (for version 1.5.1 or newer)

SYNOPSIS
git *

DESCRIPTION
This tutorial explains how to import a new project into Git, make changes to it, and share
changes with other developers.
If you are instead primarily interested in using Git to fetch a project, for example, to test the
latest version, you may prefer to start with the first two chapters of The Git Users Manual.
First, note that you can get documentation for a command such as git log --graph with:
$ man git-log

or:
$ git help log

With the latter, you can use the manual viewer of your choice; see git-help[1] for more
information.
It is a good idea to introduce yourself to Git with your name and public email address before
doing any operation. The easiest way to do so is:
$ git config --global user.name "Your Name Comes Here"
$ git config --global user.email you@yourdomain.example.com

Importing a new project


Assume you have a tarball project.tar.gz with your initial work. You can place it under Git
revision control as follows.
$ tar xzf project.tar.gz
$ cd project
$ git init

Git will reply


Initialized empty Git repository in .git/

Youve now initialized the working directoryyou may notice a new directory created, named
".git".
Next, tell Git to take a snapshot of the contents of all files under the current directory (note the .),
with git add:
$ git add .

This snapshot is now stored in a temporary staging area which Git calls the "index". You can
permanently store the contents of the index in the repository with git commit:
$ git commit

This will prompt you for a commit message. Youve now stored the first version of your project
in Git.

Making changes
Modify some files, then add their updated contents to the index:
$ git add file1 file2 file3

You are now ready to commit. You can see what is about to be committed using git diff with the
--cached option:
$ git diff --cached

(Without --cached, git diff will show you any changes that youve made but not yet added to the
index.) You can also get a brief summary of the situation with git status:
$ git status
# On branch master
# Changes to be committed:
#
(use "git reset HEAD <file>..." to unstage)
#
#
modified:
file1
#
modified:
file2
#
modified:
file3
#

If you need to make any further adjustments, do so now, and then add any newly modified
content to the index. Finally, commit your changes with:
$ git commit

This will again prompt you for a message describing the change, and then record a new version
of the project.

Alternatively, instead of running git add beforehand, you can use


$ git commit -a

which will automatically notice any modified (but not new) files, add them to the index, and
commit, all in one step.
A note on commit messages: Though not required, its a good idea to begin the commit message
with a single short (less than 50 character) line summarizing the change, followed by a blank line
and then a more thorough description. The text up to the first blank line in a commit message is
treated as the commit title, and that title is used throughout Git. For example, git-format-patch[1]
turns a commit into email, and it uses the title on the Subject line and the rest of the commit in
the body.

Git tracks content not files


Many revision control systems provide an add command that tells the system to start tracking
changes to a new file. Gits add command does something simpler and more powerful: git add is
used both for new and newly modified files, and in both cases it takes a snapshot of the given
files and stages that content in the index, ready for inclusion in the next commit.

Viewing project history


At any point you can view the history of your changes using
$ git log

If you also want to see complete diffs at each step, use


$ git log -p

Often the overview of the change is useful to get a feel of each step
$ git log --stat --summary

Managing branches
A single Git repository can maintain multiple branches of development. To create a new branch
named "experimental", use
$ git branch experimental

If you now run


$ git branch

youll get a list of all existing branches:


experimental
* master

The "experimental" branch is the one you just created, and the "master" branch is a default
branch that was created for you automatically. The asterisk marks the branch you are currently
on; type
$ git checkout experimental

to switch to the experimental branch. Now edit a file, commit the change, and switch back to the
master branch:
(edit file)
$ git commit -a
$ git checkout master

Check that the change you made is no longer visible, since it was made on the experimental
branch and youre back on the master branch.
You can make a different change on the master branch:
(edit file)
$ git commit -a

at this point the two branches have diverged, with different changes made in each. To merge the
changes made in experimental into master, run
$ git merge experimental

If the changes dont conflict, youre done. If there are conflicts, markers will be left in the
problematic files showing the conflict;
$ git diff

will show this. Once youve edited the files to resolve the conflicts,
$ git commit -a

will commit the result of the merge. Finally,


$ gitk

will show a nice graphical representation of the resulting history.


At this point you could delete the experimental branch with
$ git branch -d experimental

This command ensures that the changes in the experimental branch are already in the current
branch.
If you develop on a branch crazy-idea, then regret it, you can always delete the branch with
$ git branch -D crazy-idea

Branches are cheap and easy, so this is a good way to try something out.

Using Git for collaboration


Suppose that Alice has started a new project with a Git repository in /home/alice/project, and that
Bob, who has a home directory on the same machine, wants to contribute.
Bob begins with:
bob$ git clone /home/alice/project myrepo

This creates a new directory "myrepo" containing a clone of Alices repository. The clone is on
an equal footing with the original project, possessing its own copy of the original projects
history.
Bob then makes some changes and commits them:
(edit files)
bob$ git commit -a
(repeat as necessary)

When hes ready, he tells Alice to pull changes from the repository at /home/bob/myrepo. She
does this with:
alice$ cd /home/alice/project
alice$ git pull /home/bob/myrepo master

This merges the changes from Bobs "master" branch into Alices current branch. If Alice has
made her own changes in the meantime, then she may need to manually fix any conflicts.
The "pull" command thus performs two operations: it fetches changes from a remote branch,
then merges them into the current branch.
Note that in general, Alice would want her local changes committed before initiating this "pull".
If Bobs work conflicts with what Alice did since their histories forked, Alice will use her
working tree and the index to resolve conflicts, and existing local changes will interfere with the
conflict resolution process (Git will still perform the fetch but will refuse to merge --- Alice will
have to get rid of her local changes in some way and pull again when this happens).

Alice can peek at what Bob did without merging first, using the "fetch" command; this allows
Alice to inspect what Bob did, using a special symbol "FETCH_HEAD", in order to determine if
he has anything worth pulling, like this:
alice$ git fetch /home/bob/myrepo master
alice$ git log -p HEAD..FETCH_HEAD

This operation is safe even if Alice has uncommitted local changes. The range notation
"HEAD..FETCH_HEAD" means "show everything that is reachable from the FETCH_HEAD
but exclude anything that is reachable from HEAD". Alice already knows everything that leads
to her current state (HEAD), and reviews what Bob has in his state (FETCH_HEAD) that she has
not seen with this command.
If Alice wants to visualize what Bob did since their histories forked she can issue the following
command:
$ gitk HEAD..FETCH_HEAD

This uses the same two-dot range notation we saw earlier with git log.
Alice may want to view what both of them did since they forked. She can use three-dot form
instead of the two-dot form:
$ gitk HEAD...FETCH_HEAD

This means "show everything that is reachable from either one, but exclude anything that is
reachable from both of them".
Please note that these range notation can be used with both gitk and "git log".
After inspecting what Bob did, if there is nothing urgent, Alice may decide to continue working
without pulling from Bob. If Bobs history does have something Alice would immediately need,
Alice may choose to stash her work-in-progress first, do a "pull", and then finally unstash her
work-in-progress on top of the resulting history.
When you are working in a small closely knit group, it is not unusual to interact with the same
repository over and over again. By defining remote repository shorthand, you can make it easier:
alice$ git remote add bob /home/bob/myrepo

With this, Alice can perform the first part of the "pull" operation alone using the git fetch
command without merging them with her own branch, using:
alice$ git fetch bob

Unlike the longhand form, when Alice fetches from Bob using a remote repository shorthand set
up with git remote, what was fetched is stored in a remote-tracking branch, in this case
bob/master. So after this:
alice$ git log -p master..bob/master

shows a list of all the changes that Bob made since he branched from Alices master branch.
After examining those changes, Alice could merge the changes into her master branch:
alice$ git merge bob/master

This merge can also be done by pulling from her own remote-tracking branch, like this:
alice$ git pull . remotes/bob/master

Note that git pull always merges into the current branch, regardless of what else is given on the
command line.
Later, Bob can update his repo with Alices latest changes using
bob$ git pull

Note that he doesnt need to give the path to Alices repository; when Bob cloned Alices
repository, Git stored the location of her repository in the repository configuration, and that
location is used for pulls:
bob$ git config --get remote.origin.url
/home/alice/project

(The complete configuration created by git clone is visible using git config -l, and the gitconfig[1] man page explains the meaning of each option.)
Git also keeps a pristine copy of Alices master branch under the name "origin/master":
bob$ git branch -r
origin/master

If Bob later decides to work from a different host, he can still perform clones and pulls using the
ssh protocol:
bob$ git clone alice.org:/home/alice/project myrepo

Alternatively, Git has a native protocol, or can use rsync or http; see git-pull[1] for details.
Git can also be used in a CVS-like mode, with a central repository that various users push
changes to; see git-push[1] and gitcvs-migration[7].

Exploring history
Git history is represented as a series of interrelated commits. We have already seen that the git
log command can list those commits. Note that first line of each git log entry also gives a name
for the commit:
$ git log
commit c82a22c39cbc32576f64f5c6b3f24b99ea8149c7
Author: Junio C Hamano <junkio@cox.net>
Date:
Tue May 16 17:18:22 2006 -0700
merge-base: Clarify the comments on post processing.

We can give this name to git show to see the details about this commit.
$ git show c82a22c39cbc32576f64f5c6b3f24b99ea8149c7

But there are other ways to refer to commits. You can use any initial part of the name that is long
enough to uniquely identify the commit:
$ git show c82a22c39c

# the first few characters of the name are


# usually enough
$ git show HEAD
# the tip of the current branch
$ git show experimental
# the tip of the "experimental" branch

Every commit usually has one "parent" commit which points to the previous state of the project:
$ git show HEAD^ # to see the parent of HEAD
$ git show HEAD^^ # to see the grandparent of HEAD
$ git show HEAD~4 # to see the great-great grandparent of HEAD

Note that merge commits may have more than one parent:
$ git show HEAD^1 # show the first parent of HEAD (same as HEAD^)
$ git show HEAD^2 # show the second parent of HEAD

You can also give commits names of your own; after running
$ git tag v2.5 1b2e1d63ff

you can refer to 1b2e1d63ff by the name "v2.5". If you intend to share this name with other
people (for example, to identify a release version), you should create a "tag" object, and perhaps
sign it; see git-tag[1] for details.
Any Git command that needs to know a commit can take any of these names. For example:
$ git diff v2.5 HEAD
# compare the current HEAD to v2.5
$ git branch stable v2.5 # start a new branch named "stable" based
# at v2.5
$ git reset --hard HEAD^ # reset your current branch and working

# directory to its state at HEAD^

Be careful with that last command: in addition to losing any changes in the working directory, it
will also remove all later commits from this branch. If this branch is the only branch containing
those commits, they will be lost. Also, dont use git reset on a publicly-visible branch that other
developers pull from, as it will force needless merges on other developers to clean up the history.
If you need to undo changes that you have pushed, use git revert instead.
The git grep command can search for strings in any version of your project, so
$ git grep "hello" v2.5

searches for all occurrences of "hello" in v2.5.


If you leave out the commit name, git grep will search any of the files it manages in your current
directory. So
$ git grep "hello"

is a quick way to search just the files that are tracked by Git.
Many Git commands also take sets of commits, which can be specified in a number of ways.
Here are some examples with git log:
$
$
$
$

git
git
git
git

log
log
log
log

v2.5..v2.6
# commits
v2.5..
# commits
--since="2 weeks ago" # commits
v2.5.. Makefile
# commits
# Makefile

between v2.5 and v2.6


since v2.5
from the last 2 weeks
since v2.5 which modify

You can also give git log a "range" of commits where the first is not necessarily an ancestor of
the second; for example, if the tips of the branches "stable" and "master" diverged from a
common commit some time ago, then
$ git log stable..master

will list commits made in the master branch but not in the stable branch, while
$ git log master..stable

will show the list of commits made on the stable branch but not the master branch.
The git log command has a weakness: it must present commits in a list. When the history has
lines of development that diverged and then merged back together, the order in which git log
presents those commits is meaningless.
Most projects with multiple contributors (such as the Linux kernel, or Git itself) have frequent
merges, and gitk does a better job of visualizing their history. For example,

$ gitk --since="2 weeks ago" drivers/

allows you to browse any commits from the last 2 weeks of commits that modified files under
the "drivers" directory. (Note: you can adjust gitks fonts by holding down the control key while
pressing "-" or "+".)
Finally, most commands that take filenames will optionally allow you to precede any filename
by a commit, to specify a particular version of the file:
$ git diff v2.5:Makefile HEAD:Makefile.in

You can also use git show to see any such file:
$ git show v2.5:Makefile

Next Steps
This tutorial should be enough to perform basic distributed revision control for your projects.
However, to fully understand the depth and power of Git you need to understand two simple
ideas on which it is based:

The object database is the rather elegant system used to store the history of your
projectfiles, directories, and commits.
The index file is a cache of the state of a directory tree, used to create commits, check out
working directories, and hold the various trees involved in a merge.

Part two of this tutorial explains the object database, the index file, and a few other odds and
ends that youll need to make the most of Git. You can find it at gittutorial-2[7].
If you dont want to continue with that right away, a few other digressions that may be
interesting at this point are:

git-format-patch[1], git-am[1]: These convert series of git commits into emailed patches,
and vice versa, useful for projects such as the Linux kernel which rely heavily on emailed
patches.
git-bisect[1]: When there is a regression in your project, one way to track down the bug is
by searching through the history to find the exact commit thats to blame. Git bisect can
help you perform a binary search for that commit. It is smart enough to perform a closeto-optimal search even in the case of complex non-linear history with lots of merged
branches.
gitworkflows[7]: Gives an overview of recommended workflows.
Everyday Git with 20 Commands Or So
gitcvs-migration[7]: Git for CVS users.

NAME
gittutorial-2 - A tutorial introduction to Git: part two

SYNOPSIS
git *

DESCRIPTION
You should work through gittutorial[7] before reading this tutorial.
The goal of this tutorial is to introduce two fundamental pieces of Gits architecturethe object
database and the index fileand to provide the reader with everything necessary to understand
the rest of the Git documentation.

The Git object database


Lets start a new project and create a small amount of history:
$ mkdir test-project
$ cd test-project
$ git init
Initialized empty Git repository in .git/
$ echo 'hello world' > file.txt
$ git add .
$ git commit -a -m "initial commit"
[master (root-commit) 54196cc] initial commit
1 file changed, 1 insertion(+)
create mode 100644 file.txt
$ echo 'hello world!' >file.txt
$ git commit -a -m "add emphasis"
[master c4d59f3] add emphasis
1 file changed, 1 insertion(+), 1 deletion(-)

What are the 7 digits of hex that Git responded to the commit with?
We saw in part one of the tutorial that commits have names like this. It turns out that every
object in the Git history is stored under a 40-digit hex name. That name is the SHA-1 hash of the
objects contents; among other things, this ensures that Git will never store the same data twice
(since identical data is given an identical SHA-1 name), and that the contents of a Git object will
never change (since that would change the objects name as well). The 7 char hex strings here
are simply the abbreviation of such 40 character long strings. Abbreviations can be used
everywhere where the 40 character strings can be used, so long as they are unambiguous.

It is expected that the content of the commit object you created while following the example
above generates a different SHA-1 hash than the one shown above because the commit object
records the time when it was created and the name of the person performing the commit.
We can ask Git about this particular object with the cat-file command. Dont copy the 40 hex
digits from this example but use those from your own version. Note that you can shorten it to
only a few characters to save yourself typing all 40 hex digits:
$ git cat-file -t 54196cc2
commit
$ git cat-file commit 54196cc2
tree 92b8b694ffb1675e5975148e1121810081dbdffe
author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
initial commit

A tree can refer to one or more "blob" objects, each corresponding to a file. In addition, a tree
can also refer to other tree objects, thus creating a directory hierarchy. You can examine the
contents of any tree using ls-tree (remember that a long enough initial portion of the SHA-1 will
also work):
$ git ls-tree 92b8b694
100644 blob 3b18e512dba79e4c8300dd08aeb37f8e728b8dad

file.txt

Thus we see that this tree has one file in it. The SHA-1 hash is a reference to that files data:
$ git cat-file -t 3b18e512
blob

A "blob" is just file data, which we can also examine with cat-file:
$ git cat-file blob 3b18e512
hello world

Note that this is the old file data; so the object that Git named in its response to the initial tree
was a tree with a snapshot of the directory state that was recorded by the first commit.
All of these objects are stored under their SHA-1 names inside the Git directory:
$ find .git/objects/
.git/objects/
.git/objects/pack
.git/objects/info
.git/objects/3b
.git/objects/3b/18e512dba79e4c8300dd08aeb37f8e728b8dad
.git/objects/92
.git/objects/92/b8b694ffb1675e5975148e1121810081dbdffe
.git/objects/54
.git/objects/54/196cc2703dc165cbd373a65a4dcf22d50ae7f7
.git/objects/a0

.git/objects/a0/423896973644771497bdc03eb99d5281615b51
.git/objects/d0
.git/objects/d0/492b368b66bdabf2ac1fd8c92b39d3db916e59
.git/objects/c4
.git/objects/c4/d59f390b9cfd4318117afde11d601c1085f241

and the contents of these files is just the compressed data plus a header identifying their length
and their type. The type is either a blob, a tree, a commit, or a tag.
The simplest commit to find is the HEAD commit, which we can find from .git/HEAD:
$ cat .git/HEAD
ref: refs/heads/master

As you can see, this tells us which branch were currently on, and it tells us this by naming a file
under the .git directory, which itself contains a SHA-1 name referring to a commit object, which
we can examine with cat-file:
$ cat .git/refs/heads/master
c4d59f390b9cfd4318117afde11d601c1085f241
$ git cat-file -t c4d59f39
commit
$ git cat-file commit c4d59f39
tree d0492b368b66bdabf2ac1fd8c92b39d3db916e59
parent 54196cc2703dc165cbd373a65a4dcf22d50ae7f7
author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143418702 -0500
add emphasis

The "tree" object here refers to the new state of the tree:
$ git ls-tree d0492b36
100644 blob a0423896973644771497bdc03eb99d5281615b51
$ git cat-file blob a0423896
hello world!

file.txt

and the "parent" object refers to the previous commit:


$ git cat-file commit 54196cc2
tree 92b8b694ffb1675e5975148e1121810081dbdffe
author J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
committer J. Bruce Fields <bfields@puzzle.fieldses.org> 1143414668 -0500
initial commit

The tree object is the tree we examined first, and this commit is unusual in that it lacks any
parent.

Most commits have only one parent, but it is also common for a commit to have multiple
parents. In that case the commit represents a merge, with the parent references pointing to the
heads of the merged branches.
Besides blobs, trees, and commits, the only remaining type of object is a "tag", which we wont
discuss here; refer to git-tag[1] for details.
So now we know how Git uses the object database to represent a projects history:

"commit" objects refer to "tree" objects representing the snapshot of a directory tree at a
particular point in the history, and refer to "parent" commits to show how theyre
connected into the project history.
"tree" objects represent the state of a single directory, associating directory names to
"blob" objects containing file data and "tree" objects containing subdirectory information.
"blob" objects contain file data without any other structure.
References to commit objects at the head of each branch are stored in files under
.git/refs/heads/.
The name of the current branch is stored in .git/HEAD.

Note, by the way, that lots of commands take a tree as an argument. But as we can see above, a
tree can be referred to in many different waysby the SHA-1 name for that tree, by the name of
a commit that refers to the tree, by the name of a branch whose head refers to that tree, etc.--and
most such commands can accept any of these names.
In command synopses, the word "tree-ish" is sometimes used to designate such an argument.

The index file


The primary tool weve been using to create commits is git-commit -a, which creates a
commit including every change youve made to your working tree. But what if you want to
commit changes only to certain files? Or only certain changes to certain files?
If we look at the way commits are created under the cover, well see that there are more flexible
ways creating commits.
Continuing with our test-project, lets modify file.txt again:
$ echo "hello world, again" >>file.txt

but this time instead of immediately making the commit, lets take an intermediate step, and ask
for diffs along the way to keep track of whats happening:
$ git diff
--- a/file.txt
+++ b/file.txt
@@ -1 +1,2 @@
hello world!

+hello world, again


$ git add file.txt
$ git diff

The last diff is empty, but no new commits have been made, and the head still doesnt contain
the new line:
$ git diff HEAD
diff --git a/file.txt b/file.txt
index a042389..513feba 100644
--- a/file.txt
+++ b/file.txt
@@ -1 +1,2 @@
hello world!
+hello world, again

So git diff is comparing against something other than the head. The thing that its comparing
against is actually the index file, which is stored in .git/index in a binary format, but whose
contents we can examine with ls-files:
$ git ls-files --stage
100644 513feba2e53ebbd2532419ded848ba19de88ba00 0
$ git cat-file -t 513feba2
blob
$ git cat-file blob 513feba2
hello world!
hello world, again

file.txt

So what our git add did was store a new blob and then put a reference to it in the index file. If we
modify the file again, well see that the new modifications are reflected in the git diff output:
$ echo 'again?' >>file.txt
$ git diff
index 513feba..ba3da7b 100644
--- a/file.txt
+++ b/file.txt
@@ -1,2 +1,3 @@
hello world!
hello world, again
+again?

With the right arguments, git diff can also show us the difference between the working directory
and the last commit, or between the index and the last commit:
$ git diff HEAD
diff --git a/file.txt b/file.txt
index a042389..ba3da7b 100644
--- a/file.txt
+++ b/file.txt
@@ -1 +1,3 @@
hello world!
+hello world, again
+again?

$ git diff --cached


diff --git a/file.txt b/file.txt
index a042389..513feba 100644
--- a/file.txt
+++ b/file.txt
@@ -1 +1,2 @@
hello world!
+hello world, again

At any time, we can create a new commit using git commit (without the "-a" option), and verify
that the state committed only includes the changes stored in the index file, not the additional
change that is still only in our working tree:
$ git commit -m "repeat"
$ git diff HEAD
diff --git a/file.txt b/file.txt
index 513feba..ba3da7b 100644
--- a/file.txt
+++ b/file.txt
@@ -1,2 +1,3 @@
hello world!
hello world, again
+again?

So by default git commit uses the index to create the commit, not the working tree; the "-a"
option to commit tells it to first update the index with all changes in the working tree.
Finally, its worth looking at the effect of git add on the index file:
$ echo "goodbye, world" >closing.txt
$ git add closing.txt

The effect of the git add was to add one entry to the index file:
$ git ls-files --stage
100644 8b9743b20d4b15be3955fc8d5cd2b09cd2336138 0
100644 513feba2e53ebbd2532419ded848ba19de88ba00 0

closing.txt
file.txt

And, as you can see with cat-file, this new entry refers to the current contents of the file:
$ git cat-file blob 8b9743b2
goodbye, world

The "status" command is a useful way to get a quick summary of the situation:
$ git status
# On branch master
# Changes to be committed:
#
(use "git reset HEAD <file>..." to unstage)
#
#
new file: closing.txt
#

# Changes not staged for commit:


#
(use "git add <file>..." to update what will be committed)
#
#
modified: file.txt
#

Since the current state of closing.txt is cached in the index file, it is listed as "Changes to be
committed". Since file.txt has changes in the working directory that arent reflected in the index,
it is marked "changed but not updated". At this point, running "git commit" would create a
commit that added closing.txt (with its new contents), but that didnt modify file.txt.
Also, note that a bare git diff shows the changes to file.txt, but not the addition of closing.txt,
because the version of closing.txt in the index file is identical to the one in the working directory.
In addition to being the staging area for new commits, the index file is also populated from the
object database when checking out a branch, and is used to hold the trees involved in a merge
operation. See gitcore-tutorial[7] and the relevant man pages for details.

What next?
At this point you should know everything necessary to read the man pages for any of the git
commands; one good place to start would be with the commands mentioned in Everyday Git.
You should be able to find any unknown jargon in gitglossary[7].
The Git Users Manual provides a more comprehensive introduction to Git.
gitcvs-migration[7] explains how to import a CVS repository into Git, and shows how to use Git
in a CVS-like way.
For some interesting examples of Git use, see the howtos.
For Git developers, gitcore-tutorial[7] goes into detail on the lower-level Git mechanisms
involved in, for example, creating a new commit.

You might also like