Professional Documents
Culture Documents
5004 GIT Succinctly Syncfusion
5004 GIT Succinctly Syncfusion
By
Ryan Hodson
dited by
This publication was edited by Praveen Ramesh, director of development, Syncfusion, Inc.
Table of Contents
The Story behind the Succinctly Series of Books ........................................... 6
Introduction ......................................................................................................... 8
Faster Commands ............................................................................................. 9
Stability .............................................................................................................. 9
Isolated Environments ....................................................................................... 9
Efficient Merging................................................................................................ 9
Chapter 1: Overview .......................................................................................... 10
The Working Directory ..................................................................................... 10
The Staging Area ............................................................................................ 10
Committed History ........................................................................................... 11
Development Branches ................................................................................... 11
Chapter 2: Getting Started ................................................................................ 13
Installation ....................................................................................................... 13
Configuration ................................................................................................... 13
User Info ...................................................................................................... 14
Editor ........................................................................................................... 14
Aliases ......................................................................................................... 14
Initializing Repositories .................................................................................... 14
Cloning Repositories ....................................................................................... 15
Chapter 3: Recording Changes ........................................................................ 16
The Staging Area ............................................................................................ 16
Inspecting the Stage........................................................................................ 17
Generating Diffs ........................................................................................... 18
Commits .......................................................................................................... 19
Inspecting Commits ......................................................................................... 20
Useful Configurations .................................................................................. 21
Tagging Commits ............................................................................................ 22
Chapter 4: Undoing Changes ........................................................................... 23
Undoing in the Working Directory .................................................................... 23
Individual Files ............................................................................................. 24
Undoing in the Staging Area ........................................................................... 25
Undoing Commits ............................................................................................ 26
Resetting ..................................................................................................... 26
Reverting ..................................................................................................... 27
Amending .................................................................................................... 28
Chapter 5: Branches ......................................................................................... 29
Manipulating Branches .................................................................................... 29
Listing Branches .......................................................................................... 30
Creating Branches ....................................................................................... 30
Deleting Branches ....................................................................................... 31
Checking Out Branches .................................................................................. 31
Detached HEADs .......................................................................................... 33
Merging Branches ........................................................................................... 34
Fast-forward Merges .................................................................................... 35
4
Free forever
Syncfusion will be working to produce books on several topics. The books will always be
free. Any updates we publish will also be free.
Introduction
Git is an open-source version control system known for its speed, stability, and
distributed collaboration model. Originally created in 2006 to manage the entire
Linux kernel, Git now boasts a comprehensive feature set, an active
development team, and several free hosting communities.
Git was designed from the ground up, paying little attention to the existing
standards of centralized versioning systems. So, if youre coming from an SVN or
CVS background, try to forget everything you know about version control before
reading this guide.
Distributed software development is fundamentally different from centralized
version control systems. Instead of storing file information in a single central
repository, Git gives every developer a full copy of the repository. To facilitate
collaboration, Git lets each of these repositories share changes with any other
repository.
Faster Commands
First, a local copy of the repository means that almost all version control actions
are much faster. Instead of communicating with the central server over a network
connection, Git actions are performed on the local machine. This also means you
can work offline without changing your workflow.
Stability
Since each collaborator essentially has a backup of the whole project, the risk of
a server crash, a corrupted repository, or any other type of data loss is much
lower than that of centralized systems that rely on a single point-of-access.
Isolated Environments
Every copy of a Git repository, whether local or remote, retains the full history of
a project. Having a complete, isolated development environment gives each user
the freedom to experiment with new additions before polishing them up into
clean, publishable commits.
Efficient Merging
A complete history for each developer also means a divergent history for each
developer. As soon as you make a single local commit, youre out of sync with
everyone else on the project. To cope with this massive amount of branching, Git
became very good at merging divergent lines of development.
Chapter 1 Overview
Each Git repository contains 4 components:
10
Committed History
Once youve configured your changes in the staging area, you can commit it to
the project history where it will remain as a safe revision. Commits are safe in
the sense that Git will never change them on its own, although it is possible for
you to manually rewrite project history.
Development Branches
So far, were still only able to create a linear project history, adding one commit
on top of another. Branches make it possible to develop multiple unrelated
features in parallel by forking the project history.
11
12
Configuration
Git comes with a long list of configuration options covering everything from your
name to your favorite merge tool. You can set options with the git config
13
config
config
config
config
--global
--global
--global
--global
alias.st
alias.ci
alias.co
alias.br
status
commit
checkout
branch
Learn more by running the git help config in your Git Bash prompt.
Initializing Repositories
Git is designed to be as unobtrusive as possible. The only difference between a
Git repository and an ordinary project folder is an extra .git directory in the
project root (not in every subfolder like SVN). To turn an ordinary project folder
into a full-fledged Git repository, run the git init command:
git init <path>
The <path> argument should be a path to the repository (leaving it blank will
use the current working directory). Now, you can use all of Gits wonderful
version control features.
14
Cloning Repositories
As an alternative to git init, you can clone an existing Git repository using the
following command:
git clone ssh://<user>@<host>/path/to/repo.git
This logs into the <host> machine using SSH and downloads the repo.git
project. This is a complete copy, not just a link to the servers repository. You
have your own history, working directory, staging area, and branch structure, and
no one will see any changes you make until you push them back to a public
repository.
15
16
17
#
#
#
#
#
#
#
#
#
#
#
#
On branch master
Changes to be committed:
new file:
foobar.txt
foo.txt
Untracked files:
bar.txt
The first section, Changes to be committed is your staged snapshot. If you were
to run git commit right now, only these files would be added to the project
history. The next section lists tracked files that will not be included in the next
commit. Finally, Untracked files contains files in your working directory that
havent been added to the repository.
Generating Diffs
If you need more detailed information about the changes in your working
directory or staging area, you can generate a diff with the following command:
git diff
This outputs a diff of every unstaged change in your working directory. You can
also generate a diff of all staged changes with the --cached flag:
git diff cached
Note that the project history is outside the scope of git status. For displaying
committed snapshots, youll need git log.
18
Commits
Commits represent every saved version of a project, which makes them the
atomic unit of Git-based version control. Each commit contains a snapshot of the
project, your user information, the date, a commit message, and an SHA-1
checksum of its entire contents:
commit b650e3bd831aba05fa62d6f6d064e7ca02b5ee1b
Author: john <john@example.com>
Date:
Wed Jan 11 00:45:10 2012 -0600
Some commit message
This checksum serves as a commits unique ID, and it also means that a commit
will never be corrupted or unintentionally altered without Git knowing about it.
Since the staging area already contains the desired changeset, committing
doesnt require any involvement from the working directory.
19
Inspecting Commits
Like a repositorys status, viewing its history is one of the most common tasks in
Git version control. You can display the current branchs commits with:
git log
We now have the only two tools we need to inspect every component of a Git
repository.
20
Useful Configurations
Git provides a plethora of formatting options for git log, a few of which are
included here. To display each commit on a single line, use:
git log oneline
Or, to target the history of an individual file instead of the whole repository, use:
git log --oneline <file>
Filtering the log output is also very useful once your history grows beyond one
screenful of commits. You can use the following to display commits contained in
<until> but not in <since>. Both arguments can be a commit ID, a branch
name, or a tag:
git log <since>..<until>
Finally, you can display a diffstat of the changes in each commit. This is useful to
see what files were affected by a particular commit.
git log stat
For visualizing history, you might also want to look at the gitk command, which
is actually a separate program dedicated to graphing branches. Run git help
gitk for details.
21
Tagging Commits
Tags are simple pointers to commits, and they are incredibly useful for
bookmarking important revisions like public releases. The git tag command
can be used to create a new tag:
git tag -a v1.0 -m "Stable release"
The -a option tells Git to create an annotated tag, which lets you record a
message along with it (specified with -m).
Running the same command without arguments will list your existing tags:
git tag
22
To complicate things even further, there are multiple ways to undo a commit. You
can either:
1. Simply delete the commit from the project history.
2. Leave the commit as is, using a new commit to undo the changes
introduced by the first commit.
Git has a dedicated tool for each of these situations. Lets start with the working
directory.
23
24
25
Undoing Commits
There are two ways to undo a commit using Git: You can either reset it by simply
removing it from the project history, or you can revert it by generating a new
commit that gets rid of the changes introduced in the original. Undoing by
introducing another commit may seem excessive, but rewriting history by
completely removing commits can have dire consequences in multi-user
workflows (read more in Remote Repositories).
Resetting
The ever-versatile git reset can also be used to move the HEAD reference.
git reset HEAD~1
The HEAD~1 syntax parameter specifies the commit that occurs immediately
before HEAD (likewise, HEAD~2 refers to the second commit before HEAD). By
moving the HEAD reference backward, youre effectively removing the most
recent commit from the projects history.
26
27
This is the ideal way of undoing changes that have already been committed to a
public repository.
Amending
In addition to completely undoing commits, you can also amend the most recent
commit by staging changes as usual, then running:
git commit amend
This replaces the previous commit instead of creating a new one, which is very
useful if you forgot to add a file or two. For your convenience, the commit editor
is seeded with the old commits message. Again, you must be careful when
using the --amend flag, since it rewrites history much like git reset.
28
Chapter 5 Branches
Branches multiply the basic functionality offered by commits by allowing users to
fork their history. Creating a new branch is akin to requesting a new development
environment, complete with an isolated working directory, staging area, and
project history.
Manipulating Branches
Git separates branch functionality into a few different commands. The git
branch command is used for listing, creating, or deleting branches.
29
Listing Branches
First and foremost, youll need to be able to view your existing branches:
git branch
This will output all of your current branches, along with an asterisk next to the
one thats currently checked out (more on that later):
* master
some-feature
quick-bug-fix
The master branch is Gits default branch, which is created with the first commit
in any repository. Many developers use this branch as the main history of the
projecta permanent branch that contains every major change it goes through.
Creating Branches
You can create a new branch by passing the branch name to the same git
branch command:
git branch <name>
This creates a pointer to the current HEAD, but does not switch to the new branch
(youll need git checkout for that). Immediately after requesting a new
branch, your repository will look something like the following.
30
31
After checking out the specified branch, your working directory is updated to
match the specified branchs commit. In addition, the HEAD is updated to point to
the new branch, and all new commits will be stored on the new branch. You can
think of checking out a branch as switching to a new project folderexcept it will
be much easier to pull changes back into the project.
32
uncommitted changes. If this isnt the case, git checkout has the potential to
overwrite your modifications.
As with committing a safe revision, youre free to experiment on a new branch
without fear of destroying existing functionality. But, you now have a dedicated
history to work with, so you can record the progress of an experiment using the
exact same git add and git commit commands from earlier in the book.
33
Merging Branches
Merging is the process of pulling commits from one branch into another. There
are many ways to combine branches, but the goal is always to share information
between branches. This makes merging one of the most important features of
Git. The two most common merge methodologies are:
They both use the same command, git merge, but the method is automatically
determined based on the structure of your history. In each case, the branch you
want to merge into must be checked out, and the target branch will remain
unchanged. The next two sections present two possible merge scenarios for the
following commands:
git checkout master
git merge some-feature
Again, this merges the some-feature branch into the master branch, leaving
the former untouched. Youd typically run these commands once youve
completed a feature and want to integrate it into the stable project.
34
Fast-forward Merges
The first scenario looks like this:
35
Of course, we could have made the two commits directly on the master branch;
however, using a dedicated feature branch gave us a safe environment to
experiment with new code. If it didnt turn out quite right, we could have simply
deleted the branch (opposed to resetting/reverting). Or, if we added a bunch of
intermediate commits containing broken code, we could clean it up before
merging it into master (see Rebasing). As projects get more complicated and
acquire more collaborators, this kind of branched development makes Git a
fantastic organizational tool.
3-way Merges
But, not all situations are simple enough for a fast-forward commit. Remember,
the main advantage of branches is the ability to explore many independent lines
of development simultaneously. As a result, youll often encounter a scenario that
looks like the following:
36
<file>
Every file with a conflict is stored under the Unmerged paths section. Git
annotates these files to show you the content from both versions:
<<<<<<< HEAD
This content is from the current branch.
=======
This is a conflicting change from another branch.
>>>>>>> some-feature
The part before the ======= is from the master branch, and the rest is from the
branch youre trying to integrate.
To resolve the conflict, get rid of the <<<<<<, =======, and >>>>>>> notation,
and change the code to whatever you want to keep. Then, tell Git youre done
resolving the conflict with the git add command:
git add <file>
Thats right; all you have to do is stage the conflicted file to mark it as resolved.
Finally, complete the 3-way merge by generating the merge commit:
git commit
The log message is seeded with a merge notice, along with a conflicts list,
which can be particularly useful when trying to figure out where something went
wrong in a project.
And thats all there is to merging in Git. Now that we have an understanding of
the mechanics behind Git branches, we can take an in-depth look at how veteran
Git users leverage branches in their everyday workflow.
Branching Workflows
The workflows presented in this section are the hallmark of Git-based revision
control. The lightweight, easy-to-merge nature of Gits branch implementation
makes them one of the most productive tools in your software development
arsenal.
All branching workflows revolve around the git branch, git checkout, and
git merge commands presented earlier this chapter.
Types of Branches
Its often useful to assign special meaning to different branches for the sake of
organizing a project. This section introduces the most common types of
branches, but keep in mind these distinctions are purely superficialto Git, a
branch is a branch.
38
Figure 28: Using the master branch exclusively for public releases
Topic Branches
Topic branches generally fall into two categories: feature branches and hotfix
branches. Feature branches are temporary branches that encapsulate a new
feature or refactor, protecting the main project from untested code. They typically
stem from another feature branch or an integration branch, but not the super
stable branch.
39
40
Rebasing
Rebasing is the process of moving a branch to a new base. Gits rebasing
capabilities make branches even more flexible by allowing users to manually
organize their branches. Like merging, git rebase requires the branch to be
checked out and takes the new base as an argument:
git checkout some-feature
git rebase master
This moves the entire some-feature branch onto the tip of master:
41
42
Since the history has diverged, Git has to use an extra merge commit to combine
the branches. Doing this many times over the course of developing a longrunning feature can result in a very messy history.
These extra merge commits are superfluousthey exist only to pull changes
from master into some-feature. Typically, youll want your merge commits to
mean something, like the completion of a new feature. This is why many
developers choose to pull in changes with git rebase, since it results in a
completely linear history in the feature branch.
Interactive Rebasing
Interactive rebasing goes one step further and allows you to change commits as
youre moving them to the new base. You can specify an interactive rebase with
the -i flag:
git rebase i master
This populates a text editor with a summary of each commit in the feature
branch, along with a command that determines how it should be transferred to
the new base. For example, if you have two commits on a feature branch, you
might specify an interactive rebase like the following:
pick 58dec2a First commit for new feature
squash 6ac8a9f Second commit for new feature
The default pick command moves the first commit to the new base just like the
normal git rebase, but then the squash command tells Git to combine the
second commit with the previous one, so you wind up with one commit
containing all of your changes:
43
44
Other developers will think you are a brilliant programmer, and knew precisely
how to implement the entire feature in one fell swoop. This kind of organization is
very important for ensuring large projects have a navigable history.
Rewriting History
Rebasing is a powerful tool, but you must be judicious in your rewriting of history.
Both kinds of rebasing dont actually move existing commitsthey create brand
new ones (denoted by an asterisk in the above diagram). If you inspect commits
that were subjected to a rebase, youll notice that they have different IDs, even
though they represent the same content. This means rebasing destroys existing
commits in the process of moving them.
As you might imagine, this has dramatic consequences for collaborative
workflows. Destroying a public commit (e.g., anything on the master branch) is
like ripping out the basis of everyone elses work. Git wont have any idea how to
combine everyones changes, and youll have a whole lot of apologizing to do.
Well take a more in-depth look at this scenario after we learn how to
communicate with remote repositories.
For now, just abide by the golden rule of rebasing: never rebase a branch that
has been pushed to a public repository.
45
Manipulating Remotes
Similar to git branch, the git remote command is used to manage
connections to other repositories. Remotes are nothing more than bookmarks to
other repositoriesinstead of typing the full path, they let you reference it with a
user-friendly name. Well learn how we can use these bookmarks within Git in
Remote Workflows.
Listing Remotes
You can view your existing remotes by calling the git remote command with
no arguments:
git remote
If you have no remotes, this command wont output any information. If you used
git clone to get your repository, youll see an origin remote. Git
automatically adds this connection, under the assumption that youll probably
want to interact with it down the road.
You can request a little bit more information about your remotes with the -v flag:
git remote v
This displays the complete path to the repository. Specifying remote paths is
discussed in the next section.
Creating Remotes
The git remote add command creates a new connection to a remote
repository.
46
Remote Branches
Commits may be the atomic unit of Git-based version control, but branches are
the medium in which remote repositories communicate. Remote branches act
just like the local branches weve covered thus far, except they represent a
branch in someone elses repository.
47
completely isolated development environments. This also means Git will never
automatically fetch branches to access updated informationyou must do this
manually.
But, this is a good thing, since it means you dont have to constantly worry about
what everyone else is contributing while doing your work. This is only possible
due to the non-linear workflow enabled by Git branches.
Inspecting Remote Branches
For all intents and purposes, remote branches behave like read-only branches.
You can safely inspect their history and view their commits via git checkout,
but you cannot continue developing them before integrating them into your local
repository. This makes sense when you consider the fact that remote branches
are copies of other users commits.
The .. syntax is very useful for filtering log history. For example, the following
command displays any new updates from origin/master that are not in your
local master branch. Its generally a good idea to run this before merging
changes so you know exactly what youre integrating:
git log master..origin/master
If this outputs any commits, it means you are behind the official project and you
should probably update your repository. This is described in the next section.
It is possible to checkout remote branches, but it will put you in a detached HEAD
state. This is safe for viewing other users changes before integrating them, but
any changes you add will be lost unless you create a new local branch tip to
reference them.
Merging/Rebasing
Of course, the whole point of fetching is to integrate the resulting remote
branches into your local project. Lets say youre a contributor to an open-source
project, and youve been working on a feature called some-feature. As the
official project (typically pointed to by origin) moves forward, you may want to
incorporate its new commits into your repository. This would ensure that your
feature still works with the bleeding-edge developments.
Fortunately, you can use the exact same git merge command to incorporate
changes from origin/master into your feature branch:
git checkout some-feature
git fetch origin
git merge origin/master
49
Since your history has diverged, this results in a 3-way merge, after which your
some-feature branch has access to the most up-to-date version of the official
project.
50
As with local rebasing, this creates a perfectly linear history free of superfluous
merge commits:
Figure 36: Rebasing the feature branch onto the official master
Rebasing/merging remote branches has the exact same trade-offs as discussed
in the chapter on local branches.
Pulling
Since the fetch/merge sequence is such a common occurrence in distributed
development, Git provides a pull command as a convenient shortcut:
git pull origin/master
This fetches the origins master branch, and then merges it into the current
branch in one step. You can also pass the --rebase option to use git rebase
instead of git merge.
Pushing
To complement the git fetch command, Git also provides a push command.
Pushing is almost the opposite of fetching, in that fetching imports branches,
while pushing exports branches to another repository.
git push <remote> <branch>
The above command sends the local <branch> to the specified remote
repository. Except, instead of a remote branch, git push creates a local
branch. For example, executing git push mary my-feature in your local
51
repository will look like the following from Marys perspective (your repository will
be unaffected by the push).
Figure 37: Pushing a feature branch from your repository into Marys
Notice that my-feature is a local branch in Marys repository, whereas it would
be a remote branch had she fetched it herself.
This makes pushing a dangerous operation. Imagine youre developing in your
own local repository, when, all of a sudden, a new local branch shows up out of
nowhere. But, repositories are supposed to serve as completely isolated
development environments, so why should git push even exist? As well
discover shortly, pushing is a necessary tool for maintaining public Git
repositories.
52
Remote Workflows
Now that we have a basic idea of how Git interacts with other repositories, we
can discuss the real-world workflows that are supported by these commands.
The two most common collaboration models are: the centralized workflow and
the integrator workflow. SVN and CVS users should be quite comfortable with
Gits flavor of centralized development, but using Git means youll also get to
leverage its highly-efficient merge capabilities. The integrator workflow is a
typical distributed collaboration model and is not possible in purely centralized
systems.
As you read through these workflows, keep in mind that Git treats all repositories
as equals. There is no master repository according to Git as there is with SVN
or CVS. The official code base is merely a project conventionthe only reason
its the official repository is because thats where everyones origin remote
points.
Public (Bare) Repositories
Every collaboration model involves at least one public repository that serves as a
point-of-entry for multiple developers. Public repositories have the unique
constraint of being barethey must not have a working directory. This prevents
developers from accidentally overwriting each others work with git push. You
can create a bare repository by passing the --bare option to git init:
git init --bare <path>
Public repositories should only function as storage facilitiesnot development
environments. This is conveyed by adding a .git extension to the repositorys
file path, since the internal repository database resides in the project root instead
of the .git subdirectory. So, a complete example might look like:
git init --bare some-repo.git
Aside from a lack of a working directory, there is nothing special about a bare
repository. You can add remote connections, push to it, and pull from it in the
usual fashion.
The Centralized Workflow
The centralized workflow is best suited to small teams where each developer has
write access to the repository. It allows collaboration by using a single central
repository, much like the SVN or CVS workflow. In this model, all changes must
be shared through the central repository, which is usually stored on a server to
enable Internet-based collaboration.
53
54
55
56
incorporate the update into the main project. You probably dont want to give him
push access to the central repository, since he could start pushing all sorts of
random commits, and you would effectively lose control of the project.
But, what you can do is tell the contributor to push the changes to his own public
repository. Then, you can pull his bug fix into your private repository to ensure it
doesnt contain any undeclared code. If you approve his contributions, all you
have to do is merge them into a local branch and push it to the main repository
as usual. Youve become an integrator, in addition to an ordinary developer:
57
58
Conclusion
Supporting these centralized and distributed collaboration models was all Git was
ever meant to do. The working directory, the stage, commits, branches, and
remotes were all specifically designed to enable these workflows, and virtually
everything in Git revolves around these components.
True to the UNIX philosophy, Git was designed as a suite of interoperable tools,
not a single monolithic program. As you continue to explore Gits numerous
capabilities, youll find that its very easy to adapt individual commands to entirely
novel workflows.
I now leave it to you to apply these concepts to real-world projects, but as you
begin to incorporate Git into your daily workflow, remember that its not a silver
bullet for project management. It is merely a tool for tracking your files, and no
amount of intimate Git knowledge can make up for a haphazard set of
conventions within a development team.
59