Iro GIt

Scott Chacon
August 4, 2009
Contents
1 Getting Started 1
1.1 About VersIon ControI . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 IocaI VersIon ControI Systems . . . . . . . . . . . . 1
1.1.2 CentraIIzed VersIon ControI Systems . . . . . . . . 2
1.1.3 ÐIstrIbuted VersIon ControI Systems . . . . . . . . 3
1.2 A Short IIstory oI GIt . . . . . . . . . . . . . . . . . . . . . 3
1.3 GIt ÐasIcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Snapshots, Þot ÐIfferences . . . . . . . . . . . . . . 5
1.3.2 ÞearIy £very OperatIon ¡s IocaI . . . . . . . . . . . 6
1.3.3 GIt Ias ¡ntegrIty . . . . . . . . . . . . . . . . . . . . 7
1.3.4 GIt GeneraIIy OnIy Adds Ðata . . . . . . . . . . . . . 7
1.3.5 The Three States . . . . . . . . . . . . . . . . . . . . 7
1.4 ¡nstaIIIng GIt . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 ¡nstaIIIng Irom Source . . . . . . . . . . . . . . . . . 9
1.4.2 ¡nstaIIIng on IInux . . . . . . . . . . . . . . . . . . . 9
1.4.3 ¡nstaIIIng on Mac . . . . . . . . . . . . . . . . . . . . 10
1.4.4 ¡nstaIIIng on WIndows . . . . . . . . . . . . . . . . . 10
1.5 IIrst-TIme GIt Setup . . . . . . . . . . . . . . . . . . . . . . 11
1.5.1 Your ¡dentIty . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.2 Your £dItor . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.3 Your ÐIff TooI . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.4 CheckIng Your SettIngs . . . . . . . . . . . . . . . . 12
1.6 GettIng IeIp . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Git Basics 15
2.1 GettIng a GIt IeposItory . . . . . . . . . . . . . . . . . . . . 15
2.1.1 ¡nItIaIIzIng a IeposItory In an £xIstIng ÐIrectory . 15
2.1.2 CIonIng an £xIstIng IeposItory . . . . . . . . . . . . 16
2.3 __ an £xI Sta_
. . . . . . . . . . . 16
1.3.3 our Ser nIm _ TooI _ . . . . . . . . . . . _
1.3.5 - _aI _ SetnI e . . . - __ . . . . . . . . . . . . . .
15
10
1.3.5 - _aIInga £xI _ . . . - __ . . . . . . . . . . . . . . . . .
11
11
11
1.5.3
1111
1.5.3 ¡ Ser Seo _ _11
2.2.8 IemovIng IIIes . . . . . . . . . . . . . . . . . . . . . 24
2.2.9 MovIng IIIes . . . . . . . . . . . . . . . . . . . . . . . 26
2.3 VIewIng the CommIt IIstory . . . . . . . . . . . . . . . . . 26
2.3.1 IImItIng Iog Output . . . . . . . . . . . . . . . . . . 30
2.3.2 !sIng a G!¡ to VIsuaIIze IIstory . . . . . . . . . . . 31
2.4 !ndoIng ThIngs . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.1 ChangIng Your Iast CommIt . . . . . . . . . . . . . 32
2.4.2 !nstagIng a Staged IIIe . . . . . . . . . . . . . . . . 33
2.4.3 !nmodIIyIng a ModIfied IIIe . . . . . . . . . . . . . 34
2.5 WorkIng wIth Iemotes . . . . . . . . . . . . . . . . . . . . . 35
2.5.1 ShowIng Your Iemotes . . . . . . . . . . . . . . . . . 35
2.5.2 AddIng Iemote IeposItorIes . . . . . . . . . . . . . 36
2.5.3 IetchIng and IuIIIng Irom Your Iemotes . . . . . . 36
2.5.4 IushIng to Your Iemotes . . . . . . . . . . . . . . . 37
2.5.5 ¡nspectIng a Iemote . . . . . . . . . . . . . . . . . . 37
2.5.6 IemovIng and IenamIng Iemotes . . . . . . . . . . 38
2.6 TaggIng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.6.1 IIstIng Your Tags . . . . . . . . . . . . . . . . . . . . 39
2.6.2 CreatIng Tags . . . . . . . . . . . . . . . . . . . . . . 39
2.6.3 Annotated Tags . . . . . . . . . . . . . . . . . . . . . 39
2.6.4 SIgned Tags . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.5 IIghtweIght Tags . . . . . . . . . . . . . . . . . . . . 41
2.6.6 VerIIyIng Tags . . . . . . . . . . . . . . . . . . . . . . 41
2.6.7 TaggIng Iater . . . . . . . . . . . . . . . . . . . . . . 41
2.6.8 SharIng Tags . . . . . . . . . . . . . . . . . . . . . . . 42
2.7 TIps and TrIcks . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.7.1 Auto-CompIetIon . . . . . . . . . . . . . . . . . . . . 43
2.7.2 GIt AIIases . . . . . . . . . . . . . . . . . . . . . . . . 44
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3 Git Branching 47
3.1 What a Ðranch ¡s . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 ÐasIc ÐranchIng and MergIng . . . . . . . . . . . . . . . . 52
3.2.1 ÐasIc ÐranchIng . . . . . . . . . . . . . . . . . . . . . 53
3.2.2 ÐasIc MergIng . . . . . . . . . . . . . . . . . . . . . . 56
3.2.3 ÐasIc Merge ConflIcts . . . . . . . . . . . . . . . . . 58
3.3 Ðranch Management . . . . . . . . . . . . . . . . . . . . . . 60
3.4 ÐranchIng Workflows . . . . . . . . . . . . . . . . . . . . . . 61
3.4.1 Iong-IunnIng Ðranches . . . . . . . . . . . . . . . . 61
3.4.2 TopIc Ðranches . . . . . . . . . . . . . . . . . . . . . 62
3.5 Iemote Ðranches . . . . . . . . . . . . . . . . . . . . . . . . 63
3.5.1 IushIng . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5.2 TrackIng Ðranches . . . . . . . . . . . . . . . . . . . 68
3.5.3 ÐeIetIng Iemote Ðranches . . . . . . . . . . . . . . 69
3.6 IebasIng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.6.1 The ÐasIc Iebase . . . . . . . . . . . . . . . . . . . . 69
3.6.2 More ¡nterestIng Iebases . . . . . . . . . . . . . . . 71
3.6.3 The IerIIs oI IebasIng . . . . . . . . . . . . . . . . . 74
iv
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Git on the Server 77
4.1 The IrotocoIs . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.1.1 IocaI IrotocoI . . . . . . . . . . . . . . . . . . . . . . 78
The Iros . . . . . . . . . . . . . . . . . . . . . . . . . 79
The Cons . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1.2 The SSI IrotocoI . . . . . . . . . . . . . . . . . . . . 79
The Iros . . . . . . . . . . . . . . . . . . . . . . . . . 80
The Cons . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.1.3 The GIt IrotocoI . . . . . . . . . . . . . . . . . . . . . 80
The Iros . . . . . . . . . . . . . . . . . . . . . . . . . 80
The Cons . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1.4 The ITTI/S IrotocoI . . . . . . . . . . . . . . . . . . 81
The Iros . . . . . . . . . . . . . . . . . . . . . . . . . 82
The Cons . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 GettIng GIt on a Server . . . . . . . . . . . . . . . . . . . . 82
4.2.1 IuttIng the Ðare IeposItory on a Server . . . . . . 83
4.2.2 SmaII Setups . . . . . . . . . . . . . . . . . . . . . . . 84
SSI Access . . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 GeneratIng Your SSI IubIIc Key . . . . . . . . . . . . . . . 85
4.4 SettIng !p the Server . . . . . . . . . . . . . . . . . . . . . 86
4.5 IubIIc Access . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.6 GItWeb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.7 GItosIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.8 GIt Ðaemon . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.9 Iosted GIt . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.9.1 GItIub . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.9.2 SettIng !p a !ser Account . . . . . . . . . . . . . . 97
4.9.3 CreatIng a Þew IeposItory . . . . . . . . . . . . . . 97
4.9.4 ¡mportIng Irom SubversIon . . . . . . . . . . . . . . 99
4.9.5 AddIng CoIIaborators . . . . . . . . . . . . . . . . . . 100
4.9.6 Your Iroject . . . . . . . . . . . . . . . . . . . . . . . 102
4.9.7 IorkIng Irojects . . . . . . . . . . . . . . . . . . . . . 103
4.9.8 GItIub Summary . . . . . . . . . . . . . . . . . . . . 103
4.10Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5 Distributed Git 105
5.1 ÐIstrIbuted Workflows . . . . . . . . . . . . . . . . . . . . . 105
5.1.1 CentraIIzed Workflow . . . . . . . . . . . . . . . . . . 105
5.1.2 ¡ntegratIon-Manager Workflow . . . . . . . . . . . . 106
5.1.3 ÐIctator and IIeutenants Workflow . . . . . . . . . 107
5.2 ContrIbutIng to a Iroject . . . . . . . . . . . . . . . . . . . 108
5.2.1 CommIt GuIdeIInes . . . . . . . . . . . . . . . . . . . 109
5.2.2 IrIvate SmaII Team . . . . . . . . . . . . . . . . . . . 111
5.2.3 IrIvate Managed Team . . . . . . . . . . . . . . . . . 116
5.2.4 IubIIc SmaII Iroject . . . . . . . . . . . . . . . . . . 120
5.2.5 IubIIc Iarge Iroject . . . . . . . . . . . . . . . . . . 124
v
5.2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.3 MaIntaInIng a Iroject . . . . . . . . . . . . . . . . . . . . . 127
5.3.1 WorkIng In TopIc Ðranches . . . . . . . . . . . . . . 127
5.3.2 AppIyIng Iatches Irom £-maII . . . . . . . . . . . . . 128
AppIyIng a Iatch wIth appIy . . . . . . . . . . . . . . 128
AppIyIng a Iatch wIth am . . . . . . . . . . . . . . . 128
5.3.3 CheckIng Out Iemote Ðranches . . . . . . . . . . . 131
5.3.4 ÐetermInIng What ¡s ¡ntroduced . . . . . . . . . . . 131
5.3.5 ¡ntegratIng ContrIbuted Work . . . . . . . . . . . . 133
MergIng Workflows . . . . . . . . . . . . . . . . . . . 133
Iarge-MergIng Workflows . . . . . . . . . . . . . . . 134
IebasIng and Cherry IIckIng Workflows . . . . . . 137
5.3.6 TaggIng Your IeIeases . . . . . . . . . . . . . . . . . 138
5.3.7 GeneratIng a ÐuIId Þumber . . . . . . . . . . . . . . 139
5.3.8 IreparIng a IeIease . . . . . . . . . . . . . . . . . . 140
5.3.9 The ShortIog . . . . . . . . . . . . . . . . . . . . . . . 140
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6 Git Tools 141
6.1 IevIsIon SeIectIon . . . . . . . . . . . . . . . . . . . . . . . 141
6.1.1 SIngIe IevIsIons . . . . . . . . . . . . . . . . . . . . . 141
6.1.2 Short SIA . . . . . . . . . . . . . . . . . . . . . . . . 141
6.1.3 A SIOIT ÞOT£ AÐO!T SIA-1 . . . . . . . . . . . 142
6.1.4 Ðranch IeIerences . . . . . . . . . . . . . . . . . . . 143
6.1.5 IeIIog Shortnames . . . . . . . . . . . . . . . . . . . 143
6.1.6 Ancestry IeIerences . . . . . . . . . . . . . . . . . . 144
6.1.7 CommIt Ianges . . . . . . . . . . . . . . . . . . . . . 146
ÐoubIe Ðot . . . . . . . . . . . . . . . . . . . . . . . . 146
MuItIpIe IoInts . . . . . . . . . . . . . . . . . . . . . . 147
TrIpIe Ðot . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2 ¡nteractIve StagIng . . . . . . . . . . . . . . . . . . . . . . . 148
6.2.1 StagIng and !nstagIng IIIes . . . . . . . . . . . . . 149
6.2.2 StagIng Iatches . . . . . . . . . . . . . . . . . . . . . 150
6.3 StashIng . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.3.1 StashIng Your Work . . . . . . . . . . . . . . . . . . . 152
6.3.2 CreatIng a Ðranch Irom a Stash . . . . . . . . . . . 153
6.4 IewrItIng IIstory . . . . . . . . . . . . . . . . . . . . . . . . 154
6.4.1 ChangIng the Iast CommIt . . . . . . . . . . . . . . 154
6.4.2 ChangIng MuItIpIe CommIt Messages . . . . . . . . 155
6.4.3 IeorderIng CommIts . . . . . . . . . . . . . . . . . . 157
6.4.4 SquashIng a CommIt . . . . . . . . . . . . . . . . . . 157
6.4.5 SpIIttIng a CommIt . . . . . . . . . . . . . . . . . . . 158
6.4.6 The ÞucIear OptIon. fiIter-branch . . . . . . . . . . 159
IemovIng a IIIe Irom £very CommIt . . . . . . . . 159
MakIng a SubdIrectory the Þew Ioot . . . . . . . . 159
ChangIng £-MaII Addresses GIobaIIy . . . . . . . . 160
6.5 ÐebuggIng wIth GIt . . . . . . . . . . . . . . . . . . . . . . . 160
6.5.1 IIIe AnnotatIon . . . . . . . . . . . . . . . . . . . . . 160
vi
6.5.2 ÐInary Search . . . . . . . . . . . . . . . . . . . . . . 162
6.6 SubmoduIes . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.6.1 StartIng wIth SubmoduIes . . . . . . . . . . . . . . . 164
6.6.2 CIonIng a Iroject wIth SubmoduIes . . . . . . . . . 165
6.6.3 Superprojects . . . . . . . . . . . . . . . . . . . . . . 167
6.6.4 ¡ssues wIth SubmoduIes . . . . . . . . . . . . . . . . 168
6.7 Subtree MergIng . . . . . . . . . . . . . . . . . . . . . . . . 169
6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7 Customizing Git 173
7.1 GIt ConfiguratIon . . . . . . . . . . . . . . . . . . . . . . . . 173
7.1.1 ÐasIc CIIent ConfiguratIon . . . . . . . . . . . . . . . 174
core.edItor . . . . . . . . . . . . . . . . . . . . . . . . 174
commIt.tempIate . . . . . . . . . . . . . . . . . . . . 174
core.pager . . . . . . . . . . . . . . . . . . . . . . . . 175
user.sIgnIngkey . . . . . . . . . . . . . . . . . . . . . 175
core.excIudesfiIe . . . . . . . . . . . . . . . . . . . . 175
heIp.autocorrect . . . . . . . . . . . . . . . . . . . . . 176
7.1.2 CoIors In GIt . . . . . . . . . . . . . . . . . . . . . . . 176
coIor.uI . . . . . . . . . . . . . . . . . . . . . . . . . . 176
coIor.* . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.1.3 £xternaI Merge and ÐIff TooIs . . . . . . . . . . . . 177
7.1.4 IormattIng and WhItespace . . . . . . . . . . . . . . 179
core.autocrII . . . . . . . . . . . . . . . . . . . . . . . 179
core.whItespace . . . . . . . . . . . . . . . . . . . . . 180
7.1.5 Server ConfiguratIon . . . . . . . . . . . . . . . . . . 181
receIve.IsckObjects . . . . . . . . . . . . . . . . . . . 181
receIve.denyÞonIastIorwards . . . . . . . . . . . . 181
receIve.denyÐeIetes . . . . . . . . . . . . . . . . . . 182
7.2 GIt AttrIbutes . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.2.1 ÐInary IIIes . . . . . . . . . . . . . . . . . . . . . . . 182
¡dentIIyIng ÐInary IIIes . . . . . . . . . . . . . . . . 182
ÐIffing ÐInary IIIes . . . . . . . . . . . . . . . . . . . 183
7.2.2 Keyword £xpansIon . . . . . . . . . . . . . . . . . . . 185
7.2.3 £xportIng Your IeposItory . . . . . . . . . . . . . . . 187
export-Ignore . . . . . . . . . . . . . . . . . . . . . . . 187
export-subst . . . . . . . . . . . . . . . . . . . . . . . 188
7.2.4 Merge StrategIes . . . . . . . . . . . . . . . . . . . . 188
7.3 GIt Iooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3.1 ¡nstaIIIng a Iook . . . . . . . . . . . . . . . . . . . . 189
7.3.2 CIIent-SIde Iooks . . . . . . . . . . . . . . . . . . . . 189
CommIttIng-Workflow Iooks . . . . . . . . . . . . . 189
£-maII Workflow Iooks . . . . . . . . . . . . . . . . 190
Other CIIent Iooks . . . . . . . . . . . . . . . . . . . 191
7.3.3 Server-SIde Iooks . . . . . . . . . . . . . . . . . . . 191
pre-receIve and post-receIve . . . . . . . . . . . . . 191
update . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4 An £xampIe GIt-£nIorced IoIIcy . . . . . . . . . . . . . . . 192
vii
7.4.1 Server-SIde Iook . . . . . . . . . . . . . . . . . . . . 192
£nIorcIng a SpecIfic CommIt-Message Iormat . . 193
£nIorcIng a !ser-Ðased ACI System . . . . . . . . 194
£nIorcIng Iast-Iorward-OnIy Iushes . . . . . . . . 196
7.4.2 CIIent-SIde Iooks . . . . . . . . . . . . . . . . . . . . 198
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8 Git and Other Systems 203
8.1 GIt and SubversIon . . . . . . . . . . . . . . . . . . . . . . . 203
8.1.1 gIt svn . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.1.2 SettIng !p . . . . . . . . . . . . . . . . . . . . . . . . 204
8.1.3 GettIng Started . . . . . . . . . . . . . . . . . . . . . 205
8.1.4 CommIttIng Ðack to SubversIon . . . . . . . . . . . 206
8.1.5 IuIIIng In Þew Changes . . . . . . . . . . . . . . . . 207
8.1.6 GIt ÐranchIng ¡ssues . . . . . . . . . . . . . . . . . . 209
8.1.7 SubversIon ÐranchIng . . . . . . . . . . . . . . . . . 209
CreatIng a Þew SVÞ Ðranch . . . . . . . . . . . . . 210
8.1.8 SwItchIng ActIve Ðranches . . . . . . . . . . . . . . 210
8.1.9 SubversIon Commands . . . . . . . . . . . . . . . . . 211
SVÞ StyIe IIstory . . . . . . . . . . . . . . . . . . . . 211
SVÞ AnnotatIon . . . . . . . . . . . . . . . . . . . . . 211
SVÞ Server ¡nIormatIon . . . . . . . . . . . . . . . . 212
¡gnorIng What SubversIon ¡gnores . . . . . . . . . . 212
8.1.10GIt-Svn Summary . . . . . . . . . . . . . . . . . . . . 213
8.2 MIgratIng to GIt . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.2.1 ¡mportIng . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.2.2 SubversIon . . . . . . . . . . . . . . . . . . . . . . . . 213
8.2.3 IerIorce . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.2.4 A Custom ¡mporter . . . . . . . . . . . . . . . . . . . 217
8.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
9 Git Internals 223
9.1 IIumbIng and IorceIaIn . . . . . . . . . . . . . . . . . . . . 223
9.2 GIt Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.2.1 Tree Objects . . . . . . . . . . . . . . . . . . . . . . . 226
9.2.2 CommIt Objects . . . . . . . . . . . . . . . . . . . . . 228
9.2.3 Object Storage . . . . . . . . . . . . . . . . . . . . . . 230
9.3 GIt IeIerences . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.3.1 The I£AÐ . . . . . . . . . . . . . . . . . . . . . . . . 233
9.3.2 Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.3.3 Iemotes . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.4 IackfiIes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.5 The IeIspec . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.5.1 IushIng IeIspecs . . . . . . . . . . . . . . . . . . . . 240
9.5.2 ÐeIetIng IeIerences . . . . . . . . . . . . . . . . . . 240
9.6 TransIer IrotocoIs . . . . . . . . . . . . . . . . . . . . . . . 240
9.6.1 The Ðumb IrotocoI . . . . . . . . . . . . . . . . . . . 241
9.6.2 The Smart IrotocoI . . . . . . . . . . . . . . . . . . . 243
viii
!pIoadIng Ðata . . . . . . . . . . . . . . . . . . . . . 243
ÐownIoadIng Ðata . . . . . . . . . . . . . . . . . . . 244
9.7 MaIntenance and Ðata Iecovery . . . . . . . . . . . . . . . 245
9.7.1 MaIntenance . . . . . . . . . . . . . . . . . . . . . . . 245
9.7.2 Ðata Iecovery . . . . . . . . . . . . . . . . . . . . . . 246
9.7.3 IemovIng Objects . . . . . . . . . . . . . . . . . . . . 248
9.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
ix
Chapter 1
Getting Started
ThIs chapter wIII be about gettIng started wIth GIt. We wIII begIn
at the begInnIng by expIaInIng some background on versIon controI
tooIs, then move on to how to get GIt runnIng on your system and
finaIIy how to get It setup to start workIng wIth. At the end oI thIs
chapter you shouId understand why GIt Is around, why you shouId
use It and you shouId be aII setup to do so.
1.1 About Version Control
What Is versIon controI, and why shouId you care? VersIon controI
Is a system that records changes to a fiIe or set oI fiIes over tIme
so that you can recaII specIfic versIons Iater. Ior the exampIes In
thIs book you wIII use soItware source code as the fiIes beIng versIon
controIIed, though In reaIIty you can do thIs wIth nearIy any type oI
fiIe on a computer.
¡I you are a graphIc or web desIgner and want to keep every ver-
sIon oI an Image or Iayout (whIch you wouId most certaInIy want to),
a VersIon ControI System (VCS) Is a very wIse thIng to use. ¡t aI-
Iows you to revert fiIes back to a prevIous state, revert the entIre
project back to a prevIous state, compare changes over tIme, see
who Iast modIfied somethIng that mIght be causIng a probIem, who
Introduced an Issue and when, and more. !sIng a VCS aIso gener-
aIIy means that II you screw thIngs up or Iose fiIes, you can easIIy
recover. ¡n addItIon, you get aII thIs Ior very IIttIe overhead.
1.1.1 Local Version Control Systems
Many peopIe's versIon-controI method oI choIce Is to copy fiIes Into
another dIrectory (perhaps a tIme-stamped dIrectory, II they're cIever).
ThIs approach Is very common because It Is so sImpIe, but It Is aIso
IncredIbIy error prone. ¡t Is easy to Iorget whIch dIrectory you're In
and accIdentaIIy wrIte to the wrong fiIe or copy over fiIes you don't
mean to.
1
Section 1.1 About VersIon ControI Scott Chacon Pro Git
To deaI wIth thIs Issue, programmers Iong ago deveIoped IocaI
VCSs that had a sImpIe database that kept aII the changes to fiIes
under revIsIon controI (see IIgure 1.1).
Figure 1.1: Local version control diagram
One oI the more popuIar VCS tooIs was a system caIIed rcs, whIch
Is stIII dIstrIbuted wIth many computers today. £ven the popuIar Mac
OS X operatIng system IncIudes the rcs command when you InstaII
the ÐeveIoper TooIs. ThIs tooI basIcaIIy works by keepIng patch sets
(that Is, the dIfferences between fiIes) Irom one change to another In
a specIaI Iormat on dIsk, It can then re-create what any fiIe Iooked
IIke at any poInt In tIme by addIng up aII the patches.
1.1.2 Centralized Version Control Systems
The next major Issue that peopIe encounter Is that they need to coI-
Iaborate wIth deveIopers on other systems. To deaI wIth thIs prob-
Iem, CentraIIzed VersIon ControI Systems (CVCSs) were deveIoped.
These systems, such as CVS, SubversIon, and IerIorce, have a sIngIe
server that contaIns aII the versIoned fiIes, and a number oI cIIents
that check out fiIes Irom that centraI pIace. Ior many years, thIs has
been the standard Ior versIon controI (see IIgure 1.2).
ThIs setup offers many advantages, especIaIIy over IocaI VCSs.
Ior exampIe, everyone knows to a certaIn degree what everyone
eIse on the project Is doIng. AdmInIstrators have fine-graIned con-
troI over who can do what, and It's Iar easIer to admInIster a CVCS
than It Is to deaI wIth IocaI databases on every cIIent.
Iowever, thIs setup aIso has some serIous downsIdes. The most
obvIous Is the sIngIe poInt oI IaIIure that the centraIIzed server rep-
resents. ¡I that server goes down Ior an hour, then durIng that hour
nobody can coIIaborate at aII or save versIoned changes to anythIng
they're workIng on. ¡I the hard dIsk the centraI database Is on be-
comes corrupted, and proper backups haven't been kept, you Iose
2
Chapter 1 GettIng Started Scott Chacon Pro Git
Figure 1.2: Centralized version control diagram
absoIuteIy everythIng—the entIre hIstory oI the project except what-
ever sIngIe snapshots peopIe happen to have on theIr IocaI machInes.
IocaI VCS systems suffer Irom thIs same probIem—whenever you
have the entIre hIstory oI the project In a sIngIe pIace, you rIsk Ios-
Ing everythIng.
1.1.3 Distributed Version Control Systems
ThIs Is where ÐIstrIbuted VersIon ControI Systems (ÐVCSs) step In.
¡n a ÐVCS (such as GIt, MercurIaI, Ðazaar or Ðarcs), cIIents don't
just check out the Iatest snapshot oI the fiIes. they IuIIy mIrror the
reposItory. Thus II any server dIes, and these systems were coIIabo-
ratIng vIa It, any oI the cIIent reposItorIes can be copIed back up to
the server to restore It. £very checkout Is reaIIy a IuII backup oI aII
the data (see IIgure 1.3).
Iurthermore, many oI these systems deaI pretty weII wIth havIng
severaI remote reposItorIes they can work wIth, so you can coIIabo-
rate wIth dIfferent groups oI peopIe In dIfferent ways sImuItaneousIy
wIthIn the same project. ThIs aIIows you to set up severaI types oI
workflows that aren't possIbIe In centraIIzed systems, such as hIer-
archIcaI modeIs.
1.2 A Short History of Git
As wIth many great thIngs In IIIe, GIt began wIth a bIt oI creatIve de-
structIon and fiery controversy. The IInux kerneI Is an open source
soItware project oI IaIrIy Iarge scope. Ior most oI the IIIetIme oI
the IInux kerneI maIntenance (1991–2002), changes to the soItware
3
Section 1.2 A Short IIstory oI GIt Scott Chacon Pro Git
Figure 1.3: Distributed version control diagram
were passed around as patches and archIved fiIes. ¡n 2002, the IInux
kerneI project began usIng a proprIetary ÐVCS system caIIed ÐIt-
Keeper.
¡n 2005, the reIatIonshIp between the communIty that deveIoped
the IInux kerneI and the commercIaI company that deveIoped ÐIt-
Keeper broke down, and the tooI's Iree-oI-charge status was revoked.
ThIs prompted the IInux deveIopment communIty (and In partIcuIar
IInus TorvaIds, the creator oI IInux) to deveIop theIr own tooI based
on some oI the Iessons they Iearned whIIe usIng ÐItKeeper. Some oI
the goaIs oI the new system were as IoIIows.
• Speed
• SImpIe desIgn
• Strong support Ior non-IInear deveIopment (thousands oI paraI-
IeI branches)
• IuIIy dIstrIbuted
• AbIe to handIe Iarge projects IIke the IInux kerneI efficIentIy
(speed and data sIze)
SInce Its bIrth In 2005, GIt has evoIved and matured to be easy to use
and yet retaIn these InItIaI quaIItIes. ¡t's IncredIbIy Iast, It's very effi-
4
Chapter 1 GettIng Started Scott Chacon Pro Git
cIent wIth Iarge projects, and It has an IncredIbIe branchIng system
Ior non-IInear deveIopment (See Chapter 3).
1.3 Git Basics
So, what Is GIt In a nutsheII? ThIs Is an Important sectIon to absorb,
because II you understand what GIt Is and the IundamentaIs oI how
It works, then usIng GIt effectIveIy wIII probabIy be much easIer Ior
you. As you Iearn GIt, try to cIear your mInd oI the thIngs you may
know about other VCSs, such as SubversIon and IerIorce, doIng so
wIII heIp you avoId subtIe conIusIon when usIng the tooI. GIt stores
and thInks about InIormatIon much dIfferentIy than these other sys-
tems, even though the user InterIace Is IaIrIy sImIIar, understand-
Ing those dIfferences wIII heIp prevent you Irom becomIng conIused
whIIe usIng It.
1.3.1 Snapshots, Not Differences
The major dIfference between GIt and any other VCS (SubversIon and
IrIends IncIuded) Is the way GIt thInks about Its data. ConceptuaIIy,
most other systems store InIormatIon as a IIst oI fiIe-based changes.
These systems (CVS, SubversIon, IerIorce, Ðazaar, and so on) thInk
oI the InIormatIon they keep as a set oI fiIes and the changes made
to each fiIe over tIme, as IIIustrated In IIgure 1.4.
Figure 1.4: Other systems tend to store data as changes to a
base version
oI each fiIe.
GIt doesn't thInk oI or store Its data thIs way. ¡nstead, GIt thInks
oI Its data more IIke a set oI snapshots oI a mInI fiIesystem. £very
tIme you commIt, or save the state oI your project In GIt, It basIcaIIy
takes a pIcture oI what aII your fiIes Iook IIke at that moment and
stores a reIerence to that snapshot. To be efficIent, II fiIes have not
changed, GIt doesn't store the fiIe agaIn—just a IInk to the prevIous
IdentIcaI fiIe It has aIready stored. GIt thInks about Its data more IIke
IIgure 1.5.
5
Section 1.3 GIt ÐasIcs Scott Chacon Pro Git
Figure 1.5: Git stores data as snapshots of the project over
time.
ThIs Is an Important dIstInctIon between GIt and nearIy aII other
VCSs. ¡t makes GIt reconsIder aImost every aspect oI versIon controI
that most other systems copIed Irom the prevIous generatIon. ThIs
makes GIt more IIke a mInI fiIesystem wIth some IncredIbIy powerIuI
tooIs buIIt on top oI It, rather than sImpIy a VCS. We'II expIore some
oI the benefits you gaIn by thInkIng oI your data thIs way when we
cover GIt branchIng In Chapter 3.
1.3.2 Nearly Every Operation Is Local
Most operatIons In GIt onIy need IocaI fiIes and resources to operate
– generaIIy no InIormatIon Is needed Irom another computer on your
network. ¡I you're used to a CVCS where most operatIons have that
network Iatency overhead, thIs aspect oI GIt wIII make you thInk that
the gods oI speed have bIessed GIt wIth unworIdIy powers. Ðecause
you have the entIre hIstory oI the project rIght there on your IocaI
dIsk, most operatIons seem aImost Instantaneous.
Ior exampIe, to browse the hIstory oI the project, GIt doesn't need
to go out to the server to get the hIstory and dIspIay It Ior you—It
sImpIy reads It dIrectIy Irom your IocaI database. ThIs means you see
the project hIstory aImost InstantIy. ¡I you want to see the changes
Introduced between the current versIon oI a fiIe and the fiIe a month
ago, GIt can Iook up the fiIe a month ago and do a IocaI dIfference
caIcuIatIon, Instead oI havIng to eIther ask a remote server to do It
or puII an oIder versIon oI the fiIe Irom the remote server to do It
IocaIIy.
ThIs aIso means that there Is very IIttIe you can't do II you're offlIne
or off VIÞ. ¡I you get on an aIrpIane or a traIn and want to do a IIttIe
work, you can commIt happIIy untII you get to a network connectIon
to upIoad. ¡I you go home and can't get your VIÞ cIIent workIng
properIy, you can stIII work. ¡n many other systems, doIng so Is eIther
ImpossIbIe or paInIuI. ¡n IerIorce, Ior exampIe, you can't do much
when you aren't connected to the server, and In SubversIon and CVS,
you can edIt fiIes, but you can't commIt changes to your database
(because your database Is offlIne). ThIs may not seem IIke a huge
6
Chapter 1 GettIng Started Scott Chacon Pro Git
deaI, but you may be surprIsed what a bIg dIfference It can make.
1.3.3 Git Has Integrity
£verythIng In GIt Is check-summed beIore It Is stored and Is then re-
Ierred to by that checksum. ThIs means It's ImpossIbIe to change the
contents oI any fiIe or dIrectory wIthout GIt knowIng about It. ThIs
IunctIonaIIty Is buIIt Into GIt at the Iowest IeveIs and Is IntegraI to Its
phIIosophy. You can't Iose InIormatIon In transIt or get fiIe corruptIon
wIthout GIt beIng abIe to detect It.
The mechanIsm that GIt uses Ior thIs checksummIng Is caIIed a
SIA-1 hash. ThIs Is a 40-character strIng composed oI hexadecImaI
characters (0–9 and a–I) and caIcuIated based on the contents oI a
fiIe or dIrectory structure In GIt. A SIA-1 hash Iooks somethIng IIke
thIs.
24b9da6552252987aa493b52f8696cd6d3b00373
You wIII see these hash vaIues aII over the pIace In GIt because It
uses them so much. ¡n Iact, GIt stores everythIng not by fiIe name
but In the GIt database addressabIe by the hash vaIue oI Its contents.
1.3.4 Git Generally Only Adds Data
When you do actIons In GIt, nearIy aII oI them onIy add data to the GIt
database. ¡t Is very dIfficuIt to get the system to do anythIng that Is
not undoabIe or to make It erase data In any way. As In any VCS, you
can Iose or mess up changes you haven't commItted yet, but aIter
you commIt a snapshot Into GIt, It Is very dIfficuIt to Iose, especIaIIy
II you reguIarIy push your database to another reposItory.
ThIs makes usIng GIt a joy because we know we can experIment
wIthout the danger oI severeIy screwIng thIngs up. Ior a more In-
depth Iook at how GIt stores Its data and how you can recover data
that seems Iost, see “!nder the Covers” In Chapter 9.
1.3.5 The Three States
Þow, pay attentIon. ThIs Is the maIn thIng to remember about GIt II
you want the rest oI your IearnIng process to go smoothIy. GIt has
three maIn states that your fiIes can resIde In. commItted, modIfied,
and staged. CommItted means that the data Is saIeIy stored In your
IocaI database. ModIfied means that you have changed the fiIe but
have not commItted It to your database yet. Staged means that you
have marked a modIfied fiIe In Its current versIon to go Into your next
commIt snapshot.
ThIs Ieads us to the three maIn sectIons oI a GIt project. the GIt
dIrectory, the workIng dIrectory, and the stagIng area.
The GIt dIrectory Is where GIt stores the metadata and object
database Ior your project. ThIs Is the most Important part oI GIt,
7
Section 1.3 GIt ÐasIcs Scott Chacon Pro Git
Figure 1.6: Working directory, staging area, and git directory
and It Is what Is copIed when you cIone a reposItory Irom another
computer.
The workIng dIrectory Is a sIngIe checkout oI one versIon oI the
project. These fiIes are puIIed out oI the compressed database In the
GIt dIrectory and pIaced on dIsk Ior you to use or modIIy.
The stagIng area Is a sImpIe fiIe, generaIIy contaIned In your GIt
dIrectory, that stores InIormatIon about what wIII go Into your next
commIt. ¡t's sometImes reIerred to as the Index, but It's becomIng
standard to reIer to It as the stagIng area.
The basIc GIt workflow goes somethIng IIke thIs.
1. You modIIy fiIes In your workIng dIrectory.
2. You stage the fiIes, addIng snapshots oI them to your stagIng
area.
3. You do a commIt, whIch takes the fiIes as they are In the stagIng
area and stores that snapshot permanentIy to your GIt dIrectory.
¡I a partIcuIar versIon oI a fiIe Is In the gIt dIrectory, It's consIdered
commItted. ¡I It's modIfied but has been added to the stagIng area,
It Is staged. And II It was changed sInce It was checked out but has
not been staged, It Is modIfied. ¡n Chapter 2, you'II Iearn more about
these states and how you can eIther take advantage oI them or skIp
the staged part entIreIy.
8
Chapter 1 GettIng Started Scott Chacon Pro Git
1.4 Installing Git
Iet's get Into usIng some GIt. IIrst thIngs first—you have to InstaII
It. You can get It a number oI ways, the two major ones are to InstaII
It Irom source or to InstaII an exIstIng package Ior your pIatIorm.
1.4.1 Installing from Source
¡I you can, It's generaIIy useIuI to InstaII GIt Irom source, because
you'II get the most recent versIon. £ach versIon oI GIt tends to In-
cIude useIuI !¡ enhancements, so gettIng the Iatest versIon Is oI-
ten the best route II you IeeI comIortabIe compIIIng soItware Irom
source. ¡t Is aIso the case that many IInux dIstrIbutIons contaIn very
oId packages, so unIess you're on a very up-to-date dIstro or are us-
Ing backports, InstaIIIng Irom source may be the best bet.
To InstaII GIt, you need to have the IoIIowIng IIbrarIes that GIt
depends on. curI, zIIb, openssI, expat, and IIbIconv. Ior exampIe, II
you're on a system that has yum (such as Iedora) or apt-get (such
as a ÐebIan based system), you can use one oI these commands to
InstaII aII oI the dependencIes.
$ yum install curl-devel expat-devel gettext-devel \
openssl-devel zlib-devel
$ apt-get install curl-devel expat-devel gettext-devel \
openssl-devel zlib-devel
When you have aII the necessary dependencIes, you can go ahead
and grab the Iatest snapshot Irom the GIt web sIte.
http://git-scm.com/download
Then, compIIe and InstaII.
$ tar -zxf git-1.6.0.5.tar.gz
$ cd git-1.6.0.5
$ make prefix=/usr/local all
$ sudo make prefix=/usr/local install
AIter thIs Is done, you can aIso get GIt vIa GIt ItseII Ior updates.
$ git clone git://git.kernel.org/pub/scm/git/git.git
1.4.2 Installing on Linux
¡I you want to InstaII GIt on IInux vIa a bInary InstaIIer, you can gen-
eraIIy do so through the basIc package-management tooI that comes
wIth your dIstrIbutIon. ¡I you're on Iedora, you can use yum.
$ yum install git-core
Or II you're on a ÐebIan-based dIstrIbutIon IIke !buntu, try apt-
get.
$ apt-get install git-core
9
Section 1.4 ¡nstaIIIng GIt Scott Chacon Pro Git
1.4.3 Installing on Mac
There are two easy ways to InstaII GIt on a Mac. The easIest Is to use
the graphIcaI GIt InstaIIer, whIch you can downIoad Irom the GoogIe
Code page (see IIgure 1.7).
http://code.google.com/p/git-osx-installer
Figure 1.7: Git OS X installer
The other major way Is to InstaII GIt vIa MacIorts (http://www.
macports.org). ¡I you have MacIorts InstaIIed, InstaII GIt vIa
$ sudo port install git-core +svn +doc +bash_completion +gitweb
You don't have to add aII the extras, but you'II probabIy want to
IncIude +svn In case you ever have to use GIt wIth SubversIon repos-
ItorIes (see Chapter 8).
1.4.4 Installing on Windows
¡nstaIIIng GIt on WIndows Is very easy. The msysGIt project has one
oI the easIer InstaIIatIon procedures. SImpIy downIoad the InstaIIer
exe fiIe Irom the GoogIe Code page, and run It.
http://code.google.com/p/msysgit
AIter It's InstaIIed, you have both a command-IIne versIon (IncIud-
Ing an SSI cIIent that wIII come In handy Iater) and the standard
G!¡.
10
Chapter 1 GettIng Started Scott Chacon Pro Git
1.5 First-Time Git Setup
Þow that you have GIt on your system, you'II want to do a Iew thIngs
to customIze your GIt envIronment. You shouId have to do these
thIngs onIy once, they'II stIck around between upgrades. You can
aIso change them at any tIme by runnIng through the commands
agaIn.
GIt comes wIth a tooI caIIed gIt config that Iets you get and set
configuratIon varIabIes that controI aII aspects oI how GIt Iooks and
operates. These varIabIes can be stored In three dIfferent pIaces.
• /etc/gitconfig fiIe. ContaIns vaIues Ior every user on the system
and aII theIr reposItorIes. ¡I you pass the optIon--system to git
config, It reads and wrItes Irom thIs fiIe specIficaIIy.
• ~/.gitconfig fiIe. SpecIfic to your user. You can make GIt read
and wrIte to thIs fiIe specIficaIIy by passIng the --global optIon.
• config fiIe In the gIt dIrectory (that Is, .git/config) oI whatever
reposItory you're currentIy usIng. SpecIfic to that sIngIe reposI-
tory. £ach IeveI overrIdes vaIues In the prevIous IeveI, so vaIues
In .git/config trump those In /etc/gitconfig.
On WIndows systems, GIt Iooks Ior the .gitconfig fiIe In the $HOME dI-
rectory (C:\Documents and Settings\$USER Ior most peopIe). ¡t aIso stIII
Iooks Ior /etc/gItconfig, aIthough It's reIatIve to the MSys root, whIch
Is wherever you decIde to InstaII GIt on your WIndows system when
you run the InstaIIer.
1.5.1 Your Identity
The first thIng you shouId do when you InstaII GIt Is to set your user
name and e-maII address. ThIs Is Important because every GIt com-
mIt uses thIs InIormatIon, and It's ImmutabIy baked Into the commIts
you pass around.
$ git config --global user.name "John Doe"
$ git config --global user.email johndoe@example.com
AgaIn, you need to do thIs onIy once II you pass the --global optIon,
because then GIt wIII aIways use that InIormatIon Ior anythIng you
do on that system. ¡I you want to overrIde thIs wIth a dIfferent name
or e-maII address Ior specIfic projects, you can run the command
wIthout the --global optIon when you're In that project.
1.5.2 Your Editor
Þow that your IdentIty Is set up, you can configure the deIauIt text
edItor that wIII be used when GIt needs you to type In a message. Ðy
deIauIt, GIt uses your system's deIauIt edItor, whIch Is generaIIy VI
11
Section 1.6 GettIng IeIp Scott Chacon Pro Git
or VIm. ¡I you want to use a dIfferent text edItor, such as £macs, you
can do the IoIIowIng.
$ git config --global core.editor emacs
1.5.3 Your Diff Tool
Another useIuI optIon you may want to configure Is the deIauIt dIff
tooI to use to resoIve merge conflIcts. Say you want to use vImdIff.
$ git config --global merge.tool vimdiff
GIt accepts kdIff3, tkdIff, meId, xxdIff, emerge, vImdIff, gvImdIff,
ecmerge, and opendIff as vaIId merge tooIs. You can aIso set up a
custom tooI, see Chapter 7 Ior more InIormatIon about doIng that.
1.5.4 Checking Your Settings
¡I you want to check your settIngs, you can use the git config --list
command to IIst aII the settIngs GIt can find at that poInt.
$ git config --list
user.name=Scott Chacon
user.email=schacon@gmail.com
color.status=auto
color.branch=auto
color.interactive=auto
color.diff=auto
...
You may see keys more than once, because GIt reads the same
key Irom dIfferent fiIes (/etc/gitconfig and ~/.gitconfig, Ior exampIe).
¡n thIs case, GIt uses the Iast vaIue Ior each unIque key It sees.
You can aIso check what GIt thInks a specIfic key's vaIue Is by
typIng git config key.
$ git config user.name
Scott Chacon
1.6 Getting Help
¡I you ever need heIp whIIe usIng GIt, there are three ways to get the
manuaI page (manpage) heIp Ior any oI the GIt commands.
$ git help <verb>
$ git <verb> --help
$ man git-<verb>
Ior exampIe, you can get the manpage heIp Ior the config com-
mand by runnIng
$ git help config
12
Chapter 1 GettIng Started Scott Chacon Pro Git
These commands are nIce because you can access them anywhere,
even offlIne. ¡I the manpages and thIs book aren't enough and you
need In-person heIp, you can try the #git or #github channeI on the
Ireenode ¡IC server (Irc.Ireenode.net). These channeIs are regu-
IarIy fiIIed wIth hundreds oI peopIe who are aII very knowIedgeabIe
about GIt and are oIten wIIIIng to heIp.
1.7 Summary
You shouId have a basIc understandIng oI what GIt Is and how It's
dIfferent Irom the CVCS you may have been usIng. You shouId aIso
now have a workIng versIon oI GIt on your system that's set up wIth
your personaI IdentIty. ¡t's now tIme to Iearn some GIt basIcs.
13
Chapter 2
Git Basics
¡I you can read onIy one chapter to get goIng wIth GIt, thIs Is It. ThIs
chapter covers every basIc command you need to do the vast major-
Ity oI the thIngs you'II eventuaIIy spend your tIme doIng wIth GIt. Ðy
the end oI the chapter, you shouId be abIe to configure and InItIaI-
Ize a reposItory, begIn and stop trackIng fiIes, and stage and commIt
changes. We'II aIso show you how to set up GIt to Ignore certaIn fiIes
and fiIe patterns, how to undo mIstakes quIckIy and easIIy, how to
browse the hIstory oI your project and vIew changes between com-
mIts, and how to push and puII Irom remote reposItorIes.
2.1 Getting a Git Repository
You can get a GIt project usIng two maIn approaches. The first takes
an exIstIng project or dIrectory and Imports It Into GIt. The second
cIones an exIstIng GIt reposItory Irom another server.
2.1.1 Initializing a Repository in an Existing Direc-
tory
¡I you're startIng to track an exIstIng project In GIt, you need to go
to the project's dIrectory and type
$ git init
ThIs creates a new subdIrectory named .gIt that contaIns aII oI
your necessary reposItory fiIes — a GIt reposItory skeIeton. At thIs
poInt, nothIng In your project Is tracked yet. (See Chapter 9 Ior more
InIormatIon about exactIy what fiIes are contaIned In the .git dIrec-
tory you just created.)
¡I you want to start versIon-controIIIng exIstIng fiIes (as opposed
to an empty dIrectory), you shouId probabIy begIn trackIng those
fiIes and do an InItIaI commIt. You can accompIIsh that wIth a Iew gIt
add commands that specIIy the fiIes you want to track, IoIIowed by a
commIt.
15
Section 2.2 IecordIng Changes to the IeposItory Scott Chacon Pro Git
$ git add *.c
$ git add README
$ git commit –m 'initial project version'
We'II go over what these commands do In just a mInute. At thIs
poInt, you have a GIt reposItory wIth tracked fiIes and an InItIaI com-
mIt.
2.1.2 Cloning an Existing Repository
¡I you want to get a copy oI an exIstIng GIt reposItory — Ior exampIe,
a project you'd IIke to contrIbute to — the command you need Is gIt
cIone. ¡I you're IamIIIar wIth other VCS systems such as SubversIon,
you'II notIce that the command Is cIone and not checkout. ThIs Is an
Important dIstInctIon — GIt receIves a copy oI nearIy aII data that the
server has. £very versIon oI every fiIe Ior the hIstory oI the project
Is puIIed down when you run git clone. ¡n Iact, II your server dIsk
gets corrupted, you can use any oI the cIones on any cIIent to set the
server back to the state It was In when It was cIoned (you may Iose
some server-sIde hooks and such, but aII the versIoned data wouId
be there—see Chapter 4 Ior more detaIIs).
You cIone a reposItory wIth git clone [url]. Ior exampIe, II you
want to cIone the Iuby GIt IIbrary caIIed GrIt, you can do so IIke thIs.
$ git clone git://github.com/schacon/grit.git
That creates a dIrectory named “grIt”, InItIaIIzes a .git dIrectory
InsIde It, puIIs down aII the data Ior that reposItory, and checks out
a workIng copy oI the Iatest versIon. ¡I you go Into the new grit
dIrectory, you'II see the project fiIes In there, ready to be worked on
or used. ¡I you want to cIone the reposItory Into a dIrectory named
somethIng other than grIt, you can specIIy that as the next command-
IIne optIon.
$ git clone git://github.com/schacon/grit.git mygrit
That command does the same thIng as the prevIous one, but the
target dIrectory Is caIIed mygrIt.
GIt has a number oI dIfferent transIer protocoIs you can use. The
prevIous exampIe uses the git:// protocoI, but you may aIso see http
(s):// or user@server:/path.git, whIch uses the SSI transIer protocoI.
Chapter 4 wIII Introduce aII oI the avaIIabIe optIons the server can
set up to access your GIt reposItory and the pros and cons oI each.
2.2 Recording Changes to the Repository
You have a bona fide GIt reposItory and a checkout or workIng copy
oI the fiIes Ior that project. You need to make some changes and
commIt snapshots oI those changes Into your reposItory each tIme
the project reaches a state you want to record.
16
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
Iemember that each fiIe In your workIng dIrectory can be In one oI
two states. tracked or untracked. Tracked fiIes are fiIes that were In
the Iast snapshot, they can be unmodIfied, modIfied, or staged. !n-
tracked fiIes are everythIng eIse - any fiIes In your workIng dIrectory
that were not In your Iast snapshot and are not In your stagIng area.
When you first cIone a reposItory, aII oI your fiIes wIII be tracked and
unmodIfied because you just checked them out and haven't edIted
anythIng.
As you edIt fiIes, GIt sees them as modIfied, because you've changed
them sInce your Iast commIt. You stage these modIfied fiIes and then
commIt aII your staged changes, and the cycIe repeats. ThIs IIIecycIe
Is IIIustrated In IIgure 2.1.
Figure 2.1: The lifecycle of the status of your files
2.2.1 Checking the Status of Your Files
The maIn tooI you use to determIne whIch fiIes are In whIch state
Is the gIt status command. ¡I you run thIs command dIrectIy aIter a
cIone, you shouId see somethIng IIke thIs.
$ git status
# On branch master
nothing to commit (working directory clean)
ThIs means you have a cIean workIng dIrectory—In other words,
there are no tracked and modIfied fiIes. GIt aIso doesn't see any
untracked fiIes, or they wouId be IIsted here. IInaIIy, the command
teIIs you whIch branch you're on. Ior now, that Is aIways master,
whIch Is the deIauIt, you won't worry about It here. The next chapter
wIII go over branches and reIerences In detaII.
Iet's say you add a new fiIe to your project, a sImpIe I£AÐM£
fiIe. ¡I the fiIe dIdn't exIst beIore, and you run git status, you see
your untracked fiIe IIke so.
17
Section 2.2 IecordIng Changes to the IeposItory Scott Chacon Pro Git
$ vim README
$ git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# README
nothing added to commit but untracked files present (use "git add" to track)
You can see that your new I£AÐM£ fiIe Is untracked, because
It's under the “!ntracked fiIes” headIng In your status output. !n-
tracked basIcaIIy means that GIt sees a fiIe you dIdn't have In the pre-
vIous snapshot (commIt), GIt won't start IncIudIng It In your commIt
snapshots untII you expIIcItIy teII It to do so. ¡t does thIs so you don't
accIdentaIIy begIn IncIudIng generated bInary fiIes or other fiIes that
you dId not mean to IncIude. You do want to start IncIudIng I£AÐM£,
so Iet's start trackIng the fiIe.
2.2.2 Tracking New Files
¡n order to begIn trackIng a new fiIe, you use the command git add.
To begIn trackIng the I£AÐM£ fiIe, you can run thIs.
$ git add README
¡I you run your status command agaIn, you can see that your
I£AÐM£ fiIe Is now tracked and staged.
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
#
You can teII that It's staged because It's under the “Changes to
be commItted” headIng. ¡I you commIt at thIs poInt, the versIon oI
the fiIe at the tIme you ran gIt add Is what wIII be In the hIstorIcaI
snapshot. You may recaII that when you ran gIt InIt earIIer, you then
ran gIt add (fiIes) — that was to begIn trackIng fiIes In your dIrec-
tory. The gIt add command takes a path name Ior eIther a fiIe or a
dIrectory, II It's a dIrectory, the command adds aII the fiIes In that
dIrectory recursIveIy.
2.2.3 Staging Modified Files
Iet's change a fiIe that was aIready tracked. ¡I you change a prevI-
ousIy tracked fiIe caIIed benchmarks.rb and then run your status com-
mand agaIn, you get somethIng that Iooks IIke thIs.
18
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: benchmarks.rb
#
The benchmarks.rb fiIe appears under a sectIon named “Changed
but not updated” — whIch means that a fiIe that Is tracked has been
modIfied In the workIng dIrectory but not yet staged. To stage It, you
run the git add command (It's a muItIpurpose command — you use It
to begIn trackIng new fiIes, to stage fiIes, and to do other thIngs IIke
markIng merge-conflIcted fiIes as resoIved). Iet's run git add now to
stage the benchmarks.rb fiIe, and then run git status agaIn.
$ git add benchmarks.rb
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
# modified: benchmarks.rb
#
Ðoth fiIes are staged and wIII go Into your next commIt. At thIs
poInt, suppose you remember one IIttIe change that you want to make
In benchmarks.rb beIore you commIt It. You open It agaIn and make
that change, and you're ready to commIt. Iowever, Iet's run git
status one more tIme.
$ vim benchmarks.rb
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
# modified: benchmarks.rb
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: benchmarks.rb
#
What the heck? Þow benchmarks.rb Is IIsted as both staged and
unstaged. Iow Is that possIbIe? ¡t turns out that GIt stages a fiIe
exactIy as It Is when you run the gIt add command. ¡I you commIt
now, the versIon oI benchmarks.rb as It was when you Iast ran the
gIt add command Is how It wIII go Into the commIt, not the versIon oI
19
Section 2.2 IecordIng Changes to the IeposItory Scott Chacon Pro Git
the fiIe as It Iooks In your workIng dIrectory when you run gIt commIt.
¡I you modIIy a fiIe aIter you run git add, you have to run git add agaIn
to stage the Iatest versIon oI the fiIe.
$ git add benchmarks.rb
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
# modified: benchmarks.rb
#
2.2.4 Ignoring Files
OIten, you'II have a cIass oI fiIes that you don't want GIt to automatI-
caIIy add or even show you as beIng untracked. These are generaIIy
automatIcaIIy generated fiIes such as Iog fiIes or fiIes produced by
your buIId system. ¡n such cases, you can create a fiIe IIstIng pat-
terns to match them named .gItIgnore. Iere Is an exampIe .gItIgnore
fiIe.
$ cat .gitignore
*.[oa]
*~
The first IIne teIIs GIt to Ignore any fiIes endIng In .o or .a — ob-
ject and archIve fiIes that may be the product oI buIIdIng your code.
The second IIne teIIs GIt to Ignore aII fiIes that end wIth a tIIde (~),
whIch Is used by many text edItors such as £macs to mark temporary
fiIes. You may aIso IncIude a Iog, tmp, or pId dIrectory, automatIcaIIy
generated documentatIon, and so on. SettIng up a .gItIgnore fiIe be-
Iore you get goIng Is generaIIy a good Idea so you don't accIdentaIIy
commIt fiIes that you reaIIy don't want In your GIt reposItory.
The ruIes Ior the patterns you can put In the .gItIgnore fiIe are as
IoIIows.
• ÐIank IInes or IInes startIng wIth # are Ignored.
• Standard gIob patterns work.
• You can end patterns wIth a Iorward sIash (/) to specIIy a dIrec-
tory.
• You can negate a pattern by startIng It wIth an excIamatIon poInt
(!).
GIob patterns are IIke sImpIIfied reguIar expressIons that sheIIs use.
An asterIsk (*) matches zero or more characters, [abc] matches any
character InsIde the brackets (In thIs case a, b, or c), a questIon mark
(?) matches a sIngIe character, and brackets encIosIng characters
seperated by a hyphen([0-9]) matches any character between them
(In thIs case 0 through 9) .
Iere Is another exampIe .gItIgnore fiIe.
20
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
# a comment – this is ignored
*.a # no .a files
!lib.a # but do track lib.a, even though you're ignoring .a files above
/TODO # only ignore the root TODO file, not subdir/TODO
build/ # ignore all files in the build/ directory
doc/*.txt # ignore doc/notes.txt, but not doc/server/arch.txt
2.2.5 Viewing Your Staged and Unstaged Changes
¡I the git status command Is too vague Ior you — you want to know
exactIy what you changed, not just whIch fiIes were changed — you
can use the git diff command. We'II cover git diff In more detaII
Iater, but you'II probabIy use It most oIten to answer these two ques-
tIons. What have you changed but not yet staged? And what have
you staged that you are about to commIt? AIthough git status an-
swers those questIons very generaIIy, git diff shows you the exact
IInes added and removed — the patch, as It were.
Iet's say you edIt and stage the I£AÐM£ fiIe agaIn and then edIt
the benchmarks.rb fiIe wIthout stagIng It. ¡I you run your status com-
mand, you once agaIn see somethIng IIke thIs.
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: benchmarks.rb
#
To see what you've changed but not yet staged, type git diff wIth
no other arguments.
$ git diff
diff --git a/benchmarks.rb b/benchmarks.rb
index 3cb747f..da65585 100644
--- a/benchmarks.rb
+++ b/benchmarks.rb
@@ -36,6 +36,10 @@ def main
@commit.parents[0].parents[0].parents[0]
end
+ run_code(x, 'commits 1') do
+ git.commits.size
+ end
+
run_code(x, 'commits 2') do
log = git.commits('master', 15)
log.size
That command compares what Is In your workIng dIrectory wIth
what Is In your stagIng area. The resuIt teIIs you the changes you've
made that you haven't yet staged.
21
Section 2.2 IecordIng Changes to the IeposItory Scott Chacon Pro Git
¡I you want to see what you've staged that wIII go Into your next
commIt, you can use git diff –-cached. (¡n GIt versIons 1.6.1 and Iater,
you can aIso use git diff –-staged, whIch may be easIer to remember.)
ThIs command compares your staged changes to your Iast commIt.
$ git diff --cached
diff --git a/README b/README
new file mode 100644
index 0000000..03902a1
--- /dev/null
+++ b/README2
@@ -0,0 +1,5 @@
+grit
+ by Tom Preston-Werner, Chris Wanstrath
+ http://github.com/mojombo/grit
+
+Grit is a Ruby library for extracting information from a Git repository
¡t's Important to note that git diff by ItseII doesn't show aII changes
made sInce your Iast commIt — onIy changes that are stIII unstaged.
ThIs can be conIusIng, because II you've staged aII oI your changes,
git diff wIII gIve you no output.
Ior another exampIe, II you stage the benchmarks.rb fiIe and then
edIt It, you can use git diff to see the changes In the fiIe that are
staged and the changes that are unstaged.
$ git add benchmarks.rb
$ echo '# test line' >> benchmarks.rb
$ git status
# On branch master
#
# Changes to be committed:
#
# modified: benchmarks.rb
#
# Changed but not updated:
#
# modified: benchmarks.rb
#
Þow you can use git diff to see what Is stIII unstaged
$ git diff
diff --git a/benchmarks.rb b/benchmarks.rb
index e445e28..86b2f7c 100644
--- a/benchmarks.rb
+++ b/benchmarks.rb
@@ -127,3 +127,4 @@ end
main()
##pp Grit::GitRuby.cache_client.stats
+# test line
and git diff --cached to see what you’ve staged so far:
$ git diff --cached
diff --git a/benchmarks.rb b/benchmarks.rb
index 3cb747f..e445e28 100644
--- a/benchmarks.rb
+++ b/benchmarks.rb
22
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
@@ -36,6 +36,10 @@ def main
@commit.parents[0].parents[0].parents[0]
end
+ run_code(x, 'commits 1') do
+ git.commits.size
+ end
+
run_code(x, 'commits 2') do
log = git.commits('master', 15)
log.size
2.2.6 Committing Your Changes
Þow that your stagIng area Is set up the way you want It, you can
commIt your changes. Iemember that anythIng that Is stIII unstaged
— any fiIes you have created or modIfied that you haven't run git add
on sInce you edIted them — won't go Into thIs commIt. They wIII
stay as modIfied fiIes on your dIsk. ¡n thIs case, the Iast tIme you
ran git status, you saw that everythIng was staged, so you're ready
to commIt your changes. The sImpIest way to commIt Is to type git
commit.
$ git commit
ÐoIng so Iaunches your edItor oI choIce. (ThIs Is set by your sheII's
$EDITOR envIronment varIabIe — usuaIIy vIm or emacs, aIthough you
can configure It wIth whatever you want usIng the git config --global
core.editor command as you saw In Chapter 1).
The edItor dIspIays the IoIIowIng text (thIs exampIe Is a VIm screen).
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README
# modified: benchmarks.rb
~
~
~
".git/COMMIT_EDITMSG" 10L, 283C
You can see that the deIauIt commIt message contaIns the Iatest
output oI the git status command commented out and one empty IIne
on top. You can remove these comments and type your commIt mes-
sage, or you can Ieave them there to heIp you remember what you're
commIttIng. (Ior an even more expIIcIt remInder oI what you've mod-
Ified, you can pass the -v optIon to git commit. ÐoIng so aIso puts the
dIff oI your change In the edItor so you can see exactIy what you dId.)
When you exIt the edItor, GIt creates your commIt wIth that commIt
message (wIth the comments and dIff strIpped out).
AIternatIveIy, you can type your commIt message InIIne wIth the
commit command by specIIyIng It aIter a -m flag, IIke thIs.
23
Section 2.2 IecordIng Changes to the IeposItory Scott Chacon Pro Git
$ git commit -m "Story 182: Fix benchmarks for speed"
[master]: created 463dc4f: "Fix benchmarks for speed"
2 files changed, 3 insertions(+), 0 deletions(-)
create mode 100644 README
Þow you've created your first commIt! You can see that the com-
mIt has gIven you some output about ItseII. whIch branch you com-
mItted to (master), what SIA-1 checksum the commIt has (463dc4f),
how many fiIes were changed, and statIstIcs about IInes added and
removed In the commIt.
Iemember that the commIt records the snapshot you set up In
your stagIng area. AnythIng you dIdn't stage Is stIII sIttIng there
modIfied, you can do another commIt to add It to your hIstory. £v-
ery tIme you perIorm a commIt, you're recordIng a snapshot oI your
project that you can revert to or compare to Iater.
2.2.7 Skipping the Staging Area
AIthough It can be amazIngIy useIuI Ior craItIng commIts exactIy how
you want them, the stagIng area Is sometImes a bIt more compIex
than you need In your workflow. ¡I you want to skIp the stagIng
area, GIt provIdes a sImpIe shortcut. IrovIdIng the -a optIon to the
git commit command makes GIt automatIcaIIy stage every fiIe that Is
aIready tracked beIore doIng the commIt, IettIng you skIp the git add
part.
$ git status
# On branch master
#
# Changed but not updated:
#
# modified: benchmarks.rb
#
$ git commit -a -m 'added new benchmarks'
[master 83e38c7] added new benchmarks
1 files changed, 5 insertions(+), 0 deletions(-)
ÞotIce how you don't have to run git add on the benchmarks.rb
fiIe In thIs case beIore you commIt.
2.2.8 Removing Files
To remove a fiIe Irom GIt, you have to remove It Irom your tracked
fiIes (more accurateIy, remove It Irom your stagIng area) and then
commIt. The git rm command does that and aIso removes the fiIe
Irom your workIng dIrectory so you don't see It as an untracked fiIe
next tIme around.
¡I you sImpIy remove the fiIe Irom your workIng dIrectory, It shows
up under the “Changed but not updated” (that Is, unstaged) area oI
your git status output.
24
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
$ rm grit.gemspec
$ git status
# On branch master
#
# Changed but not updated:
# (use "git add/rm <file>..." to update what will be committed)
#
# deleted: grit.gemspec
#
Then, II you run git rm, It stages the fiIe's removaI.
$ git rm grit.gemspec
rm 'grit.gemspec'
$ git status
# On branch master
#
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# deleted: grit.gemspec
#
The next tIme you commIt, the fiIe wIII be gone and no Ionger
tracked. ¡I you modIfied the fiIe and added It to the Index aIready,
you must Iorce the removaI wIth the -f optIon. ThIs Is a saIety Ieature
to prevent accIdentaI removaI oI data that hasn't yet been recorded
In a snapshot and that can't be recovered Irom GIt.
Another useIuI thIng you may want to do Is to keep the fiIe In
your workIng tree but remove It Irom your stagIng area. ¡n other
words, you may want to keep the fiIe on your hard drIve but not
have GIt track It anymore. ThIs Is partIcuIarIy useIuI II you Iorgot to
add somethIng to your .gitignore fiIe and accIdentaIIy added It, IIke
a Iarge Iog fiIe or a bunch oI .a compIIed fiIes. To do thIs, use the --
cached optIon.
$ git rm --cached readme.txt
You can pass fiIes, dIrectorIes, and fiIe-gIob patterns to the git rm
command. That means you can do thIngs such as
$ git rm log/\*.log
Þote the backsIash (\) In Iront oI the *. ThIs Is necessary because
GIt does Its own fiIename expansIon In addItIon to your sheII's fiIe-
name expansIon. ThIs command removes aII fiIes that have the .log
extensIon In the log/ dIrectory. Or, you can do somethIng IIke thIs.
$ git rm \*~
ThIs command removes aII fiIes that end wIth ~.
25
Section 2.3 VIewIng the CommIt IIstory Scott Chacon Pro Git
2.2.9 Moving Files
!nIIke many other VCS systems, GIt doesn't expIIcItIy track fiIe move-
ment. ¡I you rename a fiIe In GIt, no metadata Is stored In GIt that
teIIs It you renamed the fiIe. Iowever, GIt Is pretty smart about figur-
Ing that out aIter the Iact — we'II deaI wIth detectIng fiIe movement
a bIt Iater.
Thus It's a bIt conIusIng that GIt has a mv command. ¡I you want
to rename a fiIe In GIt, you can run somethIng IIke
$ git mv file_from file_to
and It works fine. ¡n Iact, II you run somethIng IIke thIs and Iook
at the status, you'II see that GIt consIders It a renamed fiIe.
$ git mv README.txt README
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# renamed: README.txt -> README
#
Iowever, thIs Is equIvaIent to runnIng somethIng IIke thIs.
$ mv README.txt README
$ git rm README.txt
$ git add README
GIt figures out that It's a rename ImpIIcItIy, so It doesn't matter
II you rename a fiIe that way or wIth the mv command. The onIy reaI
dIfference Is that mv Is one command Instead oI three — It's a con-
venIence IunctIon. More Important, you can use any tooI you IIke to
rename a fiIe, and address the add/rm Iater, beIore you commIt.
2.3 Viewing the Commit History
AIter you have created severaI commIts, or II you have cIoned a
reposItory wIth an exIstIng commIt hIstory, you'II probabIy want to
Iook back to see what has happened. The most basIc and powerIuI
tooI to do thIs Is the git log command.
These exampIes use a very sImpIe project caIIed sImpIegIt that ¡
oIten use Ior demonstratIons. To get the project, run
git clone git://github.com/schacon/simplegit-progit.git
When you run git log In thIs project, you shouId get output that
Iooks somethIng IIke thIs.
26
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
$ git log
commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date: Mon Mar 17 21:52:11 2008 -0700
changed the verison number
commit 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sat Mar 15 16:40:33 2008 -0700
removed unnecessary test code
commit a11bef06a3f659402fe7563abf99ad00de2209e6
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sat Mar 15 10:31:28 2008 -0700
first commit
Ðy deIauIt, wIth no arguments, git log IIsts the commIts made
In that reposItory In reverse chronoIogIcaI order. That Is, the most
recent commIts show up first. As you can see, thIs command IIsts
each commIt wIth Its SIA-1 checksum, the author's name and e-maII,
the date wrItten, and the commIt message.
A huge number and varIety oI optIons to the git log command are
avaIIabIe to show you exactIy what you're IookIng Ior. Iere, we'II
show you some oI the most-used optIons.
One oI the more heIpIuI optIons Is -p, whIch shows the dIff Intro-
duced In each commIt. You can aIso use -2, whIch IImIts the output
to onIy the Iast two entrIes.
$ git log –p -2
commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date: Mon Mar 17 21:52:11 2008 -0700
changed the verison number
diff --git a/Rakefile b/Rakefile
index a874b73..8f94139 100644
--- a/Rakefile
+++ b/Rakefile
@@ -5,7 +5,7 @@ require 'rake/gempackagetask'
spec = Gem::Specification.new do |s|
- s.version = "0.1.0"
+ s.version = "0.1.1"
s.author = "Scott Chacon"
commit 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sat Mar 15 16:40:33 2008 -0700
removed unnecessary test code
diff --git a/lib/simplegit.rb b/lib/simplegit.rb
index a0a60ae..47c6340 100644
--- a/lib/simplegit.rb
+++ b/lib/simplegit.rb
27
Section 2.3 VIewIng the CommIt IIstory Scott Chacon Pro Git
@@ -18,8 +18,3 @@ class SimpleGit
end
end
-
-if $0 == __FILE__
- git = SimpleGit.new
- puts git.show
-end
\ No newline at end of file
ThIs optIon dIspIays the same InIormatIon but wIth a dIff dIrectIy
IoIIowIng each entry. ThIs Is very heIpIuI Ior code revIew or to quIckIy
browse what happened durIng a serIes oI commIts that a coIIaborator
has added. You can aIso use a serIes oI summarIzIng optIons wIth git
log. Ior exampIe, II you want to see some abbrevIated stats Ior each
commIt, you can use the --stat optIon.
$ git log --stat
commit ca82a6dff817ec66f44342007202690a93763949
Author: Scott Chacon <schacon@gee-mail.com>
Date: Mon Mar 17 21:52:11 2008 -0700
changed the verison number
Rakefile | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
commit 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sat Mar 15 16:40:33 2008 -0700
removed unnecessary test code
lib/simplegit.rb | 5 -----
1 files changed, 0 insertions(+), 5 deletions(-)
commit a11bef06a3f659402fe7563abf99ad00de2209e6
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sat Mar 15 10:31:28 2008 -0700
first commit
README | 6 ++++++
Rakefile | 23 +++++++++++++++++++++++
lib/simplegit.rb | 25 +++++++++++++++++++++++++
3 files changed, 54 insertions(+), 0 deletions(-)
As you can see, the --stat optIon prInts beIow each commIt entry
a IIst oI modIfied fiIes, how many fiIes were changed, and how many
IInes In those fiIes were added and removed. ¡t aIso puts a summary
oI the InIormatIon at the end. Another reaIIy useIuI optIon Is --pretty.
ThIs optIon changes the Iog output to Iormats other than the deIauIt.
A Iew prebuIIt optIons are avaIIabIe Ior you to use. The oneIIne optIon
prInts each commIt on a sIngIe IIne, whIch Is useIuI II you're IookIng
at a Iot oI commIts. ¡n addItIon, the short, full, and fuller optIons
show the output In roughIy the same Iormat but wIth Iess or more
InIormatIon, respectIveIy.
28
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
$ git log --pretty=oneline
ca82a6dff817ec66f44342007202690a93763949 changed the verison number
085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 removed unnecessary test code
a11bef06a3f659402fe7563abf99ad00de2209e6 first commit
The most InterestIng optIon Is format, whIch aIIows you to specIIy
your own Iog output Iormat. ThIs Is especIaIIy useIuI when you're
generatIng output Ior machIne parsIng — because you specIIy the
Iormat expIIcItIy, you know It won't change wIth updates to GIt.
$ git log --pretty=format:"%h - %an, %ar : %s"
ca82a6d - Scott Chacon, 11 months ago : changed the verison number
085bb3b - Scott Chacon, 11 months ago : removed unnecessary test code
a11bef0 - Scott Chacon, 11 months ago : first commit
TabIe 2.1 IIsts some oI the more useIuI optIons that Iormat takes.
OptIon ÐescrIptIon oI Output
%H CommIt hash
%h AbbrevIated commIt hash
%T Tree hash
%t AbbrevIated tree hash
%P Iarent hashes
%p AbbrevIated parent hashes
%an Author name
%ae Author e-maII
%ad Author date (Iormat respects the –date= op-
tIon)
%ar Author date, reIatIve
%cn CommItter name
%ce CommItter emaII
%cd CommItter date
%cr CommItter date, reIatIve
%s Subject
You may be wonderIng what the dIfference Is between author and
committer. The author Is the person who orIgInaIIy wrote the work,
whereas the commItter Is the person who Iast appIIed the work. So, II
you send In a patch to a project and one oI the core members appIIes
the patch, both oI you get credIt — you as the author and the core
member as the commItter. We'II cover thIs dIstInctIon a bIt more In
Chapter 5.
The oneIIne and Iormat optIons are partIcuIarIy useIuI wIth an-
other log optIon caIIed --graph. ThIs optIon adds a nIce IIttIe ASC¡¡
graph showIng your branch and merge hIstory, whIch we can see our
copy oI the GrIt project reposItory.
$ git log --pretty=format:"%h %s" --graph
* 2d3acf9 ignore errors from SIGCHLD on trap
* 5e3ee11 Merge branch 'master' of git://github.com/dustin/grit
|\
29
Section 2.3 VIewIng the CommIt IIstory Scott Chacon Pro Git
| * 420eac9 Added a method for getting the current branch.
* | 30e367c timeout code and tests
* | 5a09431 add timeout protection to grit
* | e1193f8 support for heads with slashes in them
|/
* d6016bc require time for xmlschema
* 11d191e Merge branch 'defunkt' into local
Those are onIy some sImpIe output-IormattIng optIons to git log
— there are many more. TabIe 2.2 IIsts the optIons we've covered so
Iar and some other common IormattIng optIons that may be useIuI,
aIong wIth how they change the output oI the Iog command.
OptIon ÐescrIptIon
-p Show the patch Introduced wIth each commIt.
--stat Show statIstIcs Ior fiIes modIfied In each com-
mIt.
--shortstat ÐIspIay onIy the changed/InsertIons/deIetIons
IIne Irom the --stat command.
--name-only Show the IIst oI fiIes modIfied aIter the commIt
InIormatIon.
--name-status Show the IIst oI fiIes affected wIth added/
modIfied/deIeted InIormatIon as weII.
--abbrev-commit Show onIy the first Iew characters oI the SIA-
1 checksum Instead oI aII 40.
--relative-date ÐIspIay the date In a reIatIve Iormat (Ior ex-
ampIe, “2 weeks ago”) Instead oI usIng the IuII
date Iormat.
--graph ÐIspIay an ASC¡¡ graph oI the branch and
merge hIstory besIde the Iog output.
--pretty Show commIts In an aIternate Iormat. OptIons
IncIude oneIIne, short, IuII, IuIIer, and Iormat
(where you specIIy your own Iormat).
2.3.1 Limiting Log Output
¡n addItIon to output-IormattIng optIons, gIt Iog takes a number oI
useIuI IImItIng optIons — that Is, optIons that Iet you show onIy a
subset oI commIts. You've seen one such optIon aIready — the -2
optIon, whIch show onIy the Iast two commIts. ¡n Iact, you can do
-<n>, where n Is any Integer to show the Iast n commIts. ¡n reaIIty,
you're unIIkeIy to use that oIten, because GIt by deIauIt pIpes aII
output through a pager so you see onIy one page oI Iog output at a
tIme.
Iowever, the tIme-IImItIng optIons such as --since and --until are
very useIuI. Ior exampIe, thIs command gets the IIst oI commIts
made In the Iast two weeks.
$ git log --since=2.weeks
30
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
ThIs command works wIth Iots oI Iormats — you can specIIy a
specIfic date (“2008-01-15”) or a reIatIve date such as “2 years 1 day
3 mInutes ago”.
You can aIso fiIter the IIst to commIts that match some search
crIterIa. The --author optIon aIIows you to fiIter on a specIfic author,
and the --grep optIon Iets you search Ior keywords In the commIt mes-
sages. (Þote that II you want to specIIy both author and grep optIons,
you have to add --all-match or the command wIII match commIts wIth
eIther.)
The Iast reaIIy useIuI optIon to pass to git log as a fiIter Is a path.
¡I you specIIy a dIrectory or fiIe name, you can IImIt the Iog output
to commIts that Introduced a change to those fiIes. ThIs Is aIways
the Iast optIon and Is generaIIy preceded by doubIe dashes (--) to
separate the paths Irom the optIons.
¡n TabIe 2.3 we'II IIst these and a Iew other common optIons Ior
your reIerence.
OptIon ÐescrIptIon
-(n) Show onIy the Iast n commIts
--since, --after IImIt the commIts to those made aIter the
specIfied date.
--until, --
before
IImIt the commIts to those made beIore the
specIfied date.
--author OnIy show commIts In whIch the author entry
matches the specIfied strIng.
--committer OnIy show commIts In whIch the commItter en-
try matches the specIfied strIng.
Ior exampIe, II you want to see whIch commIts modIIyIng test
fiIes In the GIt source code hIstory were commItted by junIo Iamano
and were not merges In the month oI October 2008, you can run
somethIng IIke thIs.
$ git log --pretty="%h:%s" --author=gitster --since="2008-10-01" \
--before="2008-11-01" --no-merges -- t/
5610e3b - Fix testcase failure when extended attribute
acd3b9e - Enhance hold_lock_file_for_{update,append}()
f563754 - demonstrate breakage of detached checkout wi
d1a43f2 - reset --hard/read-tree --reset -u: remove un
51a94af - Fix "checkout --track -b newbranch" on detac
b0ad11e - pull: allow "git pull origin $something:$cur
OI the nearIy 20,000 commIts In the GIt source code hIstory, thIs
command shows the 6 that match those crIterIa.
2.3.2 Using a GUI to Visualize History
¡I you IIke to use a more graphIcaI tooI to vIsuaIIze your commIt hIs-
tory, you may want to take a Iook at a TcI/Tk program caIIed gItk that
Is dIstrIbuted wIth GIt. GItk Is basIcaIIy a vIsuaI git log tooI, and It
31
Section 2.4 !ndoIng ThIngs Scott Chacon Pro Git
accepts nearIy aII the fiIterIng optIons that git log does. ¡I you type
gItk on the command IIne In your project, you shouId see somethIng
IIke IIgure 2.2.
Figure 2.2: The gitk history visualizer
You can see the commIt hIstory In the top haII oI the wIndow aIong
wIth a nIce ancestry graph. The dIff vIewer In the bottom haII oI the
wIndow shows you the changes Introduced at any commIt you cIIck.
2.4 Undoing Things
At any stage, you may want to undo somethIng. Iere, we'II revIew
a Iew basIc tooIs Ior undoIng changes that you've made. Ðe careIuI,
because you can't aIways undo some oI these undos. ThIs Is one oI
the Iew areas In GIt where you may Iose some work II you do It wrong.
2.4.1 Changing Your Last Commit
One oI the common undos takes pIace when you commIt too earIy
and possIbIy Iorget to add some fiIes, or you mess up your commIt
message. ¡I you want to try that commIt agaIn, you can run commIt
wIth the --amend optIon.
32
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
$ git commit --amend
ThIs command takes your stagIng area and uses It Ior the commIt.
¡I you've have made no changes sInce your Iast commIt (Ior Instance,
you run thIs command It ImmedIateIy aIter your prevIous commIt),
then your snapshot wIII Iook exactIy the same and aII you'II change
Is your commIt message.
The same commIt-message edItor fires up, but It aIready contaIns
the message oI your prevIous commIt. You can edIt the message the
same as aIways, but It overwrItes your prevIous commIt.
As an exampIe, II you commIt and then reaIIze you Iorgot to stage
the changes In a fiIe you wanted to add to thIs commIt, you can do
somethIng IIke thIs.
$ git commit -m 'initial commit'
$ git add forgotten_file
$ git commit --amend
AII three oI these commands end up wIth a sIngIe commIt — the
second command repIaces the resuIts oI the first.
2.4.2 Unstaging a Staged File
The next two sectIons demonstrate how to wrangIe your stagIng area
and workIng dIrectory changes. The nIce part Is that the command
you use to determIne the state oI those two areas aIso remInds you
how to undo changes to them. Ior exampIe, Iet's say you've changed
two fiIes and want to commIt them as two separate changes, but you
accIdentaIIy type git add * and stage them both. Iow can you unstage
one oI the two? The git status command remInds you.
$ git add .
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: README.txt
# modified: benchmarks.rb
#
IIght beIow the “Changes to be commItted” text, It says use git
reset HEAD <file>... to unstage. So, Iet's use that advIce to unstage
the benchmarks.rb fiIe.
$ git reset HEAD benchmarks.rb
benchmarks.rb: locally modified
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: README.txt
#
33
Section 2.4 !ndoIng ThIngs Scott Chacon Pro Git
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: benchmarks.rb
#
The command Is a bIt strange, but It works. The benchmarks.rb
fiIe Is modIfied but once agaIn unstaged.
2.4.3 Unmodifying a Modified File
What II you reaIIze that you don't want to keep your changes to the
benchmarks.rb fiIe? Iow can you easIIy unmodIIy It — revert It back
to what It Iooked IIke when you Iast commItted (or InItIaIIy cIoned, or
however you got It Into your workIng dIrectory)? IuckIIy, git status
teIIs you how to do that, too. ¡n the Iast exampIe output, the unstaged
area Iooks IIke thIs.
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: benchmarks.rb
#
¡t teIIs you pretty expIIcItIy how to dIscard the changes you've
made (at Ieast, the newer versIons oI GIt, 1.6.1 and Iater, do thIs —
II you have an oIder versIon, we hIghIy recommend upgradIng It to
get some oI these nIcer usabIIIty Ieatures). Iet's do what It says.
$ git checkout -- benchmarks.rb
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: README.txt
#
You can see that the changes have been reverted. You shouId aIso
reaIIze that thIs Is a dangerous command. any changes you made to
that fiIe are gone — you just copIed another fiIe over It. Ðon't ever
use thIs command unIess you absoIuteIy know that you don't want the
fiIe. ¡I you just need to get It out oI the way, we'II go over stashIng
and branchIng In the next chapter, these are generaIIy better ways
to go.
Iemember, anythIng that Is commItted In GIt can aImost aIways be
recovered. £ven commIts that were on branches that were deIeted
or commIts that were overwrItten wIth an --amend commIt can be re-
covered (see Chapter 9 Ior data recovery). Iowever, anythIng you
Iose that was never commItted Is IIkeIy never to be seen agaIn.
34
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
2.5 Working with Remotes
To be abIe to coIIaborate on any GIt project, you need to know how to
manage your remote reposItorIes. Iemote reposItorIes are versIons
oI your project that are hosted on the ¡nternet or network some-
where. You can have severaI oI them, each oI whIch generaIIy Is
eIther read-onIy or read/wrIte Ior you. CoIIaboratIng wIth others In-
voIves managIng these remote reposItorIes and pushIng and puIIIng
data to and Irom them when you need to share work. ManagIng re-
mote reposItorIes IncIudes knowIng how to add remote reposItorIes,
remove remotes that are no Ionger vaIId, manage varIous remote
branches and define them as beIng tracked or not, and more. ¡n thIs
sectIon, we'II cover these remote-management skIIIs.
2.5.1 Showing Your Remotes
To see whIch remote servers you have configured, you can run the
gIt remote command. ¡t IIsts the shortnames oI each remote handIe
you've specIfied. ¡I you've cIoned your reposItory, you shouId at Ieast
see orIgIn — that Is the deIauIt name GIt gIves to the server you
cIoned Irom.
$ git clone git://github.com/schacon/ticgit.git
Initialized empty Git repository in /private/tmp/ticgit/.git/
remote: Counting objects: 595, done.
remote: Compressing objects: 100% (269/269), done.
remote: Total 595 (delta 255), reused 589 (delta 253)
Receiving objects: 100% (595/595), 73.31 KiB | 1 KiB/s, done.
Resolving deltas: 100% (255/255), done.
$ cd ticgit
$ git remote
origin
You can aIso specIIy -v, whIch shows you the !II that GIt has
stored Ior the shortname to be expanded to.
$ git remote -v
origin git://github.com/schacon/ticgit.git
¡I you have more than one remote, the command IIsts them aII.
Ior exampIe, my GrIt reposItory Iooks somethIng IIke thIs.
$ cd grit
$ git remote -v
bakkdoor git://github.com/bakkdoor/grit.git
cho45 git://github.com/cho45/grit.git
defunkt git://github.com/defunkt/grit.git
koke git://github.com/koke/grit.git
origin git@github.com:mojombo/grit.git
ThIs means we can puII contrIbutIons Irom any oI these users
pretty easIIy. Ðut notIce that onIy the orIgIn remote Is an SSI !II,
so It's the onIy one ¡ can push to (we'II cover why thIs Is In Chapter
4).
35
Section 2.5 WorkIng wIth Iemotes Scott Chacon Pro Git
2.5.2 Adding Remote Repositories
¡'ve mentIoned and gIven some demonstratIons oI addIng remote
reposItorIes In prevIous sectIons, but here Is how to do It expIIcItIy.
To add a new remote GIt reposItory as a shortname you can reIerence
easIIy, run git remote add [shortname] [url].
$ git remote
origin
$ git remote add pb git://github.com/paulboone/ticgit.git
$ git remote -v
origin git://github.com/schacon/ticgit.git
pb git://github.com/paulboone/ticgit.git
Þow you can use the strIng pb on the command IIne In IIeu oI the
whoIe !II. Ior exampIe, II you want to Ietch aII the InIormatIon that
IauI has but that you don't yet have In your reposItory, you can run
gIt Ietch pb.
$ git fetch pb
remote: Counting objects: 58, done.
remote: Compressing objects: 100% (41/41), done.
remote: Total 44 (delta 24), reused 1 (delta 0)
Unpacking objects: 100% (44/44), done.
From git://github.com/paulboone/ticgit
* [new branch] master -> pb/master
* [new branch] ticgit -> pb/ticgit
IauI's master branch Is accessIbIe IocaIIy as pb/master — you can
merge It Into one oI your branches, or you can check out a IocaI
branch at that poInt II you want to Inspect It.
2.5.3 Fetching and Pulling from Your Remotes
As you just saw, to get data Irom your remote projects, you can run
$ git fetch [remote-name]
The command goes out to that remote project and puIIs down aII
the data Irom that remote project that you don't have yet. AIter you
do thIs, you shouId have reIerences to aII the branches Irom that
remote, whIch you can merge In or Inspect at any tIme. (We'II go
over what branches are and how to use them In much more detaII In
Chapter 3.)
¡I you cIoned a reposItory, the command automatIcaIIy adds that
remote reposItory under the name orIgIn. So, git fetch origin Ietches
any new work that has been pushed to that server sInce you cIoned
(or Iast Ietched Irom) It. ¡t's Important to note that the Ietch com-
mand puIIs the data to your IocaI reposItory — It doesn't automatI-
caIIy merge It wIth any oI your work or modIIy what you're currentIy
workIng on. You have to merge It manuaIIy Into your work when
you're ready.
36
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
¡I you have a branch set up to track a remote branch (see the next
sectIon and Chapter 3 Ior more InIormatIon), you can use the git pull
command to automatIcaIIy Ietch and then merge a remote branch
Into your current branch. ThIs may be an easIer or more comIortabIe
workflow Ior you, and by deIauIt, the git clone command automatI-
caIIy sets up your IocaI master branch to track the remote master
branch on the server you cIoned Irom (assumIng the remote has a
master branch). IunnIng git pull generaIIy Ietches data Irom the
server you orIgInaIIy cIoned Irom and automatIcaIIy trIes to merge It
Into the code you're currentIy workIng on.
2.5.4 Pushing to Your Remotes
When you have your project at a poInt that you want to share, you
have to push It upstream. The command Ior thIs Is sImpIe. git push
[remote-name] [branch-name]. ¡I you want to push your master branch
to your origin server (agaIn, cIonIng generaIIy sets up both oI those
names Ior you automatIcaIIy), then you can run thIs to push your
work back up to the server.
$ git push origin master
ThIs command works onIy II you cIoned Irom a server to whIch
you have wrIte access and II nobody has pushed In the meantIme.
¡I you and someone eIse cIone at the same tIme and they push up-
stream and then you push upstream, your push wIII rIghtIy be re-
jected. You'II have to puII down theIr work first and Incorporate It
Into yours beIore you'II be aIIowed to push. See Chapter 3 Ior more
detaIIed InIormatIon on how to push to remote servers.
2.5.5 Inspecting a Remote
¡I you want to see more InIormatIon about a partIcuIar remote, you
can use the git remote show [remote-name] command. ¡I you run thIs
command wIth a partIcuIar shortname, such as origin, you get some-
thIng IIke thIs.
$ git remote show origin
* remote origin
URL: git://github.com/schacon/ticgit.git
Remote branch merged with 'git pull' while on branch master
master
Tracked remote branches
master
ticgit
¡t IIsts the !II Ior the remote reposItory as weII as the trackIng
branch InIormatIon. The command heIpIuIIy teIIs you that II you're on
the master branch and you run git pull, It wIII automatIcaIIy merge
In the master branch on the remote aIter It Ietches aII the remote
reIerences. ¡t aIso IIsts aII the remote reIerences It has puIIed down.
37
Section 2.5 WorkIng wIth Iemotes Scott Chacon Pro Git
That Is a sImpIe exampIe you're IIkeIy to encounter. When you're
usIng GIt more heavIIy, however, you may see much more InIormatIon
Irom git remote show.
$ git remote show origin
* remote origin
URL: git@github.com:defunkt/github.git
Remote branch merged with 'git pull' while on branch issues
issues
Remote branch merged with 'git pull' while on branch master
master
New remote branches (next fetch will store in remotes/origin)
caching
Stale tracking branches (use 'git remote prune')
libwalker
walker2
Tracked remote branches
acl
apiv2
dashboard2
issues
master
postgres
Local branch pushed with 'git push'
master:master
ThIs command shows whIch branch Is automatIcaIIy pushed when
you run git push on certaIn branches. ¡t aIso shows you whIch remote
branches on the server you don't yet have, whIch remote branches
you have that have been removed Irom the server, and muItIpIe branches
that are automatIcaIIy merged when you run git pull.
2.5.6 Removing and Renaming Remotes
¡I you want to rename a reIerence, In newer versIons oI GIt you can
run git remote rename to change a remote's shortname. Ior Instance,
II you want to rename pb to paul, you can do so wIth git remote rename.
$ git remote rename pb paul
$ git remote
origin
paul
¡t's worth mentIonIng that thIs changes your remote branch names,
too. What used to be reIerenced at pb/master Is now at paul/master.
¡I you want to remove a reIerence Ior some reason — you've moved
the server or are no Ionger usIng a partIcuIar mIrror, or perhaps a
contrIbutor Isn't contrIbutIng anymore — you can use git remote rm.
$ git remote rm paul
$ git remote
origin
38
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
2.6 Tagging
IIke most VCSs, GIt has the abIIIty to tag specIfic poInts In hIstory
as beIng Important. GeneraIIy, peopIe use thIs IunctIonaIIty to mark
reIease poInts (v1.0, and so on). ¡n thIs sectIon, you'II Iearn how to
IIst the avaIIabIe tags, how to create new tags, and what the dIfferent
types oI tags are.
2.6.1 Listing Your Tags
IIstIng the avaIIabIe tags In GIt Is straIghtIorward. just type git tag.
$ git tag
v0.1
v1.3
ThIs command IIsts the tags In aIphabetIcaI order, the order In
whIch they appear has no reaI Importance.
You can aIso search Ior tags wIth a partIcuIar pattern. The GIt
source repo, Ior Instance, contaIns more than 240 tags. ¡I you're
onIy Interested In IookIng at the 1.4.2 serIes, you can run thIs.
$ git tag -l 'v1.4.2.*'
v1.4.2.1
v1.4.2.2
v1.4.2.3
v1.4.2.4
2.6.2 Creating Tags
GIt uses two maIn types oI tags. IIghtweIght and annotated. A IIghtweIght
tag Is very much IIke a branch that doesn't change — It's just a poInter
to a specIfic commIt. Annotated tags, however, are stored as IuII ob-
jects In the GIt database. They're checksummed, contaIn the tagger
name, e-maII, and date, have a taggIng message, and can be sIgned
and verIfied wIth GÞ! IrIvacy Guard (GIG). ¡t's generaIIy recom-
mended that you create annotated tags so you can have aII thIs In-
IormatIon, but II you want a temporary tag or Ior some reason don't
want to keep the other InIormatIon, IIghtweIght tags are avaIIabIe
too.
2.6.3 Annotated Tags
CreatIng an annotated tag In GIt Is sImpIe. The easIest way Is to
specIIy -a when you run the tag command.
$ git tag -a v1.4 -m 'my version 1.4'
$ git tag
v0.1
v1.3
v1.4
39
Section 2.6 TaggIng Scott Chacon Pro Git
The -m specIfies a taggIng message, whIch Is stored wIth the tag.
¡I you don't specIIy a message Ior an annotated tag, GIt Iaunches
your edItor so you can type It In.
You can see the tag data aIong wIth the commIt that was tagged
by usIng the git show command.
$ git show v1.4
tag v1.4
Tagger: Scott Chacon <schacon@gee-mail.com>
Date: Mon Feb 9 14:45:11 2009 -0800
my version 1.4
commit 15027957951b64cf874c3557a0f3547bd83b3ff6
Merge: 4a447f7... a6b4c97...
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sun Feb 8 19:02:46 2009 -0800
Merge branch 'experiment'
That shows the tagger InIormatIon, the date the commIt was tagged,
and the annotatIon message beIore showIng the commIt InIormatIon.
2.6.4 Signed Tags
You can aIso sIgn your tags wIth GIG, assumIng you have a prIvate
key. AII you have to do Is use -s Instead oI -a.
$ git tag -s v1.5 -m 'my signed 1.5 tag'
You need a passphrase to unlock the secret key for
user: "Scott Chacon <schacon@gee-mail.com>"
1024-bit DSA key, ID F721C45A, created 2009-02-09
¡I you run git show on that tag, you can see your GIG sIgnature
attached to It.
$ git show v1.5
tag v1.5
Tagger: Scott Chacon <schacon@gee-mail.com>
Date: Mon Feb 9 15:22:20 2009 -0800
my signed 1.5 tag
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
iEYEABECAAYFAkmQurIACgkQON3DxfchxFr5cACeIMN+ZxLKggJQf0QYiQBwgySN
Ki0An2JeAVUCAiJ7Ox6ZEtK+NvZAj82/
=WryJ
-----END PGP SIGNATURE-----
commit 15027957951b64cf874c3557a0f3547bd83b3ff6
Merge: 4a447f7... a6b4c97...
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sun Feb 8 19:02:46 2009 -0800
Merge branch 'experiment'
A bIt Iater, you'II Iearn how to verIIy sIgned tags.
40
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
2.6.5 Lightweight Tags
Another way to tag commIts Is wIth a IIghtweIght tag. ThIs Is basI-
caIIy the commIt checksum stored In a fiIe — no other InIormatIon Is
kept. To create a IIghtweIght tag, don't suppIy the -a, -s, or -m optIon.
$ git tag v1.4-lw
$ git tag
v0.1
v1.3
v1.4
v1.4-lw
v1.5
ThIs tIme, II you run git show on the tag, you don't see the extra
tag InIormatIon. The command just shows the commIt.
$ git show v1.4-lw
commit 15027957951b64cf874c3557a0f3547bd83b3ff6
Merge: 4a447f7... a6b4c97...
Author: Scott Chacon <schacon@gee-mail.com>
Date: Sun Feb 8 19:02:46 2009 -0800
Merge branch 'experiment'
2.6.6 Verifying Tags
To verIIy a sIgned tag, you use git tag -v [tag-name]. ThIs command
uses GIG to verIIy the sIgnature. You need the sIgner's pubIIc key In
your keyrIng Ior thIs to work properIy.
$ git tag -v v1.4.2.1
object 883653babd8ee7ea23e6a5c392bb739348b1eb61
type commit
tag v1.4.2.1
tagger Junio C Hamano <junkio@cox.net> 1158138501 -0700
GIT 1.4.2.1
Minor fixes since 1.4.2, including git-mv and git-http with alternates.
gpg: Signature made Wed Sep 13 02:08:25 2006 PDT using DSA key ID F3119B9A
gpg: Good signature from "Junio C Hamano <junkio@cox.net>"
gpg: aka "[jpeg image of size 1513]"
Primary key fingerprint: 3565 2A26 2040 E066 C9A7 4A7D C0C6 D9A4 F311 9B9A
¡I you don't have the sIgner's pubIIc key, you get somethIng IIke
thIs Instead.
gpg: Signature made Wed Sep 13 02:08:25 2006 PDT using DSA key ID F3119B9A
gpg: Can't check signature: public key not found
error: could not verify the tag 'v1.4.2.1'
2.6.7 Tagging Later
You can aIso tag commIts aIter you've moved past them. Suppose
your commIt hIstory Iooks IIke thIs.
41
Section 2.6 TaggIng Scott Chacon Pro Git
$ git log --pretty=oneline
15027957951b64cf874c3557a0f3547bd83b3ff6 Merge branch 'experiment'
a6b4c97498bd301d84096da251c98a07c7723e65 beginning write support
0d52aaab4479697da7686c15f77a3d64d9165190 one more thing
6d52a271eda8725415634dd79daabbc4d9b6008e Merge branch 'experiment'
0b7434d86859cc7b8c3d5e1dddfed66ff742fcbc added a commit function
4682c3261057305bdd616e23b64b0857d832627b added a todo file
166ae0c4d3f420721acbb115cc33848dfcc2121a started write support
9fceb02d0ae598e95dc970b74767f19372d61af8 updated rakefile
964f16d36dfccde844893cac5b347e7b3d44abbc commit the todo
8a5cbc430f1a9c3d00faaeffd07798508422908a updated readme
Þow, suppose you Iorgot to tag the project at v1.2, whIch was at
the “updated rakefiIe” commIt. You can add It aIter the Iact. To tag
that commIt, you specIIy the commIt checksum (or part oI It) at the
end oI the command.
$ git tag -a v1.2 9fceb02
You can see that you've tagged the commIt.
$ git tag
v0.1
v1.2
v1.3
v1.4
v1.4-lw
v1.5
$ git show v1.2
tag v1.2
Tagger: Scott Chacon <schacon@gee-mail.com>
Date: Mon Feb 9 15:32:16 2009 -0800
version 1.2
commit 9fceb02d0ae598e95dc970b74767f19372d61af8
Author: Magnus Chacon <mchacon@gee-mail.com>
Date: Sun Apr 27 20:43:35 2008 -0700
updated rakefile
...
2.6.8 Sharing Tags
Ðy deIauIt, the git push command doesn't transIer tags to remote
servers. You wIII have to expIIcItIy push tags to a shared server aI-
ter you have created them. ThIs process Is just IIke sharIng remote
branches – you can run git push origin [tagname].
$ git push origin v1.5
Counting objects: 50, done.
Compressing objects: 100% (38/38), done.
Writing objects: 100% (44/44), 4.56 KiB, done.
Total 44 (delta 18), reused 8 (delta 1)
To git@github.com:schacon/simplegit.git
* [new tag] v1.5 -> v1.5
42
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
¡I you have a Iot oI tags that you want to push up at once, you can
aIso use the --tags optIon to the git push command. ThIs wIII transIer
aII oI your tags to the remote server that are not aIready there.
$ git push origin --tags
Counting objects: 50, done.
Compressing objects: 100% (38/38), done.
Writing objects: 100% (44/44), 4.56 KiB, done.
Total 44 (delta 18), reused 8 (delta 1)
To git@github.com:schacon/simplegit.git
* [new tag] v0.1 -> v0.1
* [new tag] v1.2 -> v1.2
* [new tag] v1.4 -> v1.4
* [new tag] v1.4-lw -> v1.4-lw
* [new tag] v1.5 -> v1.5
Þow, when someone eIse cIones or puIIs Irom your reposItory, they
wIII get aII your tags as weII.
2.7 Tips and Tricks
ÐeIore we finIsh thIs chapter on basIc GIt, a Iew IIttIe tIps and trIcks
may make your GIt experIence a bIt sImpIer, easIer, or more IamIIIar.
Many peopIe use GIt wIthout usIng any oI these tIps, and we won't
reIer to them or assume you've used them Iater In the book, but you
shouId probabIy know how to do them.
2.7.1 Auto-Completion
¡I you use the Ðash sheII, GIt comes wIth a nIce auto-compIetIon
scrIpt you can enabIe. ÐownIoad the GIt source code, and Iook In the
contrib/completion dIrectory, there shouId be a fiIe caIIed git-completion.bash.
Copy thIs fiIe to your home dIrectory, and add thIs to your .bashrc fiIe.
source ~/.git-completion.bash
¡I you want to set up GIt to automatIcaIIy have Ðash sheII compIe-
tIon Ior aII users, copy thIs scrIpt to the /opt/local/etc/bash_completion.d
dIrectory on Mac systems or to the /etc/bash_completion.d/ dIrectory
on IInux systems. ThIs Is a dIrectory oI scrIpts that Ðash wIII auto-
matIcaIIy Ioad to provIde sheII compIetIons.
¡I you're usIng WIndows wIth GIt Ðash, whIch Is the deIauIt when
InstaIIIng GIt on WIndows wIth msysGIt, auto-compIetIon shouId be
preconfigured.
Iress the Tab key when you're wrItIng a GIt command, and It
shouId return a set oI suggestIons Ior you to pIck Irom.
$ git co<tab><tab>
commit config
43
Section 2.7 TIps and TrIcks Scott Chacon Pro Git
¡n thIs case, typIng gIt co and then pressIng the Tab key twIce
suggests commIt and config. AddIng m<tab> compIetes git commit au-
tomatIcaIIy.
ThIs aIso works wIth optIons, whIch Is probabIy more useIuI. Ior
Instance, II you're runnIng a git log command and can't remember
one oI the optIons, you can start typIng It and press Tab to see what
matches.
$ git log --s<tab>
--shortstat --since= --src-prefix= --stat --summary
That's a pretty nIce trIck and may save you some tIme and docu-
mentatIon readIng.
2.7.2 Git Aliases
GIt doesn't InIer your command II you type It In partIaIIy. ¡I you don't
want to type the entIre text oI each oI the GIt commands, you can
easIIy set up an aIIas Ior each command usIng git config. Iere are a
coupIe oI exampIes you may want to set up.
$ git config --global alias.co checkout
$ git config --global alias.br branch
$ git config --global alias.ci commit
$ git config --global alias.st status
ThIs means that, Ior exampIe, Instead oI typIng git commit, you just
need to type git ci. As you go on usIng GIt, you'II probabIy use other
commands IrequentIy as weII, In thIs case, don't hesItate to create
new aIIases.
ThIs technIque can aIso be very useIuI In creatIng commands that
you thInk shouId exIst. Ior exampIe, to correct the usabIIIty probIem
you encountered wIth unstagIng a fiIe, you can add your own unstage
aIIas to GIt.
$ git config --global alias.unstage 'reset HEAD --'
ThIs makes the IoIIowIng two commands equIvaIent.
$ git unstage fileA
$ git reset HEAD fileA
ThIs seems a bIt cIearer. ¡t's aIso common to add a last command,
IIke thIs.
$ git config --global alias.last 'log -1 HEAD'
ThIs way, you can see the Iast commIt easIIy.
$ git last
commit 66938dae3329c7aebe598c2246a8e6af90d04646
Author: Josh Goebel <dreamer3@example.com>
Date: Tue Aug 26 19:48:51 2008 +0800
test for current head
Signed-off-by: Scott Chacon <schacon@example.com>
44
Chapter 2 GIt ÐasIcs Scott Chacon Pro Git
As you can teII, GIt sImpIy repIaces the new command wIth what-
ever you aIIas It Ior. Iowever, maybe you want to run an externaI
command, rather than a GIt subcommand. ¡n that case, you start
the command wIth a ! character. ThIs Is useIuI II you wrIte your
own tooIs that work wIth a GIt reposItory. We can demonstrate by
aIIasIng git visual to run gitk.
$ git config --global alias.visual "!gitk"
2.8 Summary
At thIs poInt, you can do aII the basIc IocaI GIt operatIons — creat-
Ing or cIonIng a reposItory, makIng changes, stagIng and commIttIng
those changes, and vIewIng the hIstory oI aII the changes the repos-
Itory has been through. Þext, we'II cover GIt's kIIIer Ieature. Its
branchIng modeI.
45
Chapter 3
Git Branching
ÞearIy every VCS has some Iorm oI branchIng support. ÐranchIng
means you dIverge Irom the maIn IIne oI deveIopment and contInue
to do work wIthout messIng wIth that maIn IIne. ¡n many VCS tooIs,
thIs Is a somewhat expensIve process, oIten requIrIng you to create
a new copy oI your source code dIrectory, whIch can take a Iong tIme
Ior Iarge projects.
Some peopIe reIer to the branchIng modeI In GIt as Its “kIIIer Iea-
ture,” and It certaInIy sets GIt apart In the VCS communIty. Why Is It
so specIaI? The way GIt branches Is IncredIbIy IIghtweIght, makIng
branchIng operatIons nearIy Instantaneous and swItchIng back and
Iorth between branches generaIIy just as Iast. !nIIke many other
VCSs, GIt encourages a workflow that branches and merges oIten,
even muItIpIe tImes In a day. !nderstandIng and masterIng thIs Iea-
ture gIves you a powerIuI and unIque tooI and can IIteraIIy change
the way that you deveIop.
3.1 What a Branch Is
To reaIIy understand the way GIt does branchIng, we need to take a
step back and examIne how GIt stores Its data. As you may remember
Irom Chapter 1, GIt doesn't store data as a serIes oI changesets or
deItas, but Instead as a serIes oI snapshots.
When you commIt In GIt, GIt stores a commIt object that contaIns
a poInter to the snapshot oI the content you staged, the author and
message metadata, and zero or more poInters to the commIt or com-
mIts that were the dIrect parents oI thIs commIt. zero parents Ior the
first commIt, one parent Ior a normaI commIt, and muItIpIe parents
Ior a commIt that resuIts Irom a merge oI two or more branches.
To vIsuaIIze thIs, Iet's assume that you have a dIrectory contaInIng
three fiIes, and you stage them aII and commIt. StagIng the fiIes
checksums each one (the SIA-1 hash we mentIoned In Chapter 1),
stores that versIon oI the fiIe In the GIt reposItory (GIt reIers to them
as bIobs), and adds that checksum to the stagIng area.
47
Section 3.1 What a Ðranch ¡s Scott Chacon Pro Git
$ git add README test.rb LICENSE2
$ git commit -m 'initial commit of my project'
When you create the commIt by runnIng git commit, GIt checksums
each subdIrectory (In thIs case, just the root project dIrectory) and
stores those tree objects In the GIt reposItory. GIt then creates a
commIt object that has the metadata and a poInter to the root project
tree so It can re-create that snapshot when needed.
Your GIt reposItory now contaIns five objects. one bIob Ior the
contents oI each oI your three fiIes, one tree that IIsts the contents
oI the dIrectory and specIfies whIch fiIe names are stored as whIch
bIobs, and one commIt wIth the poInter to that root tree and aII the
commIt metadata. ConceptuaIIy, the data In your GIt reposItory Iooks
somethIng IIke IIgure 3.1.
Figure 3.1: Single commit repository data
¡I you make some changes and commIt agaIn, the next commIt
stores a poInter to the commIt that came ImmedIateIy beIore It. AIter
two more commIts, your hIstory mIght Iook somethIng IIke IIgure
3.2.
Figure 3.2: Git object data for multiple commits
A branch In GIt Is sImpIy a IIghtweIght movabIe poInter to one oI
these commIts. The deIauIt branch name In GIt Is master. As you
48
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
InItIaIIy make commIts, you're gIven a master branch that poInts to
the Iast commIt you made. £very tIme you commIt, It moves Iorward
automatIcaIIy.
Figure 3.3: Branch pointing into the commit data's history
What happens II you create a new branch? WeII, doIng so creates
a new poInter Ior you to move around. Iet's say you create a new
branch caIIed testIng. You do thIs wIth the git branch command.
$ git branch testing
ThIs creates a new poInter at the same commIt you're currentIy
on (see IIgure 3.4).
Figure 3.4: Multiple branches pointing into the commit's data
history
Iow does GIt know what branch you're currentIy on? ¡t keeps
a specIaI poInter caIIed I£AÐ. Þote that thIs Is a Iot dIfferent than
the concept oI I£AÐ In other VCSs you may be used to, such as
SubversIon or CVS. ¡n GIt, thIs Is a poInter to the IocaI branch you're
currentIy on. ¡n thIs case, you're stIII on master. The gIt branch
command onIy created a new branch — It dIdn't swItch to that branch
(see IIgure 3.5).
To swItch to an exIstIng branch, you run the git checkout command.
Iet's swItch to the new testIng branch.
$ git checkout testing
49
Section 3.1 What a Ðranch ¡s Scott Chacon Pro Git
Figure 3.5: HEAD file pointing to the branch you're on
Figure 3.6: HEAD points to another branch when you switch
branches.
ThIs moves I£AÐ to poInt to the testIng branch (see IIgure 3.6).
What Is the sIgnIficance oI that? WeII, Iet's do another commIt.
$ vim test.rb
$ git commit -a -m 'made a change'
IIgure 3.7 IIIustrates the resuIt.
commIt.
ThIs Is InterestIng, because now your testIng branch has moved
Iorward, but your master branch stIII poInts to the commIt you were
on when you ran git checkout to swItch branches. Iet's swItch back
to the master branch.
$ git checkout master
IIgure 3.8 shows the resuIt.
That command dId two thIngs. ¡t moved the I£AÐ poInter back to
poInt to the master branch, and It reverted the fiIes In your workIng
50
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.7: The branch that HEAD points to moves forward
with each
Figure 3.8: HEAD moves to another branch on a checkout.
dIrectory back to the snapshot that master poInts to. ThIs aIso means
the changes you make Irom thIs poInt Iorward wIII dIverge Irom an
oIder versIon oI the project. ¡t essentIaIIy rewInds the work you've
done In your testIng branch temporarIIy so you can go In a dIfferent
dIrectIon.
Iet's make a Iew changes and commIt agaIn.
$ vim test.rb
$ git commit -a -m 'made other changes'
Þow your project hIstory has dIverged (see IIgure 3.9). You cre-
ated and swItched to a branch, dId some work on It, and then swItched
back to your maIn branch and dId other work. Ðoth oI those changes
are IsoIated In separate branches. you can swItch back and Iorth
between the branches and merge them together when you're ready.
And you dId aII that wIth sImpIe branch and checkout commands.
Ðecause a branch In GIt Is In actuaIIty a sImpIe fiIe that contaIns
the 40 character SIA-1 checksum oI the commIt It poInts to, branches
are cheap to create and destroy. CreatIng a new branch Is as quIck
and sImpIe as wrItIng 41 bytes to a fiIe (40 characters and a newIIne).
51
Section 3.2 ÐasIc ÐranchIng and MergIng Scott Chacon Pro Git
Figure 3.9: The branch histories have diverged.
ThIs Is In sharp contrast to the way most VCS tooIs branch, whIch
InvoIves copyIng aII oI the project's fiIes Into a second dIrectory. ThIs
can take severaI seconds or even mInutes, dependIng on the sIze
oI the project, whereas In GIt the process Is aIways Instantaneous.
AIso, because we're recordIng the parents when we commIt, findIng
a proper merge base Ior mergIng Is automatIcaIIy done Ior us and Is
generaIIy very easy to do. These Ieatures heIp encourage deveIopers
to create and use branches oIten.
Iet's see why you shouId do so.
3.2 Basic Branching and Merging
Iet's go through a sImpIe exampIe oI branchIng and mergIng wIth
a workflow that you mIght use In the reaI worId. You'II IoIIow these
steps.
1. Ðo work on a web sIte.
2. Create a branch Ior a new story you're workIng on.
3. Ðo some work In that branch.
At thIs stage, you'II receIve a caII that another Issue Is crItIcaI and
you need a hotfix. You'II do the IoIIowIng.
1. Ievert back to your productIon branch.
2. Create a branch to add the hotfix.
3. AIter It's tested, merge the hotfix branch, and push to produc-
tIon.
4. SwItch back to your orIgInaI story and contInue workIng.
52
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
3.2.1 Basic Branching
IIrst, Iet's say you're workIng on your project and have a coupIe oI
commIts aIready (see IIgure 3.10).
Figure 3.10: A short and simple commit history
You've decIded that you're goIng to work on Issue #53 In whatever
Issue-trackIng system your company uses. To be cIear, GIt Isn't tIed
Into any partIcuIar Issue-trackIng system, but because Issue #53 Is
a Iocused topIc that you want to work on, you'II create a new branch
In whIch to work. To create a branch and swItch to It at the same
tIme, you can run the git checkout command wIth the -b swItch.
$ git checkout -b iss53
Switched to a new branch "iss53"
ThIs Is shorthand Ior
$ git branch iss53
$ git checkout iss53
IIgure 3.11 IIIustrates the resuIt.
Figure 3.11: Creating a new branch pointer
You work on your web sIte and do some commIts. ÐoIng so moves
the iss53 branch Iorward, because you have It checked out (that Is,
your I£AÐ Is poIntIng to It, see IIgure 3.12).
$ vim index.html
$ git commit -a -m 'added a new footer [issue 53]'
Þow you get the caII that there Is an Issue wIth the web sIte, and
you need to fix It ImmedIateIy. WIth GIt, you don't have to depIoy your
fix aIong wIth the iss53 changes you've made, and you don't have to
put a Iot oI effort Into revertIng those changes beIore you can work
53
Section 3.2 ÐasIc ÐranchIng and MergIng Scott Chacon Pro Git
Figure 3.12: The iss53 branch has moved forward with your
work.
on appIyIng your fix to what Is In productIon. AII you have to do Is
swItch back to your master branch.
Iowever, beIore you do that, note that II your workIng dIrectory or
stagIng area has uncommItted changes that conflIct wIth the branch
you're checkIng out, GIt won't Iet you swItch branches. ¡t's best to
have a cIean workIng state when you swItch branches. There are
ways to get around thIs (nameIy, stashIng and commIt amendIng)
that we'II cover Iater. Ior now, you've commItted aII your changes,
so you can swItch back to your master branch.
$ git checkout master
Switched to branch "master"
At thIs poInt, your project workIng dIrectory Is exactIy the way It
was beIore you started workIng on Issue #53, and you can concen-
trate on your hotfix. ThIs Is an Important poInt to remember. GIt
resets your workIng dIrectory to Iook IIke the snapshot oI the com-
mIt that the branch you check out poInts to. ¡t adds, removes, and
modIfies fiIes automatIcaIIy to make sure your workIng copy Is what
the branch Iooked IIke on your Iast commIt to It.
Þext, you have a hotfix to make. Iet's create a hotfix branch on
whIch to work untII It's compIeted (see IIgure 3.13).
$ git checkout -b 'hotfix'
Switched to a new branch "hotfix"
$ vim index.html
$ git commit -a -m 'fixed the broken email address'
[hotfix]: created 3a0874c: "fixed the broken email address"
1 files changed, 0 insertions(+), 1 deletions(-)
You can run your tests, make sure the hotfix Is what you want, and
merge It back Into your master branch to depIoy to productIon. You
do thIs wIth the git merge command.
$ git checkout master
$ git merge hotfix
Updating f42c576..3a0874c
Fast forward
README | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
54
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.13: hotfix branch based back at your master branch
point
You'II notIce the phrase “Iast Iorward” In that merge. Ðecause
the commIt poInted to by the branch you merged In was dIrectIy up-
stream oI the commIt you're on, GIt moves the poInter Iorward. To
phrase that another way, when you try to merge one commIt wIth a
commIt that can be reached by IoIIowIng the first commIt's hIstory,
GIt sImpIIfies thIngs by movIng the poInter Iorward because there Is
no dIvergent work to merge together — thIs Is caIIed a “Iast Iorward”.
Your change Is now In the snapshot oI the commIt poInted to by
the master branch, and you can depIoy your change (see IIgure 3-14).
Figure 3.14: Your master branch points to the same place as
your hotfix
branch aIter the merge.
AIter that your super-Important fix Is depIoyed, you're ready to
swItch back to the work you were doIng beIore you were Interrupted.
Iowever, first you'II deIete the hotfix branch, because you no Ionger
need It — the master branch poInts at the same pIace. You can deIete
It wIth the -d optIon to git branch.
$ git branch -d hotfix
Deleted branch hotfix (3a0874c).
55
Section 3.2 ÐasIc ÐranchIng and MergIng Scott Chacon Pro Git
Þow you can swItch back to your work-In-progress branch on Is-
sue #53 and contInue workIng on It (see IIgure 3.15).
$ git checkout iss53
Switched to branch "iss53"
$ vim index.html
$ git commit -a -m 'finished the new footer [issue 53]'
[iss53]: created ad82d7a: "finished the new footer [issue 53]"
1 files changed, 1 insertions(+), 0 deletions(-)
Figure 3.15: Your iss53 branch can move forward indepen-
dently.
¡t's worth notIng here that the work you dId In your hotfix branch
Is not contaIned In the fiIes In your iss53 branch. ¡I you need to puII
It In, you can merge your master branch Into your iss53 branch by
runnIng git merge master, or you can waIt to Integrate those changes
untII you decIde to puII the iss53 branch back Into master Iater.
3.2.2 Basic Merging
Suppose you've decIded that your Issue #53 work Is compIete and
ready to be merged Into your master branch. ¡n order to do that, you'II
merge In your iss53 branch, much IIke you merged In your hotfix
branch earIIer. AII you have to do Is check out the branch you wIsh
to merge Into and then run the git merge command.
$ git checkout master
$ git merge iss53
Merge made by recursive.
README | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
ThIs Iooks a bIt dIfferent than the hotfix merge you dId earIIer. ¡n
thIs case, your deveIopment hIstory has dIverged Irom some oIder
poInt. Ðecause the commIt on the branch you're on Isn't a dIrect an-
cestor oI the branch you're mergIng In, GIt has to do some work. ¡n
thIs case, GIt does a sImpIe three-way merge, usIng the two snap-
shots poInted to by the branch tIps and the common ancestor oI the
56
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.16: Git automatically identifies the best common-
ancestor merge
two. IIgure 3.16 hIghIIghts the three snapshots that GIt uses to do
Its merge In thIs case.
base Ior branch mergIng.
¡nstead oI just movIng the branch poInter Iorward, GIt creates a
new snapshot that resuIts Irom thIs three-way merge and automatI-
caIIy creates a new commIt that poInts to It (see IIgure 3-17). ThIs
Is reIerred to as a merge commIt and Is specIaI In that It has more
than one parent.
¡t's worth poIntIng out that GIt determInes the best common an-
cestor to use Ior Its merge base, thIs Is dIfferent than CVS or Sub-
versIon (beIore versIon 1.5), where the deveIoper doIng the merge
has to figure out the best merge base Ior themseIves. ThIs makes
mergIng a heck oI a Iot easIer In GIt than In these other systems.
Figure 3.17: Git automatically creates a new commit object
that contains
57
Section 3.2 ÐasIc ÐranchIng and MergIng Scott Chacon Pro Git
the merged work.
Þow that your work Is merged In, you have no Iurther need Ior the
iss53 branch. You can deIete It and then manuaIIy cIose the tIcket In
your tIcket-trackIng system.
$ git branch -d iss53
3.2.3 Basic Merge Conflicts
OccasIonaIIy, thIs process doesn't go smoothIy. ¡I you changed the
same part oI the same fiIe dIfferentIy In the two branches you're
mergIng together, GIt won't be abIe to merge them cIeanIy. ¡I your
fix Ior Issue #53 modIfied the same part oI a fiIe as the hotfix, you'II
get a merge conflIct that Iooks somethIng IIke thIs.
$ git merge iss53
Auto-merging index.html
CONFLICT (content): Merge conflict in index.html
Automatic merge failed; fix conflicts and then commit the result.
GIt hasn't automatIcaIIy created a new merge commIt. ¡t has
paused the process whIIe you resoIve the conflIct. ¡I you want to
see whIch fiIes are unmerged at any poInt aIter a merge conflIct, you
can run git status.
[master*]$ git status
index.html: needs merge
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# unmerged: index.html
#
AnythIng that has merge conflIcts and hasn't been resoIved Is
IIsted as unmerged. GIt adds standard conflIct-resoIutIon markers
to the fiIes that have conflIcts, so you can open them manuaIIy and
resoIve those conflIcts. Your fiIe contaIns a sectIon that Iooks some-
thIng IIke thIs.
<<<<<<< HEAD:index.html
<div id="footer">contact : email.support@github.com</div>
=======
<div id="footer">
please contact us at support@github.com
</div>
>>>>>>> iss53:index.html
ThIs means the versIon In I£AÐ (your master branch, because
that was what you had checked out when you ran your merge com-
mand) Is the top part oI that bIock (everythIng above the =======),
whIIe the versIon In your iss53 branch Iooks IIke everythIng In the bot-
tom part. ¡n order to resoIve the conflIct, you have to eIther choose
one sIde or the other or merge the contents yourseII. Ior Instance,
you mIght resoIve thIs conflIct by repIacIng the entIre bIock wIth thIs.
58
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
<div id="footer">
please contact us at email.support@github.com
</div>
ThIs resoIutIon has a IIttIe oI each sectIon, and ¡'ve IuIIy removed
the <<<<<<<, =======, and >>>>>>> IInes. AIter you've resoIved each oI
these sectIons In each conflIcted fiIe, run git add on each fiIe to mark
It as resoIved. StagIng the fiIe marks It as resoIved In GIt. ¡I you
want to use a graphIcaI tooI to resoIve these Issues, you can run git
mergetool, whIch fires up an approprIate vIsuaI merge tooI and waIks
you through the conflIcts.
$ git mergetool
merge tool candidates: kdiff3 tkdiff xxdiff meld gvimdiff opendiff emerge vimdiff
Merging the files: index.html
Normal merge conflict for 'index.html':
{local}: modified
{remote}: modified
Hit return to start merge resolution tool (opendiff):
¡I you want to use a merge tooI other than the deIauIt (GIt chose
opendiff Ior me In thIs case because ¡ ran the command on a Mac),
you can see aII the supported tooIs IIsted at the top aIter “merge
tooI candIdates”. Type the name oI the tooI you'd rather use. ¡n
Chapter 7, we'II dIscuss how you can change thIs deIauIt vaIue Ior
your envIronment.
AIter you exIt the merge tooI, GIt asks you II the merge was suc-
cessIuI. ¡I you teII the scrIpt that It was, It stages the fiIe to mark It
as resoIved Ior you.
You can run git status agaIn to verIIy that aII conflIcts have been
resoIved.
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: index.html
#
¡I you're happy wIth that, and you verIIy that everythIng that had
conflIcts has been staged, you can type git commit to finaIIze the
merge commIt. The commIt message by deIauIt Iooks somethIng IIke
thIs.
Merge branch 'iss53'
Conflicts:
index.html
#
# It looks like you may be committing a MERGE.
# If this is not correct, please remove the file
# .git/MERGE_HEAD
# and try again.
#
59
Section 3.3 Ðranch Management Scott Chacon Pro Git
You can modIIy that message wIth detaIIs about how you resoIved
the merge II you thInk It wouId be heIpIuI to others IookIng at thIs
merge In the Iuture — why you dId what you dId, II It's not obvIous.
3.3 Branch Management
Þow that you've created, merged, and deIeted some branches, Iet's
Iook at some branch-management tooIs that wIII come In handy when
you begIn usIng branches aII the tIme.
The git branch command does more than just create and deIete
branches. ¡I you run It wIth no arguments, you get a sImpIe IIstIng
oI your current branches.
$ git branch
iss53
* master
testing
ÞotIce the * character that prefixes the master branch. It IndIcates
the branch that you currentIy have checked out. ThIs means that II
you commIt at thIs poInt, the master branch wIII be moved Iorward
wIth your new work. To see the Iast commIt on each branch, you can
run git branch –v.
$ git branch -v
iss53 93b412c fix javascript issue
* master 7a98805 Merge branch 'iss53'
testing 782fd34 add scott to the author list in the readmes
Another useIuI optIon to figure out what state your branches are In
Is to fiIter thIs IIst to branches that you have or have not yet merged
Into the branch you're currentIy on. The useIuI --merged and --no-
merged optIons have been avaIIabIe In GIt sInce versIon 1.5.6 Ior thIs
purpose. To see whIch branches are aIready merged Into the branch
you're on, you can run git branch –merged.
$ git branch --merged
iss53
* master
Ðecause you aIready merged In iss53 earIIer, you see It In your IIst.
Ðranches on thIs IIst wIthout the * In Iront oI them are generaIIy fine
to deIete wIth git branch -d, you've aIready Incorporated theIr work
Into another branch, so you're not goIng to Iose anythIng.
To see aII the branches that contaIn work you haven't yet merged
In, you can run git branch --no-merged.
$ git branch --no-merged
testing
ThIs shows your other branch. Ðecause It contaIns work that Isn't
merged In yet, tryIng to deIete It wIth git branch -d wIII IaII.
60
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
$ git branch -d testing
error: The branch 'testing' is not an ancestor of your current HEAD.
¡I you are sure you want to deIete It, run git branch -D testing. ¡I
you reaIIy do want to deIete the branch and Iose that work, you can
Iorce It wIth -D, as the heIpIuI message poInts out.
3.4 Branching Workflows
Þow that you have the basIcs oI branchIng and mergIng down, what
can or shouId you do wIth them? ¡n thIs sectIon, we'II cover some
common workflows that thIs IIghtweIght branchIng makes possIbIe,
so you can decIde II you wouId IIke to Incorporate It Into your own
deveIopment cycIe.
3.4.1 Long-Running Branches
Ðecause GIt uses a sImpIe three-way merge, mergIng Irom one branch
Into another muItIpIe tImes over a Iong perIod Is generaIIy easy to do.
ThIs means you can have severaI branches that are aIways open and
that you use Ior dIfferent stages oI your deveIopment cycIe, you can
merge reguIarIy Irom some oI them Into others.
Many GIt deveIopers have a workflow that embraces thIs approach,
such as havIng onIy code that Is entIreIy stabIe In theIr master branch
— possIbIy onIy code that has been or wIII be reIeased. They have
another paraIIeI branch named deveIop or next that they work Irom
or use to test stabIIIty — It Isn't necessarIIy aIways stabIe, but when-
ever It gets to a stabIe state, It can be merged Into master. ¡t's used
to puII In topIc branches (short-IIved branches, IIke your earIIer iss53
branch) when they're ready, to make sure they pass aII the tests and
don't Introduce bugs.
¡n reaIIty, we're taIkIng about poInters movIng up the IIne oI com-
mIts you're makIng. The stabIe branches are Iarther down the IIne
In your commIt hIstory, and the bIeedIng-edge branches are Iarther
up the hIstory (see IIgure 3.18).
Figure 3.18: More stable branches are generally farther down
the commit
hIstory.
¡t's generaIIy easIer to thInk about them as work sIIos, where sets
oI commIts graduate to a more stabIe sIIo when they're IuIIy tested
(see IIgure 3.19).
61
Section 3.4 ÐranchIng Workflows Scott Chacon Pro Git
Figure 3.19: It may be helpful to think of your branches as
silos.
You can keep doIng thIs Ior severaI IeveIs oI stabIIIty. Some Iarger
projects aIso have a proposed or pu (proposed updates) branch that has
Integrated branches that may not be ready to go Into the next or master
branch. The Idea Is that your branches are at varIous IeveIs oI sta-
bIIIty, when they reach a more stabIe IeveI, they're merged Into the
branch above them. AgaIn, havIng muItIpIe Iong-runnIng branches
Isn't necessary, but It's oIten heIpIuI, especIaIIy when you're deaIIng
wIth very Iarge or compIex projects.
3.4.2 Topic Branches
TopIc branches, however, are useIuI In projects oI any sIze. A topIc
branch Is a short-IIved branch that you create and use Ior a sIngIe
partIcuIar Ieature or reIated work. ThIs Is somethIng you've IIkeIy
never done wIth a VCS beIore because It's generaIIy too expensIve
to create and merge branches. Ðut In GIt It's common to create, work
on, merge, and deIete branches severaI tImes a day.
You saw thIs In the Iast sectIon wIth the iss53 and hotfix branches
you created. You dId a Iew commIts on them and deIeted them dI-
rectIy aIter mergIng them Into your maIn branch. ThIs technIque
aIIows you to context-swItch quIckIy and compIeteIy — because your
work Is separated Into sIIos where aII the changes In that branch have
to do wIth that topIc, It's easIer to see what has happened durIng code
revIew and such. You can keep the changes there Ior mInutes, days,
or months, and merge them In when they're ready, regardIess oI the
order In whIch they were created or worked on.
ConsIder an exampIe oI doIng some work (on master), branchIng off
Ior an Issue (iss91), workIng on It Ior a bIt, branchIng off the second
branch to try another way oI handIIng the same thIng (iss91v2), goIng
back to your master branch and workIng there Ior a whIIe, and then
branchIng off there to do some work that you're not sure Is a good
Idea (dumbidea branch). Your commIt hIstory wIII Iook somethIng IIke
62
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
IIgure 3.20.
Figure 3.20: Your commit history with multiple topic branches
Þow, Iet's say you decIde you IIke the second soIutIon to your Issue
best (iss91v2), and you showed the dumbidea branch to your coworkers,
and It turns out to be genIus. You can throw away the orIgInaI iss91
branch (IosIng commIts C5 and C6) and merge In the other two. Your
hIstory then Iooks IIke IIgure 3-21.
¡t's Important to remember when you're doIng aII thIs that these
branches are compIeteIy IocaI. When you're branchIng and merg-
Ing, everythIng Is beIng done onIy In your GIt reposItory — no server
communIcatIon Is happenIng.
3.5 Remote Branches
Iemote branches are reIerences to the state oI branches on your
remote reposItorIes. They're IocaI branches that you can't move,
they're moved automatIcaIIy whenever you do any network commu-
nIcatIon. Iemote branches act as bookmarks to remInd you where
the branches on your remote reposItorIes were the Iast tIme you con-
nected to them.
They take the Iorm (remote)/(branch). Ior Instance, II you wanted
to see what the master branch on your origin remote Iooked IIke as oI
the Iast tIme you communIcated wIth It, you wouId check the origin/
master branch. ¡I you were workIng on an Issue wIth a partner and
they pushed up an iss53 branch, you mIght have your own IocaI iss53
branch, but the branch on the server wouId poInt to the commIt at
origin/iss53.
63
Section 3.5 Iemote Ðranches Scott Chacon Pro Git
Figure 3.21: Your history after merging in dumbidea and
iss91v2
ThIs may be a bIt conIusIng, so Iet's Iook at an exampIe. Iet's
say you have a GIt server on your network at git.ourcompany.com. ¡I
you cIone Irom thIs, GIt automatIcaIIy names It origin Ior you, puIIs
down aII Its data, creates a poInter to where Its master branch Is, and
names It origin/master IocaIIy, and you can't move It. GIt aIso gIves
you your own master branch startIng at the same pIace as orIgIn's
master branch, so you have somethIng to work Irom (see IIgure 3.22).
poIntIng to orIgIn's master branch.
¡I you do some work on your IocaI master branch, and, In the
meantIme, someone eIse pushes to git.ourcompany.com and updates Its
master branch, then your hIstorIes move Iorward dIfferentIy. AIso,
as Iong as you stay out oI contact wIth your orIgIn server, your origin/
master poInter doesn't move (see IIgure 3.23).
makes each hIstory move Iorward dIfferentIy.
To synchronIze your work, you run a git fetch origin command.
ThIs command Iooks up whIch server orIgIn Is (In thIs case, It's git.ourcompany.com),
Ietches any data Irom It that you don't yet have, and updates your
IocaI database, movIng your origin/master poInter to Its new, more
up-to-date posItIon (see IIgure 3.24).
To demonstrate havIng muItIpIe remote servers and what remote
branches Ior those remote projects Iook IIke, Iet's assume you have
another InternaI GIt server that Is used onIy Ior deveIopment by one
oI your sprInt teams. ThIs server Is at git.team1.ourcompany.com. You
can add It as a new remote reIerence to the project you're currentIy
64
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.22: A Git clone gives you your own master branch and
origin/master
Figure 3.23: Working locally and having someone push to your
remote server
workIng on by runnIng the git remote add command as we covered In
Chapter 2. Þame thIs remote teamone, whIch wIII be your shortname
Ior that whoIe !II (see IIgure 3.25).
Þow, you can run git fetch teamone to Ietch everythIng server has
that you don't have yet. Ðecause that server Is a subset oI the data
your origin server has rIght now, GIt Ietches no data but sets a remote
65
Section 3.5 Iemote Ðranches Scott Chacon Pro Git
Figure 3.24: The git fetch command updates your remote ref-
erences.
Figure 3.25: Adding another server as a remote
branch caIIed teamone/master to poInt to the commIt that teamone has
as Its master branch (see IIgure 3.26).
IocaIIy.
66
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.26: You get a reference to teamone's master branch
position
3.5.1 Pushing
When you want to share a branch wIth the worId, you need to push
It up to a remote that you have wrIte access to. Your IocaI branches
aren't automatIcaIIy synchronIzed to the remotes you wrIte to — you
have to expIIcItIy push the branches you want to share. That way,
you can use prIvate branches Ior work you don't want to share, and
push up onIy the topIc branches you want to coIIaborate on.
¡I you have a branch named serverfix that you want to work on
wIth others, you can push It up the same way you pushed your first
branch. Iun git push (remote) (branch).
$ git push origin serverfix
Counting objects: 20, done.
Compressing objects: 100% (14/14), done.
Writing objects: 100% (15/15), 1.74 KiB, done.
Total 15 (delta 5), reused 0 (delta 0)
To git@github.com:schacon/simplegit.git
* [new branch] serverfix -> serverfix
ThIs Is a bIt oI a shortcut. GIt automatIcaIIy expands the serverfix
branchname out to refs/heads/serverfix:refs/heads/serverfix, whIch means,
“Take my serverfix IocaI branch and push It to update the remote's
serverfix branch.” We'II go over the refs/heads/ part In detaII In Chap-
ter 9, but you can generaIIy Ieave It off. You can aIso do git push origin
serverfix:serverfix, whIch does the same thIng — It says, “Take my
serverfix and make It the remote's serverfix.” You can use thIs Iormat
to push a IocaI branch Into a remote branch that Is named dIfferentIy.
¡I you dIdn't want It to be caIIed serverfix on the remote, you couId
67
Section 3.5 Iemote Ðranches Scott Chacon Pro Git
Instead run git push origin serverfix:awesomebranch to push your IocaI
serverfix branch to the awesomebranch branch on the remote project.
The next tIme one oI your coIIaborators Ietches Irom the server,
they wIII get a reIerence to where the server's versIon oI serverfix Is
under the remote branch origin/serverfix.
$ git fetch origin
remote: Counting objects: 20, done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 15 (delta 5), reused 0 (delta 0)
Unpacking objects: 100% (15/15), done.
From git@github.com:schacon/simplegit
* [new branch] serverfix -> origin/serverfix
¡t's Important to note that when you do a Ietch that brIngs down
new remote branches, you don't automatIcaIIy have IocaI, edItabIe
copIes oI them. ¡n other words, In thIs case, you don't have a new
serverfix branch — you onIy have an origin/serverfix poInter that you
can't modIIy.
To merge thIs work Into your current workIng branch, you can run
git merge origin/serverfix. ¡I you want your own serverfix branch that
you can work on, you can base It off your remote branch.
$ git checkout -b serverfix origin/serverfix
Branch serverfix set up to track remote branch refs/remotes/origin/
serverfix.
Switched to a new branch "serverfix"
ThIs gIves you a IocaI branch that you can work on that starts
where origin/serverfix Is.
3.5.2 Tracking Branches
CheckIng out a IocaI branch Irom a remote branch automatIcaIIy cre-
ates what Is caIIed a tracking branch. TrackIng branches are IocaI
branches that have a dIrect reIatIonshIp to a remote branch. ¡I you're
on a trackIng branch and type gIt push, GIt automatIcaIIy knows
whIch server and branch to push to. AIso, runnIng git pull whIIe
on one oI these branches Ietches aII the remote reIerences and then
automatIcaIIy merges In the correspondIng remote branch.
When you cIone a reposItory, It generaIIy automatIcaIIy creates a
master branch that tracks origin/master. That's why git push and git
pull work out oI the box wIth no other arguments. Iowever, you can
set up other trackIng branches II you wIsh — ones that don't track
branches on origin and don't track the master branch. The sImpIe
case Is the exampIe you just saw, runnIng git checkout -b [branch]
[remotename]/[branch]. ¡I you have GIt versIon 1.6.2 or Iater, you can
aIso use the --track shorthand.
$ git checkout --track origin/serverfix
Branch serverfix set up to track remote branch refs/remotes/origin/
serverfix.
Switched to a new branch "serverfix"
68
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
To set up a IocaI branch wIth a dIfferent name than the remote
branch, you can easIIy use the first versIon wIth a dIfferent IocaI
branch name.
$ git checkout -b sf origin/serverfix
Branch sf set up to track remote branch refs/remotes/origin/serverfix.
Switched to a new branch "sf"
Þow, your IocaI branch sI wIII automatIcaIIy push to and puII Irom
orIgIn/serverfix.
3.5.3 Deleting Remote Branches
Suppose you're done wIth a remote branch — say, you and your coI-
Iaborators are finIshed wIth a Ieature and have merged It Into your
remote's master branch (or whatever branch your stabIe codeIIne Is
In). You can deIete a remote branch usIng the rather obtuse syntax
git push [remotename] :[branch]. ¡I you want to deIete your serverfix
branch Irom the server, you run the IoIIowIng.
$ git push origin :serverfix
To git@github.com:schacon/simplegit.git
- [deleted] serverfix
Ðoom. Þo more branch on your server. You may want to dog-
ear thIs page, because you'II need that command, and you'II IIkeIy
Iorget the syntax. A way to remember thIs command Is by recaIIIng
the git push [remotename] [localbranch]:[remotebranch] syntax that we
went over a bIt earIIer. ¡I you Ieave off the [localbranch] portIon, then
you're basIcaIIy sayIng, “Take nothIng on my sIde and make It be
[remotebranch].”
3.6 Rebasing
¡n GIt, there are two maIn ways to Integrate changes Irom one branch
Into another. the merge and the rebase. ¡n thIs sectIon you'II Iearn what
rebasIng Is, how to do It, why It's a pretty amazIng tooI, and In what
cases you won't want to use It.
3.6.1 The Basic Rebase
¡I you go back to an earIIer exampIe Irom the Merge sectIon (see
IIgure 3.27), you can see that you dIverged your work and made
commIts on two dIfferent branches.
The easIest way to Integrate the branches, as we've aIready cov-
ered, Is the merge command. ¡t perIorms a three-way merge between
the two Iatest branch snapshots (C3 and C4) and the most recent
common ancestor oI the two (C2), creatIng a new snapshot (and com-
mIt), as shown In IIgure 3.28.
69
Section 3.6 IebasIng Scott Chacon Pro Git
Figure 3.27: Your initial diverged commit history
Figure 3.28: Merging a branch to integrate the diverged work
history
Iowever, there Is another way. you can take the patch oI the
change that was Introduced In C3 and reappIy It on top oI C4. ¡n GIt,
thIs Is caIIed rebasing. WIth the rebase command, you can take aII
the changes that were commItted on one branch and repIay them on
another one.
¡n thIs exampIe, you'd run the IoIIowIng.
$ git checkout experiment
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: added staged command
¡t works by goIng to the common ancestor oI the two branches
(the one you're on and the one you're rebasIng onto), gettIng the
dIff Introduced by each commIt oI the branch you're on, savIng those
dIffs to temporary fiIes, resettIng the current branch to the same
commIt as the branch you are rebasIng onto, and finaIIy appIyIng
each change In turn. IIgure 3.29 IIIustrates thIs process.
At thIs poInt, you can go back to the master branch and do a Iast-
Iorward merge (see IIgure 3.30).
Þow, the snapshot poInted to by C3 Is exactIy the same as the
one that was poInted to by C5 In the merge exampIe. There Is no
dIfference In the end product oI the IntegratIon, but rebasIng makes
70
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.29: Rebasing the change introduced in C3 onto C4
Figure 3.30: Fast-forwarding the master branch
Ior a cIeaner hIstory. ¡I you examIne the Iog oI a rebased branch, It
Iooks IIke a IInear hIstory. It appears that aII the work happened In
serIes, even when It orIgInaIIy happened In paraIIeI.
OIten, you'II do thIs to make sure your commIts appIy cIeanIy on
a remote branch — perhaps In a project to whIch you're tryIng to
contrIbute but that you don't maIntaIn. ¡n thIs case, you'd do your
work In a branch and then rebase your work onto origin/master when
you were ready to submIt your patches to the maIn project. That
way, the maIntaIner doesn't have to do any IntegratIon work — just
a Iast-Iorward or a cIean appIy.
Þote that the snapshot poInted to by the finaI commIt you end up
wIth, whether It's the Iast oI the rebased commIts Ior a rebase or the
finaI merge commIt aIter a merge, Is the same snapshot — It's onIy
the hIstory that Is dIfferent. IebasIng repIays changes Irom one IIne
oI work onto another In the order they were Introduced, whereas
mergIng takes the endpoInts and merges them together.
3.6.2 More Interesting Rebases
You can aIso have your rebase repIay on somethIng other than the
rebase branch. Take a hIstory IIke IIgure 3.31, Ior exampIe. You
branched a topIc branch (server) to add some server-sIde IunctIon-
aIIty to your project, and made a commIt. Then, you branched off
that to make the cIIent-sIde changes (client) and commItted a Iew
tImes. IInaIIy, you went back to your server branch and dId a Iew
more commIts.
Suppose you decIde that you want to merge your cIIent-sIde changes
71
Section 3.6 IebasIng Scott Chacon Pro Git
Figure 3.31: A history with a topic branch off another topic
branch
Into your maInIIne Ior a reIease, but you want to hoId off on the
server-sIde changes untII It's tested Iurther. You can take the changes
on cIIent that aren't on server (C8 and C9) and repIay them on your
master branch by usIng the --onto optIon oI git rebase.
$ git rebase --onto master server client
ThIs basIcaIIy says, “Check out the cIIent branch, figure out the
patches Irom the common ancestor oI the client and server branches,
and then repIay them onto master.” ¡t's a bIt compIex, but the resuIt,
shown In IIgure 3.32, Is pretty cooI.
Figure 3.32: Rebasing a topic branch off another topic branch
72
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Þow you can Iast-Iorward your master branch (see IIgure 3.33).
$ git checkout master
$ git merge client
Figure 3.33: Fast-forwarding your master branch to include
the client
branch changes
Iet's say you decIde to puII In your server branch as weII. You can
rebase the server branch onto the master branch wIthout havIng to
check It out first by runnIng git rebase [basebranch] [topicbranch] —
whIch checks out the topIc branch (In thIs case, server) Ior you and
repIays It onto the base branch (master).
$ git rebase master server
ThIs repIays your server work on top oI your master work, as shown
In IIgure 3.34.
Figure 3.34: Rebasing your server branch on top of your mas-
ter branch
Then, you can Iast-Iorward the base branch (master).
$ git checkout master
$ git merge server
You can remove the client and server branches because aII the
work Is Integrated and you don't need them anymore, IeavIng your
hIstory Ior thIs entIre process IookIng IIke IIgure 3.35.
$ git branch -d client
$ git branch -d server
73
Section 3.6 IebasIng Scott Chacon Pro Git
Figure 3.35: Final commit history
3.6.3 The Perils of Rebasing
Ahh, but the bIIss oI rebasIng Isn't wIthout Its drawbacks, whIch can
be summed up In a sIngIe IIne.
Do not rebase commits that you have pushed to a public
repository.
¡I you IoIIow that guIdeIIne, you'II be fine. ¡I you don't, peopIe wIII
hate you, and you'II be scorned by IrIends and IamIIy.
When you rebase stuff, you're abandonIng exIstIng commIts and
creatIng new ones that are sImIIar but dIfferent. ¡I you push commIts
somewhere and others puII them down and base work on them, and
then you rewrIte those commIts wIth git rebase and push them up
agaIn, your coIIaborators wIII have to re-merge theIr work and thIngs
wIII get messy when you try to puII theIr work back Into yours.
Iet's Iook at an exampIe oI how rebasIng work that you've made
pubIIc can cause probIems. Suppose you cIone Irom a centraI server
and then do some work off that. Your commIt hIstory Iooks IIke IIgure
3.36.
Figure 3.36: Clone a repository, and base some work on it.
Þow, someone eIse does more work that IncIudes a merge, and
pushes that work to the centraI server. You Ietch them and merge
the new remote branch Into your work, makIng your hIstory Iook
somethIng IIke IIgure 3.37.
Þext, the person who pushed the merged work decIdes to go back
74
Chapter 3 GIt ÐranchIng Scott Chacon Pro Git
Figure 3.37: Fetch more commits, and merge them into your
work.
and rebase theIr work Instead, they do a git push --force to overwrIte
the hIstory on the server. You then Ietch Irom that server, brIngIng
down the new commIts.
Figure 3.38: Someone pushes rebased commits, abandoning
commits you've
based your work on.
At thIs poInt, you have to merge thIs work In agaIn, even though
you've aIready done so. IebasIng changes the SIA-1 hashes oI these
commIts so to GIt they Iook IIke new commIts, when In Iact you aI-
ready have the C4 work In your hIstory (see IIgure 3.39).
You have to merge that work In at some poInt so you can keep
up wIth the other deveIoper In the Iuture. AIter you do that, your
75
Section 3.7 Summary Scott Chacon Pro Git
Figure 3.39: You merge in the same work again into a new
merge commit.
commIt hIstory wIII contaIn both the C4 and C4' commIts, whIch have
dIfferent SIA-1 hashes but Introduce the same work and have the
same commIt message. ¡I you run a git log when your hIstory Iooks
IIke thIs, you'II see two commIts that have the same author date and
message, whIch wIII be conIusIng. Iurthermore, II you push thIs
hIstory back up to the server, you'II reIntroduce aII those rebased
commIts to the centraI server, whIch can Iurther conIuse peopIe.
¡I you treat rebasIng as a way to cIean up and work wIth com-
mIts beIore you push them, and II you onIy rebase commIts that have
never been avaIIabIe pubIIcIy, then you'II be fine. ¡I you rebase com-
mIts that have aIready been pushed pubIIcIy, and peopIe may have
based work on those commIts, then you may be In Ior some Irustrat-
Ing troubIe.
3.7 Summary
We've covered basIc branchIng and mergIng In GIt. You shouId IeeI
comIortabIe creatIng and swItchIng to new branches, swItchIng be-
tween branches and mergIng IocaI branches together. You shouId
aIso be abIe to share your branches by pushIng them to a shared
server, workIng wIth others on shared branches and rebasIng your
branches beIore they are shared.
76
Chapter 4
Git on the Server
At thIs poInt, you shouId be abIe to do most oI the day-to-day tasks Ior
whIch you'II be usIng GIt. Iowever, In order to do any coIIaboratIon
In GIt, you'II need to have a remote GIt reposItory. AIthough you
can technIcaIIy push changes to and puII changes Irom IndIvIduaIs'
reposItorIes, doIng so Is dIscouraged because you can IaIrIy easIIy
conIuse what they're workIng on II you're not careIuI. Iurthermore,
you want your coIIaborators to be abIe to access the reposItory even II
your computer Is offlIne — havIng a more reIIabIe common reposItory
Is oIten useIuI. ThereIore, the preIerred method Ior coIIaboratIng
wIth someone Is to set up an IntermedIate reposItory that you both
have access to, and push to and puII Irom that. We'II reIer to thIs
reposItory as a “GIt server”, but you'II notIce that It generaIIy takes
a tIny amount oI resources to host a GIt reposItory, so you'II rareIy
need to use an entIre server Ior It.
IunnIng a GIt server Is sImpIe. IIrst, you choose whIch protocoIs
you want your server to communIcate wIth. The first sectIon oI thIs
chapter wIII cover the avaIIabIe protocoIs and the pros and cons oI
each. The next sectIons wIII expIaIn some typIcaI setups usIng those
protocoIs and how to get your server runnIng wIth them. Iast, we'II
go over a Iew hosted optIons, II you don't mInd hostIng your code
on someone eIse's server and don't want to go through the hassIe oI
settIng up and maIntaInIng your own server.
¡I you have no Interest In runnIng your own server, you can skIp
to the Iast sectIon oI the chapter to see some optIons Ior settIng up
a hosted account and then move on to the next chapter, where we
dIscuss the varIous Ins and outs oI workIng In a dIstrIbuted source
controI envIronment.
A remote reposItory Is generaIIy a bare repository — a GIt repos-
Itory that has no workIng dIrectory. Ðecause the reposItory Is onIy
used as a coIIaboratIon poInt, there Is no reason to have a snapshot
checked out on dIsk, It's just the GIt data. ¡n the sImpIest terms, a
bare reposItory Is the contents oI your project's .git dIrectory and
nothIng eIse.
77
Section 4.1 The IrotocoIs Scott Chacon Pro Git
4.1 The Protocols
GIt can use Iour major network protocoIs to transIer data. IocaI,
Secure SheII (SSI), GIt, and ITTI. Iere we'II dIscuss what they are
and In what basIc cIrcumstances you wouId want (or not want) to use
them.
¡t's Important to note that wIth the exceptIon oI the ITTI pro-
tocoIs, aII oI these requIre GIt to be InstaIIed and workIng on the
server.
4.1.1 Local Protocol
The most basIc Is the Local protocol, In whIch the remote reposItory
Is In another dIrectory on dIsk. ThIs Is oIten used II everyone on your
team has access to a shared fiIesystem such as an ÞIS mount, or In
the Iess IIkeIy case that everyone Iogs In to the same computer. The
Iatter wouIdn't be IdeaI, because aII your code reposItory Instances
wouId resIde on the same computer, makIng a catastrophIc Ioss much
more IIkeIy.
¡I you have a shared mounted fiIesystem, then you can cIone, push
to, and puII Irom a IocaI fiIe-based reposItory. To cIone a reposItory
IIke thIs or to add one as a remote to an exIstIng project, use the path
to the reposItory as the !II. Ior exampIe, to cIone a IocaI reposItory,
you can run somethIng IIke thIs.
$ git clone /opt/git/project.git
Or you can do thIs.
$ git clone file:///opt/git/project.git
GIt operates sIIghtIy dIfferentIy II you expIIcItIy specIIy file:// at
the begInnIng oI the !II. ¡I you just specIIy the path, GIt trIes to use
hardIInks or dIrectIy copy the fiIes It needs. ¡I you specIIy file://,
GIt fires up the processes that It normaIIy uses to transIer data over
a network whIch Is generaIIy a Iot Iess efficIent method oI transIer-
rIng the data. The maIn reason to specIIy the file:// prefix Is II you
want a cIean copy oI the reposItory wIth extraneous reIerences or
objects IeIt out — generaIIy aIter an Import Irom another versIon-
controI system or somethIng sImIIar (see Chapter 9 Ior maIntenance
tasks). We'II use the normaI path here because doIng so Is aImost
aIways Iaster.
To add a IocaI reposItory to an exIstIng GIt project, you can run
somethIng IIke thIs.
$ git remote add local_proj /opt/git/project.git
Then, you can push to and puII Irom that remote as though you
were doIng so over a network.
78
Chapter 4 GIt on the Server Scott Chacon Pro Git
The Pros
The pros oI fiIe-based reposItorIes are that they're sImpIe and they
use exIstIng fiIe permIssIons and network access. ¡I you aIready have
a shared fiIesystem to whIch your whoIe team has access, settIng up
a reposItory Is very easy. You stIck the bare reposItory copy some-
where everyone has shared access to and set the read/wrIte permIs-
sIons as you wouId Ior any other shared dIrectory. We'II dIscuss how
to export a bare reposItory copy Ior thIs purpose In the next sectIon,
“GettIng GIt on a Server.”
ThIs Is aIso a nIce optIon Ior quIckIy grabbIng work Irom someone
eIse's workIng reposItory. ¡I you and a co-worker are workIng on the
same project and they want you to check somethIng out, runnIng a
command IIke git pull /home/john/project Is oIten easIer than them
pushIng to a remote server and you puIIIng down.
The Cons
The cons oI thIs method are that shared access Is generaIIy more dII-
ficuIt to set up and reach Irom muItIpIe IocatIons than basIc network
access. ¡I you want to push Irom your Iaptop when you're at home,
you have to mount the remote dIsk, whIch can be dIfficuIt and sIow
compared to network-based access.
¡t's aIso Important to mentIon that thIs Isn't necessarIIy the Iastest
optIon II you're usIng a shared mount oI some kInd. A IocaI reposItory
Is Iast onIy II you have Iast access to the data. A reposItory on ÞIS
Is oIten sIower than the reposItory over SSI on the same server,
aIIowIng GIt to run off IocaI dIsks on each system.
4.1.2 The SSH Protocol
IrobabIy the most common transport protocoI Ior GIt Is SSI. ThIs
Is because SSI access to servers Is aIready set up In most pIaces
— and II It Isn't, It's easy to do. SSI Is aIso the onIy network-based
protocoI that you can easIIy read Irom and wrIte to. The other two
network protocoIs (ITTI and GIt) are generaIIy read-onIy, so even
II you have them avaIIabIe Ior the unwashed masses, you stIII need
SSI Ior your own wrIte commands. SSI Is aIso an authentIcated
network protocoI, and because It's ubIquItous, It's generaIIy easy to
set up and use.
To cIone a GIt reposItory over SSI, you can specIIy ssh.// !II IIke
thIs.
$ git clone ssh://user@server:project.git
Or you can not specIIy a protocoI — GIt assumes SSI II you aren't
expIIcIt.
$ git clone user@server:project.git
79
Section 4.1 The IrotocoIs Scott Chacon Pro Git
You can aIso not specIIy a user, and GIt assumes the user you're
currentIy Iogged In as.
The Pros
The pros oI usIng SSI are many. IIrst, you basIcaIIy have to use
It II you want authentIcated wrIte access to your reposItory over a
network. Second, SSI Is reIatIveIy easy to set up — SSI daemons
are commonpIace, many network admIns have experIence wIth them,
and many OS dIstrIbutIons are set up wIth them or have tooIs to man-
age them. Þext, access over SSI Is secure — aII data transIer Is
encrypted and authentIcated. Iast, IIke the GIt and IocaI protocoIs,
SSI Is efficIent, makIng the data as compact as possIbIe beIore trans-
IerrIng It.
The Cons
The negatIve aspect oI SSI Is that you can't serve anonymous access
oI your reposItory over It. IeopIe must have access to your machIne
over SSI to access It, even In a read-onIy capacIty, whIch doesn't
make SSI access conducIve to open source projects. ¡I you're usIng
It onIy wIthIn your corporate network, SSI may be the onIy protocoI
you need to deaI wIth. ¡I you want to aIIow anonymous read-onIy
access to your projects, you'II have to set up SSI Ior you to push
over but somethIng eIse Ior others to puII over.
4.1.3 The Git Protocol
Þext Is the GIt protocoI. ThIs Is a specIaI daemon that comes pack-
aged wIth GIt, It IIstens on a dedIcated port (9418) that provIdes a
servIce sImIIar to the SSI protocoI, but wIth absoIuteIy no authen-
tIcatIon. ¡n order Ior a reposItory to be served over the GIt proto-
coI, you must create the git-export-daemon-ok fiIe — the daemon won't
serve a reposItory wIthout that fiIe In It — but other than that there
Is no securIty. £Ither the GIt reposItory Is avaIIabIe Ior everyone to
cIone or It Isn't. ThIs means that there Is generaIIy no pushIng over
thIs protocoI. You can enabIe push access, but gIven the Iack oI au-
thentIcatIon, II you turn on push access, anyone on the Internet who
finds your project's !II couId push to your project. Suffice It to say
that thIs Is rare.
The Pros
The GIt protocoI Is the Iastest transIer protocoI avaIIabIe. ¡I you're
servIng a Iot oI traffic Ior a pubIIc project or servIng a very Iarge
project that doesn't requIre user authentIcatIon Ior read access, It's
IIkeIy that you'II want to set up a GIt daemon to serve your project.
80
Chapter 4 GIt on the Server Scott Chacon Pro Git
¡t uses the same data-transIer mechanIsm as the SSI protocoI but
wIthout the encryptIon and authentIcatIon overhead.
The Cons
The downsIde oI the GIt protocoI Is the Iack oI authentIcatIon. ¡t's
generaIIy undesIrabIe Ior the GIt protocoI to be the onIy access to
your project. GeneraIIy, you'II paIr It wIth SSI access Ior the Iew de-
veIopers who have push (wrIte) access and have everyone eIse use
git:// Ior read-onIy access. ¡t's aIso probabIy the most dIfficuIt pro-
tocoI to set up. ¡t must run Its own daemon, whIch Is custom — we'II
Iook at settIng one up In the “GItosIs” sectIon oI thIs chapter — It
requIres xinetd configuratIon or the IIke, whIch Isn't aIways a waIk
In the park. ¡t aIso requIres firewaII access to port 9418, whIch Isn't
a standard port that corporate firewaIIs aIways aIIow. ÐehInd bIg
corporate firewaIIs, thIs obscure port Is commonIy bIocked.
4.1.4 The HTTP/S Protocol
Iast we have the ITTI protocoI. The beauty oI the ITTI or ITTIS
protocoI Is the sImpIIcIty oI settIng It up. ÐasIcaIIy, aII you have to do
Is put the bare GIt reposItory under your ITTI document root and
set up a specIfic post-update hook, and you're done (See Chapter 7
Ior detaIIs on GIt hooks). At that poInt, anyone who can access the
web server under whIch you put the reposItory can aIso cIone your
reposItory. To aIIow read access to your reposItory over ITTI, do
somethIng IIke thIs.
$ cd /var/www/htdocs/
$ git clone --bare /path/to/git_project gitproject.git
$ cd gitproject.git
$ mv hooks/post-update.sample hooks/post-update
$ chmod a+x hooks/post-update
That's aII. The post-update hook that comes wIth GIt by deIauIt
runs the approprIate command (git update-server-info) to make ITTI
IetchIng and cIonIng work properIy. ThIs command Is run when you
push to thIs reposItory over SSI, then, other peopIe can cIone vIa
somethIng IIke
$ git clone http://example.com/gitproject.git
¡n thIs partIcuIar case, we're usIng the /var/www/htdocs path that Is
common Ior Apache setups, but you can use any statIc web server —
just put the bare reposItory In Its path. The GIt data Is served as basIc
statIc fiIes (see Chapter 9 Ior detaIIs about exactIy how It's served).
¡t's possIbIe to make GIt push over ITTI as weII, aIthough that
technIque Isn't as wIdeIy used and requIres you to set up compIex
WebÐAV requIrements. Ðecause It's rareIy used, we won't cover It In
thIs book. ¡I you're Interested In usIng the ITTI-push protocoIs, you
81
Section 4.2 GettIng GIt on a Server Scott Chacon Pro Git
can read about preparIng a reposItory Ior thIs purpose at http://
www.kernel.org/pub/software/scm/git/docs/howto/setup-git-server-over-http.
txt. One nIce thIng about makIng GIt push over ITTI Is that you can
use any WebÐAV server, wIthout specIfic GIt Ieatures, so, you can use
thIs IunctIonaIIty II your web-hostIng provIder supports WebÐAV Ior
wrItIng updates to your web sIte.
The Pros
The upsIde oI usIng the ITTI protocoI Is that It's easy to set up.
IunnIng the handIuI oI requIred commands gIves you a sImpIe way to
gIve the worId read access to your GIt reposItory. ¡t takes onIy a Iew
mInutes to do. The ITTI protocoI aIso Isn't very resource IntensIve
on your server. Ðecause It generaIIy uses a statIc ITTI server to
serve aII the data, a normaI Apache server can serve thousands oI
fiIes per second on average — It's dIfficuIt to overIoad even a smaII
server.
You can aIso serve your reposItorIes read-onIy over ITTIS, whIch
means you can encrypt the content transIer, or you can go so Iar as
to make the cIIents use specIfic sIgned SSI certIficates. GeneraIIy, II
you're goIng to these Iengths, It's easIer to use SSI pubIIc keys, but
It may be a better soIutIon In your specIfic case to use sIgned SSI
certIficates or other ITTI-based authentIcatIon methods Ior read-
onIy access over ITTIS.
Another nIce thIng Is that ITTI Is such a commonIy used protocoI
that corporate firewaIIs are oIten set up to aIIow traffic through thIs
port.
The Cons
The downsIde oI servIng your reposItory over ITTI Is that It's reIa-
tIveIy InefficIent Ior the cIIent. ¡t generaIIy takes a Iot Ionger to cIone
or Ietch Irom the reposItory, and you oIten have a Iot more network
overhead and transIer voIume over ITTI than wIth any oI the other
network protocoIs. Ðecause It's not as InteIIIgent about transIerrIng
onIy the data you need — there Is no dynamIc work on the part oI the
server In these transactIons — the ITTI protocoI Is oIten reIerred
to as a dumb protocoI. Ior more InIormatIon about the dIfferences
In efficIency between the ITTI protocoI and the other protocoIs, see
Chapter 9.
4.2 Getting Git on a Server
¡n order to InItIaIIy set up any GIt server, you have to export an exIst-
Ing reposItory Into a new bare reposItory — a reposItory that doesn't
contaIn a workIng dIrectory. ThIs Is generaIIy straIghtIorward to do.
¡n order to cIone your reposItory to create a new bare reposItory, you
82
Chapter 4 GIt on the Server Scott Chacon Pro Git
run the cIone command wIth the --bare optIon. Ðy conventIon, bare
reposItory dIrectorIes end In .git, IIke so.
$ git clone --bare my_project my_project.git
Initialized empty Git repository in /opt/projects/my_project.git/
The output Ior thIs command Is a IIttIe conIusIng. SInce clone Is
basIcaIIy a git init then a git fetch, we see some output Irom the
git init part, whIch creates an empty dIrectory. The actuaI object
transIer gIves no output, but It does happen. You shouId now have a
copy oI the GIt dIrectory data In your my_project.git dIrectory.
ThIs Is roughIy equIvaIent to somethIng IIke
$ cp -Rf my_project/.git my_project.git
There are a coupIe oI mInor dIfferences In the configuratIon fiIe,
but Ior your purpose, thIs Is cIose to the same thIng. ¡t takes the
GIt reposItory by ItseII, wIthout a workIng dIrectory, and creates a
dIrectory specIficaIIy Ior It aIone.
4.2.1 Putting the Bare Repository on a Server
Þow that you have a bare copy oI your reposItory, aII you need to do
Is put It on a server and set up your protocoIs. Iet's say you've set up
a server caIIed git.example.com that you have SSI access to, and you
want to store aII your GIt reposItorIes under the /opt/git dIrectory.
You can set up your new reposItory by copyIng your bare reposItory
over.
$ scp -r my_project.git user@git.example.com:/opt/git
At thIs poInt, other users who have SSI access to the same server
whIch has read-access to the /opt/git dIrectory can cIone your repos-
Itory by runnIng
$ git clone user@git.example.com:/opt/git/my_project.git
¡I a user SSIs Into a server and has wrIte access to the /opt/git/
my_project.git dIrectory, they wIII aIso automatIcaIIy have push ac-
cess. GIt wIII automatIcaIIy add group wrIte permIssIons to a reposI-
tory properIy II you run the git init command wIth the --shared optIon.
$ ssh user@git.example.com
$ cd /opt/git/my_project.git
$ git init --bare --shared
You see how easy It Is to take a GIt reposItory, create a bare
versIon, and pIace It on a server to whIch you and your coIIabora-
tors have SSI access. Þow you're ready to coIIaborate on the same
project.
¡t's Important to note that thIs Is IIteraIIy aII you need to do to run a
useIuI GIt server to whIch severaI peopIe have access — just add SSI-
abIe accounts on a server, and stIck a bare reposItory somewhere
83
Section 4.2 GettIng GIt on a Server Scott Chacon Pro Git
that aII those users have read and wrIte access to. You're ready to
go — nothIng eIse needed.
¡n the next Iew sectIons, you'II see how to expand to more so-
phIstIcated setups. ThIs dIscussIon wIII IncIude not havIng to create
user accounts Ior each user, addIng pubIIc read access to reposIto-
rIes, settIng up web !¡s, usIng the GItosIs tooI, and more. Iowever,
keep In mInd that to coIIaborate wIth a coupIe oI peopIe on a prIvate
project, aII you need Is an SSI server and a bare reposItory.
4.2.2 Small Setups
¡I you're a smaII outfit or are just tryIng out GIt In your organIzatIon
and have onIy a Iew deveIopers, thIngs can be sImpIe Ior you. One oI
the most compIIcated aspects oI settIng up a GIt server Is user man-
agement. ¡I you want some reposItorIes to be read-onIy to certaIn
users and read/wrIte to others, access and permIssIons can be a bIt
dIfficuIt to arrange.
SSH Access
¡I you aIready have a server to whIch aII your deveIopers have SSI
access, It's generaIIy easIest to set up your first reposItory there,
because you have to do aImost no work (as we covered In the Iast
sectIon). ¡I you want more compIex access controI type permIssIons
on your reposItorIes, you can handIe them wIth the normaI fiIesystem
permIssIons oI the operatIng system your server runs.
¡I you want to pIace your reposItorIes on a server that doesn't
have accounts Ior everyone on your team whom you want to have
wrIte access, then you must set up SSI access Ior them. We assume
that II you have a server wIth whIch to do thIs, you aIready have an
SSI server InstaIIed, and that's how you're accessIng the server.
There are a Iew ways you can gIve access to everyone on your
team. The first Is to set up accounts Ior everybody, whIch Is straIght-
Iorward but can be cumbersome. You may not want to run adduser
and set temporary passwords Ior every user.
A second method Is to create a sIngIe `gIt' user on the machIne,
ask every user who Is to have wrIte access to send you an SSI pubIIc
key, and add that key to the ~/.ssh/authorized_keys fiIe oI your new `gIt'
user. At that poInt, everyone wIII be abIe to access that machIne vIa
the `gIt' user. ThIs doesn't affect the commIt data In any way — the
SSI user you connect as doesn't affect the commIts you've recorded.
Another way to do It Is to have your SSI server authentIcate Irom
an IÐAI server or some other centraIIzed authentIcatIon source that
you may aIready have set up. As Iong as each user can get sheII
access on the machIne, any SSI authentIcatIon mechanIsm you can
thInk oI shouId work.
84
Chapter 4 GIt on the Server Scott Chacon Pro Git
4.3 Generating Your SSH Public Key
That beIng saId, many GIt servers authentIcate usIng SSI pubIIc
keys. ¡n order to provIde a pubIIc key, each user In your system must
generate one II they don't aIready have one. ThIs process Is sImIIar
across aII operatIng systems. IIrst, you shouId check to make sure
you don't aIready have a key. Ðy deIauIt, a user's SSI keys are stored
In that user's ~/.ssh dIrectory. You can easIIy check to see II you have
a key aIready by goIng to that dIrectory and IIstIng the contents.
$ cd ~/.ssh
$ ls
authorized_keys2 id_dsa known_hosts
config id_dsa.pub
You're IookIng Ior a paIr oI fiIes named somethIng and somethIng.pub,
where the somethIng Is usuaIIy id_dsa or id_rsa. The .pub fiIe Is your
pubIIc key, and the other fiIe Is your prIvate key. ¡I you don't have
these fiIes (or you don't even have a .ssh dIrectory), you can create
them by runnIng a program caIIed ssh-keygen, whIch Is provIded wIth
the SSI package on IInux/Mac systems and comes wIth the MSysGIt
package on WIndows.
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/Users/schacon/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /Users/schacon/.ssh/id_rsa.
Your public key has been saved in /Users/schacon/.ssh/id_rsa.pub.
The key fingerprint is:
43:c5:5b:5f:b1:f1:50:43:ad:20:a6:92:6a:1f:9a:3a schacon@agadorlaptop.local
IIrst It confirms where you want to save the key (.ssh/id_rsa), and
then It asks twIce Ior a passphrase, whIch you can Ieave empty II you
don't want to type a password when you use the key.
Þow, each user that does thIs has to send theIr pubIIc key to you
or whoever Is admInIstratIng the GIt server (assumIng you're usIng
an SSI server setup that requIres pubIIc keys). AII they have to do Is
copy the contents oI the .pub fiIe and e-maII It. The pubIIc keys Iook
somethIng IIke thIs.
$ cat ~/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAklOUpkDHrfHY17SbrmTIpNLTGK9Tjom/BWDSU
GPl+nafzlHDTYW7hdI4yZ5ew18JH4JW9jbhUFrviQzM7xlELEVf4h9lFX5QVkbPppSwg0cda3
Pbv7kOdJ/MTyBlWXFCR+HAo3FXRitBqxiX1nKhXpHAZsMciLq8V6RjsNAQwdsdMFvSlVK/7XA
t3FaoJoAsncM1Q9x5+3V0Ww68/eIFmb1zuUFljQJKprrX88XypNDvjYNby6vw/Pb0rwert/En
mZ+AW4OZPnTPI89ZPmVMLuayrD2cE86Z/il8b+gw3r3+1nKatmIkjn2so1d01QraTlMqVSsbx
NrRFi9wrf+M7Q== schacon@agadorlaptop.local
Ior a more In-depth tutorIaI on creatIng an SSI key on muItIpIe
operatIng systems, see the GItIub guIde on SSI keys at http://
github.com/guides/providing-your-ssh-key.
85
Section 4.4 SettIng !p the Server Scott Chacon Pro Git
4.4 Setting Up the Server
Iet's waIk through settIng up SSI access on the server sIde. ¡n
thIs exampIe, you'II use the authorized_keys method Ior authentIcatIng
your users. We aIso assume you're runnIng a standard IInux dIstrI-
butIon IIke !buntu. IIrst, you create a `gIt' user and a .ssh dIrectory
Ior that user.
$ sudo adduser git
$ su git
$ cd
$ mkdir .ssh
Þext, you need to add some deveIoper SSI pubIIc keys to the
authorized_keys fiIe Ior that user. Iet's assume you've receIved a Iew
keys by e-maII and saved them to temporary fiIes. AgaIn, the pubIIc
keys Iook somethIng IIke thIs.
$ cat /tmp/id_rsa.john.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCB007n/ww+ouN4gSLKssMxXnBOvf9LGt4L
ojG6rs6hPB09j9R/T17/x4lhJA0F3FR1rP6kYBRsWj2aThGw6HXLm9/5zytK6Ztg3RPKK+4k
Yjh6541NYsnEAZuXz0jTTyAUfrtU3Z5E003C4oxOj6H0rfIF1kKI9MAQLMdpGW1GYEIgS9Ez
Sdfd8AcCIicTDWbqLAcU4UpkaX8KyGlLwsNuuGztobF8m72ALC/nLF6JLtPofwFBlgc+myiv
O7TCUSBdLQlgMVOFq1I2uPWQOkOWQAHukEOmfjy2jctxSDBQ220ymjaNsHT4kgtZg2AYYgPq
dAv8JggJICUvax2T9va5 gsg-keypair
You just append them to your authorized_keys fiIe.
$ cat /tmp/id_rsa.john.pub >> ~/.ssh/authorized_keys
$ cat /tmp/id_rsa.josie.pub >> ~/.ssh/authorized_keys
$ cat /tmp/id_rsa.jessica.pub >> ~/.ssh/authorized_keys
Þow, you can set up an empty reposItory Ior them by runnIng git
init wIth the --bare optIon, whIch InItIaIIzes the reposItory wIthout a
workIng dIrectory.
$ cd /opt/git
$ mkdir project.git
$ cd project.git
$ git --bare init
Then, john, josIe, or jessIca can push the first versIon oI theIr
project Into that reposItory by addIng It as a remote and pushIng
up a branch. Þote that someone must sheII onto the machIne and
create a bare reposItory every tIme you want to add a project. Iet's
use gitserver as the hostname oI the server on whIch you've set up
your `gIt' user and reposItory. ¡I you're runnIng It InternaIIy, and you
set up ÐÞS Ior gitserver to poInt to that server, then you can use the
commands pretty much as Is.
# on Johns computer
$ cd myproject
$ git init
$ git add .
$ git commit -m 'initial commit'
$ git remote add origin git@gitserver:/opt/git/project.git
$ git push origin master
86
Chapter 4 GIt on the Server Scott Chacon Pro Git
At thIs poInt, the others can cIone It down and push changes back
up just as easIIy.
$ git clone git@gitserver:/opt/git/project.git
$ vim README
$ git commit -am 'fix for the README file'
$ git push origin master
WIth thIs method, you can quIckIy get a read/wrIte GIt server up
and runnIng Ior a handIuI oI deveIopers.
As an extra precautIon, you can easIIy restrIct the `gIt' user to
onIy doIng GIt actIvItIes wIth a IImIted sheII tooI caIIed git-shell that
comes wIth GIt. ¡I you set thIs as your `gIt' user's IogIn sheII, then the
`gIt' user can't have normaI sheII access to your server. To use thIs,
specIIy git-shell Instead oI bash or csh Ior your user's IogIn sheII. To
do so, you'II IIkeIy have to edIt your /etc/passwd fiIe.
$ sudo vim /etc/passwd
At the bottom, you shouId find a IIne that Iooks somethIng IIke
thIs.
git:x:1000:1000::/home/git:/bin/sh
Change /bin/sh to /usr/bin/git-shell (or run which git-shell to see
where It's InstaIIed). The IIne shouId Iook somethIng IIke thIs.
git:x:1000:1000::/home/git:/usr/bin/git-shell
Þow, the `gIt' user can onIy use the SSI connectIon to push and
puII GIt reposItorIes and can't sheII onto the machIne. ¡I you try, you'II
see a IogIn rejectIon IIke thIs.
$ ssh git@gitserver
fatal: What do you think I am? A shell?
Connection to gitserver closed.
4.5 Public Access
What II you want anonymous read access to your project? Ierhaps
Instead oI hostIng an InternaI prIvate project, you want to host an
open source project. Or maybe you have a bunch oI automated buIId
servers or contInuous IntegratIon servers that change a Iot, and you
don't want to have to generate SSI keys aII the tIme — you just want
to add sImpIe anonymous read access.
IrobabIy the sImpIest way Ior smaIIer setups Is to run a statIc web
server wIth Its document root where your GIt reposItorIes are, and
then enabIe that post-update hook we mentIoned In the first sectIon
oI thIs chapter. Iet's work Irom the prevIous exampIe. Say you have
your reposItorIes In the /opt/git dIrectory, and an Apache server Is
runnIng on your machIne. AgaIn, you can use any web server Ior
thIs, but as an exampIe, we'II demonstrate some basIc Apache con-
figuratIons that shouId gIve you an Idea oI what you mIght need.
IIrst you need to enabIe the hook.
87
Section 4.6 GItWeb Scott Chacon Pro Git
$ cd project.git
$ mv hooks/post-update.sample hooks/post-update
$ chmod a+x hooks/post-update
¡I you're usIng a versIon oI GIt earIIer than 1.6, the mv command
Isn't necessary — GIt started namIng the hooks exampIes wIth the
.sampIe postfix onIy recentIy.
What does thIs post-update hook do? ¡t Iooks basIcaIIy IIke thIs.
$ cat .git/hooks/post-update
#!/bin/sh
exec git-update-server-info
ThIs means that when you push to the server vIa SSI, GIt wIII run
thIs command to update the fiIes needed Ior ITTI IetchIng.
Þext, you need to add a VIrtuaIIost entry to your Apache con-
figuratIon wIth the document root as the root dIrectory oI your GIt
projects. Iere, we're assumIng that you have wIIdcard ÐÞS set up
to send *.gitserver to whatever box you're usIng to run aII thIs.
<VirtualHost *:80>
ServerName git.gitserver
DocumentRoot /opt/git
<Directory /opt/git/>
Order allow, deny
allow from all
</Directory>
</VirtualHost>
You'II aIso need to set the !nIx user group oI the /opt/git dIrecto-
rIes to www-data so your web server can read-access the reposItorIes,
because the Apache Instance runnIng the CG¡ scrIpt wIII (by deIauIt)
be runnIng as that user.
$ chgrp -R www-data /opt/git
When you restart Apache, you shouId be abIe to cIone your repos-
ItorIes under that dIrectory by specIIyIng the !II Ior your project.
$ git clone http://git.gitserver/project.git
ThIs way, you can set up ITTI-based read access to any oI your
projects Ior a IaIr number oI users In a Iew mInutes. Another sImpIe
optIon Ior pubIIc unauthentIcated access Is to start a GIt daemon,
aIthough that requIres you to daemonIze the process - we'II cover
thIs optIon In the next sectIon, II you preIer that route.
4.6 GitWeb
Þow that you have basIc read/wrIte and read-onIy access to your
project, you may want to set up a sImpIe web-based vIsuaIIzer. GIt
comes wIth a CG¡ scrIpt caIIed GItWeb that Is commonIy used Ior thIs.
88
Chapter 4 GIt on the Server Scott Chacon Pro Git
Figure 4.1: The GitWeb web-based user interface
You can see GItWeb In use at sItes IIke http://git.kernel.org (see
IIgure 4.1).
¡I you want to check out what GItWeb wouId Iook IIke Ior your
project, GIt comes wIth a command to fire up a temporary Instance II
you have a IIghtweIght server on your system IIke lighttpd or webrick.
On IInux machInes, lighttpd Is oIten InstaIIed, so you may be abIe to
get It to run by typIng git instaweb In your project dIrectory. ¡I you're
runnIng a Mac, Ieopard comes preInstaIIed wIth Iuby, so webrick may
be your best bet. To start instaweb wIth a non-IIghttpd handIer, you
can run It wIth the --httpd optIon.
$ git instaweb --httpd=webrick
[2009-02-21 10:02:21] INFO WEBrick 1.3.1
[2009-02-21 10:02:21] INFO ruby 1.8.6 (2008-03-03) [universal-darwin9.0]
That starts up an ITTIÐ server on port 1234 and then automat-
IcaIIy starts a web browser that opens on that page. ¡t's pretty easy
on your part. When you're done and want to shut down the server,
you can run the same command wIth the --stop optIon.
$ git instaweb --httpd=webrick --stop
¡I you want to run the web InterIace on a server aII the tIme Ior
your team or Ior an open source project you're hostIng, you'II need to
set up the CG¡ scrIpt to be served by your normaI web server. Some
IInux dIstrIbutIons have a gitweb package that you may be abIe to
InstaII vIa apt or yum, so you may want to try that first. We'II waIk
89
Section 4.7 GItosIs Scott Chacon Pro Git
though InstaIIIng GItWeb manuaIIy very quIckIy. IIrst, you need to
get the GIt source code, whIch GItWeb comes wIth, and generate the
custom CG¡ scrIpt.
$ git clone git://git.kernel.org/pub/scm/git/git.git
$ cd git/
$ make GITWEB_PROJECTROOT="/opt/git" \
prefix=/usr gitweb/gitweb.cgi
$ sudo cp -Rf gitweb /var/www/
ÞotIce that you have to teII the command where to find your GIt
reposItorIes wIth the GITWEB_PROJECTROOT varIabIe. Þow, you need to
make Apache use CG¡ Ior that scrIpt, Ior whIch you can add a VIrtu-
aIIost.
<VirtualHost *:80>
ServerName gitserver
DocumentRoot /var/www/gitweb
<Directory /var/www/gitweb>
Options ExecCGI +FollowSymLinks +SymLinksIfOwnerMatch
AllowOverride All
order allow,deny
Allow from all
AddHandler cgi-script cgi
DirectoryIndex gitweb.cgi
</Directory>
</VirtualHost>
AgaIn, GItWeb can be served wIth any CG¡ capabIe web server, II
you preIer to use somethIng eIse, It shouIdn't be dIfficuIt to set up.
At thIs poInt, you shouId be abIe to vIsIt http://gitserver/ to vIew
your reposItorIes onIIne, and you can use http://git.gitserver to
cIone and Ietch your reposItorIes over ITTI.
4.7 Gitosis
KeepIng aII users' pubIIc keys In the authorized_keys fiIe Ior access
works weII onIy Ior a whIIe. When you have hundreds oI users, It's
much more oI a paIn to manage that process. You have to sheII onto
the server each tIme, and there Is no access controI — everyone In
the fiIe has read and wrIte access to every project.
At thIs poInt, you may want to turn to a wIdeIy used soItware
project caIIed GItosIs. GItosIs Is basIcaIIy a set oI scrIpts that heIp
you manage the authorized_keys fiIe as weII as ImpIement some sImpIe
access controIs. The reaIIy InterestIng part Is that the !¡ Ior thIs tooI
Ior addIng peopIe and determInIng access Isn't a web InterIace but
a specIaI GIt reposItory. You set up the InIormatIon In that project,
and when you push It, GItosIs reconfigures the server based on that,
whIch Is cooI.
¡nstaIIIng GItosIs Isn't the sImpIest task ever, but It's not too dIffi-
cuIt. ¡t's easIest to use a IInux server Ior It — these exampIes use a
stock !buntu 8.10 server.
90
Chapter 4 GIt on the Server Scott Chacon Pro Git
GItosIs requIres some Iython tooIs, so first you have to InstaII
the Iython setuptooIs package, whIch !buntu provIdes as python-
setuptooIs.
$ apt-get install python-setuptools
Þext, you cIone and InstaII GItosIs Irom the project's maIn sIte.
$ git clone git://eagain.net/gitosis.git
$ cd gitosis
$ sudo python setup.py install
That InstaIIs a coupIe oI executabIes that GItosIs wIII use. Þext,
GItosIs wants to put Its reposItorIes under /home/git, whIch Is fine.
Ðut you have aIready set up your reposItorIes In /opt/git, so Instead
oI reconfigurIng everythIng, you create a symIInk.
$ ln -s /opt/git /home/git/repositories
GItosIs Is goIng to manage your keys Ior you, so you need to re-
move the current fiIe, re-add the keys Iater, and Iet GItosIs controI the
authorized_keys fiIe automatIcaIIy. Ior now, move the authorized_keys
fiIe out oI the way.
$ mv /home/git/.ssh/authorized_keys /home/git/.ssh/ak.bak
Þext you need to turn your sheII back on Ior the `gIt' user, II you
changed It to the git-shell command. IeopIe stIII won't be abIe to Iog
In, but GItosIs wIII controI that Ior you. So, Iet's change thIs IIne In
your /etc/passwd fiIe
git:x:1000:1000::/home/git:/usr/bin/git-shell
back to thIs.
git:x:1000:1000::/home/git:/bin/sh
Þow It's tIme to InItIaIIze GItosIs. You do thIs by runnIng the
gitosis-init command wIth your personaI pubIIc key. ¡I your pubIIc
key Isn't on the server, you'II have to copy It there.
$ sudo -H -u git gitosis-init < /tmp/id_dsa.pub
Initialized empty Git repository in /opt/git/gitosis-admin.git/
Reinitialized existing Git repository in /opt/git/gitosis-admin.git/
ThIs Iets the user wIth that key modIIy the maIn GIt reposItory
that controIs the GItosIs setup. Þext, you have to manuaIIy set the
execute bIt on the post-update scrIpt Ior your new controI reposItory.
$ sudo chmod 755 /opt/git/gitosis-admin.git/hooks/post-update
You're ready to roII. ¡I you're set up correctIy, you can try to SSI
Into your server as the user Ior whIch you added the pubIIc key to
InItIaIIze GItosIs. You shouId see somethIng IIke thIs.
91
Section 4.7 GItosIs Scott Chacon Pro Git
$ ssh git@gitserver
PTY allocation request failed on channel 0
fatal: unrecognized command 'gitosis-serve schacon@quaternion'
Connection to gitserver closed.
That means GItosIs recognIzed you but shut you out because you're
not tryIng to do any GIt commands. So, Iet's do an actuaI GIt com-
mand — you'II cIone the GItosIs controI reposItory.
# on your local computer
$ git clone git@gitserver:gitosis-admin.git
Þow you have a dIrectory named gitosis-admin, whIch has two ma-
jor parts.
$ cd gitosis-admin
$ find .
./gitosis.conf
./keydir
./keydir/scott.pub
The gitosis.conf fiIe Is the controI fiIe you use to specIIy users,
reposItorIes, and permIssIons. The keydir dIrectory Is where you
store the pubIIc keys oI aII the users who have any sort oI access to
your reposItorIes — one fiIe per user. The name oI the fiIe In keydir
(In the prevIous exampIe, scott.pub) wIII be dIfferent Ior you — GIto-
sIs takes that name Irom the descrIptIon at the end oI the pubIIc key
that was Imported wIth the gitosis-init scrIpt.
¡I you Iook at the gitosis.conf fiIe, It shouId onIy specIIy InIorma-
tIon about the gitosis-admin project that you just cIoned.
$ cat gitosis.conf
[gitosis]
[group gitosis-admin]
writable = gitosis-admin
members = scott
¡t shows you that the `scott' user — the user wIth whose pubIIc
key you InItIaIIzed GItosIs — Is the onIy one who has access to the
gitosis-admin project.
Þow, Iet's add a new project Ior you. You'II add a new sectIon
caIIed mobile where you'II IIst the deveIopers on your mobIIe team
and projects that those deveIopers need access to. Ðecause `scott'
Is the onIy user In the system rIght now, you'II add hIm as the onIy
member, and you'II create a new project caIIed iphone_project to start
on.
[group mobile]
writable = iphone_project
members = scott
Whenever you make changes to the gitosis-admin project, you have
to commIt the changes and push them back up to the server In order
Ior them to take effect.
92
Chapter 4 GIt on the Server Scott Chacon Pro Git
$ git commit -am 'add iphone_project and mobile group'
[master]: created 8962da8: "changed name"
1 files changed, 4 insertions(+), 0 deletions(-)
$ git push
Counting objects: 5, done.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 272 bytes, done.
Total 3 (delta 1), reused 0 (delta 0)
To git@gitserver:/opt/git/gitosis-admin.git
fb27aec..8962da8 master -> master
You can make your first push to the new iphone_project project by
addIng your server as a remote to your IocaI versIon oI the project
and pushIng. You no Ionger have to manuaIIy create a bare reposItory
Ior new projects on the server — GItosIs creates them automatIcaIIy
when It sees the first push.
$ git remote add origin git@gitserver:iphone_project.git
$ git push origin master
Initialized empty Git repository in /opt/git/iphone_project.git/
Counting objects: 3, done.
Writing objects: 100% (3/3), 230 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To git@gitserver:iphone_project.git
* [new branch] master -> master
ÞotIce that you don't need to specIIy the path (In Iact, doIng so
won't work), just a coIon and then the name oI the project — GItosIs
finds It Ior you.
You want to work on thIs project wIth your IrIends, so you'II have to
re-add theIr pubIIc keys. Ðut Instead oI appendIng them manuaIIy to
the ~/.ssh/authorized_keys fiIe on your server, you'II add them, one key
per fiIe, Into the keydir dIrectory. Iow you name the keys determInes
how you reIer to the users In the gitosis.conf fiIe. Iet's re-add the
pubIIc keys Ior john, josIe, and jessIca.
$ cp /tmp/id_rsa.john.pub keydir/john.pub
$ cp /tmp/id_rsa.josie.pub keydir/josie.pub
$ cp /tmp/id_rsa.jessica.pub keydir/jessica.pub
Þow you can add them aII to your `mobIIe' team so they have read
and wrIte access to iphone_project.
[group mobile]
writable = iphone_project
members = scott john josie jessica
AIter you commIt and push that change, aII Iour users wIII be abIe
to read Irom and wrIte to that project.
GItosIs has sImpIe access controIs as weII. ¡I you want john to
have onIy read access to thIs project, you can do thIs Instead.
[group mobile]
writable = iphone_project
members = scott josie jessica
93
Section 4.8 GIt Ðaemon Scott Chacon Pro Git
[group mobile_ro]
readonly = iphone_project
members = john
Þow john can cIone the project and get updates, but GItosIs won't
aIIow hIm to push back up to the project. You can create as many
oI these groups as you want, each contaInIng dIfferent users and
projects. You can aIso specIIy another group as one oI the members,
to InherIt aII oI Its members automatIcaIIy.
¡I you have any Issues, It may be useIuI to add loglevel=DEBUG under
the [gitosis] sectIon. ¡I you've Iost push access by pushIng a messed-
up configuratIon, you can manuaIIy fix the fiIe on the server under /
home/git/.gitosis.conf — the fiIe Irom whIch GItosIs reads Its InIo. A
push to the project takes the gitosis.conf fiIe you just pushed up and
stIcks It there. ¡I you edIt that fiIe manuaIIy, It remaIns IIke that untII
the next successIuI push to the gitosis-admin project.
4.8 Git Daemon
Ior pubIIc, unauthentIcated read access to your projects, you'II want
to move past the ITTI protocoI and start usIng the GIt protocoI. The
maIn reason Is speed. The GIt protocoI Is Iar more efficIent and thus
Iaster than the ITTI protocoI, so usIng It wIII save your users tIme.
AgaIn, thIs Is Ior unauthentIcated read-onIy access. ¡I you're run-
nIng thIs on a server outsIde your firewaII, It shouId onIy be used Ior
projects that are pubIIcIy vIsIbIe to the worId. ¡I the server you're
runnIng It on Is InsIde your firewaII, you mIght use It Ior projects
that a Iarge number oI peopIe or computers (contInuous IntegratIon
or buIId servers) have read-onIy access to, when you don't want to
have to add an SSI key Ior each.
¡n any case, the GIt protocoI Is reIatIveIy easy to set up. ÐasIcaIIy,
you need to run thIs command In a daemonIzed manner.
git daemon --reuseaddr --base-path=/opt/git/ /opt/git/
--reuseaddr aIIows the server to restart wIthout waItIng Ior oId con-
nectIons to tIme out, the --base-path optIon aIIows peopIe to cIone
projects wIthout specIIyIng the entIre path, and the path at the end
teIIs the GIt daemon where to Iook Ior reposItorIes to export. ¡I you're
runnIng a firewaII, you'II aIso need to punch a hoIe In It at port 9418
on the box you're settIng thIs up on.
You can daemonIze thIs process a number oI ways, dependIng on
the operatIng system you're runnIng. On an !buntu machIne, you
use an !pstart scrIpt. So, In the IoIIowIng fiIe
/etc/event.d/local-git-daemon
you put thIs scrIpt.
94
Chapter 4 GIt on the Server Scott Chacon Pro Git
start on startup
stop on shutdown
exec /usr/bin/git daemon \
--user=git --group=git \
--reuseaddr \
--base-path=/opt/git/ \
/opt/git/
respawn
Ior securIty reasons, It Is strongIy encouraged to have thIs dae-
mon run as a user wIth read-onIy permIssIons to the reposItorIes –
you can easIIy do thIs by creatIng a new user `gIt-ro' and runnIng the
daemon as them. Ior the sake oI sImpIIcIty we'II sImpIy run It as the
same `gIt' user that GItosIs Is runnIng as.
When you restart your machIne, your GIt daemon wIII start au-
tomatIcaIIy and respawn II It goes down. To get It runnIng wIthout
havIng to reboot, you can run thIs.
initctl start local-git-daemon
On other systems, you may want to use xinetd, a scrIpt In your
sysvinit system, or somethIng eIse — as Iong as you get that com-
mand daemonIzed and watched somehow.
Þext, you have to teII your GItosIs server whIch reposItorIes to
aIIow unauthentIcated GIt server-based access to. ¡I you add a sec-
tIon Ior each reposItory, you can specIIy the ones Irom whIch you
want your GIt daemon to aIIow readIng. ¡I you want to aIIow GIt pro-
tocoI access Ior your Iphone project, you add thIs to the end oI the
gitosis.conf fiIe.
[repo iphone_project]
daemon = yes
When that Is commItted and pushed up, your runnIng daemon
shouId start servIng requests Ior the project to anyone who has ac-
cess to port 9418 on your server.
¡I you decIde not to use GItosIs, but you want to set up a GIt dae-
mon, you'II have to run thIs on each project you want the GIt daemon
to serve.
$ cd /path/to/project.git
$ touch git-daemon-export-ok
The presence oI that fiIe teIIs GIt that It's OK to serve thIs project
wIthout authentIcatIon.
GItosIs can aIso controI whIch projects GItWeb shows. IIrst, you
need to add somethIng IIke the IoIIowIng to the /etc/gitweb.conf fiIe.
$projects_list = "/home/git/gitosis/projects.list";
$projectroot = "/home/git/repositories";
$export_ok = "git-daemon-export-ok";
@git_base_url_list = ('git://gitserver');
95
Section 4.9 Iosted GIt Scott Chacon Pro Git
You can controI whIch projects GItWeb Iets users browse by addIng
or removIng a gitweb settIng In the GItosIs configuratIon fiIe. Ior In-
stance, II you want the Iphone project to show up on GItWeb, you
make the repo settIng Iook IIke thIs.
[repo iphone_project]
daemon = yes
gitweb = yes
Þow, II you commIt and push the project, GItWeb wIII automatI-
caIIy start showIng your Iphone project.
4.9 Hosted Git
¡I you don't want to go through aII oI the work InvoIved In settIng
up your own GIt server, you have severaI optIons Ior hostIng your
GIt projects on an externaI dedIcated hostIng sIte. ÐoIng so offers a
number oI advantages. a hostIng sIte Is generaIIy quIck to set up and
easy to start projects on, and no server maIntenance or monItorIng
Is InvoIved. £ven II you set up and run your own server InternaIIy,
you may stIII want to use a pubIIc hostIng sIte Ior your open source
code — It's generaIIy easIer Ior the open source communIty to find
and heIp you wIth.
These days, you have a huge number oI hostIng optIons to choose
Irom, each wIth dIfferent advantages and dIsadvantages. To see an
up-to-date IIst, check out the GItIostIng page on the maIn GIt wIkI.
http://git.or.cz/gitwiki/GitHosting
Ðecause we can't cover aII oI them, and because ¡ happen to work
at one oI them, we'II use thIs sectIon to waIk through settIng up an
account and creatIng a new project at GItIub. ThIs wIII gIve you an
Idea oI what Is InvoIved.
GItIub Is by Iar the Iargest open source GIt hostIng sIte and It's
aIso one oI the very Iew that offers both pubIIc and prIvate hostIng
optIons so you can keep your open source and prIvate commercIaI
code In the same pIace. ¡n Iact, we used GItIub to prIvateIy coIIab-
orate on thIs book.
4.9.1 GitHub
GItIub Is sIIghtIy dIfferent than most code-hostIng sItes In the way
that It namespaces projects. ¡nstead oI beIng prImarIIy based on
the project, GItIub Is user centrIc. That means when ¡ host my grit
project on GItIub, you won't find It at github.com/grit but Instead at
github.com/schacon/grit. There Is no canonIcaI versIon oI any project,
whIch aIIows a project to move Irom one user to another seamIessIy
II the first author abandons the project.
GItIub Is aIso a commercIaI company that charges Ior accounts
that maIntaIn prIvate reposItorIes, but anyone can quIckIy get a Iree
96
Chapter 4 GIt on the Server Scott Chacon Pro Git
account to host as many open source projects as they want. We'II
quIckIy go over how that Is done.
4.9.2 Setting Up a User Account
The first thIng you need to do Is set up a Iree user account. ¡I you vIsIt
the IrIcIng and SIgnup page at http://github.com/plans and cIIck
the “SIgn !p” button on the Iree account (see figure 4-2), you're
taken to the sIgnup page.
Figure 4.2: The GitHub plan page
Iere you must choose a username that Isn't yet taken In the sys-
tem and enter an e-maII address that wIII be assocIated wIth the ac-
count and a password (see IIgure 4.3).
¡I you have It avaIIabIe, thIs Is a good tIme to add your pubIIc
SSI key as weII. We covered how to generate a new key earIIer, In
the “SImpIe Setups” sectIon. Take the contents oI the pubIIc key oI
that paIr, and paste It Into the SSI IubIIc Key text box. CIIckIng the
“expIaIn ssh keys” IInk takes you to detaIIed InstructIons on how to
do so on aII major operatIng systems. CIIckIng the “¡ agree, sIgn me
up” button takes you to your new user dashboard (see IIgure 4.4).
Þext you can create a new reposItory.
4.9.3 Creating a New Repository
Start by cIIckIng the “create a new one” IInk next to Your IeposItorIes
on the user dashboard. You're taken to the Create a Þew IeposItory
Iorm (see IIgure 4.5).
AII you reaIIy have to do Is provIde a project name, but you can aIso
add a descrIptIon. When that Is done, cIIck the “Create IeposItory”
button. Þow you have a new reposItory on GItIub (see IIgure 4.6).
SInce you have no code there yet, GItIub wIII show you Instruc-
tIons Ior how create a brand-new project, push an exIstIng GIt project
up, or Import a project Irom a pubIIc SubversIon reposItory (see IIg-
ure 4.7).
97
Section 4.9 Iosted GIt Scott Chacon Pro Git
Figure 4.3: The GitHub user signup form
Figure 4.4: The GitHub user dashboard
These InstructIons are sImIIar to what we've aIready gone over.
To InItIaIIze a project II It Isn't aIready a GIt project, you use
$ git init
$ git add .
$ git commit -m 'initial commit'
When you have a GIt reposItory IocaIIy, add GItIub as a remote
and push up your master branch.
$ git remote add origin git@github.com:testinguser/iphone_project.git
$ git push origin master
98
Chapter 4 GIt on the Server Scott Chacon Pro Git
Figure 4.5: Creating a new repository on GitHub
Figure 4.6: GitHub project header information
Þow your project Is hosted on GItIub, and you can gIve the !II to
anyone you want to share your project wIth. ¡n thIs case, It's http://
github.com/testinguser/iphone_project. You can aIso see Irom
the header on each oI your project's pages that you have two GIt
!IIs (see IIgure 4.8).
The IubIIc CIone !II Is a pubIIc, read-onIy GIt !II over whIch
anyone can cIone the project. IeeI Iree to gIve out that !II and post
It on your web sIte or what have you.
The Your CIone !II Is a read/wrIte SSI-based !II that you can
read or wrIte over onIy II you connect wIth the SSI prIvate key asso-
cIated wIth the pubIIc key you upIoaded Ior your user. When other
users vIsIt thIs project page, they won't see that !II—onIy the pubIIc
one.
4.9.4 Importing from Subversion
¡I you have an exIstIng pubIIc SubversIon project that you want to
Import Into GIt, GItIub can oIten do that Ior you. At the bottom oI
the InstructIons page Is a IInk to a SubversIon Import. ¡I you cIIck
99
Section 4.9 Iosted GIt Scott Chacon Pro Git
Figure 4.7: Instructions for a new repository
Figure 4.8: Project header with a public URL and a private
URL
It, you see a Iorm wIth InIormatIon about the Import process and a
text box where you can paste In the !II oI your pubIIc SubversIon
project (see IIgure 4.9).
¡I your project Is very Iarge, nonstandard, or prIvate, thIs process
probabIy won't work Ior you. ¡n Chapter 7, you'II Iearn how to do
more compIIcated manuaI project Imports.
4.9.5 Adding Collaborators
Iet's add the rest oI the team. ¡I john, josIe, and jessIca aII sIgn up
Ior accounts on GItIub, and you want to gIve them push access to
your reposItory, you can add them to your project as coIIaborators.
ÐoIng so wIII aIIow pushes Irom theIr pubIIc keys to work.
100
Chapter 4 GIt on the Server Scott Chacon Pro Git
Figure 4.9: Subversion importing interface
CIIck the “edIt” button In the project header or the AdmIn tab at
the top oI the project to reach the AdmIn page oI your GItIub project
(see IIgure 4.10).
Figure 4.10: GitHub administration page
To gIve another user wrIte access to your project, cIIck the “Add
another coIIaborator” IInk. A new text box appears, Into whIch you
can type a username. As you type, a heIper pops up, showIng you
possIbIe username matches. When you find the correct user, cIIck
the Add button to add that user as a coIIaborator on your project
(see IIgure 4.11).
When you're finIshed addIng coIIaborators, you shouId see a IIst
oI them In the IeposItory CoIIaborators box (see IIgure 4.12).
¡I you need to revoke access to IndIvIduaIs, you can cIIck the
“revoke” IInk, and theIr push access wIII be removed. Ior Iuture
projects, you can aIso copy coIIaborator groups by copyIng the per-
mIssIons oI an exIstIng project.
101
Section 4.9 Iosted GIt Scott Chacon Pro Git
Figure 4.11: Adding a collaborator to your project
Figure 4.12: A list of collaborators on your project
4.9.6 Your Project
AIter you push your project up or have It Imported Irom SubversIon,
you have a maIn project page that Iooks somethIng IIke IIgure 4-13.
When peopIe vIsIt your project, they see thIs page. ¡t contaIns tabs
to dIfferent aspects oI your projects. The CommIts tab shows a IIst oI
commIts In reverse chronoIogIcaI order, sImIIar to the output oI the
git log command. The Þetwork tab shows aII the peopIe who have
Iorked your project and contrIbuted back. The ÐownIoads tab aIIows
you to upIoad project bInarIes and IInk to tarbaIIs and zIpped versIons
oI any tagged poInts In your project. The WIkI tab provIdes a wIkI
where you can wrIte documentatIon or other InIormatIon about your
project. The Graphs tab has some contrIbutIon vIsuaIIzatIons and
statIstIcs about your project. The maIn Source tab that you Iand on
shows your project's maIn dIrectory IIstIng and automatIcaIIy renders
the I£AÐM£ fiIe beIow It II you have one. ThIs tab aIso shows a box
wIth the Iatest commIt InIormatIon.
102
Chapter 4 GIt on the Server Scott Chacon Pro Git
Figure 4.13: A GitHub main project page
4.9.7 Forking Projects
¡I you want to contrIbute to an exIstIng project to whIch you don't
have push access, GItIub encourages IorkIng the project. When you
Iand on a project page that Iooks InterestIng and you want to hack
on It a bIt, you can cIIck the “Iork” button In the project header to
have GItIub copy that project to your user so you can push to It.
ThIs way, projects don't have to worry about addIng users as coI-
Iaborators to gIve them push access. IeopIe can Iork a project and
push to It, and the maIn project maIntaIner can puII In those changes
by addIng them as remotes and mergIng In theIr work.
To Iork a project, vIsIt the project page (In thIs case, mojombo/
chronIc) and cIIck the “Iork” button In the header (see IIgure 4.14).
button.
AIter a Iew seconds, you're taken to your new project page, whIch
IndIcates that thIs project Is a Iork oI another one (see IIgure 4-15).
4.9.8 GitHub Summary
That's aII we'II cover about GItIub, but It's Important to note how
quIckIy you can do aII thIs. You can create an account, add a new
project, and push to It In a matter oI mInutes. ¡I your project Is open
source, you aIso get a huge communIty oI deveIopers who now have
vIsIbIIIty Into your project and may weII Iork It and heIp contrIbute
to It. At the very Ieast, thIs may be a way to get up and runnIng wIth
103
Section 4.10 Summary Scott Chacon Pro Git
Figure 4.14: Get a writable copy of any repository by clicking
the “fork”
Figure 4.15: Your fork of a project
GIt and try It out quIckIy.
4.10 Summary
You have severaI optIons to get a remote GIt reposItory up and run-
nIng so that you can coIIaborate wIth others or share your work.
IunnIng your own server gIves you a Iot oI controI and aIIows
you to run the server wIthIn your own firewaII, but such a server
generaIIy requIres a IaIr amount oI your tIme to set up and maIntaIn.
¡I you pIace your data on a hosted server, It's easy to set up and
maIntaIn, however, you have to be abIe to keep your code on someone
eIse's servers, and some organIzatIons don't aIIow that.
¡t shouId be IaIrIy straIghtIorward to determIne whIch soIutIon or
combInatIon oI soIutIons Is approprIate Ior you and your organIza-
tIon.
104
Chapter 5
Distributed Git
Þow that you have a remote GIt reposItory set up as a poInt Ior aII
the deveIopers to share theIr code, and you're IamIIIar wIth basIc GIt
commands In a IocaI workflow, you'II Iook at how to utIIIze some oI
the dIstrIbuted workflows that GIt affords you.
¡n thIs chapter, you'II see how to work wIth GIt In a dIstrIbuted
envIronment as a contrIbutor and an Integrator. That Is, you'II Iearn
how to contrIbute code successIuIIy to a project and make It as easy
on you and the project maIntaIner as possIbIe, and aIso how to maIn-
taIn a project successIuIIy wIth a number oI deveIopers contrIbutIng.
5.1 Distributed Workflows
!nIIke CentraIIzed VersIon ControI Systems (CVCSs), the dIstrIbuted
nature oI GIt aIIows you to be Iar more flexIbIe In how deveIopers
coIIaborate on projects. ¡n centraIIzed systems, every deveIoper Is
a node workIng more or Iess equaIIy on a centraI hub. ¡n GIt, how-
ever, every deveIoper Is potentIaIIy both a node and a hub — that Is,
every deveIoper can both contrIbute code to other reposItorIes and
maIntaIn a pubIIc reposItory on whIch others can base theIr work
and whIch they can contrIbute to. ThIs opens a vast range oI work-
flow possIbIIItIes Ior your project and/or your team, so ¡'II cover a
Iew common paradIgms that take advantage oI thIs flexIbIIIty. ¡'II go
over the strengths and possIbIe weaknesses oI each desIgn, you can
choose a sIngIe one to use, or you can mIx and match Ieatures Irom
each.
5.1.1 Centralized Workflow
¡n centraIIzed systems, there Is generaIIy a sIngIe coIIaboratIon modeI
—the centraIIzed workflow. One centraI hub, or reposItory, can ac-
cept code, and everyone synchronIzes theIr work to It. A number oI
deveIopers are nodes — consumers oI that hub — and synchronIze
to that one pIace (see IIgure 5.1).
105
Section 5.1 ÐIstrIbuted Workflows Scott Chacon Pro Git
Figure 5.1: Centralized workflow
ThIs means that II two deveIopers cIone Irom the hub and both
make changes, the first deveIoper to push theIr changes back up can
do so wIth no probIems. The second deveIoper must merge In the
first one's work beIore pushIng changes up, so as not to overwrIte
the first deveIoper's changes. ThIs concept Is true In GIt as It Is In
SubversIon (or any CVCS), and thIs modeI works perIectIy In GIt.
¡I you have a smaII team or are aIready comIortabIe wIth a cen-
traIIzed workflow In your company or team, you can easIIy contInue
usIng that workflow wIth GIt. SImpIy set up a sIngIe reposItory, and
gIve everyone on your team push access, GIt won't Iet users over-
wrIte each other. ¡I one deveIoper cIones, makes changes, and then
trIes to push theIr changes whIIe another deveIoper has pushed In
the meantIme, the server wIII reject that deveIoper's changes. They
wIII be toId that they're tryIng to push non-Iast-Iorward changes and
that they won't be abIe to do so untII they Ietch and merge. ThIs
workflow Is attractIve to a Iot oI peopIe because It's a paradIgm that
many are IamIIIar and comIortabIe wIth.
5.1.2 Integration-Manager Workflow
Ðecause GIt aIIows you to have muItIpIe remote reposItorIes, It's pos-
sIbIe to have a workflow where each deveIoper has wrIte access to
theIr own pubIIc reposItory and read access to everyone eIse's. ThIs
scenarIo oIten IncIudes a canonIcaI reposItory that represents the
“officIaI” project. To contrIbute to that project, you create your own
pubIIc cIone oI the project and push your changes to It. Then, you
can send a request to the maIntaIner oI the maIn project to puII In
your changes. They can add your reposItory as a remote, test your
changes IocaIIy, merge them Into theIr branch, and push back to theIr
reposItory. The process works as IoIIow (see IIgure 5-2).
1. The project maIntaIner pushes to theIr pubIIc reposItory.
2. A contrIbutor cIones that reposItory and makes changes.
3. The contrIbutor pushes to theIr own pubIIc copy.
106
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
4. The contrIbutor sends the maIntaIner an e-maII askIng them to
puII changes.
5. The maIntaIner adds the contrIbutor's repo as a remote and
merges IocaIIy.
6. The maIntaIner pushes merged changes to the maIn reposItory.
Figure 5.2: Integration-manager workflow
ThIs Is a very common workflow wIth sItes IIke GItIub, where
It's easy to Iork a project and push your changes Into your Iork Ior
everyone to see. One oI the maIn advantages oI thIs approach Is that
you can contInue to work, and the maIntaIner oI the maIn reposItory
can puII In your changes at any tIme. ContrIbutors don't have to waIt
Ior the project to Incorporate theIr changes — each party can work
at theIr own pace.
5.1.3 Dictator and Lieutenants Workflow
ThIs Is a varIant oI a muItIpIe-reposItory workflow. ¡t's generaIIy used
by huge projects wIth hundreds oI coIIaborators, one Iamous exam-
pIe Is the IInux kerneI. VarIous IntegratIon managers are In charge
oI certaIn parts oI the reposItory, they're caIIed IIeutenants. AII the
IIeutenants have one IntegratIon manager known as the benevoIent
dIctator. The benevoIent dIctator's reposItory serves as the reIer-
ence reposItory Irom whIch aII the coIIaborators need to puII. The
process works IIke thIs (see IIgure 5.3).
1. IeguIar deveIopers work on theIr topIc branch and rebase theIr
work on top oI master. The master branch Is that oI the dIctator.
2. IIeutenants merge the deveIopers' topIc branches Into theIr mas-
ter branch.
3. The dIctator merges the IIeutenants' master branches Into the
dIctator's master branch.
4. The dIctator pushes theIr master to the reIerence reposItory so
the other deveIopers can rebase on It.
107
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
Figure 5.3: Benevolent dictator workflow
ThIs kInd oI workflow Isn't common but can be useIuI In very bIg
projects or In hIghIy hIerarchIcaI envIronments, because as It aIIows
the project Ieader (the dIctator) to deIegate much oI the work and
coIIect Iarge subsets oI code at muItIpIe poInts beIore IntegratIng
them.
These are some commonIy used workflows that are possIbIe wIth
a dIstrIbuted system IIke GIt, but you can see that many varIatIons
are possIbIe to suIt your partIcuIar reaI-worId workflow. Þow that
you can (¡ hope) determIne whIch workflow combInatIon may work
Ior you, ¡'II cover some more specIfic exampIes oI how to accompIIsh
the maIn roIes that make up the dIfferent flows.
5.2 Contributing to a Project
You know what the dIfferent workflows are, and you shouId have a
pretty good grasp oI IundamentaI GIt usage. ¡n thIs sectIon, you'II
Iearn about a Iew common patterns Ior contrIbutIng to a project.
The maIn dIfficuIty wIth descrIbIng thIs process Is that there are
a huge number oI varIatIons on how It's done. Ðecause GIt Is very
flexIbIe, peopIe can and do work together many ways, and It's prob-
IematIc to descrIbe how you shouId contrIbute to a project — every
project Is a bIt dIfferent. Some oI the varIabIes InvoIved are actIve
contrIbutor sIze, chosen workflow, your commIt access, and possIbIy
the externaI contrIbutIon method.
The first varIabIe Is actIve contrIbutor sIze. Iow many users are
actIveIy contrIbutIng code to thIs project, and how oIten? ¡n many
Instances, you'II have two or three deveIopers wIth a Iew commIts
a day, or possIbIy Iess Ior somewhat dormant projects. Ior reaIIy
Iarge companIes or projects, the number oI deveIopers couId be In
108
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
the thousands, wIth dozens or even hundreds oI patches comIng In
each day. ThIs Is Important because wIth more and more deveIopers,
you run Into more Issues wIth makIng sure your code appIIes cIeanIy
or can be easIIy merged. Changes you submIt may be rendered ob-
soIete or severeIy broken by work that Is merged In whIIe you were
workIng or whIIe your changes were waItIng to be approved or ap-
pIIed. Iow can you keep your code consIstentIy up to date and your
patches vaIId?
The next varIabIe Is the workflow In use Ior the project. ¡s It cen-
traIIzed, wIth each deveIoper havIng equaI wrIte access to the maIn
codeIIne? Ðoes the project have a maIntaIner or IntegratIon manager
who checks aII the patches? Are aII the patches peer-revIewed and
approved? Are you InvoIved In that process? ¡s a IIeutenant system
In pIace, and do you have to submIt your work to them first?
The next Issue Is your commIt access. The workflow requIred In
order to contrIbute to a project Is much dIfferent II you have wrIte
access to the project than II you don't. ¡I you don't have wrIte access,
how does the project preIer to accept contrIbuted work? Ðoes It even
have a poIIcy? Iow much work are you contrIbutIng at a tIme? Iow
oIten do you contrIbute?
AII these questIons can affect how you contrIbute effectIveIy to
a project and what workflows are preIerred or avaIIabIe to you. ¡'II
cover aspects oI each oI these In a serIes oI use cases, movIng Irom
sImpIe to more compIex, you shouId be abIe to construct the specIfic
workflows you need In practIce Irom these exampIes.
5.2.1 Commit Guidelines
ÐeIore you start IookIng at the specIfic use cases, here's a quIck note
about commIt messages. IavIng a good guIdeIIne Ior creatIng com-
mIts and stIckIng to It makes workIng wIth GIt and coIIaboratIng wIth
others a Iot easIer. The GIt project provIdes a document that Iays out
a number oI good tIps Ior creatIng commIts Irom whIch to submIt
patches — you can read It In the GIt source code In the Documentation/
SubmittingPatches fiIe.
IIrst, you don't want to submIt any whItespace errors. GIt pro-
vIdes an easy way to check Ior thIs — beIore you commIt, run git diff
--check, whIch IdentIfies possIbIe whItespace errors and IIsts them Ior
you. Iere Is an exampIe, where ¡'ve repIaced a red termInaI coIor
wIth Xs.
$ git diff --check
lib/simplegit.rb:5: trailing whitespace.
+ @git_dir = File.expand_path(git_dir)XX
lib/simplegit.rb:7: trailing whitespace.
+ XXXXXXXXXXX
lib/simplegit.rb:26: trailing whitespace.
+ def command(git_cmd)XXXX
¡I you run that command beIore commIttIng, you can teII II you're
about to commIt whItespace Issues that may annoy other deveIopers.
109
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
Þext, try to make each commIt a IogIcaIIy separate changeset.
¡I you can, try to make your changes dIgestIbIe — don't code Ior a
whoIe weekend on five dIfferent Issues and then submIt them aII as
one massIve commIt on Monday. £ven II you don't commIt durIng
the weekend, use the stagIng area on Monday to spIIt your work Into
at Ieast one commIt per Issue, wIth a useIuI message per commIt. ¡I
some oI the changes modIIy the same fiIe, try to use git add --patch
to partIaIIy stage fiIes (covered In detaII In Chapter 6). The project
snapshot at the tIp oI the branch Is IdentIcaI whether you do one
commIt or five, as Iong as aII the changes are added at some poInt,
so try to make thIngs easIer on your IeIIow deveIopers when they
have to revIew your changes. ThIs approach aIso makes It easIer to
puII out or revert one oI the changesets II you need to Iater. Chapter
6 descrIbes a number oI useIuI GIt trIcks Ior rewrItIng hIstory and
InteractIveIy stagIng fiIes — use these tooIs to heIp craIt a cIean and
understandabIe hIstory.
The Iast thIng to keep In mInd Is the commIt message. GettIng
In the habIt oI creatIng quaIIty commIt messages makes usIng and
coIIaboratIng wIth GIt a Iot easIer. As a generaI ruIe, your messages
shouId start wIth a sIngIe IIne that's no more than about 50 charac-
ters and that descrIbes the changeset concIseIy, IoIIowed by a bIank
IIne, IoIIowed by a more detaIIed expIanatIon. The GIt project re-
quIres that the more detaIIed expIanatIon IncIude your motIvatIon
Ior the change and contrast Its ImpIementatIon wIth prevIous behav-
Ior — thIs Is a good guIdeIIne to IoIIow. ¡t's aIso a good Idea to use
the ImperatIve present tense In these messages. ¡n other words, use
commands. ¡nstead oI “¡ added tests Ior” or “AddIng tests Ior,” use
“Add tests Ior.” Iere Is a tempIate orIgInaIIy wrItten by TIm Iope at
tpope.net.
Short (50 chars or less) summary of changes
More detailed explanatory text, if necessary. Wrap it to about 72
characters or so. In some contexts, the first line is treated as the
subject of an email and the rest of the text as the body. The blank
line separating the summary from the body is critical (unless you omit
the body entirely); tools like rebase can get confused if you run the
two together.
Further paragraphs come after blank lines.
- Bullet points are okay, too
- Typically a hyphen or asterisk is used for the bullet, preceded by a
single space, with blank lines in between, but conventions vary here
¡I aII your commIt messages Iook IIke thIs, thIngs wIII be a Iot
easIer Ior you and the deveIopers you work wIth. The GIt project has
weII-Iormatted commIt messages — ¡ encourage you to run git log --
no-merges there to see what a nIceIy Iormatted project-commIt hIstory
Iooks IIke.
¡n the IoIIowIng exampIes, and throughout most oI thIs book, Ior
110
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
the sake oI brevIty ¡ don't Iormat messages nIceIy IIke thIs, Instead,
¡ use the -m optIon to git commit. Ðo as ¡ say, not as ¡ do.
5.2.2 Private Small Team
The sImpIest setup you're IIkeIy to encounter Is a prIvate project wIth
one or two other deveIopers. Ðy prIvate, ¡ mean cIosed source — not
read-accessIbIe to the outsIde worId. You and the other deveIopers
aII have push access to the reposItory.
¡n thIs envIronment, you can IoIIow a workflow sImIIar to what you
mIght do when usIng SubversIon or another centraIIzed system. You
stIII get the advantages oI thIngs IIke offlIne commIttIng and vastIy
sImpIer branchIng and mergIng, but the workflow can be very sImI-
Iar, the maIn dIfference Is that merges happen cIIent-sIde rather than
on the server at commIt tIme. Iet's see what It mIght Iook IIke when
two deveIopers start to work together wIth a shared reposItory. The
first deveIoper, john, cIones the reposItory, makes a change, and
commIts IocaIIy. (¡'m repIacIng the protocoI messages wIth ... In
these exampIes to shorten them somewhat.)
# John's Machine
$ git clone john@githost:simplegit.git
Initialized empty Git repository in /home/john/simplegit/.git/
...
$ cd simplegit/
$ vim lib/simplegit.rb
$ git commit -am 'removed invalid default value'
[master 738ee87] removed invalid default value
1 files changed, 1 insertions(+), 1 deletions(-)
The second deveIoper, jessIca, does the same thIng — cIones the
reposItory and commIts a change.
# Jessica's Machine
$ git clone jessica@githost:simplegit.git
Initialized empty Git repository in /home/jessica/simplegit/.git/
...
$ cd simplegit/
$ vim TODO
$ git commit -am 'add reset task'
[master fbff5bc] add reset task
1 files changed, 1 insertions(+), 0 deletions(-)
Þow, jessIca pushes her work up to the server.
# Jessica's Machine
$ git push origin master
...
To jessica@githost:simplegit.git
1edee6b..fbff5bc master -> master
john trIes to push hIs change up, too.
# John's Machine
$ git push origin master
111
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
To john@githost:simplegit.git
! [rejected] master -> master (non-fast forward)
error: failed to push some refs to 'john@githost:simplegit.git'
john Isn't aIIowed to push because jessIca has pushed In the mean-
tIme. ThIs Is especIaIIy Important to understand II you're used to
SubversIon, because you'II notIce that the two deveIopers dIdn't edIt
the same fiIe. AIthough SubversIon automatIcaIIy does such a merge
on the server II dIfferent fiIes are edIted, In GIt you must merge the
commIts IocaIIy. john has to Ietch jessIca's changes and merge them
In beIore he wIII be aIIowed to push.
$ git fetch origin
...
From john@githost:simplegit
+ 049d078...fbff5bc master -> origin/master
At thIs poInt, john's IocaI reposItory Iooks somethIng IIke IIgure
5-4.
Figure 5.4: John's initial repository
john has a reIerence to the changes jessIca pushed up, but he has
to merge them Into hIs own work beIore he Is aIIowed to push.
$ git merge origin/master
Merge made by recursive.
TODO | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
The merge goes smoothIy — john's commIt hIstory now Iooks IIke
IIgure 5.5.
Þow, john can test hIs code to make sure It stIII works properIy,
and then he can push hIs new merged work up to the server.
112
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Figure 5.5: John's repository after merging origin/master
$ git push origin master
...
To john@githost:simplegit.git
fbff5bc..72bbc59 master -> master
IInaIIy, john's commIt hIstory Iooks IIke IIgure 5.6.
Figure 5.6: John's history after pushing to the origin server
¡n the meantIme, jessIca has been workIng on a topIc branch.
She's created a topIc branch caIIed issue54 and done three commIts
on that branch. She hasn't Ietched john's changes yet, so her commIt
hIstory Iooks IIke IIgure 5.7.
Figure 5.7: Jessica's initial commit history
jessIca wants to sync up wIth john, so she Ietches.
113
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
# Jessica's Machine
$ git fetch origin
...
From jessica@githost:simplegit
fbff5bc..72bbc59 master -> origin/master
That puIIs down the work john has pushed up In the meantIme.
jessIca's hIstory now Iooks IIke IIgure 5.8.
Figure 5.8: Jessica's history after fetching John's changes
jessIca thInks her topIc branch Is ready, but she wants to know
what she has to merge her work Into so that she can push. She runs
git log to find out.
$ git log --no-merges origin/master ^issue54
commit 738ee872852dfaa9d6634e0dea7a324040193016
Author: John Smith <jsmith@example.com>
Date: Fri May 29 16:01:27 2009 -0700
removed invalid default value
Þow, jessIca can merge her topIc work Into her master branch,
merge john's work (origin/master) Into her master branch, and then
push back to the server agaIn. IIrst, she swItches back to her master
branch to Integrate aII thIs work.
$ git checkout master
Switched to branch "master"
Your branch is behind 'origin/master' by 2 commits, and can be fast-
forwarded.
She can merge eIther origin/master or issue54 first — they're both
upstream, so the order doesn't matter. The end snapshot shouId be
IdentIcaI no matter whIch order she chooses, onIy the hIstory wIII be
sIIghtIy dIfferent. She chooses to merge In issue54 first.
$ git merge issue54
Updating fbff5bc..4af4298
Fast forward
README | 1 +
lib/simplegit.rb | 6 +++++-
2 files changed, 6 insertions(+), 1 deletions(-)
114
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Þo probIems occur, as you can see It, was a sImpIe Iast-Iorward.
Þow jessIca merges In john's work (origin/master).
$ git merge origin/master
Auto-merging lib/simplegit.rb
Merge made by recursive.
lib/simplegit.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
£verythIng merges cIeanIy, and jessIca's hIstory Iooks IIke IIgure
5-9.
Figure 5.9: Jessica's history after merging John's changes
Þow origin/master Is reachabIe Irom jessIca's master branch, so she
shouId be abIe to successIuIIy push (assumIng john hasn't pushed
agaIn In the meantIme).
$ git push origin master
...
To jessica@githost:simplegit.git
72bbc59..8059c15 master -> master
£ach deveIoper has commItted a Iew tImes and merged each other's
work successIuIIy, see IIgure 5.10.
Figure 5.10: Jessica's history after pushing all changes back
to the
server
That Is one oI the sImpIest workflows. You work Ior a whIIe, gen-
eraIIy In a topIc branch, and merge Into your master branch when
It's ready to be Integrated. When you want to share that work, you
merge It Into your own master branch, then Ietch and merge origin/
master II It has changed, and finaIIy push to the master branch on the
115
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
server. The generaI sequence Is somethIng IIke that shown In IIgure
5.11.
Figure 5.11: General sequence of events for a simple multiple-
developer Git
workflow
5.2.3 Private Managed Team
¡n thIs next scenarIo, you'II Iook at contrIbutor roIes In a Iarger prI-
vate group. You'II Iearn how to work In an envIronment where smaII
groups coIIaborate on Ieatures and then those team-based contrIbu-
tIons are Integrated by another party.
116
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Iet's say that john and jessIca are workIng together on one Iea-
ture, whIIe jessIca and josIe are workIng on a second. ¡n thIs case,
the company Is usIng a type oI IntegratIon-manager workflow where
the work oI the IndIvIduaI groups Is Integrated onIy by certaIn engI-
neers, and the master branch oI the maIn repo can be updated onIy
by those engIneers. ¡n thIs scenarIo, aII work Is done In team-based
branches and puIIed together by the Integrators Iater.
Iet's IoIIow jessIca's workflow as she works on her two Ieatures,
coIIaboratIng In paraIIeI wIth two dIfferent deveIopers In thIs envI-
ronment. AssumIng she aIready has her reposItory cIoned, she de-
cIdes to work on featureA first. She creates a new branch Ior the
Ieature and does some work on It there.
# Jessica's Machine
$ git checkout -b featureA
Switched to a new branch "featureA"
$ vim lib/simplegit.rb
$ git commit -am 'add limit to log function'
[featureA 3300904] add limit to log function
1 files changed, 1 insertions(+), 1 deletions(-)
At thIs poInt, she needs to share her work wIth john, so she pushes
her featureA branch commIts up to the server. jessIca doesn't have
push access to the master branch — onIy the Integrators do — so she
has to push to another branch In order to coIIaborate wIth john.
$ git push origin featureA
...
To jessica@githost:simplegit.git
* [new branch] featureA -> featureA
jessIca e-maIIs john to teII hIm that she's pushed some work Into
a branch named featureA and he can Iook at It now. WhIIe she waIts
Ior Ieedback Irom john, jessIca decIdes to start workIng on featureB
wIth josIe. To begIn, she starts a new Ieature branch, basIng It off
the server's master branch.
# Jessica's Machine
$ git fetch origin
$ git checkout -b featureB origin/master
Switched to a new branch "featureB"
Þow, jessIca makes a coupIe oI commIts on the featureB branch.
$ vim lib/simplegit.rb
$ git commit -am 'made the ls-tree function recursive'
[featureB e5b0fdc] made the ls-tree function recursive
1 files changed, 1 insertions(+), 1 deletions(-)
$ vim lib/simplegit.rb
$ git commit -am 'add ls-files'
[featureB 8512791] add ls-files
1 files changed, 5 insertions(+), 0 deletions(-)
jessIca's reposItory Iooks IIke IIgure 5.12.
117
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
Figure 5.12: Jessica's initial commit history
She's ready to push up her work, but gets an e-maII Irom josIe
that a branch wIth some InItIaI work on It was aIready pushed to the
server as featureBee. jessIca first needs to merge those changes In
wIth her own beIore she can push to the server. She can then Ietch
josIe's changes down wIth git fetch.
$ git fetch origin
...
From jessica@githost:simplegit
* [new branch] featureBee -> origin/featureBee
jessIca can now merge thIs Into the work she dId wIth git merge.
$ git merge origin/featureBee
Auto-merging lib/simplegit.rb
Merge made by recursive.
lib/simplegit.rb | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
There Is a bIt oI a probIem — she needs to push the merged work
In her featureB branch to the featureBee branch on the server. She can
do so by specIIyIng the IocaI branch IoIIowed by a coIon (.) IoIIowed
by the remote branch to the git push command.
$ git push origin featureB:featureBee
...
To jessica@githost:simplegit.git
fba9af8..cd685d1 featureB -> featureBee
ThIs Is caIIed a refspec. See Chapter 9 Ior a more detaIIed dIscus-
sIon oI GIt reIspecs and dIfferent thIngs you can do wIth them.
Þext, john e-maIIs jessIca to say he's pushed some changes to the
featureA branch and ask her to verIIy them. She runs a git fetch to
puII down those changes.
118
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
$ git fetch origin
...
From jessica@githost:simplegit
3300904..aad881d featureA -> origin/featureA
Then, she can see what has been changed wIth git log.
$ git log origin/featureA ^featureA
commit aad881d154acdaeb2b6b18ea0e827ed8a6d671e6
Author: John Smith <jsmith@example.com>
Date: Fri May 29 19:57:33 2009 -0700
changed log output to 30 from 25
IInaIIy, she merges john's work Into her own featureA branch.
$ git checkout featureA
Switched to branch "featureA"
$ git merge origin/featureA
Updating 3300904..aad881d
Fast forward
lib/simplegit.rb | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
jessIca wants to tweak somethIng, so she commIts agaIn and then
pushes thIs back up to the server.
$ git commit -am 'small tweak'
[featureA ed774b3] small tweak
1 files changed, 1 insertions(+), 1 deletions(-)
$ git push origin featureA
...
To jessica@githost:simplegit.git
3300904..ed774b3 featureA -> featureA
jessIca's commIt hIstory now Iooks somethIng IIke IIgure 5.13.
Figure 5.13: Jessica's history after committing on a feature
branch
119
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
jessIca, josIe, and john InIorm the Integrators that the featureA
and featureBee branches on the server are ready Ior IntegratIon Into
the maInIIne. AIter they Integrate these branches Into the maInIIne,
a Ietch wIII brIng down the new merge commIts, makIng the commIt
hIstory Iook IIke IIgure 5.14.
Figure 5.14: Jessica's history after merging both her topic
branches
Many groups swItch to GIt because oI thIs abIIIty to have muItIpIe
teams workIng In paraIIeI, mergIng the dIfferent IInes oI work Iate
In the process. The abIIIty oI smaIIer subgroups oI a team to coIIab-
orate vIa remote branches wIthout necessarIIy havIng to InvoIve or
Impede the entIre team Is a huge benefit oI GIt. The sequence Ior
the workflow you saw here Is somethIng IIke IIgure 5-15.
5.2.4 Public Small Project
ContrIbutIng to pubIIc projects Is a bIt dIfferent. Ðecause you don't
have the permIssIons to dIrectIy update branches on the project, you
have to get the work to the maIntaIners some other way. ThIs first
exampIe descrIbes contrIbutIng vIa IorkIng on GIt hosts that support
easy IorkIng. The repo.or.cz and GItIub hostIng sItes both support
thIs, and many project maIntaIners expect thIs styIe oI contrIbutIon.
The next sectIon deaIs wIth projects that preIer to accept contrIbuted
patches vIa e-maII.
IIrst, you'II probabIy want to cIone the maIn reposItory, create
a topIc branch Ior the patch or patch serIes you're pIannIng to con-
trIbute, and do your work there. The sequence Iooks basIcaIIy IIke
thIs.
$ git clone (url)
$ cd project
$ git checkout -b featureA
$ (work)
$ git commit
$ (work)
120
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Figure 5.15: Basic sequence of this managed-team workflow
$ git commit
You may want to use rebase -i to squash your work down to a
sIngIe commIt, or rearrange the work In the commIts to make the
patch easIer Ior the maIntaIner to revIew — see Chapter 6 Ior more
InIormatIon about InteractIve rebasIng.
When your branch work Is finIshed and you're ready to contrIbute
It back to the maIntaIners, go to the orIgInaI project page and cIIck
the “Iork” button, creatIng your own wrItabIe Iork oI the project.
You then need to add In thIs new reposItory !II as a second remote,
In thIs case named myfork.
$ git remote add myfork (url)
You need to push your work up to It. ¡t's easIest to push the remote
121
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
branch you're workIng on up to your reposItory, rather than mergIng
Into your master branch and pushIng that up. The reason Is that II
the work Isn't accepted or Is cherry pIcked, you don't have to rewInd
your master branch. ¡I the maIntaIners merge, rebase, or cherry-
pIck your work, you'II eventuaIIy get It back vIa puIIIng Irom theIr
reposItory anyhow.
$ git push myfork featureA
When your work has been pushed up to your Iork, you need to
notIIy the maIntaIner. ThIs Is oIten caIIed a puII request, and you
can eIther generate It vIa the websIte — GItIub has a “puII request”
button that automatIcaIIy messages the maIntaIner — or run the git
request-pull command and e-maII the output to the project maIntaIner
manuaIIy.
The request-pull command takes the base branch Into whIch you
want your topIc branch puIIed and the GIt reposItory !II you want
them to puII Irom, and outputs a summary oI aII the changes you're
askIng to be puIIed In. Ior Instance, II jessIca wants to send john
a puII request, and she's done two commIts on the topIc branch she
just pushed up, she can run thIs.
$ git request-pull origin/master myfork
The following changes since commit 1edee6b1d61823a2de3b09c160d7080b8d1b3a40:
John Smith (1):
added a new function
are available in the git repository at:
git://githost/simplegit.git featureA
Jessica Smith (2):
add limit to log function
change log output to 30 from 25
lib/simplegit.rb | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
The output can be sent to the maIntaIner—It teIIs them where the
work was branched Irom, summarIzes the commIts, and teIIs where
to puII thIs work Irom.
On a project Ior whIch you're not the maIntaIner, It's generaIIy
easIer to have a branch IIke master aIways track origin/master and to do
your work In topIc branches that you can easIIy dIscard II they're re-
jected. IavIng work themes IsoIated Into topIc branches aIso makes
It easIer Ior you to rebase your work II the tIp oI the maIn reposI-
tory has moved In the meantIme and your commIts no Ionger appIy
cIeanIy. Ior exampIe, II you want to submIt a second topIc oI work
to the project, don't contInue workIng on the topIc branch you just
pushed up — start over Irom the maIn reposItory's master branch.
$ git checkout -b featureB origin/master
$ (work)
122
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
$ git commit
$ git push myfork featureB
$ (email maintainer)
$ git fetch origin
Þow, each oI your topIcs Is contaIned wIthIn a sIIo — sImIIar to a
patch queue — that you can rewrIte, rebase, and modIIy wIthout the
topIcs InterIerIng or InterdependIng on each other as In IIgure 5-16.
Figure 5.16: Initial commit history with featureB work
Iet's say the project maIntaIner has puIIed In a bunch oI other
patches and trIed your first branch, but It no Ionger cIeanIy merges.
¡n thIs case, you can try to rebase that branch on top oI origin/
master, resoIve the conflIcts Ior the maIntaIner, and then resubmIt
your changes.
$ git checkout featureA
$ git rebase origin/master
$ git push –f myfork featureA
ThIs rewrItes your hIstory to now Iook IIke IIgure 5.17.
Figure 5.17: Commit history after featureA work
Ðecause you rebased the branch, you have to specIIy the –f to your
push command In order to be abIe to repIace the featureA branch on
the server wIth a commIt that Isn't a descendant oI It. An aIternatIve
wouId be to push thIs new work to a dIfferent branch on the server
(perhaps caIIed featureAv2).
123
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
Iet's Iook at one more possIbIe scenarIo. the maIntaIner has Iooked
at work In your second branch and IIkes the concept but wouId IIke
you to change an ImpIementatIon detaII. You'II aIso take thIs op-
portunIty to move the work to be based off the project's current
master branch. You start a new branch based off the current origin/
master branch, squash the featureB changes there, resoIve any con-
flIcts, make the ImpIementatIon change, and then push that up as a
new branch.
$ git checkout -b featureBv2 origin/master
$ git merge --no-commit --squash featureB
$ (change implementation)
$ git commit
$ git push myfork featureBv2
The --squash optIon takes aII the work on the merged branch and
squashes It Into one non-merge commIt on top oI the branch you're
on. The --no-commit optIon teIIs GIt not to automatIcaIIy record a
commIt. ThIs aIIows you to Introduce aII the changes Irom another
branch and then make more changes beIore recordIng the new com-
mIt.
Þow you can send the maIntaIner a message that you've made the
requested changes and they can find those changes In your featureBv2
branch (see IIgure 5.18).
Figure 5.18: Commit history after featureBv2 work
5.2.5 Public Large Project
Many Iarger projects have estabIIshed procedures Ior acceptIng patches
— you'II need to check the specIfic ruIes Ior each project, because
they wIII dIffer. Iowever, many Iarger pubIIc projects accept patches
vIa a deveIoper maIIIng IIst, so ¡'II go over an exampIe oI that now.
The workflow Is sImIIar to the prevIous use case — you create
topIc branches Ior each patch serIes you work on. The dIfference Is
how you submIt them to the project. ¡nstead oI IorkIng the project
and pushIng to your own wrItabIe versIon, you generate e-maII ver-
sIons oI each commIt serIes and e-maII them to the deveIoper maIIIng
IIst.
124
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
$ git checkout -b topicA
$ (work)
$ git commit
$ (work)
$ git commit
Þow you have two commIts that you want to send to the maIIIng
IIst. You use git format-patch to generate the mbox-Iormatted fiIes
that you can e-maII to the IIst — It turns each commIt Into an e-maII
message wIth the first IIne oI the commIt message as the subject and
the rest oI the message pIus the patch that the commIt Introduces
as the body. The nIce thIng about thIs Is that appIyIng a patch Irom
an e-maII generated wIth format-patch preserves aII the commIt InIor-
matIon properIy, as you'II see more oI In the next sectIon when you
appIy these commIts.
$ git format-patch -M origin/master
0001-add-limit-to-log-function.patch
0002-changed-log-output-to-30-from-25.patch
The format-patch command prInts out the names oI the patch fiIes
It creates. The -M swItch teIIs GIt to Iook Ior renames. The fiIes end
up IookIng IIke thIs.
$ cat 0001-add-limit-to-log-function.patch
From 330090432754092d704da8e76ca5c05c198e71a8 Mon Sep 17 00:00:00 2001
From: Jessica Smith <jessica@example.com>
Date: Sun, 6 Apr 2008 10:17:23 -0700
Subject: [PATCH 1/2] add limit to log function
Limit log functionality to the first 20
---
lib/simplegit.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/lib/simplegit.rb b/lib/simplegit.rb
index 76f47bc..f9815f1 100644
--- a/lib/simplegit.rb
+++ b/lib/simplegit.rb
@@ -14,7 +14,7 @@ class SimpleGit
end
def log(treeish = 'master')
- command("git log #{treeish}")
+ command("git log -n 20 #{treeish}")
end
def ls_tree(treeish = 'master')
--
1.6.2.rc1.20.g8c5b.dirty
You can aIso edIt these patch fiIes to add more InIormatIon Ior the
e-maII IIst that you don't want to show up In the commIt message. ¡I
you add text between the -- IIne and the begInnIng oI the patch (the
lib/simplegit.rb IIne), then deveIopers can read It, but appIyIng the
patch excIudes It.
125
Section 5.2 ContrIbutIng to a Iroject Scott Chacon Pro Git
To e-maII thIs to a maIIIng IIst, you can eIther paste the fiIe Into
your e-maII program or send It vIa a command-IIne program. Iast-
Ing the text oIten causes IormattIng Issues, especIaIIy wIth “smarter”
cIIents that don't preserve newIInes and other whItespace approprI-
ateIy. IuckIIy, GIt provIdes a tooI to heIp you send properIy Iormatted
patches vIa ¡MAI, whIch may be easIer Ior you. ¡'II demonstrate how
to send a patch vIa GmaII, whIch happens to be the e-maII agent ¡ use,
you can read detaIIed InstructIons Ior a number oI maII programs at
the end oI the aIorementIoned Documentation/SubmittingPatches fiIe In
the GIt source code.
IIrst, you need to set up the Imap sectIon In your ~/.gitconfig fiIe.
You can set each vaIue separateIy wIth a serIes oI git config com-
mands, or you can add them manuaIIy, but In the end, your config
fiIe shouId Iook somethIng IIke thIs.
[imap]
folder = "[Gmail]/Drafts"
host = imaps://imap.gmail.com
user = user@gmail.com
pass = p4ssw0rd
port = 993
sslverify = false
¡I your ¡MAI server doesn't use SSI, the Iast two IInes probabIy
aren't necessary, and the host vaIue wIII be imap:// Instead oI imaps://.
When that Is set up, you can use git send-email to pIace the patch
serIes In the ÐraIts IoIder oI the specIfied ¡MAI server.
$ git send-email *.patch
0001-added-limit-to-log-function.patch
0002-changed-log-output-to-30-from-25.patch
Who should the emails appear to be from? [Jessica Smith <jessica@example.com>]
Emails will be sent from: Jessica Smith <jessica@example.com>
Who should the emails be sent to? jessica@example.com
Message-ID to be used as In-Reply-To for the first email? y
Then, GIt spIts out a bunch oI Iog InIormatIon IookIng somethIng
IIke thIs Ior each patch you're sendIng.
(mbox) Adding cc: Jessica Smith <jessica@example.com> from
\line 'From: Jessica Smith <jessica@example.com>'
OK. Log says:
Sendmail: /usr/sbin/sendmail -i jessica@example.com
From: Jessica Smith <jessica@example.com>
To: jessica@example.com
Subject: [PATCH 1/2] added limit to log function
Date: Sat, 30 May 2009 13:29:15 -0700
Message-Id: <1243715356-61726-1-git-send-email-jessica@example.com>
X-Mailer: git-send-email 1.6.2.rc1.20.g8c5b.dirty
In-Reply-To: <y>
References: <y>
Result: OK
At thIs poInt, you shouId be abIe to go to your ÐraIts IoIder, change
the To fieId to the maIIIng IIst you're sendIng the patch to, possIbIy
126
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
CC the maIntaIner or person responsIbIe Ior that sectIon, and send
It off.
5.2.6 Summary
ThIs sectIon has covered a number oI common workflows Ior deaI-
Ing wIth severaI very dIfferent types oI GIt projects you're IIkeIy to
encounter and Introduced a coupIe oI new tooIs to heIp you man-
age thIs process. Þext, you'II see how to work the other sIde oI the
coIn. maIntaInIng a GIt project. You'II Iearn how to be a benevoIent
dIctator or IntegratIon manager.
5.3 Maintaining a Project
¡n addItIon to knowIng how to effectIveIy contrIbute to a project,
you'II IIkeIy need to know how to maIntaIn one. ThIs can consIst
oI acceptIng and appIyIng patches generated vIa format-patch and e-
maIIed to you, or IntegratIng changes In remote branches Ior repos-
ItorIes you've added as remotes to your project. Whether you maIn-
taIn a canonIcaI reposItory or want to heIp by verIIyIng or approvIng
patches, you need to know how to accept work In a way that Is cIear-
est Ior other contrIbutors and sustaInabIe by you over the Iong run.
5.3.1 Working in Topic Branches
When you're thInkIng oI IntegratIng new work, It's generaIIy a good
Idea to try It out In a topIc branch — a temporary branch specIficaIIy
made to try out that new work. ThIs way, It's easy to tweak a patch In-
dIvIduaIIy and Ieave It II It's not workIng untII you have tIme to come
back to It. ¡I you create a sImpIe branch name based on the theme oI
the work you're goIng to try, such as ruby_client or somethIng sImI-
IarIy descrIptIve, you can easIIy remember It II you have to abandon
It Ior a whIIe and come back Iater. The maIntaIner oI the GIt project
tends to namespace these branches as weII — such as sc/ruby_client,
where sc Is short Ior the person who contrIbuted the work. As you'II
remember, you can create the branch based off your master branch
IIke thIs.
$ git branch sc/ruby_client master
Or, II you want to aIso swItch to It ImmedIateIy, you can use the
checkout -b optIon.
$ git checkout -b sc/ruby_client master
Þow you're ready to add your contrIbuted work Into thIs topIc
branch and determIne II you want to merge It Into your Ionger-term
branches.
127
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
5.3.2 Applying Patches from E-mail
¡I you receIve a patch over e-maII that you need to Integrate Into your
project, you need to appIy the patch In your topIc branch to evaIuate
It. There are two ways to appIy an e-maIIed patch. wIth git apply or
wIth git am.
Applying a Patch with apply
¡I you receIved the patch Irom someone who generated It wIth the git
diff or a !nIx diff command, you can appIy It wIth the git apply com-
mand. AssumIng you saved the patch at /tmp/patch-ruby-client.patch,
you can appIy the patch IIke thIs.
$ git apply /tmp/patch-ruby-client.patch
ThIs modIfies the fiIes In your workIng dIrectory. ¡t's aImost Iden-
tIcaI to runnIng a patch -p1 command to appIy the patch, aIthough It's
more paranoId and accepts Iewer Iuzzy matches then patch. ¡t aIso
handIes fiIe adds, deIetes, and renames II they're descrIbed In the
git diff Iormat, whIch patch won't do. IInaIIy, git apply Is an “appIy
aII or abort aII” modeI where eIther everythIng Is appIIed or nothIng
Is, whereas patch can partIaIIy appIy patchfiIes, IeavIng your workIng
dIrectory In a weIrd state. git apply Is over aII much more paranoId
than patch. ¡t won't create a commIt Ior you — aIter runnIng It, you
must stage and commIt the changes Introduced manuaIIy.
You can aIso use gIt appIy to see II a patch appIIes cIeanIy beIore
you try actuaIIy appIyIng It — you can run git apply --check wIth the
patch.
$ git apply --check 0001-seeing-if-this-helps-the-gem.patch
error: patch failed: ticgit.gemspec:1
error: ticgit.gemspec: patch does not apply
¡I there Is no output, then the patch shouId appIy cIeanIy. ThIs
command aIso exIts wIth a non-zero status II the check IaIIs, so you
can use It In scrIpts II you want.
Applying a Patch with am
¡I the contrIbutor Is a GIt user and was good enough to use the format-
patch command to generate theIr patch, then your job Is easIer be-
cause the patch contaIns author InIormatIon and a commIt message
Ior you. ¡I you can, encourage your contrIbutors to use format-patch
Instead oI diff to generate patches Ior you. You shouId onIy have to
use git apply Ior Iegacy patches and thIngs IIke that.
To appIy a patch generated by format-patch, you use git am. TechnI-
caIIy, git am Is buIIt to read an mbox fiIe, whIch Is a sImpIe, pIaIn-text
Iormat Ior storIng one or more e-maII messages In one text fiIe. ¡t
Iooks somethIng IIke thIs.
128
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
From 330090432754092d704da8e76ca5c05c198e71a8 Mon Sep 17 00:00:00 2001
From: Jessica Smith <jessica@example.com>
Date: Sun, 6 Apr 2008 10:17:23 -0700
Subject: [PATCH 1/2] add limit to log function
Limit log functionality to the first 20
ThIs Is the begInnIng oI the output oI the Iormat-patch command
that you saw In the prevIous sectIon. ThIs Is aIso a vaIId mbox e-maII
Iormat. ¡I someone has e-maIIed you the patch properIy usIng gIt
send-emaII, and you downIoad that Into an mbox Iormat, then you
can poInt gIt am to that mbox fiIe, and It wIII start appIyIng aII the
patches It sees. ¡I you run a maII cIIent that can save severaI e-maIIs
out In mbox Iormat, you can save entIre patch serIes Into a fiIe and
then use gIt am to appIy them one at a tIme.
Iowever, II someone upIoaded a patch fiIe generated vIa format-
patch to a tIcketIng system or somethIng sImIIar, you can save the fiIe
IocaIIy and then pass that fiIe saved on your dIsk to git am to appIy It.
$ git am 0001-limit-log-function.patch
Applying: add limit to log function
You can see that It appIIed cIeanIy and automatIcaIIy created the
new commIt Ior you. The author InIormatIon Is taken Irom the e-
maII's From and Date headers, and the message oI the commIt Is taken
Irom the Subject and body (beIore the patch) oI the e-maII. Ior exam-
pIe, II thIs patch was appIIed Irom the mbox exampIe ¡ just showed,
the commIt generated wouId Iook somethIng IIke thIs.
$ git log --pretty=fuller -1
commit 6c5e70b984a60b3cecd395edd5b48a7575bf58e0
Author: Jessica Smith <jessica@example.com>
AuthorDate: Sun Apr 6 10:17:23 2008 -0700
Commit: Scott Chacon <schacon@gmail.com>
CommitDate: Thu Apr 9 09:19:06 2009 -0700
add limit to log function
Limit log functionality to the first 20
The Commit InIormatIon IndIcates the person who appIIed the patch
and the tIme It was appIIed. The Author InIormatIon Is the IndIvIduaI
who orIgInaIIy created the patch and when It was orIgInaIIy created.
Ðut It's possIbIe that the patch won't appIy cIeanIy. Ierhaps your
maIn branch has dIverged too Iar Irom the branch the patch was buIIt
Irom, or the patch depends on another patch you haven't appIIed yet.
¡n that case, the git am process wIII IaII and ask you what you want
to do.
$ git am 0001-seeing-if-this-helps-the-gem.patch
Applying: seeing if this helps the gem
error: patch failed: ticgit.gemspec:1
error: ticgit.gemspec: patch does not apply
Patch failed at 0001.
129
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".
ThIs command puts conflIct markers In any fiIes It has Issues wIth,
much IIke a conflIcted merge or rebase operatIon. You soIve thIs
Issue much the same way — edIt the fiIe to resoIve the conflIct, stage
the new fiIe, and then run git am --resolved to contInue to the next
patch.
$ (fix the file)
$ git add ticgit.gemspec
$ git am --resolved
Applying: seeing if this helps the gem
¡I you want GIt to try a bIt more InteIIIgentIy to resoIve the conflIct,
you can pass a -3 optIon to It, whIch makes GIt attempt a three-way
merge. ThIs optIon Isn't on by deIauIt because It doesn't work II the
commIt the patch says It was based on Isn't In your reposItory. ¡I you
do have that commIt — II the patch was based on a pubIIc commIt
— then the -3 optIon Is generaIIy much smarter about appIyIng a
conflIctIng patch.
$ git am -3 0001-seeing-if-this-helps-the-gem.patch
Applying: seeing if this helps the gem
error: patch failed: ticgit.gemspec:1
error: ticgit.gemspec: patch does not apply
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
¡n thIs case, ¡ was tryIng to appIy a patch ¡ had aIready appIIed.
WIthout the -3 optIon, It Iooks IIke a conflIct.
¡I you're appIyIng a number oI patches Irom an mbox, you can aIso
run the am command In InteractIve mode, whIch stops at each patch
It finds and asks II you want to appIy It.
$ git am -3 -i mbox
Commit Body is:
--------------------------
seeing if this helps the gem
--------------------------
Apply? [y]es/[n]o/[e]dit/[v]iew patch/[a]ccept all
ThIs Is nIce II you have a number oI patches saved, because you
can vIew the patch first II you don't remember what It Is, or not appIy
the patch II you've aIready done so.
When aII the patches Ior your topIc are appIIed and commItted Into
your branch, you can choose whether and how to Integrate them Into
a Ionger-runnIng branch.
130
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
5.3.3 Checking Out Remote Branches
¡I your contrIbutIon came Irom a GIt user who set up theIr own repos-
Itory, pushed a number oI changes Into It, and then sent you the !II
to the reposItory and the name oI the remote branch the changes are
In, you can add them as a remote and do merges IocaIIy.
Ior Instance, II jessIca sends you an e-maII sayIng that she has a
great new Ieature In the ruby-client branch oI her reposItory, you can
test It by addIng the remote and checkIng out that branch IocaIIy.
$ git remote add jessica git://github.com/jessica/myproject.git
$ git fetch jessica
$ git checkout -b rubyclient jessica/ruby-client
¡I she e-maIIs you agaIn Iater wIth another branch contaInIng an-
other great Ieature, you can Ietch and check out because you aIready
have the remote setup.
ThIs Is most useIuI II you're workIng wIth a person consIstentIy. ¡I
someone onIy has a sIngIe patch to contrIbute once In a whIIe, then
acceptIng It over e-maII may be Iess tIme consumIng than requIrIng
everyone to run theIr own server and havIng to contInuaIIy add and
remove remotes to get a Iew patches. You're aIso unIIkeIy to want to
have hundreds oI remotes, each Ior someone who contrIbutes onIy a
patch or two. Iowever, scrIpts and hosted servIces may make thIs
easIer — It depends IargeIy on how you deveIop and how your con-
trIbutors deveIop.
The other advantage oI thIs approach Is that you get the hIstory
oI the commIts as weII. AIthough you may have IegItImate merge
Issues, you know where In your hIstory theIr work Is based, a proper
three-way merge Is the deIauIt rather than havIng to suppIy a -3 and
hope the patch was generated off a pubIIc commIt to whIch you have
access.
¡I you aren't workIng wIth a person consIstentIy but stIII want to
puII Irom them In thIs way, you can provIde the !II oI the remote
reposItory to the git pull command. ThIs does a one-tIme puII and
doesn't save the !II as a remote reIerence.
$ git pull git://github.com/onetimeguy/project.git
From git://github.com/onetimeguy/project
* branch HEAD -> FETCH_HEAD
Merge made by recursive.
5.3.4 Determining What Is Introduced
Þow you have a topIc branch that contaIns contrIbuted work. At thIs
poInt, you can determIne what you'd IIke to do wIth It. ThIs sectIon
revIsIts a coupIe oI commands so you can see how you can use them
to revIew exactIy what you'II be IntroducIng II you merge thIs Into
your maIn branch.
¡t's oIten heIpIuI to get a revIew oI aII the commIts that are In
thIs branch but that aren't In your master branch. You can excIude
131
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
commIts In the master branch by addIng the --not optIon beIore the
branch name. Ior exampIe, II your contrIbutor sends you two patches
and you create a branch caIIed contrib and appIIed those patches
there, you can run thIs.
$ git log contrib --not master
commit 5b6235bd297351589efc4d73316f0a68d484f118
Author: Scott Chacon <schacon@gmail.com>
Date: Fri Oct 24 09:53:59 2008 -0700
seeing if this helps the gem
commit 7482e0d16d04bea79d0dba8988cc78df655f16a0
Author: Scott Chacon <schacon@gmail.com>
Date: Mon Oct 22 19:38:36 2008 -0700
updated the gemspec to hopefully work better
To see what changes each commIt Introduces, remember that you
can pass the -p optIon to git log and It wIII append the dIff Introduced
to each commIt.
To see a IuII dIff oI what wouId happen II you were to merge thIs
topIc branch wIth another branch, you may have to use a weIrd trIck
to get the correct resuIts. You may thInk to run thIs.
$ git diff master
ThIs command gIves you a dIff, but It may be mIsIeadIng. ¡I your
master branch has moved Iorward sInce you created the topIc branch
Irom It, then you'II get seemIngIy strange resuIts. ThIs happens be-
cause GIt dIrectIy compares the snapshots oI the Iast commIt oI the
topIc branch you're on and the snapshot oI the Iast commIt on the
master branch. Ior exampIe, II you've added a IIne In a fiIe on the
master branch, a dIrect comparIson oI the snapshots wIII Iook IIke the
topIc branch Is goIng to remove that IIne.
¡I master Is a dIrect ancestor oI your topIc branch, thIs Isn't a prob-
Iem, but II the two hIstorIes have dIverged, the dIff wIII Iook IIke
you're addIng aII the new stuff In your topIc branch and removIng
everythIng unIque to the master branch.
What you reaIIy want to see are the changes added to the topIc
branch — the work you'II Introduce II you merge thIs branch wIth
master. You do that by havIng GIt compare the Iast commIt on your
topIc branch wIth the first common ancestor It has wIth the master
branch.
TechnIcaIIy, you can do that by expIIcItIy figurIng out the common
ancestor and then runnIng your dIff on It.
$ git merge-base contrib master
36c7dba2c95e6bbb78dfa822519ecfec6e1ca649
$ git diff 36c7db
Iowever, that Isn't convenIent, so GIt provIdes another shorthand
Ior doIng the same thIng. the trIpIe-dot syntax. ¡n the context oI the
132
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
diff command, you can put three perIods aIter another branch to
do a diff between the Iast commIt oI the branch you're on and Its
common ancestor wIth another branch.
$ git diff master...contrib
ThIs command shows you onIy the work your current topIc branch
has Introduced sInce Its common ancestor wIth master. That Is a very
useIuI syntax to remember.
5.3.5 Integrating Contributed Work
When aII the work In your topIc branch Is ready to be Integrated Into
a more maInIIne branch, the questIon Is how to do It. Iurthermore,
what overaII workflow do you want to use to maIntaIn your project?
You have a number oI choIces, so ¡'II cover a Iew oI them.
Merging Workflows
One sImpIe workflow merges your work Into your master branch. ¡n
thIs scenarIo, you have a master branch that contaIns basIcaIIy stabIe
code. When you have work In a topIc branch that you've done or
that someone has contrIbuted and you've verIfied, you merge It Into
your master branch, deIete the topIc branch, and then contInue the
process. ¡I we have a reposItory wIth work In two branches named
ruby_client and php_client that Iooks IIke IIgure 5.19 and merge ruby_client
first and then php_client next, then your hIstory wIII end up IookIng
IIke IIgure 5.20.
That Is probabIy the sImpIest workflow, but It's probIematIc II
you're deaIIng wIth Iarger reposItorIes or projects.
¡I you have more deveIopers or a Iarger project, you'II probabIy
want to use at Ieast a two-phase merge cycIe. ¡n thIs scenarIo, you
have two Iong-runnIng branches, master and develop, In whIch you de-
termIne that master Is updated onIy when a very stabIe reIease Is cut
and aII new code Is Integrated Into the develop branch. You reguIarIy
push both oI these branches to the pubIIc reposItory. £ach tIme you
have a new topIc branch to merge In (IIgure 5.21), you merge It Into
develop (IIgure 5-22), then, when you tag a reIease, you Iast-Iorward
master to wherever the now-stabIe develop branch Is (IIgure 5.23).
ThIs way, when peopIe cIone your project's reposItory, they can
eIther check out master to buIId the Iatest stabIe versIon and keep
up to date on that easIIy, or they can check out deveIop, whIch Is the
more cuttIng-edge stuff. You can aIso contInue thIs concept, havIng
an Integrate branch where aII the work Is merged together. Then,
when the codebase on that branch Is stabIe and passes tests, you
merge It Into a deveIop branch, and when that has proven ItseII stabIe
Ior a whIIe, you Iast-Iorward your master branch.
133
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
Figure 5.19: History with several topic branches
Figure 5.20: After a topic branch merge
Large-Merging Workflows
The GIt project has Iour Iong-runnIng branches. master, next, and pu
(proposed updates) Ior new work, and maint Ior maIntenance back-
ports. When new work Is Introduced by contrIbutors, It's coIIected
Into topIc branches In the maIntaIner's reposItory In a manner sImI-
134
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Figure 5.21: Before a topic branch merge
Figure 5.22: After a topic branch merge
Figure 5.23: After a topic branch release
135
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
Iar to what ¡'ve descrIbed (see IIgure 5.24). At thIs poInt, the topIcs
are evaIuated to determIne whether they're saIe and ready Ior con-
sumptIon or whether they need more work. ¡I they're saIe, they're
merged Into next, and that branch Is pushed up so everyone can try
the topIcs Integrated together.
Figure 5.24: Managing a complex series of parallel con-
tributed topic
branches
¡I the topIcs stIII need work, they're merged Into pu Instead. When
It's determIned that they're totaIIy stabIe, the topIcs are re-merged
Into master and are then rebuIIt Irom the topIcs that were In next but
dIdn't yet graduate to master. ThIs means master aImost aIways moves
Iorward, next Is rebased occasIonaIIy, and pu Is rebased even more
oIten (see IIgure 5.25).
Figure 5.25: Merging contributed topic branches into long-
term integration
branches
When a topIc branch has finaIIy been merged Into master, It's re-
136
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
moved Irom the reposItory. The GIt project aIso has a maint branch
that Is Iorked off Irom the Iast reIease to provIde backported patches
In case a maIntenance reIease Is requIred. Thus, when you cIone
the GIt reposItory, you have Iour branches that you can check out to
evaIuate the project In dIfferent stages oI deveIopment, dependIng
on how cuttIng edge you want to be or how you want to contrIbute,
and the maIntaIner has a structured workflow to heIp them vet new
contrIbutIons.
Rebasing and Cherry Picking Workflows
Other maIntaIners preIer to rebase or cherry-pIck contrIbuted work
on top oI theIr master branch, rather than mergIng It In, to keep a
mostIy IInear hIstory. When you have work In a topIc branch and have
determIned that you want to Integrate It, you move to that branch
and run the rebase command to rebuIId the changes on top oI your
current master (or develop, and so on) branch. ¡I that works weII, you
can Iast-Iorward your master branch, and you'II end up wIth a IInear
project hIstory.
The other way to move Introduced work Irom one branch to an-
other Is to cherry-pIck It. A cherry-pIck In GIt Is IIke a rebase Ior a
sIngIe commIt. ¡t takes the patch that was Introduced In a commIt
and trIes to reappIy It on the branch you're currentIy on. ThIs Is use-
IuI II you have a number oI commIts on a topIc branch and you want
to Integrate onIy one oI them, or II you onIy have one commIt on a
topIc branch and you'd preIer to cherry-pIck It rather than run re-
base. Ior exampIe, suppose you have a project that Iooks IIke IIgure
5.26.
Figure 5.26: Example history before a cherry pick
¡I you want to puII commIt e43a6 Into your master branch, you can
run
$ git cherry-pick e43a6fd3e94888d76779ad79fb568ed180e5fcdf
Finished one cherry-pick.
137
Section 5.3 MaIntaInIng a Iroject Scott Chacon Pro Git
[master]: created a0a41a9: "More friendly message when locking the index fails."
3 files changed, 17 insertions(+), 3 deletions(-)
ThIs puIIs the same change Introduced In e43a6, but you get a new
commIt SIA-1 vaIue, because the date appIIed Is dIfferent. Þow your
hIstory Iooks IIke IIgure 5.27.
Figure 5.27: History after cherry-picking a commit on a topic
branch
Þow you can remove your topIc branch and drop the commIts you
dIdn't want to puII In.
5.3.6 Tagging Your Releases
When you've decIded to cut a reIease, you'II probabIy want to drop a
tag so you can re-create that reIease at any poInt goIng Iorward. You
can create a new tag as ¡ dIscussed In Chapter 2. ¡I you decIde to
sIgn the tag as the maIntaIner, the taggIng may Iook somethIng IIke
thIs.
$ git tag -s v1.5 -m 'my signed 1.5 tag'
You need a passphrase to unlock the secret key for
user: "Scott Chacon <schacon@gmail.com>"
1024-bit DSA key, ID F721C45A, created 2009-02-09
¡I you do sIgn your tags, you may have the probIem oI dIstrIbutIng
the pubIIc IGI key used to sIgn your tags. The maIntaIner oI the
GIt project has soIved thIs Issue by IncIudIng theIr pubIIc key as a
bIob In the reposItory and then addIng a tag that poInts dIrectIy to
that content. To do thIs, you can figure out whIch key you want by
runnIng gpg --list-keys.
$ gpg --list-keys
/Users/schacon/.gnupg/pubring.gpg
---------------------------------
pub 1024D/F721C45A 2009-02-09 [expires: 2010-02-09]
uid Scott Chacon <schacon@gmail.com>
sub 2048g/45D02282 2009-02-09 [expires: 2010-02-09]
138
Chapter 5 ÐIstrIbuted GIt Scott Chacon Pro Git
Then, you can dIrectIy Import the key Into the GIt database by
exportIng It and pIpIng that through git hash-object, whIch wrItes a
new bIob wIth those contents Into GIt and gIves you back the SIA-1
oI the bIob.
$ gpg -a --export F721C45A | git hash-object -w --stdin
659ef797d181633c87ec71ac3f9ba29fe5775b92
Þow that you have the contents oI your key In GIt, you can create
a tag that poInts dIrectIy to It by specIIyIng the new SIA-1 vaIue that
the hash-object command gave you.
$ git tag -a maintainer-pgp-pub 659ef797d181633c87ec71ac3f9ba29fe5775b92
¡I you run git push --tags, the maintainer-pgp-pub tag wIII be shared
wIth everyone. ¡I anyone wants to verIIy a tag, they can dIrectIy
Import your IGI key by puIIIng the bIob dIrectIy out oI the database
and ImportIng It Into GIG.
$ git show maintainer-pgp-pub | gpg --import
They can use that key to verIIy aII your sIgned tags. AIso, II you
IncIude InstructIons In the tag message, runnIng git show <tag> wIII
Iet you gIve the end user more specIfic InstructIons about tag verIfi-
catIon.
5.3.7 Generating a Build Number
Ðecause GIt doesn't have monotonIcaIIy IncreasIng numbers IIke `v123'
or the equIvaIent to go wIth each commIt, II you want to have a
human-readabIe name to go wIth a commIt, you can run git describe
on that commIt. GIt gIves you the name oI the nearest tag wIth the
number oI commIts on top oI that tag and a partIaI SIA-1 vaIue oI
the commIt you're descrIbIng.
$ git describe master
v1.6.2-rc1-20-g8c5b85c
ThIs way, you can export a snapshot or buIId and name It some-
thIng understandabIe to peopIe. ¡n Iact, II you buIId GIt Irom source
code cIoned Irom the GIt reposItory, git --version gIves you some-
thIng that Iooks IIke thIs. ¡I you're descrIbIng a commIt that you
have dIrectIy tagged, It gIves you the tag name.
The git describe command Iavors annotated tags (tags created
wIth the -a or -s flag), so reIease tags shouId be created thIs way
II you're usIng git describe, to ensure the commIt Is named prop-
erIy when descrIbed. You can aIso use thIs strIng as the target oI
a checkout or show command, aIthough It reIIes on the abbrevIated
SIA-1 vaIue at the end, so It may not be vaIId Iorever. Ior Instance,
the IInux kerneI recentIy jumped Irom 8 to 10 characters to ensure
SIA-1 object unIqueness, so oIder git describe output names were
InvaIIdated.
139
Section 5.4 Summary Scott Chacon Pro Git
5.3.8 Preparing a Release
Þow you want to reIease a buIId. One oI the thIngs you'II want to
do Is create an archIve oI the Iatest snapshot oI your code Ior those
poor souIs who don't use GIt. The command to do thIs Is git archive.
$ git archive master --prefix='project/' | gzip > `git describe master`.tar.gz
$ ls *.tar.gz
v1.6.2-rc1-20-g8c5b85c.tar.gz
¡I someone opens that tarbaII, they get the Iatest snapshot oI your
project under a project dIrectory. You can aIso create a zIp archIve
In much the same way, but by passIng the --format=zip optIon to git
archive.
$ git archive master --prefix='project/' --format=zip > `git describe master`.zip
You now have a nIce tarbaII and a zIp archIve oI your project re-
Iease that you can upIoad to your websIte or e-maII to peopIe.
5.3.9 The Shortlog
¡t's tIme to e-maII your maIIIng IIst oI peopIe who want to know what's
happenIng In your project. A nIce way oI quIckIy gettIng a sort oI
changeIog oI what has been added to your project sInce your Iast
reIease or e-maII Is to use the git shortlog command. ¡t summarIzes
aII the commIts In the range you gIve It, Ior exampIe, the IoIIowIng
gIves you a summary oI aII the commIts sInce your Iast reIease, II
your Iast reIease was named v1.0.1.
$ git shortlog --no-merges master --not v1.0.1
Chris Wanstrath (8):
Add support for annotated tags to Grit::Tag
Add packed-refs annotated tag support.
Add Grit::Commit#to_patch
Update version and History.txt
Remove stray `puts`
Make ls_tree ignore nils
Tom Preston-Werner (4):
fix dates in history
dynamic version method
Version bump to 1.0.2
Regenerated gemspec for version 1.0.2
You get a cIean summary oI aII the commIts sInce v1.0.1, grouped
by author, that you can e-maII to your IIst.
5.4 Summary
You shouId IeeI IaIrIy comIortabIe contrIbutIng to a project In GIt as
weII as maIntaInIng your own project or IntegratIng other users' con-
trIbutIons. CongratuIatIons on beIng an effectIve GIt deveIoper! ¡n
the next chapter, you'II Iearn more powerIuI tooIs and tIps Ior deaIIng
wIth compIex sItuatIons, whIch wIII truIy make you a GIt master.
140
Chapter 6
Git Tools
Ðy now, you've Iearned most oI the day-to-day commands and work-
flows that you need to manage or maIntaIn a GIt reposItory Ior your
source code controI. You've accompIIshed the basIc tasks oI trackIng
and commIttIng fiIes, and you've harnessed the power oI the stagIng
area and IIghtweIght topIc branchIng and mergIng.
Þow you'II expIore a number oI very powerIuI thIngs that GIt can
do that you may not necessarIIy use on a day-to-day basIs but that
you may need at some poInt.
6.1 Revision Selection
GIt aIIows you to specIIy specIfic commIts or a range oI commIts
In severaI ways. They aren't necessarIIy obvIous but are heIpIuI to
know.
6.1.1 Single Revisions
You can obvIousIy reIer to a commIt by the SIA-1 hash that It's gIven,
but there are more human-IrIendIy ways to reIer to commIts as weII.
ThIs sectIon outIInes the varIous ways you can reIer to a sIngIe com-
mIt.
6.1.2 Short SHA
GIt Is smart enough to figure out what commIt you meant to type II
you provIde the first Iew characters, as Iong as your partIaI SIA-1
Is at Ieast Iour characters Iong and unambIguous — that Is, onIy one
object In the current reposItory begIns wIth that partIaI SIA-1.
Ior exampIe, to see a specIfic commIt, suppose you run a git log
command and IdentIIy the commIt where you added certaIn IunctIon-
aIIty.
141
Section 6.1 IevIsIon SeIectIon Scott Chacon Pro Git
$ git log
commit 734713bc047d87bf7eac9674765ae793478c50d3
Author: Scott Chacon <schacon@gmail.com>
Date: Fri Jan 2 18:32:33 2009 -0800
fixed refs handling, added gc auto, updated tests
commit d921970aadf03b3cf0e71becdaab3147ba71cdef
Merge: 1c002dd... 35cfb2b...
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Dec 11 15:08:43 2008 -0800
Merge commit 'phedders/rdocs'
commit 1c002dd4b536e7479fe34593e72e6c6c1819e53b
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Dec 11 14:58:32 2008 -0800
added some blame and merge stuff
¡n thIs case, choose 1c002dd.... ¡I you git show that commIt, the
IoIIowIng commands are equIvaIent (assumIng the shorter versIons
are unambIguous).
$ git show 1c002dd4b536e7479fe34593e72e6c6c1819e53b
$ git show 1c002dd4b536e7479f
$ git show 1c002d
GIt can figure out a short, unIque abbrevIatIon Ior your SIA-1
vaIues. ¡I you pass --abbrev-commit to the git log command, the output
wIII use shorter vaIues but keep them unIque, It deIauIts to usIng
seven characters but makes them Ionger II necessary to keep the
SIA-1 unambIguous.
$ git log --abbrev-commit --pretty=oneline
ca82a6d changed the verison number
085bb3b removed unnecessary test code
a11bef0 first commit
GeneraIIy, eIght to ten characters are more than enough to be
unIque wIthIn a project. One oI the Iargest GIt projects, the IInux
kerneI, Is begInnIng to need 12 characters out oI the possIbIe 40 to
stay unIque.
6.1.3 A SHORT NOTE ABOUT SHA-1
A Iot oI peopIe become concerned at some poInt that they wIII, by
random happenstance, have two objects In theIr reposItory that hash
to the same SIA-1 vaIue. What then?
¡I you do happen to commIt an object that hashes to the same
SIA-1 vaIue as a prevIous object In your reposItory, G¡t wIII see the
prevIous object aIready In your GIt database and assume It was aI-
ready wrItten. ¡I you try to check out that object agaIn at some poInt,
you'II aIways get the data oI the first object.
142
Chapter 6 GIt TooIs Scott Chacon Pro Git
Iowever, you shouId be aware oI how rIdIcuIousIy unIIkeIy thIs
scenarIo Is. The SIA-1 dIgest Is 20 bytes or 160 bIts. The number
oI randomIy hashed objects needed to ensure a 50% probabIIIty oI
a sIngIe coIIIsIon Is about 2
80
(the IormuIa Ior determInIng coIIIsIon
probabIIIty Is p =
n(n−1)
2
×
1
2
160
). 2
80
Is 1.2×10
24
or 1 mIIIIon bIIIIon
bIIIIon. That's 1,200 tImes the number oI graIns oI sand on the earth.
Iere's an exampIe to gIve you an Idea oI what It wouId take to get
a SIA-1 coIIIsIon. ¡I aII 6.5 bIIIIon humans on £arth were program-
mIng, and every second, each one was producIng code that was the
equIvaIent oI the entIre IInux kerneI hIstory (1 mIIIIon GIt objects)
and pushIng It Into one enormous GIt reposItory, It wouId take 5 years
untII that reposItory contaIned enough objects to have a 50% prob-
abIIIty oI a sIngIe SIA-1 object coIIIsIon. A hIgher probabIIIty exIsts
that every member oI your programmIng team wIII be attacked and
kIIIed by woIves In unreIated IncIdents on the same nIght.
6.1.4 Branch References
The most straIghtIorward way to specIIy a commIt requIres that It
have a branch reIerence poInted at It. Then, you can use a branch
name In any GIt command that expects a commIt object or SIA-1
vaIue. Ior Instance, II you want to show the Iast commIt object on a
branch, the IoIIowIng commands are equIvaIent, assumIng that the
topic1 branch poInts to ca82a6d.
$ git show ca82a6dff817ec66f44342007202690a93763949
$ git show topic1
¡I you want to see whIch specIfic SIA a branch poInts to, or II you
want to see what any oI these exampIes boIIs down to In terms oI
SIAs, you can use a GIt pIumbIng tooI caIIed rev-parse. You can see
Chapter 9 Ior more InIormatIon about pIumbIng tooIs, basIcaIIy, rev-
parse exIsts Ior Iower-IeveI operatIons and Isn't desIgned to be used
In day-to-day operatIons. Iowever, It can be heIpIuI sometImes when
you need to see what's reaIIy goIng on. Iere you can run rev-parse
on your branch.
$ git rev-parse topic1
ca82a6dff817ec66f44342007202690a93763949
6.1.5 RefLog Shortnames
One oI the thIngs GIt does In the background whIIe you're workIng
away Is keep a reflog — a Iog oI where your I£AÐ and branch reIer-
ences have been Ior the Iast Iew months.
You can see your reflog by usIng git reflog.
$ git reflog
734713b... HEAD@{0}: commit: fixed refs handling, added gc auto, updated
d921970... HEAD@{1}: merge phedders/rdocs: Merge made by recursive.
143
Section 6.1 IevIsIon SeIectIon Scott Chacon Pro Git
1c002dd... HEAD@{2}: commit: added some blame and merge stuff
1c36188... HEAD@{3}: rebase -i (squash): updating HEAD
95df984... HEAD@{4}: commit: # This is a combination of two commits.
1c36188... HEAD@{5}: rebase -i (squash): updating HEAD
7e05da5... HEAD@{6}: rebase -i (pick): updating HEAD
£very tIme your branch tIp Is updated Ior any reason, GIt stores
that InIormatIon Ior you In thIs temporary hIstory. And you can spec-
IIy oIder commIts wIth thIs data, as weII. ¡I you want to see the fiIth
prIor vaIue oI the I£AÐ oI your reposItory, you can use the @n reIer-
ence that you see In the reflog output.
$ git show HEAD@{5}
You can aIso use thIs syntax to see where a branch was some
specIfic amount oI tIme ago. Ior Instance, to see where your master
branch was yesterday, you can type
$ git show master@{yesterday}
That shows you where the branch tIp was yesterday. ThIs tech-
nIque onIy works Ior data that's stIII In your reflog, so you can't use
It to Iook Ior commIts oIder than a Iew months.
To see reflog InIormatIon Iormatted IIke the git log output, you
can run git log -g.
$ git log -g master
commit 734713bc047d87bf7eac9674765ae793478c50d3
Reflog: master@{0} (Scott Chacon <schacon@gmail.com>)
Reflog message: commit: fixed refs handling, added gc auto, updated
Author: Scott Chacon <schacon@gmail.com>
Date: Fri Jan 2 18:32:33 2009 -0800
fixed refs handling, added gc auto, updated tests
commit d921970aadf03b3cf0e71becdaab3147ba71cdef
Reflog: master@{1} (Scott Chacon <schacon@gmail.com>)
Reflog message: merge phedders/rdocs: Merge made by recursive.
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Dec 11 15:08:43 2008 -0800
Merge commit 'phedders/rdocs'
¡t's Important to note that the reflog InIormatIon Is strIctIy IocaI
— It's a Iog oI what you've done In your reposItory. The reIerences
won't be the same on someone eIse's copy oI the reposItory, and rIght
aIter you InItIaIIy cIone a reposItory, you'II have an empty reflog, as
no actIvIty has occurred yet In your reposItory. IunnIng git show
HEAD@2.months.ago wIII work onIy II you cIoned the project at Ieast two
months ago — II you cIoned It five mInutes ago, you'II get no resuIts.
6.1.6 Ancestry References
The other maIn way to specIIy a commIt Is vIa Its ancestry. ¡I you
pIace a ˆ at the end oI a reIerence, GIt resoIves It to mean the parent
oI that commIt. Suppose you Iook at the hIstory oI your project.
144
Chapter 6 GIt TooIs Scott Chacon Pro Git
$ git log --pretty=format:'%h %s' --graph
* 734713b fixed refs handling, added gc auto, updated tests
* d921970 Merge commit 'phedders/rdocs'
|\
| * 35cfb2b Some rdoc changes
* | 1c002dd added some blame and merge stuff
|/
* 1c36188 ignore *.gem
* 9b29157 add open3_detach to gemspec file list
Then, you can see the prevIous commIt by specIIyIng HEADˆ, whIch
means “the parent oI I£AД.
$ git show HEAD^
commit d921970aadf03b3cf0e71becdaab3147ba71cdef
Merge: 1c002dd... 35cfb2b...
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Dec 11 15:08:43 2008 -0800
Merge commit 'phedders/rdocs'
You can aIso specIIy a number aIter the ˆ — Ior exampIe, d921970ˆ2
means “the second parent oI d921970.” ThIs syntax Is onIy useIuI Ior
merge commIts, whIch have more than one parent. The first parent
Is the branch you were on when you merged, and the second Is the
commIt on the branch that you merged In.
$ git show d921970^
commit 1c002dd4b536e7479fe34593e72e6c6c1819e53b
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Dec 11 14:58:32 2008 -0800
added some blame and merge stuff
$ git show d921970^2
commit 35cfb2b795a55793d7cc56a6cc2060b4bb732548
Author: Paul Hedderly <paul+git@mjr.org>
Date: Wed Dec 10 22:22:03 2008 +0000
Some rdoc changes
The other maIn ancestry specIficatIon Is the ~. ThIs aIso reIers
to the first parent, so HEAD~ and HEADˆ are equIvaIent. The dIfference
becomes apparent when you specIIy a number. HEAD~2 means “the
first parent oI the first parent,” or “the grandparent” — It traverses
the first parents the number oI tImes you specIIy. Ior exampIe, In
the hIstory IIsted earIIer, HEAD~3 wouId be
$ git show HEAD~3
commit 1c3618887afb5fbcbea25b7c013f4e2114448b8d
Author: Tom Preston-Werner <tom@mojombo.com>
Date: Fri Nov 7 13:47:59 2008 -0500
ignore *.gem
ThIs can aIso be wrItten HEADˆˆˆ, whIch agaIn Is the first parent oI
the first parent oI the first parent.
145
Section 6.1 IevIsIon SeIectIon Scott Chacon Pro Git
$ git show HEAD^^^
commit 1c3618887afb5fbcbea25b7c013f4e2114448b8d
Author: Tom Preston-Werner <tom@mojombo.com>
Date: Fri Nov 7 13:47:59 2008 -0500
ignore *.gem
You can aIso combIne these syntaxes — you can get the second
parent oI the prevIous reIerence (assumIng It was a merge commIt)
by usIng HEAD~3ˆ2, and so on.
6.1.7 Commit Ranges
Þow that you can specIIy IndIvIduaI commIts, Iet's see how to spec-
IIy ranges oI commIts. ThIs Is partIcuIarIy useIuI Ior managIng your
branches — II you have a Iot oI branches, you can use range specI-
ficatIons to answer questIons such as, “What work Is on thIs branch
that ¡ haven't yet merged Into my maIn branch?”
Double Dot
The most common range specIficatIon Is the doubIe-dot syntax. ThIs
basIcaIIy asks GIt to resoIve a range oI commIts that are reachabIe
Irom one commIt but aren't reachabIe Irom another. Ior exampIe,
say you have a commIt hIstory that Iooks IIke IIgure 6.1.
Figure 6.1: Example history for range selection
You want to see what Is In your experIment branch that hasn't yet
been merged Into your master branch. You can ask GIt to show you
a Iog oI just those commIts wIth master..experiment — that means “aII
commIts reachabIe by experIment that aren't reachabIe by master.”
Ior the sake oI brevIty and cIarIty In these exampIes, ¡'II use the
Ietters oI the commIt objects Irom the dIagram In pIace oI the actuaI
Iog output In the order that they wouId dIspIay.
$ git log master..experiemnt
D
C
¡I, on the other hand, you want to see the opposIte — aII commIts
In master that aren't In experiment — you can reverse the branch names.
experiment..master shows you everythIng In master not reachabIe Irom
experiment.
146
Chapter 6 GIt TooIs Scott Chacon Pro Git
$ git log experiment..master
F
E
ThIs Is useIuI II you want to keep the experiment branch up to date
and prevIew what you're about to merge In. Another very Irequent
use oI thIs syntax Is to see what you're about to push to a remote.
$ git log origin/master..HEAD
ThIs command shows you any commIts In your current branch that
aren't In the master branch on your origin remote. ¡I you run a git push
and your current branch Is trackIng origin/master, the commIts IIsted
by git log origin/master..HEAD are the commIts that wIII be transIerred
to the server. You can aIso Ieave off one sIde oI the syntax to have GIt
assume I£AÐ. Ior exampIe, you can get the same resuIts as In the
prevIous exampIe by typIng git log origin/master.. — GIt substItutes
I£AÐ II one sIde Is mIssIng.
Multiple Points
The doubIe-dot syntax Is useIuI as a shorthand, but perhaps you want
to specIIy more than two branches to IndIcate your revIsIon, such as
seeIng what commIts are In any oI severaI branches that aren't In
the branch you're currentIy on. GIt aIIows you to do thIs by usIng
eIther the ˆ character or --not beIore any reIerence Irom whIch you
don't want to see reachabIe commIts. Thus these three commands
are equIvaIent.
$ git log refA..refB
$ git log ^refA refB
$ git log refB --not refA
ThIs Is nIce because wIth thIs syntax you can specIIy more than
two reIerences In your query, whIch you cannot do wIth the doubIe-
dot syntax. Ior Insance, II you want to see aII commIts that are reach-
abIe Irom refA or refB but not Irom refC, you can type one oI these.
$ git log refA refB ^refC
$ git log refA refB --not refC
ThIs makes Ior a very powerIuI revIsIon query system that shouId
heIp you figure out what Is In your branches.
Triple Dot
The Iast major range-seIectIon syntax Is the trIpIe-dot syntax, whIch
specIfies aII the commIts that are reachabIe by eIther oI two reIer-
ences but not by both oI them. Iook back at the exampIe commIt
hIstory In IIgure 6.1. ¡I you want to see what Is In master or experiment
but not any common reIerences, you can run
147
Section 6.2 ¡nteractIve StagIng Scott Chacon Pro Git
$ git log master...experiment
F
E
D
C
AgaIn, thIs gIves you normaI log output but shows you onIy the
commIt InIormatIon Ior those Iour commIts, appearIng In the tradI-
tIonaI commIt date orderIng.
A common swItch to use wIth the log command In thIs case Is --
left-right, whIch shows you whIch sIde oI the range each commIt Is
In. ThIs heIps make the data more useIuI.
$ git log --left-right master...experiment
< F
< E
> D
> C
WIth these tooIs, you can much more easIIy Iet GIt know what
commIt or commIts you want to Inspect.
6.2 Interactive Staging
GIt comes wIth a coupIe oI scrIpts that make some command-IIne
tasks easIer. Iere, you'II Iook at a Iew InteractIve commands that
can heIp you easIIy craIt your commIts to IncIude onIy certaIn combI-
natIons and parts oI fiIes. These tooIs are very heIpIuI II you modIIy
a bunch oI fiIes and then decIde that you want those changes to be
In severaI Iocused commIts rather than one bIg messy commIt. ThIs
way, you can make sure your commIts are IogIcaIIy separate change-
sets and can be easIIy revIewed by the deveIopers workIng wIth you.
¡I you run git add wIth the -i or --interactive optIon, GIt goes Into an
InteractIve sheII mode, dIspIayIng somethIng IIke thIs.
$ git add -i
staged unstaged path
1: unchanged +0/-1 TODO
2: unchanged +1/-1 index.html
3: unchanged +5/-1 lib/simplegit.rb
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now>
You can see that thIs command shows you a much dIfferent vIew oI
your stagIng area — basIcaIIy the same InIormatIon you get wIth git
status but a bIt more succInct and InIormatIve. ¡t IIsts the changes
you've staged on the IeIt and unstaged changes on the rIght.
AIter thIs comes a Commands sectIon. Iere you can do a num-
ber oI thIngs, IncIudIng stagIng fiIes, unstagIng fiIes, stagIng parts
oI fiIes, addIng untracked fiIes, and seeIng dIffs oI what has been
staged.
148
Chapter 6 GIt TooIs Scott Chacon Pro Git
6.2.1 Staging and Unstaging Files
¡I you type 2 or u at the What now> prompt, the scrIpt prompts you Ior
whIch fiIes you want to stage.
What now> 2
staged unstaged path
1: unchanged +0/-1 TODO
2: unchanged +1/-1 index.html
3: unchanged +5/-1 lib/simplegit.rb
Update>>
To stage the TOÐO and Index.htmI fiIes, you can type the numbers.
Update>> 1,2
staged unstaged path
* 1: unchanged +0/-1 TODO
* 2: unchanged +1/-1 index.html
3: unchanged +5/-1 lib/simplegit.rb
Update>>
The * next to each fiIe means the fiIe Is seIected to be staged. ¡I
you press £nter aIter typIng nothIng at the Update>> prompt, GIt takes
anythIng seIected and stages It Ior you.
Update>>
updated 2 paths
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now> 1
staged unstaged path
1: +0/-1 nothing TODO
2: +1/-1 nothing index.html
3: unchanged +5/-1 lib/simplegit.rb
Þow you can see that the TOÐO and Index.htmI fiIes are staged
and the sImpIegIt.rb fiIe Is stIII unstaged. ¡I you want to unstage the
TOÐO fiIe at thIs poInt, you use the 3 or r (Ior revert) optIon.
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now> 3
staged unstaged path
1: +0/-1 nothing TODO
2: +1/-1 nothing index.html
3: unchanged +5/-1 lib/simplegit.rb
Revert>> 1
staged unstaged path
* 1: +0/-1 nothing TODO
2: +1/-1 nothing index.html
3: unchanged +5/-1 lib/simplegit.rb
Revert>> [enter]
reverted one path
IookIng at your GIt status agaIn, you can see that you've unstaged
the TOÐO fiIe.
149
Section 6.2 ¡nteractIve StagIng Scott Chacon Pro Git
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now> 1
staged unstaged path
1: unchanged +0/-1 TODO
2: +1/-1 nothing index.html
3: unchanged +5/-1 lib/simplegit.rb
To see the dIff oI what you've staged, you can use the 6 or d (Ior
dIff) command. ¡t shows you a IIst oI your staged fiIes, and you can
seIect the ones Ior whIch you wouId IIke to see the staged dIff. ThIs
Is much IIke specIIyIng git diff --cached on the command IIne.
*** Commands ***
1: status 2: update 3: revert 4: add untracked
5: patch 6: diff 7: quit 8: help
What now> 6
staged unstaged path
1: +1/-1 nothing index.html
Review diff>> 1
diff --git a/index.html b/index.html
index 4d07108..4335f49 100644
--- a/index.html
+++ b/index.html
@@ -16,7 +16,7 @@ Date Finder
<p id="out">...</p>
-<div id="footer">contact : support@github.com</div>
+<div id="footer">contact : email.support@github.com</div>
<script type="text/javascript">
WIth these basIc commands, you can use the InteractIve add mode
to deaI wIth your stagIng area a IIttIe more easIIy.
6.2.2 Staging Patches
¡t's aIso possIbIe Ior GIt to stage certaIn parts oI fiIes and not the
rest. Ior exampIe, II you make two changes to your sImpIegIt.rb fiIe
and want to stage one oI them and not the other, doIng so Is very
easy In GIt. Irom the InteractIve prompt, type 5 or p (Ior patch). GIt
wIII ask you whIch fiIes you wouId IIke to partIaIIy stage, then, Ior
each sectIon oI the seIected fiIes, It wIII dIspIay hunks oI the fiIe dIff
and ask II you wouId IIke to stage them, one by one.
diff --git a/lib/simplegit.rb b/lib/simplegit.rb
index dd5ecc4..57399e0 100644
--- a/lib/simplegit.rb
+++ b/lib/simplegit.rb
@@ -22,7 +22,7 @@ class SimpleGit
end
def log(treeish = 'master')
- command("git log -n 25 #{treeish}")
+ command("git log -n 30 #{treeish}")
150
Chapter 6 GIt TooIs Scott Chacon Pro Git
end
def blame(path)
Stage this hunk [y,n,a,d,/,j,J,g,e,?]?
You have a Iot oI optIons at thIs poInt. TypIng ? shows a IIst oI
what you can do.
Stage this hunk [y,n,a,d,/,j,J,g,e,?]? ?
y - stage this hunk
n - do not stage this hunk
a - stage this and all the remaining hunks in the file
d - do not stage this hunk nor any of the remaining hunks in the file
g - select a hunk to go to
/ - search for a hunk matching the given regex
j - leave this hunk undecided, see next undecided hunk
J - leave this hunk undecided, see next hunk
k - leave this hunk undecided, see previous undecided hunk
K - leave this hunk undecided, see previous hunk
s - split the current hunk into smaller hunks
e - manually edit the current hunk
? - print help
GeneraIIy, you'II type y or n II you want to stage each hunk, but
stagIng aII oI them In certaIn fiIes or skIppIng a hunk decIsIon untII
Iater can be heIpIuI too. ¡I you stage one part oI the fiIe and Ieave
another part unstaged, your status output wIII Iook IIke thIs.
What now> 1
staged unstaged path
1: unchanged +0/-1 TODO
2: +1/-1 nothing index.html
3: +1/-1 +4/-0 lib/simplegit.rb
The status oI the sImpIegIt.rb fiIe Is InterestIng. ¡t shows you that
a coupIe oI IInes are staged and a coupIe are unstaged. You've par-
tIaIIy staged thIs fiIe. At thIs poInt, you can exIt the InteractIve addIng
scrIpt and run git commit to commIt the partIaIIy staged fiIes.
IInaIIy, you don't need to be In InteractIve add mode to do the
partIaI-fiIe stagIng — you can start the same scrIpt by usIng git add
-p or git add --patch on the command IIne.
6.3 Stashing
OIten, when you've been workIng on part oI your project, thIngs are
In a messy state and you want to swItch branches Ior a bIt to work
on somethIng eIse. The probIem Is, you don't want to do a commIt
oI haII-done work just so you can get back to thIs poInt Iater. The
answer to thIs Issue Is the git stash command.
StashIng takes the dIrty state oI your workIng dIrectory — that Is,
your modIfied tracked fiIes and staged changes — and saves It on a
stack oI unfinIshed changes that you can reappIy at any tIme.
151
Section 6.3 StashIng Scott Chacon Pro Git
6.3.1 Stashing Your Work
To demonstrate, you'II go Into your project and start workIng on a
coupIe oI fiIes and possIbIy stage one oI the changes. ¡I you run git
status, you can see your dIrty state.
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: index.html
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: lib/simplegit.rb
#
Þow you want to swItch branches, but you don't want to commIt
what you've been workIng on yet, so you'II stash the changes. To
push a new stash onto your stack, run git stash.
$ git stash
Saved working directory and index state \
"WIP on master: 049d078 added the index file"
HEAD is now at 049d078 added the index file
(To restore them type "git stash apply")
Your workIng dIrectory Is cIean.
$ git status
# On branch master
nothing to commit (working directory clean)
At thIs poInt, you can easIIy swItch branches and do work eIse-
where, your changes are stored on your stack. To see whIch stashes
you've stored, you can use git stash list.
$ git stash list
stash@{0}: WIP on master: 049d078 added the index file
stash@{1}: WIP on master: c264051... Revert "added file_size"
stash@{2}: WIP on master: 21d80a5... added number to log
¡n thIs case, two stashes were done prevIousIy, so you have ac-
cess to three dIfferent stashed works. You can reappIy the one you
just stashed by usIng the command shown In the heIp output oI the
orIgInaI stash command. git stash apply. ¡I you want to appIy one
oI the oIder stashes, you can specIIy It by namIng It, IIke thIs. git
stash apply stash@2. ¡I you don't specIIy a stash, GIt assumes the most
recent stash and trIes to appIy It.
$ git stash apply
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: index.html
# modified: lib/simplegit.rb
#
152
Chapter 6 GIt TooIs Scott Chacon Pro Git
You can see that GIt re-modIfies the fiIes you uncommItted when
you saved the stash. ¡n thIs case, you had a cIean workIng dIrectory
when you trIed to appIy the stash, and you trIed to appIy It on the
same branch you saved It Irom, but havIng a cIean workIng dIrectory
and appIyIng It on the same branch aren't necessary to successIuIIy
appIy a stash. You can save a stash on one branch, swItch to another
branch Iater, and try to reappIy the changes. You can aIso have modI-
fied and uncommItted fiIes In your workIng dIrectory when you appIy
a stash — GIt gIves you merge conflIcts II anythIng no Ionger appIIes
cIeanIy.
The changes to your fiIes were reappIIed, but the fiIe you staged
beIore wasn't restaged. To do that, you must run the git stash apply
command wIth a --index optIon to teII the command to try to reappIy
the staged changes. ¡I you had run that Instead, you'd have gotten
back to your orIgInaI posItIon.
$ git stash apply --index
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: index.html
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: lib/simplegit.rb
#
The appIy optIon onIy trIes to appIy the stashed work — you con-
tInue to have It on your stack. To remove It, you can run git stash
drop wIth the name oI the stash to remove.
$ git stash list
stash@{0}: WIP on master: 049d078 added the index file
stash@{1}: WIP on master: c264051... Revert "added file_size"
stash@{2}: WIP on master: 21d80a5... added number to log
$ git stash drop stash@{0}
Dropped stash@{0} (364e91f3f268f0900bc3ee613f9f733e82aaed43)
You can aIso run git stash pop to appIy the stash and then Imme-
dIateIy drop It Irom your stack.
6.3.2 Creating a Branch from a Stash
¡I you stash some work, Ieave It there Ior a whIIe, and contInue on the
branch Irom whIch you stashed the work, you may have a probIem
reappIyIng the work. ¡I the appIy trIes to modIIy a fiIe that you've
sInce modIfied, you'II get a merge conflIct and wIII have to try to re-
soIve It. ¡I you want an easIer way to test the stashed changes agaIn,
you can run git stash branch, whIch creates a new branch Ior you,
checks out the commIt you were on when you stashed your work,
reappIIes your work there, and then drops the stash II It appIIes suc-
cessIuIIy.
153
Section 6.4 IewrItIng IIstory Scott Chacon Pro Git
$ git stash branch testchanges
Switched to a new branch "testchanges"
# On branch testchanges
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: index.html
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
#
# modified: lib/simplegit.rb
#
Dropped refs/stash@{0} (f0dfc4d5dc332d1cee34a634182e168c4efc3359)
ThIs Is a nIce shortcut to recover stashed work easIIy and work on
It In a new branch.
6.4 Rewriting History
Many tImes, when workIng wIth GIt, you may want to revIse your
commIt hIstory Ior some reason. One oI the great thIngs about GIt
Is that It aIIows you to make decIsIons at the Iast possIbIe moment.
You can decIde what fiIes go Into whIch commIts rIght beIore you
commIt wIth the stagIng area, you can decIde that you dIdn't mean
to be workIng on somethIng yet wIth the stash command, and you
can rewrIte commIts that aIready happened so they Iook IIke they
happened In a dIfferent way. ThIs can InvoIve changIng the order
oI the commIts, changIng messages or modIIyIng fiIes In a commIt,
squashIng together or spIIttIng apart commIts, or removIng commIts
entIreIy — aII beIore you share your work wIth others.
¡n thIs sectIon, you'II cover how to accompIIsh these very useIuI
tasks so that you can make your commIt hIstory Iook the way you
want beIore you share It wIth others.
6.4.1 Changing the Last Commit
ChangIng your Iast commIt Is probabIy the most common rewrItIng oI
hIstory that you'II do. You'II oIten want to do two basIc thIngs to your
Iast commIt. change the commIt message, or change the snapshot
you just recorded by addIng, changIng and removIng fiIes.
¡I you onIy want to modIIy your Iast commIt message, It's very
sImpIe.
$ git commit --amend
That drops you Into your text edItor, whIch has your Iast commIt
message In It, ready Ior you to modIIy the message. When you save
and cIose the edItor, the edItor wrItes a new commIt contaInIng that
message and makes It your new Iast commIt.
154
Chapter 6 GIt TooIs Scott Chacon Pro Git
¡I you've commItted and then you want to change the snapshot you
commItted by addIng or changIng fiIes, possIbIy because you Iorgot
to add a newIy created fiIe when you orIgInaIIy commItted, the pro-
cess works basIcaIIy the same way. You stage the changes you want
by edItIng a fiIe and runnIng git add on It or git rm to a tracked fiIe,
and the subsequent git commit --amend takes your current stagIng area
and makes It the snapshot Ior the new commIt.
You need to be careIuI wIth thIs technIque because amendIng
changes the SIA-1 oI the commIt. ¡t's IIke a very smaII rebase —
don't amend your Iast commIt II you've aIready pushed It.
6.4.2 Changing Multiple Commit Messages
To modIIy a commIt that Is Iarther back In your hIstory, you must
move to more compIex tooIs. GIt doesn't have a modIIy-hIstory tooI,
but you can use the rebase tooI to rebase a serIes oI commIts onto
the I£AÐ they were orIgInaIIy based on Instead oI movIng them to
another one. WIth the InteractIve rebase tooI, you can then stop aIter
each commIt you want to modIIy and change the message, add fiIes,
or do whatever you wIsh. You can run rebase InteractIveIy by addIng
the -i optIon to git rebase. You must IndIcate how Iar back you want
to rewrIte commIts by teIIIng the command whIch commIt to rebase
onto.
Ior exampIe, II you want to change the Iast three commIt mes-
sages, or any oI the commIt messages In that group, you suppIy as
an argument to git rebase -i the parent oI the Iast commIt you want
to edIt, whIch Is HEAD~2ˆ or HEAD~3. ¡t may be easIer to remember the
~3 because you're tryIng to edIt the Iast three commIts, but keep In
mInd that you're actuaIIy desIgnatIng Iour commIts ago, the parent
oI the Iast commIt you want to edIt.
$ git rebase -i HEAD~3
Iemember agaIn that thIs Is a rebasIng command — every com-
mIt IncIuded In the range HEAD~3..HEAD wIII be rewrItten, whether you
change the message or not. Ðon't IncIude any commIt you've aIready
pushed to a centraI server — doIng so wIII conIuse other deveIopers
by provIdIng an aIternate versIon oI the same change.
IunnIng thIs command gIves you a IIst oI commIts In your text
edItor that Iooks somethIng IIke thIs.
pick f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
# Rebase 710f0f8..a5f4a0d onto 710f0f8
#
# Commands:
# p, pick = use commit
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
155
Section 6.4 IewrItIng IIstory Scott Chacon Pro Git
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#
¡t's Important to note that these commIts are IIsted In the opposIte
order than you normaIIy see them usIng the log command. ¡I you run
a log, you see somethIng IIke thIs.
$ git log --pretty=format:"%h %s" HEAD~3..HEAD
a5f4a0d added cat-file
310154e updated README formatting and added blame
f7f3f6d changed my name a bit
ÞotIce the reverse order. The InteractIve rebase gIves you a scrIpt
that It's goIng to run. ¡t wIII start at the commIt you specIIy on the
command IIne (HEAD~3) and repIay the changes Introduced In each oI
these commIts Irom top to bottom. ¡t IIsts the oIdest at the top, rather
than the newest, because that's the first one It wIII repIay.
You need to edIt the scrIpt so that It stops at the commIt you want
to edIt. To do so, change the word pIck to the word edIt Ior each oI
the commIts you want the scrIpt to stop aIter. Ior exampIe, to modIIy
onIy the thIrd commIt message, you change the fiIe to Iook IIke thIs.
edit f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
When you save and exIt the edItor, GIt rewInds you back to the
Iast commIt In that IIst and drops you on the command IIne wIth the
IoIIowIng message.
$ git rebase -i HEAD~3
Stopped at 7482e0d... updated the gemspec to hopefully work better
You can amend the commit now, with
git commit --amend
Once you’re satisfied with your changes, run
git rebase --continue
These InstructIons teII you exactIy what to do. Type
$ git commit --amend
Change the commIt message, and exIt the edItor. Then, run
$ git rebase --continue
ThIs command wIII appIy the other two commIts automatIcaIIy,
and then you're done. ¡I you change pIck to edIt on more IInes, you
can repeat these steps Ior each commIt you change to edIt. £ach
tIme, GIt wIII stop, Iet you amend the commIt, and contInue when
you're finIshed.
156
Chapter 6 GIt TooIs Scott Chacon Pro Git
6.4.3 Reordering Commits
You can aIso use InteractIve rebases to reorder or remove commIts
entIreIy. ¡I you want to remove the “added cat-fiIe” commIt and
change the order In whIch the other two commIts are Introduced,
you can change the rebase scrIpt Irom thIs
pick f7f3f6d changed my name a bit
pick 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
to thIs.
pick 310154e updated README formatting and added blame
pick f7f3f6d changed my name a bit
When you save and exIt the edItor, GIt rewInds your branch to the
parent oI these commIts, appIIes 310154e and then f7f3f6d, and then
stops. You effectIveIy change the order oI those commIts and remove
the “added cat-fiIe” commIt compIeteIy.
6.4.4 Squashing a Commit
¡t's aIso possIbIe to take a serIes oI commIts and squash them down
Into a sIngIe commIt wIth the InteractIve rebasIng tooI. The scrIpt
puts heIpIuI InstructIons In the rebase message.
#
# Commands:
# p, pick = use commit
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
#
# If you remove a line here THAT COMMIT WILL BE LOST.
# However, if you remove everything, the rebase will be aborted.
#
¡I, Instead oI “pIck” or “edIt”, you specIIy “squash”, GIt appIIes
both that change and the change dIrectIy beIore It and makes you
merge the commIt messages together. So, II you want to make a
sIngIe commIt Irom these three commIts, you make the scrIpt Iook
IIke thIs.
pick f7f3f6d changed my name a bit
squash 310154e updated README formatting and added blame
squash a5f4a0d added cat-file
When you save and exIt the edItor, GIt appIIes aII three changes
and then puts you back Into the edItor to merge the three commIt
messages.
# This is a combination of 3 commits.
# The first commit's message is:
changed my name a bit
157
Section 6.4 IewrItIng IIstory Scott Chacon Pro Git
# This is the 2nd commit message:
updated README formatting and added blame
# This is the 3rd commit message:
added cat-file
When you save that, you have a sIngIe commIt that Introduces the
changes oI aII three prevIous commIts.
6.4.5 Splitting a Commit
SpIIttIng a commIt undoes a commIt and then partIaIIy stages and
commIts as many tImes as commIts you want to end up wIth. Ior
exampIe, suppose you want to spIIt the mIddIe commIt oI your three
commIts. ¡nstead oI “updated I£AÐM£ IormattIng and added bIame”,
you want to spIIt It Into two commIts. “updated I£AÐM£ IormattIng”
Ior the first, and “added bIame” Ior the second. You can do that In
the rebase -i scrIpt by changIng the InstructIon on the commIt you
want to spIIt to “edIt”.
pick f7f3f6d changed my name a bit
edit 310154e updated README formatting and added blame
pick a5f4a0d added cat-file
Then, when the scrIpt drops you to the command IIne, you re-
set that commIt, take the changes that have been reset, and create
muItIpIe commIts out oI them. When you save and exIt the edItor,
GIt rewInds to the parent oI the first commIt In your IIst, appIIes the
first commIt (f7f3f6d), appIIes the second (310154e), and drops you to
the consoIe. There, you can do a mIxed reset oI that commIt wIth
git reset HEADˆ, whIch effectIveIy undoes that commIt and Ieaves the
modIfied fiIes unstaged. Þow you can stage and commIt fiIes untII
you have severaI commIts, and run git rebase --continue when you're
done.
$ git reset HEAD^
$ git add README
$ git commit -m 'updated README formatting'
$ git add lib/simplegit.rb
$ git commit -m 'added blame'
$ git rebase --continue
GIt appIIes the Iast commIt (a5f4a0d) In the scrIpt, and your hIstory
Iooks IIke thIs.
$ git log -4 --pretty=format:"%h %s"
1c002dd added cat-file
9b29157 added blame
35cfb2b updated README formatting
f3cc40e changed my name a bit
Once agaIn, thIs changes the SIAs oI aII the commIts In your IIst,
so make sure no commIt shows up In that IIst that you've aIready
pushed to a shared reposItory.
158
Chapter 6 GIt TooIs Scott Chacon Pro Git
6.4.6 The Nuclear Option: filter-branch
There Is another hIstory-rewrItIng optIon that you can use II you need
to rewrIte a Iarger number oI commIts In some scrIptabIe way — Ior
Instance, changIng your e-maII address gIobaIIy or removIng a fiIe
Irom every commIt. The command Is filter-branch, and It can rewrIte
huge swaths oI your hIstory, so you probabIy shouIdn't use It unIess
your project Isn't yet pubIIc and other peopIe haven't based work off
the commIts you're about to rewrIte. Iowever, It can be very useIuI.
You'II Iearn a Iew oI the common uses so you can get an Idea oI some
oI the thIngs It's capabIe oI.
Removing a File from Every Commit
ThIs occurs IaIrIy commonIy. Someone accIdentaIIy commIts a huge
bInary fiIe wIth a thoughtIess git add ., and you want to remove It ev-
erywhere. Ierhaps you accIdentaIIy commItted a fiIe that contaIned
a password, and you want to make your project open source. filter-
branch Is the tooI you probabIy want to use to scrub your entIre hIs-
tory. To remove a fiIe named passwords.txt Irom your entIre hIstory,
you can use the --tree-filter optIon to filter-branch.
$ git filter-branch --tree-filter 'rm -f passwords.txt' HEAD
Rewrite 6b9b3cf04e7c5686a9cb838c3f36a8cb6a0fc2bd (21/21)
Ref 'refs/heads/master' was rewritten
The --tree-filter optIon runs the specIfied command aIter each
checkout oI the project and then recommIts the resuIts. ¡n thIs case,
you remove a fiIe caIIed passwords.txt Irom every snapshot, whether
It exIsts or not. ¡I you want to remove aII accIdentaIIy commItted
edItor backup fiIes, you can run somethIng IIke git filter-branch --
tree-filter 'rm -f *~' HEAD.
You'II be abIe to watch GIt rewrItIng trees and commIts and then
move the branch poInter at the end. ¡t's generaIIy a good Idea to
do thIs In a testIng branch and then hard-reset your master branch
aIter you've determIned the outcome Is what you reaIIy want. To
run filter-branch on aII your branches, you can pass --all to the com-
mand.
Making a Subdirectory the New Root
Suppose you've done an Import Irom another source controI system
and have subdIrectorIes that make no sense (trunk, tags, and so on).
¡I you want to make the trunk subdIrectory be the new project root
Ior every commIt, filter-branch can heIp you do that, too.
$ git filter-branch --subdirectory-filter trunk HEAD
Rewrite 856f0bf61e41a27326cdae8f09fe708d679f596f (12/12)
Ref 'refs/heads/master' was rewritten
159
Section 6.5 ÐebuggIng wIth GIt Scott Chacon Pro Git
Þow your new project root Is what was In the trunk subdIrectory
each tIme. GIt wIII aIso automatIcaIIy remove commIts that dId not
affect the subdIrectory.
Changing E-Mail Addresses Globally
Another common case Is that you Iorgot to run git config to set your
name and e-maII address beIore you started workIng, or perhaps you
want to open-source a project at work and change aII your work e-
maII addresses to your personaI address. ¡n any case, you can change
e-maII addresses In muItIpIe commIts In a batch wIth filter-branch as
weII. You need to be careIuI to change onIy the e-maII addresses that
are yours, so you use --commit-filter.
$ git filter-branch --commit-filter '
if [ "$GIT_AUTHOR_EMAIL" = "schacon@localhost" ];
then
GIT_AUTHOR_NAME="Scott Chacon";
GIT_AUTHOR_EMAIL="schacon@example.com";
git commit-tree "$@";
else
git commit-tree "$@";
fi' HEAD
ThIs goes through and rewrItes every commIt to have your new
address. Ðecause commIts contaIn the SIA-1 vaIues oI theIr parents,
thIs command changes every commIt SIA In your hIstory, not just
those that have the matchIng e-maII address.
6.5 Debugging with Git
GIt aIso provIdes a coupIe oI tooIs to heIp you debug Issues In your
projects. Ðecause GIt Is desIgned to work wIth nearIy any type oI
project, these tooIs are pretty generIc, but they can oIten heIp you
hunt Ior a bug or cuIprIt when thIngs go wrong.
6.5.1 File Annotation
¡I you track down a bug In your code and want to know when It was
Introduced and why, fiIe annotatIon Is oIten your best tooI. ¡t shows
you what commIt was the Iast to modIIy each IIne oI any fiIe. So, II
you see that a method In your code Is buggy, you can annotate the
fiIe wIth git blame to see when each IIne oI the method was Iast edIted
and by whom. ThIs exampIe uses the -L optIon to IImIt the output to
IInes 12 through 22.
$ git blame -L 12,22 simplegit.rb
^4832fe2 (Scott Chacon 2008-03-15 10:31:28 -0700 12) def show(tree = 'master')
^4832fe2 (Scott Chacon 2008-03-15 10:31:28 -0700 13) command("git show #
{tree}")
^4832fe2 (Scott Chacon 2008-03-15 10:31:28 -0700 14) end
160
Chapter 6 GIt TooIs Scott Chacon Pro Git
^4832fe2 (Scott Chacon 2008-03-15 10:31:28 -0700 15)
9f6560e4 (Scott Chacon 2008-03-17 21:52:20 -0700 16) def log(tree = 'master')
79eaf55d (Scott Chacon 2008-04-06 10:15:08 -0700 17) command("git log #
{tree}")
9f6560e4 (Scott Chacon 2008-03-17 21:52:20 -0700 18) end
9f6560e4 (Scott Chacon 2008-03-17 21:52:20 -0700 19)
42cf2861 (Magnus Chacon 2008-04-13 10:45:01 -0700 20) def blame(path)
42cf2861 (Magnus Chacon 2008-04-13 10:45:01 -0700 21) command("git blame #
{path}")
42cf2861 (Magnus Chacon 2008-04-13 10:45:01 -0700 22) end
ÞotIce that the first fieId Is the partIaI SIA-1 oI the commIt that
Iast modIfied that IIne. The next two fieIds are vaIues extracted Irom
that commIt—the author name and the authored date oI that com-
mIt — so you can easIIy see who modIfied that IIne and when. AIter
that come the IIne number and the content oI the fiIe. AIso note the
ˆ4832fe2 commIt IInes, whIch desIgnate that those IInes were In thIs
fiIe's orIgInaI commIt. That commIt Is when thIs fiIe was first added
to thIs project, and those IInes have been unchanged sInce. ThIs Is a
tad conIusIng, because now you've seen at Ieast three dIfferent ways
that GIt uses the ˆ to modIIy a commIt SIA, but that Is what It means
here.
Another cooI thIng about GIt Is that It doesn't track fiIe renames
expIIcItIy. ¡t records the snapshots and then trIes to figure out what
was renamed ImpIIcItIy, aIter the Iact. One oI the InterestIng Iea-
tures oI thIs Is that you can ask It to figure out aII sorts oI code move-
ment as weII. ¡I you pass -C to git blame, GIt anaIyzes the fiIe you're
annotatIng and trIes to figure out where snIppets oI code wIthIn It
orIgInaIIy came Irom II they were copIed Irom eIsewhere. IecentIy, ¡
was reIactorIng a fiIe named GITServerHandler.m Into muItIpIe fiIes, one
oI whIch was GITPackUpload.m. Ðy bIamIng GITPackUpload.m wIth the -C
optIon, ¡ couId see where sectIons oI the code orIgInaIIy came Irom.
$ git blame -C -L 141,153 GITPackUpload.m
f344f58d GITServerHandler.m (Scott 2009-01-04 141)
f344f58d GITServerHandler.m (Scott 2009-01-04 142) - (void) gatherObjectShasFromC
f344f58d GITServerHandler.m (Scott 2009-01-04 143) {
70befddd GITServerHandler.m (Scott 2009-03-22 144) //NSLog
(@"GATHER COMMI
ad11ac80 GITPackUpload.m (Scott 2009-03-24 145)
ad11ac80 GITPackUpload.m (Scott 2009-03-24 146) NSString *parentSha;
ad11ac80 GITPackUpload.m (Scott 2009-03-24 147) GITCommit *commit = [g
ad11ac80 GITPackUpload.m (Scott 2009-03-24 148)
ad11ac80 GITPackUpload.m (Scott 2009-03-24 149) //NSLog
(@"GATHER COMMI
ad11ac80 GITPackUpload.m (Scott 2009-03-24 150)
56ef2caf GITServerHandler.m (Scott 2009-01-05 151) if(commit) {
56ef2caf GITServerHandler.m (Scott 2009-01-05 152) [refDict setOb
56ef2caf GITServerHandler.m (Scott 2009-01-05 153)
ThIs Is reaIIy useIuI. ÞormaIIy, you get as the orIgInaI commIt the
commIt where you copIed the code over, because that Is the first tIme
you touched those IInes In thIs fiIe. GIt teIIs you the orIgInaI commIt
where you wrote those IInes, even II It was In another fiIe.
161
Section 6.5 ÐebuggIng wIth GIt Scott Chacon Pro Git
6.5.2 Binary Search
AnnotatIng a fiIe heIps II you know where the Issue Is to begIn wIth.
¡I you don't know what Is breakIng, and there have been dozens or
hundreds oI commIts sInce the Iast state where you know the code
worked, you'II IIkeIy turn to git bisect Ior heIp. The bisect command
does a bInary search through your commIt hIstory to heIp you IdentIIy
as quIckIy as possIbIe whIch commIt Introduced an Issue.
Iet's say you just pushed out a reIease oI your code to a productIon
envIronment, you're gettIng bug reports about somethIng that wasn't
happenIng In your deveIopment envIronment, and you can't ImagIne
why the code Is doIng that. You go back to your code, and It turns out
you can reproduce the Issue, but you can't figure out what Is goIng
wrong. You can bIsect the code to find out. IIrst you run git bisect
start to get thIngs goIng, and then you use git bisect bad to teII the
system that the current commIt you're on Is broken. Then, you must
teII bIsect when the Iast known good state was, usIng git bisect good
[good_commit].
$ git bisect start
$ git bisect bad
$ git bisect good v1.0
Bisecting: 6 revisions left to test after this
[ecb6e1bc347ccecc5f9350d878ce677feb13d3b2] error handling on repo
GIt figured out that about 12 commIts came between the commIt
you marked as the Iast good commIt (v1.0) and the current bad ver-
sIon, and It checked out the mIddIe one Ior you. At thIs poInt, you
can run your test to see II the Issue exIsts as oI thIs commIt. ¡I It
does, then It was Introduced sometIme beIore thIs mIddIe commIt, II
It doesn't, then the probIem was Introduced sometIme aIter the mId-
dIe commIt. ¡t turns out there Is no Issue here, and you teII GIt that
by typIng git bisect good and contInue your journey.
$ git bisect good
Bisecting: 3 revisions left to test after this
[b047b02ea83310a70fd603dc8cd7a6cd13d15c04] secure this thing
Þow you're on another commIt, haIIway between the one you just
tested and your bad commIt. You run your test agaIn and find that
thIs commIt Is broken, so you teII GIt that wIth git bisect bad.
$ git bisect bad
Bisecting: 1 revisions left to test after this
[f71ce38690acf49c1f3c9bea38e09d82a5ce6014] drop exceptions table
ThIs commIt Is fine, and now GIt has aII the InIormatIon It needs to
determIne where the Issue was Introduced. ¡t teIIs you the SIA-1 oI
the first bad commIt and show some oI the commIt InIormatIon and
whIch fiIes were modIfied In that commIt so you can figure out what
happened that may have Introduced thIs bug.
162
Chapter 6 GIt TooIs Scott Chacon Pro Git
$ git bisect good
b047b02ea83310a70fd603dc8cd7a6cd13d15c04 is first bad commit
commit b047b02ea83310a70fd603dc8cd7a6cd13d15c04
Author: PJ Hyett <pjhyett@example.com>
Date: Tue Jan 27 14:48:32 2009 -0800
secure this thing
:040000 040000 40ee3e7821b895e52c1695092db9bdc4c61d1730
f24d3c6ebcfc639b1a3814550e62d60b8e68a8e4 M config
When you're finIshed, you shouId run git bisect reset to reset your
I£AÐ to where you were beIore you started, or you'II end up In a
weIrd state.
$ git bisect reset
ThIs Is a powerIuI tooI that can heIp you check hundreds oI com-
mIts Ior an Introduced bug In mInutes. ¡n Iact, II you have a scrIpt
that wIII exIt 0 II the project Is good or non-0 II the project Is bad,
you can IuIIy automate git bisect. IIrst, you agaIn teII It the scope oI
the bIsect by provIdIng the known bad and good commIts. You can do
thIs by IIstIng them wIth the bisect start command II you want, IIstIng
the known bad commIt first and the known good commIt second.
$ git bisect start HEAD v1.0
$ git bisect run test-error.sh
ÐoIng so automatIcaIIy runs test-error.sh on each checked-out com-
mIt untII GIt finds the first broken commIt. You can aIso run some-
thIng IIke make or make tests or whatever you have that runs automated
tests Ior you.
6.6 Submodules
¡t oIten happens that whIIe workIng on one project, you need to use
another project Irom wIthIn It. Ierhaps It's a IIbrary that a thIrd party
deveIoped or that you're deveIopIng separateIy and usIng In muItIpIe
parent projects. A common Issue arIses In these scenarIos. you want
to be abIe to treat the two projects as separate yet stIII be abIe to use
one Irom wIthIn the other.
Iere's an exampIe. Suppose you're deveIopIng a web sIte and cre-
atIng Atom Ieeds. ¡nstead oI wrItIng your own Atom-generatIng code,
you decIde to use a IIbrary. You're IIkeIy to have to eIther IncIude thIs
code Irom a shared IIbrary IIke a CIAÞ InstaII or Iuby gem, or copy
the source code Into your own project tree. The Issue wIth IncIudIng
the IIbrary Is that It's dIfficuIt to customIze the IIbrary In any way
and oIten more dIfficuIt to depIoy It, because you need to make sure
every cIIent has that IIbrary avaIIabIe. The Issue wIth vendorIng the
code Into your own project Is that any custom changes you make are
dIfficuIt to merge when upstream changes become avaIIabIe.
163
Section 6.6 SubmoduIes Scott Chacon Pro Git
GIt addresses thIs Issue usIng submoduIes. SubmoduIes aIIow you
to keep a GIt reposItory as a subdIrectory oI another GIt reposItory.
ThIs Iets you cIone another reposItory Into your project and keep your
commIts separate.
6.6.1 Starting with Submodules
Suppose you want to add the Iack IIbrary (a Iuby web server gate-
way InterIace) to your project, possIbIy maIntaIn your own changes
to It, but contInue to merge In upstream changes. The first thIng
you shouId do Is cIone the externaI reposItory Into your subdIrec-
tory. You add externaI projects as submoduIes wIth the git submodule
add command.
$ git submodule add git://github.com/chneukirchen/rack.git rack
Initialized empty Git repository in /opt/subtest/rack/.git/
remote: Counting objects: 3181, done.
remote: Compressing objects: 100% (1534/1534), done.
remote: Total 3181 (delta 1951), reused 2623 (delta 1603)
Receiving objects: 100% (3181/3181), 675.42 KiB | 422 KiB/s, done.
Resolving deltas: 100% (1951/1951), done.
Þow you have the Iack project under a subdIrectory named rack
wIthIn your project. You can go Into that subdIrectory, make changes,
add your own wrItabIe remote reposItory to push your changes Into,
Ietch and merge Irom the orIgInaI reposItory, and more. ¡I you run
git status rIght aIter you add the submoduIe, you see two thIngs.
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: .gitmodules
# new file: rack
#
IIrst you notIce the .gitmodules fiIe. ThIs Is a configuratIon fiIe that
stores the mappIng between the project's !II and the IocaI subdI-
rectory you've puIIed It Into.
$ cat .gitmodules
[submodule "rack"]
path = rack
url = git://github.com/chneukirchen/rack.git
¡I you have muItIpIe submoduIes, you'II have muItIpIe entrIes In
thIs fiIe. ¡t's Important to note that thIs fiIe Is versIon-controIIed wIth
your other fiIes, IIke your .gitignore fiIe. ¡t's pushed and puIIed wIth
the rest oI your project. ThIs Is how other peopIe who cIone thIs
project know where to get the submoduIe projects Irom.
The other IIstIng In the git status output Is the rack entry. ¡I you
run git diff on that, you see somethIng InterestIng.
164
Chapter 6 GIt TooIs Scott Chacon Pro Git
$ git diff --cached rack
diff --git a/rack b/rack
new file mode 160000
index 0000000..08d709f
--- /dev/null
+++ b/rack
@@ -0,0 +1 @@
+Subproject commit 08d709f78b8c5b0fbeb7821e37fa53e69afcf433
AIthough rack Is a subdIrectory In your workIng dIrectory, GIt sees
It as a submoduIe and doesn't track Its contents when you're not In
that dIrectory. ¡nstead, GIt records It as a partIcuIar commIt Irom
that reposItory. When you make changes and commIt In that subdI-
rectory, the superproject notIces that the I£AÐ there has changed
and records the exact commIt you're currentIy workIng off oI, that
way, when others cIone thIs project, they can re-create the envIron-
ment exactIy.
ThIs Is an Important poInt wIth submoduIes. you record them as
the exact commIt they're at. You can't record a submoduIe at master
or some other symboIIc reIerence.
When you commIt, you see somethIng IIke thIs.
$ git commit -m 'first commit with submodule rack'
[master 0550271] first commit with submodule rack
2 files changed, 4 insertions(+), 0 deletions(-)
create mode 100644 .gitmodules
create mode 160000 rack
ÞotIce the 160000 mode Ior the rack entry. That Is a specIaI mode
In GIt that basIcaIIy means you're recordIng a commIt as a dIrectory
entry rather than a subdIrectory or a fiIe.
You can treat the rack dIrectory as a separate project and then up-
date your superproject Irom tIme to tIme wIth a poInter to the Iatest
commIt In that subproject. AII the GIt commands work IndependentIy
In the two dIrectorIes.
$ git log -1
commit 0550271328a0038865aad6331e620cd7238601bb
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Apr 9 09:03:56 2009 -0700
first commit with submodule rack
$ cd rack/
$ git log -1
commit 08d709f78b8c5b0fbeb7821e37fa53e69afcf433
Author: Christian Neukirchen <chneukirchen@gmail.com>
Date: Wed Mar 25 14:49:04 2009 +0100
Document version change
6.6.2 Cloning a Project with Submodules
Iere you'II cIone a project wIth a submoduIe In It. When you receIve
such a project, you get the dIrectorIes that contaIn submoduIes, but
none oI the fiIes yet.
165
Section 6.6 SubmoduIes Scott Chacon Pro Git
$ git clone git://github.com/schacon/myproject.git
Initialized empty Git repository in /opt/myproject/.git/
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (6/6), done.
$ cd myproject
$ ls -l
total 8
-rw-r-r-- 1 schacon admin 3 Apr 9 09:11 README
drwxr-xr-x 2 schacon admin 68 Apr 9 09:11 rack
$ ls rack/
$
The rack dIrectory Is there, but empty. You must run two com-
mands. git submodule init to InItIaIIze your IocaI configuratIon fiIe,
and git submodule update to Ietch aII the data Irom that project and
check out the approprIate commIt IIsted In your superproject.
$ git submodule init
Submodule 'rack' (git://github.com/chneukirchen/rack.git) registered for path 'rack'
$ git submodule update
Initialized empty Git repository in /opt/myproject/rack/.git/
remote: Counting objects: 3181, done.
remote: Compressing objects: 100% (1534/1534), done.
remote: Total 3181 (delta 1951), reused 2623 (delta 1603)
Receiving objects: 100% (3181/3181), 675.42 KiB | 173 KiB/s, done.
Resolving deltas: 100% (1951/1951), done.
Submodule path 'rack': checked out '08d709f78b8c5b0fbeb7821e37fa53e69afcf433'
Þow your rack subdIrectory Is at the exact state It was In when
you commItted earIIer. ¡I another deveIoper makes changes to the
rack code and commIts, and you puII that reIerence down and merge
It In, you get somethIng a bIt odd.
$ git merge origin/master
Updating 0550271..85a3eee
Fast forward
rack | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
[master*]$ git status
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: rack
#
You merged In what Is basIcaIIy a change to the poInter Ior your
submoduIe, but It doesn't update the code In the submoduIe dIrec-
tory, so It Iooks IIke you have a dIrty state In your workIng dIrectory.
$ git diff
diff --git a/rack b/rack
index 6c5e70b..08d709f 160000
--- a/rack
+++ b/rack
166
Chapter 6 GIt TooIs Scott Chacon Pro Git
@@ -1 +1 @@
-Subproject commit 6c5e70b984a60b3cecd395edd5b48a7575bf58e0
+Subproject commit 08d709f78b8c5b0fbeb7821e37fa53e69afcf433
ThIs Is the case because the poInter you have Ior the submoduIe
Isn't what Is actuaIIy In the submoduIe dIrectory. To fix thIs, you must
run git submodule update agaIn.
$ git submodule update
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 2 (delta 0)
Unpacking objects: 100% (3/3), done.
From git@github.com:schacon/rack
08d709f..6c5e70b master -> origin/master
Submodule path 'rack': checked out '6c5e70b984a60b3cecd395edd5b48a7575bf58e0'
You have to do thIs every tIme you puII down a submoduIe change
In the maIn project. ¡t's strange, but It works.
One common probIem happens when a deveIoper makes a change
IocaIIy In a submoduIe but doesn't push It to a pubIIc server. Then,
they commIt a poInter to that non-pubIIc state and push up the su-
perproject. When other deveIopers try to run git submodule update, the
submoduIe system can't find the commIt that Is reIerenced, because
It exIsts onIy on the first deveIoper's system. ¡I that happens, you see
an error IIke thIs.
$ git submodule update
fatal: reference isn’t a tree: 6c5e70b984a60b3cecd395edd5b48a7575bf58e0
Unable to checkout '6c5e70b984a60b3cecd395edd5ba7575bf58e0' in submodule path 'rack'
You have to see who Iast changed the submoduIe.
$ git log -1 rack
commit 85a3eee996800fcfa91e2119372dd4172bf76678
Author: Scott Chacon <schacon@gmail.com>
Date: Thu Apr 9 09:19:14 2009 -0700
added a submodule reference I will never make public. hahahahaha!
Then, you e-maII that guy and yeII at hIm.
6.6.3 Superprojects
SometImes, deveIopers want to get a combInatIon oI a Iarge project's
subdIrectorIes, dependIng on what team they're on. ThIs Is common
II you're comIng Irom CVS or SubversIon, where you've defined a
moduIe or coIIectIon oI subdIrectorIes, and you want to keep thIs
type oI workflow.
A good way to do thIs In GIt Is to make each oI the subIoIders a
separate GIt reposItory and then create superproject GIt reposIto-
rIes that contaIn muItIpIe submoduIes. A benefit oI thIs approach Is
that you can more specIficaIIy define the reIatIonshIps between the
projects wIth tags and branches In the superprojects.
167
Section 6.6 SubmoduIes Scott Chacon Pro Git
6.6.4 Issues with Submodules
!sIng submoduIes Isn't wIthout hIccups, however. IIrst, you must be
reIatIveIy careIuI when workIng In the submoduIe dIrectory. When
you run git submodule update, It checks out the specIfic versIon oI the
project, but not wIthIn a branch. ThIs Is caIIed havIng a detached
head — It means the I£AÐ fiIe poInts dIrectIy to a commIt, not to
a symboIIc reIerence. The Issue Is that you generaIIy don't want
to work In a detached head envIronment, because It's easy to Iose
changes. ¡I you do an InItIaI submodule update, commIt In that submod-
uIe dIrectory wIthout creatIng a branch to work In, and then run git
submodule update agaIn Irom the superproject wIthout commIttIng In
the meantIme, GIt wIII overwrIte your changes wIthout teIIIng you.
TechnIcaIIy you won't Iose the work, but you won't have a branch
poIntIng to It, so It wIII be somewhat dIfficuIt to retrIeIve.
To avoId thIs Issue, create a branch when you work In a submoduIe
dIrectory wIth git checkout -b work or somethIng equIvaIent. When
you do the submoduIe update a second tIme, It wIII stIII revert your
work, but at Ieast you have a poInter to get back to.
SwItchIng branches wIth submoduIes In them can aIso be trIcky.
¡I you create a new branch, add a submoduIe there, and then swItch
back to a branch wIthout that submoduIe, you stIII have the submod-
uIe dIrectory as an untracked dIrectory.
$ git checkout -b rack
Switched to a new branch "rack"
$ git submodule add git@github.com:schacon/rack.git rack
Initialized empty Git repository in /opt/myproj/rack/.git/
...
Receiving objects: 100% (3184/3184), 677.42 KiB | 34 KiB/s, done.
Resolving deltas: 100% (1952/1952), done.
$ git commit -am 'added rack submodule'
[rack cc49a69] added rack submodule
2 files changed, 4 insertions(+), 0 deletions(-)
create mode 100644 .gitmodules
create mode 160000 rack
$ git checkout master
Switched to branch "master"
$ git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# rack/
You have to eIther move It out oI the way or remove It, In whIch
case you have to cIone It agaIn when you swItch back—and you may
Iose IocaI changes or branches that you dIdn't push up.
The Iast maIn caveat that many peopIe run Into InvoIves swItchIng
Irom subdIrectorIes to submoduIes. ¡I you've been trackIng fiIes In
your project and you want to move them out Into a submoduIe, you
must be careIuI or GIt wIII get angry at you. Assume that you have the
rack fiIes In a subdIrectory oI your project, and you want to swItch It
168
Chapter 6 GIt TooIs Scott Chacon Pro Git
to a submoduIe. ¡I you deIete the subdIrectory and then run submodule
add, GIt yeIIs at you.
$ rm -Rf rack/
$ git submodule add git@github.com:schacon/rack.git rack
'rack' already exists in the index
You have to unstage the rack dIrectory first. Then you can add the
submoduIe.
$ git rm -r rack
$ git submodule add git@github.com:schacon/rack.git rack
Initialized empty Git repository in /opt/testsub/rack/.git/
remote: Counting objects: 3184, done.
remote: Compressing objects: 100% (1465/1465), done.
remote: Total 3184 (delta 1952), reused 2770 (delta 1675)
Receiving objects: 100% (3184/3184), 677.42 KiB | 88 KiB/s, done.
Resolving deltas: 100% (1952/1952), done.
Þow suppose you dId that In a branch. ¡I you try to swItch back
to a branch where those fiIes are stIII In the actuaI tree rather than
a submoduIe — you get thIs error.
$ git checkout master
error: Untracked working tree file 'rack/AUTHORS' would be overwritten by merge.
You have to move the rack submoduIe dIrectory out oI the way
beIore you can swItch to a branch that doesn't have It.
$ mv rack /tmp/
$ git checkout master
Switched to branch "master"
$ ls
README rack
Then, when you swItch back, you get an empty rack dIrectory. You
can eIther run git submodule update to recIone, or you can move your
/tmp/rack dIrectory back Into the empty dIrectory.
6.7 Subtree Merging
Þow that you've seen the dIfficuItIes oI the submoduIe system, Iet's
Iook at an aIternate way to soIve the same probIem. When GIt merges,
It Iooks at what It has to merge together and then chooses an appro-
prIate mergIng strategy to use. ¡I you're mergIng two branches, GIt
uses a recursive strategy. ¡I you're mergIng more than two branches,
GIt pIcks the octopus strategy. These strategIes are automatIcaIIy
chosen Ior you because the recursIve strategy can handIe compIex
three-way merge sItuatIons — Ior exampIe, more than one common
ancestor — but It can onIy handIe mergIng two branches. The oc-
topus merge can handIe muItIpIe branches but Is more cautIous to
avoId dIfficuIt conflIcts, so It's chosen as the deIauIt strategy II you're
tryIng to merge more than two branches.
169
Section 6.7 Subtree MergIng Scott Chacon Pro Git
Iowever, there are other strategIes you can choose as weII. One
oI them Is the subtree merge, and you can use It to deaI wIth the
subproject Issue. Iere you'II see how to do the same rack embeddIng
as In the Iast sectIon, but usIng subtree merges Instead.
The Idea oI the subtree merge Is that you have two projects, and
one oI the projects maps to a subdIrectory oI the other one and vIce
versa. When you specIIy a subtree merge, GIt Is smart enough to
figure out that one Is a subtree oI the other and merge approprIateIy
— It's pretty amazIng.
You first add the Iack appIIcatIon to your project. You add the
Iack project as a remote reIerence In your own project and then
check It out Into Its own branch.
$ git remote add rack_remote git@github.com:schacon/rack.git
$ git fetch rack_remote
warning: no common commits
remote: Counting objects: 3184, done.
remote: Compressing objects: 100% (1465/1465), done.
remote: Total 3184 (delta 1952), reused 2770 (delta 1675)
Receiving objects: 100% (3184/3184), 677.42 KiB | 4 KiB/s, done.
Resolving deltas: 100% (1952/1952), done.
From git@github.com:schacon/rack
* [new branch] build -> rack_remote/build
* [new branch] master -> rack_remote/master
* [new branch] rack-0.4 -> rack_remote/rack-0.4
* [new branch] rack-0.9 -> rack_remote/rack-0.9
$ git checkout -b rack_branch rack_remote/master
Branch rack_branch set up to track remote branch refs/remotes/rack_remote/
master.
Switched to a new branch "rack_branch"
Þow you have the root oI the Iack project In your rack_branch
branch and your own project In the master branch. ¡I you check out
one and then the other, you can see that they have dIfferent project
roots.
$ ls
AUTHORS KNOWN-ISSUES Rakefile contrib lib
COPYING README bin example test
$ git checkout master
Switched to branch "master"
$ ls
README
You want to puII the Iack project Into your master project as a
subdIrectory. You can do that In GIt wIth git read-tree. You'II Iearn
more about read-tree and Its IrIends In Chapter 9, but Ior now know
that It reads the root tree oI one branch Into your current stagIng
area and workIng dIrectory. You just swItched back to your master
branch, and you puII the rack branch Into the rack subdIrectory oI
your master branch oI your maIn project.
$ git read-tree --prefix=rack/ -u rack_branch
When you commIt, It Iooks IIke you have aII the Iack fiIes under
that subdIrectory — as though you copIed them In Irom a tarbaII.
170
Chapter 6 GIt TooIs Scott Chacon Pro Git
What gets InterestIng Is that you can IaIrIy easIIy merge changes
Irom one oI the branches to the other. So, II the Iack project updates,
you can puII In upstream changes by swItchIng to that branch and
puIIIng.
$ git checkout rack_branch
$ git pull
Then, you can merge those changes back Into your master branch.
You can use git merge -s subtree and It wIII work fine, but GIt wIII aIso
merge the hIstorIes together, whIch you probabIy don't want. To puII
In the changes and prepopuIate the commIt message, use the --squash
and --no-commit optIons as weII as the -s subtree strategy optIon.
$ git checkout master
$ git merge --squash -s subtree --no-commit rack_branch
Squash commit -- not updating HEAD
Automatic merge went well; stopped before committing as requested
AII the changes Irom your Iack project are merged In and ready to
be commItted IocaIIy. You can aIso do the opposIte — make changes
In the rack subdIrectory oI your master branch and then merge them
Into your rack_branch branch Iater to submIt them to the maIntaIners
or push them upstream.
To get a dIff between what you have In your rack subdIrectory and
the code In your rack_branch branch — to see II you need to merge
them — you can't use the normaI diff command. ¡nstead, you must
run git diff-tree wIth the branch you want to compare to.
$ git diff-tree -p rack_branch
Or, to compare what Is In your rack subdIrectory wIth what the
master branch on the server was the Iast tIme you Ietched, you can
run
$ git diff-tree -p rack_remote/master
6.8 Summary
You've seen a number oI advanced tooIs that aIIow you to manIpuIate
your commIts and stagIng area more precIseIy. When you notIce Is-
sues, you shouId be abIe to easIIy figure out what commIt Introduced
them, when, and by whom. ¡I you want to use subprojects In your
project, you've Iearned a Iew ways to accommodate those needs. At
thIs poInt, you shouId be abIe to do most oI the thIngs In GIt that you'II
need on the command IIne day to day and IeeI comIortabIe doIng so.
171
Chapter 7
Customizing Git
So Iar, ¡'ve covered the basIcs oI how GIt works and how to use It, and
¡'ve Introduced a number oI tooIs that GIt provIdes to heIp you use It
easIIy and efficIentIy. ¡n thIs chapter, ¡'II go through some operatIons
that you can use to make GIt operate In a more customIzed IashIon by
IntroducIng severaI Important configuratIon settIngs and the hooks
system. WIth these tooIs, It's easy to get GIt to work exactIy the way
you, your company, or your group needs It to.
7.1 Git Configuration
As you brIefly saw In the Chapter 1, you can specIIy GIt configuratIon
settIngs wIth the git config command. One oI the first thIngs you dId
was set up your name and e-maII address.
$ git config --global user.name "John Doe"
$ git config --global user.email johndoe@example.com
Þow you'II Iearn a Iew oI the more InterestIng optIons that you
can set In thIs manner to customIze your GIt usage.
You saw some sImpIe GIt configuratIon detaIIs In the first chapter,
but ¡'II go over them agaIn quIckIy here. GIt uses a serIes oI config-
uratIon fiIes to determIne non-deIauIt behavIor that you may want.
The first pIace GIt Iooks Ior these vaIues Is In an /etc/gitconfig fiIe,
whIch contaIns vaIues Ior every user on the system and aII oI theIr
reposItorIes. ¡I you pass the optIon --system to git config, It reads and
wrItes Irom thIs fiIe specIficaIIy.
The next pIace GIt Iooks Is the ~/.gitconfig fiIe, whIch Is specIfic
to each user. You can make GIt read and wrIte to thIs fiIe by passIng
the --global optIon.
IInaIIy, GIt Iooks Ior configuratIon vaIues In the config fiIe In the
GIt dIrectory (.git/config) oI whatever reposItory you're currentIy us-
Ing. These vaIues are specIfic to that sIngIe reposItory. £ach IeveI
overwrItes vaIues In the prevIous IeveI, so vaIues In .git/config trump
those In /etc/gitconfig, Ior Instance. You can aIso set these vaIues by
173
Section 7.1 GIt ConfiguratIon Scott Chacon Pro Git
manuaIIy edItIng the fiIe and InsertIng the correct syntax, but It's
generaIIy easIer to run the git config command.
7.1.1 Basic Client Configuration
The configuratIon optIons recognIzed by GIt IaII Into two categorIes.
cIIent sIde and server sIde. The majorIty oI the optIons are cIIent sIde
—configurIng your personaI workIng preIerences. AIthough tons oI
optIons are avaIIabIe, ¡'II onIy cover the Iew that eIther are commonIy
used or can sIgnIficantIy affect your workflow. Many optIons are use-
IuI onIy In edge cases that ¡ won't go over here. ¡I you want to see a
IIst oI aII the optIons your versIon oI GIt recognIzes, you can run
$ git config --help
The manuaI page Ior git config IIsts aII the avaIIabIe optIons In
quIte a bIt oI detaII.
core.editor
Ðy deIauIt, GIt uses whatever you've set as your deIauIt text edItor
or eIse IaIIs back to the VI edItor to create and edIt your commIt and
tag messages. To change that deIauIt to somethIng eIse, you can use
the core.editor settIng.
$ git config --global core.editor emacs
Þow, no matter what Is set as your deIauIt sheII edItor varIabIe,
GIt wIII fire up £macs to edIt messages.
commit.template
¡I you set thIs to the path oI a fiIe on your system, GIt wIII use that
fiIe as the deIauIt message when you commIt. Ior Instance, suppose
you create a tempIate fiIe at $HOME/.gitmessage.txt that Iooks IIke thIs.
subject line
what happened
[ticket: X]
To teII GIt to use It as the deIauIt message that appears In your
edItor when you run git commit, set the commit.template configuratIon
vaIue.
$ git config --global commit.template $HOME/.gitmessage.txt
$ git commit
Then, your edItor wIII open to somethIng IIke thIs Ior your pIace-
hoIder commIt message when you commIt.
174
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
subject line
what happened
[ticket: X]
# Please enter the commit message for your changes. Lines starting
# with '#' will be ignored, and an empty message aborts the commit.
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: lib/test.rb
#
~
~
".git/COMMIT_EDITMSG" 14L, 297C
¡I you have a commIt-message poIIcy In pIace, then puttIng a tem-
pIate Ior that poIIcy on your system and configurIng GIt to use It by
deIauIt can heIp Increase the chance oI that poIIcy beIng IoIIowed
reguIarIy.
core.pager
The core.pager settIng determInes what pager Is used when GIt pages
output such as log and diff. You can set It to more or to your IavorIte
pager (by deIauIt, It's less), or you can turn It off by settIng It to a
bIank strIng.
$ git config --global core.pager ''
¡I you run that, GIt wIII page the entIre output oI aII commands,
no matter how Iong they are.
user.signingkey
¡I you're makIng sIgned annotated tags (as dIscussed In Chapter 2),
settIng your GIG sIgnIng key as a configuratIon settIng makes thIngs
easIer. Set your key ¡Ð IIke so.
$ git config --global user.signingkey <gpg-key-id>
Þow, you can sIgn tags wIthout havIng to specIIy your key every
tIme wIth the git tag command.
$ git tag -s <tag-name>
core.excludesfile
You can put patterns In your project's .gitignore fiIe to have GIt not
see them as untracked fiIes or try to stage them when you run git add
on them, as dIscussed In Chapter 2. Iowever, II you want another
fiIe outsIde oI your project to hoId those vaIues or have extra vaIues,
you can teII GIt where that fiIe Is wIth the core.excludesfile settIng.
SImpIy set It to the path oI a fiIe that has content sImIIar to what a
.gitignore fiIe wouId have.
175
Section 7.1 GIt ConfiguratIon Scott Chacon Pro Git
help.autocorrect
ThIs optIon Is avaIIabIe onIy In GIt 1.6.1 and Iater. ¡I you mIstype a
command In GIt 1.6, It shows you somethIng IIke thIs.
$ git com
git: 'com' is not a git-command. See 'git --help'.
Did you mean this?
commit
¡I you set help.autocorrect to 1, GIt wIII automatIcaIIy run the com-
mand II It has onIy one match under thIs scenarIo.
7.1.2 Colors in Git
GIt can coIor Its output to your termInaI, whIch can heIp you vIsuaIIy
parse the output quIckIy and easIIy. A number oI optIons can heIp
you set the coIorIng to your preIerence.
color.ui
GIt automatIcaIIy coIors most oI Its output II you ask It to. You can
get very specIfic about what you want coIored and how, but to turn
on aII the deIauIt termInaI coIorIng, set color.ui to true.
$ git config --global color.ui true
When that vaIue Is set, GIt coIors Its output II the output goes to
a termInaI. Other possIbIe settIngs are IaIse, whIch never coIors the
output, and aIways, whIch sets coIors aII the tIme, even II you're redI-
rectIng GIt commands to a fiIe or pIpIng them to another command.
ThIs settIng was added In GIt versIon 1.5.5, II you have an oIder ver-
sIon, you'II have to specIIy aII the coIor settIngs IndIvIduaIIy.
You'II rareIy want color.ui = always. ¡n most scenarIos, II you want
coIor codes In your redIrected output, you can Instead pass a --color
flag to the GIt command to Iorce It to use coIor codes. The color.ui =
true settIng Is aImost aIways what you'II want to use.
color.*
¡I you want to be more specIfic about whIch commands are coIored
and how, or you have an oIder versIon, GIt provIdes verb-specIfic
coIorIng settIngs. £ach oI these can be set to true, false, or always.
color.branch
color.diff
color.interactive
color.status
176
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
¡n addItIon, each oI these has subsettIngs you can use to set spe-
cIfic coIors Ior parts oI the output, II you want to overrIde each coIor.
Ior exampIe, to set the meta InIormatIon In your dIff output to bIue
Ioreground, bIack background, and boId text, you can run
$ git config --global color.diff.meta “blue black bold”
You can set the coIor to any oI the IoIIowIng vaIues. normaI, bIack,
red, green, yeIIow, bIue, magenta, cyan, or whIte. ¡I you want an
attrIbute IIke boId In the prevIous exampIe, you can choose Irom boId,
dIm, uI, bIInk, and reverse.
See the git config manpage Ior aII the subsettIngs you can config-
ure, II you want to do that.
7.1.3 External Merge and Diff Tools
AIthough GIt has an InternaI ImpIementatIon oI dIff, whIch Is what
you've been usIng, you can set up an externaI tooI Instead. You can
aIso set up a graphIcaI merge conflIct–resoIutIon tooI Instead oI hav-
Ing to resoIve conflIcts manuaIIy. ¡'II demonstrate settIng up the Ier-
Iorce VIsuaI Merge TooI (I4Merge) to do your dIffs and merge reso-
IutIons, because It's a nIce graphIcaI tooI and It's Iree.
¡I you want to try thIs out, I4Merge works on aII major pIatIorms,
so you shouId be abIe to do so. ¡'II use path names In the exampIes
that work on Mac and IInux systems, Ior WIndows, you'II have to
change /usr/local/bin to an executabIe path In your envIronment.
You can downIoad I4Merge here.
http://www.perforce.com/perforce/downloads/component.html
To begIn, you'II set up externaI wrapper scrIpts to run your com-
mands. ¡'II use the Mac path Ior the executabIe, In other systems, It
wIII be where your p4merge bInary Is InstaIIed. Set up a merge wrapper
scrIpt named extMerge that caIIs your bInary wIth aII the arguments
provIded.
$ cat /usr/local/bin/extMerge
#!/bin/sh
/Applications/p4merge.app/Contents/MacOS/p4merge $*
The dIff wrapper checks to make sure seven arguments are pro-
vIded and passes two oI them to your merge scrIpt. Ðy deIauIt, GIt
passes the IoIIowIng arguments to the dIff program.
path old-file old-hex old-mode new-file new-hex new-mode
Ðecause you onIy want the old-file and new-file arguments, you
use the wrapper scrIpt to pass the ones you need.
$ cat /usr/local/bin/extDiff
#!/bin/sh
[ $# -eq 7 ] && /usr/local/bin/extMerge "$2" "$5"
You aIso need to make sure these tooIs are executabIe.
177
Section 7.1 GIt ConfiguratIon Scott Chacon Pro Git
$ sudo chmod +x /usr/local/bin/extMerge
$ sudo chmod +x /usr/local/bin/extDiff
Þow you can set up your config fiIe to use your custom merge
resoIutIon and dIff tooIs. ThIs takes a number oI custom settIngs.
merge.tool to teII GIt what strategy to use, mergetool.*.cmd to specIIy
how to run the command, mergetool.trustExitCode to teII GIt II the exIt
code oI that program IndIcates a successIuI merge resoIutIon or not,
and diff.external to teII GIt what command to run Ior dIffs. So, you
can eIther run Iour config commands
$ git config --global merge.tool extMerge
$ git config --global mergetool.extMerge.cmd \
'extMerge "$BASE" "$LOCAL" "$REMOTE" "$MERGED"'
$ git config --global mergetool.trustExitCode false
$ git config --global diff.external extDiff
or you can edIt your ~/.gitconfig fiIe to add these IInes.
[merge]
tool = extMerge
[mergetool "extMerge"]
cmd = extMerge "$BASE" "$LOCAL" "$REMOTE" "$MERGED"
trustExitCode = false
[diff]
external = extDiff
AIter aII thIs Is set, II you run dIff commands such as thIs.
$ git diff 32d1776b1^ 32d1776b1
¡nstead oI gettIng the dIff output on the command IIne, GIt fires
up I4Merge, whIch Iooks somethIng IIke IIgure 7.1.
¡I you try to merge two branches and subsequentIy have merge
conflIcts, you can run the command git mergetool, It starts I4Merge
to Iet you resoIve the conflIcts through that G!¡ tooI.
The nIce thIng about thIs wrapper setup Is that you can change
your dIff and merge tooIs easIIy. Ior exampIe, to change your extDiff
and extMerge tooIs to run the KÐIff3 tooI Instead, aII you have to do Is
edIt your extMerge fiIe.
$ cat /usr/local/bin/extMerge
#!/bin/sh
/Applications/kdiff3.app/Contents/MacOS/kdiff3 $*
Þow, GIt wIII use the KÐIff3 tooI Ior dIff vIewIng and merge conflIct
resoIutIon.
GIt comes preset to use a number oI other merge-resoIutIon tooIs
wIthout your havIng to set up the cmd configuratIon. You can set your
merge tooI to kdIff3, opendIff, tkdIff, meId, xxdIff, emerge, vImdIff, or
gvImdIff. ¡I you're not Interested In usIng KÐIff3 Ior dIff but rather
want to use It just Ior merge resoIutIon, and the kdIff3 command Is
In your path, then you can run
$ git config --global merge.tool kdiff3
178
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
Figure 7.1: P4Merge
¡I you run thIs Instead oI settIng up the extMerge and extDiff fiIes,
GIt wIII use KÐIff3 Ior merge resoIutIon and the normaI GIt dIff tooI
Ior dIffs.
7.1.4 Formatting and Whitespace
IormattIng and whItespace Issues are some oI the more IrustratIng
and subtIe probIems that many deveIopers encounter when coIIabo-
ratIng, especIaIIy cross-pIatIorm. ¡t's very easy Ior patches or other
coIIaborated work to Introduce subtIe whItespace changes because
edItors sIIentIy Introduce them or WIndows programmers add car-
rIage returns at the end oI IInes they touch In cross-pIatIorm projects.
GIt has a Iew configuratIon optIons to heIp wIth these Issues.
core.autocrlf
¡I you're programmIng on WIndows or usIng another system but work-
Ing wIth peopIe who are programmIng on WIndows, you'II probabIy
run Into IIne-endIng Issues at some poInt. ThIs Is because WIndows
uses both a carrIage-return character and a IIneIeed character Ior
newIInes In Its fiIes, whereas Mac and IInux systems use onIy the
IIneIeed character. ThIs Is a subtIe but IncredIbIy annoyIng Iact oI
cross-pIatIorm work.
179
Section 7.1 GIt ConfiguratIon Scott Chacon Pro Git
GIt can handIe thIs by auto-convertIng CIII IIne endIngs Into II
when you commIt, and vIce versa when It checks out code onto your
fiIesystem. You can turn on thIs IunctIonaIIty wIth the core.autocrlf
settIng. ¡I you're on a WIndows machIne, set It to true — thIs converts
II endIngs Into CIII when you check out code.
$ git config --global core.autocrlf true
¡I you're on a IInux or Mac system that uses II IIne endIngs, then
you don't want GIt to automatIcaIIy convert them when you check
out fiIes, however, II a fiIe wIth CIII endIngs accIdentaIIy gets In-
troduced, then you may want GIt to fix It. You can teII GIt to con-
vert CIII to II on commIt but not the other way around by settIng
core.autocrlf to Input.
$ git config --global core.autocrlf input
ThIs setup shouId Ieave you wIth CIII endIngs In WIndows check-
outs but II endIngs on Mac and IInux systems and In the reposItory.
¡I you're a WIndows programmer doIng a WIndows-onIy project,
then you can turn off thIs IunctIonaIIty, recordIng the carrIage re-
turns In the reposItory by settIng the config vaIue to false.
$ git config --global core.autocrlf false
core.whitespace
GIt comes preset to detect and fix some whItespace Issues. ¡t can
Iook Ior Iour prImary whItespace Issues — two are enabIed by deIauIt
and can be turned off, and two aren't enabIed by deIauIt but can be
actIvated.
The two that are turned on by deIauIt are trailing-space, whIch
Iooks Ior spaces at the end oI a IIne, and space-before-tab, whIch Iooks
Ior spaces beIore tabs at the begInnIng oI a IIne.
The two that are dIsabIed by deIauIt but can be turned on are
indent-with-non-tab, whIch Iooks Ior IInes that begIn wIth eIght or more
spaces Instead oI tabs, and cr-at-eol, whIch teIIs GIt that carrIage re-
turns at the end oI IInes are OK.
You can teII GIt whIch oI these you want enabIed by settIng core.whitespace
to the vaIues you want on or off, separated by commas. You can
dIsabIe settIngs by eIther IeavIng them out oI the settIng strIng or
prependIng a - In Iront oI the vaIue. Ior exampIe, II you want aII but
cr-at-eol to be set, you can do thIs.
$ git config --global core.whitespace \
trailing-space,space-before-tab,indent-with-non-tab
GIt wIII detect these Issues when you run a git diff command and
try to coIor them so you can possIbIy fix them beIore you commIt. ¡t
wIII aIso use these vaIues to heIp you when you appIy patches wIth
git apply. When you're appIyIng patches, you can ask GIt to warn you
II It's appIyIng patches wIth the specIfied whItespace Issues.
180
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
$ git apply --whitespace=warn <patch>
Or you can have GIt try to automatIcaIIy fix the Issue beIore ap-
pIyIng the patch.
$ git apply --whitespace=fix <patch>
These optIons appIy to the gIt rebase optIon as weII. ¡I you've com-
mItted whItespace Issues but haven't yet pushed upstream, you can
run a rebase wIth the --whitespace=fix optIon to have GIt automatIcaIIy
fix whItespace Issues as It's rewrItIng the patches.
7.1.5 Server Configuration
Þot nearIy as many configuratIon optIons are avaIIabIe Ior the server
sIde oI GIt, but there are a Iew InterestIng ones you may want to take
note oI.
receive.fsckObjects
Ðy deIauIt, GIt doesn't check Ior consIstency aII the objects It receIves
durIng a push. AIthough GIt can check to make sure each object stIII
matches Its SIA-1 checksum and poInts to vaIId objects, It doesn't do
that by deIauIt on every push. ThIs Is a reIatIveIy expensIve operatIon
and may add a Iot oI tIme to each push, dependIng on the sIze oI the
reposItory or the push. ¡I you want GIt to check object consIstency
on every push, you can Iorce It to do so by settIng receive.fsckObjects
to true.
$ git config --system receive.fsckObjects true
Þow, GIt wIII check the IntegrIty oI your reposItory beIore each
push Is accepted to make sure IauIty cIIents aren't IntroducIng cor-
rupt data.
receive.denyNonFastForwards
¡I you rebase commIts that you've aIready pushed and then try to
push agaIn, or otherwIse try to push a commIt to a remote branch
that doesn't contaIn the commIt that the remote branch currentIy
poInts to, you'II be denIed. ThIs Is generaIIy good poIIcy, but In the
case oI the rebase, you may determIne that you know what you're
doIng and can Iorce-update the remote branch wIth a -f flag to your
push command.
To dIsabIe the abIIIty to Iorce-update remote branches to non-Iast-
Iorward reIerences, set receive.denyNonFastForwards.
$ git config --system receive.denyNonFastForwards true
The other way you can do thIs Is vIa server-sIde receIve hooks,
whIch ¡'II cover In a bIt. That approach Iets you do more compIex
thIngs IIke deny non-Iast-Iorwards to a certaIn subset oI users.
181
Section 7.2 GIt AttrIbutes Scott Chacon Pro Git
receive.denyDeletes
One oI the workarounds to the denyNonFastForwards poIIcy Is Ior the
user to deIete the branch and then push It back up wIth the new
reIerence. ¡n newer versIons oI GIt (begInnIng wIth versIon 1.6.1),
you can set receive.denyDeletes to true.
$ git config --system receive.denyDeletes true
ThIs denIes branch and tag deIetIon over a push across the board
— no user can do It. To remove remote branches, you must remove
the reI fiIes Irom the server manuaIIy. There are aIso more Interest-
Ing ways to do thIs on a per-user basIs vIa ACIs, as you'II Iearn at the
end oI thIs chapter.
7.2 Git Attributes
Some oI these settIngs can aIso be specIfied Ior a path, so that GIt
appIIes those settIngs onIy Ior a subdIrectory or subset oI fiIes. These
path-specIfic settIngs are caIIed GIt attrIbutes and are set eIther In
a .gitattribute fiIe In one oI your dIrectorIes (normaIIy the root oI
your project) or In the .git/info/attributes fiIe II you don't want the
attrIbutes fiIe commItted wIth your project.
!sIng attrIbutes, you can do thIngs IIke specIIy separate merge
strategIes Ior IndIvIduaI fiIes or dIrectorIes In your project, teII GIt
how to dIff non-text fiIes, or have GIt fiIter content beIore you check
It Into or out oI GIt. ¡n thIs sectIon, you'II Iearn about some oI the
attrIbutes you can set on your paths In your GIt project and see a Iew
exampIes oI usIng thIs Ieature In practIce.
7.2.1 Binary Files
One cooI trIck Ior whIch you can use GIt attrIbutes Is teIIIng GIt whIch
fiIes are bInary (In cases It otherwIse may not be abIe to figure out)
and gIvIng GIt specIaI InstructIons about how to handIe those fiIes.
Ior Instance, some text fiIes may be machIne generated and not dII-
IabIe, whereas some bInary fiIes can be dIffed — you'II see how to
teII GIt whIch Is whIch.
Identifying Binary Files
Some fiIes Iook IIke text fiIes but Ior aII Intents and purposes are to
be treated as bInary data. Ior Instance, Xcode projects on the Mac
contaIn a fiIe that ends In .pbxproj, whIch Is basIcaIIy a jSOÞ (pIaIn
text javascrIpt data Iormat) dataset wrItten out to dIsk by the ¡Ð£ that
records your buIId settIngs and so on. AIthough It's technIcaIIy a text
fiIe, because It's aII ASC¡¡, you don't want to treat It as such because
It's reaIIy a IIghtweIght database — you can't merge the contents II
182
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
two peopIe changed It, and dIffs generaIIy aren't heIpIuI. The fiIe Is
meant to be consumed by a machIne. ¡n essence, you want to treat
It IIke a bInary fiIe.
To teII GIt to treat aII pbxproj fiIes as bInary data, add the IoIIowIng
IIne to your .gitattributes fiIe.
*.pbxproj -crlf -diff
Þow, GIt won't try to convert or fix CIII Issues, nor wIII It try to
compute or prInt a dIff Ior changes In thIs fiIe when you run gIt show
or gIt dIff on your project. ¡n the 1.6 serIes oI GIt, you can aIso use a
macro that Is provIded that means -crlf -diff.
*.pbxproj binary
Diffing Binary Files
¡n the 1.6 serIes oI GIt, you can use the GIt attrIbutes IunctIonaIIty to
effectIveIy dIff bInary fiIes. You do thIs by teIIIng GIt how to convert
your bInary data to a text Iormat that can be compared vIa the normaI
dIff.
Ðecause thIs Is a pretty cooI and not wIdeIy known Ieature, ¡'II go
over a Iew exampIes. IIrst, you'II use thIs technIque to soIve one oI
the most annoyIng probIems known to humanIty. versIon-controIIIng
Word documents. £veryone knows that Word Is the most horrIfic
edItor around, but, oddIy, everyone uses It. ¡I you want to versIon-
controI Word documents, you can stIck them In a GIt reposItory and
commIt every once In a whIIe, but what good does that do? ¡I you
run git diff normaIIy, you onIy see somethIng IIke thIs.
$ git diff
diff --git a/chapter1.doc b/chapter1.doc
index 88839c4..4afcb7c 100644
Binary files a/chapter1.doc and b/chapter1.doc differ
You can't dIrectIy compare two versIons unIess you check them
out and scan them manuaIIy, rIght? ¡t turns out you can do thIs IaIrIy
weII usIng GIt attrIbutes. Iut the IoIIowIng IIne In your .gitattributes
fiIe.
*.doc diff=word
ThIs teIIs GIt that any fiIe that matches thIs pattern (.doc) shouId
use the “word” fiIter when you try to vIew a dIff that contaIns changes.
What Is the “word” fiIter? You have to set It up. Iere you'II config-
ure GIt to use the strings program to convert Word documents Into
readabIe text fiIes, whIch It wIII then dIff properIy.
$ git config diff.word.textconv strings
183
Section 7.2 GIt AttrIbutes Scott Chacon Pro Git
Þow GIt knows that II It trIes to do a dIff between two snapshots,
and any oI the fiIes end In .doc, It shouId run those fiIes through the
“word” fiIter, whIch Is defined as the strings program. ThIs effectIveIy
makes nIce text-based versIons oI your Word fiIes beIore attemptIng
to dIff them.
Iere's an exampIe. ¡ put Chapter 1 oI thIs book Into GIt, added
some text to a paragraph, and saved the document. Then, ¡ ran git
diff to see what changed.
$ git diff
diff --git a/chapter1.doc b/chapter1.doc
index c1c8a0a..b93c9e4 100644
--- a/chapter1.doc
+++ b/chapter1.doc
@@ -8,7 +8,8 @@ re going to cover Version Control Systems (VCS) and Git basics
re going to cover how to get it and set it up for the first time if you don
t already have it on your system.
In Chapter Two we will go over basic Git usage - how to use Git for the 80%
-s going on, modify stuff and contribute changes. If the book spontaneously
+s going on, modify stuff and contribute changes. If the book spontaneously
+Let's see if this works.
GIt successIuIIy and succInctIy teIIs me that ¡ added the strIng
“Iet's see II thIs works”, whIch Is correct. ¡t's not perIect — It adds a
bunch oI random stuff at the end — but It certaInIy works. ¡I you can
find or wrIte a Word-to-pIaIn-text converter that works weII enough,
that soIutIon wIII IIkeIy be IncredIbIy effectIve. Iowever, strings Is
avaIIabIe on most Mac and IInux systems, so It may be a good first
try to do thIs wIth many bInary Iormats.
Another InterestIng probIem you can soIve thIs way InvoIves dIff-
Ing Image fiIes. One way to do thIs Is to run jI£G fiIes through a fiI-
ter that extracts theIr £X¡I InIormatIon — metadata that Is recorded
wIth most Image Iormats. ¡I you downIoad and InstaII the exiftool
program, you can use It to convert your Images Into text about the
metadata, so at Ieast the dIff wIII show you a textuaI representatIon
oI any changes that happened.
$ echo '*.png diff=exif' >> .gitattributes
$ git config diff.exif.textconv exiftool
¡I you repIace an Image In your project and run git diff, you see
somethIng IIke thIs.
diff --git a/image.png b/image.png
index 88839c4..4afcb7c 100644
--- a/image.png
+++ b/image.png
@@ -1,12 +1,12 @@
ExifTool Version Number : 7.74
-File Size : 70 kB
-File Modification Date/Time : 2009:04:21 07:02:45-07:00
+File Size : 94 kB
+File Modification Date/Time : 2009:04:21 07:02:43-07:00
File Type : PNG
MIME Type : image/png
184
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
-Image Width : 1058
-Image Height : 889
+Image Width : 1056
+Image Height : 827
Bit Depth : 8
Color Type : RGB with Alpha
You can easIIy see that the fiIe sIze and Image dImensIons have
both changed.
7.2.2 Keyword Expansion
SVÞ- or CVS-styIe keyword expansIon Is oIten requested by deveI-
opers used to those systems. The maIn probIem wIth thIs In GIt Is
that you can't modIIy a fiIe wIth InIormatIon about the commIt aIter
you've commItted, because GIt checksums the fiIe first. Iowever,
you can Inject text Into a fiIe when It's checked out and remove It
agaIn beIore It's added to a commIt. GIt attrIbutes offers you two
ways to do thIs.
IIrst, you can Inject the SIA-1 checksum oI a bIob Into an $Id$
fieId In the fiIe automatIcaIIy. ¡I you set thIs attrIbute on a fiIe or set
oI fiIes, then the next tIme you check out that branch, GIt wIII repIace
that fieId wIth the SIA-1 oI the bIob. ¡t's Important to notIce that It
Isn't the SIA oI the commIt, but oI the bIob ItseII.
$ echo '*.txt ident' >> .gitattributes
$ echo '$Id$' > test.txt
The next tIme you check out thIs fiIe, GIt Injects the SIA oI the
bIob.
$ rm text.txt
$ git checkout -- text.txt
$ cat test.txt
$Id: 42812b7653c7b88933f8a9d6cad0ca16714b9bb3 $
Iowever, that resuIt Is oI IImIted use. ¡I you've used keyword
substItutIon In CVS or SubversIon, you can IncIude a datestamp —
the SIA Isn't aII that heIpIuI, because It's IaIrIy random and you can't
teII II one SIA Is oIder or newer than another.
¡t turns out that you can wrIte your own fiIters Ior doIng sub-
stItutIons In fiIes on commIt/checkout. These are the “cIean” and
“smudge” fiIters. ¡n the .gitattributes fiIe, you can set a fiIter Ior
partIcuIar paths and then set up scrIpts that wIII process fiIes just
beIore they're commItted (“cIean”, see IIgure 7-2) and just beIore
they're checked out (“smudge”, see IIgure 7-3). These fiIters can be
set to do aII sorts oI Iun thIngs.
The orIgInaI commIt message Ior thIs IunctIonaIIty gIves a sIm-
pIe exampIe oI runnIng aII your C source code through the indent
program beIore commIttIng. You can set It up by settIng the fiIter
attrIbute In your .gitattributes fiIe to fiIter *.c fiIes wIth the “Indent”
fiIter.
185
Section 7.2 GIt AttrIbutes Scott Chacon Pro Git
Figure 7.2: The “smudge” filter is run on checkout.
Figure 7.3: The “clean” filter is run when files are staged.
*.c filter=indent
Then, teII GIt what the “Indent”" fiIter does on smudge and cIean.
$ git config --global filter.indent.clean indent
$ git config --global filter.indent.smudge cat
¡n thIs case, when you commIt fiIes that match *.c, GIt wIII run
them through the Indent program beIore It commIts them and then
run them through the cat program beIore It checks them back out
onto dIsk. The cat program Is basIcaIIy a no-op. It spIts out the same
data that It gets In. ThIs combInatIon effectIveIy fiIters aII C source
code fiIes through indent beIore commIttIng.
Another InterestIng exampIe gets $Date$ keyword expansIon, ICS
styIe. To do thIs properIy, you need a smaII scrIpt that takes a fiIe-
name, figures out the Iast commIt date Ior thIs project, and Inserts
the date Into the fiIe. Iere Is a smaII Iuby scrIpt that does that.
#! /usr/bin/env ruby
data = STDIN.read
186
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
last_date = `git log --pretty=format:"%ad" -1`
puts data.gsub('$Date$', '$Date: ' + last_date.to_s + '$')
AII the scrIpt does Is get the Iatest commIt date Irom the git log
command, stIck that Into any $Date$ strIngs It sees In stdIn, and prInt
the resuIts — It shouId be sImpIe to do In whatever Ianguage you're
most comIortabIe In. You can name thIs fiIe expand_date and put It In
your path. Þow, you need to set up a fiIter In GIt (caII It dater) and
teII It to use your expand_date fiIter to smudge the fiIes on checkout.
You'II use a IerI expressIon to cIean that up on commIt.
$ git config filter.dater.smudge expand_date
$ git config filter.dater.clean 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\
$Date\\\$/"'
ThIs IerI snIppet strIps out anythIng It sees In a $Date$ strIng, to
get back to where you started. Þow that your fiIter Is ready, you can
test It by settIng up a fiIe wIth your $Date$ keyword and then settIng
up a GIt attrIbute Ior that fiIe that engages the new fiIter.
$ echo '# $Date$' > date_test.txt
$ echo 'date*.txt filter=dater' >> .gitattributes
¡I you commIt those changes and check out the fiIe agaIn, you see
the keyword properIy substItuted.
$ git add date_test.txt .gitattributes
$ git commit -m "Testing date expansion in Git"
$ rm date_test.txt
$ git checkout date_test.txt
$ cat date_test.txt
# $Date: Tue Apr 21 07:26:52 2009 -0700$
You can see how powerIuI thIs technIque can be Ior customIzed ap-
pIIcatIons. You have to be careIuI, though, because the .gitattributes
fiIe Is commItted and passed around wIth the project but the drIver
(In thIs case, dater) Isn't, so, It won't work everywhere. When you
desIgn these fiIters, they shouId be abIe to IaII graceIuIIy and have
the project stIII work properIy.
7.2.3 Exporting Your Repository
GIt attrIbute data aIso aIIows you to do some InterestIng thIngs when
exportIng an archIve oI your project.
export-ignore
You can teII GIt not to export certaIn fiIes or dIrectorIes when gener-
atIng an archIve. ¡I there Is a subdIrectory or fiIe that you don't want
to IncIude In your archIve fiIe but that you do want checked Into your
project, you can determIne those fiIes vIa the export-ignore attrIbute.
Ior exampIe, say you have some test fiIes In a test/ subdIrectory,
and It doesn't make sense to IncIude them In the tarbaII export oI
187
Section 7.2 GIt AttrIbutes Scott Chacon Pro Git
your project. You can add the IoIIowIng IIne to your GIt attrIbutes
fiIe.
test/ export-ignore
Þow, when you run gIt archIve to create a tarbaII oI your project,
that dIrectory won't be IncIuded In the archIve.
export-subst
Another thIng you can do Ior your archIves Is some sImpIe keyword
substItutIon. GIt Iets you put the strIng $Format:$ In any fiIe wIth
any oI the --pretty=format IormattIng shortcodes, many oI whIch you
saw In Chapter 2. Ior Instance, II you want to IncIude a fiIe named
LAST_COMMIT In your project, and the Iast commIt date was automatI-
caIIy Injected Into It when git archive ran, you can set up the fiIe IIke
thIs.
$ echo 'Last commit date: $Format:%cd$' > LAST_COMMIT
$ echo "LAST_COMMIT export-subst" >> .gitattributes
$ git add LAST_COMMIT .gitattributes
$ git commit -am 'adding LAST_COMMIT file for archives'
When you run git archive, the contents oI that fiIe when peopIe
open the archIve fiIe wIII Iook IIke thIs.
$ cat LAST_COMMIT
Last commit date: $Format:Tue Apr 21 08:38:48 2009 -0700$
7.2.4 Merge Strategies
You can aIso use GIt attrIbutes to teII GIt to use dIfferent merge
strategIes Ior specIfic fiIes In your project. One very useIuI optIon
Is to teII GIt to not try to merge specIfic fiIes when they have con-
flIcts, but rather to use your sIde oI the merge over someone eIse's.
ThIs Is heIpIuI II a branch In your project has dIverged or Is specIaI-
Ized, but you want to be abIe to merge changes back In Irom It, and
you want to Ignore certaIn fiIes. Say you have a database settIngs fiIe
caIIed database.xmI that Is dIfferent In two branches, and you want
to merge In your other branch wIthout messIng up the database fiIe.
You can set up an attrIbute IIke thIs.
database.xml merge=ours
¡I you merge In the other branch, Instead oI havIng merge conflIcts
wIth the database.xmI fiIe, you see somethIng IIke thIs.
$ git merge topic
Auto-merging database.xml
Merge made by recursive.
¡n thIs case, database.xmI stays at whatever versIon you orIgInaIIy
had.
188
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
7.3 Git Hooks
IIke many other VersIon ControI Systems, GIt has a way to fire off
custom scrIpts when certaIn Important actIons occur. There are two
groups oI these hooks. cIIent sIde and server sIde. The cIIent-sIde
hooks are Ior cIIent operatIons such as commIttIng and mergIng.
The server-sIde hooks are Ior GIt server operatIons such as receIv-
Ing pushed commIts. You can use these hooks Ior aII sorts oI reasons,
and you'II Iearn about a Iew oI them here.
7.3.1 Installing a Hook
The hooks are aII stored In the hooks subdIrectory oI the GIt dIrectory.
¡n most projects, that's .git/hooks. Ðy deIauIt, GIt popuIates thIs dI-
rectory wIth a bunch oI exampIe scrIpts, many oI whIch are useIuI by
themseIves, but they aIso document the Input vaIues oI each scrIpt.
AII the exampIes are wrItten as sheII scrIpts, wIth some IerI thrown
In, but any properIy named executabIe scrIpts wIII work fine — you
can wrIte them In Iuby or Iython or what have you. Ior post-1.6 ver-
sIons oI GIt, these exampIe hook fiIes end wIth .sampIe, you'II need
to rename them. Ior pre-1.6 versIons oI GIt, the exampIe fiIes are
named properIy but are not executabIe.
To enabIe a hook scrIpt, put a fiIe In the hooks subdIrectory oI your
GIt dIrectory that Is named approprIateIy and Is executabIe. Irom
that poInt Iorward, It shouId be caIIed. ¡'II cover most oI the major
hook fiIenames here.
7.3.2 Client-Side Hooks
There are a Iot oI cIIent-sIde hooks. ThIs sectIon spIIts them Into
commIttIng-workflow hooks, e-maII–workflow scrIpts, and the rest oI
the cIIent-sIde scrIpts.
Committing-Workflow Hooks
The first Iour hooks have to do wIth the commIttIng process. The pre-
commit hook Is run first, beIore you even type In a commIt message.
¡t's used to Inspect the snapshot that's about to be commItted, to see
II you've Iorgotten somethIng, to make sure tests run, or to examIne
whatever you need to Inspect In the code. £xItIng non-zero Irom
thIs hook aborts the commIt, aIthough you can bypass It wIth git
commit --no-verify. You can do thIngs IIke check Ior code styIe (run IInt
or somethIng equIvaIent), check Ior traIIIng whItespace (the deIauIt
hook does exactIy that), or check Ior approprIate documentatIon on
new methods.
The prepare-commit-msg hook Is run beIore the commIt message ed-
Itor Is fired up but aIter the deIauIt message Is created. ¡t Iets you
edIt the deIauIt message beIore the commIt author sees It. ThIs
189
Section 7.3 GIt Iooks Scott Chacon Pro Git
hook takes a Iew optIons. the path to the fiIe that hoIds the com-
mIt message so Iar, the type oI commIt, and the commIt SIA-1 II thIs
Is an amended commIt. ThIs hook generaIIy Isn't useIuI Ior normaI
commIts, rather, It's good Ior commIts where the deIauIt message
Is auto-generated, such as tempIated commIt messages, merge com-
mIts, squashed commIts, and amended commIts. You may use It In
conjunctIon wIth a commIt tempIate to programmatIcaIIy Insert In-
IormatIon.
The commit-msg hook takes one parameter, whIch agaIn Is the path
to a temporary fiIe that contaIns the current commIt message. ¡I
thIs scrIpt exIts non-zero, GIt aborts the commIt process, so you can
use It to vaIIdate your project state or commIt message beIore aIIow-
Ing a commIt to go through. ¡n the Iast sectIon oI thIs chapter, ¡'II
demonstrate usIng thIs hook to check that your commIt message Is
conIormant to a requIred pattern.
AIter the entIre commIt process Is compIeted, the post-commit hook
runs. ¡t doesn't take any parameters, but you can easIIy get the Iast
commIt by runnIng git log -1 HEAD. GeneraIIy, thIs scrIpt Is used Ior
notIficatIon or somethIng sImIIar.
The commIttIng-workflow cIIent-sIde scrIpts can be used In just
about any workflow. They're oIten used to enIorce certaIn poIIcIes,
aIthough It's Important to note that these scrIpts aren't transIerred
durIng a cIone. You can enIorce poIIcy on the server sIde to reject
pushes oI commIts that don't conIorm to some poIIcy, but It's en-
tIreIy up to the deveIoper to use these scrIpts on the cIIent sIde. So,
these are scrIpts to heIp deveIopers, and they must be set up and
maIntaIned by them, aIthough they can be overrIdden or modIfied by
them at any tIme.
E-mail Workflow Hooks
You can set up three cIIent-sIde hooks Ior an e-maII–based workflow.
They're aII Invoked by the git am command, so II you aren't usIng that
command In your workflow, you can saIeIy skIp to the next sectIon. ¡I
you're takIng patches over e-maII prepared by git format-patch, then
some oI these may be heIpIuI to you.
The first hook that Is run Is applypatch-msg. ¡t takes a sIngIe ar-
gument. the name oI the temporary fiIe that contaIns the proposed
commIt message. GIt aborts the patch II thIs scrIpt exIts non-zero.
You can use thIs to make sure a commIt message Is properIy Iormat-
ted or to normaIIze the message by havIng the scrIpt edIt It In pIace.
The next hook to run when appIyIng patches vIa git am Is pre-
applypatch. ¡t takes no arguments and Is run aIter the patch Is ap-
pIIed, so you can use It to Inspect the snapshot beIore makIng the
commIt. You can run tests or otherwIse Inspect the workIng tree
wIth thIs scrIpt. ¡I somethIng Is mIssIng or the tests don't pass, ex-
ItIng non-zero aIso aborts the git am scrIpt wIthout commIttIng the
patch.
190
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
The Iast hook to run durIng a git am operatIon Is post-applypatch.
You can use It to notIIy a group or the author oI the patch you puIIed
In that you've done so. You can't stop the patchIng process wIth thIs
scrIpt.
Other Client Hooks
The pre-rebase hook runs beIore you rebase anythIng and can haIt
the process by exItIng non-zero. You can use thIs hook to dIsaIIow
rebasIng any commIts that have aIready been pushed. The exampIe
pre-rebase hook that GIt InstaIIs does thIs, aIthough It assumes that
next Is the name oI the branch you pubIIsh. You'II IIkeIy need to
change that to whatever your stabIe, pubIIshed branch Is.
AIter you run a successIuI git checkout, the post-checkout hook runs,
you can use It to set up your workIng dIrectory properIy Ior your
project envIronment. ThIs may mean movIng In Iarge bInary fiIes that
you don't want source controIIed, auto-generatIng documentatIon, or
somethIng aIong those IInes.
IInaIIy, the post-merge hook runs aIter a successIuI merge command.
You can use It to restore data In the workIng tree that GIt can't track,
such as permIssIons data. ThIs hook can IIkewIse vaIIdate the pres-
ence oI fiIes externaI to GIt controI that you may want copIed In when
the workIng tree changes.
7.3.3 Server-Side Hooks
¡n addItIon to the cIIent-sIde hooks, you can use a coupIe oI Important
server-sIde hooks as a system admInIstrator to enIorce nearIy any
kInd oI poIIcy Ior your project. These scrIpts run beIore and aIter
pushes to the server. The pre hooks can exIt non-zero at any tIme to
reject the push as weII as prInt an error message back to the cIIent,
you can set up a push poIIcy that's as compIex as you wIsh.
pre-receive and post-receive
The first scrIpt to run when handIIng a push Irom a cIIent Is pre-
receive. ¡t takes a IIst oI reIerences that are beIng pushed Irom stdIn,
II It exIts non-zero, none oI them are accepted. You can use thIs hook
to do thIngs IIke make sure none oI the updated reIerences are non-
Iast-Iorwards, or to check that the user doIng the pushIng has create,
deIete, or push access or access to push updates to aII the fiIes they're
modIIyIng wIth the push.
The post-receive hook runs aIter the entIre process Is compIeted
and can be used to update other servIces or notIIy users. ¡t takes the
same stdIn data as the pre-receive hook. £xampIes IncIude e-maIIIng
a IIst, notIIyIng a contInuous IntegratIon server, or updatIng a tIcket-
trackIng system — you can even parse the commIt messages to see II
any tIckets need to be opened, modIfied, or cIosed. ThIs scrIpt can't
191
Section 7.4 An £xampIe GIt-£nIorced IoIIcy Scott Chacon Pro Git
stop the push process, but the cIIent doesn't dIsconnect untII It has
compIeted, so, be careIuI when you try to do anythIng that may take
a Iong tIme.
update
The update scrIpt Is very sImIIar to the pre-receive scrIpt, except that
It's run once Ior each branch the pusher Is tryIng to update. ¡I the
pusher Is tryIng to push to muItIpIe branches, pre-receive runs onIy
once, whereas update runs once per branch they're pushIng to. ¡n-
stead oI readIng Irom stdIn, thIs scrIpt takes three arguments. the
name oI the reIerence (branch), the SIA-1 that reIerence poInted
to beIore the push, and the SIA-1 the user Is tryIng to push. ¡I the
update scrIpt exIts non-zero, onIy that reIerence Is rejected, other
reIerences can stIII be updated.
7.4 An Example Git-Enforced Policy
¡n thIs sectIon, you'II use what you've Iearned to estabIIsh a GIt work-
flow that checks Ior a custom commIt message Iormat, enIorces Iast-
Iorward-onIy pushes, and aIIows onIy certaIn users to modIIy certaIn
subdIrectorIes In a project. You'II buIId cIIent scrIpts that heIp the
deveIoper know II theIr push wIII be rejected and server scrIpts that
actuaIIy enIorce the poIIcIes.
¡ used Iuby to wrIte these, both because It's my preIerred scrIpt-
Ing Ianguage and because ¡ IeeI It's the most pseudocode-IookIng oI
the scrIptIng Ianguages, thus you shouId be abIe to roughIy IoIIow
the code even II you don't use Iuby. Iowever, any Ianguage wIII
work fine. AII the sampIe hook scrIpts dIstrIbuted wIth GIt are In eI-
ther IerI or Ðash scrIptIng, so you can aIso see pIenty oI exampIes oI
hooks In those Ianguages by IookIng at the sampIes.
7.4.1 Server-Side Hook
AII the server-sIde work wIII go Into the update fiIe In your hooks dI-
rectory. The update fiIe runs once per branch beIng pushed and
takes the reIerence beIng pushed to, the oId revIsIon where that
branch was, and the new revIsIon beIng pushed. You aIso have ac-
cess to the user doIng the pushIng II the push Is beIng run over SSI.
¡I you've aIIowed everyone to connect wIth a sIngIe user (IIke “gIt”)
vIa pubIIc-key authentIcatIon, you may have to gIve that user a sheII
wrapper that determInes whIch user Is connectIng based on the pub-
IIc key, and set an envIronment varIabIe specIIyIng that user. Iere ¡
assume the connectIng user Is In the $USER envIronment varIabIe, so
your update scrIpt begIns by gatherIng aII the InIormatIon you need.
#!/usr/bin/env ruby
192
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
$refname = ARGV[0]
$oldrev = ARGV[1]
$newrev = ARGV[2]
$user = ENV['USER']
puts "Enforcing Policies... \n(#{$refname}) (#{$oldrev[0,6]}) (#{$newrev
[0,6]})"
Yes, ¡'m usIng gIobaI varIabIes. Ðon't judge me — It's easIer to
demonstrate In thIs manner.
Enforcing a Specific Commit-Message Format
Your first chaIIenge Is to enIorce that each commIt message must
adhere to a partIcuIar Iormat. just to have a target, assume that each
message has to IncIude a strIng that Iooks IIke “reI. 1234” because
you want each commIt to IInk to a work Item In your tIcketIng system.
You must Iook at each commIt beIng pushed up, see II that strIng Is
In the commIt message, and, II the strIng Is absent Irom any oI the
commIts, exIt non-zero so the push Is rejected.
You can get a IIst oI the SIA-1 vaIues oI aII the commIts that are
beIng pushed by takIng the $newrev and $oldrev vaIues and passIng
them to a GIt pIumbIng command caIIed git rev-list. ThIs Is basIcaIIy
the git log command, but by deIauIt It prInts out onIy the SIA-1
vaIues and no other InIormatIon. So, to get a IIst oI aII the commIt
SIAs Introduced between one commIt SIA and another, you can run
somethIng IIke thIs.
$ git rev-list 538c33..d14fc7
d14fc7c847ab946ec39590d87783c69b031bdfb7
9f585da4401b0a3999e84113824d15245c13f0be
234071a1be950e2a8d078e6141f5cd20c1e61ad3
dfa04c9ef3d5197182f13fb5b9b1fb7717d2222a
17716ec0f1ff5c77eff40b7fe912f9f6cfd0e475
You can take that output, Ioop through each oI those commIt SIAs,
grab the message Ior It, and test that message agaInst a reguIar ex-
pressIon that Iooks Ior a pattern.
You have to figure out how to get the commIt message Irom each
oI these commIts to test. To get the raw commIt data, you can use
another pIumbIng command caIIed git cat-file. ¡'II go over aII these
pIumbIng commands In detaII In Chapter 9, but Ior now, here's what
that command gIves you.
$ git cat-file commit ca82a6
tree cfda3bf379e4f8dba8717dee55aab78aef7f4daf
parent 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
author Scott Chacon <schacon@gmail.com> 1205815931 -0700
committer Scott Chacon <schacon@gmail.com> 1240030591 -0700
changed the verison number
A sImpIe way to get the commIt message Irom a commIt when
you have the SIA-1 vaIue Is to go to the first bIank IIne and take
193
Section 7.4 An £xampIe GIt-£nIorced IoIIcy Scott Chacon Pro Git
everythIng aIter that. You can do so wIth the sed command on !nIx
systems.
$ git cat-file commit ca82a6 | sed '1,/^$/d'
changed the verison number
You can use that IncantatIon to grab the commIt message Irom
each commIt that Is tryIng to be pushed and exIt II you see anythIng
that doesn't match. To exIt the scrIpt and reject the push, exIt non-
zero. The whoIe method Iooks IIke thIs.
$regex = /\[ref: (\d+)\]/
# enforced custom commit message format
def check_message_format
missed_revs = `git rev-list #{$oldrev}..#{$newrev}`.split("\n")
missed_revs.each do |rev|
message = `git cat-file commit #{rev} | sed '1,/^$/d'`
if !$regex.match(message)
puts "[POLICY] Your message is not formatted correctly"
exit 1
end
end
end
check_message_format
IuttIng that In your update scrIpt wIII reject updates that contaIn
commIts that have messages that don't adhere to your ruIe.
Enforcing a User-Based ACL System
Suppose you want to add a mechanIsm that uses an access controI
IIst (ACI) that specIfies whIch users are aIIowed to push changes
to whIch parts oI your projects. Some peopIe have IuII access, and
others onIy have access to push changes to certaIn subdIrectorIes or
specIfic fiIes. To enIorce thIs, you'II wrIte those ruIes to a fiIe named
acl that IIves In your bare GIt reposItory on the server. You'II have the
update hook Iook at those ruIes, see what fiIes are beIng Introduced
Ior aII the commIts beIng pushed, and determIne whether the user
doIng the push has access to update aII those fiIes.
The first thIng you'II do Is wrIte your ACI. Iere you'II use a Iormat
very much IIke the CVS ACI mechanIsm. It uses a serIes oI IInes,
where the first fieId Is avail or unavail, the next fieId Is a comma-
deIImIted IIst oI the users to whIch the ruIe appIIes, and the Iast fieId
Is the path to whIch the ruIe appIIes (bIank meanIng open access).
AII oI these fieIds are deIImIted by a pIpe (|) character.
¡n thIs case, you have a coupIe oI admInIstrators, some documen-
tatIon wrIters wIth access to the doc dIrectory, and one deveIoper
who onIy has access to the lib and tests dIrectorIes, and your ACI
fiIe Iooks IIke thIs.
avail|nickh,pjhyett,defunkt,tpw
avail|usinclair,cdickens,ebronte|doc
avail|schacon|lib
avail|schacon|tests
194
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
You begIn by readIng thIs data Into a structure that you can use. ¡n
thIs case, to keep the exampIe sImpIe, you'II onIy enIorce the avail dI-
rectIves. Iere Is a method that gIves you an assocIatIve array where
the key Is the user name and the vaIue Is an array oI paths to whIch
the user has wrIte access.
def get_acl_access_data(acl_file)
# read in ACL data
acl_file = File.read(acl_file).split("\n").reject { |line| line == '' }
access = {}
acl_file.each do |line|
avail, users, path = line.split('|')
next unless avail == 'avail'
users.split(',').each do |user|
access[user] ||= []
access[user] << path
end
end
access
end
On the ACI fiIe you Iooked at earIIer, thIs get_acl_access_data method
returns a data structure that Iooks IIke thIs.
{"defunkt"=>[nil],
"tpw"=>[nil],
"nickh"=>[nil],
"pjhyett"=>[nil],
"schacon"=>["lib", "tests"],
"cdickens"=>["doc"],
"usinclair"=>["doc"],
"ebronte"=>["doc"]}
Þow that you have the permIssIons sorted out, you need to deter-
mIne what paths the commIts beIng pushed have modIfied, so you
can make sure the user who's pushIng has access to aII oI them.
You can pretty easIIy see what fiIes have been modIfied In a sIngIe
commIt wIth the --name-only optIon to the git log command (men-
tIoned brIefly In Chapter 2).
$ git log -1 --name-only --pretty=format:'' 9f585d
README
lib/test.rb
¡I you use the ACI structure returned Irom the get_acl_access_data
method and check It agaInst the IIsted fiIes In each oI the commIts,
you can determIne whether the user has access to push aII oI theIr
commIts.
# only allows certain users to modify certain subdirectories in a project
def check_directory_perms
access = get_acl_access_data('acl')
# see if anyone is trying to push something they can't
new_commits = `git rev-list #{$oldrev}..#{$newrev}`.split("\n")
new_commits.each do |rev|
195
Section 7.4 An £xampIe GIt-£nIorced IoIIcy Scott Chacon Pro Git
files_modified = `git log -1 --name-only --pretty=format:'' #
{rev}`.split("\n")
files_modified.each do |path|
next if path.size == 0
has_file_access = false
access[$user].each do |access_path|
if !access_path # user has access to everything
|| (path.index(access_path) == 0) # access to this path
has_file_access = true
end
end
if !has_file_access
puts "[POLICY] You do not have access to push to #{path}"
exit 1
end
end
end
end
check_directory_perms
Most oI that shouId be easy to IoIIow. You get a IIst oI new com-
mIts beIng pushed to your server wIth git rev-list. Then, Ior each
oI those, you find whIch fiIes are modIfied and make sure the user
who's pushIng has access to aII the paths beIng modIfied. One Iuby-
Ism that may not be cIear Is path.index(access_path) == 0, whIch Is true
II path begIns wIth access_path — thIs ensures that access_path Is not
just In one oI the aIIowed paths, but an aIIowed path begIns wIth each
accessed path.
Þow your users can't push any commIts wIth badIy Iormed mes-
sages or wIth modIfied fiIes outsIde oI theIr desIgnated paths.
Enforcing Fast-Forward-Only Pushes
The onIy thIng IeIt Is to enIorce Iast-Iorward-onIy pushes. ¡n GIt ver-
sIons 1.6 or newer, you can set the receive.denyDeletes and receive.denyNonFastForwards
settIngs. Ðut enIorcIng thIs wIth a hook wIII work In oIder versIons oI
GIt, and you can modIIy It to do so onIy Ior certaIn users or whatever
eIse you come up wIth Iater.
The IogIc Ior checkIng thIs Is to see II any commIts are reachabIe
Irom the oIder revIsIon that aren't reachabIe Irom the newer one. ¡I
there are none, then It was a Iast-Iorward push, otherwIse, you deny
It.
# enforces fast-forward only pushes
def check_fast_forward
missed_refs = `git rev-list #{$newrev}..#{$oldrev}`
missed_ref_count = missed_refs.split("\n").size
if missed_ref_count > 0
puts "[POLICY] Cannot push a non fast-forward reference"
exit 1
end
end
check_fast_forward
196
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
£verythIng Is set up. ¡I you run chmod u+x .git/hooks/update, whIch
Is the fiIe you Into whIch you shouId have put aII thIs code, and then
try to push a non-Iast-Iorwarded reIerence, you get somethIng IIke
thIs.
$ git push -f origin master
Counting objects: 5, done.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 323 bytes, done.
Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
Enforcing Policies...
(refs/heads/master) (8338c5) (c5b616)
[POLICY] Cannot push a non-fast-forward reference
error: hooks/update exited with error code 1
error: hook declined to update refs/heads/master
To git@gitserver:project.git
! [remote rejected] master -> master (hook declined)
error: failed to push some refs to 'git@gitserver:project.git'
There are a coupIe oI InterestIng thIngs here. IIrst, you see thIs
where the hook starts runnIng.
Enforcing Policies...
(refs/heads/master) (fb8c72) (c56860)
ÞotIce that you prInted that out to stdout at the very begInnIng oI
your update scrIpt. ¡t's Important to note that anythIng your scrIpt
prInts to stdout wIII be transIerred to the cIIent.
The next thIng you'II notIce Is the error message.
[POLICY] Cannot push a non fast-forward reference
error: hooks/update exited with error code 1
error: hook declined to update refs/heads/master
The first IIne was prInted out by you, the other two were GIt teIIIng
you that the update scrIpt exIted non-zero and that Is what Is decIIn-
Ing your push. IastIy, you have thIs.
To git@gitserver:project.git
! [remote rejected] master -> master (hook declined)
error: failed to push some refs to 'git@gitserver:project.git'
You'II see a remote rejected message Ior each reIerence that your
hook decIIned, and It teIIs you that It was decIIned specIficaIIy be-
cause oI a hook IaIIure.
Iurthermore, II the reI marker Isn't there In any oI your commIts,
you'II see the error message you're prIntIng out Ior that.
[POLICY] Your message is not formatted correctly
Or II someone trIes to edIt a fiIe they don't have access to and
push a commIt contaInIng It, they wIII see somethIng sImIIar. Ior
Instance, II a documentatIon author trIes to push a commIt modIIyIng
somethIng In the lib dIrectory, they see
197
Section 7.4 An £xampIe GIt-£nIorced IoIIcy Scott Chacon Pro Git
[POLICY] You do not have access to push to lib/test.rb
That's aII. Irom now on, as Iong as that update scrIpt Is there and
executabIe, your reposItory wIII never be rewound and wIII never
have a commIt message wIthout your pattern In It, and your users
wIII be sandboxed.
7.4.2 Client-Side Hooks
The downsIde to thIs approach Is the whInIng that wIII InevItabIy
resuIt when your users' commIt pushes are rejected. IavIng theIr
careIuIIy craIted work rejected at the Iast mInute can be extremeIy
IrustratIng and conIusIng, and Iurthermore, they wIII have to edIt
theIr hIstory to correct It, whIch Isn't aIways Ior the IaInt oI heart.
The answer to thIs dIIemma Is to provIde some cIIent-sIde hooks
that users can use to notIIy them when they're doIng somethIng that
the server Is IIkeIy to reject. That way, they can correct any probIems
beIore commIttIng and beIore those Issues become more dIfficuIt to
fix. Ðecause hooks aren't transIerred wIth a cIone oI a project, you
must dIstrIbute these scrIpts some other way and then have your
users copy them to theIr .git/hooks dIrectory and make them exe-
cutabIe. You can dIstrIbute these hooks wIthIn the project or In a
separate project, but there Is no way to set them up automatIcaIIy.
To begIn, you shouId check your commIt message just beIore each
commIt Is recorded, so you know the server won't reject your changes
due to badIy Iormatted commIt messages. To do thIs, you can add the
commit-msg hook. ¡I you have It read the message Irom the fiIe passed
as the first argument and compare that to the pattern, you can Iorce
GIt to abort the commIt II there Is no match.
#!/usr/bin/env ruby
message_file = ARGV[0]
message = File.read(message_file)
$regex = /\[ref: (\d+)\]/
if !$regex.match(message)
puts "[POLICY] Your message is not formatted correctly"
exit 1
end
¡I that scrIpt Is In pIace (In .git/hooks/commit-msg) and executabIe,
and you commIt wIth a message that Isn't properIy Iormatted, you
see thIs.
$ git commit -am 'test'
[POLICY] Your message is not formatted correctly
Þo commIt was compIeted In that Instance. Iowever, II your mes-
sage contaIns the proper pattern, GIt aIIows you to commIt.
$ git commit -am 'test [ref: 132]'
[master e05c914] test [ref: 132]
1 files changed, 1 insertions(+), 0 deletions(-)
198
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
Þext, you want to make sure you aren't modIIyIng fiIes that are
outsIde your ACI scope. ¡I your project's .git dIrectory contaIns a
copy oI the ACI fiIe you used prevIousIy, then the IoIIowIng pre-commit
scrIpt wIII enIorce those constraInts Ior you.
#!/usr/bin/env ruby
$user = ENV['USER']
# [ insert acl_access_data method from above ]
# only allows certain users to modify certain subdirectories in a project
def check_directory_perms
access = get_acl_access_data('.git/acl')
files_modified = `git diff-index --cached --name-only HEAD`.split("\n")
files_modified.each do |path|
next if path.size == 0
has_file_access = false
access[$user].each do |access_path|
if !access_path || (path.index(access_path) == 0)
has_file_access = true
end
if !has_file_access
puts "[POLICY] You do not have access to push to #{path}"
exit 1
end
end
end
check_directory_perms
ThIs Is roughIy the same scrIpt as the server-sIde part, but wIth
two Important dIfferences. IIrst, the ACI fiIe Is In a dIfferent pIace,
because thIs scrIpt runs Irom your workIng dIrectory, not Irom your
GIt dIrectory. You have to change the path to the ACI fiIe Irom thIs
access = get_acl_access_data('acl')
to thIs.
access = get_acl_access_data('.git/acl')
The other Important dIfference Is the way you get a IIstIng oI the
fiIes that have been changed. Ðecause the server-sIde method Iooks
at the Iog oI commIts, and, at thIs poInt, the commIt hasn't been
recorded yet, you must get your fiIe IIstIng Irom the stagIng area
Instead. ¡nstead oI
files_modified = `git log -1 --name-only --pretty=format:'' #{ref}`
you have to use
files_modified = `git diff-index --cached --name-only HEAD`
199
Section 7.4 An £xampIe GIt-£nIorced IoIIcy Scott Chacon Pro Git
Ðut those are the onIy two dIfferences — otherwIse, the scrIpt
works the same way. One caveat Is that It expects you to be runnIng
IocaIIy as the same user you push as to the remote machIne. ¡I that
Is dIfferent, you must set the $user varIabIe manuaIIy.
The Iast thIng you have to do Is check that you're not tryIng to push
non-Iast-Iorwarded reIerences, but that Is a bIt Iess common. To get
a reIerence that Isn't a Iast-Iorward, you eIther have to rebase past
a commIt you've aIready pushed up or try pushIng a dIfferent IocaI
branch up to the same remote branch.
Ðecause the server wIII teII you that you can't push a non-Iast-
Iorward anyway, and the hook prevents Iorced pushes, the onIy ac-
cIdentaI thIng you can try to catch Is rebasIng commIts that have
aIready been pushed.
Iere Is an exampIe pre-rebase scrIpt that checks Ior that. ¡t gets
a IIst oI aII the commIts you're about to rewrIte and checks whether
they exIst In any oI your remote reIerences. ¡I It sees one that Is
reachabIe Irom one oI your remote reIerences, It aborts the rebase.
#!/usr/bin/env ruby
base_branch = ARGV[0]
if ARGV[1]
topic_branch = ARGV[1]
else
topic_branch = "HEAD"
end
target_shas = `git rev-list #{base_branch}..#{topic_branch}`.split("\n")
remote_refs = `git branch -r`.split("\n").map { |r| r.strip }
target_shas.each do |sha|
remote_refs.each do |remote_ref|
shas_pushed = `git rev-list ^#{sha}^@ refs/remotes/#{remote_ref}`
if shas_pushed.split(“\n”).include?(sha)
puts "[POLICY] Commit #{sha} has already been pushed to #{remote_ref}"
exit 1
end
end
end
ThIs scrIpt uses a syntax that wasn't covered In the IevIsIon SeIec-
tIon sectIon oI Chapter 6. You get a IIst oI commIts that have aIready
been pushed up by runnIng thIs.
git rev-list ^#{sha}^@ refs/remotes/#{remote_ref}
The SHAˆ@ syntax resoIves to aII the parents oI that commIt. You're
IookIng Ior any commIt that Is reachabIe Irom the Iast commIt on the
remote and that Isn't reachabIe Irom any parent oI any oI the SIAs
you're tryIng to push up — meanIng It's a Iast-Iorward.
The maIn drawback to thIs approach Is that It can be very sIow
and Is oIten unnecessary — II you don't try to Iorce the push wIth -f,
the server wIII warn you and not accept the push. Iowever, It's an
InterestIng exercIse and can In theory heIp you avoId a rebase that
you mIght Iater have to go back and fix.
200
Chapter 7 CustomIzIng GIt Scott Chacon Pro Git
7.5 Summary
You've covered most oI the major ways that you can customIze your
GIt cIIent and server to best fit your workflow and projects. You've
Iearned about aII sorts oI configuratIon settIngs, fiIe-based attrIbutes,
and event hooks, and you've buIIt an exampIe poIIcy-enIorcIng server.
You shouId now be abIe to make GIt fit nearIy any workflow you can
dream up.
201
Chapter 8
Git and Other Systems
The worId Isn't perIect. !suaIIy, you can't ImmedIateIy swItch every
project you come In contact wIth to GIt. SometImes you're stuck on
a project usIng another VCS, and many tImes that system Is Subver-
sIon. You'II spend the first part oI thIs chapter IearnIng about git svn,
the bIdIrectIonaI SubversIon gateway tooI In GIt.
At some poInt, you may want to convert your exIstIng project to
GIt. The second part oI thIs chapter covers how to mIgrate your
project Into GIt. first Irom SubversIon, then Irom IerIorce, and finaIIy
vIa a custom Import scrIpt Ior a nonstandard ImportIng case.
8.1 Git and Subversion
CurrentIy, the majorIty oI open source deveIopment projects and a
Iarge number oI corporate projects use SubversIon to manage theIr
source code. ¡t's the most popuIar open source VCS and has been
around Ior nearIy a decade. ¡t's aIso very sImIIar In many ways to
CVS, whIch was the bIg boy oI the source-controI worId beIore that.
One oI GIt's great Ieatures Is a bIdIrectIonaI brIdge to SubversIon
caIIed git svn. ThIs tooI aIIows you to use GIt as a vaIId cIIent to a
SubversIon server, so you can use aII the IocaI Ieatures oI GIt and
then push to a SubversIon server as II you were usIng SubversIon
IocaIIy. ThIs means you can do IocaI branchIng and mergIng, use the
stagIng area, use rebasIng and cherry-pIckIng, and so on, whIIe your
coIIaborators contInue to work In theIr dark and ancIent ways. ¡t's a
good way to sneak GIt Into the corporate envIronment and heIp your
IeIIow deveIopers become more efficIent whIIe you Iobby to get the
InIrastructure changed to support GIt IuIIy. The SubversIon brIdge
Is the gateway drug to the ÐVCS worId.
8.1.1 git svn
The base command In GIt Ior aII the SubversIon brIdgIng commands
Is git svn. You preIace everythIng wIth that. ¡t takes quIte a Iew com-
203
Section 8.1 GIt and SubversIon Scott Chacon Pro Git
mands, so you'II Iearn about the common ones whIIe goIng through
a Iew smaII workflows.
¡t's Important to note that when you're usIng git svn, you're In-
teractIng wIth SubversIon, whIch Is a system that Is Iar Iess sophIs-
tIcated than GIt. AIthough you can easIIy do IocaI branchIng and
mergIng, It's generaIIy best to keep your hIstory as IInear as possIbIe
by rebasIng your work and avoIdIng doIng thIngs IIke sImuItaneousIy
InteractIng wIth a GIt remote reposItory.
Ðon't rewrIte your hIstory and try to push agaIn, and don't push to
a paraIIeI GIt reposItory to coIIaborate wIth IeIIow GIt deveIopers at
the same tIme. SubversIon can have onIy a sIngIe IInear hIstory, and
conIusIng It Is very easy. ¡I you're workIng wIth a team, and some
are usIng SVÞ and others are usIng GIt, make sure everyone Is usIng
the SVÞ server to coIIaborate — doIng so wIII make your IIIe easIer.
8.1.2 Setting Up
To demonstrate thIs IunctIonaIIty, you need a typIcaI SVÞ reposItory
that you have wrIte access to. ¡I you want to copy these exampIes,
you'II have to make a wrIteabIe copy oI my test reposItory. ¡n order
to do that easIIy, you can use a tooI caIIed svnsync that comes wIth
more recent versIons oI SubversIon — It shouId be dIstrIbuted wIth
at Ieast 1.4. Ior these tests, ¡ created a new SubversIon reposItory
on GoogIe code that was a partIaI copy oI the protobuf project, whIch
Is a tooI that encodes structured data Ior network transmIssIon.
To IoIIow aIong, you first need to create a new IocaI SubversIon
reposItory.
$ mkdir /tmp/test-svn
$ svnadmin create /tmp/test-svn
Then, enabIe aII users to change revprops — the easy way Is to
add a pre-revprop-change scrIpt that aIways exIts 0.
$ cat /tmp/test-svn/hooks/pre-revprop-change
#!/bin/sh
exit 0;
$ chmod +x /tmp/test-svn/hooks/pre-revprop-change
You can now sync thIs project to your IocaI machIne by caIIIng
svnsync init wIth the to and Irom reposItorIes.
$ svnsync init file:///tmp/test-svn http://progit-example.googlecode.com/
svn/
ThIs sets up the propertIes to run the sync. You can then cIone
the code by runnIng
$ svnsync sync file:///tmp/test-svn
Committed revision 1.
Copied properties for revision 1.
Committed revision 2.
Copied properties for revision 2.
Committed revision 3.
...
204
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
AIthough thIs operatIon may take onIy a Iew mInutes, II you try to
copy the orIgInaI reposItory to another remote reposItory Instead oI a
IocaI one, the process wIII take nearIy an hour, even though there are
Iewer than 100 commIts. SubversIon has to cIone one revIsIon at a
tIme and then push It back Into another reposItory — It's rIdIcuIousIy
InefficIent, but It's the onIy easy way to do thIs.
8.1.3 Getting Started
Þow that you have a SubversIon reposItory to whIch you have wrIte
access, you can go through a typIcaI workflow. You'II start wIth the
git svn clone command, whIch Imports an entIre SubversIon repos-
Itory Into a IocaI GIt reposItory. Iemember that II you're Import-
Ing Irom a reaI hosted SubversIon reposItory, you shouId repIace the
file:///tmp/test-svn here wIth the !II oI your SubversIon reposItory.
$ git svn clone file:///tmp/test-svn -T trunk -b branches -t tags
Initialized empty Git repository in /Users/schacon/projects/testsvnsync/
svn/.git/
r1 = b4e387bc68740b5af56c2a5faf4003ae42bd135c (trunk)
A m4/acx_pthread.m4
A m4/stl_hash.m4
...
r75 = d1957f3b307922124eec6314e15bcda59e3d9610 (trunk)
Found possible branch point: file:///tmp/test-svn/trunk => \
file:///tmp/test-svn /branches/my-calc-branch, 75
Found branch parent: (my-calc-branch) d1957f3b307922124eec6314e15bcda59e3d9610
Following parent with do_switch
Successfully followed parent
r76 = 8624824ecc0badd73f40ea2f01fce51894189b01 (my-calc-branch)
Checked out HEAD:
file:///tmp/test-svn/branches/my-calc-branch r76
ThIs runs the equIvaIent oI two commands — git svn init IoIIowed
by git svn fetch — on the !II you provIde. ThIs can take a whIIe.
The test project has onIy about 75 commIts and the codebase Isn't
that bIg, so It takes just a Iew mInutes. Iowever, GIt has to check out
each versIon, one at a tIme, and commIt It IndIvIduaIIy. Ior a project
wIth hundreds or thousands oI commIts, thIs can IIteraIIy take hours
or even days to finIsh.
The -T trunk -b branches -t tags part teIIs GIt that thIs SubversIon
reposItory IoIIows the basIc branchIng and taggIng conventIons. ¡I
you name your trunk, branches, or tags dIfferentIy, you can change
these optIons. Ðecause thIs Is so common, you can repIace thIs en-
tIre part wIth -s, whIch means standard Iayout and ImpIIes aII those
optIons. The IoIIowIng command Is equIvaIent.
$ git svn clone file:///tmp/test-svn -s
At thIs poInt, you shouId have a vaIId GIt reposItory that has Im-
ported your branches and tags.
205
Section 8.1 GIt and SubversIon Scott Chacon Pro Git
$ git branch -a
* master
my-calc-branch
tags/2.0.2
tags/release-2.0.1
tags/release-2.0.2
tags/release-2.0.2rc1
trunk
¡t's Important to note how thIs tooI namespaces your remote reIer-
ences dIfferentIy. When you're cIonIng a normaI GIt reposItory, you
get aII the branches on that remote server avaIIabIe IocaIIy as some-
thIng IIke origin/[branch] - namespaced by the name oI the remote.
Iowever, git svn assumes that you won't have muItIpIe remotes and
saves aII Its reIerences to poInts on the remote server wIth no names-
pacIng. You can use the GIt pIumbIng command show-ref to Iook at
aII your IuII reIerence names.
$ git show-ref
1cbd4904d9982f386d87f88fce1c24ad7c0f0471 refs/heads/master
aee1ecc26318164f355a883f5d99cff0c852d3c4 refs/remotes/my-calc-branch
03d09b0e2aad427e34a6d50ff147128e76c0e0f5 refs/remotes/tags/2.0.2
50d02cc0adc9da4319eeba0900430ba219b9c376 refs/remotes/tags/release-2.0.1
4caaa711a50c77879a91b8b90380060f672745cb refs/remotes/tags/release-2.0.2
1c4cb508144c513ff1214c3488abe66dcb92916f refs/remotes/tags/release-2.0.2rc1
1cbd4904d9982f386d87f88fce1c24ad7c0f0471 refs/remotes/trunk
A normaI GIt reposItory Iooks more IIke thIs.
$ git show-ref
83e38c7a0af325a9722f2fdc56b10188806d83a1 refs/heads/master
3e15e38c198baac84223acfc6224bb8b99ff2281 refs/remotes/gitserver/master
0a30dd3b0c795b80212ae723640d4e5d48cabdff refs/remotes/origin/master
25812380387fdd55f916652be4881c6f11600d6f refs/remotes/origin/testing
You have two remote servers. one named gitserver wIth a master
branch, and another named origin wIth two branches, master and
testing.
ÞotIce how In the exampIe oI remote reIerences Imported Irom
git svn, tags are added as remote branches, not as reaI GIt tags.
Your SubversIon Import Iooks IIke It has a remote named tags wIth
branches under It.
8.1.4 Committing Back to Subversion
Þow that you have a workIng reposItory, you can do some work on the
project and push your commIts back upstream, usIng GIt effectIveIy
as a SVÞ cIIent. ¡I you edIt one oI the fiIes and commIt It, you have a
commIt that exIsts In GIt IocaIIy that doesn't exIst on the SubversIon
server.
$ git commit -am 'Adding git-svn instructions to the README'
[master 97031e5] Adding git-svn instructions to the README
1 files changed, 1 insertions(+), 1 deletions(-)
206
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
Þext, you need to push your change upstream. ÞotIce how thIs
changes the way you work wIth SubversIon — you can do severaI
commIts offlIne and then push them aII at once to the SubversIon
server. To push to a SubversIon server, you run the git svn dcommit
command.
$ git svn dcommit
Committing to file:///tmp/test-svn/trunk ...
M README.txt
Committed r79
M README.txt
r79 = 938b1a547c2cc92033b74d32030e86468294a5c8 (trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
ThIs takes aII the commIts you've made on top oI the SubversIon
server code, does a SubversIon commIt Ior each, and then rewrItes
your IocaI GIt commIt to IncIude a unIque IdentIfier. ThIs Is Impor-
tant because It means that aII the SIA-1 checksums Ior your commIts
change. IartIy Ior thIs reason, workIng wIth GIt-based remote ver-
sIons oI your projects concurrentIy wIth a SubversIon server Isn't a
good Idea. ¡I you Iook at the Iast commIt, you can see the new git-
svn-id that was added.
$ git log -1
commit 938b1a547c2cc92033b74d32030e86468294a5c8
Author: schacon <schacon@4c93b258-373f-11de-be05-5f7a86268029>
Date: Sat May 2 22:06:44 2009 +0000
Adding git-svn instructions to the README
git-svn-id: file:///tmp/test-svn/trunk@79 4c93b258-373f-11de-be05-5f7a86268029
ÞotIce that the SIA checksum that orIgInaIIy started wIth 97031e5
when you commItted now begIns wIth 938b1a5. ¡I you want to push to
both a GIt server and a SubversIon server, you have to push (dcommit)
to the SubversIon server first, because that actIon changes your com-
mIt data.
8.1.5 Pulling in New Changes
¡I you're workIng wIth other deveIopers, then at some poInt one oI
you wIII push, and then the other one wIII try to push a change that
conflIcts. That change wIII be rejected untII you merge In theIr work.
¡n git svn, It Iooks IIke thIs.
$ git svn dcommit
Committing to file:///tmp/test-svn/trunk ...
Merge conflict during commit: Your file or directory 'README.txt' is probably \
out-of-date: resource out of date; try updating at /Users/schacon/libexec/
git-\
core/git-svn line 482
To resoIve thIs sItuatIon, you can run git svn rebase, whIch puIIs
down any changes on the server that you don't have yet and rebases
any work you have on top oI what Is on the server.
207
Section 8.1 GIt and SubversIon Scott Chacon Pro Git
$ git svn rebase
M README.txt
r80 = ff829ab914e8775c7c025d741beb3d523ee30bc4 (trunk)
First, rewinding head to replay your work on top of it...
Applying: first user change
Þow, aII your work Is on top oI what Is on the SubversIon server,
so you can successIuIIy dcommit.
$ git svn dcommit
Committing to file:///tmp/test-svn/trunk ...
M README.txt
Committed r81
M README.txt
r81 = 456cbe6337abe49154db70106d1836bc1332deed (trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
¡t's Important to remember that unIIke GIt, whIch requIres you to
merge upstream work you don't yet have IocaIIy beIore you can push,
git svn makes you do that onIy II the changes conflIct. ¡I someone eIse
pushes a change to one fiIe and then you push a change to another
fiIe, your dcommit wIII work fine.
$ git svn dcommit
Committing to file:///tmp/test-svn/trunk ...
M configure.ac
Committed r84
M autogen.sh
r83 = 8aa54a74d452f82eee10076ab2584c1fc424853b (trunk)
M configure.ac
r84 = cdbac939211ccb18aa744e581e46563af5d962d0 (trunk)
W: d2f23b80f67aaaa1f6f5aaef48fce3263ac71a92 and refs/remotes/trunk differ, \
using rebase:
:100755 100755 efa5a59965fbbb5b2b0a12890f1b351bb5493c18 \
015e4c98c482f0fa71e4d5434338014530b37fa6 M autogen.sh
First, rewinding head to replay your work on top of it...
Nothing to do.
ThIs Is Important to remember, because the outcome Is a project
state that dIdn't exIst on eIther oI your computers when you pushed.
¡I the changes are IncompatIbIe but don't conflIct, you may get Is-
sues that are dIfficuIt to dIagnose. ThIs Is dIfferent than usIng a GIt
server — In GIt, you can IuIIy test the state on your cIIent system be-
Iore pubIIshIng It, whereas In SVÞ, you can't ever be certaIn that the
states ImmedIateIy beIore commIt and aIter commIt are IdentIcaI.
You shouId aIso run thIs command to puII In changes Irom the
SubversIon server, even II you're not ready to commIt yourseII. You
can run git svn fetch to grab the new data, but git svn rebase does
the Ietch and then updates your IocaI commIts.
$ git svn rebase
M generate_descriptor_proto.sh
r82 = bd16df9173e424c6f52c337ab6efa7f7643282f1 (trunk)
First, rewinding head to replay your work on top of it...
Fast-forwarded master to refs/remotes/trunk.
208
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
IunnIng git svn rebase every once In a whIIe makes sure your code
Is aIways up to date. You need to be sure your workIng dIrectory Is
cIean when you run thIs, though. ¡I you have IocaI changes, you
must eIther stash your work or temporarIIy commIt It beIore runnIng
git svn rebase — otherwIse, the command wIII stop II It sees that the
rebase wIII resuIt In a merge conflIct.
8.1.6 Git Branching Issues
When you've become comIortabIe wIth a GIt workflow, you'II IIkeIy
create topIc branches, do work on them, and then merge them In. ¡I
you're pushIng to a SubversIon server vIa gIt svn, you may want to
rebase your work onto a sIngIe branch each tIme Instead oI mergIng
branches together. The reason to preIer rebasIng Is that SubversIon
has a IInear hIstory and doesn't deaI wIth merges IIke GIt does, so gIt
svn IoIIows onIy the first parent when convertIng the snapshots Into
SubversIon commIts.
Suppose your hIstory Iooks IIke the IoIIowIng. you created an
experiment branch, dId two commIts, and then merged them back Into
master. When you dcommit, you see output IIke thIs.
$ git svn dcommit
Committing to file:///tmp/test-svn/trunk ...
M CHANGES.txt
Committed r85
M CHANGES.txt
r85 = 4bfebeec434d156c36f2bcd18f4e3d97dc3269a2 (trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
COPYING.txt: locally modified
INSTALL.txt: locally modified
M COPYING.txt
M INSTALL.txt
Committed r86
M INSTALL.txt
M COPYING.txt
r86 = 2647f6b86ccfcaad4ec58c520e369ec81f7c283c (trunk)
No changes between current HEAD and refs/remotes/trunk
Resetting to the latest refs/remotes/trunk
IunnIng dcommit on a branch wIth merged hIstory works fine, ex-
cept that when you Iook at your GIt project hIstory, It hasn't rewrItten
eIther oI the commIts you made on the experiment branch — Instead,
aII those changes appear In the SVÞ versIon oI the sIngIe merge com-
mIt.
When someone eIse cIones that work, aII they see Is the merge
commIt wIth aII the work squashed Into It, they don't see the commIt
data about where It came Irom or when It was commItted.
8.1.7 Subversion Branching
ÐranchIng In SubversIon Isn't the same as branchIng In GIt, II you can
avoId usIng It much, that's probabIy best. Iowever, you can create
209
Section 8.1 GIt and SubversIon Scott Chacon Pro Git
and commIt to branches In SubversIon usIng gIt svn.
Creating a New SVN Branch
To create a new branch In SubversIon, you run git svn branch [branchname].
$ git svn branch opera
Copying file:///tmp/test-svn/trunk at r87 to file:///tmp/test-svn/branches/
opera...
Found possible branch point: file:///tmp/test-svn/trunk => \
file:///tmp/test-svn/branches/opera, 87
Found branch parent: (opera) 1f6bfe471083cbca06ac8d4176f7ad4de0d62e5f
Following parent with do_switch
Successfully followed parent
r89 = 9b6fe0b90c5c9adf9165f700897518dbc54a7cbf (opera)
ThIs does the equIvaIent oI the svn copy trunk branches/opera com-
mand In SubversIon and operates on the SubversIon server. ¡t's Im-
portant to note that It doesn't check you out Into that branch, II you
commIt at thIs poInt, that commIt wIII go to trunk on the server, not
opera.
8.1.8 Switching Active Branches
GIt figures out what branch your dcommIts go to by IookIng Ior the
tIp oI any oI your SubversIon branches In your hIstory — you shouId
have onIy one, and It shouId be the Iast one wIth a git-svn-id In your
current branch hIstory.
¡I you want to work on more than one branch sImuItaneousIy, you
can set up IocaI branches to dcommit to specIfic SubversIon branches
by startIng them at the Imported SubversIon commIt Ior that branch.
¡I you want an opera branch that you can work on separateIy, you can
run
$ git branch opera remotes/opera
Þow, II you want to merge your opera branch Into trunk (your master
branch), you can do so wIth a normaI git merge. Ðut you need to
provIde a descrIptIve commIt message (vIa -m), or the merge wIII say
“Merge branch opera” Instead oI somethIng useIuI.
Iemember that aIthough you're usIng git merge to do thIs oper-
atIon, and the merge IIkeIy wIII be much easIer than It wouId be In
SubversIon (because GIt wIII automatIcaIIy detect the approprIate
merge base Ior you), thIs Isn't a normaI GIt merge commIt. You have
to push thIs data back to a SubversIon server that can't handIe a
commIt that tracks more than one parent, so, aIter you push It up,
It wIII Iook IIke a sIngIe commIt that squashed In aII the work oI an-
other branch under a sIngIe commIt. AIter you merge one branch
Into another, you can't easIIy go back and contInue workIng on that
branch, as you normaIIy can In GIt. The dcommit command that you
run erases any InIormatIon that says what branch was merged In,
210
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
so subsequent merge-base caIcuIatIons wIII be wrong — the dcom-
mIt makes your git merge resuIt Iook IIke you ran git merge --squash.
!nIortunateIy, there's no good way to avoId thIs sItuatIon — Subver-
sIon can't store thIs InIormatIon, so you'II aIways be crIppIed by Its
IImItatIons whIIe you're usIng It as your server. To avoId Issues, you
shouId deIete the IocaI branch (In thIs case, opera) aIter you merge It
Into trunk.
8.1.9 Subversion Commands
The git svn tooIset provIdes a number oI commands to heIp ease
the transItIon to GIt by provIdIng some IunctIonaIIty that's sImIIar to
what you had In SubversIon. Iere are a Iew commands that gIve you
what SubversIon used to.
SVN Style History
¡I you're used to SubversIon and want to see your hIstory In SVÞ
output styIe, you can run git svn log to vIew your commIt hIstory In
SVÞ IormattIng.
$ git svn log
------------------------------------------------------------------------
r87 | schacon | 2009-05-02 16:07:37 -0700 (Sat, 02 May 2009) | 2 lines
autogen change
------------------------------------------------------------------------
r86 | schacon | 2009-05-02 16:00:21 -0700 (Sat, 02 May 2009) | 2 lines
Merge branch 'experiment'
------------------------------------------------------------------------
r85 | schacon | 2009-05-02 16:00:09 -0700 (Sat, 02 May 2009) | 2 lines
updated the changelog
You shouId know two Important thIngs about git svn log. IIrst, It
works offlIne, unIIke the reaI svn log command, whIch asks the Sub-
versIon server Ior the data. Second, It onIy shows you commIts that
have been commItted up to the SubversIon server. IocaI GIt com-
mIts that you haven't dcommIted don't show up, neIther do commIts
that peopIe have made to the SubversIon server In the meantIme.
¡t's more IIke the Iast known state oI the commIts on the SubversIon
server.
SVN Annotation
Much as the git svn log command sImuIates the svn log command
offlIne, you can get the equIvaIent oI svn annotate by runnIng git svn
blame [FILE]. The output Iooks IIke thIs.
211
Section 8.1 GIt and SubversIon Scott Chacon Pro Git
$ git svn blame README.txt
2 temporal Protocol Buffers - Google's data interchange format
2 temporal Copyright 2008 Google Inc.
2 temporal http://code.google.com/apis/protocolbuffers/
2 temporal
22 temporal C++ Installation - Unix
22 temporal =======================
2 temporal
79 schacon Committing in git-svn.
78 schacon
2 temporal To build and install the C++ Protocol Buffer runtime and the Protocol
2 temporal Buffer compiler (protoc) execute the following:
2 temporal
AgaIn, It doesn't show commIts that you dId IocaIIy In GIt or that
have been pushed to SubversIon In the meantIme.
SVN Server Information
You can aIso get the same sort oI InIormatIon that svn info gIves you
by runnIng git svn info.
$ git svn info
Path: .
URL: https://schacon-test.googlecode.com/svn/trunk
Repository Root: https://schacon-test.googlecode.com/svn
Repository UUID: 4c93b258-373f-11de-be05-5f7a86268029
Revision: 87
Node Kind: directory
Schedule: normal
Last Changed Author: schacon
Last Changed Rev: 87
Last Changed Date: 2009-05-02 16:07:37 -0700 (Sat, 02 May 2009)
ThIs Is IIke blame and log In that It runs offlIne and Is up to date onIy
as oI the Iast tIme you communIcated wIth the SubversIon server.
Ignoring What Subversion Ignores
¡I you cIone a SubversIon reposItory that has svn:ignore propertIes
set anywhere, you'II IIkeIy want to set correspondIng .gitignore fiIes
so you don't accIdentaIIy commIt fiIes that you shouIdn't. git svn has
two commands to heIp wIth thIs Issue. The first Is git svn create-
ignore, whIch automatIcaIIy creates correspondIng .gitignore fiIes Ior
you so your next commIt can IncIude them.
The second command Is git svn show-ignore, whIch prInts to stdout
the IInes you need to put In a .gitignore fiIe so you can redIrect the
output Into your project excIude fiIe.
$ git svn show-ignore > .git/info/exclude
That way, you don't IItter the project wIth .gitignore fiIes. ThIs Is
a good optIon II you're the onIy GIt user on a SubversIon team, and
your teammates don't want .gitignore fiIes In the project.
212
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
8.1.10 Git-Svn Summary
The git svn tooIs are useIuI II you're stuck wIth a SubversIon server
Ior now or are otherwIse In a deveIopment envIronment that neces-
sItates runnIng a SubversIon server. You shouId consIder It crIppIed
GIt, however, or you'II hIt Issues In transIatIon that may conIuse you
and your coIIaborators. To stay out oI troubIe, try to IoIIow these
guIdeIInes.
• Keep a IInear GIt hIstory that doesn't contaIn merge commIts
made by git merge. Iebase any work you do outsIde oI your
maInIIne branch back onto It, don't merge It In.
• Ðon't set up and coIIaborate on a separate GIt server. IossIbIy
have one to speed up cIones Ior new deveIopers, but don't push
anythIng to It that doesn't have a git-svn-id entry. You may even
want to add a pre-receive hook that checks each commIt message
Ior a git-svn-id and rejects pushes that contaIn commIts wIthout
It.
¡I you IoIIow those guIdeIInes, workIng wIth a SubversIon server can
be more bearabIe. Iowever, II It's possIbIe to move to a reaI GIt
server, doIng so can gaIn your team a Iot more.
8.2 Migrating to Git
¡I you have an exIstIng codebase In another VCS but you've decIded
to start usIng GIt, you must mIgrate your project one way or another.
ThIs sectIon goes over some Importers that are IncIuded wIth GIt Ior
common systems and then demonstrates how to deveIop your own
custom Importer.
8.2.1 Importing
You'II Iearn how to Import data Irom two oI the bIgger proIessIonaIIy
used SCM systems — SubversIon and IerIorce — both because they
make up the majorIty oI users ¡ hear oI who are currentIy swItchIng,
and because hIgh-quaIIty tooIs Ior both systems are dIstrIbuted wIth
GIt.
8.2.2 Subversion
¡I you read the prevIous sectIon about usIng git svn, you can easIIy
use those InstructIons to git svn clone a reposItory, then, stop usIng
the SubversIon server, push to a new GIt server, and start usIng that.
¡I you want the hIstory, you can accompIIsh that as quIckIy as you can
puII the data out oI the SubversIon server (whIch may take a whIIe).
213
Section 8.2 MIgratIng to GIt Scott Chacon Pro Git
Iowever, the Import Isn't perIect, and because It wIII take so Iong,
you may as weII do It rIght. The first probIem Is the author InIorma-
tIon. ¡n SubversIon, each person commIttIng has a user on the system
who Is recorded In the commIt InIormatIon. The exampIes In the pre-
vIous sectIon show schacon In some pIaces, such as the blame output
and the git svn log. ¡I you want to map thIs to better GIt author data,
you need a mappIng Irom the SubversIon users to the GIt authors.
Create a fiIe caIIed users.txt that has thIs mappIng In a Iormat IIke
thIs.
schacon = Scott Chacon <schacon@geemail.com>
selse = Someo Nelse <selse@geemail.com>
To get a IIst oI the author names that SVÞ uses, you can run thIs.
$ svn log --xml | grep author | sort -u | perl -pe 's/.>(.?)<./$1 = /'
That gIves you the Iog output In XMI Iormat — you can Iook Ior
the authors, create a unIque IIst, and then strIp out the XMI. (ObvI-
ousIy thIs onIy works on a machIne wIth grep, sort, and perl InstaIIed.)
Then, redIrect that output Into your users.txt fiIe so you can add the
equIvaIent GIt user data next to each entry.
You can provIde thIs fiIe to git svn to heIp It map the author data
more accurateIy. You can aIso teII git svn not to IncIude the meta-
data that SubversIon normaIIy Imports, by passIng --no-metadata to
the clone or init command. ThIs makes your import command Iook
IIke thIs.
$ git-svn clone http://my-project.googlecode.com/svn/ \
--authors-file=users.txt --no-metadata -s my_project
Þow you shouId have a nIcer SubversIon Import In your my_project
dIrectory. ¡nstead oI commIts that Iook IIke thIs
commit 37efa680e8473b615de980fa935944215428a35a
Author: schacon <schacon@4c93b258-373f-11de-be05-5f7a86268029>
Date: Sun May 3 00:12:22 2009 +0000
fixed install - go to trunk
git-svn-id: https://my-project.googlecode.com/svn/trunk@94 4c93b258-373f-11de-
be05-5f7a86268029
they Iook IIke thIs.
commit 03a8785f44c8ea5cdb0e8834b7c8e6c469be2ff2
Author: Scott Chacon <schacon@geemail.com>
Date: Sun May 3 00:12:22 2009 +0000
fixed install - go to trunk
Þot onIy does the Author fieId Iook a Iot better, but the git-svn-id
Is no Ionger there, eIther.
You need to do a bIt oI post-import cIeanup. Ior one thIng, you
shouId cIean up the weIrd reIerences that git svn set up. IIrst you'II
214
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
move the tags so they're actuaI tags rather than strange remote
branches, and then you'II move the rest oI the branches so they're
IocaI.
To move the tags to be proper GIt tags, run
$ cp -Rf .git/refs/remotes/tags/* .git/refs/tags/
$ rm -Rf .git/refs/remotes/tags
ThIs takes the reIerences that were remote branches that started
wIth tag/ and makes them reaI (IIghtweIght) tags.
Þext, move the rest oI the reIerences under refs/remotes to be IocaI
branches.
$ cp -Rf .git/refs/remotes/* .git/refs/heads/
$ rm -Rf .git/refs/remotes
Þow aII the oId branches are reaI GIt branches and aII the oId tags
are reaI GIt tags. The Iast thIng to do Is add your new GIt server as a
remote and push to It. Ðecause you want aII your branches and tags
to go up, you can run thIs.
$ git push origin --all
AII your branches and tags shouId be on your new GIt server In a
nIce, cIean Import.
8.2.3 Perforce
The next system you'II Iook at ImportIng Irom Is IerIorce. A IerIorce
Importer Is aIso dIstrIbuted wIth GIt, but onIy In the contrib sectIon
oI the source code — It Isn't avaIIabIe by deIauIt IIke git svn. To run
It, you must get the GIt source code, whIch you can downIoad Irom
gIt.kerneI.org.
$ git clone git://git.kernel.org/pub/scm/git/git.git
$ cd git/contrib/fast-import
¡n thIs fast-import dIrectory, you shouId find an executabIe Iython
scrIpt named git-p4. You must have Iython and the p4 tooI InstaIIed
on your machIne Ior thIs Import to work. Ior exampIe, you'II Import
the jam project Irom the IerIorce IubIIc Ðepot. To set up your cIIent,
you must export the I4IOIT envIronment varIabIe to poInt to the
IerIorce depot.
$ export P4PORT=public.perforce.com:1666
Iun the git-p4 clone command to Import the jam project Irom the
IerIorce server, suppIyIng the depot and project path and the path
Into whIch you want to Import the project.
$ git-p4 clone //public/jam/src@all /opt/p4import
Importing from //public/jam/src@all into /opt/p4import
Reinitialized existing Git repository in /opt/p4import/.git/
Import destination: refs/remotes/p4/master
Importing revision 4409 (100%)
215
Section 8.2 MIgratIng to GIt Scott Chacon Pro Git
¡I you go to the /opt/p4import dIrectory and run git log, you can see
your Imported work.
$ git log -2
commit 1fd4ec126171790efd2db83548b85b1bbbc07dc2
Author: Perforce staff <support@perforce.com>
Date: Thu Aug 19 10:18:45 2004 -0800
Drop 'rc3' moniker of jam-2.5. Folded rc2 and rc3 RELNOTES into
the main part of the document. Built new tar/zip balls.
Only 16 months later.
[git-p4: depot-paths = "//public/jam/src/": change = 4409]
commit ca8870db541a23ed867f38847eda65bf4363371d
Author: Richard Geiger <rmg@perforce.com>
Date: Tue Apr 22 20:51:34 2003 -0800
Update derived jamgram.c
[git-p4: depot-paths = "//public/jam/src/": change = 3108]
You can see the git-p4 IdentIfier In each commIt. ¡t's fine to keep
that IdentIfier there, In case you need to reIerence the IerIorce change
number Iater. Iowever, II you'd IIke to remove the IdentIfier, now Is
the tIme to do so — beIore you start doIng work on the new repos-
Itory. You can use git filter-branch to remove the IdentIfier strIngs
en masse.
$ git filter-branch --msg-filter '
sed -e "/^\[git-p4:/d"
'
Rewrite 1fd4ec126171790efd2db83548b85b1bbbc07dc2 (123/123)
Ref 'refs/heads/master' was rewritten
¡I you run git log, you can see that aII the SIA-1 checksums Ior
the commIts have changed, but the git-p4 strIngs are no Ionger In the
commIt messages.
$ git log -2
commit 10a16d60cffca14d454a15c6164378f4082bc5b0
Author: Perforce staff <support@perforce.com>
Date: Thu Aug 19 10:18:45 2004 -0800
Drop 'rc3' moniker of jam-2.5. Folded rc2 and rc3 RELNOTES into
the main part of the document. Built new tar/zip balls.
Only 16 months later.
commit 2b6c6db311dd76c34c66ec1c40a49405e6b527b2
Author: Richard Geiger <rmg@perforce.com>
Date: Tue Apr 22 20:51:34 2003 -0800
Update derived jamgram.c
Your Import Is ready to push up to your new GIt server.
216
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
8.2.4 A Custom Importer
¡I your system Isn't SubversIon or IerIorce, you shouId Iook Ior an Im-
porter onIIne — quaIIty Importers are avaIIabIe Ior CVS, CIear Case,
VIsuaI Source SaIe, even a dIrectory oI archIves. ¡I none oI these
tooIs works Ior you, you have a rarer tooI, or you otherwIse need a
more custom ImportIng process, you shouId use git fast-import. ThIs
command reads sImpIe InstructIons Irom stdIn to wrIte specIfic GIt
data. ¡t's much easIer to create GIt objects thIs way than to run the
raw GIt commands or try to wrIte the raw objects (see Chapter 9
Ior more InIormatIon). ThIs way, you can wrIte an Import scrIpt that
reads the necessary InIormatIon out oI the system you're ImportIng
Irom and prInts straIghtIorward InstructIons to stdout. You can then
run thIs program and pIpe Its output through git fast-import.
To quIckIy demonstrate, you'II wrIte a sImpIe Importer. Suppose
you work In current, you back up your project by occasIonaIIy copy-
Ing the dIrectory Into a tIme-stamped back_YYYY_MM_DD backup dIrec-
tory, and you want to Import thIs Into GIt. Your dIrectory structure
Iooks IIke thIs.
$ ls /opt/import_from
back_2009_01_02
back_2009_01_04
back_2009_01_14
back_2009_02_03
current
¡n order to Import a GIt dIrectory, you need to revIew how GIt
stores Its data. As you may remember, GIt Is IundamentaIIy a IInked
IIst oI commIt objects that poInt to a snapshot oI content. AII you
have to do Is teII fast-import what the content snapshots are, what
commIt data poInts to them, and the order they go In. Your strategy
wIII be to go through the snapshots one at a tIme and create commIts
wIth the contents oI each dIrectory, IInkIng each commIt back to the
prevIous one.
As you dId In the “An £xampIe GIt £nIorced IoIIcy” sectIon oI
Chapter 7, we'II wrIte thIs In Iuby, because It's what ¡ generaIIy
work wIth and It tends to be easy to read. You can wrIte thIs ex-
ampIe pretty easIIy In anythIng you're IamIIIar wIth — It just needs
to prInt the approprIate InIormatIon to stdout.
To begIn, you'II change Into the target dIrectory and IdentIIy every
subdIrectory, each oI whIch Is a snapshot that you want to Import as
a commIt. You'II change Into each subdIrectory and prInt the com-
mands necessary to export It. Your basIc maIn Ioop Iooks IIke thIs.
last_mark = nil
# loop through the directories
Dir.chdir(ARGV[0]) do
Dir.glob("*").each do |dir|
next if File.file?(dir)
217
Section 8.2 MIgratIng to GIt Scott Chacon Pro Git
# move into the target directory
Dir.chdir(dir) do
last_mark = print_export(dir, last_mark)
end
end
end
You run print_export InsIde each dIrectory, whIch takes the manI-
Iest and mark oI the prevIous snapshot and returns the manIIest and
mark oI thIs one, that way, you can IInk them properIy. “Mark” Is the
fast-import term Ior an IdentIfier you gIve to a commIt, as you create
commIts, you gIve each one a mark that you can use to IInk to It Irom
other commIts. So, the first thIng to do In your print_export method
Is generate a mark Irom the dIrectory name.
mark = convert_dir_to_mark(dir)
You'II do thIs by creatIng an array oI dIrectorIes and usIng the
Index vaIue as the mark, because a mark must be an Integer. Your
method Iooks IIke thIs.
$marks = []
def convert_dir_to_mark(dir)
if !$marks.include?(dir)
$marks << dir
end
($marks.index(dir) + 1).to_s
end
Þow that you have an Integer representatIon oI your commIt, you
need a date Ior the commIt metadata. Ðecause the date Is expressed
In the name oI the dIrectory, you'II parse It out. The next IIne In your
print_export fiIe Is
date = convert_dir_to_date(dir)
where convert_dir_to_date Is defined as
def convert_dir_to_date(dir)
if dir == 'current'
return Time.now().to_i
else
dir = dir.gsub('back_', '')
(year, month, day) = dir.split('_')
return Time.local(year, month, day).to_i
end
end
That returns an Integer vaIue Ior the date oI each dIrectory. The
Iast pIece oI meta-InIormatIon you need Ior each commIt Is the com-
mItter data, whIch you hardcode In a gIobaI varIabIe.
$author = 'Scott Chacon <schacon@example.com>'
218
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
Þow you're ready to begIn prIntIng out the commIt data Ior your
Importer. The InItIaI InIormatIon states that you're definIng a commIt
object and what branch It's on, IoIIowed by the mark you've gener-
ated, the commItter InIormatIon and commIt message, and then the
prevIous commIt, II any. The code Iooks IIke thIs.
# print the import information
puts 'commit refs/heads/master'
puts 'mark :' + mark
puts "committer #{$author} #{date} -0700"
export_data('imported from ' + dir)
puts 'from :' + last_mark if last_mark
You hardcode the tIme zone (--0700) because doIng so Is easy.
¡I you're ImportIng Irom another system, you must specIIy the tIme
zone as an offset. The commIt message must be expressed In a spe-
cIaI Iormat.
data (size)\n(contents)
The Iormat consIsts oI the word data, the sIze oI the data to be
read, a newIIne, and finaIIy the data. Ðecause you need to use the
same Iormat to specIIy the fiIe contents Iater, you create a heIper
method, export_data.
def export_data(string)
print "data #{string.size}\n#{string}"
end
AII that's IeIt Is to specIIy the fiIe contents Ior each snapshot. ThIs
Is easy, because you have each one In a dIrectory — you can prInt out
the deleteall command IoIIowed by the contents oI each fiIe In the
dIrectory. GIt wIII then record each snapshot approprIateIy.
puts 'deleteall'
Dir.glob("**/*").each do |file|
next if !File.file?(file)
inline_data(file)
end
Þote. Ðecause many systems thInk oI theIr revIsIons as changes
Irom one commIt to another, Iast-Import can aIso take commands
wIth each commIt to specIIy whIch fiIes have been added, removed,
or modIfied and what the new contents are. You couId caIcuIate the
dIfferences between snapshots and provIde onIy thIs data, but doIng
so Is more compIex — you may as weII gIve GIt aII the data and Iet
It figure It out. ¡I thIs Is better suIted to your data, check the fast-
import man page Ior detaIIs about how to provIde your data In thIs
manner.
The Iormat Ior IIstIng the new fiIe contents or specIIyIng a modI-
fied fiIe wIth the new contents Is as IoIIows.
M 644 inline path/to/file
data (size)
(file contents)
219
Section 8.2 MIgratIng to GIt Scott Chacon Pro Git
Iere, 644 Is the mode (II you have executabIe fiIes, you need to
detect and specIIy 755 Instead), and InIIne says you'II IIst the contents
ImmedIateIy aIter thIs IIne. Your inline_data method Iooks IIke thIs.
def inline_data(file, code = 'M', mode = '644')
content = File.read(file)
puts "#{code} #{mode} inline #{file}"
export_data(content)
end
You reuse the export_data method you defined earIIer, because It's
the same as the way you specIfied your commIt message data.
The Iast thIng you need to do Is to return the current mark so It
can be passed to the next IteratIon.
return mark
That's It. ¡I you run thIs scrIpt, you'II get content that Iooks some-
thIng IIke thIs.
$ ruby import.rb /opt/import_from
commit refs/heads/master
mark :1
committer Scott Chacon <schacon@geemail.com> 1230883200 -0700
data 29
imported from back_2009_01_02deleteall
M 644 inline file.rb
data 12
version two
commit refs/heads/master
mark :2
committer Scott Chacon <schacon@geemail.com> 1231056000 -0700
data 29
imported from back_2009_01_04from :1
deleteall
M 644 inline file.rb
data 14
version three
M 644 inline new.rb
data 16
new version one
(...)
To run the Importer, pIpe thIs output through git fast-import whIIe
In the GIt dIrectory you want to Import Into. You can create a new
dIrectory and then run git init In It Ior a startIng poInt, and then run
your scrIpt.
$ git init
Initialized empty Git repository in /opt/import_to/.git/
$ ruby import.rb /opt/import_from | git fast-import
git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects: 5000
Total objects: 18 ( 1 duplicates )
blobs : 7 ( 1 duplicates 0 deltas)
trees : 6 ( 0 duplicates 1 deltas)
commits: 5 ( 0 duplicates 0 deltas)
220
Chapter 8 GIt and Other Systems Scott Chacon Pro Git
tags : 0 ( 0 duplicates 0 deltas)
Total branches: 1 ( 1 loads )
marks: 1024 ( 5 unique )
atoms: 3
Memory total: 2255 KiB
pools: 2098 KiB
objects: 156 KiB
---------------------------------------------------------------------
pack_report: getpagesize() = 4096
pack_report: core.packedGitWindowSize = 33554432
pack_report: core.packedGitLimit = 268435456
pack_report: pack_used_ctr = 9
pack_report: pack_mmap_calls = 5
pack_report: pack_open_windows = 1 / 1
pack_report: pack_mapped = 1356 / 1356
---------------------------------------------------------------------
As you can see, when It compIetes successIuIIy, It gIves you a
bunch oI statIstIcs about what It accompIIshed. ¡n thIs case, you Im-
ported 18 objects totaI Ior 5 commIts Into 1 branch. Þow, you can
run git log to see your new hIstory.
$ git log -2
commit 10bfe7d22ce15ee25b60a824c8982157ca593d41
Author: Scott Chacon <schacon@example.com>
Date: Sun May 3 12:57:39 2009 -0700
imported from current
commit 7e519590de754d079dd73b44d695a42c9d2df452
Author: Scott Chacon <schacon@example.com>
Date: Tue Feb 3 01:00:00 2009 -0700
imported from back_2009_02_03
There you go — a nIce, cIean GIt reposItory. ¡t's Important to note
that nothIng Is checked out — you don't have any fiIes In your workIng
dIrectory at first. To get them, you must reset your branch to where
master Is now.
$ ls
$ git reset --hard master
HEAD is now at 10bfe7d imported from current
$ ls
file.rb lib
You can do a Iot more wIth the fast-import tooI — handIe dIfferent
modes, bInary data, muItIpIe branches and mergIng, tags, progress
IndIcators, and more. A number oI exampIes oI more compIex scenar-
Ios are avaIIabIe In the contrib/fast-import dIrectory oI the GIt source
code, one oI the better ones Is the git-p4 scrIpt ¡ just covered.
8.3 Summary
You shouId IeeI comIortabIe usIng GIt wIth SubversIon or ImportIng
nearIy any exIstIng reposItory Into a new GIt one wIthout IosIng data.
221
Section 8.3 Summary Scott Chacon Pro Git
The next chapter wIII cover the raw InternaIs oI GIt so you can craIt
every sIngIe byte, II need be.
222
Chapter 9
Git Internals
You may have skIpped to thIs chapter Irom a prevIous chapter, or
you may have gotten here aIter readIng the rest oI the book — In
eIther case, thIs Is where you'II go over the Inner workIngs and Im-
pIementatIon oI GIt. ¡ Iound that IearnIng thIs InIormatIon was Iun-
damentaIIy Important to understandIng how useIuI and powerIuI GIt
Is, but others have argued to me that It can be conIusIng and unnec-
essarIIy compIex Ior begInners. Thus, ¡'ve made thIs dIscussIon the
Iast chapter In the book so you couId read It earIy or Iater In your
IearnIng process. ¡ Ieave It up to you to decIde.
Þow that you're here, Iet's get started. IIrst, II It Isn't yet cIear, GIt
Is IundamentaIIy a content-addressabIe fiIesystem wIth a VCS user
InterIace wrItten on top oI It. You'II Iearn more about what thIs means
In a bIt.
¡n the earIy days oI GIt (mostIy pre 1.5), the user InterIace was
much more compIex because It emphasIzed thIs fiIesystem rather
than a poIIshed VCS. ¡n the Iast Iew years, the !¡ has been refined
untII It's as cIean and easy to use as any system out there, but oIten,
the stereotype IIngers about the earIy GIt !¡ that was compIex and
dIfficuIt to Iearn.
The content-addressabIe fiIesystem Iayer Is amazIngIy cooI, so ¡'II
cover that first In thIs chapter, then, you'II Iearn about the trans-
port mechanIsms and the reposItory maIntenance tasks that you may
eventuaIIy have to deaI wIth.
9.1 Plumbing and Porcelain
ThIs book covers how to use GIt wIth 30 or so verbs such as checkout,
branch, remote, and so on. Ðut because GIt was InItIaIIy a tooIkIt Ior a
VCS rather than a IuII user-IrIendIy VCS, It has a bunch oI verbs that
do Iow-IeveI work and were desIgned to be chaIned together !Þ¡X
styIe or caIIed Irom scrIpts. These commands are generaIIy reIerred
to as “pIumbIng” commands, and the more user-IrIendIy commands
are caIIed “porceIaIn” commands.
223
Section 9.2 GIt Objects Scott Chacon Pro Git
The book's first eIght chapters deaI aImost excIusIveIy wIth porce-
IaIn commands. Ðut In thIs chapter, you'II be deaIIng mostIy wIth the
Iower-IeveI pIumbIng commands, because they gIve you access to the
Inner workIngs oI GIt and heIp demonstrate how and why GIt does
what It does. These commands aren't meant to be used manuaIIy on
the command IIne, but rather to be used as buIIdIng bIocks Ior new
tooIs and custom scrIpts.
When you run git init In a new or exIstIng dIrectory, GIt creates
the .git dIrectory, whIch Is where aImost everythIng that GIt stores
and manIpuIates Is Iocated. ¡I you want to back up or cIone your
reposItory, copyIng thIs sIngIe dIrectory eIsewhere gIves you nearIy
everythIng you need. ThIs entIre chapter basIcaIIy deaIs wIth the
stuff In thIs dIrectory. Iere's what It Iooks IIke.
$ ls
HEAD
branches/
config
description
hooks/
index
info/
objects/
refs/
You may see some other fiIes In there, but thIs Is a Iresh git init
reposItory — It's what you see by deIauIt. The branches dIrectory Isn't
used by newer GIt versIons, and the description fiIe Is onIy used by
the GItWeb program, so don't worry about those. The config fiIe con-
taIns your project-specIfic configuratIon optIons, and the info dIrec-
tory keeps a gIobaI excIude fiIe Ior Ignored patterns that you don't
want to track In a .gItIgnore fiIe. The hooks dIrectory contaIns your
cIIent- or server-sIde hook scrIpts, whIch are dIscussed In detaII In
Chapter 6.
ThIs Ieaves Iour Important entrIes. the HEAD and index fiIes and the
objects and refs dIrectorIes. These are the core parts oI GIt. The
objects dIrectory stores aII the content Ior your database, the refs
dIrectory stores poInters Into commIt objects In that data (branches),
the HEAD fiIe poInts to the branch you currentIy have checked out, and
the index fiIe Is where GIt stores your stagIng area InIormatIon. You'II
now Iook at each oI these sectIons In detaII to see how GIt operates.
9.2 Git Objects
GIt Is a content-addressabIe fiIesystem. Great. What does that mean?
¡t means that at the core oI GIt Is a sImpIe key-vaIue data store. You
can Insert any kInd oI content Into It, and It wIII gIve you back a key
that you can use to retrIeve the content agaIn at any tIme. To demon-
strate, you can use the pIumbIng command hash-object, whIch takes
some data, stores It In your .git dIrectory, and gIves you back the
224
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
key the data Is stored as. IIrst, you InItIaIIze a new GIt reposItory
and verIIy that there Is nothIng In the objects dIrectory.
$ mkdir test
$ cd test
$ git init
Initialized empty Git repository in /tmp/test/.git/
$ find .git/objects
.git/objects
.git/objects/info
.git/objects/pack
$ find .git/objects -type f
$
GIt has InItIaIIzed the objects dIrectory and created pack and info
subdIrectorIes In It, but there are no reguIar fiIes. Þow, store some
text In your GIt database.
$ echo 'test content' | git hash-object -w --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
The -w teIIs hash-object to store the object, otherwIse, the com-
mand sImpIy teIIs you what the key wouId be. --stdin teIIs the com-
mand to read the content Irom stdIn, II you don't specIIy thIs, hash-
object expects the path to a fiIe. The output Irom the command Is a
40-character checksum hash. ThIs Is the SIA-1 hash — a checksum
oI the content you're storIng pIus a header, whIch you'II Iearn about
In a bIt. Þow you can see how GIt has stored your data.
$ find .git/objects -type f
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
You can see a fiIe In the objects dIrectory. ThIs Is how GIt stores the
content InItIaIIy — as a sIngIe fiIe per pIece oI content, named wIth
the SIA-1 checksum oI the content and Its header. The subdIrectory
Is named wIth the first 2 characters oI the SIA, and the fiIename Is
the remaInIng 38 characters.
You can puII the content back out oI GIt wIth the cat-file com-
mand. ThIs command Is sort oI a SwIss army knIIe Ior InspectIng GIt
objects. IassIng -p to It Instructs the cat-file command to figure out
the type oI content and dIspIay It nIceIy Ior you.
$ git cat-file -p d670460b4b4aece5915caf5c68d12f560a9fe3e4
test content
Þow, you can add content to GIt and puII It back out agaIn. You
can aIso do thIs wIth content In fiIes. Ior exampIe, you can do some
sImpIe versIon controI on a fiIe. IIrst, create a new fiIe and save Its
contents In your database.
$ echo 'version 1' > test.txt
$ git hash-object -w test.txt
83baae61804e65cc73a7201a7252750c76066a30
Then, wrIte some new content to the fiIe, and save It agaIn.
225
Section 9.2 GIt Objects Scott Chacon Pro Git
$ echo 'version 2' > test.txt
$ git hash-object -w test.txt
1f7a7a472abf3dd9643fd615f6da379c4acb3e3a
Your database contaIns the two new versIons oI the fiIe as weII as
the first content you stored there.
$ find .git/objects -type f
.git/objects/1f/7a7a472abf3dd9643fd615f6da379c4acb3e3a
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
Þow you can revert the fiIe back to the first versIon
$ git cat-file -p 83baae61804e65cc73a7201a7252750c76066a30 > test.txt
$ cat test.txt
version 1
or the second versIon.
$ git cat-file -p 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a > test.txt
$ cat test.txt
version 2
Ðut rememberIng the SIA-1 key Ior each versIon oI your fiIe Isn't
practIcaI, pIus, you aren't storIng the fiIename In your system — just
the content. ThIs object type Is caIIed a bIob. You can have GIt teII
you the object type oI any object In GIt, gIven Its SIA-1 key, wIth cat-
file -t.
$ git cat-file -t 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a
blob
9.2.1 Tree Objects
The next type you'II Iook at Is the tree object, whIch soIves the prob-
Iem oI storIng the fiIename and aIso aIIows you to store a group oI
fiIes together. GIt stores content In a manner sImIIar to a !Þ¡X
fiIesystem, but a bIt sImpIIfied. AII the content Is stored as tree and
bIob objects, wIth trees correspondIng to !Þ¡X dIrectory entrIes and
bIobs correspondIng more or Iess to Inodes or fiIe contents. A sIngIe
tree object contaIns one or more tree entrIes, each oI whIch contaIns
an SIA-1 poInter to a bIob or subtree wIth Its assocIated mode, type,
and fiIename. Ior exampIe, the most recent tree In the sImpIegIt
project may Iook somethIng IIke thIs.
$ git cat-file -p master^{tree}
100644 blob a906cb2a4a904a152e80877d4088654daad0c859 README
100644 blob 8f94139338f9404f26296befa88755fc2598c289 Rakefile
040000 tree 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0 lib
The masterˆtree syntax specIfies the tree object that Is poInted to
by the Iast commIt on your master branch. ÞotIce that the lib subdI-
rectory Isn't a bIob but a poInter to another tree.
226
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
$ git cat-file -p 99f1a6d12cb4b6f19c8655fca46c3ecf317074e0
100644 blob 47c6340d6459e05787f644c2447d2595f5d3a54b simplegit.rb
ConceptuaIIy, the data that GIt Is storIng Is somethIng IIke IIgure
9-1.
Figure 9.1: Simple version of the Git data model
You can create your own tree. GIt normaIIy creates a tree by tak-
Ing the state oI your stagIng area or Index and wrItIng a tree object
Irom It. So, to create a tree object, you first have to set up an Index by
stagIng some fiIes. To create an Index wIth a sIngIe entry — the first
versIon oI your text.txt fiIe — you can use the pIumbIng command
update-index. You use thIs command to artIficIaIIy add the earIIer ver-
sIon oI the test.txt fiIe to a new stagIng area. You must pass It the --
add optIon because the fiIe doesn't yet exIst In your stagIng area (you
don't even have a stagIng area set up yet) and --cacheinfo because
the fiIe you're addIng Isn't In your dIrectory but Is In your database.
Then, you specIIy the mode, SIA-1, and fiIename.
$ git update-index --add --cacheinfo 100644 \
83baae61804e65cc73a7201a7252750c76066a30 test.txt
¡n thIs case, you're specIIyIng a mode oI 100644, whIch means It's a
normaI fiIe. Other optIons are 100755, whIch means It's an executabIe
fiIe, and 120000, whIch specIfies a symboIIc IInk. The mode Is taken
Irom normaI !Þ¡X modes but Is much Iess flexIbIe — these three
modes are the onIy ones that are vaIId Ior fiIes (bIobs) In GIt (aIthough
other modes are used Ior dIrectorIes and submoduIes).
Þow, you can use the write-tree command to wrIte the stagIng
area out to a tree object. Þo -w optIon Is needed — caIIIng write-tree
automatIcaIIy creates a tree object Irom the state oI the Index II that
tree doesn't yet exIst.
$ git write-tree
d8329fc1cc938780ffdd9f94e0d364e0ea74f579
$ git cat-file -p d8329fc1cc938780ffdd9f94e0d364e0ea74f579
100644 blob 83baae61804e65cc73a7201a7252750c76066a30 test.txt
227
Section 9.2 GIt Objects Scott Chacon Pro Git
You can aIso verIIy that thIs Is a tree object.
$ git cat-file -t d8329fc1cc938780ffdd9f94e0d364e0ea74f579
tree
You'II now create a new tree wIth the second versIon oI test.txt
and a new fiIe as weII.
$ echo 'new file' > new.txt
$ git update-index test.txt
$ git update-index --add new.txt
Your stagIng area now has the new versIon oI test.txt as weII as
the new fiIe new.txt. WrIte out that tree (recordIng the state oI the
stagIng area or Index to a tree object) and see what It Iooks IIke.
$ git write-tree
0155eb4229851634a0f03eb265b69f5a2d56f341
$ git cat-file -p 0155eb4229851634a0f03eb265b69f5a2d56f341
100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt
100644 blob 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a test.txt
ÞotIce that thIs tree has both fiIe entrIes and aIso that the test.txt
SIA Is the “versIon 2” SIA Irom earIIer (1f7a7a). just Ior Iun, you'II
add the first tree as a subdIrectory Into thIs one. You can read trees
Into your stagIng area by caIIIng read-tree. ¡n thIs case, you can read
an exIstIng tree Into your stagIng area as a subtree by usIng the --
prefix optIon to read-tree.
$ git read-tree --prefix=bak d8329fc1cc938780ffdd9f94e0d364e0ea74f579
$ git write-tree
3c4e9cd789d88d8d89c1073707c3585e41b0e614
$ git cat-file -p 3c4e9cd789d88d8d89c1073707c3585e41b0e614
040000 tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579 bak
100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt
100644 blob 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a test.txt
¡I you created a workIng dIrectory Irom the new tree you just
wrote, you wouId get the two fiIes In the top IeveI oI the workIng
dIrectory and a subdIrectory named bak that contaIned the first ver-
sIon oI the test.txt fiIe. You can thInk oI the data that GIt contaIns Ior
these structures as beIng IIke IIgure 9.2.
9.2.2 Commit Objects
You have three trees that specIIy the dIfferent snapshots oI your
project that you want to track, but the earIIer probIem remaIns. you
must remember aII three SIA-1 vaIues In order to recaII the snap-
shots. You aIso don't have any InIormatIon about who saved the snap-
shots, when they were saved, or why they were saved. ThIs Is the
basIc InIormatIon that the commIt object stores Ior you.
To create a commIt object, you caII commit-tree and specIIy a sIngIe
tree SIA-1 and whIch commIt objects, II any, dIrectIy preceded It.
Start wIth the first tree you wrote.
228
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
Figure 9.2: The content structure of your current Git data
$ echo 'first commit' | git commit-tree d8329f
fdf4fc3344e67ab068f836878b6c4951e3b15f3d
Þow you can Iook at your new commIt object wIth cat-file.
$ git cat-file -p fdf4fc3
tree d8329fc1cc938780ffdd9f94e0d364e0ea74f579
author Scott Chacon <schacon@gmail.com> 1243040974 -0700
committer Scott Chacon <schacon@gmail.com> 1243040974 -0700
first commit
The Iormat Ior a commIt object Is sImpIe. It specIfies the top-
IeveI tree Ior the snapshot oI the project at that poInt, the author/
commItter InIormatIon puIIed Irom your user.name and user.email con-
figuratIon settIngs, wIth the current tImestamp, a bIank IIne, and
then the commIt message.
Þext, you'II wrIte the other two commIt objects, each reIerencIng
the commIt that came dIrectIy beIore It.
$ echo 'second commit' | git commit-tree 0155eb -p fdf4fc3
cac0cab538b970a37ea1e769cbbde608743bc96d
$ echo 'third commit' | git commit-tree 3c4e9c -p cac0cab
1a410efbd13591db07496601ebc7a059dd55cfe9
£ach oI the three commIt objects poInts to one oI the three snap-
shot trees you created. OddIy enough, you have a reaI GIt hIstory
now that you can vIew wIth the git log command, II you run It on the
Iast commIt SIA-1.
$ git log --stat 1a410e
commit 1a410efbd13591db07496601ebc7a059dd55cfe9
Author: Scott Chacon <schacon@gmail.com>
Date: Fri May 22 18:15:24 2009 -0700
third commit
229
Section 9.2 GIt Objects Scott Chacon Pro Git
bak/test.txt | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
commit cac0cab538b970a37ea1e769cbbde608743bc96d
Author: Scott Chacon <schacon@gmail.com>
Date: Fri May 22 18:14:29 2009 -0700
second commit
new.txt | 1 +
test.txt | 2 +-
2 files changed, 2 insertions(+), 1 deletions(-)
commit fdf4fc3344e67ab068f836878b6c4951e3b15f3d
Author: Scott Chacon <schacon@gmail.com>
Date: Fri May 22 18:09:34 2009 -0700
first commit
test.txt | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
AmazIng. You've just done the Iow-IeveI operatIons to buIId up a
GIt hIstory wIthout usIng any oI the Iront ends. ThIs Is essentIaIIy
what GIt does when you run the git add and git commit commands
— It stores bIobs Ior the fiIes that have changed, updates the In-
dex, wrItes out trees, and wrItes commIt objects that reIerence the
top-IeveI trees and the commIts that came ImmedIateIy beIore them.
These three maIn GIt objects — the bIob, the tree, and the commIt
— are InItIaIIy stored as separate fiIes In your .git/objects dIrectory.
Iere are aII the objects In the exampIe dIrectory now, commented
wIth what they store.
$ find .git/objects -type f
.git/objects/01/55eb4229851634a0f03eb265b69f5a2d56f341 # tree 2
.git/objects/1a/410efbd13591db07496601ebc7a059dd55cfe9 # commit 3
.git/objects/1f/7a7a472abf3dd9643fd615f6da379c4acb3e3a # test.txt v2
.git/objects/3c/4e9cd789d88d8d89c1073707c3585e41b0e614 # tree 3
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30 # test.txt v1
.git/objects/ca/c0cab538b970a37ea1e769cbbde608743bc96d # commit 2
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4 # 'test content'
.git/objects/d8/329fc1cc938780ffdd9f94e0d364e0ea74f579 # tree 1
.git/objects/fa/49b077972391ad58037050f2a75f74e3671e92 # new.txt
.git/objects/fd/f4fc3344e67ab068f836878b6c4951e3b15f3d # commit 1
¡I you IoIIow aII the InternaI poInters, you get an object graph
somethIng IIke IIgure 9.3.
9.2.3 Object Storage
¡ mentIoned earIIer that a header Is stored wIth the content. Iet's
take a mInute to Iook at how GIt stores Its objects. You'II see how
to store a bIob object — In thIs case, the strIng “what Is up, doc?”
— InteractIveIy In the Iuby scrIptIng Ianguage. You can start up
InteractIve Iuby mode wIth the irb command.
230
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
Figure 9.3: All the objects in your Git directory
$ irb
>> content = "what is up, doc?"
=> "what is up, doc?"
GIt constructs a header that starts wIth the type oI the object, In
thIs case a bIob. Then, It adds a space IoIIowed by the sIze oI the
content and finaIIy a nuII byte.
>> header = "blob #{content.length}\0"
=> "blob 16\000"
GIt concatenates the header and the orIgInaI content and then caI-
cuIates the SIA-1 checksum oI that new content. You can caIcuIate
the SIA-1 vaIue oI a strIng In Iuby by IncIudIng the SIA1 dIgest II-
brary wIth the require command and then caIIIng Digest::SHA1.hexdigest
() wIth the strIng.
>> store = header + content
=> "blob 16\000what is up, doc?"
>> require 'digest/sha1'
=> true
>> sha1 = Digest::SHA1.hexdigest(store)
=> "bd9dbf5aae1a3862dd1526723246b20206e5fc37"
GIt compresses the new content wIth zIIb, whIch you can do In
Iuby wIth the zIIb IIbrary. IIrst, you need to requIre the IIbrary and
then run Zlib::Deflate.deflate() on the content.
>> require 'zlib'
=> true
>> zlib_content = Zlib::Deflate.deflate(store)
=> "x\234K\312\311OR04c(\317H,Q\310,V(-\320QH\311O\266\a\000_\034\a\235"
IInaIIy, you'II wrIte your zIIb-deflated content to an object on dIsk.
You'II determIne the path oI the object you want to wrIte out (the
231
Section 9.3 GIt IeIerences Scott Chacon Pro Git
first two characters oI the SIA-1 vaIue beIng the subdIrectory name,
and the Iast 38 characters beIng the fiIename wIthIn that dIrectory).
¡n Iuby, you can use the FileUtils.mkdir_p() IunctIon to create the
subdIrectory II It doesn't exIst. Then, open the fiIe wIth File.open()
and wrIte out the prevIousIy zIIb-compressed content to the fiIe wIth
a write() caII on the resuItIng fiIe handIe.
>> path = '.git/objects/' + sha1[0,2] + '/' + sha1[2,38]
=> ".git/objects/bd/9dbf5aae1a3862dd1526723246b20206e5fc37"
>> require 'fileutils'
=> true
>> FileUtils.mkdir_p(File.dirname(path))
=> ".git/objects/bd"
>> File.open(path, 'w') { |f| f.write zlib_content }
=> 32
That's It — you've created a vaIId GIt bIob object. AII GIt objects
are stored the same way, just wIth dIfferent types — Instead oI the
strIng bIob, the header wIII begIn wIth commIt or tree. AIso, aIthough
the bIob content can be nearIy anythIng, the commIt and tree content
are very specIficaIIy Iormatted.
9.3 Git References
You can run somethIng IIke git log 1a410e to Iook through your whoIe
hIstory, but you stIII have to remember that 1a410e Is the Iast commIt
In order to waIk that hIstory to find aII those objects. You need a fiIe
In whIch you can store the SIA-1 vaIue under a sImpIe name so you
can use that poInter rather than the raw SIA-1 vaIue.
¡n GIt, these are caIIed “reIerences” or “reIs”, you can find the
fiIes that contaIn the SIA-1 vaIues In the .git/refs dIrectory. ¡n the
current project, thIs dIrectory contaIns no fiIes, but It does contaIn a
sImpIe structure.
$ find .git/refs
.git/refs
.git/refs/heads
.git/refs/tags
$ find .git/refs -type f
$
To create a new reIerence that wIII heIp you remember where your
Iatest commIt Is, you can technIcaIIy do somethIng as sImpIe as thIs.
$ echo "1a410efbd13591db07496601ebc7a059dd55cfe9" > .git/refs/heads/master
Þow, you can use the head reIerence you just created Instead oI
the SIA-1 vaIue In your GIt commands.
$ git log --pretty=oneline master
1a410efbd13591db07496601ebc7a059dd55cfe9 third commit
cac0cab538b970a37ea1e769cbbde608743bc96d second commit
fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit
232
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
You aren't encouraged to dIrectIy edIt the reIerence fiIes. GIt pro-
vIdes a saIer command to do thIs II you want to update a reIerence
caIIed update-ref.
$ git update-ref refs/heads/master 1a410efbd13591db07496601ebc7a059dd55cfe9
That's basIcaIIy what a branch In GIt Is. a sImpIe poInter or reI-
erence to the head oI a IIne oI work. To create a branch back at the
second commIt, you can do thIs.
$ git update-ref refs/heads/test cac0ca
Your branch wIII contaIn onIy work Irom that commIt down.
$ git log --pretty=oneline test
cac0cab538b970a37ea1e769cbbde608743bc96d second commit
fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit
Þow, your GIt database conceptuaIIy Iooks somethIng IIke IIgure
9-4.
Figure 9.4: Git directory objects with branch head references
included
When you run commands IIke git branch (branchname), GIt basIcaIIy
runs that update-ref command to add the SIA-1 oI the Iast commIt
oI the branch you're on Into whatever new reIerence you want to
create.
9.3.1 The HEAD
The questIon now Is, when you run git branch (branchname), how does
GIt know the SIA-1 oI the Iast commIt? The answer Is the I£AÐ fiIe.
The I£AÐ fiIe Is a symboIIc reIerence to the branch you're currentIy
on. Ðy symboIIc reIerence, ¡ mean that unIIke a normaI reIerence, It
doesn't generaIIy contaIn a SIA-1 vaIue but rather a poInter to an-
other reIerence. ¡I you Iook at the fiIe, you'II normaIIy see somethIng
IIke thIs.
$ cat .git/HEAD
ref: refs/heads/master
233
Section 9.3 GIt IeIerences Scott Chacon Pro Git
¡I you run git checkout test, GIt updates the fiIe to Iook IIke thIs.
$ cat .git/HEAD
ref: refs/heads/test
When you run git commit, It creates the commIt object, specIIyIng
the parent oI that commIt object to be whatever SIA-1 vaIue the
reIerence In I£AÐ poInts to.
You can aIso manuaIIy edIt thIs fiIe, but agaIn a saIer command
exIsts to do so. symbolic-ref. You can read the vaIue oI your I£AÐ
vIa thIs command.
$ git symbolic-ref HEAD
refs/heads/master
You can aIso set the vaIue oI I£AÐ.
$ git symbolic-ref HEAD refs/heads/test
$ cat .git/HEAD
ref: refs/heads/test
You can't set a symboIIc reIerence outsIde oI the reIs styIe.
$ git symbolic-ref HEAD test
fatal: Refusing to point HEAD outside of refs/
9.3.2 Tags
You've just gone over GIt's three maIn object types, but there Is a
Iourth. The tag object Is very much IIke a commIt object — It contaIns
a tagger, a date, a message, and a poInter. The maIn dIfference Is that
a tag object poInts to a commIt rather than a tree. ¡t's IIke a branch
reIerence, but It never moves — It aIways poInts to the same commIt
but gIves It a IrIendIIer name.
As dIscussed In Chapter 2, there are two types oI tags. annotated
and IIghtweIght. You can make a IIghtweIght tag by runnIng some-
thIng IIke thIs.
$ git update-ref refs/tags/v1.0 cac0cab538b970a37ea1e769cbbde608743bc96d
That Is aII a IIghtweIght tag Is — a branch that never moves. An
annotated tag Is more compIex, however. ¡I you create an annotated
tag, GIt creates a tag object and then wrItes a reIerence to poInt to
It rather than dIrectIy to the commIt. You can see thIs by creatIng an
annotated tag (-a specIfies that It's an annotated tag).
$ git tag -a v1.1 1a410efbd13591db07496601ebc7a059dd55cfe9 –m 'test tag'
Iere's the object SIA-1 vaIue It created.
$ cat .git/refs/tags/v1.1
9585191f37f7b0fb9444f35a9bf50de191beadc2
Þow, run the cat-file command on that SIA-1 vaIue.
234
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
$ git cat-file -p 9585191f37f7b0fb9444f35a9bf50de191beadc2
object 1a410efbd13591db07496601ebc7a059dd55cfe9
type commit
tag v1.1
tagger Scott Chacon <schacon@gmail.com> Sat May 23 16:48:58 2009 -0700
test tag
ÞotIce that the object entry poInts to the commIt SIA-1 vaIue that
you tagged. AIso notIce that It doesn't need to poInt to a commIt,
you can tag any GIt object. ¡n the GIt source code, Ior exampIe, the
maIntaIner has added theIr GIG pubIIc key as a bIob object and then
tagged It. You can vIew the pubIIc key by runnIng
$ git cat-file blob junio-gpg-pub
In the GIt source code. The IInux kerneI aIso has a non-commIt-
poIntIng tag object — the first tag created poInts to the InItIaI tree oI
the Import oI the source code.
9.3.3 Remotes
The thIrd type oI reIerence that you'II see Is a remote reIerence. ¡I
you add a remote and push to It, GIt stores the vaIue you Iast pushed
to that remote Ior each branch In the refs/remotes dIrectory. Ior
Instance, you can add a remote caIIed origin and push your master
branch to It.
$ git remote add origin git@github.com:schacon/simplegit-progit.git
$ git push origin master
Counting objects: 11, done.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (7/7), 716 bytes, done.
Total 7 (delta 2), reused 4 (delta 1)
To git@github.com:schacon/simplegit-progit.git
a11bef0..ca82a6d master -> master
Then, you can see what the master branch on the origin remote
was the Iast tIme you communIcated wIth the server, by checkIng
the refs/remotes/origin/master fiIe.
$ cat .git/refs/remotes/origin/master
ca82a6dff817ec66f44342007202690a93763949
Iemote reIerences dIffer Irom branches (refs/heads reIerences)
maInIy In that they can't be checked out. GIt moves them around as
bookmarks to the Iast known state oI where those branches were on
those servers.
9.4 Packfiles
Iet's go back to the objects database Ior your test GIt reposItory. At
thIs poInt, you have 11 objects — 4 bIobs, 3 trees, 3 commIts, and 1
tag.
235
Section 9.4 IackfiIes Scott Chacon Pro Git
$ find .git/objects -type f
.git/objects/01/55eb4229851634a0f03eb265b69f5a2d56f341 # tree 2
.git/objects/1a/410efbd13591db07496601ebc7a059dd55cfe9 # commit 3
.git/objects/1f/7a7a472abf3dd9643fd615f6da379c4acb3e3a # test.txt v2
.git/objects/3c/4e9cd789d88d8d89c1073707c3585e41b0e614 # tree 3
.git/objects/83/baae61804e65cc73a7201a7252750c76066a30 # test.txt v1
.git/objects/95/85191f37f7b0fb9444f35a9bf50de191beadc2 # tag
.git/objects/ca/c0cab538b970a37ea1e769cbbde608743bc96d # commit 2
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4 # 'test content'
.git/objects/d8/329fc1cc938780ffdd9f94e0d364e0ea74f579 # tree 1
.git/objects/fa/49b077972391ad58037050f2a75f74e3671e92 # new.txt
.git/objects/fd/f4fc3344e67ab068f836878b6c4951e3b15f3d # commit 1
GIt compresses the contents oI these fiIes wIth zIIb, and you're not
storIng much, so aII these fiIes coIIectIveIy take up onIy 925 bytes.
You'II add some Iarger content to the reposItory to demonstrate an
InterestIng Ieature oI GIt. Add the repo.rb fiIe Irom the GrIt IIbrary
you worked wIth earIIer — thIs Is about a 12K source code fiIe.
$ curl http://github.com/mojombo/grit/raw/master/lib/grit/repo.rb > repo.rb
$ git add repo.rb
$ git commit -m 'added repo.rb'
[master 484a592] added repo.rb
3 files changed, 459 insertions(+), 2 deletions(-)
delete mode 100644 bak/test.txt
create mode 100644 repo.rb
rewrite test.txt (100%)
¡I you Iook at the resuItIng tree, you can see the SIA-1 vaIue your
repo.rb fiIe got Ior the bIob object.
$ git cat-file -p master^{tree}
100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt
100644 blob 9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e repo.rb
100644 blob e3f094f522629ae358806b17daf78246c27c007b test.txt
You can then use git cat-file to see how bIg that object Is.
$ git cat-file -s 9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e
12898
Þow, modIIy that fiIe a IIttIe, and see what happens.
$ echo '# testing' >> repo.rb
$ git commit -am 'modified repo a bit'
[master ab1afef] modified repo a bit
1 files changed, 1 insertions(+), 0 deletions(-)
Check the tree created by that commIt, and you see somethIng
InterestIng.
$ git cat-file -p master^{tree}
100644 blob fa49b077972391ad58037050f2a75f74e3671e92 new.txt
100644 blob 05408d195263d853f09dca71d55116663690c27c repo.rb
100644 blob e3f094f522629ae358806b17daf78246c27c007b test.txt
The bIob Is now a dIfferent bIob, whIch means that aIthough you
added onIy a sIngIe IIne to the end oI a 400-IIne fiIe, GIt stored that
new content as a compIeteIy new object.
236
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
$ git cat-file -s 05408d195263d853f09dca71d55116663690c27c
12908
You have two nearIy IdentIcaI 12K objects on your dIsk. WouIdn't
It be nIce II GIt couId store one oI them In IuII but then the second
object onIy as the deIta between It and the first?
¡t turns out that It can. The InItIaI Iormat In whIch GIt saves ob-
jects on dIsk Is caIIed a Ioose object Iormat. Iowever, occasIonaIIy
GIt packs up severaI oI these objects Into a sIngIe bInary fiIe caIIed
a packfiIe In order to save space and be more efficIent. GIt does
thIs II you have too many Ioose objects around, II you run the git gc
command manuaIIy, or II you push to a remote server. To see what
happens, you can manuaIIy ask GIt to pack up the objects by caIIIng
the git gc command.
$ git gc
Counting objects: 17, done.
Delta compression using 2 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (17/17), done.
Total 17 (delta 1), reused 10 (delta 0)
¡I you Iook In your objects dIrectory, you'II find that most oI your
objects are gone, and a new paIr oI fiIes has appeared.
$ find .git/objects -type f
.git/objects/71/08f7ecb345ee9d0084193f147cdad4d2998293
.git/objects/d6/70460b4b4aece5915caf5c68d12f560a9fe3e4
.git/objects/info/packs
.git/objects/pack/pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.idx
.git/objects/pack/pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.pack
The objects that remaIn are the bIobs that aren't poInted to by
any commIt — In thIs case, the “what Is up, doc?” exampIe and the
“test content” exampIe bIobs you created earIIer. Ðecause you never
added them to any commIts, they're consIdered dangIIng and aren't
packed up In your new packfiIe.
The other fiIes are your new packfiIe and an Index. The packfiIe
Is a sIngIe fiIe contaInIng the contents oI aII the objects that were
removed Irom your fiIesystem. The Index Is a fiIe that contaIns offsets
Into that packfiIe so you can quIckIy seek to a specIfic object. What
Is cooI Is that aIthough the objects on dIsk beIore you ran the gc were
coIIectIveIy about 12K In sIze, the new packfiIe Is onIy 6K. You've
haIved your dIsk usage by packIng your objects.
Iow does GIt do thIs? When GIt packs objects, It Iooks Ior fiIes
that are named and sIzed sImIIarIy, and stores just the deItas Irom
one versIon oI the fiIe to the next. You can Iook Into the packfiIe
and see what GIt dId to save space. The git verify-pack pIumbIng
command aIIows you to see what was packed up.
$ git verify-pack -v pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.idx
0155eb4229851634a0f03eb265b69f5a2d56f341 tree 71 76 5400
05408d195263d853f09dca71d55116663690c27c blob 12908 3478 874
237
Section 9.5 The IeIspec Scott Chacon Pro Git
09f01cea547666f58d6a8d809583841a7c6f0130 tree 106 107 5086
1a410efbd13591db07496601ebc7a059dd55cfe9 commit 225 151 322
1f7a7a472abf3dd9643fd615f6da379c4acb3e3a blob 10 19 5381
3c4e9cd789d88d8d89c1073707c3585e41b0e614 tree 101 105 5211
484a59275031909e19aadb7c92262719cfcdf19a commit 226 153 169
83baae61804e65cc73a7201a7252750c76066a30 blob 10 19 5362
9585191f37f7b0fb9444f35a9bf50de191beadc2 tag 136 127 5476
9bc1dc421dcd51b4ac296e3e5b6e2a99cf44391e blob 7 18 5193 1
05408d195263d853f09dca71d55116663690c27c \
ab1afef80fac8e34258ff41fc1b867c702daa24b commit 232 157 12
cac0cab538b970a37ea1e769cbbde608743bc96d commit 226 154 473
d8329fc1cc938780ffdd9f94e0d364e0ea74f579 tree 36 46 5316
e3f094f522629ae358806b17daf78246c27c007b blob 1486 734 4352
f8f51d7d8a1760462eca26eebafde32087499533 tree 106 107 749
fa49b077972391ad58037050f2a75f74e3671e92 blob 9 18 856
fdf4fc3344e67ab068f836878b6c4951e3b15f3d commit 177 122 627
chain length = 1: 1 object
pack-7a16e4488ae40c7d2bc56ea2bd43e25212a66c45.pack: ok
Iere, the 9bc1d bIob, whIch II you remember was the first versIon
oI your repo.rb fiIe, Is reIerencIng the 05408 bIob, whIch was the sec-
ond versIon oI the fiIe. The thIrd coIumn In the output Is the sIze
oI the object In the pack, so you can see that 05408 takes up 12K oI
the fiIe but that 9bc1d onIy takes up 7 bytes. What Is aIso InterestIng
Is that the second versIon oI the fiIe Is the one that Is stored Intact,
whereas the orIgInaI versIon Is stored as a deIta — thIs Is because
you're most IIkeIy to need Iaster access to the most recent versIon oI
the fiIe.
The reaIIy nIce thIng about thIs Is that It can be repacked at any
tIme. GIt wIII occasIonaIIy repack your database automatIcaIIy, aI-
ways tryIng to save more space. You can aIso manuaIIy repack at
any tIme by runnIng git gc by hand.
9.5 The Refspec
Throughout thIs book, you've used sImpIe mappIngs Irom remote
branches to IocaI reIerences, but they can be more compIex. Sup-
pose you add a remote IIke thIs.
$ git remote add origin git@github.com:schacon/simplegit-progit.git
¡t adds a sectIon to your .git/config fiIe, specIIyIng the name oI the
remote (origin), the !II oI the remote reposItory, and the reIspec Ior
IetchIng.
[remote "origin"]
url = git@github.com:schacon/simplegit-progit.git
fetch = +refs/heads/*:refs/remotes/origin/*
The Iormat oI the reIspec Is an optIonaI +, IoIIowed by <src>:<dst>,
where <src> Is the pattern Ior reIerences on the remote sIde and <dst>
Is where those reIerences wIII be wrItten IocaIIy. The + teIIs GIt to
update the reIerence even II It Isn't a Iast-Iorward.
238
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
¡n the deIauIt case that Is automatIcaIIy wrItten by a git remote
add command, GIt Ietches aII the reIerences under refs/heads/ on the
server and wrItes them to refs/remotes/origin/ IocaIIy. So, II there Is
a master branch on the server, you can access the Iog oI that branch
IocaIIy vIa
$ git log origin/master
$ git log remotes/origin/master
$ git log refs/remotes/origin/master
They're aII equIvaIent, because GIt expands each oI them to refs/
remotes/origin/master.
¡I you want GIt Instead to puII down onIy the master branch each
tIme, and not every other branch on the remote server, you can
change the Ietch IIne to
fetch = +refs/heads/master:refs/remotes/origin/master
ThIs Is just the deIauIt reIspec Ior git fetch Ior that remote. ¡I you
want to do somethIng one tIme, you can specIIy the reIspec on the
command IIne, too. To puII the master branch on the remote down to
origin/mymaster IocaIIy, you can run
$ git fetch origin master:refs/remotes/origin/mymaster
You can aIso specIIy muItIpIe reIspecs. On the command IIne, you
can puII down severaI branches IIke so.
$ git fetch origin master:refs/remotes/origin/mymaster \
topic:refs/remotes/origin/topic
From git@github.com:schacon/simplegit
! [rejected] master -> origin/mymaster (non fast forward)
* [new branch] topic -> origin/topic
¡n thIs case, the master branch puII was rejected because It wasn't
a Iast-Iorward reIerence. You can overrIde that by specIIyIng the +
In Iront oI the reIspec.
You can aIso specIIy muItIpIe reIspecs Ior IetchIng In your config-
uratIon fiIe. ¡I you want to aIways Ietch the master and experIment
branches, add two IInes.
[remote "origin"]
url = git@github.com:schacon/simplegit-progit.git
fetch = +refs/heads/master:refs/remotes/origin/master
fetch = +refs/heads/experiment:refs/remotes/origin/experiment
You can't use partIaI gIobs In the pattern, so thIs wouId be InvaIId.
fetch = +refs/heads/qa*:refs/remotes/origin/qa*
Iowever, you can use namespacIng to accompIIsh somethIng IIke
that. ¡I you have a OA team that pushes a serIes oI branches, and you
want to get the master branch and any oI the OA team's branches but
nothIng eIse, you can use a config sectIon IIke thIs.
239
Section 9.6 TransIer IrotocoIs Scott Chacon Pro Git
[remote "origin"]
url = git@github.com:schacon/simplegit-progit.git
fetch = +refs/heads/master:refs/remotes/origin/master
fetch = +refs/heads/qa/*:refs/remotes/origin/qa/*
¡I you have a compIex workflow process that has a OA team push-
Ing branches, deveIopers pushIng branches, and IntegratIon teams
pushIng and coIIaboratIng on remote branches, you can namespace
them easIIy thIs way.
9.5.1 Pushing Refspecs
¡t's nIce that you can Ietch namespaced reIerences that way, but how
does the OA team get theIr branches Into a qa/ namespace In the first
pIace? You accompIIsh that by usIng reIspecs to push.
¡I the OA team wants to push theIr master branch to qa/master on
the remote server, they can run
$ git push origin master:refs/heads/qa/master
¡I they want GIt to do that automatIcaIIy each tIme they run git
push origin, they can add a push vaIue to theIr config fiIe.
[remote "origin"]
url = git@github.com:schacon/simplegit-progit.git
fetch = +refs/heads/*:refs/remotes/origin/*
push = refs/heads/master:refs/heads/qa/master
AgaIn, thIs wIII cause a git push origin to push the IocaI master
branch to the remote qa/master branch by deIauIt.
9.5.2 Deleting References
You can aIso use the reIspec to deIete reIerences Irom the remote
server by runnIng somethIng IIke thIs.
$ git push origin :topic
Ðecause the reIspec Is <src>:<dst>, by IeavIng off the <src> part,
thIs basIcaIIy says to make the topIc branch on the remote nothIng,
whIch deIetes It.
9.6 Transfer Protocols
GIt can transIer data between two reposItorIes In two major ways.
over ITTI and vIa the so-caIIed smart protocoIs used In the file://,
ssh://, and git:// transports. ThIs sectIon wIII quIckIy cover how
these two maIn protocoIs operate.
240
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
9.6.1 The Dumb Protocol
GIt transport over ITTI Is oIten reIerred to as the dumb protocoI
because It requIres no GIt-specIfic code on the server sIde durIng
the transport process. The Ietch process Is a serIes oI G£T requests,
where the cIIent can assume the Iayout oI the GIt reposItory on the
server. Iet's IoIIow the http-fetch process Ior the sImpIegIt IIbrary.
$ git clone http://github.com/schacon/simplegit-progit.git
The first thIng thIs command does Is puII down the info/refs fiIe.
ThIs fiIe Is wrItten by the update-server-info command, whIch Is why
you need to enabIe that as a post-receive hook In order Ior the ITTI
transport to work properIy.
=> GET info/refs
ca82a6dff817ec66f44342007202690a93763949 refs/heads/master
Þow you have a IIst oI the remote reIerences and SIAs. Þext,
you Iook Ior what the I£AÐ reIerence Is so you know what to check
out when you're finIshed.
=> GET HEAD
ref: refs/heads/master
You need to check out the master branch when you've compIeted
the process. At thIs poInt, you're ready to start the waIkIng process.
Ðecause your startIng poInt Is the ca82a6 commIt object you saw In
the info/refs fiIe, you start by IetchIng that.
=> GET objects/ca/82a6dff817ec66f44342007202690a93763949
(179 bytes of binary data)
You get an object back — that object Is In Ioose Iormat on the
server, and you Ietched It over a statIc ITTI G£T request. You can
zIIb-uncompress It, strIp off the header, and Iook at the commIt con-
tent.
$ git cat-file -p ca82a6dff817ec66f44342007202690a93763949
tree cfda3bf379e4f8dba8717dee55aab78aef7f4daf
parent 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
author Scott Chacon <schacon@gmail.com> 1205815931 -0700
committer Scott Chacon <schacon@gmail.com> 1240030591 -0700
changed the verison number
Þext, you have two more objects to retrIeve — cfda3b, whIch Is
the tree oI content that the commIt we just retrIeved poInts to, and
085bb3, whIch Is the parent commIt.
=> GET objects/08/5bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
(179 bytes of data)
That gIves you your next commIt object. Grab the tree object.
241
Section 9.6 TransIer IrotocoIs Scott Chacon Pro Git
=> GET objects/cf/da3bf379e4f8dba8717dee55aab78aef7f4daf
(404 - Not Found)
Oops — It Iooks IIke that tree object Isn't In Ioose Iormat on the
server, so you get a 404 response back. There are a coupIe oI reasons
Ior thIs — the object couId be In an aIternate reposItory, or It couId
be In a packfiIe In thIs reposItory. GIt checks Ior any IIsted aIternates
first.
=> GET objects/info/http-alternates
(empty file)
¡I thIs comes back wIth a IIst oI aIternate !IIs, GIt checks Ior
Ioose fiIes and packfiIes there — thIs Is a nIce mechanIsm Ior projects
that are Iorks oI one another to share objects on dIsk. Iowever,
because no aIternates are IIsted In thIs case, your object must be
In a packfiIe. To see what packfiIes are avaIIabIe on thIs server, you
need to get the objects/info/packs fiIe, whIch contaIns a IIstIng oI them
(aIso generated by update-server-info).
=> GET objects/info/packs
P pack-816a9b2334da9953e530f27bcac22082a9f5b835.pack
There Is onIy one packfiIe on the server, so your object Is obvIousIy
In there, but you'II check the Index fiIe to make sure. ThIs Is aIso
useIuI II you have muItIpIe packfiIes on the server, so you can see
whIch packfiIe contaIns the object you need.
=> GET objects/pack/pack-816a9b2334da9953e530f27bcac22082a9f5b835.idx
(4k of binary data)
Þow that you have the packfiIe Index, you can see II your object
Is In It — because the Index IIsts the SIAs oI the objects contaIned
In the packfiIe and the offsets to those objects. Your object Is there,
so go ahead and get the whoIe packfiIe.
=> GET objects/pack/pack-816a9b2334da9953e530f27bcac22082a9f5b835.pack
(13k of binary data)
You have your tree object, so you contInue waIkIng your commIts.
They're aII aIso wIthIn the packfiIe you just downIoaded, so you don't
have to do any more requests to your server. GIt checks out a workIng
copy oI the master branch that was poInted to by the I£AÐ reIerence
you downIoaded at the begInnIng.
The entIre output oI thIs process Iooks IIke thIs.
$ git clone http://github.com/schacon/simplegit-progit.git
Initialized empty Git repository in /private/tmp/simplegit-progit/.git/
got ca82a6dff817ec66f44342007202690a93763949
walk ca82a6dff817ec66f44342007202690a93763949
got 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
Getting alternates list for http://github.com/schacon/simplegit-progit.git
Getting pack list for http://github.com/schacon/simplegit-progit.git
Getting index for pack 816a9b2334da9953e530f27bcac22082a9f5b835
Getting pack 816a9b2334da9953e530f27bcac22082a9f5b835
which contains cfda3bf379e4f8dba8717dee55aab78aef7f4daf
walk 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
walk a11bef06a3f659402fe7563abf99ad00de2209e6
242
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
9.6.2 The Smart Protocol
The ITTI method Is sImpIe but a bIt InefficIent. !sIng smart proto-
coIs Is a more common method oI transIerrIng data. These protocoIs
have a process on the remote end that Is InteIIIgent about GIt — It
can read IocaI data and figure out what the cIIent has or needs and
generate custom data Ior It. There are two sets oI processes Ior
transIerrIng data. a paIr Ior upIoadIng data and a paIr Ior downIoad-
Ing data.
Uploading Data
To upIoad data to a remote process, GIt uses the send-pack and receive-
pack processes. The send-pack process runs on the cIIent and connects
to a receive-pack process on the remote sIde.
Ior exampIe, say you run git push origin master In your project,
and origin Is defined as a !II that uses the SSI protocoI. GIt fires
up the send-pack process, whIch InItIates a connectIon over SSI to
your server. ¡t trIes to run a command on the remote server vIa an
SSI caII that Iooks somethIng IIke thIs.
$ ssh -x git@github.com "git-receive-pack 'schacon/simplegit-progit.git'"
005bca82a6dff817ec66f4437202690a93763949 refs/heads/master report-status delete-
refs
003e085bb3bcb608e1e84b2432f8ecbe6306e7e7 refs/heads/topic
0000
The git-receive-pack command ImmedIateIy responds wIth one IIne
Ior each reIerence It currentIy has — In thIs case, just the master
branch and Its SIA. The first IIne aIso has a IIst oI the server's capa-
bIIItIes (here, report-status and delete-refs).
£ach IIne starts wIth a 4-byte hex vaIue specIIyIng how Iong the
rest oI the IIne Is. Your first IIne starts wIth 005b, whIch Is 91 In hex,
meanIng that 91 bytes remaIn on that IIne. The next IIne starts wIth
003e, whIch Is 62, so you read the remaInIng 62 bytes. The next IIne
Is 0000, meanIng the server Is done wIth Its reIerences IIstIng.
Þow that It knows the server's state, your send-pack process deter-
mInes what commIts It has that the server doesn't. Ior each reIer-
ence that thIs push wIII update, the send-pack process teIIs the receive-
pack process that InIormatIon. Ior Instance, II you're updatIng the
master branch and addIng an experiment branch, the send-pack response
may Iook somethIng IIke thIs.
0085ca82a6dff817ec66f44342007202690a93763949 15027957951b64cf874c3557a0f3547bd83b3ff6 refs/
heads/master report-status
00670000000000000000000000000000000000000000 cdfdb42577e2506715f8cfeacdbabc092bf63e8d refs/
heads/experiment
0000
The SIA-1 vaIue oI aII '0's means that nothIng was there beIore —
because you're addIng the experIment reIerence. ¡I you were deIet-
Ing a reIerence, you wouId see the opposIte. aII '0's on the rIght sIde.
243
Section 9.6 TransIer IrotocoIs Scott Chacon Pro Git
GIt sends a IIne Ior each reIerence you're updatIng wIth the oId
SIA, the new SIA, and the reIerence that Is beIng updated. The
first IIne aIso has the cIIent's capabIIItIes. Þext, the cIIent upIoads
a packfiIe oI aII the objects the server doesn't have yet. IInaIIy, the
server responds wIth a success (or IaIIure) IndIcatIon.
000Aunpack ok
Downloading Data
When you downIoad data, the fetch-pack and upload-pack processes
are InvoIved. The cIIent InItIates a fetch-pack process that connects
to an upload-pack process on the remote sIde to negotIate what data
wIII be transIerred down.
There are dIfferent ways to InItIate the upload-pack process on the
remote reposItory. You can run vIa SSI In the same manner as the
receive-pack process. You can aIso InItIate the process vIa the GIt dae-
mon, whIch IIstens on a server on port 9418 by deIauIt. The fetch-
pack process sends data that Iooks IIke thIs to the daemon aIter con-
nectIng.
003fgit-upload-pack schacon/simplegit-progit.git\0host=myserver.com\0
¡t starts wIth the 4 bytes specIIyIng how much data Is IoIIow-
Ing, then the command to run IoIIowed by a nuII byte, and then
the server's hostname IoIIowed by a finaI nuII byte. The GIt dae-
mon checks that the command can be run and that the reposItory
exIsts and has pubIIc permIssIons. ¡I everythIng Is cooI, It fires up
the upload-pack process and hands off the request to It.
¡I you're doIng the Ietch over SSI, fetch-pack Instead runs some-
thIng IIke thIs.
$ ssh -x git@github.com "git-upload-pack 'schacon/simplegit-progit.git'"
¡n eIther case, aIter fetch-pack connects, upload-pack sends back
somethIng IIke thIs.
0088ca82a6dff817ec66f44342007202690a93763949 HEAD\0multi_ack thin-pack \
side-band side-band-64k ofs-delta shallow no-progress include-tag
003fca82a6dff817ec66f44342007202690a93763949 refs/heads/master
003e085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7 refs/heads/topic
0000
ThIs Is very sImIIar to what receive-pack responds wIth, but the
capabIIItIes are dIfferent. ¡n addItIon, It sends back the I£AÐ reIer-
ence so the cIIent knows what to check out II thIs Is a cIone.
At thIs poInt, the fetch-pack process Iooks at what objects It has
and responds wIth the objects that It needs by sendIng “want” and
then the SIA It wants. ¡t sends aII the objects It aIready has wIth
“have” and then the SIA. At the end oI thIs IIst, It wrItes “done” to
InItIate the upload-pack process to begIn sendIng the packfiIe oI the
data It needs.
244
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
0054want ca82a6dff817ec66f44342007202690a93763949 ofs-delta
0032have 085bb3bcb608e1e8451d4b2432f8ecbe6306e7e7
0000
0009done
That Is a very basIc case oI the transIer protocoIs. ¡n more com-
pIex cases, the cIIent supports multi_ack or side-band capabIIItIes, but
thIs exampIe shows you the basIc back and Iorth used by the smart
protocoI processes.
9.7 Maintenance and Data Recovery
OccasIonaIIy, you may have to do some cIeanup — make a reposItory
more compact, cIean up an Imported reposItory, or recover Iost work.
ThIs sectIon wIII cover some oI these scenarIos.
9.7.1 Maintenance
OccasIonaIIy, GIt automatIcaIIy runs a command caIIed “auto gc”.
Most oI the tIme, thIs command does nothIng. Iowever, II there
are too many Ioose objects (objects not In a packfiIe) or too many
packfiIes, GIt Iaunches a IuII-fledged git gc command. The gc stands
Ior garbage coIIect, and the command does a number oI thIngs. It
gathers up aII the Ioose objects and pIaces them In packfiIes, It con-
soIIdates packfiIes Into one bIg packfiIe, and It removes objects that
aren't reachabIe Irom any commIt and are a Iew months oId.
You can run auto gc manuaIIy as IoIIows.
$ git gc --auto
AgaIn, thIs generaIIy does nothIng. You must have around 7,000
Ioose objects or more than 50 packfiIes Ior GIt to fire up a reaI gc com-
mand. You can modIIy these IImIts wIth the gc.auto and gc.autopacklimit
config settIngs, respectIveIy.
The other thIng gc wIII do Is pack up your reIerences Into a sIngIe
fiIe. Suppose your reposItory contaIns the IoIIowIng branches and
tags.
$ find .git/refs -type f
.git/refs/heads/experiment
.git/refs/heads/master
.git/refs/tags/v1.0
.git/refs/tags/v1.1
¡I you run git gc, you'II no Ionger have these fiIes In the refs dIrec-
tory. GIt wIII move them Ior the sake oI efficIency Into a fiIe named
.git/packed-refs that Iooks IIke thIs.
$ cat .git/packed-refs
# pack-refs with: peeled
cac0cab538b970a37ea1e769cbbde608743bc96d refs/heads/experiment
245
Section 9.7 MaIntenance and Ðata Iecovery Scott Chacon Pro Git
ab1afef80fac8e34258ff41fc1b867c702daa24b refs/heads/master
cac0cab538b970a37ea1e769cbbde608743bc96d refs/tags/v1.0
9585191f37f7b0fb9444f35a9bf50de191beadc2 refs/tags/v1.1
^1a410efbd13591db07496601ebc7a059dd55cfe9
¡I you update a reIerence, GIt doesn't edIt thIs fiIe but Instead
wrItes a new fiIe to refs/heads. To get the approprIate SIA Ior a gIven
reIerence, GIt checks Ior that reIerence In the refs dIrectory and then
checks the packed-refs fiIe as a IaIIback. Iowever, II you can't find a
reIerence In the refs dIrectory, It's probabIy In your packed-refs fiIe.
ÞotIce the Iast IIne oI the fiIe, whIch begIns wIth a ˆ. ThIs means
the tag dIrectIy above Is an annotated tag and that IIne Is the commIt
that the annotated tag poInts to.
9.7.2 Data Recovery
At some poInt In your GIt journey, you may accIdentaIIy Iose a com-
mIt. GeneraIIy, thIs happens because you Iorce-deIete a branch that
had work on It, and It turns out you wanted the branch aIter aII, or
you hard-reset a branch, thus abandonIng commIts that you wanted
somethIng Irom. AssumIng thIs happens, how can you get your com-
mIts back?
Iere's an exampIe that hard-resets the master branch In your test
reposItory to an oIder commIt and then recovers the Iost commIts.
IIrst, Iet's revIew where your reposItory Is at thIs poInt.
$ git log --pretty=oneline
ab1afef80fac8e34258ff41fc1b867c702daa24b modified repo a bit
484a59275031909e19aadb7c92262719cfcdf19a added repo.rb
1a410efbd13591db07496601ebc7a059dd55cfe9 third commit
cac0cab538b970a37ea1e769cbbde608743bc96d second commit
fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit
Þow, move the master branch back to the mIddIe commIt.
$ git reset --hard 1a410efbd13591db07496601ebc7a059dd55cfe9
HEAD is now at 1a410ef third commit
$ git log --pretty=oneline
1a410efbd13591db07496601ebc7a059dd55cfe9 third commit
cac0cab538b970a37ea1e769cbbde608743bc96d second commit
fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit
You've effectIveIy Iost the top two commIts — you have no branch
Irom whIch those commIts are reachabIe. You need to find the Iatest
commIt SIA and then add a branch that poInts to It. The trIck Is
findIng that Iatest commIt SIA — It's not IIke you've memorIzed It,
rIght?
OIten, the quIckest way Is to use a tooI caIIed git reflog. As
you're workIng, GIt sIIentIy records what your I£AÐ Is every tIme
you change It. £ach tIme you commIt or change branches, the reflog
Is updated. The reflog Is aIso updated by the git update-ref com-
mand, whIch Is another reason to use It Instead oI just wrItIng the
SIA vaIue to your reI fiIes, as we covered In the “GIt IeIerences”
246
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
sectIon oI thIs chapter earIIer. You can see where you've been at any
tIme by runnIng git reflog.
$ git reflog
1a410ef HEAD@{0}: 1a410efbd13591db07496601ebc7a059dd55cfe9: updating HEAD
ab1afef HEAD@{1}: ab1afef80fac8e34258ff41fc1b867c702daa24b: updating HEAD
Iere we can see the two commIts that we have had checked out,
however there Is not much InIormatIon here. To see the same InIor-
matIon In a much more useIuI way, we can run git log -g, whIch wIII
gIve you a normaI Iog output Ior your reflog.
$ git log -g
commit 1a410efbd13591db07496601ebc7a059dd55cfe9
Reflog: HEAD@{0} (Scott Chacon <schacon@gmail.com>)
Reflog message: updating HEAD
Author: Scott Chacon <schacon@gmail.com>
Date: Fri May 22 18:22:37 2009 -0700
third commit
commit ab1afef80fac8e34258ff41fc1b867c702daa24b
Reflog: HEAD@{1} (Scott Chacon <schacon@gmail.com>)
Reflog message: updating HEAD
Author: Scott Chacon <schacon@gmail.com>
Date: Fri May 22 18:15:24 2009 -0700
modified repo a bit
¡t Iooks IIke the bottom commIt Is the one you Iost, so you can
recover It by creatIng a new branch at that commIt. Ior exampIe,
you can start a branch named recover-branch at that commIt (ab1aIeI).
$ git branch recover-branch ab1afef
$ git log --pretty=oneline recover-branch
ab1afef80fac8e34258ff41fc1b867c702daa24b modified repo a bit
484a59275031909e19aadb7c92262719cfcdf19a added repo.rb
1a410efbd13591db07496601ebc7a059dd55cfe9 third commit
cac0cab538b970a37ea1e769cbbde608743bc96d second commit
fdf4fc3344e67ab068f836878b6c4951e3b15f3d first commit
CooI — now you have a branch named recover-branch that Is where
your master branch used to be, makIng the first two commIts reach-
abIe agaIn. Þext, suppose your Ioss was Ior some reason not In the
reflog — you can sImuIate that by removIng recover-branch and deIet-
Ing the reflog. Þow the first two commIts aren't reachabIe by any-
thIng.
$ git branch –D recover-branch
$ rm -Rf .git/logs/
Ðecause the reflog data Is kept In the .git/logs/ dIrectory, you eI-
IectIveIy have no reflog. Iow can you recover that commIt at thIs
poInt? One way Is to use the git fsck utIIIty, whIch checks your
database Ior IntegrIty. ¡I you run It wIth the --full optIon, It shows
you aII objects that aren't poInted to by another object.
247
Section 9.7 MaIntenance and Ðata Iecovery Scott Chacon Pro Git
$ git fsck --full
dangling blob d670460b4b4aece5915caf5c68d12f560a9fe3e4
dangling commit ab1afef80fac8e34258ff41fc1b867c702daa24b
dangling tree aea790b9a58f6cf6f2804eeac9f0abbe9631e4c9
dangling blob 7108f7ecb345ee9d0084193f147cdad4d2998293
¡n thIs case, you can see your mIssIng commIt aIter the dangIIng
commIt. You can recover It the same way, by addIng a branch that
poInts to that SIA.
9.7.3 Removing Objects
There are a Iot oI great thIngs about GIt, but one Ieature that can
cause Issues Is the Iact that a git clone downIoads the entIre hIstory
oI the project, IncIudIng every versIon oI every fiIe. ThIs Is fine II
the whoIe thIng Is source code, because GIt Is hIghIy optImIzed to
compress that data efficIentIy. Iowever, II someone at any poInt In
the hIstory oI your project added a sIngIe huge fiIe, every cIone Ior aII
tIme wIII be Iorced to downIoad that Iarge fiIe, even II It was removed
Irom the project In the very next commIt. Ðecause It's reachabIe Irom
the hIstory, It wIII aIways be there.
ThIs can be a huge probIem when you're convertIng SubversIon
or IerIorce reposItorIes Into GIt. Ðecause you don't downIoad the
whoIe hIstory In those systems, thIs type oI addItIon carrIes Iew con-
sequences. ¡I you dId an Import Irom another system or otherwIse
find that your reposItory Is much Iarger than It shouId be, here Is
how you can find and remove Iarge objects.
Ðe warned. thIs technIque Is destructIve to your commIt hIstory.
¡t rewrItes every commIt object downstream Irom the earIIest tree
you have to modIIy to remove a Iarge fiIe reIerence. ¡I you do thIs
ImmedIateIy aIter an Import, beIore anyone has started to base work
on the commIt, you're fine — otherwIse, you have to notIIy aII con-
trIbutors that they must rebase theIr work onto your new commIts.
To demonstrate, you'II add a Iarge fiIe Into your test reposItory,
remove It In the next commIt, find It, and remove It permanentIy
Irom the reposItory. IIrst, add a Iarge object to your hIstory.
$ curl http://kernel.org/pub/software/scm/git/git-1.6.3.1.tar.bz2 > git.tbz2
$ git add git.tbz2
$ git commit -am 'added git tarball'
[master 6df7640] added git tarball
1 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 git.tbz2
Oops — you dIdn't want to add a huge tarbaII to your project.
Ðetter get rId oI It.
$ git rm git.tbz2
rm 'git.tbz2'
$ git commit -m 'oops - removed large tarball'
[master da3f30d] oops - removed large tarball
1 files changed, 0 insertions(+), 0 deletions(-)
delete mode 100644 git.tbz2
248
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
Þow, gc your database and see how much space you're usIng.
$ git gc
Counting objects: 21, done.
Delta compression using 2 threads.
Compressing objects: 100% (16/16), done.
Writing objects: 100% (21/21), done.
Total 21 (delta 3), reused 15 (delta 1)
You can run the count-objects command to quIckIy see how much
space you're usIng.
$ git count-objects -v
count: 4
size: 16
in-pack: 21
packs: 1
size-pack: 2016
prune-packable: 0
garbage: 0
The size-pack entry Is the sIze oI your packfiIes In kIIobytes, so
you're usIng 2MÐ. ÐeIore the Iast commIt, you were usIng cIoser to
2K — cIearIy, removIng the fiIe Irom the prevIous commIt dIdn't re-
move It Irom your hIstory. £very tIme anyone cIones thIs reposItory,
they wIII have to cIone aII 2MÐ just to get thIs tIny project, because
you accIdentaIIy added a bIg fiIe. Iet's get rId oI It.
IIrst you have to find It. ¡n thIs case, you aIready know what fiIe
It Is. Ðut suppose you dIdn't, how wouId you IdentIIy what fiIe or
fiIes were takIng up so much space? ¡I you run git gc, aII the objects
are In a packfiIe, you can IdentIIy the bIg objects by runnIng another
pIumbIng command caIIed git verify-pack and sortIng on the thIrd
fieId In the output, whIch Is fiIe sIze. You can aIso pIpe It through the
tail command because you're onIy Interested In the Iast Iew Iargest
fiIes.
$ git verify-pack -v .git/objects/pack/pack-3f8c0...bb.idx | sort -k 3 -
n | tail -3
e3f094f522629ae358806b17daf78246c27c007b blob 1486 734 4667
05408d195263d853f09dca71d55116663690c27c blob 12908 3478 1189
7a9eb2fba2b1811321254ac360970fc169ba2330 blob 2056716 2056872 5401
The bIg object Is at the bottom. 2MÐ. To find out what fiIe It Is,
you'II use the rev-list command, whIch you used brIefly In Chapter
7. ¡I you pass --objects to rev-list, It IIsts aII the commIt SIAs and
aIso the bIob SIAs wIth the fiIe paths assocIated wIth them. You can
use thIs to find your bIob's name.
$ git rev-list --objects --all | grep 7a9eb2fb
7a9eb2fba2b1811321254ac360970fc169ba2330 git.tbz2
Þow, you need to remove thIs fiIe Irom aII trees In your past. You
can easIIy see what commIts modIfied thIs fiIe.
$ git log --pretty=oneline -- git.tbz2
da3f30d019005479c99eb4c3406225613985a1db oops - removed large tarball
6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 added git tarball
249
Section 9.7 MaIntenance and Ðata Iecovery Scott Chacon Pro Git
You must rewrIte aII the commIts downstream Irom 6df76 to IuIIy
remove thIs fiIe Irom your GIt hIstory. To do so, you use filter-branch,
whIch you used In Chapter 6.
$ git filter-branch --index-filter \
'git rm --cached --ignore-unmatch git.tbz2' -- 6df7640^..
Rewrite 6df764092f3e7c8f5f94cbe08ee5cf42e92a0289 (1/2)rm 'git.tbz2'
Rewrite da3f30d019005479c99eb4c3406225613985a1db (2/2)
Ref 'refs/heads/master' was rewritten
The --index-filter optIon Is sImIIar to the --tree-filter optIon used
In Chapter 6, except that Instead oI passIng a command that modI-
fies fiIes checked out on dIsk, you're modIIyIng your stagIng area or
Index each tIme. Iather than remove a specIfic fiIe wIth somethIng
IIke rm file, you have to remove It wIth git rm --cached — you must
remove It Irom the Index, not Irom dIsk. The reason to do It thIs
way Is speed — because GIt doesn't have to check out each revIsIon
to dIsk beIore runnIng your fiIter, the process can be much, much
Iaster. You can accompIIsh the same task wIth --tree-filter II you
want. The --ignore-unmatch optIon to git rm teIIs It not to error out
II the pattern you're tryIng to remove Isn't there. IInaIIy, you ask
filter-branch to rewrIte your hIstory onIy Irom the 6df7640 commIt up,
because you know that Is where thIs probIem started. OtherwIse, It
wIII start Irom the begInnIng and wIII unnecessarIIy take Ionger.
Your hIstory no Ionger contaIns a reIerence to that fiIe. Iowever,
your reflog and a new set oI reIs that GIt added when you dId the
filter-branch under .git/refs/original stIII do, so you have to remove
them and then repack the database. You need to get rId oI anythIng
that has a poInter to those oId commIts beIore you repack.
$ rm -Rf .git/refs/original
$ rm -Rf .git/logs/
$ git gc
Counting objects: 19, done.
Delta compression using 2 threads.
Compressing objects: 100% (14/14), done.
Writing objects: 100% (19/19), done.
Total 19 (delta 3), reused 16 (delta 1)
Iet's see how much space you saved.
$ git count-objects -v
count: 8
size: 2040
in-pack: 19
packs: 1
size-pack: 7
prune-packable: 0
garbage: 0
The packed reposItory sIze Is down to 7K, whIch Is much better
than 2MÐ. You can see Irom the sIze vaIue that the bIg object Is stIII
In your Ioose objects, so It's not gone, but It won't be transIerred on
a push or subsequent cIone, whIch Is what Is Important. ¡I you reaIIy
wanted to, you couId remove the object compIeteIy by runnIng git
prune --expire.
250
Chapter 9 GIt ¡nternaIs Scott Chacon Pro Git
9.8 Summary
You shouId have a pretty good understandIng oI what GIt does In
the background and, to some degree, how It's ImpIemented. ThIs
chapter has covered a number oI pIumbIng commands — commands
that are Iower IeveI and sImpIer than the porceIaIn commands you've
Iearned about In the rest oI the book. !nderstandIng how GIt works
at a Iower IeveI shouId make It easIer to understand why It's doIng
what It's doIng and aIso to wrIte your own tooIs and heIpIng scrIpts
to make your specIfic workflow work Ior you.
GIt as a content-addressabIe fiIesystem Is a very powerIuI tooI that
you can easIIy use as more than just a VCS. ¡ hope you can use your
newIound knowIedge oI GIt InternaIs to ImpIement your own cooI
appIIcatIon oI thIs technoIogy and IeeI more comIortabIe usIng GIt In
more advanced ways.
251

Sign up to vote on this title
UsefulNot useful