You are on page 1of 20

EVERYTHING YOU ALWAYS WANTED TO KNOW ABOUT OPENSTACK NETWORK*

* But Were Afraid to Ask

AKA Openstack debugging VLAN setup

Disclaimer
HereisatentativeguidetotestanddebugmostlythenetworkingintheOpenstackcloudworld.
Wehavespenthugeamountoftimelookingatpacketdumpsinordertodistillthisinformationforyouinthebelief
that,followingtherecipesoutlinedinthefollowingpages,youwillhaveaneasiertime!
KeepinmindthatthisiscomingmorefromadaybydaydebugthanfromastructuredplansoItriedtoseparate
thepiecesaccordingtothearchitecturethatIhaveinmind...butisandwillremainaworkinprogress.
Reference setup:
Thesetupisthefollowing:
1.
2.
3.

computenodeUbuntuserver144ethernetinterfacesmappedonem14(3used)
controllercomputenodeUbuntuserver144ethernetinterfacesmappedonem14(3used)
networknodeUbuntuserver144ethernetinterfacesmappedonem14(3used)

The networking configuration is implemented withinneutronserviceand basedonaVLANapproach sotoobtain


acompletlyL2separationofamultipletenantenvironment.
Follow the openstack guide to configure the services(inappendixtheconfigurationfilesthathasbeenusedinthis
caseandfewconfigurationscripts).
Preliminary checks
Once you agreed with your network administrators on the switches configuration (If you have nodirectaccessto
them)let'sdoublechecktheportconfigurationforthevlanids:
Capturean
LLDPpacket
(0x88cc)fromeachhostandforeachinterface:

#tcpdumpvvvs1500etherproto0x88cciem1

(waitforapacketandthenCTRLc)
this command will give you some information about the switch that you are connected to and the VLAN
configuration.
NBiftheportisintrunkyoumaygetthesameresultasiftheportiswithoutVLANsettings.
Anexampleoftheoutputofthecommandforaninterfaceattachedtoaportthatisconfiguredas
access
:
tcpdump:WARNING:em1:noIPv4addressassigned
tcpdump:listeningonem1,linktypeEN10MB(Ethernet),capturesize1500bytes
12:33:03.255101LLDP,length351
[...]
SystemNameTLV(5),length13:
stackdr2
.GARR
0x0000:737461636b6472322e47415252
[...]
PortDescriptionTLV(4),length21:GigabitEthernet2/0/31
[...]
OrganizationspecificTLV(127),length6:OUIEthernetbridged(0x0080c2)
PortVLANIdSubtype(1)

portvlanid
(PVID):320
[...]
1packetcaptured
1packetreceivedbyfilter
0packetsdroppedbykernel

anexampleoftheoutputofthecommandforaninterfaceattachedtoaportthatisconfiguredas
trunk
:
#tcpdumpvvvs1500etherproto0x88cciem3
tcpdump:WARNING:em3:noIPv4addressassigned
tcpdump:listeningonem3,linktypeEN10MB(Ethernet),capturesize1500bytes
12:32:11.513135LLDP,length349
[...]
SystemNameTLV(5),length13:stackdr2.GARR
[...]
PortDescriptionTLV(4),length20:GigabitEthernet2/0/3
[...]

PortVLANIdSubtype(1)

portvlanid
(PVID):1
[...]^C
1packetcaptured
1packetreceivedbyfilter
0packetsdroppedbykernel

Check Interfaces
On compute nodes, use the following command to see information about interfaces: IPs, VLAN ids and to know
wethertheinterfacesareup:
#ipa

onegoodinitialsanitycheckistomakesurethatyourinterfacesareup:
#ipa|grepem[1,3]|grepstate
2:em3:<BROADCAST,MULTICAST,UP,LOWER_UP>mtu1500qdiscmqmasterovssystemstateUP
groupdefaultqlen1000
6:em1:<BROADCAST,MULTICAST,UP,LOWER_UP>mtu1500qdiscmqstateUPgroupdefaultqlen
1000
37:brem3:<BROADCAST,UP,LOWER_UP>mtu1500qdiscnoqueuestateUNKNOWNgroupdefault

Troubleshooting Open vSwitch


Open vSwitch is a multilayervirtual switch. Full documentation can befoundat
thewebsite
.Inpracticeyouneed
to ensure that the required bridges (brint, brex, brem1, brem3 etc) existandhavetheproperportsconnectedto
themwiththeovsvsctlandovsofctlcommands.
Tolistthebridgesonasystem(VLANnetworksaretrunkedthroughtheem3networkinterface):
#ovsvsctllistbr
brem3
brex
brint

Example:onthenetworknode(youshouldfollowthesamelogiconthecomputeone)
Lets check the chain of ports and bridges. The bridge brem3 contains thephysicalnetworkinterfaceem3(trunk
network)andthevirtualinterfacephybrem3attachedtotheintbrem3ofthebrint:
#ovsvsctllistportsbrem3
em3
phybrem3

#ovsvsctlshow
Bridge"brem3"
Port"em3"
Interface"em3"
Port"phybrem3"
Interface"phybrem3"
type:patch

options:{peer="intbrem3"}
Port"brem3"
Interface"brem3"
type:internal

brintcontainsintbrem3which pairswithphybrem3toconnecttothephysicalnetworkwhich isusedtoconnect


to the computenodesandtheTAPdevicesthatconnecttotheDHCPinstances andtheTapinterfacesthatconnects
tothevirtualrouters:
#ovsvsctllistportsbrint
intbrem3
intbrex
qr9ae4acd492
qrae75168a67
qre323976e2b
qre3debf8dee
tap1474f18da9
tap7c29ce274e
tapc974ab5325
tapd9762af34b

#ovsvsctlshow
Bridgebrint
fail_mode:secure
Port"tapd9762af34b"
tag:5
Interface"tapd9762af34b"
type:internal
Portintbrex
Interfaceintbrex
type:patch
options:{peer=phybrex}
[...]
Port"qr9ae4acd492"

tag:1
Interface"qr9ae4acd492"
type:internal
Portbrint
Interfacebrint
type:internal
Port"tap1474f18da9"

tag:3
Interface"tap1474f18da9"
type:internal

#ovsvsctllistportsbrex
Bridgebrex
Portbrex
Interfacebrex
type:internal
Port"em4"
Interface"em4"
Portphybrex
Interfacephybrex
type:patch
options:{peer=intbrex}

Ifanyoftheselinksismissingorincorrect,itsuggestsaconfigurationerror.
NB: youcanalsocheckthecorrectvlantagstranslationalong theoverallchainwithovsofctlcommandsi.e.(more
detailsfollows)
#ovsofctldumpflowsbrint
NXST_FLOWreply(xid=0x4):

cookie=0x0, duration=6718.658s, table=0, n_packets=0, n_bytes=0, idle_age=6718,


priority=3,in_port=1,dl_vlan=325actions=mod_vlan_vid:4,NORMAL
cookie=0x0, duration=6719.335s, table=0, n_packets=0, n_bytes=0, idle_age=6719,
priority=3,in_port=1,dl_vlan=327actions=mod_vlan_vid:3,NORMAL
cookie=0x0, duration=6720.508s, table=0, n_packets=3, n_bytes=328, idle_age=6715,
priority=3,in_port=1,dl_vlan=328actions=mod_vlan_vid:1,NORMAL
cookie=0x0, duration=5840.156s, table=0, n_packets=139, n_bytes=13302, idle_age=972,
priority=3,in_port=1,dl_vlan=320actions=mod_vlan_vid:5,NORMAL
cookie=0x0, duration=6719.906s, table=0, n_packets=58, n_bytes=6845, idle_age=6464,
priority=3,in_port=1,dl_vlan=324actions=mod_vlan_vid:2,NORMAL
cookie=0x0, duration=6792.845s, table=0, n_packets=555, n_bytes=100492, idle_age=9,
priority=2,in_port=1actions=drop
cookie=0x0, duration=6792.025s, table=0, n_packets=555, n_bytes=97888, idle_age=9,
priority=2,in_port=2actions=drop
cookie=0x0, duration=6793.667s, table=0, n_packets=203,n_bytes=22402,idle_age=4535,
priority=1actions=NORMAL
cookie=0x0, duration=6793.605s, table=23, n_packets=0, n_bytes=0, idle_age=6793,
priority=0actions=drop

Bridges can be added with ovs-vsctl add-br, and ports can be added to bridges with
ovs-vsctl add-port.

Troubleshoot neutron traffic


Refer to the
Cloud Administrator Guide for a variety of networking scenarios and their connectionpaths.Weuse
theOpenvSwitch(OVS)backend.
Seethefollowingfigureforreference.
1.
2.

TheinstancegeneratesapacketandsendsitthroughthevirtualNICinsidetheinstance,suchaseth0.
ThepackettransferstoaTestAccessPoint(TAP)deviceonthecomputehost,suchas
tap1d40b89cfe
.
YoucanfindoutwhatTAPisbeingusedbylookingatthe
/etc/libvirt/qemu/instancexxxxxxxx.xml
file.

followinganexamplewiththeinterestingpartsinevidence:
<domaintype='kvm'>
<name>instance00000015</name>
<uuid>cc2b78766d3a4b78b817ed36146a9b9e</uuid>
[....]
<controllertype='pci'index='0'model='pciroot'/>
<interfacetype='bridge'>
<macaddress='fa:16:3e:a4:56:3d'/>

<sourcebridge='qbrff8e411e6e'/>

<targetdev='tapff8e411e6e'/>
<modeltype='virtio'/>
<addresstype='pci'domain='0x0000'bus='0x00'slot='0x03'function='0x0'/>
</interface>
<serialtype='file'>
<source
[....]

another means of finding the device name is to use the neutron commands: to get the port ID associated with IP
address192.168.20.103,dothis
3. #neutronportlist|grep192.168.4.102|cutd\|f2
ff8e411e6e
08499fb9a50beca2c94b85

fig: Neutron network paths


see
here for more details at the networking scenarios
chapter

Looking also at the neutron part and highlighting the VLAN configuration wehavesomethinglike(I recycled the
image so the breth1 is bremXX in my setup and ethYY are emZZ but the flow is the point that I wanttostress
here):

1.

2.

The TAP device is connected to the integration bridge, brint. This bridge connects alltheinstanceTAP
devices and any otherbridges on thesystem.intbreth1isonehalfofa vethpairconnectingtothebridge
breth1,whichhandlesVLANnetworkstrunkedoverthephysicalEthernetdeviceeth1.
TheTAPdevicesand vethdevicesarenormal Linuxnetworkdevicesandmaybeinspectedwiththeusual
tools, such as ip and tcpdump. Open vSwitch internal devices are only visible within the Open vSwitch
environment.
#tcpdumpiintbrem3
tcpdump:intbrem3:Nosuchdeviceexists
(SIOCGIFHWADDR:Nosuchdevice)

3.

To watch packets on internal interfaces you need to create a dummy network device and add itto the
bridge containing the internal interface youwanttosnoopon.ThentellOpenvSwitchto mirroralltraffic
to or from the internal port onto thisdummy port so to run tcpdumponthedummyinterfaceandseethe
trafficontheinternalport.
4. Capture packets from an internal interface on integration bridge, br-int
(advanced):
1. Createandbringupadummyinterface,snooper0:
2.
3.

#iplinkaddnamesnooper0typedummy
#iplinksetdevsnooper0up

4. Add device snooper0 to bridge br-int:


#ovsvsctladdportbrintsnooper0

5. Create mirror of for example int-br-em3 interface to snooper0 (all in one line
- returns UUID of mirror port):

#ovsvsctlsetBridgebrintmirrors=@m id=@snooper0getPort
snooper0 id=@intbrem3getPortintbrem3 id=@mcreateMirror
name=mymirror
selectdstport=@intbrem3
selectsrcport=@intbrem3
outputport=@snooper0
dcce2c59be1a4f2db00b9d906c77ee8a

6. and from here you can see the traffic going through int-br-em3 with a
tcpdumpisnooper0
.
7. Clean up mirrors:
#ovsvsctlclearBridgebrintmirrors
#ovsvsctldelportbrintsnooper0
#iplinkdeletedevsnooper0

On the integration bridge, networks are distinguished using internal VLANids (unrelatedtothesegmentation IDs
used in the network definition and on thephysical wire) regardless of how the networking service defines them.
This allows instances on the same host to communicate directly without transiting the rest of the virtual, or
physical, network. On the brint, incoming packets are translated from external tags to internal tags. Other
translationsalsohappenontheotherbridgesandwillbediscussedlater.
5.

To discover which internal VLAN tag is in use for a given external VLAN by using the ovsofctl
command:
1. FindtheexternalVLANtagofthenetworkyou'reinterestedinwith
#neutronnetshowfieldsprovider:segmentation_id<networkname>
+++
|Field|Value|
+++
|provider:network_type|vlan|
|provider:segmentation_id|324|
+++

2.

Grep for the provider:segmentation_id, 324 in this case, in the output of ovsofctl dumpflows
brint:
#ovsofctldumpflowsbrint|grepvlan=324
cookie=0x0,duration=105039.122s,table=0,n_packets=5963,n_bytes=482203,
idle_age=1104,hard_age=65534,priority=3,in_port=1,
dl_vlan=324
actions=
mod_vlan_vid:1
,NORMAL

3.

Here you can seepacketsreceivedonportID1 withthe VLANtag 324aremodifiedtohavethe


internalVLANtag1.Diggingalittledeeper,youcanconfirmthatport1isinfact:
4. #ovsofctlshowbrint

OFPT_FEATURES_REPLY(xid=0x2):dpid:0000029a51549b40
n_tables:254,n_buffers:256
capabilities:FLOW_STATSTABLE_STATSPORT_STATSQUEUE_STATSARP_MATCH_IP
actions:OUTPUTSET_VLAN_VIDSET_VLAN_PCPSTRIP_VLANSET_DL_SRCSET_DL_DST
SET_NW_SRCSET_NW_DSTSET_NW_TOSSET_TP_SRCSET_TP_DSTENQUEUE

1(intbrem3):addr:52:40:bd:b3:88:9c
config:0
state:0
speed:0Mbpsnow,0Mbpsmax
2(qvof3b63d31a0):addr:4e:db:74:04:53:4d
config:0
state:0

current:10GBFDCOPPER
speed:10000Mbpsnow,0Mbpsmax
3(qvo65fb5ad8b5):addr:92:75:b8:03:cc:1d
config:0
state:0
current:10GBFDCOPPER
speed:10000Mbpsnow,0Mbpsmax
4(qvoa6e8c6e31c):addr:82:22:71:c5:4e:f8
config:0
state:0
current:10GBFDCOPPER
speed:10000Mbpsnow,0Mbpsmax
5(qvo1d40b89cfe):addr:5e:e3:15:53:e5:16
config:0
state:0
current:10GBFDCOPPER
speed:10000Mbpsnow,0Mbpsmax
6(qvoff8e411e6e):addr:02:a9:38:d6:88:22
config:0
state:0
current:10GBFDCOPPER
speed:10000Mbpsnow,0Mbpsmax
LOCAL(brint):addr:02:9a:51:54:9b:40
config:0
state:0
speed:0Mbpsnow,0Mbpsmax
OFPT_GET_CONFIG_REPLY(xid=0x4):frags=normalmiss_send_len=0

5. (NB thisisNOT valid ifyouareusingaGREtunnel)VLANbasednetworksexittheintegration


bridge via a veth interface i.e. intbrem3 (intbreth1 in the picture) and arrive on a bridge i.e.
brem3 (breth1) on theothermemberofthevethpairphybrem3 (phybreth1).Packets on this
interface arrive with internal VLAN tagsand are translated to externaltagsinthereverseofthe
processdescribedabove:
#ovsofctldumpflowsbrem3|grep324
cookie=0x0,duration=105402.89s,table=0,n_packets=7374,n_bytes=905197,
idle_age=1468,hard_age=65534,priority=4,in_port=2,dl_vlan=1
actions=
mod_vlan_vid:324
,NORMAL

6.

7.

8.
9.

6. Packets, now tagged with the external VLAN tag, then exit onto the physical network via em3
(eth1). The Layer2 switch this interface is connected to must be configured as trunk on the
VLANIDsused.Thenexthopforthispacketmustalsobeonthesamelayer2network.
The packet is then received on the network node. Note that any traffic to thel3agentordhcpagentwill
be visible only within their network namespace. Watchinganyinterfacesoutsidethosenamespaces,even
those that carrythe network traffic, will only show broadcast packets like Address Resolution Protocols
(ARPs), but unicast traffic to the router or DHCP address will not be seen. See
Dealing with Network
Namespaces
fordetailonhowtoruncommandswithinthesenamespaces.
Alternatively, it is possible to configure VLANbased networks to use external routers rather than the
l3agentshownhere,solongastheexternalrouterisonthesameVLAN:
1. VLANbased networks are received as tagged packets on a physical network interface, eth1 in
thisexample.Justasonthecomputenode,thisinterfaceisamemberofthebreth1bridge.
2. GREbased networks willbepassedtothetunnelbridgebrtun,whichbehavesjustliketheGRE
interfacesonthecomputenode.
Next,thepacketsfromeitherinputgothroughtheintegrationbridge,againjustasonthecomputenode.
The packet then makes it to the l3agent. This is actuallyanotherTAPdevicewithintherouter'snetwork
namespace. Router namespaces are named in the form qrouter<routeruuid>. Running ip a within the
namespacewillshowtheTAPdevicename,qre6256f7d31inthisexample:

10. #ipnetnsexecqroutere521f9d0a1bd4ff4bc8178a60dd88fe5ipa|grepstate
10:qre6256f7d31:<BROADCAST,UP,LOWER_UP>mtu1500qdiscnoqueue\
stateUNKNOWN

11:qg35916e1f36:<BROADCAST,MULTICAST,UP,LOWER_UP>mtu1500\
qdiscpfifo_faststateUNKNOWNqlen500
28:lo:<LOOPBACK,UP,LOWER_UP>mtu16436qdiscnoqueuestateUNKNOWN

11. The qg<n> interface in the l3agent router namespacesendsthe packet on toitsnexthopthroughdevice
eth2 on the external bridge brex. This bridge is constructed similarlytobreth1and maybeinspectedin
thesameway.
12. This external bridge also includes a physical network interface, eth2 inthisexample,whichfinallylands
thepacketontheexternalnetworkdestinedforanexternalrouterordestination.
13. DHCP agents running on OpenStack networks run in namespaces similar to the l3agents. DHCP
namespaces are named qdhcp<uuid> and have a TAP device on the integration bridge. Debugging of
DHCPissuesusuallyinvolvesworkinginsidethisnetworknamespace.

Debug a problem along the Path


Pingisyourbestfriend!Fromaninstance:
1.
2.
3.
4.
5.

Seewhetheryoucanpinganexternalhost,suchas8.8.8.8(googlewhichusuallyisup:fromstats99.9%).
Ifyoucan't,trytheIPaddressofthecomputenodewherethevirtualmachineishosted.
If you can ping this IP, then the problem is somewhere between the compute node and that compute
node'sgateway.
If you can't the problem is between the instanceand thecomputenode.Checkalsothebridgeconnecting
thecomputenode'smainNICwiththevnetNICofthevm.
Launch a second instance and see whether the two instances can ping each other.If they can, the issue
mightberelatedtothefirewallonthecomputenode.Seefurtherforiptablesdebugging

tcpdump
This is yoursecond best friendtohelpwithtroubleshootingnetwork issues.Usingtcpdumpatseveral pointsalong
thenetworkpathshouldhelpfindingwheretheproblemis.
Forexample,runthefollowingcommand:
tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=
icmpecho'
on:

1.
2.
3.

Anexternalserveroutsideofthecloud(intheexample193.206.159.201)
Acomputenode()
Aninstancerunningonthatcomputenode

Inthisexample,theselocationshavethefollowingIPaddresses:
Instance10.0.2.24203.0.113.30
ComputeNode10.0.0.42203.0.113.34
ExternalServer1.2.3.4

Next, open a new shell to the instance and then pingthe external host where tcpdump is running. If the network
pathtotheexternalserverandbackisfullyfunctional,youseesomethinglikethefollowing:
On the external server:
$tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=
icmpecho'
tcpdump:listeningonany,linktypeLINUX_SLL(Linuxcooked),capturesize65535bytes
10:20:23.517242IP(tos0x0,ttl64,id65416,offset0,flags[none],protoICMP(1),
length84)
193.206.159.201>90.147.91.10:ICMPechoreply,id1606,seq28,length64

which received the ping request and sent a ping reply

On the compute node you can follow the traffic along the path:
1.

onthetapdevicewhichisconnectingtheVMtothelinuxbridge(tofindthetapseeprevious)
# tcpdump i tap88ab3af77d n v \ 'icmp[icmptype] = icmpechoreply or
icmp[icmptype]=
icmpecho'
tcpdump:WARNING:tap88ab3af77d:noIPv4addressassigned
tcpdump:listeningontap88ab3af77d,linktypeEN10MB(Ethernet),capturesize
65535bytes
10:36:31.000419IP(tos0x0,ttl64,id1469,offset0,flags[DF],protoICMP(1),
length84)
192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq1,length64

2.

onthetwosidesofthevethpairbetweenthelinuxbridgeandtheOVSbrint
# tcpdump i qbr88ab3af77d n v \ 'icmp[icmptype] = icmpechoreply or
icmp[icmptype]=
icmpecho'
tcpdump:WARNING:qbr88ab3af77d:noIPv4addressassigned
tcpdump:listeningonqbr88ab3af77d,linktypeEN10MB(Ethernet),capturesize
65535bytes
10:36:59.035767IP(tos0x0,ttl64,id1497,offset0,flags[DF],protoICMP(1),
length84)
192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq29,length64
root@compute:~#tcpdumpiqvb88ab3af77dnv\'icmp[icmptype]=icmpechoreply
oricmp[icmptype]=
icmpecho'
tcpdump:WARNING:qvb88ab3af77d:noIPv4addressassigned
tcpdump:listeningonqvb88ab3af77d,linktypeEN10MB(Ethernet),capturesize
65535bytes
10:37:18.058899IP(tos0x0,ttl64,id1516,offset0,flags[DF],protoICMP(1),
length84)
192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq48,length64

3.

andfinallyontheoutgoinginterface(em1intheexample)
#tcpdumpiem1nv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=
icmpecho'
tcpdump:WARNING:em1:noIPv4addressassigned
tcpdump:listeningonem1,linktypeEN10MB(Ethernet),capturesize65535bytes
10:37:49.099383IP(tos0x0,ttl64,id1547,offset0,flags[DF],protoICMP(1),
length84)
192.168.4.103>8.8.8.8:ICMPechorequest,id1709,seq79,length64

Ontheinstance:
#tcpdumpianynv\'icmp[icmptype]=icmpechoreplyoricmp[icmptype]=
>icmpecho'
tcpdump:listeningonany,linktypeLINUX_SLL(Linuxcooked),capturesize65535bytes
09:27:04.801759IP(tos0x0,ttl64,id36704,offset0,flags[DF],protoICMP(1),length
84)
192.168.4.103>192.168.21.107:ICMPechorequest,id1693,seq27,length64

NBitcanbeusefultoshowvlantagintrafficdebugging.
Todothisuse:
#tcpdumpi<iface>Uw|tcpdumpenrvlan<id>

iptables and security rules


OpenStack Compute automatically manages iptables, including forwarding packets to and from instances on a
computenode,forwardingfloatingIPtraffic,andmanagingsecuritygrouprules.
iptablessave

showsyoualltherules

Example of setup of security rules


toshowthesecurityrules:
#novasecgrouplistrulesdefault
++++++
|IPProtocol|FromPort|ToPort|IPRange|SourceGroup|
++++++
|||||default|
|||||default|
++++++

tosetuparuletomakeicmptrafficpassthrough:
novasecgroupaddruledefaulticmp11
0.0.0.0/0
++++++
|IPProtocol|FromPort|ToPort|IPRange|SourceGroup|
++++++
|icmp|1|1|
0.0.0.0/0
||
|||||default|
|||||default|
++++++

Troubleshooting DNS
SSHserver doesareverseDNSlookupontheIPaddressthatyouareconnectingfromsoifyoucanuseSSHtolog
intoaninstance,butittakesorderofaminutethenyoumighthaveaDNSissue.
A quick way to check whether DNS is working is to resolve a hostname inside your instancebyusing the host
command.IfDNSisworking,youshouldsee:
#hostgarr.it
garr.itmailishandledby15lx1.dir.garr.it.
garr.itmailishandledby20lx5.dir.garr.it.

Note If you're running the Cirros image, it doesn't have the"host" program installed, in which case you can use
pingtotrytoaccessamachinebyhostnametoseewhetheritresolves.

Dealing with Network Namespaces


Linux network namespaces are a kernel feature the networking service uses to support
multiple isolated layer-2 networks with overlapping IP address ranges. Your network nodes
will run their dhcp-agents and l3-agents in isolated namespaces. NB Network interfaces

and traffic on those interfaces will not be visible in the default namespace.
L3-agent router namespaces are named qrouter-
<router_uuid>
, and dhcp-agent name
spaces are named qdhcp-
<net_uuid>
.
To see whether you are using namespaces, run ip netns:
#ipnetns
qrouter80fdf88437c34d33a340cd1a09510e59
qdhcpc3cfc51bf07c47aebdb4b029035c08d7
qdhcpf7bff0561d274c12a9176ffe2925a44b
qrouteredcb7cb537fd4b3181c5cee1bda75369
qdhcp286f28446b7642e59664ab5123bde2d5
qrouter3618b0204f3c4a728c02e25db0c4769d
qdhcpc8a29266e9ac45e0be6d79c32f501194
qrouter301f264a8ef1413db252c0886fc2c815
qrouter9d378195ee9345f0b27f2bd48b774f5a
qdhcp13c334c1ad394c51b396953430059b22

This output shows a network node with 5 networks running dhcp-agents, each also running
an l3-agent router.
A list of existing networks and their UUIDs can be obtained by running
neutronnetlis
t
with administrative credentials.
#neutronnetlist
++++
|id|name|subnets|
++++
|13c334c1ad394c51b396953430059b22|intnet324|edd7678a277c477ea5ac84258e6b1794192.168.1.0/24|
|286f28446b7642e59664ab5123bde2d5|inafnet|dbf5bd19de674b84a97b8e322f9343dc192.168.3.0/24|
|99e9c208b72a427f97f62443cdd6de9c|extnetflat319|e0ef8d6f3fa94a05ae2c5ec229357f4b90.147.91.0/24|
|b4ef2523bebe4dbeb5b782983fec6be8|extnetflat319bis|91ccda542af14a59bf088bb0821c1c0890.147.91.0/24|
|c3cfc51bf07c47aebdb4b029035c08d7|intnet328|0d36feb34c834867a227fb972564125c192.168.8.0/24|
|c8a29266e9ac45e0be6d79c32f501194|ingvnet|915f9929e49b4a95a193c71227ff870d192.168.2.0/24|
|f7bff0561d274c12a9176ffe2925a44b|eneanet|d9d1ba304a144aaba95f4ed2c3f895d3192.168.4.0/24|

Once you've determined which namespace you need to work in, you can use any of the
debugging tools mention earlier by prefixing the command with ip netns exec
<namespace>.
For example, to see what network interfaces exist in the first qdhcp namespace returned
above, do this:
#ipnetnsexecqdhcpf7bff0561d274c12a9176ffe2925a44bipa
1:lo:<LOOPBACK,UP,LOWER_UP>mtu65536qdiscnoqueuestateUNKNOWNgroupdefault

link/loopback00:00:00:00:00:00brd00:00:00:00:00:00
inet127.0.0.1/8scopehostlo
valid_lftforeverpreferred_lftforever
inet6::1/128scopehost
valid_lftforeverpreferred_lftforever
61:tapd9762af34b:<BROADCAST,UP,LOWER_UP>mtu1500qdiscnoqueuestateUNKNOWNgroup
default
link/etherfa:16:3e:b8:2e:0cbrdff:ff:ff:ff:ff:ff
inet192.168.4.100/24brd192.168.4.255scopeglobaltapd9762af34b
valid_lftforeverpreferred_lftforever
inet6fe80::f816:3eff:feb8:2e0c/64scopelink
valid_lftforeverpreferred_lftforever

From this you see that the DHCP server on that network is using the
tapd9762af3-4b
device and has an IP address of 192.168.4.100.
Usual commands also mentioned previously can be run in the same way.
note: It is also possible to run a shell and have an interactive session within the
namespace i.e.
#ipnetnsexecqdhcpf7bff0561d274c12a9176ffe2925a44bbash
root@network:~#ifconfig
loLinkencap:LocalLoopback
inetaddr:127.0.0.1Mask:255.0.0.0
inet6addr:::1/128Scope:Host
UPLOOPBACKRUNNINGMTU:65536Metric:1
RXpackets:0errors:0dropped:0overruns:0frame:0
TXpackets:0errors:0dropped:0overruns:0carrier:0
collisions:0txqueuelen:0
RXbytes:0(0.0B)TXbytes:0(0.0B)

tapd9762af34bLinkencap:EthernetHWaddrfa:16:3e:b8:2e:0c
inetaddr:192.168.4.100Bcast:192.168.4.255Mask:255.255.255.0
inet6addr:fe80::f816:3eff:feb8:2e0c/64Scope:Link
UPBROADCASTRUNNINGMTU:1500Metric:1
RXpackets:22errors:0dropped:0overruns:0frame:0
TXpackets:9errors:0dropped:0overruns:0carrier:0
collisions:0txqueuelen:0
RXbytes:1788(1.7KB)TXbytes:738(738.0B)

Mapping of physnet vs network inside neutron db


Sometimes therecouldbe anunclear (fromthelogs pointof view)error that claimsnottofindsuitable
resourcesatthemomentofVMcreation.ItcouldberelatedtoaproblemintheneutronDB.Tofindout:
1.

checkthatnovaservicesarerunningonthecomputenodesandcontroller
#novaservicelist
+++++++
++
|Id|Binary
|Host
|Zone
|Status |State|Updated_at
|DisabledReason|
+++++++
++
|1 |
novacompute
|compute
|nova
|
enabled|up
|
20150212T13:52:45.000000||
|2 |novacert
|controller| internal|enabled|up
|

20150212T13:52:40.000000||
| 3 | novaconsoleauth | controller | internal | enabled | up
|
20150212T13:52:40.000000||
| 4 | novascheduler
| controller | internal | enabled | up
|
20150212T13:52:45.000000||
| 5 | novaconductor
| controller | internal | enabled | up
|
20150212T13:52:44.000000||
|6 |
novacompute
|controller|nova
|
enabled|up
|
20150212T13:52:46.000000||
+++++++
++

2.

checkthatthereareenoughhwresources
#novahypervisorstats

+++
|Property|Value|
+++
|count|2|
|current_workload|0|
|disk_available_least|1130|
|free_disk_gb|1274|
|free_ram_mb|367374|
|local_gb|1454|
|local_gb_used|180|
|memory_mb|386830|
|memory_mb_used|19456|
|running_vms|6|
|vcpus|80|
|vcpus_used|9|
+++

3.

check that there isno problem in the mapping ofphysnet and networks in the neutron db(i.e.
trunknetisourvlantaggednetwork)
select*fromml2_vlan_allocations
++++
|physical_network|vlan_id|allocated|
++++
|trunknet

319|

0|

|trunknet

320|

0|

|trunknet

321|

0|

|trunknet

322|

0|

|trunknet

323|

0|

|trunknet

324|

0|

|trunknet

325|

0|

|trunknet

326|

0|

|trunknet

327|

0|

|trunknet

328|

0|

++++32

Debugging with logs - where Are the Logs?


Following reported a quick summary table of the services log location, more in OpenStack
log locations
.
Node type

Service

Log location

Cloud
controller

nova-*

/var/log/nova

Cloud
controller

glance-*

/var/log/glance

Cloud
controller

cinder-*

/var/log/cinder

Cloud
controller

keystone-*

/var/log/keystone

Cloud
controller

neutron-*

/var/log/neutron

Cloud
controller

horizon

/var/log/apache2/

All nodes

misc (swift, dnsmasq)

/var/log/syslog

Compute
nodes

libvirt

/var/log/libvirt/libvirtd.log

Compute
nodes

Console
(boot
messages)
for
instances:

Block
Storage
nodes

cinder-volume

up /var/lib/nova/instances/instance-<instance
VM id>/console.log

/var/log/cinder/cinder-volume.log

Backup + Recovery (for Real)


This chapter describes only how to back up configuration files and databases that the various OpenStack
components need to run. This chapter does not describe how to back up objects inside Object Storage or data
containedinsideBlockStorage.

Database Backups
The cloud controlleris theMySQLserver hostingthedatabases fornova,glance,cinder,andkeystone.Tocreatea
databasebackup:
#mysqldumpu<adminuser>hcontrollerpalldatabases>openstack.sql

To backup a single database (i.e. nova) you can run:


#mysqldumpu<adminuser>hcontrollerpnova>nova.sql

You can easily automate this process. The following script dumps the entire MySQL database and deletes any
backupsolderthansevendays:
#!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql`hostname``evaldate+%Y%m%d`.sql.gz"
#DumptheentireMySQLdatabase
/usr/bin/mysqldumpurootp123gridalldatabases|gzip>$filename
#Deletebackupsolderthan7days
find$backup_dirctime+7typefdelete

File System Backups


Compute
The/etc/novadirectoryonboththecloudcontrollerandcomputenodesshouldbebackedup.
/var/lib/novaisadirectorytobackup.
note: its unuseful to backup /var/lib/nova/instances subdirectory on compute nodes which contains the KVM
imagesofrunninginstancesunlessyouneedtomaintainbackupcopiesofallinstances.
ImageCatalogandDelivery
/etc/glanceand/var/log/glanceshouldbebackedup
/var/lib/glanceshouldalsobebackedup.

There are two ways to ensure stability with thisdirectory.The first istomakesurethisdirectoryis runonaRAID
array. If a diskfails,the directoryisavailable.The secondwayistouseatoolsuchasrsynctoreplicatetheimages
toanotherserver:
#rsyncazprogress/var/lib/glance/imagesbackupserver:/var/lib/glance/images/

Identity
/etc/keystoneand/var/log/keystonefollowthesamerulesasothercomponents.
/var/lib/keystone,shouldnotcontainanydatabeingused.

Recovering Backups
Recoveringbackupsisasimpleprocess.
1.

ensurethattheserviceyouarerecoveringisnotrunning.I.e.inthecaseofnova:
#stopnovacert
#stopnovaconsoleauth
#stopnovanovncproxy
#stopnovaobjectstore
#stopnovascheduler

2.

importapreviouslybackedupdatabase:
#mysqlurootponedatabaseneutron</root/mysqldump_20150210

3.

restorebackedupnovadirectories:
#mv/etc/nova{,.orig}
#cpa/path/to/backup/nova/etc/

4.

starteverythingbackup
#foriinnovaapinovacertnovaconsoleauthnovanovncproxynovaschedulerdo
start$idone

Appendix

Debugging Nova
Toknowtheflowforprovisioninganinstancecouldbeusefultoidentifyaprobleminthechain:

1. ThedashboardorCLIgetstheusercredentialsandauthenticateswiththeIdentityServiceviaRESTAPI.
2. The Identity Serviceauthenticatestheuserwiththeusercredentials,andthengeneratesandsendsbackan
authtokenwhichwillbeusedforsendingtherequesttoothercomponentsthroughRESTcall.
3. The dashboardorCLIconvertsthenewinstance requestspecifiedinlaunchinstanceornovabootformto
aRESTAPIrequestandsendsittonovaapi.
4. novaapi receives the request and sends a request to the Identity Service for validation oftheauthtoken
andaccesspermission.
5. The Identity Service validates the token and sends updated authentication headers with roles and
permissions.
6. novaapichecksforconflictswithnovadatabase.
7. novaapicreatesinitialdatabaseentryforanewinstance.
8. novaapi sends the rpc.callrequestto novaschedulerexpectingto getupdated instanceentrywithhostID
specified.
9. novaschedulerpicksuptherequestfromthequeue.
10. novaschedulerinteractswithnovadatabasetofindanappropriatehostviafilteringandweighing.
11. novascheduler returns the updated instance entry with the appropriate host ID after filtering and
weighing.
12. novascheduler sends the rpc.cast request to novacompute for launching an instance on the appropriate
host.
13. novacomputepicksuptherequestfromthequeue.
14. novacompute sends the rpc.call request to novaconductor to fetch theinstance informationsuchas host
IDandflavor(RAM,CPU,Disk).
15. novaconductorpicksuptherequestfromthequeue.
16. novaconductorinteractswithnovadatabase.
17. novaconductorreturnstheinstanceinformation.
18. novacomputepicksuptheinstanceinformationfromthequeue.
19. novacompute performs the REST call by passingthe authtokentoglanceapi.Then,novacomputeuses
the Image ID to retrieve the Image URI from the Image Service, and loads the image fromthe image

storage.
20. glanceapivalidatestheauthtokenwithkeystone.
21. novacomputegetstheimagemetadata.
22. novacompute performs the RESTcall by passing the authtoken to Network API to allocate and
configurethenetworksothattheinstancegetstheIPaddress.
23. neutronservervalidatestheauthtokenwithkeystone.
24. novacomputeretrievesthenetworkinfo.
25. novacompute performs the REST call by passingthe authtokentoVolumeAPItoattachvolumestothe
instance.
26. cinderapivalidatestheauthtokenwithkeystone.
27. novacomputeretrievestheblockstorageinfo.
28. novacompute generates data for the hypervisor driver and executes the request on the hypervisor (via
libvirtorAPI).

Configuration options

Allthedetailsaboutconfigurationoptionscanbefoundhere
http://docs.openstack.org/juno/configreference/content/index.html

anycommentismorethanwelcome
Alex