Professional Documents
Culture Documents
Robotic Drives Are Going Into AVR Mode
Robotic Drives Are Going Into AVR Mode
Problem
Troubleshooting for when Robotic Drives are going into AVR mode, and backups are halting
with a pending mount request.
Error
The cause of this problem is most often a result of communication problems. There are two
NetBackup daemons for robotic control: one runs on the machine with robotic control, the other
runs on the machine that has drives in the robot. For example, if the robot is a TLD robot the two
daemons are tldcd (runs on the server with robotic control) and tldd (runs on server with drives
on the robot). In this commonly occurring problem, the drives will change from TLD control to
AVR control. This is so the jobs will go into a pending mount state, rather than failing. That
happens so that if network communications were to fail between two server for a short time, then
there would be no need to fail the jobs and they could wait until the connection comes back up.
However, at times this can be caused by more severe problems.
The above error is what will cause drives in a robot to go into an AVR control mode. This is
because these two daemons are unable to communicate.
1. Network connectivity has just plain failed. In this case, the network must be restored.
2. There are multiple interfaces on one or both of the machines that cannot route to or resolve
each other. In this case, either routing needs to be changed so that a request going will be able to
reach its destination. Adding the proper host names to the /etc/hosts file has been shown to work
in some situations.
3. The tldcd daemon has enters an uninterruptible state or is hung, thus making it unable for it to
reply to tldd. In this case, shutdown the media management daemons by running
/usr/openv/volmgr/bin/stopltid.
Next, run /usr/openv/volmgr/bin/vmps to get the pid (process ID) of the tldcd daemon and run
a kill command on it. If that doesn't work, use kill -9. If this does not kill the process, the server
will have to be rebooted. To restart the daemons, run /usr/openv/volmgr/bin/ltid.
Note: The daemon does not time out because it is hung on a system call. This is something out of
an application's ability to control.
4. The /etc/services file is missing the correct entries on one or both the servers. Below are the
entries that should be in /etc/services:
# Media Manager services #
vmd 13701/tcp vmd
acsd 13702/tcp acsd
tl8cd 13705/tcp tl8cd
odld 13706/tcp odld
tldcd 13711/tcp tldcd
tl4d 13713/tcp tl4d
tshd 13715/tcp tshd
tlmd 13716/tcp tlmd
tlhcd 13717/tcp tlhcd
rsmd 13719/tcp rsmd
# End Media Manager services #
Note: Not only can this happen between two different servers, it can also happen on the same
server.