You are on page 1of 24
Vuwwrwwwwwwewwowwweevvewewweuwweuvwsd Content ‘System Status ~ Memory System Status - 10 System Status - CPU Performance Trending with sar ‘Troubleshooting Basies: The Process Troubleshooting Basics: The Tools strace and irace Common Problems Troubleshooting Incorrect Inability to Boot 7 ‘Types in Configuration Fes Corrupt Flesystems Rescue Environment Lab Tasks 1. Recovering Damaged MBR. Permissions Chapter MONITORING & TROUBLESHOOTING ‘Memory Status Linux provides several tools for examining the memory usage on running systems as shown in the following examples: 4 free 4 cat /proc/senintfo ‘Swap usage can be summarized by either of the folowing: 4 swapon ~3 4 cat /proc/sueps Field [Meaning fF [umber of processes waiting to run [b__ [amber of processes uninterruptedlysleeping| spd [amount of swap used, in kilobytes ree [amount of memory free, in Kilobytes System Status - Memory free Quick summary of memory usoge Displays swap usage on per swap fle / pation basis Process Fes * Tpcoc/neninfo ® 7proc/swape vastat *rastat 2~ plays unring stats every? seconds [buff [amount of memory used for butfers, in Kilobytes 5i__|memory being swapped from disk, in klobytes/sec [so_| memory being swapped to disk, in Klobytes/soe fbi [Blocks writen to disk, in bloeksise [bo _ [blocks read from disk, in blocksisee Fin number of interupts per second [es [number of context switches per second jus [percent of total CPU consumed by user processes, [sy _ [percent of total CPU consumed by eystem processes percent of total CPU spent waiting on WO fia [percent of total CPU ile time Linux also provides the vastat command, Commonly used on Unix systems to examine memory usage, mastat produaes output Ike: # metat procs. ~ nemory— sap cb swpd free buff cache si so bi bo ince us sy id va 6 8 "8 31106 413726 155988 6 «817142 15.2 ssystem= ----cpu- 450038 16 4 783 A A EY VewwE wer ewwUEUerowuwwwowwwowed Vo Status ‘Several commands are also available to examine UO statistics on Unux systems. Some basic V0 information is provided by mastat. In ‘addition, stat can be run witha few aifferent options to abtain ‘moce detailed states. This command presents summary information regarding VO: System Status ~ astat Toate + Shows detailed VO statistics on 8 per dive basis Process Files Yoroe/ interrupts 1 /proe/stat yo ‘More detailed information about specific devices can be found by: f dostat -a -x /dev/eda + output omitted... iets [Description [Device _|name, in devi format, where # is the major number 4 dostat -a 5 7) " Device: tps slkread/s Blkwrtn/s Blkread slivrtn — [F£90/s [merged read requests per secon: sa 093 aS 35.68 4433955 3456508 [argn/s [merged write requests per second fas 8.08 8 32 ® 7 7 add 8.12 36 2:51 35283 ageng [2/8 __ toad requests per second are 8.64 oa 1st dea 146162 [rs write requests per second Field [Description [rsec/s | sectors read per second [Device _[name, in devé format, where # is the majornumber| _[wsec/s [sectors written per second ps 17/0 requests por second [avarg-sz average sizo of requests, in sectors BUK_read/s [aumber of blocks read per second [avaqu-sz average quove length of requests BTK wrtn/s [pumber of blocks writen per second [avait [average wait ime until 170 request served UK read [total numberof blocks read favctm [average time to service 17 O requests in seconds UK wrta [total number of Blocks writen [tatit [percentage of CPU time used servicing VO requesta 183 (CPU Details Feirly detailed information about the processors} on the systom can be obtained from the /proc/cpuinfo file as shown in the following example: 5 cat /proc/epuinto processor ° vendor_id Genuinerntet cpu family 1 6 rodel PB rodel nane ee and 15-4 System Status - CPU Process Files * Iproclepuinto ‘Shows dealed info about physical CPUs) estat -< "= provides 8 summery of CPU uilization pstat ‘provides detailed CPU uttization + Including forced sleep by Xen hypervisor (CPU Status (unl-processor) ‘Several tools provide information about the system CPU configuration ‘Futilzation. This command displays an average, for all processors: 4 fostat -< avg-opu: Buser nice tsys Nidle 8.81 93.96 8.83 6.64 See Se Ses ae ere fi [acon oa eee aaa care RRA RRR RAR ARERR RR ARAAA RR RAR RAO Veer www wuWwwwUwWuwwwweuwwwwwwed (CPU Status (multi-processor) Slightly more detailed information can be obtained with mpstat, ‘which fs usually run with the =P ALL option to display information ‘bout all processors on the system f mpstat -P ALL ad 8.48 a7 a Field [Description is avorage of all CPUs; a number is a specie CPU tnice faye Hovait —tirg Asoft. tsteal 8. [taser [percentage of CPU used for user processes [tnice [percentage of CPU used for niced user processes [says [percentage of CPU used for system processes: siowait [percentage of CPUICPUs time spent waiting for isk UO [srg [percentage of CPUICPUs time spent servicing interupts [ssoft | percentage of CPUICPUs time spent servicing softwar inerrupts fisteat [percentage of ime spent in involuntary wait by virtual ICPUICPUs while the hypervisor was servicing another vital CPU fbiate_|percontage of CPU time spent idle jintr7s_[number of interrupts per second processed tidte 99.27 98.99 intr/s 335.15 15-5 ‘System Activity Reporter ‘The System Activity Reporter (sar) is flexible tool which can be sed to observe realtime performance data or take samples over time to show performance trends. ‘The basic syntax ofthe sar command is: sar [option] [count] ‘The option indicates what output is desired, and the optional count ‘says how many samples are desiod. Useful options include: [Option Result =A__ [display (almost) all avaliable statistics a display disk VO statistics =| display system paging staisties =e __| display process creation statistics =4___|sioplay perclisk activity statistics Gisplay interrupt statistics: irq is the number of the IRQ to sisplay: SUM displays the total numberof interupts; PROC sisplays the number of interupts per processor: ALL sisplays tho fist 16 intorupts: XALL displays all 4 Bega; sar Performance Trending sade * system process located at /usr/Iib/sa/sade ‘an rom /ete/eron.d/sysstat 1 loge ectiy to ves tog/sa/ Shows system activity overtime based on sade logs cafes to clplaying data from 10 minute inteals devices: EDEV displays network errors: SOCK displays | soce | statistics on network sockets; FULL displays all network [rus | statistics -4__ | aisplays queve lengths and load averages mz [displays memory and swap utilization statistics =a [displays memory VO statistics =u [aisplays CPU valization statistics =v [displays kernel file table statistics ma |alsplays switching statistics x _|alsplays ewepping statistics lsplays process staisics, pid is the PID of the process to monitor, SELP indicates the sar process itself; SUM displays system processes major and minor fait statistics; ALL cispays statistics for al cispiays stats about fof the process whose children are monitored: SELF ingicates children of the sar process itslf; ALL displays statistics for child processes of all system processes 156 [=7 [display staistios about TTY activity RAR RARE ERR RAR RAR RANA ARAN ewww wwwwewwwwuwowwwwwwwwwowed Investigate In many ways, troubleshooting is lke reading a good detective novel ‘One must not stop atthe fist possible explanation. Instead, evidence ‘must be gathered and analyzed. The investigation can't end until theres enough proof to cstinguish truth from fiction. Sometimes the {ruth is exactly as expected, and the challenge is proving it. Other times, the ruth wall be quite cifferent from what was originally, suspected. Dont form an opinion too quickly when troubleshooting. A change ‘that might corect one problem could make another problem worse ‘A superficial ix that only addresses @ surface problem, without Correcting the root cause, may hide the problem for awhile, but ‘eventualy the problem will return, perhaps making the problem worse! Without proper understanding of the problem, itis rare to antive at 8 {900d solution. Be patent. Make sure the problem is understood betore trying to solve it. Such obvious advice may seem ridiculous, but itis surprising the numberof times its ignored. Document Discoveries IT knowing is half the battle, what's the other hall? Remembering While it may be possible to keep everything in one's head during 3 ‘short troubleshooting session, the longer the session, ofthe higher ‘the stress of the situation, the more important it becomes to keep a record of everything learned during the process. Dorit make things, Troubleshooting Ba: lovestigato ‘Decument Discoveries Ferm Multiple Hypotheses ‘Test Each Hypothesis Consider Al Solutions (Then Pick The Bost) Document Changes, Fepoat the Process (H Needed) too complicated. Paper and a pen is enough to get the job done, but ifthe voubleshooting session is lang or the problem reluns, 8 few basic notes taken during the troubleshooting process can be invaluable. Good notes can also be the basis of future documentation Form Multiple Hypotheses. Eventually, enough information willbe gathered to support at least ‘one hypothesis. At this point itis tempting to jump in and attempt to solve the problem. Resist this temptation! Take a moment to ‘consider and rank other possible hypotheses. Many problems are ‘easy to solve, but only alter one stops focusing on the wrong ‘hypothesis. ‘Test Each Hypothesis Even after forming hypotheses, continue to collect information. If possible, perform one or more experiments to confirm the most likely ‘exlanation. Such confirmation is especially important when the fx is potentially dangecous or time consuming. A good experiment may ‘ao reveal 8 more serious underlying problem. Without careful confirmation, one may end up repeatedly addressing surface problems without ever artving at the root cause. Consider All Solutions (Then Pick The Best) ‘Many problems allow multiple solutions. Instead of immediately choosing the first solution that comes to mind, consider alternatives, Fecusing too much on one solution may resull in ignoring a faster, 167 safer solution, Taking time to consider the consequences of 2 Solution is @ertieal part of not making the problem worse. Document Changes {All changes made during the troubleshooting process should be ‘documented. ideal, any and al changes to @ system should always be documented. It a change is later discovered to be inappropriate, something as simple as a chronological ist of chenges on a piece of ‘baper can save significant time when reversing unintended damage. Repent the Process (W Needed) ‘Sometimes resolving one problem can reveal another. In these cases, itis tempting to assume the new problem is just an extension of the ‘ld. While that may often be true, care should stil be takon to follow proper troubleshooting procedure, not just slap @ quick fix on and hope forthe best. 15-8 POR ARERR RRR ARR RRR AAR NRO COCONUT U wwe wwwwwwwwwwwod ‘Troubleshooting Basics: The Tools Every Tool Is Useful ‘Analyzing Log Files ‘kc, doesg. grep. head, tail Reviewing System Status ‘cat fete/*vrelease, chkconfig, date, 4, dig, free, getfacl, 1s, tsar, mount, netstat, ps, rpm, service, top Reviewing Network Settings ‘cat /ete/resolv.cont dig, ethtol, ip addr, ip route Every Too! Is Useful ‘There is no replacement for a deep knowledge of a system's services and subsystems, and the tools that can be used to explore and control ther. Under the right conditions, any command is Potential troubleshooting tool. However, some tools are more commonly used than ‘everyting else. These most common tools are presented inthe next severl tables. ‘Analyzing Log Files Log files should always be examined earl in the troubleshooting process. Alot can be learned quickly using basic tools to examine the systems logfiles. [Command [Description fark | The eblity to rapidly create simple reports from logs by fiterng cerlain columns, summarizing, and applying formatting can reveal things that would otherwise be lost inthe noise ofthe fogs. [While some Kemel events are recorded in system logfiles, others may only be present in the Kernel fing butfr. Log files are often filed with a mixture of useful and lrelevant information, While ile tempting to attempt mental tering, is better to use software to do it Wile things sometimes bresk as a result of long past events, most often a recent change Is atthe root of new problems. Monitoring logs as new information fs added can accelerate the woubleshooting proc 159 Reviewing System Status ‘Minor misconfiguations can have surprising side effects. Early in the troubleshooting process iti important to review the general system [commana [cat /ete/*-release Description [When working in on environment with multiple Linux versions, Knowing how To quicky determine whats running on & system can be helpful ‘ctkconfig list [service ane status| pore daemon pe ef [Sometimes things break when a service is running that shouldnt be. More offen things break when @ service should be running but isnt. If service depends on multiple daemor's, make sure that al are runing. te [As services becoming increasingly time sensitive, systems can break in unexpected ways when thelr clocks are wrong ath Running out of disk space can cause many applications to exhibit stvange behavior. Just as It is possible to run out of jaf ~ih data blocks, itis also possible to run out of inodes. Some Linux filesystems can automatically allocate more inades, but Jotners can not fees When Linux rons low on available RAM, its forced to page data out to swap, resulting in poor performance. Running low on memory can also force the kerel to activate the OOM Kil avaliable but actualy unusable, ‘This can even happen when memory is technically Improper file permissions are a common cause of unexpected behavior In adalion to wadlional Unix permissions, [consider other options. like SELinux contexts, extended attibutes, and FACLs. Pay attention to not only what is mounted, but which options the flesystoms are mounted with, Unfortunately, when the root filesystem is in readonly mode, the /etc/mtab fle ean not be Updated and tho output of the mount command may be incorrect. For this reason It may be necessary to query the kernel directly netstat -tulpa [A sovvice may be running but not Iistening tothe right pant or address. Even if a service is configured to listen to the sof =i [correct pon and adress, it will not be able to if another process has already claimed it. pa Va Teis normal for configuration fles to change. lis not narmal for binaries and Worries to change, Changes may indiate fling storage, bad memory, or human action fruntevel [When a system is brought up manually, tis easy accidentally leave itn an incorrect runlevel. Knowing the current runlevel is also important when confirming the correct services are running. top In addition tots default view, top Includes several useful commands to tweak what information lt shows and how. ITs also a simple interface for certain types of process management, o [When woubleshooting iis important to know wo is using the system. 15-10 RRA RARER RR AREA A RRA ARAN www Uwwewwwwwwowwowwwowwwwuwwud Reviewing Network Status Linux has a very network oriented design. Even services only listening on localhost can break in unexpected ways when the network is ‘misconfigured or misbehaving, [Command [Description aig Many services depend on working name resolution, ost cat /ete/resolv.cont! fethtoot [When physical indicators are not visible, the ethtool command can be used to check lnk status, It can also be used Inti-tool to correct link level setings when autonegotation fails, ip addr [Although ist possible to view most of a systems network configuration using ifconfig the ip command wil ip route provide more correct and more complete information. feepeump While i may be tedious to get spectic information about network behavior using tepdump, ils helpful for gating @ general idea of whats happening on the network 8.11 ‘The strace Command ‘The stace command can be used to intercept and record the system calls made, and the signals received by a process. Ths allows, ‘examination of the boundary layer between the user and Kernel space ‘which can be very useful fr identiving wy a process is failing Using strace to analyze how a program interacts with the system is ‘especially useful when the source code ls not ready avaliable. In ‘addition to its importance in troubleshooting, strace can provide ‘deep insight into how the system operates, ‘Any user may trace their own running processess; additionally the root user may trace any running processess. For exemple, fallowing could be used to attach to and trace the running slapd ‘daemon: 4 strace -p $(pgrep slapd) + + + output omitted. ‘trace Output ‘Output from strace will correspond to either a system cal or signal ‘Output from a system calls comprised of three components: The ‘system cal, any arguments surrounded by parenthesis, and the result (ofthe call folowing an equal sign. An exit status of =1 usually Indicates an error. For example: § strace 1s enterprise exceve("/bin/ls*, {"le", “enterprise'}, [/* brk(@) ‘= éxdaa2000 16-12 vars */]) = 6 strace and Itrace Tacos systems call and signals. 1 Sp planus atoch to 3 running process 222 Hlenane— output te 1 We tracerenpe— spociy 2 fiter for whats esplayed “i Tolow fd wace) forked chid processes 222 show counts, teeaee very sia to steaes in functonslity and options but shows ‘Syoe Roery calle nade by Uns proouns 25 show system als 1 -a 3 indent output for nested cals access(*/ete/Id.so.preload", ROK) = -1 BNOENT (Wo such > file or directory) open(*/ete/d.so.cacie", 0 ROOMY) = 3 esnips ss Curly braces are used to Indicate dereferenced C structures. Square braces are used to indicate simple pointers or an array of values. Since steace often creates a large amount of output, it's often Convenient to redirect i to fil, For example, the following could be Used to launch the Z shell, trace any forked child processes, and record all le access tothe fies. trace file: 4# strace ~f -0 files.trace -e trace=file 25h + output enitted . un the 1s command counting the number of times each system c ‘was made and print tolls showing the number and time spent in ‘each call (useful for basi profling or botleneck isolation 4 strace -c le ‘output omitted . ‘The following example shows the three config files that OpenSSH's sshd reads as it starts. Note that strace sends its output to STOERR by default, so if you want to pipe it to other commands like grep for further altering you must redirect the output appropristely: 4 strace ~f ~copen /asr/sbin/sshd 41 | orep ssh + + + output omitted. . Trace just the network related system calls as Netcat attempts to Se ea een ea ree eo can WCC WUWOUWT EWU oUUNUwwUwewowwNd ‘connect to a local telnet service: | steace -e tracemnetwork ne localhost 23 + output omitted... ‘The trace Command ‘The Ltrace command can be used to intercept and record the dynamic eals made to shared libraries. The amount of output ‘generated by the trace command can be overwhelming for some ‘commands (especialy ifthe ~8 option is used to also showy system calls). You can focus the output to just the interection between the program and some list of libraries. For example, to execute the id -2 ‘command and show the calls made to the Libselinux.so module, execute: § Verace -L /Lib/Libselinux.so.1 id -2 is_selinux enabled(@xcicTas, 8:9£29168, Sxclaffc, 8, -l)> = getconléx804c2c8, exfee80Et4, exB64b179, 8840828, 8) > user_ussysten_runcontined_t Remember that you can see what libraries @ program is inked against using the 1dé command, 15-13 Common Problems Incorrect flo permissions Inability to boot Imeorrest bootloader installation Corrupt ile systems, Typos in configuration fles Disks fu oro fre blocks + no tres inoges Running out of (virtual) memory Runaway processes Problems and Solutions in Linux When prablems aceu in Linus, they manifest thameslves via ‘symptoms. Sometimes there is 3 stong correlation between the ‘symptom and problem such that determining the root problem is very ‘2887. Other times, the problem manifests itsell in symptoms that are ‘ube or otherwise non-obvious. These latter types of problems can 'be especialy dificult to solve the frst time you encounter them, Experience isa large i not the largest, factor in speedy troubleshooting. The upcoming pages document common problems, thor symptoms and solutions, Hopeful this wil sve you Ue i 1 tue. 18-14 Pe es ee ee en ee wwwwwwwwwwwwwwwwwwwwvvvvwwowwswuud Troubleshooting Incorrect File Permissions ‘very strict permissions ‘tap? not world writable and sticky (1777) Filesystom is «hierarchy What are the permissions on / What UID & GID Is deomon running as? = What are the permissions on te lest « Vina ae the permissions on parent directories? Incoract SELinux fle context Possible Solutions Incorrect fle permissions can be very dificult to debug. For example if permissions on /tmp/ are incorrect, xfs wil ceport that it started ‘successtully, but when making an attempt to write a socket in /tmp, wil fall and silently ex. strace and Ltrace ar invaluable for debugging opaque problems lke these. Use strace on the daemon or program to See it itis getting any "permission denied’ error messages. Then take the appropriate action to fix the permission problem. ne problem which can be dificult to diagnose is incorrect permissions on the root directory, /. To see the permissions on /, use the 1s ld 1/eomerand. On Linux, you should see: § is “ta / drwix-xx-x 22 root root 1024 Sep 18 67:25 / (On RHEL, the root directory does not have write access for the owner, i.e. 0585 mode, Incorrect SELinux Security Context ‘On RHELS, probloms with @ weong SELinux context can be fixed with the cheon or restorecon commands. For example, inorder for Apache to be able to read a filet needs to have the ht tpd_sys_content_t SELinux context. Here isan Mlustration of a common problem: 4 cd /var/inea/htal/; Is -alt Grwxr-or-x. Toot root systen_uzobject_r:httpa_sys_content_tist . rwnr-xr-x. root root systenusobject_rshttpd_sys_content_trs® root root unconfined vrobject_r:httpd_sys_content_t:s@ index.html sritist logo.pag has the incorrect context. It can be corrected with ‘The ogo. fi # chcon -t hetpd_sys_content_t Logo,png 15-15 X86 Linux Boot Process Bios 2% Quick Hardware Check 2€ ail Hardware configuration Selects and passes contol to Master Boot Record on boot ‘device LILO / GRUB first stage in MBR. 2 Executes 2nd stage Boot loader 2nd stage der 28 Starts execution of kernal Linux kemet 2 nitiaizes and configures hardware 48 Finds and decompresses intramis image(s) Loads drivers 4% Creates foot device, mounts root patton read-only. Location of ‘oot filesystem is usually passed to kernel by boot loader 14 Executes /sbin/init (can be overidden with init=/sex/ex option) Int (inital process) Reads /etc/inittab, Determines defauit runlevel: if not 15-16 Inability to Boot Understanding exact stops performed during booting wil greatly aid ‘roubleehooting ‘108, hardware, and boot loader problems Kernel issues = Does Komal have divers fr device that has rot filesystem, and forthe (00 fisystem type? Flesystom Corruption or Label erors “= Flsystom is corupted enough that the kemelcennot mount i readonly Configuration tle errors exe! flow = Tete/inittab, /ete/tstab, bot leader contig ‘defined, prompts. 1 Executes a variety of scripts to bring up core system services. Scripts executed by init on a RHELG system include: 1 Processes seLinux and enforcing parameters passed to kernel Uses Libsetinax to mount the /selinux flesystem. Loads SELinux policy. 1 /ete/ze.4/re-sysinit which performs inal tasks that ae the ‘same regardless of the runlevel being entered. 28 fete/ze.d/re which runs all SysV iit scripts referenced by the ‘symbolic inks in the corresponding /ete/rcx.d/ directory for tho runiovel being entered. Starts mingetty’s attached to 9 set of local vitual terminals, 28Starts a display manager such 2s gam if entering runlevel 5. RAR ARR RA RAR RRR RRA ARAMA ARRAY we wewewwewwewwwwwwwewwYwwwrwwwwwwd ‘Typos in Configuration Files ‘Typos in ortan configuration fle can prevent succesful boot = boot loader con es 2 fetc/initias 1 Jeve/fatab Cheek /var/tog)* (Cheek documentation for proper syntax Essontial Configuration Flos Typographical errors in essential configuration files can prevent Linux from booting correctly. This can occur when mistakes are made in the boot loader configuration files, In order to recover from an incorrectly installed boot loader, the system usually needs to be booted off of rescue media. To fx mistakes in other important files, such as fstab or inittab, you can usually do a minimal boot off of the local installation, either by booting with a rescue shell instead of int (init=/bin/bash atthe bootloader command line) or by booting into runlevel 1 appending 1 to the kernel parameters from the GRUB boot loader. (On RHEL, the GRUB configuration file that defines the menu items. displayed on boot isthe boot /grub/grub.coaf fe. The ‘/etc/grub.cont and /boot /grub/nens.1st file are symbolic inks to ‘this contig file. 1617 Solutions: ‘At boottime, all flesystems are checked to see if they ae in clean state. Flesystems are marked clean as the final step of being ‘unmounted. If @ fllesystem is not clean, fsek is run against it. fsck Is 2 front-end program wich calls an appropriate program, (89 (e2fsckiB),reiserfsck(8), xfs_check(S) to check the filesystom. If there is enough damage to the filesystem that data might potentially be lost, the automatic filesystem check halts, and you are prompted to manually tun feck. You wall be prompted to press at tech repair step. This can become quite tedious, 50 you can use feck =y to tell Fsck to always assume "yes" Files recovered by fsck wil be stored inthe lost+found/ directory at the root ofthe filesystem from which they were recovered. ‘ost#found/ is not a normal directory file, but rather one wihich has a pool of prellocated data blocks avallable fr its use. Ifit gets Femoved it ean be recreated with the most +found command. 15-18 Corrupt Filesystems Filesystoms need to be clean before they are mounted roed-write Unioss major damage has occurred, uncloan Mesystems ean still be ‘mounted read-only ‘Flosystem could be unclean because of hardware falure {ck will return flesystom to consistent stato = Flesystem stscturesrepaved, not Mlesystom dats 1 Can take @ LONG tine on vary logo flesystems Journaling filesystems partially prevent need to run fsck "Burs, Ela, JFS, ResorFS, ES A Ve wwwewwewwewwwUwwwwewwrwwewwwwwsd ‘The RHELG Rescue Environment Using the rescue environment is @ last-stop efor, nat a first step. ‘Most problems do not require its use, and can be solved more easily ina more fulfeatured envionment, such as single user mode. The rescue environment has 2 lrvted set of tities, One useful lity is the chroot command: | chroot /ant/sysinage chroot will change / (the roo filesystem) to begin from /ant/sysinage/. From there you can run grub-install, vi, ena {and everything else as you would normaly W Auto-mounting Filesystems Fail Use mknod to create device nodes, use fdisk -1 to view partitions. run fsck against filesystems, then mount if needed) the root filesystem at /mnt/sysinage/ (and other filesystems under ‘Innt/sysinage/ 2s needed). 4 nknod /dev/héa 4 mknod /dev/hdat 1 mknod /dev/hda2 i taisk -L /aevinsa 4 edfsck -y /dev/ndat 4 edfsck -y /dev/nda2 Wrount /dev/ndax /ant/sysinage Rescue Environment ‘Some problems require use of a vseue mode = Boot of install sk 1 and to Linur Fescue atthe boot prompt + Fesystems wil be mounted undo fat /eysiage/ if possilo * After fing the problem, type exit et the promt to reboot the system 4 chroot /ant/sysinage (On RHEL, the rescue environment provides several options for detecting and mounting Linux fiesystems. One option is for the rescue environment to scan all gives on the system and automatically mount detected Lux filesystems in read-urite mode. nether option will mount detected Linux filesystems in read-only ‘mode. The last option is to skip the detection of Linux flesystems completely. ‘Task 1: Recovering Damaged MBR. Page: 1521 Time: 18 minutes Requirements: & (1 station) @ (classroom server) 18.20 Lab 15 PET CTT LT ee CwewwwwrwwwwrUEUwwworwerwweswuved oie th damaged MOR T: ie k1 Jae te rescve environment to recover a damag as! Requirements Recovering Damaged MBR 'B.(1 station) (classroom server) Zaee aS TE eee Relevance Certain problems are severe enough that they render the system Lunbootable by normal means. In these cases, you must boot the system sing an alternate boot medi and possibly aitenate roo filesystem to repair the system. Ths task teaches how to enter and use the rescue ‘environment 1) The following actions require administrative privileges. Switch to root login shall: 8 su -1 Password: makeitso E for any disk or filesystem disaster includes having 2 copy of 3nd what each partion contains. At a minimum you should know where your /boot, /, use, and /var filesystems are located. Use df to view ‘where they are currently located tae = output onitted . 3) Record which partition contains /boot: Result: 4) Record which parttion contains /: Result 55) Record which parttion contains Just: Resut: 16-21 66) Record which partion contains /var: Result: 7), Overwrite your MBR by running the command listed below. Please double-check your syntax before running the command, as simple typos here can easly render ‘the machine inoperable (beyond the scope of this course to recover} 4d ifs/dev/zero of=/dev/ sda countel be=446 ‘Carey tye this command as any typos coud «output onitted . ‘campleteydesboy you Lux stalin, reboot. £8) Your instructor will provide instructions on whether PXE is available, and if so how to PXE boot your lab system. I PXE isnot avilable, skip to the next step. PXE boot your workstation, Most Intl systems do tis with the key after POST. From the PXE menu, select Rescue Node by typing XEx (replace with appropiate selection from description in menu). ‘This wil bot tothe Rascue environment. Skip to 10, 9) I you are using the installation DVD, the Rescue CO, or boot. iso mede CD, complete the fllowing: Boot from the optical media provided by your instructor. ‘At the menu, select Rescue installed systen 10) After a moment the rescue environment will ask about language and keyboard preferences, Select as appropriate, 11) I you are doing @ network rescue with no DHCP server, supply the following networking parameters tothe first stage ofthe installer. using DHCP, only the NPS Server and Directory settings will be necessary. Be sure to replace distro withthe proper value forthe version of Linux you are using: 15.22 RRR RR RAR RAR RRR AR RAR ARTA AAAS Ve ewe ewww wEUNWwYUwUUWNweNWWwed 12) 13) IP address Netmask Dotautt gateway 0 254 Primary nameserver 254 INFS sever 254 Directoy /export/actinstall/aistro| Iwill then ask i t should attempt to locate and automatically mount the Linux filesystems. in many situations this will work fine and you would choose Continue. I would then mount the root filesystem at /ant/sysinage/ and mount child flesystoms as wl In order to become versed with the steps required in the worst case ot have it attempt automatic mounting, Select Skip to go deectly to command prompt, Select Shell Start Shell from the menu: nai. do Starting shell. Dash-4.i8 fdisk —t * Not the parton ruber for your Bat pation Ifthe system boing rescued is using LVM based devices, it may be necessary to siscover, import, activate, of create the necessary device files; 1 lndiskscan ‘ln some cass, misconiwed extemal storage targets, tae eM es + such a5 SCSI o FE, alto automaticaly detect Note idev/sd (74.51 GiB) int casa tat 1 LV phil volume is dtecta, 6 disks 17 partitions @ Lin physical volune whole disks L WM physical volume # lum vgs “ther is an x inte atts then the Volume Group We tp HL ASW attr Size Free has teen exgertd and wil eed tobe inported wing vgstationk 15 8 ig-n~ 34.189 25.689 ‘mwpimort vy station. 1 lum vochange -a y vg_stationx 5 ogical volune(s) in volune group "vg.stationX nov active Is /dev/vg_stationt/ Waroot Wwisvap Wtemp Wiser Wvar 15-23 14) 15) 16) 15-24 ‘Mount the normal filesystems under /ant/sysinage/: f mkdir /ant/sysinage oust /dev/vg_stationt/Lv_soot /ant/sysinage/ + Once the oot lesystem i mounted is possible to look at fant /sysinage/ete/ fstab to see the somal f mount -o bind /dev /ant/sysinage/dev/ ‘mounts. LABEL or UUID rlrences ae used instead of 4 aout -o bind /sys /ant/sysinage/sys/ the device flonome, use te fds or bIkLd 4 out 0 bind /proc /ant/sysinage/proc/ cmmards to ientty the device lenmes # mount /dev/vg_stationt/tv_usr /ant/sysinage/usr/ 4 aout /dev/vg_stationx/Iv_var /ant/sysinage/var/ 4 mount /dev/vg_stationX/lv_tap /ant/sysinage/tnp/ # mount /dev/edal /ant/sysinage/boot/ 4 chroot /nnt/sysinage/ +The choot command may not prove an inéeaton that you are curently in a pseudo soo. Repair the master boot record (MBR) on the primary dive: 4 grub ‘The hit and bX parameter passed to grub shud be ‘grub> root (ho, 0) the haré lve ond partion te BOS toot ram oot (hd8, 8) Filesystem type is ext2fs, partition type 6x83 ‘grub> setup (hab) snips Done grub> quit, GRUB should find its stagot,stagel_S, and stage2 images, and report succeeded in anything run after the checks. If there are problems, ask your instructor. xt chroot, exit rescue mode and reboot the system. Once the system has ‘rebooted, remove any boot media you used and start a normal boot process. I the steps above were performed correctly your system should boot in @ normal fashion fb exit ‘teint the cvooted ste exit “esminte th seve envoment Soloct reboot Reboot from the menu. RRA RRR AR RAR RAR RAR AAA AAA RATA AAD

You might also like