You are on page 1of 169
em at ee Pl 4 ik 0 15 1B Se 5d OT it KF EB w oF ATR RPM Linux RABBI A (2.4.0) WRU CTE LAT A SHAR AUIS LP AT. {PARAS RARE? He IPT? AGU ITEAEA Bl, KA ABRUE, ART MH Jy id RAL “RK” SEE. PUL, LBM RAT A, RATA, ee tMC IAL TART A. ESF. oe AORRES STI ABA ACLS HHI TEP DARI SRS, SET LE A ABOU". “AULA”, RAD SE BUTTS IRAL, HAE EMR REE. oh, PRATER, ESA AAER i), TE MR PONE ARGU A BR Soe aE AB A A J tS A OER OE, REE PEERS NVA, BL, RAMEN. SRNR REA LPAI. BKE, RAMEE MORE Pia), MME AOR, ARIE LE MAR BIMUFAEEEM, ETRE ASME TMA. QRENTUNE, MERE AF PARLE AT EAMES. HART S HAAR ASAE. JUAER AX Linux WANES) MR ec HE eae, Bl, ATR ALE, MTT LEE, ASE — MRE — ERE, GA BARBIE NCA UHM, J) BLEDEL ER a PO Eo Te RRR A LI, eA SPA ABYSS RE TSI SRA EMS RY “TAT, RE. AB Linux AAPA AR. FET, VRE RE AEE, te OE WEEE M, URRES. AAP, RTD RSA, MPT ATER, Ae BRB MLA. ARTERY BURT, RUFF RAS ARI RTO Linux PU 2.3.38 AR, FARES 2.3.98 Al 2.4.0 RUBRIG, SUISHRAE 2.4.0 LETRA SB UE RS. BERTH LAA KI LAT PR IPT PB GRE. TUE, SATA, REASAA, HMMA AHR 24.0 T. RATS PRUNRBR RE TAR, L240, ARTIF SPAT BEAL RURAL SH I TERR HE: DAS RA UIE Fie, BASH ZR A RS, AAR CAE EA ATR I A AHR Oe, A ati SOLVERS TWICE SE. DUIS. AT a APR ULL A. RR eR RAS URE RAN TAPE Sh BE AIK, EMEA PBR A BAT BY LE LT AR AT BER A A AY BRS AK II LAT YR PE ACD HEE SESE, eee HRY Toe ae he a ACIP. AVE. BaP (Sete RSA EL ALIN, “SORE T A LR aH, SRR ZR, RE SBR) SOE, TE sO RUE MEM ASE. MCP. BARRA PERE R. RAS PK POISE, RATA, BERR AEE ROT, TEI Pe] A HEE ARIAL ADR FR» SPEEA ek ES RS A SY R19 ACHE AU RG, Ubi aa Pe EMBASE, DML BR, RITA A TT ATER PMR ASRT AEE, BURT NB RE RAT. SAB AA SAE ETHLEANLA TW, MRRET “iwi TA ERS RG, UMN RAR PREM FARRAR fil AMM AC TEA A B92 BAR ie eB OAH BC I BPR. 3s Ta ANISM AAAs DS SE CER AS A YF SURI [S.A CATR RAT eT. RING URAL: eT eRe MBM A CONET. WAR -BAECMMRLS. Fe Fuebthitd, HET UES ZITA. Linux 9 BA HE SE LAT EE TEA TT I. DATE LA Bl, AMI ER Cy Acme DI IEA SEAR AR: RE Linux CABLE Unix) REA SRA? SRA Linux SEPARA? HOR PAR, AT RIK. AMR AAP HL ACARI. HEL, Bat SCAU GPL PIN AED. WR, Linux ABU HOARE RELA, RAPE ATMS, ASTI. MEARS PEAS SI ALI PTE ER Te — FETTER ES RRR A RA RZ BT PARAL TLE ee BON ALR Di TRA PERU be ATF EEE LIB ATTA. MOA Re RE AAO RRA A PET LO SIA” AYERM, (RFMD )LA IGA —-NR RE R. OR. PRA, HP HLALR Pe WES, AERA RU LAE SHUR VARA MR MEST, kt TAA PERLE RM. MO, OT MERA, RADHA. Heo SBMA, LP ROR ATEAL. BPE TA Linux WRIA IK AIL ALE BEATA fi, eR ves HLL, WIR EAR, ik 28 Pe a A ed SP TaD aT AR, PAT ARE BE, MGB T RT OZ eT OM a nt SB TA. NPE, TS EP. IL, PEREIRA SCI. UE LE PEM, SCPE REEL ASSO Unix HERIODEL AL, AH. PARRUGEY socket AUMLATIALTIN, BGK Bh, SAE SMP RSE A RSS| PAI, IE. LE PAH, ABLE AOA Bese Tika AL RIT Mk DBR ALA REBEL BUF TAA. AA AL, BRACES Pe aS RS TART SL ESP Le AS RE COIR RE as FAR SAE eA eR). RTI, A AE RE OL A TE RAT, FECHA. MATL A. -2 BURBPRAT FA HE SCPE ERA ATE ti AIR, ATTA RH SRD HT PAAR aA, ABCD ATE OF A RTT ATR, A OR HAR. (HL, RUT RAIN, BUTT, WHIT. 20 FER, A TRAE PEA ASA AIT. +R, FSH NICT ARL L MVHS ADIT A. ARE, GR PER SL TREN, HAE SEE aT. RT SAL EN Tes TEA CUE BGB IR, FTIR ARUN SEA PAROS FEI. THT AAO Aa La PWS CBI Unix A Linux (Ska), PEATE a EES LAX Linux (5, JETER FE LRH A, HARE RIE HAYES. PIRABBI NER, SRAEBY TA EBL. SMD ATSB JL Ae Be A er Bd A FUSES, (PENG AAL. PYAR BIA, BVI MRE HE ids OLE AR TAPE OE Bh, BORER ME AS, ATAU AICS CL EE RB PAAR, ADS es ee fea BEB A FO A RS Ba. SE LR RE APU RAPS APR Linux OFFLLY SAS = (WAT ASS — PEE T RT TARR EE TRA Unix CIRCA Linux BORRUTIL. thy EE EE ARR, AAT IPE OR EIR HELA BRR HI STARR Hy BR Ma MAE BAT MF AR Aa AY A CE R.A, PBL TUR AE ALR. Hoa, ESUR KECK, BE, TRL, RT eek aA 2S ol LR RI A, PT DAB Ca AAT At eB. ATM, REM WRASS 1 RS HRZMEAAT SARA EI, SORA SY SR MABIRA RAE, CBE. EAB (Decao Mao) 19 Orchard Hill Road, Newtown, CT.06470 usa BA BIKA FREI BLAS SUNTIARLLGR 138 S48. LIF (310009) 20015 HLH Blk WeR.. LL Linux ABMS. 1.2 Intel X86 CPU APH PE 3 13 1386 ARRAS EBL eee 14 Linux WBE PAY C (LFS . 1.5) Linux PPLE TIC Si BARB. SE FERRER 21 Linux A 4e 9 SE AHE: 2.2 AunbBRAT ROEHL ae. 2.3 Jc CE a a 24 TTA, 25 ALP HPA EE... 26 Op UH 1 FP 2.7 PR TUM HIKE... 2.8 Tene ue, 29 RMA... 210 AEM AYE... = DLL MB ER A 77 eV BH LB BIR PB. AAAS... 3.1 X86 CPU af Pi MRRP SHS. 3.2 spi de IDT MRE HL, 3.3 PURI ARCA TEA yb, 3.4 HAO AIRS. a 3.5 SHG Bottom Half..... 3.6 SUE EA Ae. 37 HYPER... 38 AGIAN... 3.9 RSAN S SBE. Linux WAH Pin sear) Cat 84% PRSMAAL 44 42 43 44 45 46 47 48 49 BSH LHRK oa 52 53 34 55 56 $7 58 S56 HE -HEBEAY Unix HEREIALRLE . 61 62 63 64 65 66 67 68 OBR, SRSA. Oa. PT SH SERIBH fork( ). vfork( )'d clone( ).. HAA execvel oon AFA exit )'5 waitd( ).. HERA ES DH... IH Me... ARK nanosleep( )#11 pauses ) Pa Be aT aL ER HE. ASH 2 BI a a. Ui IR Seat. SCAT AR AEON Se AR... AT RA. SCHE'S Sik SERIO E.. FPURICAE RAE proc IES sree AGIA ptrace( RUMEFLEE ROE REAP. SR. “We He Aw LL Linux A 4% fa) JS TEN RABAWARE LL, Unix RP RAE RE 4+ AH IM. SIRS Unix BEE BMGT LRAT AREOLA EAR, JP ELYRIA kay HAT ALA Rh HURT Sets TTA, AEA. TOL AR BRE RE” WP REE TR. APB Unix A 6 RRC, RS KIRA LR Ee HE EAU, HASST OLE, SEE IRR ARAN FAL MLA RR AA AE Unix AOR ARS ICI, iL, UAT Unix (PERE, ALISTER T 4 Unix Ph. SEL, IR ARR. WAS) Unix PARA. Unix WA LRZ—N BSD MRAM NAAR ALD BE RAY. OK, Unix RT, FRAT RY. FBR AEA T TS 6 HER RSL AIPRIET . (GREAT ASPET Unix Py CHS CPE Bit TA SUB EAT EH ERAT. NECA, ZAR Andrew S, Tanenbaum M47) £—P MH) “OK Unix" #8 fk Rt Minix, 7 PC HLL. KOERIZE 20 THEE BO A A 90 EACH BL IZ FH. (BE, Minix Bie 8 Unix”, CSE Unix HA. AL, Mini AMATI “PA”, 5 Unix WBUR PARRA, 2h8e LAR I FA TUiS. PRR Unix BARRE AK, GT He“ 9bIE" Shell BST ATER “SARE. UU RAPHY RHA SCRE, BASIE SRR ALTER Unie BE. GRRE. Minix RRA AA ARB ETH, BE I. BB Minix (ik MR 3) Nae Linus Torvalds sai 7 HEA, Eh Minix Ai, SEALE BERK Unix (eG, HMR AMAL, TE PC ALES, THRE TRIE STELSEAIRN Unix Wik. iE, PORE BEY GILYR) Unix RSE. MA RACANVIRICES, ANAT CENT. vy, Tanenbaum £2 MAME REL, UUW RPE. RA RR. BRE MLTR", MENSA, LAABL, Linus Torvalds DMEF Tae. WT ALIA b AL Unix, Linus Torvalds HCE 9 Linux. ABIL ERIM i ARR AMR BARR, (HLA YAH TOA HAIRS T. Linus Torvalds 4 EA TCH L Linux AKI — PR AU BREE SRM L, AGHEA CNR ART. REET NEMA RES. AER BP aA PRR SR PR, IS eee” PSP TRIER de Linux /9 Hei8d era cn HES FSP UAB REFER P28 Unix AE Unix, BELLARY GNU, AL “Gnu is Not Unix” fit RS) RE RAISES, il) Linux ABRIL. APE. Tek, Ht Linus Torvalds + 4908 Linux WMP, MESH, ate T PSF IER AZ. RIM, PSP ASE. on GNU OY C WE goo, AAPA TA edb, HAA Shell RISD PLA, THE Web HR%A% Apatche, i) MS Mozilla SERRE ase Netscape) 879, WIE RA ZACSENA. ATPRERETA HF LSC DIF EP LP TR. KASHRHSS, HARM ERAAT RAMS, RARER TRIACS. TF Rite BOTT ALLEL AM RAE, EASA RIS MZ, Linux 5 EAT Minix FUE BIATZEOE? HPI PUA, TD Linux tb FR": Minix E728 Unix ASCE, iT Linux HAS LALA Unix, if) AL Unix AYRE, HE SHAH Unix RAS RA AAALA KRG, CHEN ORPRE, MARTH EET HME RRS, Wes Ae EW “HEAL” ROARS, WUE, UAE SOR, CANT. ERAN RRSS. APRA AIRS RS ER AE EZ hs ER WAY “Client/Server” MKB. MI, UHM A ARMA AT Heda ay UAL HL FESR RAE “ARLE”, CT NICE BW TEA BP mG SP Se EL 9 SRE I SRRANABHBMERN EL, MAN BAS MS RATA MMI. URS ERL, BERR CUA BPO RED, (Bey DUR BP. SELLA TBR, ECR SE MAE NY LA Hess RE A RE MUA. BPA, AFP PRAIA” (Micro-Kemel) ENGEL. HALA T ASM RS, ERREN REA “WAR” RH (Embedded System), MARMEURMIRAM S|. THERA, + BARMERA AMR, BP RAMY AIA EPROM PF, HERE iif ROR MR MM. BL, LT TAIRA RAAT RMR, Ml PSOS, VxWorks %. “8, BAHASA, SRR SARA EMRK E, ASB LEE RIALI Gl BRE) RAR, HEM ASIN TIA, BRT RAR. SOU BAAR OE, ESET Ee RY “2” CMacro-Kemel), BAK “HELA A” (Monolithic Kemel). HASH AR FARA ER, A Bh ee. te JERR. Linux RA OAK ALR BRE (2000 Unix BE “FP” AN. RRB en AS ee COR IR). BL — TEAM MALY, HAE A RL RR RAE), TRE AK, FAW THARK. RAMAABATE, MRAM AERIARIRIE, TR REM, AL AMMET © ASSP QUNIR F. SJR ANS SH AA BI CC, a “CMR AT REAL PLE HIE PRR a EA PEP, ‘TOF “setup” ay tit Ew ee SRL mri Alpha, M68K. MIPS, SPARC, Power PC 3 (Pentium, Pentium Il $$) B+ i386 AW). ATLA Linux A SES i DRE. FS. de — 4 REA) CPU LE, Linux WW BBE ANAL REE, COMI INH CPU HIM, RIES CPU HAT. Bit, A HHESIE T 1386 CPU, JPA CAR CPU HWE, (LER AS SIT ibs CPU HHH. CER MEAE HY Linux REET. ERIN fet /usr/sreMlinux. WARIS, GNU Fa FBIM Linux YH EM tar CFE, URE s e—44 Hinux AF BR. AP BUR ERT. BLRAA linux PTA. Linux HUBER, KOE Ba COPYING HR FSF ASE HF ALE BLA GPL MH AEA [— README Linwx Pa Be ce A AE FA [— Maiefile EAD Linux ARTA GARTH m eke SCE [— Documentation AH Linux HATH J— aca arch J& architecture ia) M40, BOE S UCP U BUR SCHR HA RAGS NA ARO FEE RRS DAREN n STRAP LR inclodejasm HZ K [— Alpha——ii DEC FREAD 64 Chu [— i386 ~~ X86 AI A038 GANA 32 CPU. wR 80486, Petium sPentium H. 494%, hOIAMD Ko Fe BRS |__ m68k— tH Motorola sP'32 9 68000 5H } mips —— RISC CPU EI RISC CPU TH. EAA T sun LA aha BLA F— 390 —— BM ERRATA {ings —— intel fy 14-64 SHI 64 (2 CPU WH: AED CPU TERE, AB—VPHMA boot, mm, kernel FF HR, linux — PMOTH RANE. AACR. KAM MENG. pL ARWAT RMT CPU PARE MOR ERG Rb Rae RIC BRM, ERR RAG drivers ee CD LI LI Ae BE 719 BL HO L— 6 SCPE RAR, ED CHROMATE RSE SCHR, se FNAL ARH ARE es f— include BA THA Eh REEL larch FH ORE, ME include LEW EAR CPU BER-OFAR. WAR THR asm WAR RANMA AE #2" BAAR CPU AY GAT TH. kt asm i386. acm mokk . BRIE ZIb. SBA MUTE Fa hin. net L__ init Linux BEC maint ) AAEM ALL AL, (246: main, versione EK |__ipe Linux WEEMS IR. WAR: utile, semc, shmc, msg.c LH |__ kernel MER PMAILAINE, 14% sched.cy fork.c, exit.cr signal.c, sys.cs time.c, resource.c, dma.c,softirg.c + itimene BX 4E i UL AS LAL TAR fe a [— an PERI. MMRTERE AE, HE. swap.c, swapfile.c, page inc, page.alloc.c swap state, vmsean.c, kmalloc.e, vmalloc.c, memory.c, mmapc XE L— net LT REANIM RLM AR BE BE L_ scripts HF REM HBOG OCH Linux Aya La fA —RATE, Linux Sis eu A SESE Let IPR LB. c Ah SPMD), TAR IERIE RAL SE. PUN, RE a TA RLS AAA CPU RRS, GELS ET AR RR ETRY AR CPU HD. FE, de net FAR POST SMEARS, (AS bE SR BR. TH ROR SRL AR Ae ERRAWZ A, WEE PAK Linux WAM 36%, TERA! Linux BY, AAEM AML STAR LS Ap. PER Bh, RRP RSE — AO, ACO ASP. A, AAT ELEE Linux 999 Bea Linux, FDL ZEUEA! Linux PARAS RIERA, HH ROREAR, BRE PO. AH, WOAH BUA. Ml) Linux iB ALIEN. Linux AMARA RAT LAT A CASULA SINCLAR, BRS ROY “x yy.z2”> HP x PF OBO LIA, ii yy. zz WF OF) 99 Zi. PHRF RA RT. HER JTRS 2 LB) pNIN SEBEL NN AES OB) 20 Z MCT. EALERTS ETAT" BU VANE, H0.99p15, AEA ALRI ALAC 0.99 APTA 15 UAB. HHP Linux BURPSYI TTC, A ORBEINT AT DOA LE Pa ERAS, CLERC IP A AE. E. AWUEARRTHEA, At, REA-ER SAR, PRP TARAS TEL GR ORAT NR” GB AL “TTR”, PELL Linux (Ay RAR AA SAAT RRM AAS yy.zz TH. x RAS TRAN ERIE LN ADCR, yy ARR ORE, Bah ASAE, RU ATTN” LRM”. OUR yy AMBER MAM, CERT RRR, H EPROM AE ETE RD LEA ARE. RAIS Ao PAT ER EK A RSS FE PA A BMA RR, UE POR, MRR RAF yy HU RRA, Zi, TPR OE FTE IA. (LEAS A TILIA RAP BAT HL EF wz, WARE AMMAN ADRS. COT ERAI INSEE, RATA MEAS Bs, WSCA EH 2.0.34 THERE 2.0.35 RRA 2.0.34 PH) HEIR, RAAT He AIR. “RATA” Al “TP RHR” AY 22 RAMS, EER MMR. BN, FPA 2.3 HR ASSAD) 2.3.99 Wt, AAV AAT RUB At 2.2.18. Linux ANB 0.0.2 MAT 1991 ERA FR ARAT, 2.2 AE 1999 2E 1 AT. Linux Wy BAe AL SIRE. SLES ARERR. AERA MEE 2.3.28 FB, RT COAT MYA ERITH 2.4.0 HR RB. Linux 1% HAE A FY APBLAL IH Linus FATT RAE AN MA. (RAS A) ¢ERAT Linux S(F RAEN ATH (distribution), fi] Red Hat, Caldera FF. MADMIN RAT HA PRAM ATA LATA, (UREA — SC. RATNER TAZ Wh ARP CE REIT. SUID. PPP ELNS > Mba Rew AU PERE Ay ANE Ay, TUL Ab ANA fs AY CHD. ASTOR AAT AGH AS IFT ARATE EAR SS 5 ANT ARAT YT AAT HR A a Het Fl, 67 ARERR I. BOR IID. CETL, MW ERR +P Linux, BF 1B HER Linox” RIE RATA RIE TT RCA. 9b, APE Linux AA MNRAS BeAr fa fh AYARLACHO “Red Hat 6.0” yikti#. GUO, Caldera 2.2 PIM PY BAL 2.2.5 Hk APAS RP, HRTEM RTE ERIE. UA Bt ES RRR, WAR POE UO PRA BARET, GOBER. OE MASA FTP APRA HARI ARP, BEWRAMASH RAMA, BF. NTE TIE ide BK mak ARE RAAT. Linux WRT) RARE TRA. BAP MRT IE, AZ EE TRAM. TEL, AT eee. ATT we SR ET A Tee eT, DR) FHP SOIR. i TRA TR FRAT EHS RHE. BT RMASIPL ai RAT RE RAKE. Linux PA RAPES HAVA ERR RFE (ELE DU i FR SGU Fe TR A ES AC hl. FURY, Apt B) ASRS PR AOLAT, PEE AER LHD ALERT A HA DLS AAD 3 JER FORMER ATE. BAUM, ET MASK” REINER, A ASL AW Embedded Linux: $34 “UE EN ERMA, AAI T RT-Linux: Ht FPL TOR, AW AGLFFRMT Baby Linux; BG SR FP Linux AC FR. RABIN Limux PARA ACTIN, MER NAR a th, {RRL SEMA EAS. A FSF OS CLA (GPL), LARA AR ifsc PELAL IAG, HEAL, BER Linux RRS ATTEE, ABR ACATIA RAL LT » SEIAHR. Linux U5 Linux APRGRICS, ARUP, MIA CREAN BPA. A ARIE FSF (948. FSF WHT HUN GNU Athi TP AHMET GEL. A629 GPL (General Public License), ULIY Copyleft, ROE SKM AT VEILED Copyright #28 7S FAI HME. Copyright BiB Aa XE IMAL, PREME REAL RAE AT EY BEAL, TT Copyleft sav F AL SI TE RET Sil. HO, (ABR 7&8 GPL MI "2X9. TK GPL ME. FOIE RL BE OL GNU Sct. 9¢ HATO GNU SF SRG TTA. Bae. GPL ib FOPP EAT Se RH GNU BPR RES. IF PLE URAEE HE, (QAR GPL RRR. MITTAL, Hea A GNU KIT ARTE GNU HK rae KCTS, Aa CRI. ED Ba Hat GNU Gea GNU), FF HDB MR TERE HAE TCU, RENE ROTATES. BZ MR PPAR TE GNU BRACES IG Month CAB FETT HR, FS 1 TR A aE es AY LH TP GE BEM he SRAM ATA ETE). Bees, BHR HESRSORAR BRK. Dil. OUR ~ MiP RAM GNU BAHU FE CAPD MURR. MASSE GPL RHA RAR. RL, GPL AYRE Dw FE: AH SARE RTA RSE RAS TPO BE LE FEARS AE SE. AASB) Linux AB ORD, UR A BURRIS A ea eT ABC. REMAP SAT Linux ABP REE, OITA HPA TPIS. (LE, MURA TT BEY AP, JR ESRD LAR FEA PT I A TU UA eB PAR, AS 2 GPL RRMA. SL, FSP ADH) RAR WEA Be, HA A Ae a. BEA, HARMS BLA. NAS ERAT, — ANY Undocumented DOS, Fi BN Undocumented Windows. PRACT BA A DOSWindows ABA HAL LAF. HERA #84, fE# il] CAndrew Schulman, David Maxey UJ Matt Pietrek %) ——si6 7 oie ttl RDI HA BEE AL Bil HAM DOSWindows API CMH EEF BHT FEI) SEE AECL T (ADEE SLA Microsoft SA POAMES AM Chi AA) MISH RE. FEET. Microsoft BATHS aE TI ABI A ARPES ATE FA 2 BGR LEA, A A ABR. Microsoft BLAH TE RERIILEICA RUIRD JE ASME PR POO AR BE GULL FAIA SP 2 ET Rat, a AMET RI FEG Microsoft A 3EF, Mi Microsoft ef Lh LAT RAMA A WB MIA BIAS DOS/ Windows MM StPHitai Ft. (PETE TPH Microsoh AHMAR ARE LMR, WAL iM. A ROP AM BAe. ATSDR MAB ES, of WML HL Se Linu wy Heated Ea) BEATS A Microsoft (8% RARER VER Be2245 FSF 5 Microsoft HEHE, Wot ARATE. LZ Be ACHR BB ie. GPL HLH EY COPYING MISCPE. 4H AE Linux RAP, CHER % Aslust/scrMinux/COPYING » ide FARE Linux AK tar KP. BARR SITEMEARP. A NERAA KNEAD HAN FRE. 1.2 Intel X86 CPU A 5] 4F3LF A, Intel ORR ERAS ER SLi RV MAL ARASH 4004 BEAL Intel ‘iE H. FI X86 FTI, ALA intel Sh 16 (LiALEARE BORG THLE CPU SARI, all tA AUG AUR SLAIN S HRS ARE, ELA] 8086, 8088, 80186, 80286, 80386, 80486 LIA LIE AA 9158) Pentium HUSAIBM i848 8088 ADE PC AGT SEBLEUR, X86 FURR RET IBM PC SEAR BLAU AIRBUS T SEP 80186 JPR) ASST, 85 IBM GE ILTE BC BUH HEH 80186 HK. PAT, ABARAT AI REEMA, std RA Linux PURE TE ARE Bt FPS E HEYA ULM. 46 X86 RIM, 8086 Fl BOBS JE 16 (MEIER, FTA 80386 FTHAY 32 AEE, 80286 Ml AAS) JA 8088 FI 80386, CUAL ALIA 16 (2) 32 Arse RIM YAS) 25MM. 80286 AMRIT AE 16 RESID, (LE CEMA LETTE T A “SHUAEAR SC” SN CADRES” URE “BATR—* CPU # “16 FE" BE 32 0" WT, FEA aah A” CALUD (HEI RRSP ABER, FN RUBE”, HES ALU RATER CRA BIS. IBA “abst 388" CSREES RIES MR RB SREB BRET SER BL, A Shk. BR Math, BUR 5 — PRR. (LE, MUR 8 fhe CPU PLB AY RS 2, USERS LALA SEHS, BY >t 8 EAN REINA UT 256 AAR ANSWML ATE, EAA Ay J FAVA, A858 RE CPU (yihhk Ae ABE 16 fii. RetwieART Ae 8 Ge CPU EMEA HS LI HOSTHE. TE 8 CPU ES: be ARR He hp LE 16 HERR HE. 4 CPU (AERA 8 free B16 ALTAR, ACR SBAL ARM ACEH DL OT AIT IR AT Lo A BRD AHL 7H] (64K) EA, ENTIAK. Ms SAME? Abe ALI SB AL FUE, BAR REABAS HUH tet Yes RIT IM, ULAR 64K MY Le (at. BNR AINE AL ALA T 18S, 1M FPMACS RENN CARE KEE ROAE T, BRATS MEL, BAK ABLE 4M PSA ESI). EOL RE, LEER, TR Je PLS EWA Ae Ht AN. BLA Intel RE T HEH 16 (ie CPU, BY 8086 ARH IM “PATHS AY ee Sra), HEL AR ay Ee at HIRI T, AURAL 20 1. JAE, ANUREBRERAET Inet AUULEE A DET): SUR a a Wee #2 200, (A CPU HY ALU MYA AA 16 OE, BLS OE) AH LL PTET CRE 16 CU. fo PURIRANA MERRIE? TRE TRAST. (IL, ADR TE He 8 fi CPU PARE. HR E20 (AMES SR SBE RR IE, (HE Rie CPU ABE HE. A, tity PDP-11 RUBLE 16 (LAD, (LEA FC MMU CATE RESE HIG) ATLAS 16 (fsa MRR Sy 24 fh ALSEIAL. 65, Intel BLT AYE INE SEA AIST, BMAF ELM. “6. PAK Biorin Tntel 4 8086 CPU HE SU “Eee 7F 88". CS. DS. SS AMES, 4S TA SUT ABA S Ma EMAISLH. ARES PE RAE 16 (eR, AME HE AES 16 (1. Be “OTA” aS AY PA AGHMAL” ABE 16 CH, (HALA EE RZ HABE CPU ABB Babs Seek ay fe eA FARM, HER —7 20 ROUSE RR HHE. GREE, REEL T AA 16 COA SHB 20 {USE GRR, RA “BRAY. TSE RR BLE 7 ASH TN AER OL 20 Cede BRP AT, 16 ie, FLA ZEA DOIN Se feb he: PY RABEL HS 12 CSBP SH OY 16 CARIN, MO ALRBNME PAIS 4 REAR. RTS BREE ASLO H “BONAR” FA, RATE OE, ERA MEnL Sind ERIM. ate AE ASU BRAS FEAR IO EHTS OREM”, SERS ROUT It FP RRY 64K SATAN I, ACEI ALR FIR, PKR ERA ARGC AETA RUNS", HAE, lid CORRS PASIAN. TERRA) LEPTIN — oe, TE ANE BR HB FAI EO A 4 APR, RAS ORE RAS AAD. Sta, 4 CPU RRL AT NR, ROE, RRA LTA EE, AER pe AUIESS. BET 8086 MULLAH A dF Stik Uy SOAR UA ARIAT ERR, BC Ay TBR SIS Fes HULL ORS RAR BU", BRA An “Seb. BR, GEISHA EACH ITC bY “ate RAR” OS, ELAS 8086 AREF AEG, Intel M 80286 TTHASEIR“ PAD HEK” (Protected Mode, 1272-5551 19 80286 FRESH AARHEA PREP, AASB ORT L CHHEAESN). PL, ALL 32 Ref) 80386 CPU tH IFARARIY T . RFP, JM 8088/8086 Hi) 80386 HLTEIR T ak A Ee BRILL 16 fic CPU BIBLACAY 32 (CPU AY EEK, TM) 80286 MAE ARIA EK CERI — PHAR. I 80386 LUE, Intel (t) CPU SE 80486, Pentium, Pentium I FSA'S, Sk ASHE LAR T WLM, Chae DALAM ictal, RAS TARA RE SR. TARA, ATLL 1386 iY, Be i386 CPU. F TD RATSLL 80386 WIR, SHH 1386 AMMO AESK 80386 4+ 32 fi. CPU. tha BEIM ALU Bc Oe M32 GA. AUT CERTTE BEL. AAR Se BL. bd AR MAT 32 (A, FOP ALRE DAE T 464 Tk). AFA 4h 32 HE CPU MIE, SMM DBR AT. AE TREE. ATP RAHI FR, 80386 DAE ES EEL HF 8B LOCA ALL, CCI AER AL. AERP ALE hi AS, RARER TE AB HEA | AER EA) — BL, J ALGAE HEY CPU PY AAD BEDI? LEAF Intel 98H A RIE SEE ith Intel B62 SAR aE A ASIA ASCO. AR BLSS TERY 16 GRA ATLA FUGA DUS Bk es Fe-ah >, ALL HR T PRE AERE PS BGS. TSU ARAP ESR. HR Be ee PRUE TEM TOSI, SUI MBA VIUILERET CAE, JAE HR, FRERZ AS. ARE, GAGE MERA, TRIE REUSE. Af. Intel BARN SABE: GRP HOR POCA ATT? INA. MAA AL CBRE > ARE PL TRG MEIRET. A, AR THBS Ra) -P PAeSEA, CPU ain) Dk RR USEING ee eM (CL) ARR SS th RR ae EL LASER Zr aR, METS POHL CER, Ta ARS DEL AAR ER. IR (2) HUBBER EASIN AS, FAINTED “HALE eR”. ) JAR Be Hi ei Mrep Apa bai nux Bet my 4) MSP RHE, GRRE RK, GREER. (5) ORAS ICR FFP Ur FA PORTIA TE 2 LA 0 © AES HRM HAE, SHAH SERB “RAS” BRERA TERETE TET. TESS LAL ADRES MRA CPU TAA REE” EM, it CPU. PERETTI NUE RSEZE CPU PAY RAP. ERP” APRA, Zech GROEN) ABA CER A ORAL) Hah, OREAR MEW LM METI, CPLA SHANA P EMT ARLH AMET ER HAS, SBR AAS, ADE SE le) AEA ED oh HERE]. HET TXB, 80386 (BEC 17 FF ALAR LE BCA FRR COBIRSLAY). Pat EAL BRM. TYE, TE 80386 CPU TG THU 5 TF A: — PIE Dal HY LH AF 77 8S GDTR( global descriptor table register), A/S AR Jed fib HE MI Bt 2 AF7F-AF LDTR(local descriptor table register), 4H) MT LAF HAPMEAT PE TUDE MSA, EM VR. HAT RETR, AE ES HHNKS LEMMA, ARPA ER IS TR “LS”, Hee RR RL, RARE RRM 13 AL CAM 3 AICS PR UHL FRR TE WT RAR ee a Se (index), #0 1.1 Wis. 15 3.210 Ty RPL RTP RAS, DO= EA, ARNE ‘TL= 0M f8HF GDTR — = 1 te) LoTR MA 8192 PEE RR ALI FORE AEE MB RNY D1 Ramee RAE GDTR ak LDTR ‘PHBL ARAB Et AER A AE TER HN bane ei, ATE TA POTEET ik, HAT CURR, ORBLE ARAN ARIE 3 ACRRRLERLL ITS GDTR 3% LDTR AEA PH AZM RA. ASA A RSL, ioeLE PRP RUA. BMEM ATION 8 PP, BMRA SHAME), Bah Fhe See 1.2 BTR Sit B31~-B24 Al B23~B16 4) Uwe bitl6~-bit23 AN dit24~-bit31. TT L19~-L16 #U LIS~-LO RI AER HE Cimit)iH) bitO~bitlS Fl bitle~bitl9. JL? DPL 84> 2 RLELBL, THT type 4 4 EMILE. EAI AEA 2 A L.3 Dios 1 a B3I—B24 G|pja Ligue P |ppL} s | wee B23-B16 Bis—Bo Lis-Lo 1.2 era i TYPE 1 P DPL S | E /EDIC] RW| A L— A=0 REAM: =1 CART B=0, SEER ED=O, 11:1 EBD ED=L, (AFAR GBERBED Wao, BARE A, 1, HA Eel, (0556 CHO, BERTH Col, Mei HeE RO, RAK Rel, Tie SO. Aastha Sel, MUSA Bet DPL=OO—-O1, ABER PeO, HAAMLEM oo ec ADA.3 AHL TYPE FAME, PRATT CAB * Ob PCH” SE BEB A HR AD typedef struct { unsigned int base24 31 : J/* TE MOEA BRE #/ unsigned int g@ : 1; /* granularity, AAA AGL, 0 Has. 1 AR KB #/ Linux ABR a unsigned int db /* defalut operation size f#MRiya, 0-16 fit, 1-32 ft */ unsigned int unused : 1; /* HERERO 4/ unsigned int avl : 1; /& avalable, WRAITH EI +/ unsigned int seg limit 16.19: 4, /* BLK AEH RR 4 fr #/ unsigned int p: 1; J segment present, Jy 0 BL RATER AIAN BATE ATE! +/ unsigned int dpl : 2; /* Descriptor privilege level, WM ABUATALER */ unsigned int s: 1; Je PUR 1 RAR, 0 UAE +/ unsigned int type : 4; Je BROS, 5 boot S ake, aH 4 unsigned int base_0 23 : 24; je ESWALAIAS 24 fe #/ unsigned int seg limit 0.15 : 16 /* BRICAEAMII 16 fi +/ } Beam, EL BAYER type Wl, “: 4” REAR 4, PRIA 64 1, BLS PES EK -RA: AT AIHA AGRA MHD? BN, tT A AEHHEAN Ay 8 fica {G24 MAREE? RA APRA AE: FPA Intel ATR ALE 24 CSET, JK RR 3. bik fa) CU HY IE IL GRIME: Sg RAIL 1 AY HAE A Po KB, THER KARE BLAINE 16 BEAN AERLL 64K, BRLA—“MRIREACPT RENE 64K X4K=256M, TRIES 24 ALOHA. BELA, AT DUEH!, Intel RACAL 24 (LMWH SETH, AA LV ARB SAH 32 4 {AE 80286 CBRE T, FRR AEE. SITE Intel BOSE A — HE Bi A SERS” OR ‘ee A—-TRA ERMA ARM GEL MOV, POP AES ARE SAE), CPU RIEU EL iC BE RH PS A URE HAIR A CPU UAB —7S “RE” IRRITL. GCHE, CPU PIL RFE SRL E THEM, PLB A ARR R DH. PRN T T, —a SET COPA AD, BSBA ERR MME RTILAY, BRET ROE PHB FRR], GK RRAP ARG ER CPU HEFT. 7 80386 MA RAR SMe, WER AMR MEM, MAAI FUER ERO, FEAR BLCHE BE RURK, BORGER — 7M O FERRE aE: 32 (ith a rei) ~ “he Bh. EP AEHE 0, JEIT IOS Seth LABIA, CPU HSI Eo BELT oP SR HUAUHBHL. QOPRAU HALA AT ee“ BLAP TESS / AEE aR” URRY “ECA” Ht, JTL Intel HH “OP TH (Plat) kkk. Linux APR IOURCES CRUEL geo) ALY (Minh. SREB Mee, PD SLAF OL T BAR, BA AERR REBUN REULAL, TRUER UA TE FAP GE TA. HF 80386 WRASSE IEA AME, DAB RECS SPT EEE AE TANF. EAR SET AR BEA TY LSE Intel AT FARA EA AVR 80386 HAAKGTLOEELH, TUS. UATE, ARATE SS PARI, CPU SALIBA AY 4-8 LALA GDTR 8% LDTR Bf PA AE TL EL A CPU Hh. ZEWEitARE, CPU Seer A MUmTH AY p PRARAL CAE “present”), SUR p Arabi 0, RAAT RMD RAAT CLA, TR EMR MAD, nt CPU AF EUSA (exception, 41 FH iii). ij AAIC AY AR SAR IT RT Ok MBER SEA DC ARS BRL AEST PAO, THREE TP AAEM, FEAE p RAE RE 1. AMINE, AT ADS -10. BLS titer aie TERE ST DVS Bia, JRE AILP p Anas irate 0. SY LSA Ee AEA SCH Fe 1386 GAPS PAHS) UR MARS A ORAS LBA UIE ERR PEI RE, IRAP TAU PE, WIR A RS BIER PIRCR. GME SDE SIN, AAR RITERE GDTR #1 LDTR ffi LGDT/LLDT AA SGDT/SLDT “Sah MLHS HUES . TEAR HH PING AER ELE RIOR AIRE LEME RAYA EP) (RA, AAR URE ASA ASHES GDTR AA LDTR OPS AE, Ey BALA OG RICE sa He AA ARLE, MAEUIT MBER AT TENT Sin) CREE RARAS FABRA TRH OE FRAT RRA MRP OL), IA, 80386 2K he BA VPP aR, FAR ETN AAR AZ BROLIN? 80386 JFASLAIR—AR CPU HA AT BABAR. IGPU RAR AS AU RAR. TH AR Bal, Th 0 Mk, 3 RI. Ee RRS WAT HUE AIS, MATA LODT, aR fitE O PARAS BA REUEHE, TD RIAA / HLS CIN, OUT) WIEBE O BEL 1 He. A, Pam WURRIEAB AE 3 ho AREAS (FH 17 EH rh SCAR Be AST ch ASTER CS BTA LA BEAT) 4104) dpl BRE (dpl 4275 “descriptor privilege level). “MK, AMAIA AT dpl FBS JOE 0 BORAT FA UE I. TE ERAN dpl FEL. ABTA, ERAT AH. AUS, 16 ANB AS Te RT IR 13 AAT FeRAM, TOA AC Tt ae? BRA ERB — ROAR: typedef struct { unsigned short seg idx : 13; /* 13 iPAN RHA T BR +/ unsigned short ti : 1; (oe BRIBE. 0 ARAN GDI, 1 fea LDP >/ unsignedshort rpl : 2; /* Requested Privilege Level , SHURE I */ ) RBS: BLASER CS PN ti Oe LT, SADE BR Ae. 0 OY, WR RRA aH Ri cpl WAAR. Se SPM AAMT. CPU SIMLUO A, DLR BOREHY TAT ALOR AR AS AS TH HE BER CH IS AMET IB FPA pl SF ERAT A ASAT LZ De, RATA EEA. RSC ANP TE AT JtSb, PRY S/RBGH AGT GDTR Ae BELA Tt LDTR AAR, HSE i386 CPU ATL SEH EL APES IDTR, SHERE CE Intel RIBAK “ES”, Task) AMS TR DLA AES RASIN) “ARSARASEL” TSS A, RAGE Meat PAN MELA. Intel HER A386 ATER AA SCAN A CPU ASEM AT ARS AP eR DLE» A A AS TE A AT E, PERE, MOS, WHT. HEREABHRLER AD BGA DA. REE, JF BPAY FEAR FAY CPU MIRIAM Av. ALA, 7E 80386 |-SCBLI AA Unix MA, 48 Linun, #8 ALT PATH), BO DEAD 3 Sh, TE ARBOR AS APH RAR. ARSED t He PIA Unix Ate PRZ hy ROR AU PRS “i. Linux 118 28D) 13 386 RAN AE LALA SHRAARENEA OE, AGHLAAY, AEAREE, SHAME, TT REMB ATU. M80 FHI, RA fee MMEA T ARREARS (OL Unix HE) HARK, TI RABEL E REE ER TH Intel J, 80286 FRR SERS “CRIA”, LAV BESt Se PERE. (AR BL, eA Bist 7 WARRANTS REACH, MHS REM X86 REM K STH AAT ESR CPU FE ATH. At, ERAVLGHT 80386 PRLS T MTA SMH. WAL, 80386 ERT rem ihse 80286 FERRE EE ATE IR T A. AUTUEL, 80386 HRA ESE BL AL. eve HE Heh or A ay FES RH) 32 EAA A BR) RUPE 32 CANON. LC “RENE”, BAR REAM Lk, FR LSD OL PEPE AY FA FEA CRA Hh. GL. BRC HR A ad) 2 AL I HB ERE TE “RR” ARK, RAK RRR T : AA, WR TR TE PANTIES AN, MME ER SERN AD. FN, Ieee fel, BUR ASB nl DLAs 8192 MIT (BLA 13 CFD, URubat a LAL ES HE DAFUL, ECAR DARPA RA UU SAR I. A, TSAR RHE AS BE AER YE HZ k, BLAS GBLH. BTAL, Ze 90386 Hh, PRPS Rast eet aT hI. bilo. CPU ity PSI AT BOAR A eR CS A OM Umi SRA eR a PDP-il PARE be. AE PDP-L1 HP CPU fy SHAT BOR AF AEP A 8 2 FRE SW HT FEAT HARM EM TR. FE, 7 80386 |, BERR AAA A TEER. HANES PARP, MRA E ET REIT. HAE. 80386 MRE TE ATE PER ARR EBA ey PE a ERA. UAC EE UE A AR B17 TB Ae BY HL REAL RAY. or Ti a AAP OAT TO RAIL NPE “STEAL” J, Imcel BERL I “AR esa”, THe, RCP Sg EE ah UL I a ek Rs Peis ew We Jk: ate, SAUER, OSS Heii LARRY pe Bb, BO3R6 HE HE Hb 1) SP RR 4K PSA TT CE PT A EER TRA PRANK CHRMAY 4K TIE). RSE, RES hh eT ee EE MERE. (URC P, ESM Gh es ed IB tee Ae CRI PENI E Pi). CABREL AE. ARR TCP AR Ae NP a, (EL AY FE. SPAT MER PENAL aaa it TA RR, 3 GDTR 4G LDTR Pete MBH Fe RRA bE the A ALSh. PRAIA, Af 32 REHEAT Si RE aE ee) typeded struct { unsigned int dir:l0; /* AYP RMR RTH bes, KERR T RK 4/ unsigned int page:10; /* APR AATF ATT in, AOR — PTCA 4/ unsigned int offset:12; /* 4K ‘FW 7HE DURA AS ate +/ 1 Seebhb: “2. Bie Ran Jk-MaHTT DAR 14 Size. 31 2 2 mu 0 ti Ea ie HIRI page TEA offset TH. 4 Sef Abbe Of DLE, AEDUIRT ab Sey 2!° = 1024 SH SRG, EA RAE AP OLB A, ART Ae SPIE 1024 SIH HGATI. AIF GDTR AI LDTR, SUMAN T —“P aT AER CRI ETRE TURE ROSTHEL. GARE, MER MEAL Rb AON EE A 1) A CRS (6 ST Hi ae AEH» (2) DUSRESMMY dir OLB Pon, ZEB RASA TL ae A (3) PAR HEMBHER IY page (7 BLIP tak, EAA BUR TE MD Be PA HN EY HE HAI. (4) Em AH YO A SeR PR PAY offset (2 RAMI PI Ee (5) Esme EAE RT HHS 1.5 LOW dea. BR | mde | sett vote TE cR3 po 15 Ate ee HA, AAV BARK ARH) RIL, PRIA) ATARI, TOT AS AR TA BR AR AB RE DRAG? ROR DRA. UOT ett 8 dir #0 page MBE a HEA 20 45 Waal TRG Re AA EAE KX 1KS0M 20. PS RI A 4K EY ETA WH AKXIM=4G, FEAT AR 32 AHR SS TNA A. (EE, SEN AREA RE — PEER SS te EI 4G 109 PAVE lhl, AAS SARIN, WER, ET. BG ANB eet AH MLE TURE. TIARA. WUTC T DL APETO I A, UN RP MT, Hh eH BORODL RE. ARTA TART]. RS, TERRE ASIA GR — 7a EC I SIA AG Te (A, ABR AMLARE NT, DRS RT AY ETE, ARR EO. 4h, —PU (Ab ak aK ST, DE ER RRMA A a PT. 102d PIER EE aK “2B. HEALTH. PES T1024 THE TEE OI Rk OL AES TL AEA T . HIE ilk, Ze 64 fh Linux fa aR LA) AAG KEENER E (Y Alpha CPU # AHA rb Je 8K FA, A Ak AAD AOE AK REM T 8 PT. WHA, a RA PC REINARET, TTR ate) At TR RE Rat. FMA AT A ARATE 4K TE, ETI 12 RABEL OL HH, ARTA LRAT MAB 20 ATH AE Ta RA) 12 RMT Pe a A fl. PR, AR: typedef unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned |} Aaa: struct { int ptba : 20; /* TURREHUMLANFG 20 (2 */ int avail : 3; /* DLAMURARREAL */ int gil; (* global, Sift si */ int ps: 1: (SIAC, 0 AEA AK SET #/ int reserved : 1; /* (RRL, AWA O */ int ail; /* accessed, BRI Mid ¥/ int ped : 1; eK AMI) RTPA int pwt : 1; (# Write Through, FR PST Rae +/ int us: l; /* WOR AS CALM ALOR, Ay L TRAP AR +/ int rw (® SURAT */ int pil: /* HOM RTM DEAE */ BRM OW ARM 16 RR oe AM HG 20 Ae Dia can a , vam L__p writabie |___puser defined L____pwrite-tnrough L___p cache disable Lp accessed L__ypiny B16 AARTRA TRAM AEA L SIAL, (Re “TMA” ft ps, BDU 8 (ERE, (5 7 fh Ce HRAPREEAD MWA D Diny) ws, RTRBAHLBRSM, HAL “AE” T. STIR “i. Pe anne ARTUR p HOM, RMAMYAGRAMRDENE, HAR RSFSR, CPU HIDE 4 Umi” CPage Fault) S48 CHAR NB ICT AT, (ARM AUP WIE AAD SRE, PUR AT EU Pa A at OT GA TRA ES BRAT HE, FO p MEAL. MR, RTL EAE T'S AE HR BATHE ADIN p MEY 0. AA, MATER AEET. Mp AON, RRMA SFR, BUTEA PAD PE AS Ak PUA HB wa AES, “GH RP AY ps Cpage size) (029 OI, Late ak RAT I HT TM eb AT A Kh a AK S15, RARE Linux ABE ARAM RAN. (LE, M Pentium ERIT, Intel HAT PSE GANS FEAL. A ps HA LE, RAAT 4M PA, RAE T 2 AT WR, SEH hE PAIR 22 ARLE ARAL ECE IM FWP. A, BMPR ARR AOR, Bl 1024X4M=4G, (AAU ALD T — NR. ROSA TE Re AR aI, Baa SREY SBME BS LL Ae et PG Ah Se RSA, 4M SAY DTT A A HD AR Se ESHER oh Ey Intel (8G AE LY. SU, 1386 CPU SRA METER CRO, HRA PG AAO XK. “4 PG HE RRL ML, CPU RFE IE TCR Fe fit EY TL HS AA Pentium Pro FF4f, Intel AE LaF Fi. A UA FAY ARE AY A» tel FF HL A 4ECR4 LAIN T —fik PAB (29% Physical Address Extension), “4 PAE (itt at 1 if, thik BARA FERSERT 36 1 CRMMNT 4 Or). GUA, TOA SAAT PL BR TERR. BLK SER PAD Re EHH 36 A OAG YEH HL 1B), ATLL BIR, A NBA LS Intel OFT XPARVALES. IES. Intel GHEE T 64 (rh 1A-64 RSH. Linux WB RUS EH IAG eit. HK, Linux RKBAE Alpha CPU | sch 64 ALLL. PRAGA IBI, 80386 EAL CURRIE ie AUK RE. LN TR. PRR SA BOR BL, AMER PERE [Jeena a, DAA RRR AT BER AEA A, TID AEE T= 14) Linux ARRAS PH CBE Linux ARAIAEAAEEA GNU 09 C ASE, GNU NULARART ME LIL gee, GNU At CB AE Y GEANSIC HERE) (ET ROT HE. TARBALL. A il, BAAR, EE Hee RY AL EBD BE PAN AY RG, ARREARS ST He, IERCA SEL, BILLA ZER ED —— FIRE CE FE BURT. UL. FT SUSHIL EC. PRA, ATE RRMAT SAD. BTLL, BUNA RT He SUMS Linux ARIE, SARE SBIR — Me ALLE TS IE ER ET. DU EAA. EEE Se MELA HE. dE. geo KM CHIBI T “inline"# “const”. JES, GNU 1 C ACHE HEMI, goo BEE C EES CHARR, BRLUA CH PMU EK HB] C FURR PRI. NRE EU, inline BR (SMH Sitdefine 2252 QAR, (BARRO, Re, FH inline MRC ALS APR RSE, MUR inline ABLE. BI. HP. WRT CMG. Fb RARE, RE inline RRMA AA TSI, AA) eR AT So APY inline BRM SME, ALY MOA ARED Nc SOPRA Th acti. 1S. Linux a6 a, OA. NT SCH 64 Air CPU SB (Alpha aE 64 1004), goo SHAT ARR AYRE ACHR ACH! “long long int”, BRMCARRIE? HAS, WS CHAM RAH “RAHAT” (attribute), MM “aligned”, "packed"; gee HKD SAAR. RRA SRE CNT RRS. WR, ZEBRA CHM ANSI.C) $GR#EIAFER ET, BORAT AT AEP E-2EDPSE, GHWT, geo KEMRINS inline, FUE WT “nine” RAPES CE CHHPREREF), PUTER SRSA BRS inline, PRPET IPR, ATARI, goo AVL AIREY “inline, GABLE “2”, TU “__inline __"@GhF(RAF “inline”. IAP AEH, “__asm__"3EiF “asm”. RATT ARB “asm”, TARR BB “__asm__" HRA. gee BR PRBS “attribute”, AR PERATEMA. dn: struct foo{ char ay int x(z] attribute__ ((packed)): } J&B PE HAE" packed" TET a SRAM x ZAM TY 32 RRMA PE TA. IAF, “packed” BASH RERREMRRT HF ¢E Linux MALE £ eco A C MS FE. (EYRE Linux HUA BOR ALBEF goo MBER. ACH dey BF geo WM Linux ACEP ATIR AR, ELE Linux BPE T goo. TERENAS AT TB BT Hi. RARER geo KE. RAL, Linux WA SARA TT EM gcc MARY HR. AREA: GPE, Linux Wy RATT SHEE eI ae?” PRL, “HR, MEA BETTE a PERE.” ASE, AEE ASI I), GNU ET CL Fe. FAR, HF eco BHCRSLT FO) BA CPU EME. EIB OF Unix abit. SBE Unix Jt ICRA BS BSA, TELA Unix RBA T CIR. MUL, CAT ABLE Unix ARAM. Unix BRIE. Cif WRK AS Unix Kit, CALA, TH SEIN A RH. HK, TERME AWEK, HHA. UNATATA. HY Linux AMR LLP RABE. HAR CPU, geo KEN CPURBST. WH. gee RE RH CPU ME Gait. SUATATE, Linux ARES EAA ACRE inline AM. RSL, RITA BRA EECA, ARE PAYEE ZR SE X. MATASIE Set A BARS | BER BEM UN Ny TORR AE, TTS BEER DYE HAAR. St: —7SSE(FI, BRA fs/proc/kcore.ce 163 #define DUMP_WRITE(addr,nr) do { memcpy (bufp, addr, nr); bufp += nr; } while(O) TESAMOSE, do-while 2 2ICRUT THERA PE. UL, Ia ROR AES 1 A PAPAS BUTE Ik, THA AGUT— ik. TIAL, WA BIER SEL —+ do-while (IRIE WE? RTA RE. RANT BELLA AE BIE, RENAE XO PSE? 163 define DUMP_WRITE(addr, nr) memepy (buf'p, addr, nr); bufp += nr; AMT. WRT BART AP if AUP S| AL aR a CIT, ARITA Mant RBA: 16. RA ReMiR if (addr) DUMP_WRITE (addr, nr) 5 else do_something_else( ); ARAB, RARE RS RMT if (addr) menepy (buf, addr, nr); bufp += nr; else do_something_else( ): BB CHIIN geo RRL. TRAE HH. TAY goo UY if HAZE memepy( LUBA L. T 8] XRE S| —7> else. {L482 DUMP_WRITE( )ffl do_something_else( )#—F (i, EZ nf WUE FRA RT, I AMER EAL 9 bufp += or BA BAT. BAD LSUMS RR CHILES, RNR ids 163 #define DUMP_WRITE (addr, nr) —{memepy (bufp, addr, nr); bufp += nr:} ne, EMIS B Pi Rm ee, We TL Aa RE if (addr) {memepy (bufp, addr, nr); bufp += nei}; else do_something else( ): TARE, eco A: AEH) else HVTIMY “5” BLAU OY if RD QAR, Blliu TEM else AUC if Ay. HB LLP. ARH do-while (152 XA: Ef) 2 F EBA Fl» TRA ZIS. BRR a RE” EM. lh F Linux ABA SBA SAHA CPU ARAM RARE, UMA AE EIN TAR CRE MW SARE. He include/asm-i386/system.h '{'#14) prepare_to_switeh( ); 14 #deFine prepare.to_switch() do { } while(0) ARTES SURE AT. UAT HIRI, 44 CPU bP EIR MA prepare_to_switch( )fFEHh 7A CPU LAE, BLL BIO EE XE. AAA MAE SPI, SEALER FU AREDA SUERTE. PUP A BS AH a BASU BA URE IX LRT P RF Hy I ERS RR SCE FABER ES), PUREE te THs SORTA APSR #8 foo, JT EL TG BEEN — TRA EE MOREE TAL, ARTA. HAIG RRA AE Aa SR OAL PR a i: typedef struct foo “7. Linux ERS A struct foo pri struct foo *mext; } foo_t: PRI ASE NS SLE ATR EI PIS ER ME FAN MEH AEM AE foo BUR), RETREAT HR eM BRE. RZ, ES DAREMA, RELL ETRE PEE. MARL LEG D AE RRR TK FR, RE EAR A Bea ET. BELL, Linux ye RAT — SA. AEA. BIRT RGR HTD TE. bk, (USE SHEAR ET prev AT next MAT “TEL” HG SAPARD HAGEL list_head, QP RAMA ELA DL “aye” CESARE ERA 1B, RO MAE SA — AS OEE, RT DU AF ATTRA AS BA BI Seo ak HE A we include/linux/list.h FP CSR ERS RA FR, AT A, BRIA “BEI, BR AAP RR AVASHE. AY GEL” A RRA”, ADR “RRR” Al RRA”, 1548 “Hi” SLAM MAIR. PR, BRATS AN EE A. 16 struct list_head { W struct list_head #next, prev: 18}: BERMAN ORASH, AMO TMA, ARIA MUR ER RES HOT, BETES ABLE + list_head RGR eet. LUA T Al fF Sil BEY page Ra Hit fl, S230: Cl include/linux/mm.b) 134 typedef struct page [ 135 struct list_head list: 138 struct page #next_hash; iat struct list head lru; 148} mem_map_t: UL, 4 page BURA DAF T+ list_head SiH. RELAY POMFRET. BTLL page SEPUMLLARI TREE PAP RRL JESR, SARTRE AEP SURSTAT next_hash, ALK AE PSU AY GRD, ALRITE BPR KL. YEE BGR ABA AE list_nead BUBB MUTA, APL oe RE INIT_LIST_HEAD 3847: 25 — fidefine INTT_LIST_HEAD(ptr) do { \ 26 (ptr)->next = (ptr); (ptr)->prev = (ptr); \ 27} while (0) BM per WATE OIC list_nead 458. WT LOA LL iM EL ABR IA 1% list head 4849 A +18. a. SR Blea BE— page MALI “BGI” bist EA CHET PRR TUT “HEA) RU, A lis_add(), AE <7} inline Hk. JE{VEGHE include/tinux/list.h Fs 53 54 55, 56 static __inline__ void list_add(struct list_head #new, struct list head #head) list add(new, head, head->next) ; } BH mew 4 9 GG ADAP AT 1 BR SF AY AGE ist_head BURA). BBC head MARAE ALR, tH AL¢ list_head $8. “eR DUE/MICTIN. SUE CEA, Hel te BM Ba CE ALAA RM ESAS) PA. IAP inline BECIRM 5i-~~%> inline BA B__list_add( PREACH: Ulist_add() > __list_add )] 2 30° * Insert a new entry between two known consecutive entries. 31 * 32 * This is only for internal list manipulation where we know 33 * the prev/next entries already! Bs 35 static __inline__ void __list_add(struct list_head * new, 36 struct list_head * prev, 37 struct List_head * next) 38 39 next->prev = new; 40 new->next = next; 41 new->prev ~ prev; 42 prev->next = new; 3) KF ati te 90 91 92 93 MTS MARR, NR REN TAPIA AK, ATSB A AE A ACE A ‘Sd SE ALBA 2 BP A SL HE Ba, 1) AK Bh AE Mist_add( ) OME OAR AIRS, RAMI R SEE ROOT FABRE. fda, AE Beth Met list_add( TO AEB AL_list_add(), ADRS ARAB HB AT __listadd A HIG, RARE MAILE T FORE AGTH IBGE LIBRE list_del( ): static __inline__ void list_del (struct list_head *entry) ; - __list_del (entry~>prev, entry>next) ; } FAR, ALIA 74S inline BB HL__list_del( RTERIRIE: “19. Linux Patt Dy Ilist_del() >__tist_del()] 78 static __inline_ void __list_del (struct list_head * prev, 19 struct list_head * next) gf 8I next->prev = prev; 82 prev—>next = next; a3} TERE__list_del( +P OUREM SLIP LE entry 2 HAZ ATA list_head 4449. MUR entry Lb BUPA TH, ATARI], BAIA PUGS, ABA AL 7 list_head 9), AREAS CELE fe HAST A. BP WRASARBAT: WPM ALELL list_head BTM, RAAT TERE. oR RAVE LA MAES, BARRA T CME list_head ZEAE, MTA Dut BCMA list_add( Belist_del( ); AE, Rat, ARLMCIWE—AB RIERA I THEY Hist_head SHUI, MAS HARE FCTE FHiNWE? FE lishead SEPIA EAA. SE, RINTCEROM ELH, TD ATER. EN, BEAM. RNBRBL-TRAKEARTABRER RN. FARE mmpage_alloc.c ‘# (f)—4T #003: [rmqueue( )] 188 page = memlist_entry(curr, struct page, list); 3K FLAY memlist_entry( yf list_head 486 curr SRS EMA MLL, BLN E I HOHE page GPUYHEL. HEATH At memlist enuy( HT RANB RAIA. BLISS page RAL, TRL IUROVEE. MLA — FeO BE rmqueuet HT MCES, ET OAL CE BME list EAE sie LA. FKL, AIT PHS memlist_entry 5 L.A% list_entrys FLASKS! FAM AE Tist_entey( )s 48 define nemlist_entry list entry TW list_entry fi) XMUZE include/linux /listh FP: 135 (% 136 > List_entry ~ get the struct for this entry 137 @ptr: the &struct List_head pointer. 138 @type: the type of the struct this is embedded in. 139 @member: the name of the list_struct within the struct. M0 #/ 141 #Uefine list_entry(ptr, type, member) \ 142 ((lype ® ((char *) (ptr)-(unsigned long) (&((type *)0)—>menber))) HTH 188 TSU, ATLAS MERE: BTC MBER, R-THA DR BA: “20. See aa page=( (struct page) ((char*) (curr)~(unsigned long) (@((struct page*)0)-> lis0))); IRIE cum 38 -7P page HMI AO BIN AR A lise (BME. TRAUB ABE BEB page 24H. A Sk, AVL AAKHE core BRAMALL, HO RLIP list TT page PURIRMUEEL, AEA IEOR. MA. AEBAURL DV? & (struct page*)0)- >list BEAN page TENE EHWALO Lit SRG ist (BK, SMALE. FUPPOOILE, UN AEE page SH) iu BLE, UH BKM member Jy iu, —APMEE UTA: Sy tbh. FD, 1X EB EBERMS IL, RAS TAURI. LA, A PIG AAR, BOLI A BP ist head (ERMA, TELLER PAGEL next (IARI way. 1.5 Linux AA RRA a4 HIS SY (LTH REGS SMR RR, RARER BD SEAMS aS. it Unix Sys VRE A BIE, EI 3 TBO PAS SS RELY 2000 17, 5 BAB 20 MPRA N.s Bl. HE, HP ARR TPS BE, Gate SU HARE BRP BY RCRA SAS OTS RUA, Re LAR E a BL © RTRANB PORE BRASH 2. HEARS MMS, RAE TE C TE SPIE ACRT RCT RS}. PAN. E386 FRSA HT TT. MESA / Mirth dE > On ind, out SHON MM Cit). FA, ISR MERE EL ORIS S KMS. CPU PASH RHE, Au, BRERA, HOURS SRS. © CPU PHY ERRIF > RAINED C TARR, WHO, FFU. ES. HERP SOAR CPU GHP, SALMTRMRN GAP, EERO Lae, fe Pentium, Pentium II # Pentium MMX, #¢F BORK Sia 14°98 TRIMS, MH SEAL AIL SMiE © AT HSEELRCHR TALL, ALAR BLA, CEL TT IT ARR I, PUL IE Cm tAl ARAL EE. LSE aS, ERI RHA F, aaa ELARAH SRSNH. CHKEPRBPAY, HERRON ARH io ABET LOE A AE LEE — RAE. RAT EA AR PATRAHART LAK, KHATER LEE. FR, RAMEN) OTM ARS AZ UR WA, TH Pik A te CANE OA SSC GRIEBE EG LD. ATL, SR Me AMPA OAS A 03 I] Re FS a © ERR, — BRET AIMS ES . PE ROT PRR TO) Fo AA SAP ERE RETR. RR, BIRR ASH TPF BBAMT, PLA AARC M05 5 éE Linux ARS, CORR SRP RFR, ALAA ATESt BPAY, BORIRMRHLs MONIC GMa. SSE I, IER AL “SRR” ATL 2. Linux #3 ti La) AAT. DAMIR TART CS BUM Keh, ACoA ML SR, Ae ZH MILLS Dyia eR. ECS) Soha CREA, PTL A#include, #ifdet BMS. MT BGR ARTE eH LEM BoA RAE CREP MIRA BL. BRE ANSI AY CIB tt PRA PCB BAY UE. TS AE ba HAY C BEEP AME TR TANF. TT GNU BY C HiiF pce HAE TET ARNT Fe - JESh, ARTA Intel HSCIICARIT, DEH RESIS. PAH SEF Intel 1386 RAL A Linux PSH, PIATRA ME GNU 3 1386 ICE AE ie ADF SEAR Linux ARES, SOBEL BORS 1386 ICAI. CERERA PAA 078 7 BFR RNS AE, ARTA. HOA. PEAR “ob” Iam eBaHP GNU HT ARAL TA 386 LSE SAO ANA: CER A C REAR BRP, SECIS nT 88d SC LA gu fl OP EARS, URIS CREPE MERE SAE SM. BOER C RIF PRICE RSH BEG LMT — ASE 386 1A CZ fal BP ali BRC, RUNES PHT PEW BPTI PGA 386 ICawi a, LURE PORE BRAIN TE TATE LAER. 1.5.1 GNU BY 386 CS € DosWindows SURP, 386 ILGWiE SMA Intel LWIA) GES) Hat, RARE PET ABH 386 1 MT TE Rit BP MS SEY PRIMER. AE, fe Unix Tt, RA HT Joi AT&T ENON. SHI, 4 AT&T HG Unix HB) 80386 WHEE FM, MLE Unix BW ALAS) ANG EMT RE TECH. Unix SE) PDP-11 ALAR FPA, F6/a HLH) VAX #68000 KF FORRES L ROMA ASIC ARIA TT CERT. AT AE SR ES Intel APTA. fA] AT&T LAY 386 SCA SLE ASICS. SK, ZE Unixware PRR THERESE. GNU AES Unix Sit WANA CEM GNU 42 “GNU is Not Unix” 94825). 9 T SSG HUI AE Unix RAS GLAST BEE AUARAEHE, ch GNU FPR A ALA B PRA f ATRT Mf) 386 GS st, TAHA Intel it) ek. ASA, BORA 3 ZIRE AS AME? HSA DR. UR ANIRDEe (A, RDN vSiH BR. FUR, By PK Ae GQ) Entel HAPASRMASES, ME ATT HN MORASS. Q) EATST HAT, SEBS "(HAM ME Intel BAT WATE. 3) ATR T fi 386 TSI 1" ZH AUTRES H ER PHOS 7E Lmtel BY 386 ICR SIEGE. 4 Imel HCE Eben, UAYE ITs WIZE AT&T Hest, HORT. PU, RETR eax HABA ebx, 4 Intel Hab y “MOVE EBX, BAX”, jfite AT&T FA “move Geax, Toebx”. ZA, Intel Hea We Hh PTARRY “EBX = BAX”, fl AT&T ARSON BRET A TABBY IE “Goeax -> oebx”» TE ATRT HH, OATHS PER CARE) HERR A ARMs PS Cat ALAR {EM VEED FORE. HP ERM TAR b Gem 8 £2), w CRA 16 HL) AL ea 32 £0). FOAE Intel HesteP, LTE 2225 ee CLR a OAT FE “BYTE PTR”, “WORD “2. a) Se Heme PTR”, a “DWORD PTR” K227%. GW, 4 FOO KBAR HIF TRA 8 HN BAL, ERAN PAM AF: MOV AL, BYTE PTR FOO (intel #430) movb FOO, %al CAT&ET Hest) TEATQT Hyak, RRP RSE IL “8” (PRR, TOE Intel HPA RTA. LL, Intel ESCA) “PUSH 4”, Ze AT&T HisteP ARSED “pushl $4”. TE AT&T HCP, RASCAL > jump/call RTE CATER aR IBAA ARMA), SEILE 8" (FHT AACR CTT VERGT ME), TTAE Intel HeSCP MUAH. ARON SRE SANTI SORTER, Ce ATT HCH “Ajmp” A “Ica”. 46 Imiel 48504, S29 “IMP FAR” Hil “CALL FAR". “444958, /HEN H ay PURO APRA, BRAM Mater: CALL FAR SECTION: OFFSET (intel #30) SMP FAR SECTION:OFFSET (intel #30) 6 (6) a) Ieall $section, $ offset (AT&T Hat) Jjmp Ssection, S$ offset (AT&T #30) SLAM AREHES, HS RET FAR STACK_ADJUST Intel 0) Iret_ $ stack_adjust CATRT FO) (8) BPM, GTP: SECTION: [BASE+INDEX*SCALE+DISP] Cote! Hest) section: disp (base, index, scale) CATT HQ) HERE ATRT HSRPESAT HTT HF. Bh, i SECTION #ilf, INDEX Al SCALE hE, BASE % EBP, iff DISP (ff) 4h}. Aaa: [ebp—4] intel Hest) —4 (eebp) CATT HERD PEATAT HANES TURAG— UM base, RYASMAS, MAKES, ATLL (Kebp) A F (hebp.. M-SEABAT COhebp, 0, 0. Lil. “4 INDEX 2 EAX, SCALE % 4 (32 {i2), DISP Hy foo, Mi HwSTE MS, MR Ys [footEAX*4] intel #38) %EAX, 4) CATRET 4430) ROP SHI SUN HERG oe DE TERA PBL, base Wy BCsHL Meh. scale WAM EALTR A), index WPA. WRATH RAE, MN disp WATE ERP He foo ( 15.2 A C (OP BY 386 1 ae eB Sm EE CBA WA EPIRA —BULRIE LPR, AYE goo #20tH “asm” BAW. RU, #E include/asm/io.h "|! HIRA—AT: define __SLOW DOWN 10 __asm, volatile. Coutb al, § 0x80") Linu 9 PR Es 33, AMR asm Al volatile HSH “__” FH, tHE gee RC TRAN -A FE. HRA, RAS RHE T SSCS. LR 8 LES, UT CeBR TE FE EINT 88 “b” WALA 8 ALY, TAT Ox80 ARAL, AUNTIE “ALR”. PTL mL Het “8%, (TRS al IT HAE “9”. KGET TPE AT&T ALS Intel HAMA, ae —K RMA GHES, RADHA. ER asm HOP NOMA STM. REAPS, ERM ATE, _-SLOW_DOWN_IO R# ARH ATI Ms #define __SLOWDOWN 10 __asm__ __volatile__ ("jmp 1f \nl:\timp 1 \nt:) RMABANT. RB, SHAT SACRA), “Wn” RATA. TW” WU deaR TAB HF. BRL, goo MLAB RE MAE THT gas HICH mp 1f Ie jmp if 1: RASS Abe If Rasika Cf A forward) KAVA—“Me Ss 1 AAT. HABE, OR #2 lb MRE. LE SAY Unix (CR RES PARK PORE, ii Unix 6 ATE A AIA EIGER. TLL, BCL RA TE CPU 25 (PR ee A > TERM Ta). ERE SWE AOE, HFRAEDA- LM, MAATABA GARD RRR, MEE C BADER FUME? INET AREER AC CERRO EIS]. OUR C TRADE AEIRIANGA, fA ANE BAM PEL, SURAT, EUR EMCEE WREARMA PEELS MIS, WA PRIA —R CORA include/asm/atomich) AAS ts 29 static __inline__ void atomic_add(int i, atomic_t +v) 30 31 _-asm__ __volatile__( 32 LOCK “addl $1, 80" 3 (v->counter) 34 ir” G), “a” (y-Dcounter)) ; 3 RTS, HEC RTA AUCR SRB EL SUE” ICR SRBSAS, Wie LAVERAMERSES, CHSC REPWERAAH ME. YTS, BATA te TTA S NG Fe, RAT TR ASEH. SEG SETI RT BEATA PA BAT CRA RE Fil, Fest PABA CHP RO Ak, SEO. be, AEM eat REY AB] FS YAR ws I Se Ie AS FEA CFR NCS ee RH WT EL PO ADD, Ee” SUAS, RELI 2d. Se Be HHO MI MA HE AERURAY "SELLE pe TT AUCATT ITY Ls RAN SE MSP RAIA, HH CS TESA ERI AALS), (UTR AEFI). AAPL AN RL. THEA ARNT RRL PU TP FLAC ARM MAOH FOR HEART SRS LE ACP), AU ITE AM ASS ICA TRATES CARES, BAM CS CARBS A UNS IRE Ce AWAKE TTA MORE AES CRP OFT, TOURER. HAR IPH PRES C PT RRA ARES i, THURLES RAI RUT RASA SORCERY, HAR REITERATE AHH. (LRA AEA BLAU goo TERA LOREAL ARS ARPA TARE, LIS BR SLATE RAR IAT HL GA Rah IRUI poo Wey AERO HNL BIE TS RPA gees ALI RENAN. AMR, HUB gcc (A IURESE RIM, ABZ REPU A iL SCL EWI HAUORLR, ADIL, AUB UAB AIAG. (AE. EDGE, BS] ARORA i, DAHER NAS. HLT. goo RS MPAs AREA UROL UDO. TURE AAP AY FSU ALR Et REE LAE PE AIRC UR A ES go Al gas Hh. ERA HT, MCPINE TRG, WK, G61 BG, TI LRH w fe AER EM. BTM ORLA GOR UE FL CPU YOR AL SPROUL. SPE, HRA REPMIS TULA SE Me ALR LARER BEG AERA, th gow Bi gas ZEARTT RUS ABISP ARI STE) PE A AAR. Ls TROP BEEIP RAE “9” AUR, ED) BY YAP A RI LE ET a A A" HE, DLS. MA, SPARE SOIR AME? RALETUR LP ARAMUEERL. BertH AHO “gta”, LLNS, OO ERNE ORR. ER AF 3" Constraint), SEN HA AH ATLAS MAR, LDU SM. EMS EEL Ske Rak: PTA), ERA PR RASA Blin, ZELTRELS hab Az : “=m" (v->eounter) HEV ER, “am” Zeon FANE HRA CH ME M90) ASP FEHLIE v->counter. AL AS HHH BAH BD ORT BE GUE TF A RBA. PIATRA BT 2 A LHUHIAN TE, IRR gor FEE T UM EIA A TF eA HH. ARE “MAAS. RAKES RAIL, (RA =". ARTI CR ABAATA. BT “in” (i Rath eid WE AE are PI “ERC” Ci de ® immediate), Jf Lik (ER RAL CRUE RACERS A. BoM “m'v-rcounter), BMRA TARR. A MACE E S77 a. WAL UFERY gee 2 AZAR PFE RR, HERA EEA OA SBD REA LRA Ta RE. tN TA RACHA CoH te OBER A, LE RIAT IRB CL a RAAT 2 OE. BGG) BORMAN GH, BU gco RAB ATER, JA & movi HOTS A 1 DRE RABE, WTAE RNA RRR T. RPA A, abe ASTI. PDS RU SR TET 1 25 Ae AB Ae HE. RTI 7S. RARE eH eR +25. Linux ABO fi MPT. DENT eco TT IP AEA AR pushl HHO ASAP a ORT RUFARA—& popl 8S, RAST BMAD. PEAT ABR IH, RFLP OR AR (PRE HAR FBT 25 E28 Lh, EOE S25 FE EAL Tak PREEATP IER. GPE, ARTES AMA BERHIR 7. SPRL BE RERUN A SR EEE PE HATA BH, ib gce RRNA. Rit, PARR, AT. PPR SME MAA OF 52 0) TPR, MURR, ERIK. LES ABP S| ERR ERNST RMI ee a ae AAS. BAERS TM A oe” S.A [A MEET SHEE SR 32 FE “RE”, LAS RERUER TE, UAT BE tT Lhe PERE 6 HOPE. SRT RH TIN DRIP ER MIR MUERTE. ERIS Ail, FE MORPPRITBR TET, RUBE AP SC ATE AT AR PINT the FOF MBSE ERT PETE, HET WBVES LIGA Sb" RRR T AT. HA AD Oh” ARITA. RRR OF EARS. LEA: “m”, “y" Al “0” —— Jom eB Ts “en RENTS FFE “gq” —— BARB APRE eax, ebx, ecx, edx 2; “i” Al “h” — tor REBEL. “BE” fl SF” RTE “8” — FoR ALR": “a” 2 Sc% Ad" —— SPS ABR ANIA AAEM eax, ebx, ecd wh edx: “s", “D" RIN AARER EH 8 165 esi BK edi: “yp #380 4 31). Webb, AUR TERRE RCL SAITAMA MT ROR BY AL LP a. AES A LSB ERS BER ARE HERTHA UL “memory” AEA RE, Zea BR ETE MU ALTE 'PHARAACE, MRR SEE LIFTEAE TIEAR AAW SR ETE. UBER RE BRR. CELE, SMU, WARD, ICA SRT TE, MUA bei “2” Se WB) TELE GA Se de EY LTR BCE AS ST AVAIL v->counter LE. (UF PIRI T LOCK Han ANAT addl HON BIE RANE, ALAA) CPU CUR ARB ARIE A CPU) Ht. VEAL BEA, AEA SATIRE, CHT PRE AEE md, WU “v->counter +=i5". ATTASALCMNE? FRIAR, RRR REL AR ve EAL RBG, DRUERRIF AY “IRF YE(atomic)”. FALLZ PERRY C elfen BUR A Lea REARING RTE oh LBP ty J Ha RRA HAUS, —- VIKA include/asm-i386/bitops.h. 18 #ifdef CONFTG_SMP 19 define LOCK_PREFTX “lock ; 20 Helse 21 deine LOCK PREFIX *” 22 fendi 23, 24° define ADDR (*(volatile long *) adér) SAS Rimi 25 26 statie __inline__ void set_bit(int nr, volatile void * addr) af 28 _asm__ __volatile (LOCK PREFIX 29 “isl ‘1, 90" 30 :"=m” (ADDR) 31 "Ie" (nr); 2} SRN btsl 4-7 32 CE BONG AB 1 BRE me BATE. ET PE AR ASHER, RUA aft ZEA AT. FORA PRA ORISA, HLH include/asm-i386/string.h: 199 static inline void * __memepy(void * to, const void * from, size t n) 2000 201 int 40, dl, d2: 202 __asm —_volatile__( 208 “rep ; movs}\n\t™ 204 “testh $2, Xb4\n\t” 205 “Je 1f\n\t” 206 “movsw\n™ 207 “T\ttestb $1, 3b4\n\" 208 “je at\n\t” 209 “movsb\n” 210 2" ai ic” (dO), “=80" (61), *=8S" (42) 212 0” (n/4), “a” (),"1" (Cong) to), 2” ({long) from) 213 2 *memory”) ; 214 return (to); 215} BE ULAR memepy( Jo IK L(Y _memepy( ALLAH memepy( pIFVEA SEAL. AAA SRA RETIN PTA, ZAR TURE, BORA — he, PILUDE TTR PIC. KEWRAMMRE DETR OAA. BULA SMA, NHR IPO m2. Tae do SIRES, WRITE ETERS ox FP. WSS —PRREBES . IFIRE dl EMD A AUCH 17 Bedi Ps 2 WG? HCE A ES esi. MAR, AR, MTR R RS 486. HATER 3 SHE MSO HEALER, ULL A 42IE cox: FFABERIM gee HAMAD EANES, ROR BERE RR nt, SEG LIRA RUMEN Pn PRE PM old. AF mH IVER gee (LSA, ANAT ALTER, BEMIS 5%6, EBA to 4G from, APH GL MG (LALA IAA 1a ATUL tH DHE 85 17 8 edi Al esi ASM, AG LAME DUE A Tod, AL CTE RE ALE AE TDI? seca SWI L AGNI “rep”, dR PAHS movsl HARB AAT. ESL ILI RY HF 28 eos 1, FESIREWRO ik. BERL, AERC: SRT w/t HC. ABZ, movsl LPH ET ARE SPADA EO “EA esi Pitt “27. ‘Linwx Pate aes oe ee ae AS FH edi PRM T. FEAL esi Medi PHAN 4. RE, ARES AY 203 TATE. i TH AMR TRUBS, MERAPSSET. RPE, MEAT ecx, HES, %2 CIMT BE) WAR Clr 29S) MERLE He RAB TEN Or, Mtn IK. FAUT, Ge Ta ff Ze fe Ce WTP. HE TULIEM PETE T . ELT test MRE i Ad, IOS HAE n CET HY bi, MRSA 1 REA BDA, HELMS movsw 4) MEF (esi Bl edi MA} HIA 2), TUR CH, A est CREATION, eae HAS CAR IUG PH A—+> TAB 45) ULB Mr (bit, RGR — Ley 1 RVR AEA, BELLIES movsh BMI TEB, BAER. AAAS 2 RUMI, IMT RAAT. UAW SBR CURSE Te, BUELUGM objdump FEMI, SHER, MUA REAR AR A) tt Aik OR BAS. EAA. PIBLLE stencmp( APRS, RAAB 1386 FH SAUER OT ORAS Intel AHS FHM RARE. 127 static inline int strncmp(const char * cs, const char * ct, size_t count) 1st 129 register int __res; 130 int dO, di, d2; 131 __volatile_ 132 dec] %3\n\t" 133 s 2f\n\t” 134 “lodsb\n\t” 135 “scasb\n\t” 136 “jne 38\n\t" 137 “testb Wal, Wal \n\t” 138 “jne Lb\n” 139 “2:\txorl Siheax, Seax\n\t” 140 “jmp Af \n" 141 “3:\tebbl Wkeax, Sheax\n\t” 142 “orb $1, 88al\n" 143 4." 44 =a" (res), “8S” (¢0), "=D" (di), “=ke” (a2) 145, 1" (es), "2" (ct), "3" (count)); 146 return __res; “7 } +28. pO ieee sol Sa Eo #2H # ft FB 2.1 Linux AAP EO RATER TEES, RANIMAT 1386 CPU, f48 Pentium, ZEBE( HTK LASER. ATF PRL IISA E oR. BUTCH BREIL, 1386 CPU [SGA AEE ABR Aes MBL Ml A AL AE IK TE DUSK H BL) HHT SK BURT ARCA: KS I PT LL eh AT. ASSURE ANE EET, MES PE MARA SUR". RAPER, AB AS ETT EN AS ARSS RTE SIAL, AAT A RR BEE" LTT AT TSH A We (1024 MTA). MSE TERE 32 mY. WL ULLAL. AAS, SORE 32 RT, PRE AIS ER, AGA RT Linux WEARS BEY BEB AE AAS] CPU EASED, 8 RBI 4 64 fe CPU (AU Alpha) Eft SOL, PRA REALE 1386 SEHR AURA Lit, BELL AMERARIA. MEHELEY CPU AY MMU(PS FATED) Wake, Bb AAO, HR 2 ale RAP APR CPU LE. Butt, Linux AY AVR SLB MSA, TOUT RATT SEP OU Tek Ta AR”. eR, TO SAK 24 PGD, 18) 3 SRR PMD, ii MA WAY PT. PT Pi IRM PTE, PTE £2“ Page Table Entry” (0485. PGD. PMD Al PT SSH) UR. AUN, AEE eR ML MEIC Ab we 4 CB. RAPER, AOU PRZE DSR PGD PASTE AR. 408) Hak PMD HA PR. DUIRTAe PAY BPR EL AL PUM TODA ATLAS. TAPE, TR HEMH EEN Bh} MN BS 2.1 Brae Oda. A SAT CPU SEH Ae AE ahh, ARAELAY Linux Pate A oP oO Uae sca MR AE SIE HAHE A (1) FABREEREIL i ei RMB hinze PCD RAISIN, Beem AmAD fs} 5k PMD. (2) HBR HEMIE OSM WP bate PMD Pa BUA OMA Se, ROT AN THT RE. (3) ABR rs. GABLE AF bse Bini de Pa BPAY ZEIT PTE, ART ARO FO ETT RAPES TT DSi en MLR Jy EE UA A EEL Ae AG A fT i 29. 4) Linux A Bei a ALI Aa BUA EH Pap | emp | PT ocr ts | deinen | ran | PF T T Pap PMD L, Pr 8 pu Pod > aE, 21 See AACA. TES EAU BR A AR Na CBRE CPU A MMU (HWE BL He BoA EAE ARH Fak ALAR kA LT A) SLE HOLS RRS LO BUR, DiRHei) fy PMD 2h. 95 TRL. M Pentium Pro Fra. Intel 3] AT Msaihhd FTA PAB, SOFAS HSL LM 32 (1 I 36 fi, FRAC LEE RRR RHA. GEAR, 7T Pentium Pro WE AO CPU E, ASS CPU MATTE PAR, MME MRN SOR. BA, AAMT 1386 Sibi) CPU. Linux ABER RU HLA AI? BE ERTIES include/asm-i386/pgtable.h * Bit X: 106 if CONFIG X86 PAE 107 # include 108 #else 109 # include 110 endif ARAB EERIE Linux AZ NY RACAL (confighit EF MAF, BTEAIET IRIE H ak include/asm #4 SRERAL ACS CPU SFAMISCIFH AR. AF i386 CPU, AA RIS EH include/asm-i386. lFlIM, ME RRM OA — PEMA T PAB HY, MARTTI CPU At Pentium Pro Bib bit, 3 FLARE Wi 36 GLE, MUAM RAL HEI CONFIG_X86_PAE 2 1, UU 0. HRI, AVERT patable-3level.h ok pgtable-2levelh PFOA —, WALT 36 GUM bEL = PRAY, Te aU 32 fice SEARO. BB, RASA P ITE 32 UEMURA. FEIT 32 GeSLLON— RURAL, ATT ETB RAK 36 REL A BS. SUFF pgtable-2levellh EXT JAR PGD fl PMD BYE ALE HY Oe 05 traditional i386 two-level paging structure 06 7 08 define PGDIR_SHIFT 22 “0. BIE fai 09 define PTRS PER PGD 1024 10 w/e 12 * the 1386 is two-level, so we don’t really have any 13 * PMD directory physically. Mo 15 #define PMD_SITPT 22 16 define PTRS_PER_PMD 1 " 18 define PTRS_PER_PTE 1024 J&B PGDIR_SHIFT 26402244 HMHE |! PGD PPR OLERATARUR OE, OPP PAGIE RE Hy 22, Hath bit22 8 23-8L), FF PGD RAR PEN EAU BL, BTLLTA EAM BS 23 REBIS 32 fie, FE 10 Rho #60 pgtable.h 432 XT AB PGDIR_SIZE 75: 117 #define —PGDIRSIZR —(IUL<pad))): ROE ATICMACS, BLAEHE next -> ped, BF —SUEALEK SUT Rae HL, HL pat #68 OEE CH BCER TA AERE), PRIEST mov HSH SAREE CRI. BWIKRISVLE, CRI ‘OLE BUMEAL next (4) Hii A ak PGD T. HUVEL, AEA BR LDT ABIES FEE, tee I PRAIA 22 GDT BA TRUM ATA AL, ROAM KRU Khe SR. koh, NEI A + TSS 49 CLSURSE) HE. CAF TSS WSU) LL, RRS PARR CDT FRAT ION. MA, GDT MAH LAW? BAER HALE GDT Fmt (rt WE 13 1. HA ODT PRA 8192 SHIA. HRM RAII CHalan GDT THI 2 TAU 3 TWAT ARMAS RAGE, 4 UAE 5 AGH Ta ERNE ABURE, 31 Sk HE 0, BA) VAS, MAT 8180 NACI OT CR, ATLL FR GEP LAER ARLE 4090, 2.2 HedkeRH ay SAA Linux BRAT io ee fie. aot ae SALAD EIN ORT”, MMU fice 7 Ba SOUL “BRAD” Wh CBR LEM) He MOI A FE PE, RSC RAE, TE a HAR. Th, RARE. ESA, BRR, AE STA RAPA LMR, CRATE RETR CER MIRA) Ph, TAREE RAR TOHAT, BORGIR TERE. TAA PSH S BLE Ae EE TORINO EASIER CPU BE RAWAL, RARE. (Le, FUTATTVEE. i386 MNT rk. AF 1386 APUAI A RIEL A, “CTT CP A A ARE BATES HFA TAS KLIN +33. Linux AUER i) erates rere IDAs AACA IRE. ATLL, AMARA S IN, 1386 CPU FRR ARE eh 1 FT Beast hk Se AT Ba St. PRGA AREAT CT. BERS CPU (ETE PR RE, Linux AAR RFA Intel RFF. FE HMERPRE RS ODEN, HNN RAA AD Be, WEA ARNT Linux RA “ER TA? SERRA AOA MAE. PRASAD Linx AB PRO IER Bt WHT RESERR LARA Z TL CRATE VMB6 BLY. SALAD ARUDL 80286 M4). WHIRL, “fs ARR, RANK", ETERREL. AMMAR, HA Linux WATE i386 CPU LigiTAy HULU AACR. GALE, ARO 1386 WMT AY. PIAA, HR MO8K, Power PC, AA ARMIES ET. RZ. AMER ARE AE Calin UNDO, REE B86 LEM, MLAMRDABA LEMAR, PA AT ELSE YIN iTS ARERANN SRA #include greeting() { printf Hello, would! \n") ; greeting( ): RSLS. AMIE TART ASH—T CAMERA AAI BUTHORIE main( yi greeting DK@ABT EN “Hello, would!”. SERIE, UTE RITAMT AU hello. SARA goo FM Id ATR AIMEE) PUT SY. Linux AP SCALES objdump AEA HALA, MIDUT RIL Ae RR. ithieh ire: % objdump -d_ hello HIG SRIRAM AAR, HUGO CRI ABR AOD Dye 08048568 : 8048568: 55, push] S%ebp 8048569: 89 &B mov] %esp, ¥ebp 804856b: 68 04 94 04 0B push] $0x8049404 8018570: eB ff fe Tf ff call 8048474 < init+0x84> BOARS: 83 ef 04 addl — $0x4,¥esp ROASTS: 9 leave 3 rel 2 89 16 movl esi, tosi 0804857¢
: 55 pushl %ebp 89 ed movi ‘kesp, *eebp e8 ef ff ff ff call 8048568 8048584; 9 leave 8048585: 3 ret, 34. 2S fenae oe 8048586: 90. nop 8048587: 90 nop AER ROTA), Id 4 greeting, ZAC KALA Ox08048568. 77 elf HESLAVATIMIT (CA. Id AEM 0x8000000 FAT ACHEELARAY “ARAGEL”, ATE LPARAM. ED RR EIUT IT RA fe SAA HE AUR BE es AVE PCRS TER e e, AE Y S B SEA CF RA BERT CS His, BMRA MLM BUBB Ve, IFA CPU EEA AT maint PHY “call 08048568" 1X tH, BH Bl sedi Mat OxOR04Ks6s H. HEF LMI SEF RR Se eLik MLL OW EA EERO OBL. aa HHL Ox08048508 A — MERI AT, aOR RARER ATMA AL CPU Pi “HSUPA” BIP STRAY, BRELAEARMSBEP. Bvt, 1386 CPU GEA ARHGEE RTF a CS ty SHIRE BRC OY “TES”, LATA CP ce BH Ae Fon. BH eke? BHD 4 GDT Be ABCA LDT? DRE CS POABT . AE PRR FETE RS feist, WL 2.3. 1s 32 1 0 L_____Requested_Privileg Level L___ Table Indicator, 0=GDT.1=LDT. B23 Mw RMRER AEB, bid OMA GDT, Wy 1 RRA LDT. Intel RITA KA GDT ihe (SIARBHIIEN TAS LOT. RHI RPL AT BCRAYRHRBUH, 14} 4, OA. ALE, TTR CS AAT PN CHEE VER AES ICR AS ER a CAE He EBL TE), ACFE include/asm-i386/processor.h *P: 408 —ftdefine start_thread(regs, new_eip, new_esp) do { \ 409 (Cnovl 80, 8fs : movl XO, Sigs”: sr” (0); \ 410 set_fs (USER DS) ; \ 4 regs>xds = __USER DS; \ a2 regs>xes = _ USER DS; \ 413 USER DS; \ 44 \ 415 \ 416 \ 417} while (0) iA regs onds BLAH DS MURR, RAE. AM AAT OLED] TBI, SEALDR CS ‘BLEW USER_CS Sh, WHAT AUER AF 17 BE ABER HE A USER_DS 320.45 99) (URE BEIM AL AEA oF Fa SS, “EHUMERE Re USER_DS. ALLL, RU Intel AURAL ME ALORA SP CEL, MGB BL THERE, Linux PRAIA. Linux AY BERETA ASE A 35. ry r Che Linu 1 HAE di AAS AAS ACA USER_CS Al USER_DS BUR AL{TA. BBLEAE include/asm-i386/segment.h 475 XH: define __KERNEL_CS 0x10 define __KERNEL_DS x18 Hdefine USER CS 0x23 define USER DS Ox2B RABEL, Linux PUP SUA ES ROR, BR OS, PMP AY SEE. RCE, PRAGA OU APC HD See A TT SP SB AY FF RR Index TI RPL KERNEL CS Ox10 0000 0000 0001 0/0/00 KERNEL DS OxI8 0000 0000 D001 1/0/00 0x23 0000 0000 0010 OJ ol|1 1 0x = «0000 0000 OO1O LJol1 t Rite S.A __KERNEL_ CS: index RPL _KERNEL_ DS: index RPL USER_ CS: index RPI _-USER_ DS: index = HE, TAGE O, WALKS AHL GDT. BY Intel RH RAR—-MT. Hb, & Linux PUB pas LAS FE BEBE LDT. LDT PURGE VM86 BK H5847 wine DURFEE Linux EE $IZtT Windows Fak DOS Het ALE RA EA. HARPL, RAT OMI MR, AKO RMAM GEE) 3m. FR RNINBE. RMB ERAR TAK, ATER MHI BPE ATER SEEMAGE TIT, 48 CS BetRK__USER_CS, 10x23. DF, CPU LL 4 ARR, MMB GDT PEM IR. PHL GDT Ps A UETE arch/i386/kemel/head.S H5E LAY, HABE A thig FTI ANE: 444 fx 445 * This contains typically 140 quadwords, depending on NR_CPUS, 44600«O 447 NOTE! Make sure the gdt descriptor in head.S matches this if you 448 * change anything. 490 / 450 ENTRY (gdt_table) 451 quad 0x0000000000000000 —/* NULL descriptor #/ 452 -quad 0x0000000000000000 /* not used */ 453 -quad OxO0eF9a00000FTTT —_/* Ox10 kernel 4GB code at 0x00000000 */ 454 quad Ox00eF92000000 Trt —_/* Ox18 kernel 4GB data at 0x00000000 */ 155, quad Ox00cfFa000000F ITF —_/* 0x23 user AGB code at 0x00000000 */ +36. RIE neers 456 - quad Ox00cfF20000008FFT —/* Ox2b user 4GB data at 0x00000000 */ 457 - quad 0x0000000000000000 /* not_used #/ 458 quad 0x0000000000000000 /* not used #/ GT PHYA 3H CFR 0) ALARIHIND. J 7 tr BL tea Re Aa deat NR AUPE GDT, EULA Intl USE. BORLA SRR D SH a RAL MRR MR. ARERR, Fit UR RAN, RINE 4 Ne Ae — FP Tr KCS: 0000 0000 1100 1111 1001 1010 0000 0000 000 0000 0000 0000 1EIT TILL 1111 LLL K_DS: 0000 0000 1100 1111 1001 0010 0000 0000 0000 0000 0000 OOOO 1111 1111 1111 1112 KCS: 0000 0000 1190 1111 1111 1010 0000 0000 9600 0000 0000 0000 1111 1111 1111 1111 DS: 0000 0000 1100 1111 1111 0010 0000 0000 0000 0000 000 0000 1111 1111 1111 1111 EAA PUM 2.4 BA Me SPAR, OT A BeBe: (1) PAP BERL FP A EAE © BO-BIS. B16-B31 Ai 0 EHH 0; © LO-LIS, Li6-L19 Mii 1 ——Beity bR4e 32 Ons © Ghats —— Kt KB; e@ DAB 1 — SF US RU Al AB AL 32 Hs © Pie T —— A RB LE AE Hie: RET BOLE MA O HULL IF RAE ET 4GB Meee Sie), SEMEL BRR RR ANE Fut, PEC SRHRA Linux AASB, BT CAREER PENS (ERIE, ATE Bee (2) AB SIRE ALE bittO~bit 46, LT HATH type AE S HRA DPL FB. @ OX} KERNEL_CS: DPL=0, £70; SH 1, RAT YERMATEEL: type % 1010, FRE, WT, BUT, MARE. @ =%{KERNEL DS: DPL=0, Kx 04; SHH 1, Ra VSRALBCRER, type % 0010, SRRUER, OT, TS, AREA TIA. @ =X} USER_CS: DPL=: ARS Rs SAH 1, RARBG: type 2 1010, ae RGR, Wik, MT, MARE. @ «(XP USER_DS: Bl FAW SH, DPL=3, RAZ M: SLA 1, RAAARRAUEAR: type OO, ABLE, NE. IS. ARR. AR HIM HSER AMR: AE DPL, A RREA O Mk, ALP Fata 3 2k; 5} RY FRA, BONAR, BOW MUR. PAR CPU RRA LARP RA BRAY. LOSE DPL Wy 0 #8, MUR FE TES CS HAY DPL A 3 2, ABM A POT Bl Ay ABH CPU fe piace DR Bi Ee AB fel BHR. Re, UR BUA TREE, TREE CS FeUila, BAD et. The. RALPH RY PALM ARAMA HET SRT, SCBRHT RAW, IRAE RIES Ae. BAS 421386 CPU Fi) MMU BfME HOM ABR SE “SATIRE AY TE Ab, HE AGE 1386 CPU SBI] MMU ROSE Sete BE SCBON, YIP MT ULE OCSURY. ARCH AGRE BEAT RUBLES TSB UA, RCE Linux PUHK IA ASL A PEPE HE FE 1386 CPU, eT HA LEAT. RIAA: RUD, PARR ITER EU LaF CS Be DS, HARRI AFA, BUT LANE 1386 PUBL AL HS? ALA, (RABEL Linux WZ a ed, AB PERMEATE, RAL AM ACK Ree EAN ARES I AT ABT HRTEM 2 ARATE RP Li. 37. www.zzb i Linux A Bet GH 4 CSS Saeed 63 53655 S251 4847 4039 3231 wis 870 B23-B16 | HiBHBIS-BO | AL ERELIS.LO SrA, CPU RemBiditic KH 0 RAM REMI 32 HES: =0, W 16 Hite AK EAI 0, DLP a “443 40 > aia aL Bx, SiR ED=0, WEA CHMBD: ED=1, ARM GERBD L wa0, KAPHA: Wel, EA, [— E=l, fee CHO, ULAR: C21, RRO S eit Rat. WE 0, ARENT RATRM RRS Ine BE, 1, SAAN BOR RE ence 21, REREAD 24 BEL PRL, Linux PBT BLECBR AHL AILSA Ox08048568 BUN FIT IEA 4, TAHA RMEHEE Us AT. PIAGET mista ME SEWED AGERE AME — + GDT RHE, HACER Ran TY. SAAC SHOT AR PCD, HIG HR MIR ET REE TERED mm_struce MSY. AEE AE PER AGLAT ORLA, PROSE Wy RN AT ME AR BE FPR BF AB CRG. shi MMU AVES EIA CRB PARAM BSR AGEET. 7Niot, CPU ZEST RAPPER MAAR PNB, Ti MMU TP EET ik a OSS BT FAD 4) SY J Ay BB HA. GK A ZE inline eR BK switchmm( ) PE AWAY. SEALS include/asm-i386/mmu_contexth. (ALBA EA- OM ARI AE 1: 28 static inline void switch_mm(struct mm_struct *prev, +38. Re Few s struct m_struct next, struct task struct *tsk, unsigned cpu) ao ft 44 asm volatile("mov! $0, ter3”: :"r” _pa(next>pgd))) : 59} BIT ARY SSE AT AE LS__pa( AFAR, BASF PEPE AY TUT HR PGD AYRE A FBG Ccr3, WAT CRI. MUMIA ATARI: AE, TK —ATLLATANLT CRB AERP, HA SEARO UR LS, TRACE NEES TOY? AE, BURA BR. AMET ate, ABAABAIT RATE, MAME, MURS AAS. ERITREA AH OX0R04BSER AHVETR, BEI RAVE TO ART . AREER HEL: OX08048568 i —AERLALTE: TH, CRI ERE. HA 0000 1000 0000 0100 1000 o101 0110 1000 RR AR PEMA HS AC, TNL RRA 10 fey — EHH ET 0000100000, th at Je biaE lH) 32, ITLL 1386 CPU (GARMIN BEAE CPU PA) MMU, FAD) abe. 32 9 eX WTA a PAR BISE A UR. US RP A 20 RHR — HTM R. CPU 43% 20 CLI 1 12 +O RBA MM AMEE. CAAT RAT, fet RMA, FURR 4K PRISE, SOURS 12 a 0. IED et, ARYL 32 42 ROP RR 20 FEUER P Pade, 1 ARTA He RE ANE + PUMA LUT, CPU PRAM TAPIA) 10 fe. SHEN OX0B048568 MIF — 4 10 Ai 0001001000, H-+iE fH) 72. PAE CPU BLUALEY Prat CARA TPR ARIAT, HRI HYD, SS OUISESE A) PPE 1 ea TRE A TUR ce fh. 32 Aen LTT Ae TT HH 20 ALA MORAN FET, ENE E 12 +0 SDR A fe NRRL. PAI, HE MARRS, (ORBAN ERTL. ASbRe ht Lon LAR AEARH ARR 12 12, BE BUT RAG OSE PY TEMALAR HEHEHE SIE 12 799 Ox568. ATLA, MERE PRTC M ABMS HE Jy 0x740000 AVI CRA PRIA), MA greeting( )A CIM) MPAL tH Hk ATE 0x740568,, greeting( HORT ARS at ER TEI LATROAERA, CRRA MALT, i386 CPU BPA TK. BW MAMMA, B REM A, RSM ARV RIE Ht. DURANT RRB A Rt TBR TE Cache) AISI. ATEREH, AACR —KRAAANDE ARAM RAS MARP EER, A-BAT RE SU, RECT ERTS), ERHARD. TE FEES AN)» AFL SE AREER. AAAS, TREE Linux AU 1 AGE TRA SRA? fi EBERT Windows sk DOS a, RAT AMAR, SRE A RIOR. 2.2.1 modify_ldt(int func, void *ptr , unsigned long bytecount) PARA RE TUE BR. HEH RARE PSF PD, BR Linux Lh SHEA YES Peat aT. Balti “WINE”, 327K [i “Windows Emulation”, AYA 7E Linux 139. Linux HGR mo) AGF EUETT Windows MISE. 24K, AM Windows HE ELL” 2S He AAT IIH ALAA Chl MS Word HE), MAE Linux Lier ATA aR PAE EMT FS AAI Linux AUTRE. APU, Linux 2 ABSr AS, GEE AT DL de TRGEAT Windows IMECHE. UR SAR TH. RD modify_Idt( BREF FER WINE (NAPE BEER. “func SRA OMY, IR ALE ARES ARR MRA, TRA RLTL AD PUBL per BOATS P.M fune SCNT 1 RY, per AVF 4 4i#) modify_Idt_ldt_s. fii bytecount Sll|9 sizeof(siruct modify_Idt_ldt_s). ZBGRBMINE MME include/asm-i386/Idt.h: 15 struct modify_ldt_ldt_s { 16 unsigned int entry_number; Wr unsigned long base_addr: 18 unsigned int limit: 19 unsigned int seg 82bit:1 20 unsigned int contents:2; ai unsigned int read_exec only:1; 22 unsigned int imit_in_pages:1; 23 unsigned int seg not_present:1; 24 unsigned int useable:1; 2 4 JE" entry_number ZAHRA TS, ED PER. A Hae ST NER th eR BIS ALB P EOE PATESEA: KAREN SHS T He BO RT RE A KR, CARAT GERMAN MELAR SPA? REM TRA. TERRE FETAL AT — PbO, (OF EAA YR Linux AR TPE, FURAN UNF HREE EARN RAR RFR, RAMIERAM 2.2.2, vm86(struct vm86_struct *info) 5 modify_ld( “PKU, A — +P ASIAN vmB8o( ), HIKE linux LBUIE 17 DOS WKF. i386 CPU STRUT ADL AT M86, HR AEORT MRR PRUNUZ TT SHHKBL Creal-mode) MKfT. ICH AY RAR RP RAM RA Cl Windows, OS/2%F) RASIKLUKIT CHA DOS Ki) HAA. SHMS, RV MMMRLLST DOS MHEMRDT. MHP MASAT. PLLA Tee 80386 HSE BR WPT VM86 BRM AA, ANGIE ASH Intel SABER, TRAP AX (HYHACHY, BEF include/asm-i386/vm86.b All arch/i386/kemel/vm86.c~ GR, SARA RAR DELE LEARY Linux ABREOEREIER, SENT 5 Windows (FM DOS RRMA RRA ZH. 2.3 ILA ERO RE 25 Fo BSR ALPE EH RRB, Linux AY FEA UTEP es OF ULI AL at PGD. sil té PT WL Ak Fai eB He GDT Ale ARELA BM LDT, Jf TEBARE RAC RS AF AS, BRI TAN FETT RAL | aE WI ABS) a 40. re ea RI tele STP. SUARAM Dba, Pltsis LAR REANER LER AAS. Aone PRAM ABUSE, ALP MARSI RHEE, CR REAR TP A eee (AMER. Wi ae PGD, 418) Ha PMD AU TL Ze PT 4} 1 us4240i pgd_t. pmd_t LAA pte_t AAR BAL, MLA ABE, XP include/asm-i386/page.h Ps 360m 37 * These are used to make use of C type-checking. .. 38 #/ 39 Rif CONFIG_X86_PAE 40 typedef struct ( unsigned long pte low, pte high; | pte_t: 41 typedef struct { unsigned long long pmd; } pmd t 42 typedef struct { unsigned long long pad: | pad t: 43 Sdefine pte_val(x) (> PAGE_SHIFT)))) HIF mem_map 4€ page 4) GHET. HRTF RTA RUE T page HiMdH ET, mem_map+x 'j&mem_map|x} FORE. TEN ARID, AT eRe AR oa ELAR AE MRT page BRAG, PTL Ste MT —ME BME Cinclude/asm-i386/page-h ): 117 define virt to page(kaddr) (mem_map + (__pa(kaddr) >> PAGE_SHIFT)) RETA page BURG HE CC include/linux/mm.h PE X(t: 12600 /* 127 -* Try to keep the wost conmonly accessed fieids in single cache lines 128 here (16 bytes or greater). This ordering should be particularly 129 * beneficial on 32-bit. processors. 0 131 * The first Line is data used in page cache lookup, the second line 132 is used for lincar searches (eg. clock algorithm scans). 1300 134 typedef struct page { 135 struct list_head list; 136 struct address_space *mapping: 137 unsigned long index; 138 struct page *next_hash; 139 atomic t count; 40 unsigned long flags; /* atomic flags, some possibly updated asynchronously #/ 11 struct list_head Iru; 142 unsigned long age; 143 wait_queue_head_ wait; 144 struct page *pprov_hash; 45 struct buffer_head * buffers; 146 void #virtual; /* non-NULL if kmapped */ Linux WHRSRACIE 8) a7 struct zone_struet #zone; 148) mem_map_t PARSE Rok A HY BE HE AE page BK map. MAIR NERA, index (CARTES PEG, Ye HUTA Pa eee IN BL, (RIERA Bethy AS I index HH TO EA a ARO HT HAY SBE LD A ESWC ETT I RI FPR 6 MED) He BURT AAT —/> page HHH (BR mem_map_t>. RACE WMA LwY HE Ya A 42 (8) AADAC page AiHQCEL mem_map, (FAWN “CPR”. HTHIAYAES page RARE PERRO RE. SETIFIS page AMER MA PRR ES. “Cove” SLO Y38 WLI 4)i& ZONE_DMA #il ZONE_NORMAL WME AUR RARE. ini te ARTE BEK ZONE_HIGHMEM, FPA hhaBit 1GB NFAT). PK ZONE_DMA SATE Se DMA FERN. att ZJk DMA 1k FR ASU se Mei BL YE? thse, DMA MEHR LALA VO PRUs), MURIEL Hh FAT ETAL Ae IGT. ARIE TEATS REET. UGoh, ARERR. 26 1386 CPU HP, TRE ft EAU ER HE AS HE HE CPU WATSEDLID, THAR THE CPU ARE dh “hai MMU H(i, BILL DMA AEM: MMU Bf RO SLALURST. LURE, SHADERS BER RRO LMR MRM, FTAA AR SNE CAPAUENIZE ISA ELLA LIE) TOR AAI, SORA DMA MISE HH RAE. SIT. TE DMA 7225 MMU Se RUSEHEBEIS, “4 DMA Bie HSER ek — OE HAN BER PR MUTE ME LEER, PAIL DMA SLES ANAE SEE CPU ALAbAY MMU Hee Seite (2 RS) LASEMEOO TUB. BFUL, FU DMA tine i ee Hn BLE, SRE ATT - MORSE, BU cone_suruct HHH. 42 zone_struct BURSA “2A Dela)” Cfree_areat) BAR). AttAw * "BARI TAA “ -h" BASINE? it A RR “SR” th REAM) AES SHU, BERRA SP ALI. Dk, Ze SER BP REA- TPR HEA CER 1) ERI, EE — NOIRE MEI 2 BVT SRDA IESE 4. 8. 16. +. FE MAOORPER Gy Te, EBL MAX_ORDER 3 10, Hh ARBOR AUEAE IIR AT DL 2!=1024 TERT, BAM eA. PEARL BL (MIS ACE ZE SCA includeftinu/mmzone.h 47 XH): a no fe 12 * Free memory management ~ zoned buddy allocator. 13 o4/ 4 15 define MAX ORDER 10 16 17 typedef struct free_area_struct { 18 struct list_head free list; 19 unsigned int snap; 20} free_area_t; 21 22 struct pglist_data; 23 24 typedef struct zone_struct { 44. mle eR 25 i 26 * Commonly accessed fields: 2 */ 28 spinlock_t Jock: 29 unsigned long offset: 30 unsigned long free pages: 31 unsigned long inactive_clean_pages: a2 unsigned long inactive dirty_pag: 33 unsigned long pages_min, pages_low, pages_high; 34 35 i 36 ¥ free areas of different. sizes 3T */ 38 struct list head inactive_clean list: 39 free_areat froc_arealMAX ORDER]; 40 aL ie 2 * rarely used fields: 43 */ uw char ‘name: 45 unsigned long size: 46 pe aT * Discontig memory support fields. 48 */ 49 struct pglist_data *zone pgdat; 50 unsigned long zone_start paddr; BL unsigned Long rone_start mapnr: 52 struct page *zone_nem map: 53) zone; 54 55 #define ZONE_DMA 0 58 define ZONE_NORMAL 1 57 #define ZONE _HIGHNEM = 2 5R define MAX.NR_ZONES 3 TPE Se ttpP HY offset atiAIP RCE mem_map PAAR TTT S . ~~ AECL CFTR, AE HMR AHIR TE MEDC, ASB TOUR RRL, BRR MEN PU MR HE PEMA TTUtba PE. THI free_area_struct Wi f!)"|) FH KAPSE 6s BEB F459 list_head LE 4b FUMUSIGR EAE, limux PAB FRE ASEM EBA PURI A A A A. ARM, EAS prev Wonext HET. LA_LTAY page HH), HAAR ABE 7 list_head 4¥], BEE MIN page FAME MGA EEA free_arca_struct S051 EEM SINT. Ce MM TUIDNI SHE” A, ATH TRAK ERM EM CE 2B. BRIA, QUA PEM MO. THER SLSR, BML Beh, CPU We MEI PIER “SLT RIGID ARARTAL, BULLY “HRP LASKY” (Uniform Memory Architecture), fis UMAc FE, (te Sear UREN, PRUE CPU EPIRA. WETTER TMT BEM T LAS oh NRA TRH TY, 45. iY pede Ree, Linux 4 Bait: AY Gn Se aha © KR OR—BBA, fri PCL ER. © AST CPU PUREE RARAL, SS CPU BURMA HAUTE NT, (LLL T CL RRB el Hh CPU BR LATE. © RRS ISDER TARR BU, PATI CPU BUABET Lhiddict ASE AER. PAR Sy Ae A AAS I — A EM BA, MEME CPU MA, UML EMR EER, RAMA fe RRLIE I CPU BER LAN ERR SRLS, TALI TORE PO ALMA REE, RIE, TELOMERE, SCM AARTE TA] UAH HEE, TEA” RAS HL, BR DUR IR “AS Tee eee ty” (Non-Uniform Memory Architecture), ff NUMA. ¢ NUMA 4H ROP, SAUER E POR BS ARAL A CE CHAT AIEX Ta] CORK node, WO “EAR”, ASPIRE, BE CPU BE 1 BSNL AMAT, Th FARR EMSRS, PLLA 3 ATURE, mS SU ‘HUASRREI 7 CPU BER 2, IBGE. HREM F. Hea PUREE BUR REGAL. WIL, PEN LA UMA LEAP EM. OSCAR ER TMM CPU AY PCB. 3B Seer fk lela T RAM. ROM CHF BIOS), IGA BUE-F buted: RAM. (EAE UMA Sith, BR AGETE” RAM DISMIVFEAREER RN, TDMA PROSE EA Ise) AS", FEARS SPAHR at ATT. AST, AeA NUMA Si 2 BER A Ber A eA SET HFS ANE RS SEH ASEH AHS IA, Linux PY 2.4.0 AGE TM NUMA BSE CPR MBE FATA). BT NUMA SMS A, FIRE LET IE. SRA TR FBI ATOLD, MARTE TARE NRA DP ED TAAL MY page SHAE ANE at FL, TIRE PRIA . Mili, 7F zone_steuct 454) (UL page #4 BEA) 2 ERAT BBE RATE AAT AY pglist_data UBS HI. i MF include/linux/mmzone.h Fs 79 typedef struct pglist_data { 80 zone_t node_zones(MAX_NR_ZONES] ; 81 zonelist_t node_zonelists[NR_GFPINDEX] ; 82 struct page node mem_nap; 83 unsigned long val id_addr_bitmap: 84 struct bootmem data *bdata; 85 unsigned long hode_start_pad 86 unsigned long node_start_mapnr; 87 unsigned long node size; 88 int node_id: 89 struct pglist_data *node_next; 90} pedatat; UR, PAPER TEN pplist_data BCL HIT LD WLAE node_next TEL —“AEOA I, AEH SHADE FT node_mem_map 41.615 ALA page LAVAL, MKC node_zonesl )WLBIZT LMS = TDAH. KK, ZE zone_struct 491A MREL zone_pgdat, FHI PIE ALY pelist data ath. PHT, de pglist_data hy THE T — 84H node_zonelists| J, HACE LWZEII— eB: 46. Re eRe 71 typedef struct zonelist_struct | 72 zone_t * zones {MAX NR_ZONES+1]; // NULL delimited 3 int_gfp_mask; 74} gonelist_t; AM zones| ENINSA, ATER SOUTH AOR OEE, RGR E 1K zones[O|HT AIM SHE, MARL LRM zones FAM THR, BE. BERR OTL PASTURE AR ES RD EAR OF BLT EA ae: ELAS ak BE CPU BU Lf ZONE_DMA SER, HAM 4 PRE ABM A ADBULIN ZONE_DMA S32 PSC. BEALBE. 484 zonelist_t Hl ET PEACH. ART, ENTER AID APTA, TELE pglist_data Sh APSR ERY Jé~7 zonelist_t BUR, SEH AK) 9 NR_GFPINDEX, 36 09): 76 #define NR_GFPINDEX —_0x100 PRAT, BLA UT ELAN 100 PR ASAT ER. BRAT RG TINE, Se BEAR ADE — Bh CS LRG OLAS ORS SEY, RERGS RAO MAE, WEEE EEE DE] VTE AGE Uo IEA PAE, TA ERA. ann eee CAP) 28fe]. Ait, WRIT, SREY RSET)” AER SEEN. DUG RATTRAY EASE TE]” AL “FE” GRAM IE AP SURE M OE” RTT, ALAR: SOME PRE HIT", Mae dee, SEM GH” OAR REAR AD, AE RATS EL A EE] EAB A". Se ESTED FE Te)” BAL, AA TUES LA 3G HVE). int, MRL ASMP ATR CAEN, HAIER RR “EI”. ARIE. ARTE TAS Buk ME BNR SEH. 7E Linux VJB RAE vm_area_struct BGRESIH, 32 MF include/linux/mm.h Pe 350 fe 36 * This struct defines a memory WWM momory area. There is one of these 37 per VWearea/task. A YM area is any part of the process virtual memory 38 * space that has @ special rule for the page-fault handlers (ie @ shared 39 * Library, the executable area ete). 4000 / 41 struct vmarea_struct { 42 struct mmstruct * vmmm; —/* VM area parameters #/ 43, unsigned long vm_start; 44 unsigned long vm_end; 45 46 /* Linked List of VM areas per task, sorted by address #/ a7 struct vm area struct #vmnext; 48 49 pgprot t vm page prot; 50 unsigned long vm flags: 51 M7. Linux BR ERE! 52 /# AVL tree of VM areas per task, sorted by address */ 53 short, ymavl_height; 54 struct vm area_struct * vmavl left; 55 struct vmarea_struct * vmavl_right: 56 37 /* For areas with an address space and backing store, 58 # one of the address_space->i_nmap{, shared) lists, 59 * for shm areas, the list of attaches, otherwise unused, 60 “/ 61 struct vm_area struct. 4vm_next_share; 62 struct vmarca_struct vm pprev_share; 63 64 struct vm_operations_struct * vm_ops; 6% unsigned long vm pgoff; /* offset in PAGR_SIZE units, not PAGE CACHE STAR / 66 struct file * vm file; 6T unsigned long vm_raend: 68 void * vm private data; / was vm pte (shared men) +/ eo ; AEPARIIRESH, UPI GEE IRSA HBL vas Sa HOMEAY vm_start Hl vm_end VRE f—-ASREAELE AA). vm_start OAT AERIAPYAY, i vm_end UAB AERA. KOM PHAR UAT MALE, UT UIE, aT Ae SIRT 7a DARE. HOS EL AS AT A Jk — TAY AN PA IE, BEARER. OL, CTEM — TERR TRA ARN I RRR TE Fh ASE, REAP ARS vm_page_prot Hl vm_flags MADR. BFR MEMORIAL ASSET LOY RUPE Te oR, EMT vm_next JEM SHO. Trane HAAR TAL RRRERH. MLAN EA CAPD SERTRALINE EN SET EH FEUILLE ASSURE. OUR ALRASSENTAT vm next (E88 HEREWITH, Hoes EIN OR. BELL. BORNE vm_nexe AEE FATT fale A — AVRIEBNTUYASS, RECT LCE tn Be REE Ko 5 2 sit 4 AVL CAdelson_Nelskii and Landis) AVL, HAPS, BERN MUR SEP AL T SI. Ze AVL BP Rw RHR TE EE AL Og n), BUSHEY AAR COALS) ARH 6 HEED TEA vm_area_struct +} vm_avLheight, vm_avl_left 0)J& vm_avLright (SAQA AVL 4. ARETE AVL BE) Ay EMS. ZEPRAMADR Fae AF ICTR GREE IAL) ALMERIA. APL HE (swap LAKES TARR SPURT, AEA AEDT AT Ge LR SLATE — AM (HY “HAE” TN APE (demand paging), 5 FORRES — PORE BUTS) EPRI SEP. Linux Se PCIE mmap() (SEB 1 ALIA Unix SysV R42 Fra), HEEB OES FRSC AR BH tae Ae eT LUG Dal AE TEAL RPS AR, TASTE Iscek( ), read( jak write yA MET CREE. HS PRETPER Ia) GRAIL SIRT) SEAT AIR, 46 vin_area_struct Mish ALN EET ae ALS. mapping, vm_next_share. vm_pprev_share, vm_file “, FUWUZSCUTHEMAA. TRIKE +B. BIE irae ADU ERA. HARA Se ATT ‘Ne Fx 8) iF — 4 EE LE, vm_ops, 33 LAH] —“P vm_operation_struct MGR AIEAT « RUBS WATE includeflinax/mm.h Tit LAY: nS 116 uy us ug 120 121 122 123 124 f* * These are the virtual MM functions ~ opening of an area, closing and * unmapping it (needed to keep Piles on disk up-to-date ete), pointer * to the functions called when a no-page or a wp-page exception occurs. / struct vm_operations struct ( void (open) (struct vm area struct * area); void (close) (struct vmarea_struct * area); struct page * (tnopage) (struct vmarea_struet * area, unsigned long address, int write access Vi SRP EL BMGHET. JEP open, close. nopage 2} iFAF ak t7 bx NET. ALAR IT. there nopage 47% FR BEME 7 A Ae od FAS bal 4g 1 fe So ES FF I Se SIN CMB AE SUM ASAE A TPH GI de “TCT” (page faut) FRAT CULES 3 > PENI AiR. AUT) vm_area_struct T3E4i—MEt vm_mm, ETH FI—4++ mm_struct BGRMI PY, BLAZE include/linux/sched.h "Pa i 208 204 208 206, 207 208 209 210 2nd 212 213 214 215 216 a 218 219 220 221 222 223, 224 struct mm_struct { struct vmarea_struct * nmap J* list of VMAs #/ SUrucL vmarea struct * mmap_av]; /* tree of YMAs */ struct vmarea struct * mmap_cache; /* last find vma result #/ ped_t * pad; atomic_t mm_users; /%* How many users with user space? */ atomic_t mm_count; /* How many references to “struct mn_struet” (users count as 1) #/ int map_count; /* number of VMAs >*/ struct semaphore mnap_sem; spinlock_t page_table lock: struct list head mlist; /* List of all active ams */ unsigned long start,code, end code, start data, end data; unsigned Jong start_brk, brk, start, stack; unsigned long arg start, arg end, env start, env end; unsigned long rss, Lotal_vm, locked_vm: unsigned long def_flags; unsigned long cpu_vm_mask; unsigned ong swap_ent; /# nunber of pages 10 swap on next pass */ unsigned long swap_address; 9. Linux HE RH 225 ‘* Architecture-specific MM context ¥/ 226 mm_context_t context; 27); FEA RMU. PRP HGRSN GRE) REA E mm, GAR, BOREL, vm_area_struct FRU E EASA. BE bk EE UA — 4 mm_struct SiR, Cet EIE EY “CALIDA”, Bll task_struct S84, A ~-MBSTSE I RREFER mm_struct 4749. BUELR, mm_struct SUE MUA TA SR, WERE. SS METS BAF HARRI. BAS map FLARE. — Ae TE PE A AAR AEN) 7S mmap_avl HR HL ‘Pete SAY AVL RACER ORL. GS /MEt mmap cache, AKIRA AE FRLSURO MRT REAP DADE HAAR LR > HBS a OAS AS Fe, Bae — Da YG RT AE RE RURAL, ARATE MOR. 5) NARS} map_count, MMIRWAZERAIUF (aR AVL: AP) AIL METER ISH, BALD AUE REA LP POS TH. HET ped SENT bh NLR a eae BARK, SARE TURAL ITHT, BDH AS MEET Meteora, JP'S stl AEB CRI, REWHOAASMT. AA. HF mm_struct 4 RF BY vm_area_struct HHA HEAT ABLPRP SA, MR HM UAT, WLP TI PL VBR aO IS at (semaphore), El mmap_sem, Jt4$, page_table_lock JE 228 (2i Fi Nii BBN. BUR SERED mm_struct 4, BERL K—4> mm_struct Hi f4 AVET Hey & MERE TIED APF RE, A—MEPAAE Cvfork( ek clone( ), RL 4 BE) AF REPEL. SURE RRRLAT AE SLMS mm_struct SHI. WELL, ZE mm_struct HMPA Tt mm_users AM mm_count, 284 atomic_t Shp LG AMER, (ALR MET ERO RE” AR, th ALA OE AH i I (HR TT ER Jeet segments 7B MA) MAB LDT. 4h, ARE RO, A 71. VM86 ta FA 2 LDT. SAP HC Ad) OO A UE Be AB TTL, 0 start_code, end_code, start_data, end_data “$*SHLi% ERB SEL. RHEL. APE ERRATA i, ARRAS RT. TER, SEU BRS oh eit “BL” BR Be” HO “ER” AUR. ROR AT A, mm_struct 41449 4 ICIE PAS vm_area_struct URRY T RYE NAAR. ANE OSM a An Nae A NTR A FE ANTE I TE OL TT PARAL TL Bl, BARRERA MANET. PRIA IY SER IIMT, BEAP EAS “Page Fault” FF 3 CREME. WTP WD, SITAR Page Fault SA MMS APE RE A RAMI Pl ah. RRL, A EXE, mm_struct Hl vm_area_struct WAY TAT RNA: AAI page. zone_struct “4 45H THY LUA GE I «TTL TRS Pf) RA Pe FS FL 2.S Reh CREM, PAG i) T ALP EMP ee pH ag ZT EHR WOME, ZEAE: Se NE EU, HERR HL FRAUD ILL ANZA vma_area_struct #5. JX ALL find_vma( PRIEILAY, ICAUESZE mm/mmap.c 50. 2S Teo 40s 405, 406 407 408 409 410 aul 412 413. 44 415 116 417 418, ay 420 421 422 423, 424 425, 426 427 smm_struct vmarea_struct > 048, vm_arex_struct task_struct ped yin_next mm mm, Sth LSE ped_t Ly ‘ym_operations_struct RUUWEA pre_t URIBE He pte_t yv 2.5 eSB /* Look up the first WMA which satisfies addr < vm_end, NULI. if none. 4/ struct vm_area_struct * find_vma(struct mm_struct * mm, unsigned long adér) ( struct vmarea_struct #vma = NULL; if Gm) [ /* Check the cache First, #/ /* (Cache hit rate is typically around 35%.) #*/ yma = mi-Damtap cache; if (yma 8& vna->vm_end > addr && vm >vm_start ( addr) { if (tnm->nmap_avi) { /* Go through the Linear List. #/ yma = o->amep while (vma && vma~>vm_end < addr) yma = yme->vm_next; } else { /* Then go through the AVL tree quickly. */ struct vm area_struct * tree = mm->mmap_avl; va = NULL; for G:) { if (tree == vm_avl_empty) break; if (tree->vmend > addr) { va ~ tree: St. Lima ERE i. 28 if (Lree->vm start <= addr) 429 break: 430 tree = tree >vm avl_left: 431 } else 432 tree = tree->vm_avl right; 433 } 434 } 435 it (ma) 436 no >imap_cache = vm; 137 ) 438 } 439 revurn yma; 40} BRA BL) AMRPREEAI AL PAF AUS IN), DUA RO — a A A HIE PRM SHAT, —P EHH, —PRIAREY mm_stuct MMT. aE PIAA POMEL UCR UO DiPTHU TOC. BURRS PTE, se ARTI 35%, SHI AL7F mm_struct 444)" BEE! -7S mmap_cache JH ASR, MARA arts, ALR T RES AVL Hit GEL mmap_avi TE), BE AVL BPR, BCE pe. ALG. MRALBAVGN, GAL mmap_cache JET HEA ALI HARBIN vm_area_struct 444. ea RBI EL PE (NULL, RAR HALAT AYER ACRE We. HO SAE SE A AE SR — PTD, IAB insert_vm_stract( H48 AS] mm_struct *PHVARTEDA TUR AVL A412. PRE insert_vm_struct( )f) MERGER — PS 961 void insert_vm_struct (struct mm_struct “nm, struct vm area struct *vmp) 962 963 lock yma_mappings (vmp) ; 964 spin lock (écurrent->mn->page table_lock) ; 985 __insert_vm_struct (mm, vmp); 986 spin unlock (écurrent—>am->page table Lock) ; 967 unlock via mappings (vmp) ; 968} ¥§

mmap_avi 99 0 (a Aba FLA. 913 /* Insert vm structure into process list sorted by address 914 * and into the inode’ s i_amap ring. Lf vm_file is non-NULL 915 * then the i_shared_lock mast be held here. 96a 917 void __insert_vm_struct (struct mm struct #mm, struct vm_area struct *vmp) +52. 2 twee gg 919 struct vmarea_struct **pprev; 920 struct file * file; 921 922 if Cmm->amap_avl) ( 923 pprev = &im->amap; 924 while (prev && (pprev)—>vm_start <= vop vm start) 925 pprev = &(pprev)—>vm_next; 926 } else ( 927 Struct vmarea struct prev, next; 928 avl insert_neighbours(vmp, &m-amap_avl, &prev, &next) 929 pprev = (prev ? &prev->vm.next : &mm->mnap) : 930 if Gpprey != next) 931 printk("insert_vm_struct: tree inconsistent with list\n”) : 922 t 933 vmp->vm_next ~ *pprev; 934 pprev = vmp; 995 986 mm->map_covnt++; 937 if (am->map count >= AVL_MIN.MAP COUNT && tmm-Snmap_avl) 938 build mnap_avl (on) ; 939 99} AN MEAE TTD x TELS CREABE AS, TER EBA SP SREB OR EAS TI TL ASG ED, SE. AVL AM fd “11s ff) 38CHIF ACS] AVL_MIN_MAP_COUNT, &{! 32 IM, $08 250812 build_mmap_avi() SEXL AVL 4, DREAD RACE T 2.4 ARF I) TAF le EAL Haag RSL) eG ‘He » WM SRS ASD ES ep a A TTT AE CPU ACR A FD SUP DN PP 7, RECT FG NADAS te BRED FER. HEM CPU rR ATA (Page Fault) S#%5 (Exception) (th RRORIUP IT), SRM RUAY PON A OCCT FR AS REI EN PPE A A WRT AA TO OT SAT eT. seattiT SARE. HET BLOT ET LLU PLR © HALAS TU RN TR Ae I. HA ALR tht 5 Oe hk UR eR ARE, ak BUSA. © AMOR RAE ATE © ROPMENTALASRMMRMAY, PLR AS—T “AR” RT. RMA, RTE BAP RA PATIL Hat mmapl id Fs BY Fe, A ROAR GILL munmap( ASEH). ZENER -MBRAL PCIE, FSH ete esta Ea] SP RAMI ASI, TAUB EE AY. PU, ALR, Bt +53 Linux PARE Li ERLE EA HOHE AR UG HK RAT) ESR, RARE APD. RN, ETE —P EREHE CInvalid Address) THiS {HBBRAT AW, IMT BRA A:T — UC TU th EP "PUTA A RP AIMEE “SPIELE” Roop eH, AEA ALA LR SAA MIR SEM EM. ARBRE CPU MET EBA TRIB IRS AE do_page_faul HAT. RAK do_page_faul( )AIARIZEM FF archi386/mmvfault.c FP. XA MMV, BAST ‘ESE RA PE HIE TA 9 HY EM IERETAIUTRB: 108 asmlinkage void do_page_fault (struct pt_regs regs, unsigned Tong error_code) 7 108 struct task struct #tsk; 109 struct mm_struct nm; 10 struct vm area struct * vma ul unsigned long address; 112 unsigned long page; 113 unsigned long fixup: 4 int write; us siginfo_t info; 116 “7 /* get the address */ 118 _-asm__(“nov] herd, 40": "=r" (address)) ; 119 120 tsk = current: 121 122 is 123 * We fault-in kernel-space virtual memory on-demand. The 124 * ‘reference’ page table is init_nm, pgd. 125 * 126 * NOTE! We MUST NOT take any locks for this case. We may 127 * be in an interrupt or a critical region, and should 128 * only copy the information from the master page table, 129 * nothing more, 130 ” 131 if (address >= TASK SIZ2) 132 goto vmalloc fault; 133 134 wma = tsk~>am; 135 info. si_code = SEGV MAPERR; 136 137 ie 138 * Lf we're in an interrupt or have no user 139 * context, we must not take the fault... 140 */ 141 if (in interrupt () || !mm) 142 goto m_context; +54. BIE fra M43 144 down (am->mmap_sem) ; 145 146 yine = find_yma(nm, address) ; M7 if (tvma) 148 goto bad_area: 49 if (vma->vm_start <= address) 150 goto good_area: 151 if (1 (wma->vm_flags & VM_GRORSDOWN) ) 152 goto bad_area; TET. AA EFA A? 4 1386 CPU ree “THE” SPRAY, CPU HER BCU AT SR AR es aL EAS Hh AEA CRIP, MA PARSE eM. A, EC SEP REA AADUADIE TF ALS) BT CAH BER CR2 AIS, BLL SLMEFEN WRT. ATIC RS A A aH AL TO RAMRA RS, “EHE%O FREE address B28 A, JERSE MLN Bea OPAL — PAT ET 6 RR, ARM / RRMA ATS. ME ptregs MHUHtt regs, Cast REMY CPU PH RETR AE OP AR, CE LP RL BL ER POR AD DR" error_code it~ 25-45 BWI 6A FLA BU» JRC TTEAR AN task_struct SUR AY. AE BT DUEL RE current IRA Saint PE CHBTE TEDL AT MEAL) AY task_struct if (MMW. TERE REREAY task_struct SiH MEET, 1H FAS mm_struct SARS, TERA EAU A AT AEB Pe. QU SEE, CPU SchR SAT ARAN IE AN mm_struct SH, hd AAR OL AT EL A ALA at on A RA Fe AT, mmm_struct SiH )RORT , SUPT BEARLE T PRET. PERS, BER MEPARIS. — (RPTL AL in_interrupt( )IBIELF 0, BEAAWERTBN A Wea A Ze FAREED, A SARA. Th MTD STRELA mm THOSE, at ABREU MARES, MNRAS AD BE SATE. WHE, AER AUREL AX, in_interrupt( ) WEE) 0, MKT REE TAMIR? MARRERO / RRS, RAAT in_interrupt( )EKU PITA PTE MRR EIS ATA OL HEAL goto HATH BART no_cotext Sh, BARS RGR MEIER, ATER ABBE CURSE EA HRW EER, AURA RMR ATDL, PDR atte SNS PIV HtP. BD down( fup( RTI GRIE. SIA HAY, Zé mm_struct 44 PE LET AAS mmap_sem. ik FE Medown( iB IFIDUS, ®t 72 47 HARRI T - AMER, ARISE TSE AR KI OS Sak Be SA 5 Be RS I DATTA BE re PN OE fa), RE PE). SIE, find_vma( ) BFS AATSHF. LAAT BEE. find_vma( yet FACE sie AP PH 24 ka A R-AK, WREATH, DANA SAYER MTs. BA. HPAI PS HARBIIE? PCF ABO EE ARREARS TILA, LR, TURAL ROES BBR ABE A Riel ESET. UR A 7 nd Tae, A A JEEZ b, BAA IG TET. BMY RT RT, BORER AT. We SLAIN bad_area, Asii$ (1138 MARR ATRL AIA IK MAT 6 to RALH AA AN, AL Pe ars Me Pee Ms COLA 148 77), ABR HE OGAWA RR TERME A AE, BUNT SE ZA, DITLABRSE IA good _area JE — 2 He UU. 55. e240 Ite CSS Saeed Linus Pei RARER PITRE BS. TRS, ATA a TRS PAE A, AR SL TE TED RARER OLHA, TAREE, TRAP AM SI. R-HSMAREA—T, IB RAE EMMA, CRA taba A Ae OBR RBM brkc( )) TDA SHAE 2A Ih), ABU ANCATD HLS TET, AMER. RAE MAPLE, (HUE, SHA AUER AIR TEIN? ALE 150 tr. ATAU, JERR PURER, oe find_vma( ARB) R i] HERE fe, BALE EA vm_flags PWIA Mardi, VM_GROWSDOWN. & FEARS 0 BIA, ABET SSI L Fy RODE IAD HAE, RBH ASST AR MD Ue AMA PH, RA eee et T — Pebble A ARE. RRMA RT BLL FRU, RANT MBH AT IS BAY goto 1 A)H: fH) bad_area, BUEE 224 {TF [do page_rault()] 2200 /® 221 * Something tried to access memory that isn’t in our memory map. 222 -* Fix it, but check if it’s kernel or user first. 23 #/ 224 bad area 225 vp (éom->amap sem) ; 226 227 bad area nosemaphore: 28 /# User mode accesses just cause SIGSBGY #/ 229 if (error_code & 4) [ 230 Usk>thread. er2 = address; 231 (sk->thread, error_code = error code; 232 tsk->thread. trap_no = 14; 233 info, si_signo = SIGSHGV; 234 info. si_errno = 0: 235, /* info. si_code has been set above +/ 236 info, si_addr = (void #)address; 237 force_sig_info(SIGSEGY, Ginfo, tsk): 238 return; 239 1 Hh, SRA A, CAPAMBA SE (BARAT mm_struct SMUT HRIE), DILL Lup HAR. A, ME ER errorcode, HAAR ARB RUBAER BIEIN TPE we: 96 fe 97 & This routine handles page faults. Tt determines the address, 98 * and the problem, and then passes it off to one of the appropriate 99 & routines. 100 * 101 * errer_code: 102 * «bit O == 0 means no page found, 1 means protection fault +56. Ble Taw 103 = * bit 1 = 0 means read, 1 means write 104 bit 2 == 0 means kernel, 1 means user-mode 1050 ¥/ “4 error_code BJ bit? 2 1 BL, AeA RMAE CPU AEF HU RESO EEN, BOE SRT RATT. PEARSE REA 229 17. CEA, NY NRIMEREIN task_struct HAM RR, a Ba NaN AE” RAK RHI”) SIGSEGV. ilk, AUS i ART ER ARIS: “BARES? "ARN, 56TH ORM EA TB et A SBE TORRES WA. SUOMI / RPBPZLH, PECANS RM ARM fe SEE, ERAGE MARE YREAAM. HepAebA—TAA SIGSEGV. Wis, ARMIES Abi SAUTE RUREBAR IPE EAI. LER MMM Re “WR” OY, APE RIAD. AT SIGSEGV #55057, SLB, SUB R ALTE RAM GPL Sa AM AS BEAD “Segment Fault” #243, #R/G MEEREME CHRD. EAS Abu BAL ein AA, ZEA REFARX, MAKRATEE So FANE BBE F do_page_fault( FSG, WMA RBS RATE PMT EMRE Aids DUPE RUD RF BAIS ELBA CS HK 25 AP HRT EL- MARE, RUT URS” TA li RM TS [debe BA, RARHOBERD, MARANA RN. Be. REE ARP. ERR RATA PB >. HUE PIER ITT “AA” ARERR. TAR, a MR] Pa “Mad. RSM AET RP, CAMS T AA OR PT, CRI “UTR” HE COLE, HERE TY BRD, BBVA FCAT ROMER STINT PY. BRR, CPU aie Beast esp DHS EEE MURS HEH, 26. — Ri al eA wes > , =e | meee aR FB BL H26 Resa r ee AEE AEE RAPS, SG CPU SB Wak HERR, ta Ad BG lg Ate SIP HE CSbesp—4) AOMKIT. AT AL, AERRATR- MAP IEE (Soesp—4) TRA Sa, BAL -57- Linux Pa, | RBA OSE PA a ES [a — CT GPE ULB LM BE LO RB I arch/i386/mnv/faultc ((3% 151 47. {do_page_fault()] 151 if (1 (vma—>va_flags & VM_CROWSDONN)) 152 goto bad area; 153 if (error_code & 4) { 154 io 155 * accessing the stack below esp is always a bug. 156 % The “+ 32” is there due to some instructions (like 187 * pusha) doing post-decrement on the stack and that 158 * doesn’t show up until later. 159 */ 160 if (address + 32 < regs~ 161 goto bad_area; 162 } 163 if (expand_stack(vma, address)) 164 goto bad_area: FA-UK, PHR_E ARYL LAL AEER ET, 3 VM_GROWSDOWN fast 1, BTA CPU MRMRSEAL HT BUT. SARE AWCRETERU ASI] (bid 1) Wt, PERERA SAAT SEAL fe ROLE AER, FL FB BR EA A AS HER ET TEAL TT» AERA MA EH, ALA Sbesp—4, RR RARE. (LEM RL Gesp—40 FE? AAA TE RET S|, TPR SHER T. ALE, GARAGE “TER” BONER? EH, RAR 4 4, BLL HLA AL Gesp—4. {HAZ 1386 CPU HA pusha $4, AL — ete 32 PEW (8 32 AB AS) IRAE. BELL, KYA RTMENUE%esp—32. MIR MIRE AT, TRUER Nt Rb AE, FeAl bad_area. TAA MA RP. KMRL MALL REBT EBERT ER, ARIAS BHT LAE OR. HE ZF AAERURIE, (ESU@DUT RE. FFDLAESEAH] expand_stack( ), GREAT include/tinua/ymmb 112 09 —4S inline AR: [do_page_fault( ) > expand_stack( )} 487 /* vma is the first one with address < vma->vm_end, 488 * and even address < vma~>vm_start. Have to extend vma, */ 489 static inline int expand_stack (struct vmarea struct * vma, unsigned long address) 490 491 unsigned long grow: 492 493 address &= PAGE_MASK; 494 grow = (va >vm_start ~ address) >> PAGE_SHIFT 495 if (vma->vm_end ~ address > current—>rlim[RLIMIT_STACK]. rlim_cur 496 C(vma->vm_mm->total_va + grow) << PAGE_SHIPT) > +58. BIS teow current—>rLim[RLIMIT_AS}. rl im_cur) 497 return ~ENOMEM; 498 vma->vm_start ~ address; 499 vma->vm_pgoff -= grow; 500 vma->vm mn->total_vm += grow; 501 if (vma~>vm flags & VM LOCKED) 502 vyma->vm_mn->Locked vm += grow; 503 return 0; 504} BH va HEI —F vm_area_struct BUBBA, (RAPER, CE (Ee STO HE ACEC m). PI. MORAL TCT AER Fr, Thee Re BEAR JL NUTR A Ee EO CE HED), ABRAM ala, HAT RAE MiB, ABR TAlp At Re Se if oe LL? FILM). AEARAY task_struct HHP ABA lim SRL, HE TR ENR CO. RLIMIT_STACK @& 23 41° 2200) BOA DESMA. ATLA, ROME TIA PEON. AOR REC fant TAT THR ORE, SAAN AT RCIA RUN. RL TTA ea. BRE AMET» edeiklel +) ABU CY—ENOMEM, AAT EAT ATLA RT : ARR EEL 0. 4 expand_stack( )i&[EI(K{H294E 0, thiI—ENOMEM It, 7 do_page_fault( ) 2414] bad_area, HSER SH ROPE. Aik ALR FAAS ARR, FLL expand_stack( )—BUAWEIE HE Fl. (AE, RACH, expand_stack( RAAT HLA vm_area_struct 444. ba AE Bea AY TAY EEN FERS. IX MES HH PAY good_area TER: [do_page_fault( )} 165 /* 166 # 0k, we have a good vmarea for this memory access, so 167 * we can handle it.. 168 #/ 168 good_area: 170 info. si_code = SBGV_ACCERR; I write = 0; 172 switch (error_code & 3) ( 173 default: /# 3: write, present */ 174 ifdef TEST_VERIFY_AREA 178 if (regs->es == KERNEL_CS) 176 printk (“WP fault at €O8ix\n", regs~>eip) : 177 dendit 178 /* fall through #/ 178 case 2: /# wrile, not present. */ 180 if (1 (vma->vm_flags & VM_WRITE) ) 181 goto had_ares: 182 writer; 183 break; 184 case 1: /* read, present */ 185 goto bad_area; 186 case 0: /* read, not present */ 59. Linux Pei ea! ff 187 if (1 (vma~>vm flags & (VM READ | VM EXEC))) 188 goto had area: 189 } 190 191 i 192 * If for any reason at all we couldn't handle the fault, 193 * make sure we oxit gracefully rather than endlessly redo 194 * the fault. 195 */ 196 switch (handle_mm_fault(nm, vma, address, write) { 197 case 1: 198 tsk->nin_flt++; 199 break; 200 case 2: 201 tsk->maj_flt++: 202 break; 203 case 0: 204 koto do_sigbus: 205 default: 206 goto out_of_memory: 207 } TAAL switch 16 4)71, Ay BL ee TEL ASEAN error_code Rath: ~ 25 AH ERRATA FADIFE RAAT Cerror_code MERU PA). RRA MAST A, bit AO, SRAM, Ti bil 1 ARSE. ML, RAMA 2. BEAR ETE, ASE HANH KARGRSA, TERRE RASAM. Fit, MHMAT 196 tr, WAL eR handle_mm_fault() 7. 28802 XT mm/memory.c “}: [do_page_fault() > handle_mm_fault()} 1189 /* 1190 * By the time we get here, we already hold the mm semaphore 1191 */ 1192 int handle mm fault (struct mm_struct ‘mm, struct vmarea_struct * vm, 1193 unsigned long address, int write access) 194 ( 1195 int ret = 1196 pad_t *pad; ug7 pmd_t *pmd; 1198 1199 pad = pgd_offset(nm, addres 1200 pnd = pnd_alloe(pgd, address 1201 1202 if (pnd) { 1203 pte_t * pte = pte_alloc(pmd, address) ; 1204 if (pte) 1205 ret = handle pte fault (mn, vma, address, write_a pte): 60. Tiny ean 28 tener 1206 } 1207 return ret: 1208} PRA S20 PU LHL AAR ea ETD) mm_struct BBA, ABEL ped_offsen( it MEAT ILA TL A RORWEET. LCORZE include/asm_i386/pgtableh PIE LAI: 311 /* to find an entry in a page-table-directory. */ 312 define ped_index (address) ((address >> PGDIR SHIFT) & (PTRS_PER_PGD-1)) 316 #define pgd_offset (am, address) ((mm)->pgdtpgd index (address) ) ET FIRM pmd_alloct ). ASEM ME OR AFA) —MP TAT. F386 BPS WRT, BILE include/asm_i386/pgtable_2level.h P45 ILE Mh “return (pmd_t *)pgd;". WALAEWL, 7F 386 CPU FE, EFA RHP AH CAAT 1D ANEPIBLER. BALL, AF 1386 CPU WE, pmd_alloc( )EAEAL AWAY, EDLY pmd ATT AEN 0. BER ARH SHE ALL Rowe EY BAL A PRBEWM A? WMARGREN, AMM ASML ORI — TTR, seta ARABS SE MOHANL TE AeA BUNA SEL. RAE, ROT TA, APR RE aE ATR, AAEM ARAL VOR. RAE, PHT Ly Fd sm Ps op SE IP HES. RALULAL pte_alloct )7EaRIM, SL/QH9ZE include/asm_i386/pgalloc.h *t: [do_page_fault( ) > handle_mm_fault( ) > pte_alloc( )] 120 extern inline pte_t * pte_alloc(pmd_t * pnd, unsigned long address) wf 122 address = (address >> PAGE SHIFT) & (PTRS_PER PTR ~ 1); 123, 124 if (pmd_none (4pmd)) 125 goto getnew: 126 if (pmd_bad (pd) ) 127 goto fix; 128 return (pte t *)pmd_page (pnd) + address; 129 getnew: 1300 131 unsigned long page = (unsigned long) get_pte_fast( ): 132 133 if (page) 134 return get_pte_slow(pmd, address) : 135 set_pmd(pmd, __pmd(_PAGE TABLE + pa(page))); 136 return (pte_t *)page + address: 137} 138 Fix 139 —-handle_bad_pmd (pnd) ; 140 return NULL: Mt} “61. Linox PHBE rats See (Mo MLE Re et CR TU EP aR ak. ARAM Re ABET pd 9S I) Ha WAS, HMB SS gernew( bare PMR. — TM RA AHEM MOM. AP RRMA AME TAG. WR MRT, A BOF IRET ML eR TTT V5 SAE eM JA RA BD AR es PERL. RPE, GBA AP ULIMTARINE, BERT ULE FREI, BAL get_pee_fast( ). BURADHRLEL BST, MARAT get_pte_kemel_slow( KET. LABIA E, BALM PEER TRAM A MA, aA ZIE “slow” ML (oA LAT MTA 2 AAR. FRE ROU TAT ARSE, MALATE AE A NAM TRIM Le LT LT 5 SME RDU MAL set_pmat THEIRS UAMMENEIA) LEELA. ARS AHP FLSRCHL pd TENE 1886 RISER LAST WOME SOR ped 4. AE. WH STAN) “AEE” AB CLEETR AT 5 (A ICT AT pe ALE fy. aR SPRAIN He AAC P.M HB handle _pte_fault( VEAL. APRBGE LP mm/memory.c A: [do_page_fault( ) > handle_mm_fault( ) > handle_pte_fault( )] 1135 /* 1136 * These routines also need to handle stuff like marking pages dirty 1137 * and/or accessed for architectures that don’t do it in hardware (most 1138 —-# RISC architectures). The early dirtying is also good on the 1386. 139 1140 * There is also a hook called “update_nmu cache( )” that architectures 1141 * with external mmu caches can use lo update those (ie the Spare or 1142 * PowerPC hashed page tables that act as extended TLBs) 143 1144 # Note the “page table_lock”, It is to protect against kswapd removing 1145 * pages from under us. Note that kswapd only ever removes pages, never 1146 adds them. As such, once we have noticed that the page is not present, 1147 * we can drop the lock early. 148 ® 1149 * The adding of pages is protected by the MM semaphore (which we hold), 1150 * so we don’t need to worry about a page being suddenly been added into 1151 our VM. 152 #/ 1153 static inline int handle_pte_fault (struct mm struct *ms, 1154 struct vm_area_struct * vma, unsigned long address, 1155 int write access, pte_t * ple) 1156 1157 pte_t entry; 1158 1159 i* 1160 * We need the page table lock to synchronize with kswapd 1161 * and the SMP-safe atomic PIE updates. A162 =/ 1163 spin_lock (&mm>page table Lock); 1164 entry = pte; 62+ Ble Tee 1165 if (pte_present (entry) { 1166, i* 67 * Tf it truly wasn’t present, we know that kswapd 1168. * and the PTE updates will not touch it later. So 169 + drop the lock. 1170 af 71 spin_unlock (&mm->page_table_lock) ; nT iff (pte_none (entry)) 1173 return do no page(mm, vm, address, write access, pte): 474 return do_swap_page (nm, vma, address, pte, pte_to_owp_entry(entry), write_nocess); 75 } L176 ugT if (write_access) { 1178 if (Ipte write(entry)) 1179 return do_wp_page(mm, vma, address, pte, entry); 1180 181 entry = pte_mkdirty (entry); 1182 } 1183 entry = pte_mkyoung(entry) ; 1184 establish pte(oma, address, pte, entry); 1185 spin_anlock (&mm>page_table_lock) ; 1186 return 1; ust} ARAVA TD, ARTUR AR HRC RAAT ARIGUEE) IG ACE ALA, He, PLR TTSL A if EAI: REREIHEAL. [1% pre_present( )ik- ATIF WR ALIN LAF CE PS TPR, CRIN Ae SL LEIEATINNE. DE 2. ple_nonel \PFWIANIA ACHE. EIA MPRA WR. HALL, BREA do_no_paget ) (2 RUBLE do_swap_page( )). Mitt F. WAL pte_present( VIMAR LAR TPT A UTR EP eH AZ ARE SEMA EVAR, eh RARRT AT. FBR do_no_page( HLAEZ: mm/memory.c 45% UN). OEM EAA, IR RIC RE. Dh a ARANDA EBA LHR ae AF BLD A AY vm_area._ struct PAF HE vm_ops He) =F vnoperations. struct BUBSEH). A-MHARGIGSEIR IAL MEBLBESER, ARMIES ASCE HPAL RAO RARE. SLII-A) (SB MAGE OR EDR A eT RE, YORE A TL GALI a SCPE TE? BLOF mT RRSP RICA SUN. Se TUES FPS HRESIEA, TIT REO RRL. AS MRS ART TLE BINED BAIT ALBIA, AN “copy on write"R COW. XF COW seit] AMEE — AHS fok BER EBON TEA TH 2B. EAE. SBLt mmap( ys — SRA PRMD — PERTTI CELI BLD RRA. BRAT ULM Sci TRA EROTICA. een Me EAVAYSCAPRUUINIBE( LE. 7) Ju thi, 5B 0 SL AA LA A ERAS. DEEL. aR EE AGS AO COR ER AAAS, TUR, AOL BU Sa wma SET DL VILIN Te TD BR YF B.A AL vmavm_ops->mopage( > fH, vmasvm_ops Hl vma->vm_ops->nopage AMA BTARAL, MARA ASAT ZANTE ALM nopaget ERE, SCH RASBLE Ai 63. Limo py Heit (ab fi &—/> vm_operation struct 4 #4). “HEAT HEE MY nopage( ) MEE IT A Beat A — 4h a Bt do_anonymous_page( )74) AO YE 4 17 Ki. FREER H do_no_page MIFAILAT: [do_page_fault( ) > handle_mm_fault( ) > handle_pte_fault ) > do_no_page( )] 1080 /* 1081 * do_no_page() Uries to create a new page mapping. Lt aggressively 1082 tries to share with existing pages, but makes a separate copy if 1083 the “write access” parameter is true in order to avoid the next 1084 > page fault. 1085 1086 * As this is called only for pages that do not currently exist, we 1087 -* do not need to flush old virtual caches or the TLB. 1088 * 1089 This is called with the MM semaphore held. 1090 / 1091 static int dono_page(struct mm struct * mm, struct vm area struct * vma, 1092 unsigned long address, int write access, pte_t *page_table) 1093 { 1094 struct page * new_page: 1095 pte t entry; 1096 1097 if (!vma~>vm_ ops || !vma~>vm_ops->nopage) 1098 return do_anonymous page(mm, vma, page_table, write access, address) ; 1133} MF RALE MER RE, APRN MTA, IRA MERA TARR, TR 2275 HEH) nopage( BRHF, fITLLIEA do_anonymous_page( )» [do_page_fault() > handle_mm_fault() > handle_pte_fault( ) > do_no_page( ) > do_anonymous_page( )] 1058 /* 1059 * This only needs the MM semaphore 1060 #/ 1061 static int do_anonymous_page (struct mm_struct * nm, struct vm area_struct * vma, pte_t *page_tablo, int write access, unsigned long addr) 1062 1063 struct page *page = NULL; 1064 pte_t entry = pte_wrprotect (mk_pte (ZRRO_PAGE (addr), vma->vm_page_prot)) ; 1065 if (write access) { 1066 page = alloc_page(GFP_HIGHUSER) ; 1067 if (page) 628 tame 1068 return -L: 1069 clear_user_highpage (page, addr) : 1070 entry = pte_mkwrite(pte mkdirty(mk_pte(page, vma->vm page prot))); 1071 mrss}; 1072, flush_page_to_ram (page) ; 1073 } 1074 set_pte(page_table, entry); 1075 (No need to invalidate ~ 1 was non-present before #/ 1076 update_nmu_cache(vma, addr, entry): 1077 return 1; /* Minor fault */ 1078} FARMER A, WRG RMR SEL, ALA mk_pte( RISTO 2 Bt ple_wrprotect( )IOLAWSTE; lu RHE BRE (BAL write_access 41: 0). WUiKLLt pte_mkwrite( HULL HELE. ROAWHAAMWE? I includelasm-i386/pgtable-h: 217 static inline pte_t pte_wrprotect (pte_t pte) \ ( (pte). pte_low &= ~_PAGE_RW; return pte; } 270 static inline int ptewrite(pte_t pte) \ { return (pte). pte_low & _PAGE_RW; } ATHE— F, SATA, ZE pte_wrprotect( jt, 42 PAGE_RW fraviritek 0, aK MEER AVR: TCE pte_write( MIB PRA MBM 1. AEE eR A. FR AS 2 dt ZERO_PAGE, i& 47TH EE include/asm_i386/pgtable.h 5 Xi: a ie 92 ZBRO_PAGE is a global shared page that is always zero: used 93 > for vero-mapped memory areas etc. . OY 95 extern unsigned long empty_zero_page[1024] ; 96 #define ZBRO_PAGE(vaddr) (virt_to_page(empty_zero_pege)) LAER, BEE RE” CHR OR) BATU. FPA eh — ff a Fa — 4 EN CTT empty_zero_page, MUA PICEA Abu ALT 20 SERRE, TIM AA TEAS 0, APLAR ZF ANI RY HIRE 0. NAAM RR. 71 alloc_page( HABBO AEA EE. ERATE MER SA, PAPI NAR THER. IF HALES BEA SRS M4, JITLIBEIBL alloc_page( )#It4}E2— ADEA FF DUEL, FFAG A MLB] A HE TOE I] PA PARAS Ae AWLP L115 47), sRLIL set_pte( ) RMAH pagetable PHIM. St, Maer TL SIN Ae OI ee Pa Tk £9 update_mmu_cache( )#f i386 CPU #2 8% CJA. include/asm_i386/pgtable.h), [33 i386 fy MMU (A FPG IL) AIGRAE CPU ANAK, TIAA MMU. RURAL. PRRABRBMT. ATR, SEK PM RAMS 1, BE do_page_fault( ). 7F 3% do_page_fault( )4#, i834 - +45 VM86 BECUAR VGA MARTE RAK AIR. (EA SRA ROR KAT +65. Linux Path r [do_page_fault()] 209 is 210 * Did it hit the DOS screen memory VA from vm86 mode? 21 */ 212 if (ogs-Deflags & WMMASK) 213 unsigned long bit = (address ~ 0xA0000) >> PAGE SHIFT: ald if (bit < 32) 215 tsk->thread. screen_bitmap |= 1 << bit; 216 y 217 up @nm-Dusmap_sem) : 218 return; SUES» APM 81 CPU AKU TCT GA A BPE, AG SS AIT IAL Mer GR AAT ABR, GARE POUT, LR. PRA RR YR iH, PRU AM Crap HHS) AEN, CPU MH TE AIS. AAR Poe ak dT Ae Apt RE Ba A SS AS eA HEAT I). SRR AEA, CPU HF TG a OPRDL O, BEAR, EAE) TAME A NAL CAV FAS RSL) AHR. GES BEATLES SE Mh EE I SERRATE A NL. JM AREE REE CPU RUA ALE P SLD, EAN HEP BL. DORE UE, TIN ROT IE” ALATA, BZN TLE ACL. ARM 4, SURBAE WAS PRE, RT CRAMER RCM ESN. MRE WEATBRT. FPA EH ARR COLOMBIA Goesp JEAIIER RUT). UTE. MPR EER, HERR UAT, HUT LAMAR Bho) LA ae RTT. PPR, RAE OAM" I, SRT AR A, TORE MFR ROS AACE T ALG AST] FES 2.6 AER Hh ay 1k A Fo Pe BRCPU 24h. SFAR Linux LOR ABA RER IE RSC BL, PORTER TCIM AY CGE Rae AR EE HOT. DRAMA MERAY ER EEO PEA AEE. wt, & PRR EA Ee eT Bite HARRIE AB I MERINLA ARIE. “RETEOUN”, AR ALMA EE ELE A, UES RMA (4KB) MAIDA. RAE TURBOS SAL, A ET SE TE AT Ly BSL “PHT”. KURA RIAL, PT LEA, te ede b. T KARMA, ATED MARZ “CED ee i” AU “CRE CRB) SU”. ELSh, AEE MRL, PERERA bh, AE TREAT TR, RETRIAL. BTL RATER ER A AE APD SCAT HT AOR, SEPA LARNER SP eae A TT AA LL URAL, UE, RSE AL PLB, aR ICE SUATBT. RE UEPERUNE TE EIA CALE I) y 3GBD. Bit, tush LEIA WENGE, BASALT MB. HAE, F285 Linux COL Unix) BUSA ER API AG LEB 1S, ARLE KB R= KB. nT, SRB AUL TS. LSE REIRIUT Te HRT OR, RE PEATE] 66. RIS Thome ARERR T . TORRID F, SW RYN ALI RRR. BELL, Ee ALR RR EIRE RA TARAS —* SRB i ER” AR, NIMH CNBR) APM LE, UF EIA), STREAM EERE. RR ee AREER EIA Li, SER AGE THOR SL AT LUE CRA BE) FORINZ GRICRRE BUD, ACPI “CRB, BABE SUPRA TTT FEAR. GUN, RRS BERRA, RAE REM RUA HO BEK, FS A Ae RS TM HE Th RMR” RR. TERE RY, MACE AL, BREE ATP RA, AMES leat 1, AMORA. TE, WERT, BAH, BARRA ET Zs PARAS BRR, ARATE, WE RRO ha TR MOAB HE. BUG, Linux $k 7 ADIT ASA LY RAL, ALBA ER PEE ETF. PEST He Se FL 9 RZ SEG I «SA A A Heh PE MHA MMK SHE, ATE CO) ARTE, RTA ABNF page He HA. SMA Hi page HAR CURA PE AH task_struct 4H), BATRA AZ PL BRR RSE” PR, MIE TEENA, GRA, AEE RELA EM. FR, ELE TPTERU TER, WRT MHEIOEIN page 6, RMA RRA OB A)". REALM, ARRESTS, yeh CTAB AE 4 page SiH, Te TP page HARSH BAL, Hk — 4a Fei mem_map RABE. ALY, RE MS HE REHEAT BR”, PRR AE ER” Cone). ii ARM LE TARR, MRA GTM PRED. RK, RACE ASLT. SIAL, BER AC HARE, AT TLC) AA a EN WEEE RRC, SHRM RAS, SHANE THM, REBAMAS LRA HE, ARAL AAP ERAT en. Rae A ae ORT. AE XT — swap_info_struct BAR, A LUAAAUT ALP me OR EE RE include/tinux/swap.h 49 struct swap info struct | 50 unsigned int flags; al kdev t swap_device; 2 spinlock_t sdev_lock; 53 struct dentry * swap_files St struct v'smount *swap_vf'smnt ; 55 unsigned short * swap_nap; 56 unsigned int lowest_bit; ST unsigned int highest_bit: 58 unsigned int cluster_next: 59 unsigned int cluster_nr: Cy int prio: /* swap priority #/ él int pages: 62 unsigned long max; 6 int next; /* next entry on swap List #/ 67. Linus AIRE, AN 64: Shinde swap_map fib] MAL, APH TEE SRE Ree EE "> AMET, TALK Pew a Te A oR SLA. BAHL KAN pages, CRAAAHRABSM AMID. Be REE. Be Ace, RIAD) mu38 hott LB swap_maplOVHHRM BT RORY RMR, CLA T ARE Ae Re AAEM 2 ETL 0) RAED BPD 5 Sek BARA EH HN A CT EH RA AMMAN ZRENA ARRA), EA MATH HTM SR. ROM RT EE SMA MHB SN, BEA swap_info_struct 244 5F 9 lowest_bit #1 highest_bit MVS PM TARA FARR & by eee R.A max RAR RRA, PELE RL WORK YD. EDL FF RST EAS UREA GE AS FE aD HS ATT PRUE SPALL TUS dA Rm REAR AE (cluster) A ALMET. 10 BL cluster_next ll cluster_nr HE JRE. Linux ASIEN S + ARS (BOLD, PLES BP REIL T+ swap_info_struct 4549 PESY (AL) swap info, 1X Z4E mnv/swapfile.c Pa LAN: 25 struct swap_info_struct swap_info[MAX SHAPFILES] ; Felt, BEL IZ TANGA BA swap_list, #544 AT UA 4} BE OEE WL A BL ee HE swap_info_siruct SMR Se LE I TE ik 23° struct swap_list_t swap_list = {-1, -I}; JX 19 swap_list_t Sti 44 #4 2 4 include/linux/swap.h Ps LE: 153 struct swap_list_t { 154 int head; /* head of priority-ordered swapfile list */ 155 int next; /* swapfile to be used next */ 196}; FHENB AZ, BTA head AM next 4-1. SABI swap_on( RRA TITER FRI, ESCA AY swap_info_struct 4 MBEABSIE DUQUE pte_t MRSA MA) HOBAGRA SHERMER A-H, LMA 4—/ swp_entry_t BH SA, CEE include/linux/mmh Pie LEA: a of 9 * A swap entry has to fit into a “unsigned long”, as 10 * the entry is hidden in the “index” field of the 11 * swapper address space. 12 * 13 We have to move it here, since not every user of fs.h is including 14 nmb, but mh is including fs.h via sched .h :-/ #28 Gaoe 15 */ 16 typedef struct [ 17 unsigned long val: 18) swp_entry_t: FYI, —AS swp_entry_t POE: LR 4S 32 ATE SME. (IE, RAS 32 ONE EP STB. IV 270 offset type 2A ti 7h L FRU Ri HE O B27 Re RE Cf include/asm-i386/pgtable.h 11299 type A offset MMIBA VIALS pte St Z MW KA, ERT LAER Es 336 /* Encode and de-code a sap entry #/ 337 #define SWP_TYPE (x) (CO). val >> 1) & 0x31) 338 define SHP_OPFSET(x) (G). val > 8) 339, Hdefine SWP_ENTRY(type, offset) \ (Gswo_entry_t) { ((tyne) << 1) | (offset) <« 8) 1) 340 #define pte_to_swp_entry (pte) ((swp_entry_t) { (pte). pte_low }) B41 dofine swp entry topte(x) — ((pte_t) { (x). val }) BEE offset SORTER BEER SCP OE, BALSAM ST type a HRSA ASP, AES). AMR Me HRY SARTRE, IOAN HE PRCA S CSET 127 MPC, (SEB ARETE, WEAF 127), © FAIAKZM type WE? GNI pret HAHA. PATRIA, pte tin WAL 32 FES SM. HARM) 20 RE AEE A 20 fe CEE TTR AR HLA IE 12 fire O. SATA 4K WEALTHY), SEK 7 NADA LAOS TCE AHL bea. hn ROW. US, 39%, HOLA Wy type AUR. Til swp_entry U5 ple MERA RACING, KABA. NEAR, TUR REP AACA pret BUM PARA) 1, RERTURE TER, TRA th ERS Ae TL UAT SAM Re AT IE AS A, RUAN RT eA) SEF OER, TEER TA swp_entry_t “428”, AAAI HUET. PLS 0, ARH AAR TEPS He. TEL CPU AFI MMU SL ERIEAC A ACAI AL, TITRE TRASH BS RCINCLA FEF Linx AKCE, BRA ESRHE STE SRR LIOR, SHEE PCPA, Le Ra Auk eae FRU, SRE ANTE, FORE AN eG TARR: ANE TPT. MY HOLT WM HAE TT. RA RRR AL, RUPE SWP_TYPECeniry) AH SWP_FILE(entry). RRS A AS 5 BR TT ESL SP BALE a it — 2 HR 69. Linux Wine FAN — FFE — “MH TT 9 08 B_swap_tree( ). Lik TA BUR. Bear eT LRT EGR RRM. Hee MORE PE movswapfilec TY. ii A Ad ea AY —~get_swap_page( AER Cth, BEAM BTR. HHAL__swap_free( OFFI IUT: Mi /* 142 * Caller has made sure that the swapdevice corresponding to entry 143 is still around or has not been recycled. 4 145 void __swap_free(swp entry t entry, unsigned short count) 46 { 147 struct. swap_info_struct * p: 148 unsigned long offset, type: 149 150 if (entry. val) 151 goto out; 152 153 type = SWP_TYPE(entry) ; 154 if (type >= nr_swapfiles) 155 goto bad_nofile: 156 p = & swap_infoltypel; 157 if (1 (p>flags & SWP_USED)) 158 goto bad device: INR entry.val yO, REGIS ANE BHAT. By Ceti TT ea a ae ICE EM O RAE SCHR. AT, GUT ATI, SWAP_TYPE ABI] LAE TD SCHR MH ALES , BDSE swap_info_struct SFYC swap_infol JAALHI F AR. FRA 156 47 LLG FARIA swap_infol | FIR A ABC HEH swap_info_struct “#4. SCHR BILLS, “Fm HROY OCT T 159 offset = SHP_OFFSET (entry) ; 160 iff (off'set >= p->max) 161 goto bad_offset; 162 if (tp->swap_napLoffset) 163 goto bad free; 164 swap List lock( ) 165 if (prio > swap_infolswap_list.next]. prio) 166 swap_list.next = type: 167 swap_device_lock(p) ; 168 if (p->swap_maploffset] < SWAP_MAP_MAX) { 169 if (p~>swap_map[offset] < count) 170 goto bad_count; sua) if (!(p->swap_map[oftset] -~ count)) { 172 if (offset < p->lowest_bit) 173 plowest bit ~ offset; 174 if (offset > p->highest_bit) 175 phighest_bit = offset: “70+ Bl Fle 76 nr_swap_pagest*; ur ) 1%} 179 swap_device_unlock(p) ; 180 swap list unlock(); 181 out: 182 return; WADA. oftsen fo ESC PH CES AR AS A PC BE OT poswep_maploiiser EVA Hi (SMEAR EIT TRC, 40% OBER MLACAMER. FIRL, SFCET ML ASM A ‘FT SWAP_MAP_MAX, e382 3X count 2707 LMS EA, BELA ATE SCP 2: count. SUMGAS) OW, RATTGALSIESERC Ta, I FOREST NRA EZ oh, BRE ‘ARNG MO 4078 FEY FF lowest_bit 2% highest_bit, [FIN 6/ PeA-ALATAR EM sil Bem nr_swap_pages HUSH. (ERA, AERC TURD ESI LIP RE, EN fee “TS” ERE, eRe LAURA LASHER. BTL, TES REY. UT AN Ae ATT a A a aN BT HEI, BARAAER— MAT EIB, IT page SENT count 4y 0+ Mitt Spt i SE 1, GRCAL CBRE emqueue( )*P3HLi set_page count( BAIN, RUE WME A sh. PAE RM APS mR, TU ENA, IA RT EHR. SL EAR SEH TE A A eA TA. SEAL AT We TD BT DL HEH, SEE, FCM IE ih A eee, NR ERT TTT Ne RB Th, Ce PN ety CLG TAT OYE LNT, HT ZT AY TR IRE OP ART AO. DR OP SIRT”, RETR > URINAL SEIal BY A, ZEW) V9 Be SEP A HC STURT PAE AUER, FAL ea ATRL © ARCH ORT, MEAL CAEL, BOREL JEHLEL, Diana rAen “éefeHE”. St TATE IAA I OU ERA Ah EB CH BR, ALAA RSC PR BBR a a mary. © TRAD mp NSA ATER. © HAM MIR AX. SAL A EDD RP, AEF RT, UB Ae A RS / BLA FJD BL RR IN AB AS eA, EERE eA AST BE SIL SE PARES ed ah oO PN AE DU TM AN ER SPAS SP, SSS f. CHG F, BERARDI: RCMB CE PETAL, BRAUN CHT AR a, RADAR Ws RGB MRE TA, JEL eT AAR LT es 9 AT PD Fak Z 5h, PB EHNA EOL LAS, AAR EE, ARSE Boe. HAASE PY TE ARH 7 PH TET LL) RR BAH TE REA OPEL, PRLLSC (RATE Lule, TUM A DSS ATA, BRAE SIR GSR) ARF CFP + EPR. ROLAND © Perfil aL kmalloct )38% vmalloo( ) PE. ESAS 1 EA Bly FEE TT BLUES A, 71. Linux AH (a 4 et 0 vma_area_struct BURA. LOUIS — HARSH RTE OME, TCA AT FL. BALA T TREE S TRAD, PURE) AA ET WP. © A BeP BLL alloc_pages( )P 02, FRE SACI at fF Jy Fl ASIN fe Ae FARAH RT Me ETO» DL AL DAR El Ba ad 88 BC HH dP LE HARASS EAR EIU, DTLA SERINE OP FEI. ARR BIMEMSEE T . HILSON. SUR AE Si, facile ahd FEA” {WEBI LURE DUS HUE ROK. QR ITT ORE MCE RT” QUEM LRU BAR, Bit ESE Sle"; WRAL ARLE SAAT, EARS RMEN A AP SUCRE, ASME A mb. BOAR ABLES Fee: © EXPE RARE DR PAE ik 28 SF HY dnetry 14274), © ICR SR EP RENT fl inode HHH SETa]. © API ARE / SPE. SEAT AAR OER TR A Ree hk A, RE WRI PAB th EAS BY, PAD SRAM RRMA T ANZ Fy RMS I, PERSO HY RM KIT. SEAR, ROT MAGTUMA RRR, AAT IBSEN TP, JA me FO WA BIA BIATE DUETS. WRAL ETA ICA ROT AL, LRA PIL fe LE. AUTH EAB RID. (AA, PEA EE A RB YR FE GREE”, RAR RCE TTT RATE CRRA IN A, ER PARTE DAT, PUR AE mT LAI LH TU A PTET, MATT CE RA EU LAR, RPGR TL POR Be 3S LATED TPT TRS, RT REM, RATE LRU, BIS “a SRD ABI” Qh. AL, Lea A OU es TR A SRL ARE FRAT EH MUS RD. A, TATA R AMR, RUE ART RAGA. ARMOR LAL, MLW T, FR MEME CRE. ARERR OR, ABP RESE TD RSA AA AE 77 ARR BRA. (RIAL, TSE IK EAA BERET A SE BE. BARN COUR) “Bray”. WaT Bi ARAL ALA mY ST A A CR PPS Ae LUAU REE A ALAA, ARCO OU TM TAS AAA EEA TT es AA AHR EDU (CP PRAY 0, FEAT ANTE DA), AEA IRAN FE TTIAISEAN TOE TA AE page HHH TE “RITE “Coache) BAS) CORERAEPTBARI) Py FURLRILM “HRBRIRAR HEAT TEBORAS”, BURA “BUR” BAL RR". EPA OBR”, WER, MEE UT AL RAP SEMAT = SCRE MO 7S A Be a SA PTT RR TRE, LN SL TUT AUT EBA PPAR AL AEN ETT. RK ZR TU SU AE, OR UE AS, RRAPEN LRAT. RZ, MUR ROLL, PATRIA TEL, OA POND EDEA AP HAMM RE, BLAZE A, MAA TR TRB T . RATE PEED FU NY UT SL BUH), OATS A ET A, A BR ING SSE A HR a POA a TT OST RAS ROSEN NLA TT DUR Sab Mn RE, JF EL Rate MT AER. ne, MUA ROAM AML, BEAT DUA IESE nT DL. Ae. EMER PA IE ASA +72. B28 eine WE. MRAM RAM TNICUEMA SS GM, BAS OLIN “Fee” BN, hates RLM AH BLAH. SR, ae A Oat, ta AS ha al HE, PTCA A TR RIT. Ht Bea) “70” RL” BAS. Mie BG. FP SH” TE, Wik RAR ARS AME, BSE “TI WG MER RRA. SRM, ERATIONS, / Sab a AGRE AN (1) BAe TUNAY page Sami AIH LIL $y Ai list EASE STL FEE cone SAE WA free_area. WIRE (E)H Ht 8 count 4 0. (2) 48d. iE PAM__alloc_pages( )H%__get_free_page( )M Mt 422A B\Fisp SPAN fe HH. FEE ARP ACI AAD Re count Bn 1, H¢ page Mets Ai Hy BA Fi] Lise £5 #4 Ml) de RAPA @) RBA. HTM page BARA VUELTA rw BEA TRE ILIA active list, 3288 DAMA in) TACT AT, AY A TE CE, AE A AYR SK count Hi 1. FGGERAR ASCE). GLIA page SURAT OWA ira HAAR “UE” inactive dirty list, (4 LW E7S 4 (CARLES TE RON AT. AE I TT AOE Wf AE HC EAT A T+ 8K count I 1. (8) HORTSBR “AR” SIG AP HAZE ER A, SPAFDCHTUY page BREE MARIERE “IE” TUK Bl inactive_dirty_list BMAP AR “TIP” MARE. TADURACHID). TUT page MARSA MILL TISAI Irv BEARS ER TH” $8 TRBASY, SS HRT EIR A A AARR “FTP” GUOBA FI inactive_clean_list. (7) BAR AERA ASHSWEARASE AAS — BR AICTE, USCS AP REO (8) SARE, BM TUR” TPAC HT, BRISA, PAR TSH SR, LMR RE DI oe, HS SIRS, TE page BUREN BERT AFAR ARS, JRE ERT Ota active_list #l inactive_dirty_list 4S LRU RAFI), ECE E+ 1h DLR ch ek HT —4 inactive_clean_list. FEET AY page 7MI7EIRAE LRU PAY UAE, MRT DA ROAR A PCT A Ai DAR AT PES PUTAS. FIN, LLM FRMY address. space MEdH EH swapper_space, HIATT WATE IES Ae, FRAN Re AAI page MIRA AIR IDA SA ise BEA JP ASS BA Bl, Hh, ADR ERM MRR, MILT —T ARR page_hash table. UPAR GE ARS — 4 AE ASU. WER RA PORE AERA fe HIRT LE » MLAB et add_to_swap_cache( Y#FSE page AMIHEA AA AIA, J&P BEATE * mmiswap_statec Ps a © 54 void add_to_swap_cache (struct page page, swo entry t entry) 5 56 unsigned long flags; ST 58 ifdef SWAP CACHE TNFO 59 swap cache add total !t; 6 endif él if (Pagelocked (page)) 73. Wes Linux 1 wD cea 62 BUG( ) ; 63 if (PageTestandSetSwapCache (page) ) 64 BUG); 65 if (page->napping) 66 BUG( ); 7 Flags = page->flags & ~((1 << PG error) | (1 << PG_arch_1) 68 page->flags = flags | (1 << PG uptodate) ; 69 add_to_page_cache locked (page, &swapper_space, entry. val) ; m0 3 CARRIE AMM BL, UR. Bly RIS MEHTA, $C PG_swap_cache PREM UAA 0, HET mapping OG 0. MIA, RMA ARLMI MRR REAM, BAR SE Lot H—S, AFUAIE PG_uptodate tra 12 SK 1. HA__add_to_page_cache( )((V% XK, mmvfilemap.c: 416 /* ATT * Add a page to the inode page cache. 478 419 * The caller must have locked the page and 480 set all the page flags correctly. 4810 #/ 482 void add_to_page_cache_locked (struct nage * page, struct address, space *mapping, unsigned long index) 4830 484 if (1PageLocked (page) ) 485, BUC); 486 487 page_cache_get (page) ; 488 spin_lock (&pagecache_lock) ; 489 page->index = index; 490 add_page_to_inode_queue (napping, page) 491 add_page_to hash queue(page, page_hash (mapping, index); 492 Iru_cache_add (page) : 493 spin_unlock (kpagecache_lock) ; 44} iW BEA mapping BA address_space MHA, LLswapper_spaces RAPMAKA HIE XI include/linux/fs.h: 365 struct address_space ( 366 struct list_head clean_pages; —/* list of clean pages #/ 367 struct list_head dirty_pages; /* list of dirty pages #/ 368 struct List_head locked_pages; /* list of locked pages */ 369 unsigned long nrpages; /* number of total pages */ 370 struct address_space_operations *a_ops; /* methods */ ail struct inode #host; /* owner: inode, block device */ 372 struct vmarea_struct *i_mmap; —/* list of private mappings */ 373 struct vmarea_struct *i_mmap_shared; /* list of shared mappings ¥*/ 74. RIS A 374 spinlock_t i shared_lock; /* and spinlock protecting it */ 375s SPALL, WAALS TP" A AE A CEG), Fh BAT locked_pages iF ie BE Bi ex 40k saz ZE Wy 77 AS iL de lh OA We dtl BCH AD swapper_space (Ky 32 XRT ‘min/swap_state.c: 31 struct address space swapper_space = { 32 LIST HEAD. INIT (swapper_space. clean_pages), 33 LIST HEAD_INTT (swappor_space. dirty pages), a4 LIST HEAD_INIT(swapper_space. Locked_pages), 35 0, /* orpages */ 36 &swap_aops, a7: BMPR — TRE REE swap_aops, HILT AA swap Hep ATLA MEET. JAs8 8 add_to_page_cache_locked( PFT AFH], TH page MAH = TOSI. PHRASEA, page SHIH MICELI: list REAPER swapper_space, iMH##t next_hash AXLE pprev_hash BARAORDT, JEL W IS Ira A LRU BAI active_lists {X34 page_cache_get( )7E pagemap.h 472 XH get_page(page), Ilr LA AH RM MARAT a page->count Hi 1. JX 424 include/linux/mm.h #15 LEA: 150 define get_page(p) __atomic_ine(&(p)->eount) 31 #define page_cache_get (x) get_page (x) ‘e549 (09 page SAMITLML add_page_to_inode_quene( jin AS] swapper_space :|'tY clean_pages BL, SEAREIE include/linux/pagemap.h *P: 72 static inline void add_page_to_inode_queue (struct address_space *napping, struct page * page) Bf m4 struct list_head *head = &mapping->clean_nages: 5 16 mapping-Dnrpages t+; 1 List_add(page>list, head) ; B page->napping = mapping: 9} TTL, BEAME swapper_space '['(H clean_pages PAF], RUA AC LG Uk A YTB MAE “Te” RB. {| 4X4 HH add_page_to_inode_queue WL? IAEA WMATA A Wy HATTA, SCAPEVE / SAL SEAL BURL. ADR AR] 4c Pen wR ++ address_space BEM ORES, TAR BES ANIC HEHY inode HGRA HIH AI ulS> i_data, ABMAE -7S address_space SUBMIHY. NORRIE ABUT 2 He IUIATEA] address_space SUES swapper_space 5 kt MEL. fait __add_page_to_hash_quene( #5FLHEA FIRE AEB, HORA IEZE mm/filemap.c A: +75. Tes. psa Linux A BSACPH ot 58 static void add_page_to_hash_queue (struct page * page, struct page **p) 59 { 60 struct page *next = *p; 61 62 p= page; 63 page->next_hash = next: 64 page->pprev_hash = pi 65 if (next) 66 next->pprev_hash = &page->next_hash; 67 if (page->buffers) 68 PAGE_BUG (page) 69 atomic_ine (&page_cache_size) ; 7} BEA RU ASD SUOR TARR AL define page_hash (mapping, index) \ (page hash table + page hashfn (mapping, index) ) SVAN TAY page HUE Mill Iru_cache_add( )HA BI ALE P EN LRU 6) 3] active_list PR, 24S # mmAwape #: 2260 fae 227° * Iru cache add: add a page to the page lists 228 * @page: the page to add 2900 #/ 230 void Iru_eache_add(struct page * page) a1 232 spin_lock (dpagemap_Lru_lock) ; 233 if (PageLocked (page)) 234 BUG( ); 35 DEBUG_ADD_PAGE 236 add_page_to_active_list (page) ; 237 /& This should be relatively rare */ 238, if (1page->age) 239 deactivate_page_nolock (page) : 240 spin unlock (tpagemap Iru_lock) ; m1} J FAMY add_page_to_active_list( )A& S226. 5X T° include/linux/swap.h (Al: 209 ‘define add_page_to_active_list (page) { \ 210 DEBUG_ADD_PAGE \ aut TRRO_PAGE BUG \, 212 SetPageActive (nage): \ 213 list_add(&(page)—>Iru, &active list): \ 214 hr_active_pages++; \ -76- B2E Cm 25} INT page Bczh24 #9 AT LOB SESE M4 SEAS ru AAI LRU BS. LGR EA PG_active, PG_inactive_dimty UA PG_inactive_clean JHp.G(RAHW AML SOP. Wsie BAER BY ME MG UE TRAE CE ROOR, JEP RULRRT DLE LS Sah TeROETRE, RTLLL NEE PURE FCA AA 77 By eR PE OL AR ET FAD map yb CR BE AL fal, SRN RAL SEAL, ESTA / PALIT TERR, SKB ARIAT swapont )Al swapofi)= S/R: swapon(const char path, int swapflags) swapoff (const char path) JWI RSE tH AE ALP REP EE, A RRA RS a BA SE TY BARRE. STAT AT ARIK USC AAS FP SRR ie A 8 A RS SR. ESR, AAR. ite “AGL” AS, AANA Flash Memory (INTE) KAR BRET, Xf Flash Memory WERE ARAMA, AER PMARLAL, Raa SA, HAUTE RR (SMR S AEH). SESR, Flash Memory LAGE AM fE DLAI BEEN, TLL HE PRE AE RAT. SE, 4 Linux ABUT i), APTA TUTE A AR WER CM ED Tetche dies tS Le, MAT At STZ BLL RAEI swapoa( ) AUN SCALREFE swapon. 2 SAB AX dr OAT MAE HE SBE OER FS Jot, BALSA ASA RAIA, ASTRA ARN. AFR LARA THA SERDAR), MRL MASE AEE — SP ATT 2.7 WER aya ESBS), SRR TA Ret, AS DMA MAT RMR. HSE, ARLE, HP ETT SETA) “OK” BERANE, RGAE DMA. AYA FSC tt Bese. 4-SEBEA RAPES HWM, FPL alloc_pages( KIM. Linux PIE 2.4.0 Hie RESP FE alloc_pages( ), --7S4E mm/numa.c *B, 9} -4S4E mnv/page_alloc.c P, HREM ARIA TE SL ae PE EEF CONFIG_DISCONTIGMEM #E I. ATA? LORRAL Hay TR ABARAT] “HUG” BEN AE. FERRET NUMA 289/15 alloc_pages(), 340047F mm/numac : 43 -#ifdef CONFIG_DISCONTIGMEM g/m {2 This can be refined. Currently, tries to do round robin, instead $3 + should do concentratic circle search, starting from current node. “77. Linux Hitt HD 94 #/ 95 struct page * alloc pages (int gfp_mask, unsigned long order) 96 97 struct page #ret = 0; 98 pe_datat *start, *temp: 99 ifndef CONPIG_NUMA 100 unsigned long flags; 101 static pg_data t next 102 endif 103 104 if (order >= MAX_ORDER) 105 return NULL: 106 -#ifdef CONFIG_NUMA 107 tomp = NODE_DATA(numa_node_id( )): 108 Helse 109 spin_lock irqsave(&node_lock, flags) ; 110 if (Inext) next = pgdat. in temp = next; 112 next = next->node_next; 113 spin_unlock_irarestore (node lock, flags); 114 Hendif us start = temp; 116 while (temp) ( uy if ((ret = alloc_pages_pgdat (temp, gfp_mask, order))) us return (ret) ; 19 ‘temp = temp->node_next; 120 1 1a temp = pedat_list; 122 while (temp != start) { 123 if (ret = alloc_pages_pgdat (temp, gfp mask, order))) 124 return (ret) ; 15 temp = temp->node_next: 126 ) 127 return(0) ; 128) FRA» AE NUMA. iS SRF 2 Le ane EE OR RA. TL BAR A HE AT CONFIG_DISCONTIGMEM 43 MA #9 BIE. iL. GALA IE RAHI AL “ANE Gee Al”, TIAE CONFIG_NUMA, 3E9:, EASE EF HIE EAP SO NUMA, 16 BS Ae ‘ak ALR Ok LAR AE BS SEA, TTA SE PT) A A © PA ASE i AD RE BAS) 2 HAP EE CALA TY TT. TDL, AAR RETA ELMAR, BMRB ATAT ME, AMAT pe_datat SHRM TA WAN AB TSH, BEM ofp_mask EMER, RR RAB MS ACMS, A — 4A onder ROTTER ID, TIDGE 1 2s 4s ey ER) NOOR ATT 7E NUMA 4641 RR, ATLL LPF NUMA_DATA il numa_node_id( )#22 CPU PRET A) pg_datat MARSH. MAARES UMA BP, MUA pg_data_t BE i HIMBA SI pedat_tist, 78. RIF Hon SESE AT RTT, CAH Ta PMP EAB EEF PST while TR, CAIN GERM temp FEEDS AI, ASTI DB ASB AID) ROT ATTA, BLEEP ASTRO, AEC RIK TEGO, 3 F455 A, FH alloc_pages_pgdat( SARA ALA TATUM, 4 RCS TE mm/numa.c He 85 static struct page * alloc_pages_pgdat (pg data t *pgdat, int gfp_mask, 86 unsigned long order) art 88 return — alloc_pages(pgdat->node_sonelists + gfp mask, order) ; a9 TYR, Be gfp_mask 261K SF PERS ET PA node_zonelistsf JAI Fn, Bese FLAY STAC R ESRB S PA THETA UMA S590 alloc_pages( PAR —F, AEFTDIB MEDC, ZEXEMEAE TE] UMA S40 RA —75 A contig_page_data,jij7E NUMA 28 Hye AES Ia) UMA SiH HUA Bh eMt[a] UMA 48#509 alloc_pages( E7EX{F mm/page_alloc.c PH NH: M3. -Rifndef CONFIG DISCONTIGNEM 344 static inline struct page * alloc pages (int gfp_mask, unsigned long order) Mi 348 ie MT * Gets optimized away by the compiler. 248 */ 349 if (order >= MAX_ORDER) 350 return NULL; 351 return __alloc_pages(contig_page_data. node_zonelists+(gfp mask), order) ; 352} 15 NUMA Sti f4ift! alloc_pages( AHIR, 3&4SH68AX1E CONFIG_DISCONTIGMEM Zi X81 4 #421 BE. LN ABR AP EE SUBMIT TE APAL cb es 8__alloc_pages( )5GAk, HACHSTE mm/page_alloc.c "F, HLTA EB: [alloc_pages( ) > __alloc_pages( )] 2m /® 271 * This is the "heart’ of the zoned buddy allocator: 272 */ 273 struct page * __alloc_pages (zonelist_t *zonelist, unsigned long order) 274 215 one_t **z0ne; 6 int direct_reclaim a unsigned int gfp_mask = zonelist->gfp_mask; 278 struct page * page: 279 Linux AV d 280 Ms 281 * Allocations put pressure on the VM subsystem, 282, a/ 283 memory pressurett; 284 285 i 286 * (If anyone calls gfp from interrupts nonatomically then it 287 * will sooner or later tripped up by a schedule( ).) 288 * 289 * We are falling back to lower-level zones if allocation 290 * in a higher zone fails. 291 */ 292 293 i 294 % Can we take pages directly from the inactive clean 295 + List? 296 4/ 297 if (order == 0 && (gfp_mask & __GEP_WALT) & 238 {(current->flags & PF_MEMALLOC)) 299) direct reclaim = 1; 300 301 iy 302 % If we are about to get low on free pages and we also have 303 % an inactive page shortage, wake up kswapd. 304 +/ 305 if (inactive_shortage() > inactive target / 2 && free_shortage( )) 306 wakeup_kswapd(0) ; 307 is 308 * If we are about to get low on free pages and cleaning 309 * the inactive dirty pages would fix the situation, 310 % wake up bdflush. 3 */ 312 else if (free shortage() && nr_inactive dirty pages > free_shortage( ) 313 && nr_inactive_dirty_pages >> freepages. high) 314 wakeup_bdflush ( 315 BHNAATSE. B-PSR zonelist HCY -~PHRSP ERRAND zonelist_ BURA. SH order Ml ‘iii thi alloc_pages( )PMVAUIF]. 4/4 memory_pressure RRA # KSLA Ay, GSP FE PUI, JPLARIN MUR. 208 (FPL gfp_mask 3 E118 ZC FLAS KH. 2-KATEMH ONS, MRERGMH ABET, ABSA, LAA RAHM, RHE aBR direct_reclaim Hk 1, FavAT MAA OD PE HY “ATP” SOU AP EM. RANMA AROS WE MRE SSO H, RAR TMA, He RERBEITMHASHARAMNREREHEA, CRASARDRRH, MARKERS T. ey ASR HAE AN — se BE RE TT PE, ER FS BEA, SET Pehe, Heoh, 4A PC TAGE. ZB EEWRAM kewapd Ail bdflush PA SARA. ike +80. B28 fwow ee SEA TUR COFIL “TUTTE TAR”). ERE PA: {alloc_pages( )> __alloc_pages( )] 316 try_again: ait ix 318 * First, see if we have any zones with lots of free menory. 319 * 320 %* We allocate free menory first because it doesn’ t contain 321 * any data ... DUH! 322 */ 323 zone = zonelist->zones; 324 for { 325 zone t #2 = *(zone++); 325 if (12) 327 break: 328 if (size) 329 BUG ); 330 331 if (@>free_pages >= z~>pages_low) { 332 page = rmqueue(2, order); 338 if (page) 334 return page: 235 } else if (2->free_pages < 2—>pages_min && 336 waitqueue_active(@kreclaimd_wait)) { 337 wake_up interruptible(ékreclaind wait) ; 338 1 339) } 340 GE RERE SPILT FAI AL TAT HR PEP DE DH AS RR TR PA HSS, WRB Ze “RA” OL, BEAL rmqueue( AVIRA RR. BEL EATS REE TR, AAT MERE CSch LE REA RAY kreclaimd) 7£—MES BAT kreclaimd_wait FRE, SHE CME, ULE BDI TC aR. BAK rmqueue( RFA — PTE ERR PEROAG HIM, FLA mm/page_alloc.c P: [alloc_pages( ) > __alloc_pages( ) > rmqueuet )] 172 static struct page * rmqueue (zone t ¥zone, unsigned long order) mB 174 free area t * area = zone->free area + order; 175 unsigned long curr order = order: 176 struct list head #head, #curr; it unsigned long flags; 178 struct page *page 79 180 spin_lock_irqsave (zone->lock, flags) ; “81. re Seas Linu Aa kD 181 do { 182 head = &area~>free_list: 183 curr = momlist_next (head) ; 184 185 if (curr != head) { 186 unsigned int index; 187 188 page = menlist_entry(curr, struct page, list); 189 if (BAD_RANGE (zone, page)) 190 BUGC): 191 men] ist_del (curr) ; 192 index = (page ~ men map) ~ zone~offset; 193 MARK USED(index, curr_order, area); 194 zone->free pages = 1 << order; 195 196 page = expand(zone, page, index, order, curr_order, area); 197 spin_unlock_irqrestore(&zono—>lock, flags) ; 198 199 set_page_count (page, 1); 200 if (BAD_RANGE (zone, page)) 201 BUG); 202 DEBUG_ADD_PAGE 203 return page; 204 } 205 ourr_order++; 206 reat; 207 } while (curr order ¢ MAX_ORDER) ; 208 spin_unlock_irgrestore(&zone->lock, flags) ; 209 210 return NULL: au) CLAD, ACROPELTE ATHY page RAREAH, LLL ALBERT CER AR EIA. 3} ROTM SASH ASU, MARRERO HMM RE SOR EC ATA ARATHRAY. BFLLEAT spin_lock_irqsuve( HAIMA MEO AAUHTE. WER MHA ER zone->free_area JEMVRPEUR, BAU one->free_area +order WHET] SERIA A DEEN TFIIB Ske SEE BEERR TEAS do_while MRTMET. ENA AISMER A BRM TLE SAL, UAE COBRA ANY RMSE ALES) BUSI SPA, RLODAOLE, RAEN AIT ACHP AR TPE VERT CBRL 196 ATH expand()). 38 188 17 40H memlist_entry( )A—7P3RE AFIS -/MEMY page 70K, AML tt memlist_del() IUMBUTIPINER. Auk, SUES LRU RL AOR. KL expand( EA FI—I fF (mm/page_alloc.c) PE LY: [alloc_pages( ) > __alloc_pages( ) > rmqueue( )> expand )] 150 static inline struct page * expand (zone t #zone, struct page *page, 8. Ble wae 151 unsigned long index, int low, int high, free area_t * area) 12 153 unsigned long size = 1 < high: 154 155 while (high > low) { 156 if (BAD_RANGE (zone, page)) 157 BUG(): 158 159 160 161 meml ist_add_head(&(page) list, &(area) >free list) ; 162 MARK_USED (index, high, area); 163 index += size; 164 page += size; 165 } 166 if (BAD_RANGE (zone, page) ) 167 BUG( ); 168 return pages 169} SHRP BY low REL FAA AT EEA st order, if high JUL THe He EPA WA SIC SEH E AG LER ARAYA SI) EN curr_order. SPRAAAEM, MA 15S 47 FF RGHY while HF BLT. PRA RBIAWVERK TATA AD CARAT BEN TRENT A), ARRAS UR EE ARRON EMO TRO SUP 2, FAIRS ABA DIME, BOLAEM 158 1725 162AT PSEA. PGR Wid SE, THUS ERE AD RAISER CB 163 FH 164 47), FBT AMAR PSNR TI. TE, BUT high 5 low WAI, th BURST RG BER PAL REIN Mee, ERR A BUX, rmqueuet )— EU LAS. HST A BEAM. HR rmqueuet ) KW, h|__ulloc_pages( ) HELLK for TREMOR, BRR TOLER F—“METE, ABI RSD, SAME T SHH NE SEKI CML 327 17). MRS ACRLD TT, I)__alloc_pages( )EFI—+ page AIH, HMM PITA page S444), Jt Li& page MAE AHH count 4 1. WARK IATA AE PAY TUM order WO), NUE SED TY EARS 1 BLES A OE PIT ALY TT IT, ORR Ae” ARR, A RACER 9 BER POE CORLL” AUSER, LARP AE TT ALC IN “ANGER TS Mh” APE. FUE FH __alloc_pages( )f3{RE3 (mm/page_alloc.c). [alloc_pages( ) > __alloc_pages( )] 34 x 32 % Try to allocate a page from s zone with a HIGH 343 % amount of free + inactive_clean pages. a * 345 * If there is a lot of activity, inactive target 346 * will be high and we'll have a good chance of MT * finding a page using the HIGH limit. +8. Nes. Linux pa HESmt ft a> ee 348 */ 349 page = __alloc_pages_limit(zonelist, order, PAGES HIGH, direct_reclaim) ; 350 if (page) 351 return page: 352 353 be 354 * Then try to allocate a page from a zone with more 355 % than zone->pages low free + inactive clean pages. 356 * 367 * When the working sel is very large and YM activity 358 * is low, we're most likely to have our allocation 359 * succeed here, 360 */ 361 page = alloc_pages limit (zonelist, order, PAGES_LOW, direct_reciaim): 362 if (page) 363 return pases 364 3& HELL BBW PAGES_HIGH iff Hi__alloc_pages_limit( ); MRIDAAT RB MA DME. BU PAGES_LOW FHial]—Yk. HML__ alloc_pages_limit( )(#/{t051HZE mm/page_alloc.c "P+ {alloc_pages( ) > __alloc_pages( ) > __alloc_pages_limit()] 213. define PAGES_MIN 0 214 Bdefine PAGE 1 215 #define PAGES HIGH = 2 216 27 fe 218 * This function does the dirty work for __alloc_pages 219 * and is separated out to keep the code size smaller. 220 * (suggested by Davem at 1:30 AM, typed by Rik at 6 AM) 221 */ 222 static struct page * __alloc_pages limit (zonelist_t *zonelist, 223 unsigned long order, int limit, int direct_reclaim) 224 225 zone_t #*zone = zonelist->zones; 26 227 for (3) L 228 zone_t 2 = *(zone++) : 229 unsigned long water_mark; 230 231 if (a) 282 break: 233 if (2->size) 234 BUG); 235 2360 fx 84. BIE foes We allocate if the number of free + inactive clean * pages is above the watermark. switch (limit) { default: case PAGES_MIN: water_mark = 2->pages_min; break; case PAGES _LOW: water_mark = 2->pages_low; break; case PAGES_HIGH water_mark ~ z~>pages high; if (2->free pages + z—>inactive_clean_pages > water_mark) { struct page *page = NULL; /* Tf possible, reclaim a page directly. */ if (direct reclaim && z->free pages < 2~>pages min + 8) page = reclaim page(z) * Tf that fails, fall back to tmqueue. */ iP paged page = rmqueue(z, order) if (page) return pages h } /* Found nothing. */ return NULL; } 3 2604 (CE ABL__alloc_pages( )"P0 for (HEK(ESBAE KL SLASHNA ANIL, ARACEAE. FE reclaim_page( )M Til @ #2 64 inactive_clean_list BAF bie Hil, HARESEE mm/vmscanc HH AeA de TET RU” A, UR Oy Le) T tl AR LOS A ESWHA TAMA LSM direct_reclaim 4E 0, HLA IAM EAS SBA, TAR WLM AER f(t ML AOE FILTRATE A__alloc_pages( Btn HAM: {alloc_pages( )> __alloc_pages( )] 365 ie 366 * OK, none of the zones on our zonclist has lots 367 * of pages free. 368 * 369) %* We wake up kswapd, in the hope that kswapd will 370 * resolve this situation before menory gets tight. 83. Linux 1 401 fi an * 372 * We also yield the CPU, because that: 373 * ~ gives kswapd @ chance to do something 374 % > slons down allocations, in particular the 375 % allocations from the fast allocator that’ s 376 %* causing the problems ... 3m7 %*-... which minimises the impact the “bad guys” 378 * have on the rest of the system 379 * if we don't have GFP 10 set, kswapd may be 380 * able to free some memory we can’ t free ourselves 381 wv 382 wekeup_kswapd (0) ; 383 if (gfp_mask & __GRP WAIT) { 384 ___Set_current_ state (TASK_RUNNING) ; 385, current->poliey |= SCIED_ YIELD; 386 schedule( ); 387 I 388 389 o 390 * After waking up kswapd, we try to allocate a page 391 * from any zone which isn’t critical yet. 392 * 393, * Kowapd should, in most situations, bring the situation 394 * back to normal in no time. 395 v 396, page = __alloc_pages_limit(zonelist, order, PAGES MIN, direct_reclaim) ; 397 if (page) 398 return page: 399 ‘PAARL ERARAZ kswapd, UES VA ih a SRM LL AM TI BELA DMA MITT AS, RAL RGR YOM, FEELS UREA SOIRUERLE— PRR. GRE, RIE kowapd # FY E37 BN HEI RSE ase a te AF 1 Fe SP RAPE TUMORAL, WET IE I. RAPA TT EA AGATE, BR SA ERR TA ANS YESAERY, BELLS PAGES_MIN #818 -2%__alloc_pages_limit( ). WHE, BUSTER? Lente RHREAERA RAEI F . WRIA RRO ME RE RR AEE kswapd BY kreclaimd, AH RLE “RES LES", ERICA EM AI RIMT AS, BSE HT, IR AR OME FEMALE. RAGE ABA task_struct 44)!" fags FBLMY PE_MEMALLOC iat 1. RUT AT TfGEFE, Bll PF_MEMALLOC tii (0.) 0 AUREREAUAL SR. [alloc_pages( ) > __alloc_pages( )] 400 ix 401 * Damn, we didn’ t succeed. 402 * 403 * This can be due to 2 reasons 86. B25 fn — 404 *— we're doing a higher-order allocation 405; %* —) move pages to the free List until we succeed 406 *— we're /reaily/ tight on memory 407 —> wait on the ksvapd waitqueue until memory is freod 408 */ 409 if (!(current->flags & PF MEMALLOC)) { 410 * 4n % Are we dealing with a higher order allocation? 412 * 413 % Move pages from the inactive clean to the free list 44 ¥ in the hope of creating a large, physically contiguous 415 * piece of free memory. 416 */ a7 if (order > 0 && (gfp_mask & GFP WAIT) ( 418 zone = xonelist->zones; 419 /* First, clean some dirty pages. */ 420 current->flags |= PF_MEMALLOC; ai page_launder(gfp_mask, 1) ; 422 current=>flags &= ~PR_MBMALLOC; 423 for Gi) { 424 zone_t *2 = #(zone++); 425, if (12) 426 break: 427 if (1z>size) 28 continue: 429 while (z->inactive_clean_pages) [ 430 struct page * page: 431 /* Move one page to the free list. #/ 42 page = reclaim page(z); 433 if (page) ut break; 495 __free_page (nage) ; 136 /* Try if the allocation succeeds. */ 47 page = rmqueue(, ord 438 if (page) 439 return page: 40 : 4Al } 2 ) 443 i aah * When we arrive here, we are really tight on memory. 445 * 6 * We wake up kswapd and sleep until kswapd wakes us war * up again, After that we loop back to the start. 448 * 499 % We have to do this because something else might eat 450 * the menory kswapd frees for us and ve need to be 151 * reliable. Note that we don’t loop back for higher 87. re Linus BABAR bat) Ess anes 492 * order allocations since it is possible that kswand A583 * simply cannot free a large enough contiguous area 164 * of memory *ever*, 455 */ 456 if ((gtp_mask & (__GFP WAIT GRP 10)) == (__GRP_WAIT|__GFP_10)) { 457 wakeup_kswapd (1) ; 458 nenory_pressuret*; 459 if (lorder) 460 goto try again: 461 i 462 * If __GFP 10 isn’t set, we can’1 wait on kswapd because 263 * kswapd just might need some 10 locks /we/ are holding ... 464 * 465 * SUBTLE: The schedul ing point above makes sure that 466 * kswapd does get the chance to free memory we can’ t 467 * free ourselves. 468 */ 469 } else if (gfp_mask & __GFP_WAIT) [ 410 try_to_free_pages (gfp_mask) : 471 memory_pressure++; 472 if (order) 473 goto try_again: ‘v4 t 475 416 } a7 RCA 4 SU RATT RE LH, Ake RE SA GRAD TF MABRNGERD, EROTIK DMA, HE Rb A RM inactive_clean_pages PARP. WRAL CAAT AER AAT TT. FAY, WTR ATMS BE” TURES eH) inactive_dirty_pages BL, CALS MINAS Pl ae bate, eT Lee SER AE" TOTTI. TEA, FLAT BE, {RESP RB IT page_launder( HE “HET” TH “YL i" ERR “TURE IB”), AaB — AP flor (ER HE A cl ME HD EB” IE. ALR ATALACAUFR BCL ot —~4* while (TENN. TE __free_page( EHH SIESTA BS RATE AM TU, BTLMERPML T —4 i OUT ABE rmqueue( ik —F, BGA CARI OR (AERA, 3-4 9A page_taunder( )!ilnl@ “AT HEFERY PE_MEMALLOC fide(tz Hea 1, RAT RUT AR” MAR, Watt BETA AE? 1 ET VE page_launder( )' 12 BOR PAR AEG WHEOU LAE el, AME PF_MEMALLOC $y ficient 1 SER] SERRE Ai BAY 409~476 47. WURDE TPE Hi td LE AAMT, ABR AL] SPR TU TRA A RB ADEE ALORA kswapd. (32 -R SPC TL Oi PEW BEER FF, HEL kswapd ZETERR T -HG3E TZ a FEAL KER SPD THM. i) MRA RACED SO. Bt goto F414 MI__alloc_pages( FF IAbH ES try_again Mb. 5} — ARAL ALU try_to_free_pages( ), iA“ AMAL JE TH kswapd TAHA. BA, WIRE “WTAG” WE? a, FURR RRUT AI, CAVE Sw, ART UE Hi, SUA EAR AERA TE A APTA AS try_again Mb. +88. 92% moa BOER ESI, YC OT EWE AL__alloc_pages_limit( ist Se LIGA PAPTAR ATEN. i, HE A—UKDL PAGES_MIN Yo -58, HIN) PU LE ay SM RE A aT SLY OKC” popages_mine ZPADABRE A EA, ALU MUR SARDL, TUDE LRT “RA” MUTT. RATALE(E F Fi__alloc_pages( ){Hfti. [alloc_pages( ) > __alloc_pages( )] 478 x 49 * Final phase: allocate anything we can! 480 * 481 % Highor order allocations, GFP_ATOMIC allocations and 492 % recursive allocations (PF MEMALLOC) end up here 483 * 44 * Only recursive allocations can use the very last pages 485 % in the system, otherwise it would be just too easy to 486 * deadlock the system 481 / 488 zone = zonelist->zones; 489 for (3) { 490 zone_t 2 = *(zone++); 491 struct page * page ~ NULL: 492 if (12) 498 break; 494 if (zsize) 195 BUG); 496 497 * 498 * SUBTLE: direct_reclaim is only possible if the task 499 % becomes PF_MFMALI.OC while looping above. ‘This will 500 * happen when the OOM killer selects this task for 50L * instant execution. 502 */ 503 if Girect_rectaim) | 504 page - reclaim_page (2) 505 if (page) 506 return page; 507 } 508 509 /# XXX: is pages.min/4 a good amount to reserve for this? */ 510 if (>free pages < 2~>pages_min / 4 && sui Ucurrent->flags & PF_MEMALLOC) ) 512 continue: 513 page = rmqueue(2, order) : 54 if (page) 515 return page: 516 } 517 89. Linux HAA a) 518 /#* No luck. */ 519 printk(KERN_ERR “__alloc_pages: ‘lu-order allocation failed. \n”, order); 520 return NULL; bal} DREAM, RATT T BES IFA BL: ARIK, GR NBULD A MIHAEARRL, B CPU A EDS O TRE SUB RY YEE 2 TE RA ABLE PAL SJ) “KC” TS RE”, AR ZEEE H © SR by AAS ACES RTT BT Bi PE A AE CE SE BS HS SERGARIA To ASL) MGR RATS RT LAB Bet — 4S RSH SLB 28 Riaz Hkh JEM AEC BUS, REE RT. Pr TER LEE CPU ARRIETA, ERE CERT SE ALE AUR, SPR RAS oy a PAE FUDFHN ARH, Linux AME BP ei, BE, DR ERI SEABT EY GAR. AR, aT eam SAAN AAD, RII a a A ee ET REM ATR SAR, TR AFIS SHR ATR He TRAY OTE. AR. ORECAST Lb Ste AA WR. HL, MSR, PRS AMIN Ok, UGA S MM, BTCA AER TE REY i A SIA eH RISB FA ES AE Lima A Be ERB he NE APT AY “SH” kswapd. MGIB EGR, kswapd FP PAE, ATC A UEP task_struct 4, FREER RE ZAR COUMAE, TIAA AN BOC UIE, BMT WUC TRAD RE. RLS GRAB METEHIG, kswapd ELTA. BE, CRA A CMCC bk i, PTDL CE URE REICH A “BH” (thread) Liab]. AIA, kswapd (EAL HEROS IME? CEH AE A ROS FA. FER, ESTAR REEL. Hk, CA MIE ere PM, ATLL ARPS PRET, TAMU REL RSTO. (ARUGULA. AHURA kowapd 2A BM AETIIS47 tk se RAT LAE. SeHE kswapd HWA AR LAE mmvvmscan.c PFET E AVE 1146 static int __init kswapd_init (void) way i148 printk("Starting kswapd v1. 8\n") ; 1149 swap_setup( }; 1150 kernel_thread(kswapd, NULL, CLONE_FS | CLONE_FILRS | CLONE_SIGNAL); 151 kernel_thread(kreclaimd, NULL, CLONEPS | CLONE_PILPS | CLONE SIGNAL) : 152 return 0; 53} FRR kswapd_ini( LE RAPA CHIESA, CRAMP. SB AEECE swap_setp( ) PLA DE AN FRA RE — Fa page_cluster: 0. 2S eres {kswapdl_init( ) > swap_setup( )] 2930 fe 294 * Perform any setup for the swap system 295 of void __init swap_setup (void) 296 297 298 /* Use a smaller cluster for memory <16MB or <32MB #/ 299 if (num _physpages < ((16 * 1024 * 1024) >> PAGE_SHTFT)) 300 page_cluster = 2; 301 else if (num_physpages < ((32 * 1024 * 1024) >> PAGE_SHIFT)) 302 page_cluster = 3; 303 else 304 page cluster = 4; 305 } CE-RAR EM AAKWSM. HTRMAN ARS Se, IF SRE TCR RATA SE, PUM RERAR— TRB ATE. LUE INE RET RFS ELT, “Rik”. GEARRAAEKRRE ESO ATR, ROR MEMMR, TRUE YEE AER AOKI RHE EAS BLA E SHE . BMRA LOE AAZ kswapd, 3X 2 th kernel_thread( ) FEIN. JA SUB HET PAR kreclaimd, WAM, ALAM kewapd MAR RAL SE, RCRA BA EAE add. KT AS EA SE, AK SER kewapd BROERESE SFP HABER kswapd( )FPMAT. SURE ZE mm/vmscan.c 1"; on /* 948 * The background pageout daemon, started as a kernel thread 949 from the init process. 9580 951 * This basically trickles out pages so that we have _some_ 952 * free memory available even if there is no other activity 953 * that frees anything up. This is needed for things Like routing 954 * etc, where we otherwise might have all activity going on in 955 * asynchronous contexts that cannot page things out. 9560 € 957 * Tf there are applications that are active memory-al locators 958 * (most normal use), this basically shouldn’ t matter. 959 #/ 960 int kswapd(void unused) get 962 struct task struct #tsk = current; 963 964 tsk->session 965 tsk>pgrp = 1; 968 strepy(tsk->comm, “kswapd”) ; 967 sigfill set @tsk->blocked) ; 968 kswapd_task = tsk: 91. IOC OT Www.zzbaike.com Linux A Healt sy sbi tier 969 970 i 97 Tell the memory management that we're a “nenory allocator”, gr2 * and that if we need more menory we should get access to it 973 + regardless (sce “__alloc_pages( )”). “kswapd” should gra % never get caught in the normal page freeing logic. 975 * 976 % (Kswapd normally doesn’ t need menory anyway, but sometimes 977 % you need a small amount of memory in order to be able to 978 % page out something else, and this flag essontially protects 979 4 us from recursively trying to free more memory as wel re 980 + trying to free the first piece of memory in the first place) 981 +/ 982 tsk->flags |= PF_MEMALLOC: 983 984 i 985 % Kswapd main Loop. 986 a/ 987 for Gi) 988 static int recale = 0; 989 990, /* If needed, try to free some memory. */ 991 if Cinactive_shortage() || free_shortage( )) { 992 int wait = 0; 993 /* Do we need to do some synchronous Mushing? */ 994 if (waitqueve_active (@kswapd_done)) 995 wait = 1; 996 do_try_to_free_pages (GHP_KSWAPD, wait); 997 } 998 999 ys 1000 + Do some (very minimal) background scanning. This 1001 ¥ will sean all pages on the active list once 1002 * every minute. This clears old referenced bits 1003 ¥ and moves unused pages to the inactive List. 1004 a/ 1005 refill _inactive_scan(6, 0); 1006 1007, /x Once a second, recalculate some YM stats. */ 1008 if (vime_after (jiffies, recale + HZ)) { 1009 recale = jiffies: 1010 recalculate vm stats(); 101 } 1012 1013 i 1014 * Wake up everybody waiting for free memory 1015 ¥ and unplug the disk queue. 1016 7 9- RIS trae 1017 wake_up all (@kswapd_done) ; 1018 run_task queue(&ta disk) ; 1019 1020 os 1021 * We go to sleep if either the free page shortage 1022 * or the inactive page shortage is gone, We do this 1028, * because: 1024 * 1) we necd no more free pages or 1025 + 2) the inactive pages need to be flushed to disk, 1026, * it wouldn’t help to eat CPU tine now .. 1027 * 1028 * We go to sleep for one second, but if it’s needed 1029 * we' Ll be woken up earlier. 1030 / 1031 if ({free_shortage() || !inactive_shortage()) { 1032 interruptible sleep_on_timeout (@kswapd wait, HZ); 1038, ye 1034 » Lf we couldn't free cnough memory, we see if it was 1035, * due to the system just not having enough memory. 1036 * Lf that is the case, the only solution is to kill 1037 * a process (the alternative is enternal deadlock). 1038 * 1039 ¥ If there still is enough memory around, we just Loop 1040 % and try free some more memory... 1041 ”/ 1042 } else if (out_ot_memory( )) { 1043, oomkill(): 1044 } 1045 } 1045} EEE CREO, FLEA EIR. MAR AD interruptible_sleep_on_timeout( )i# AWEOR, Uc Py 4% 6 HMI RUMEARICT. (UA GEIS TAL IS LETRAS kswapd BEALE, EMH kewapd BLEIBT. BA KE AVA” ALS AWE, GABLALATME HZ. HZ dei T ARR EPA & DIR PPP. FEL) LATER EK HUN REACT ERE AL (ELI ZB BR ARE ROK T FUL, 118 interruptible_sleep_on_timeout( ) RNB HZ. Zen 1 PPL MSE kswapd HIE T. HZ. Hf interruptible_sleep_on_timeout( ) SWEAR 1 PLUG EDK. (Rk, ERR PARA TE 1 SOREL RIERE, S FE kowapa Bi Cobh i ol DIF AG BiB HE FTL, IX RR cb EI 1 PERT, EOLA kswapd NAT BER BA, kswapd ZS > Re — UR AY TE eet | ZU? AT LGBE PRN ABSY OB BRON EE, RIMARH LAR FART M, AMF RRMA, Eee MUNI, FEL TARA A BORAS, A TUTTI ERE. RAS AAR AES ATA, INCE TA CEB bP BORA * TUM SAS BGR SECM ANERR “EUR” TT a Ah, Bot A FCP ENE OC TT 93. Linux 4 iit i FAAS, ESSA TEA Ae OL MRR DE a A [kswapd( ) > inactive_shortage( )] 805 /* 806 * How many inactive pages are we short? R07 #/ 808 int inactive_shortage (void) soo f 810 int shortage = 0; 811 B12 shortage #= freepages. high; 813 shortage *= inactive target: sid shortage -= nr_free pages( ) 81s shortage = nr_inactive_clean_pages( ); 816 shortage ~+ nr_inactive dirty_pages; ait 818 if (shortage > 0) 819 return shortage: 820 821 return 0; 2) RP AAS ALT OLIN YS LOT, AGBLAE freepages.high Ml inactive targets 4} Ba 3 A) AAAS GB TH, AA TEA BT 6 NT A KERMA STAR. TSA MESA, KASAM RTL SN TU. SALT A LTE ATER RT LAF ROBHLIER AN 2.4, 2" SL IRTES SH THER, MCE HH nr_free_pages() ME. BOM AGAR SP" HR, ROTA Le bas Ren i ENA P MARTELL, OR MATT AB FR RR BEA. RATT SHES POET, (IRATE, FERC! nr_inactive_clean_pages (IMLMIT» Rela BUM IRR “ME” TU, RAT OMG", ROG REA OL A MBE ONL. CAAT AMER TBARED, PREP HY EE or_inactive_dirty_pages 1A MSIL AUT He. LP AMO RESARZE mmipage_allocc , WALA HM, BTU UR. Ait, Hee ATT OL ME AME, BAG free_shortage( RXTE BANAT PUREE EPA PEMC KOR, BPELHETT Pes MeN TUM MR CORANIARK “AE” TURTLE) ARB FRAG. PBR RBE mmivscanc , LMC MARE. WRART KTR AE LT, RELA RHA TR, BRB do_try_to_free_pages( )5ehM. Ait eth MBSA waitqueve_active( ), HF kswapd_done [}5* REARS THAT, THUR IE BAAEMS do_try_to_free_pages( ). #698 3-4. RF BRAARPALARMIL, AB RPA CREW Baka) AT LUE Animate AIRE AYIA, MASA CE CSI AE AR ENT AEA BLAILAT:« i Kswapd_done, BLIEIX#RAY -MIAF. JL EEA MASIMR ML, Ze kswapd AisEat ROUTE MIRE MT BEAL BIBT. ROE inline wb waitqueve_active( MEA BA Ai eh BERR PATI EAST. SOE ZE include/linux/wait.h Hs +94. 22 tease [kswapd() > waitqueue_active( )} 152 static inline int waitqueue_active (wait_queue_head_t *q) 153 [ 154 #if WATTQUBUE_DEBUG 155 if (ta) 156 WO_BUG(); 157 CHECK_MAGIC_WQHEAD (q) ; 158 endif, 159 160 return !Jist_empty (&q->task_list) ; Lf FHQE W do_try_to_free_pages(), iRPAMRH) -22A47 TUM. HRESZE vinscan.c Ps {kswapd( ) > do_try_to_free_pages( )] 907 static int do_try_to_free_pages (unsigned int gfp_mask, int user) 908 909 int ret = 0; 910 sil I 912 * If we’ re low on free pages, move pages from the 913 * inactive_dirty list to the inactive clean List, 914 * a5 * Usually bdflush will have pre-cleaned the pages 916 * before we get around to moving them to the other a7 * list, so this is a relatively cheap operation. 918 */ gig if (free_shortage() || nr_inactive dirty_pages > nr_frec_pages( ) + 920 mr_inactive clean_pages( )} 921 ret + page_launder(gfp_mask, user); 922 923 x 924 * Tf needed, we move pages from the active list 925 * to the inactive list. We also “eat” pages from 926 * the inode and dentry cache whenever we do this. oer v 928 if (frec_shortage( ) || inactive_shortage( )) { 99 shrink deache_memory (6, gfp_mask) ; 930, shrink_icache memory (6, gfp_mask); 981 ret += refill inactive(erpmask, user) : 932 } else { 933 i 9M * Reclaim unused slab cache memory, 935 */ 936 knem cache_reap(gfp_mask) ; 937 ret = 1; +95. Te Soya eae Neely Linu A RARE af ce aa ese 938 } 939 940 return ret: ou} HARTMAN, ELMAR, CPUTRM ARMS, AAG OTIZ, Py ih LP RO — TR A RR ARTE “REE LEN” AAT PEROMEN), ABIES “WGK DUH THE”. TLL, AE AAY ORR” TEENA. MPR 1B, RUBRICS GME. BEAD DNR AME. WILALIEIE] page launder), HARI OLH A ADR 0 AE" TU “Bee”, PETER OLA. AAA “launder”, BEAL “EAT.” AY SB. RGM HORA LEME kswapd BK, AERA RECA A, OX AMAT, BABA. SCARROZE mm/vmscan.c Hs tkswapd( ) > do_try_to_free_pages( ) > page_launder( )] 465 466 * page_leunder ~ clean dirty inactive pages, move to inactive clean list 467 * @gfp_nask: what operations we are allowed to do 468 * @sync: should we wait synchronously for the cleaning of pages 469 * 470 # When this function is called, we are most likely lov on free + 471 inactive_clean pages. Since we want to refill those pages as 472 * soon as possible, we’ Ll make two loops over the inactive list, 473 one to move the already cleaned pages to the inactive clean lists 474 * and one to (often asynchronously) clean the dirty inactive pages. 415 476 * In situations where kswapd cannot keep up, user processes will 477 * end up calling this function. Since the user process needs to 418 * have a page before it can continue with its allocation, we’ ll 479 * do synchronous page flushing in that case. 4800 481 * This code is heavily inspired by the FreeBSD source code. Thanks 482 * go out to Matthew Dillon, 4830 ¥/ 484 define MAX_LAUNDER (4 * (1 <& page eluster)) 485 int page_leunder (int gfp_mask, int syne) 486 ( 487 int launder_loop, maxsean, cleaned pages, maxlaunder: 488, int can_get_io locks: 489 struct list_head * page_Lru; 490 struct page * page: 491 492 is 493 + We can only grab the 10 Jocks (eg. for flushing dirty 494 + buffers to disk) if __GEP_IO is set. 495 "/ RIS Fees 495 can_got_io_locks = gfp_mask & __GFP_I0; 497 198 launder_loop = 0; 499 maxlaunder = 0, 500 cleaned_pages 50L 502 dirty_page_resean: 503 spin_lock (&pagemap_Iru_lock) ; 504 maxscan = nr_inactive dirty pages: 305 while ((page_lru = inactive dirty list. prev) != @inactive dirty list & 506 maxsean— > 0) [ 507 page = list_entry(page lru, struct page, Iru); 508 509 /* Wrong page on list?! (list corruption, should not. happen) #/ 510 if (!PagelnactiveDirty(page)) { SL printk("VM: page_launder, wrong page on list. \n"); Biz Tist_del (page_lru) ; 513 ir_inactive dirty_pages—-; 54 page->zone->inactive_dirty_pages-~; 515 continue; 516 } sit 518 /# Page is or was in use? Move it to the active list. +/ 519 if (PageTestandClearReferenced (page) || page->age > 0 || 520 Cpage->buffers && page count (page) > 1) |! 521 page_ramdisk(page)) { 522 del_page_from_inactive dirty list (page) ; 523 add page to active list (page); 524 continue: 525 } 526 52 i 528 * The page is locked. 10 in progress? 529 * Move it to the back of the List. 530 */ 531 if (IryLockPage (page) { 532 list_del (page_Lru) ; 533 List_add(page_Iru, &inactive dirty list); 5a continue; 535 } 536 537 y* 538 + Dirty swap-cache page? Write it out if 539 + last copy.. 540 +/ S41 if (PageDirty(page)) { 542 int (#writepage) (struct page *) = page->mapping->a_ops->writepage; 543 int result; 97. Wes Linux os EF a cena 544 545 if (twritepage) 546 goto page_active: SAT 548 /* First time through? Move it to the back of the list +/ 549 if (!launder. loop) { 550 Jist_del (page_Iru) ; 551 list_add(page_Iru, &inactive dirty_list); 552 UnlockPage (page) ; 553 continue; 554 ) 555 558 /* OK, do a physical asynchronous write to swap. */ 557 ClearPageDirty (page) : 558 page_cache_get (page) ; 559 spin_unlock (&pagemap_lru_ lock) : 560 561 result = writepage (page) ; 562 page_cache_release (page) ; 563 564 /* And re-start the thing. */ 565 spin_lock (&pagemap_lru_lock) ; 566 if (result != 1) 567 continue; 568 /* writepage refused to do anything */ 569 set_page dirty (page) ; 570 goto page active; 57 } 572 573 i ara * Lf the page has buffers, try to free the buffer mappings 505 * associated with this page. If we succeed we either free 516 * the page (in case it vas a buffercache only page) or we 5i7 * move the page to the inactive clean list. 578 * 579) * On the first round, we should free all previously cleaned 580 * buffer pages 581 “/ 582 if’ (page->butfers) { 583 int wait, clearedbuf; 584 int freed_page = 0; 585 Ie 586 % Since we might be doing disk 10, we have to 587 % drop the spinlock and take an extra reference 588 %* on the page so it doesn’ t go away from under us. 589 / 590 del_page from inactive dirty_list (page) ; 591 page_cache_get (page) +98. 592. 593, 594 595 596 597 598, 599 600 ot 602 603, eos 605 606 607 608 609 610 61 612 613 64 615 616 617 618 619 620 eal 622 623 624 25 626 627 628 629 630 61 632. 633, 4 635, 636 637 638 639 SRD fen spin_unlock (pagenap, Iru_lock) ; /* Will we do (asynchronous) [0? */ if (launder_loop && maxlaunder == 0 8& sync) wait /* Synchrounous 10 #/ else if (launder_loop && maxlaunder-- > 0) wait = 1; /* Asyne 10 ¥/ else wait 0; No 10 */ /* Try to free the page buffers, */ clearedbuf = try_to_free_buffers (page, wait); it %* Re-take the spinlock. Note that we cannot * unlock the page yet since we! re still * accessing the page struct here. */ spin_lock (&pagemap Iru_lock) ; /* The buffers were not freed, */ if (lelearedbut) { add_page_to_inactive_dirty_list (page) ; /* The page was only in the buffer cache. */ J else if (!page->mapping) { atomic_dec (&buf fermen_pages) ; freed_page = 1 cleaned_pages++; /* The page has more users besides the cache and us. 4/ } else if (page_count (page) > 2) { add_page_to active List (page) ; /* OK, we “created” a freeable page. */ } else /* page~>mapping && page count (page) add_page_to_inactive clean list (page) ; cleaned pages++; w/t 1 I * Unlock the page and drop the extra reference. %* We can only do it here because we ar accessing % the page struct above. */ UnlockPage (page) ; page_cache_release (page) : Linux Po Bett ema 640 ys 641 % If we're freeing buffer cache pages, stop when 642 4 we! ve got enough free memory. 643 4/ bad if (freed_page && !free_shortage( )) 645 break: 646 continue; 647 } else if (page->mapping && !PageDirty(page)) { 648 i 649 + Tf a page bad an extra reference in 650 * deactivate _page(), we will find it here. 651 + Now the page is really freeable, so we 652 % move it to the inactive clean list 6m */ 654 del page from inactive_dirty list (page) ; 655 add_page_to_inactive clean_list (page) ; 856 UnlockPage (page) ; 657 cleaned_pagest+: 658, | else { 659 page_active: 660 ix 661 % UK, wo don’ { know what to do with the page. 662 * It’s no use keeping it here, so we move it to 663 * the active list. 664 */ 665 del_page from_inactive_dirty_list (page) ; 666 add_page_to_active_list (page) ; 667 UnlockPage (page) ; 668 } 669 } 670 spin_unlock (&pagemap_lru_lock) ; ACESS OHA cleaned_pages FURR 14 “VLA” AVUTINIRE AL. 5} —“P ah ABA launder_loop HIE HERURAEIR “HE” TBA FUME. ZEA — RE HAY launder_loop 90, MURA LEU AHS 48, TRIE BL AL 1 IRIEL BRAS dirty_page_rescan ¥h(S02 47), JPA M— TKI. AATARR “RE” TUNDRA EL —F while TIK(S0S FUT. EBT HEMMER ei ee TM ERTL LAU BA AS I, TLRS RESIS Th PhS Ee Lee, PE RL BS Wh Ti, RAR AT, XG maxscan HIE. MPWA PRET IU, BACHE EM PC_inactive_dimy Mae 1, AMMA AMMAR TOI, EIN TPE, BEDE AMUN CL 512 77). Buh. ATE AY ial, UBER URES a ME PARE WH RR ARE AAA “HE” AS, ULI PAREBA, see ARS WADARS ER AL MIELE”, Al SBE PL BE Ha BA (S19 ~-525 MO RTT: HUM ZEREA TANNER “WE” DODO UZ BU PA, WNBA T DU SCAT BRR ie IATL L7H TU AER. +100. lm ARAL TUM “Aen” ARERR. TERY page SMP AEB age, HRS TSU Mal MaRS RAK. HHRMEEAMK MA FU AEE / "5 ROE, FOU SAD RAF 1. BOY TTT AE Rb — SE ARORA THB. TUTTE, PA ERM Bem 1, De CRS PEALE THT SON 1, CEM E / SC. RT AE SSPE, AZ, PRE ART 1 SSE MER AE AL TT Rin CEE UE EPR ATH ALE ramdisk, OVARY TEER A. IATL SRAM BE: . WM BB B(S31 fF). PTL TryLockPage( )iHIE 1, RRVNI ATI METTSeE. on ME, PRTC EAR “ME” SCTE, EERO UREA. UR, NARA, MECBRET. ARTUR “AL” WUE 47). BN page SHAY PG_diny FEAT 1, TY ES SCR, A A eS RE BS41~571 17). PA» ATIMBY address_space Bc PDAS UREN, BURL A OPH page_active Mb, MTGE TEER HDA Je A — BA TTL HR, TIBIA) address_space E48 4 JJ) swapper space. address_space_operations $44) 79 swap_aops, JV #2 (H (5 3 thi’ tH PRE swap_writepage( ), i 2 OR" ARAB. CEN AD, HERR) DA, MPS RH(531~535 17). RETRO, MRAM EMO SNAT. SZ aL ClearPageDirty( 48 hilt] PG_dirty pick (icdita O, 9X feist ce VIR address_space MV MIAT RAN RAHRES UE. RA RANT, HM Hie i, Ret mmap( )2E CRY SCBA CER SEINE / SBE, FHA PSR ET AE AAR ATER, BARS dicem), AETAR ARI, (ae ee AEST Re, CELIA KAT REPRE, page launder( ), AiDLAP EB RHR PTS “Yk Jit AARNE PG_dinty pr O MAM. RE, RRL NSA T Cb S41 #7). HES, BERRA S WAM HTET AL, FRU RETE'S HAM RIL 1, Ae page_launder( ) FY LAH SZ 12 IRIE PG_dirty #pi(0e FHF HIRE ARAM GBA FIP (569~-570 47 D0 WEEE TP, ELAR EL AAG writepage oh HiBY ALB page_cache_get( )i@!# sUnINO EHH Bee Mik eR EL a I page_cache_release( BWIA TH ME, Ror TEAR NTS AE) BI ~*BITE Ah AY DS BAR “Ph” UA, WMA BALEA PC_dirty HRA TO. DEER, WR CPU BUA TARAS PAY 582 7, WIT TERY PG_diny Pah 0, IAP EAE SMe RT A. SORA “ME” 8S, JP AOR esc Pie / Sapte oe ti682~647 17), MUSE EL ATER “AL” TURES, FEMI wry_to_free_buffers( URE RE HIK. RARER ERR RAAB AER HE” TAS, RFRA, MAR PH” TL HBT). WR ACT, MMO BM try_to_free_buffers( yw 1, 638 4714) page_cache_release( )# (EHR 1 BABI T 0, Mit eA TUT AR CPB A TCA VY. Say ARMA, HARRAH SAMOA RB RR, BALBOA Ti 644 Fil 645 47). URES. HAA try_to_free_buffers( MI {US¢E fs/bufferc , ee TOEFAT “OCA” ~ RU ATA. (5) MR TAHA “BE” EA. FFALAESE +S address_space BURA HAUK I}! , LARUE S “pti” 101. 2) 6 “ ay eau i mee) Linux BARA TK. PROMO Pee BUTE Ie AR “Te” RM. (©) FRG, CRAM T LAE — PBL (658 17), HORE AL ACAD IRAY TR, FLARE TPA. ERT MUS, KE RUER PEW LAR, USM efp_mask +i _GEP_IO FEHR AY 1, RAMESH. [kswapd( ) > do_try_to_free_pages( ) > page_launder( )] el 672 i 673 * If we don’t have enough free pages, we Loop back once 674 * to queue the dirty pages for writeout. When we were called 675 * by a user process (that /needs/ a free paye) and we didn’t 676 * free anything yet, we wait synchronously on the writeout of 677 4% MAX_SYNC, LAUNDER pages. 678 * 679 * We also wake up bdflush, since bdflush should, under most 680 * loads, flush out the dirty pages before we have to wait on 681 * 10. 682 / 683 if (can_get io locks && !Jaunder_loop && free shortage( )) { 684 launder_loop = 1; 685 /* If we cleaned pages, never do synchronous 10. */ 686 if (cleaned pages) 687 syne = 0; 688 /* We only do a few “out of order” flushes. */ 689 maxlaunder = MAX LAUNDER; 690 /* Kflushd Lakes care of the rest. */ 691 wakeup_bdflush (0) ; 692 goto dirty page_rescan: 693 ) 694 695 /* Return the number of pages moved to the inactive clean list. */ 696 return cleaned_pages; 697} MURATA. BATH, BLLS| 502 {FAS dirty_page_rescan Ab. PEAIX HAE launder_loop # AT 1, CRA ATER 2 NAIK T . ATLLEUKIAHY page_launder( ie ® 21 ¢ Maha FFB do_try_to_free_pages( )iJfRES', ist page_launder( UUs MSR) 4} Acs TTT CATR RAE, ABBE S EAP BENT 6 AB aL, SPAR AEH UE FT Re TPL, TA Oe TP, a tH kT HH = eR MKC shrink_deache_memory( ). shrink_icache_memory(). refill_inactive( )), LAR “F “FA¢S HH1AY kmem_cache_reap( HAL. 72“ BRR HY, AHA, CAT FECML B RECT SORA dentry A AH ARE RS AH inode BRL. EUR RTE ALUM BOE LRU SUPT, ERECT AHR ID PRET EB). GR, Sick RAL, BOA nl AE +102. BIE fhews BREKEW deouy SAMA inode RAH, HAMRM NMEA. ke, Rea shrink_dcache_memory( )#I shrink_icache_memory( )i M/ONUAIBIL, ELMER RX Hoste 4 #5 388 Fa BEAST Oi". Tt, RGA, PU eet tb ay Ba dtl AP A, eb BRAT AGU “slab” WEA. WEA SAR, RARE ATE GR” Oe KG. ASME ER”, MR RANETT, CHEM MAUI AR Ae. shb FEL A TCR SOSA MEE, PERE, Hb RAL (ASB iL kmem_cache_reap( 3K “WH”. ERATE T “ORR” BADR CRT BRUINS, RRA IAE RUSE REE refill_inactive(), SWE mm/vmscan.c [kswapd{ ) > do_try_to_free_pages( ) > refill_inactive ()] Re fe 825 * We need to make the locks finer granularity, but right 826 * now we need this so that we can do page allocations 827 * without holding the kernel Lock etc. eg Oe 829 We want to try to free “count” pages, and we want to 830 * cluster them so that we get good swap-out. behaviour. Ble 832 * OTOH, if we're a user process (and not kswapd), we 833 really care about Latency. In that case we don’t try 824 * to free Loo many pages. 85 #/ 836 static int refill_inactive (unsigned int gfp mask, int user) 87 838 int priority, count, start_count, made_progress; 839 HO count > inactive _shortage() + free shortage( ); 84 if (user) 812 count = (1 <« page cluster); 813 start count = count; ad 5 /* Always trim SLAB caches when memory gets Low. +/ 846 knem cache_reap (gfp_mask) ; 87 888 priority = 6: 849 do ( 850 made_progress = 0: 851 852 if (current->need_resched) { 853 set_current_state (TASK_RUNNING) ; 854 schedule( ); 855 1 856 857 while (refill_inactive_scan(priority, 1)) { 858 made_progress = 1; + 103. Wes Linux AYER ese 859 860 861 } 862 863 i 864 % don’t be too light against the d/i cache since 865 % refill inactive( ) almost never fail when there’ s 866 * really plenty of memory free. 867 ” 868 shrink _deache_memory (priority, gfp_mask) ; 869 shrink_icache_memory (priority, gfp_mask) ; 870 871 i 872 * Then, try to page stuff out.. 873 +f 874 while (swap_out (priority, sfp_mask)) ( 875 made_progress : 876 if (-count <= 0) 877 goto done; 878 } 879 880 - 881 * If we either have enough free memory, or if 882 * page_launder( ) will be able to make enough 883 * free memory, then stop. 884 4 885 if (linactive_shortage() || !free_shortage( )) 886, goto done 887 888 i* 889 * Only switch to a lower “priority” if we £90 + didn’ t make any useful progress in the 891 * last loop. 892 */ 893 if (!made_progress) 894 priority 895 } while (priority >= 0); 896 897 /* Always end on a refill_inactive.., may sleep... #/ 898, while (refill inactive_scan(0, 1)) { 899 if (—count <= 0) 900 goto done; 901 } 902 903 done: 904 return (count ¢ start_count); 905] 104. Re HAR PR user BM kswapd fe PHN, RRB A BBC kewapd_done HIN PEBPAT, ATARRE (EL AB SHE EY A, TLR ANCE ‘HEMLAL kmem_cache_reap( )“H6CHH" Li slab HLH SLAVS MER ILM, BRITT AE REN, RETREAT “ARLE RNR” TS ROAR TRIS. RUB, BRP do-while FF. TAMIR 6 BIE, BabA “re” 0B, A RRA TA, SCM T, RATAN AAA, Mth T CETL PRR REM ITA T eee). BAD, SOAAAAAE — PF AEF task_struct 47H) "PH need_resched JE A 1. SURE, LEAP MAR EAP SER IABE, BTLLADAY schedule( yuk WR METT—UCiRIBE, (EUR ZEUS AE AT HARARE AR TASK RUNNING, 2k BEERS TOM. RAED 4 Std SSAA, ask_struct $444) "Pit need_resched AEB IM RAH, A CPU ART KARAM P AR. MARIA BP Si RAMEE. ATA, kswapd BDARRM, AGES MIMI)”, ARES PALIT tE CPU ASI BULA REAR “EAE READ AES RICE el AT EAMG FFE schedule( ). WA, TEMA PRE AE? BAAR. ALIBI refill_inactive_scan( }#I#M35 8K iti BAW), RUA RAT AAR RI, PEASE, swap_out( RR “SEA, STH CBT KMPER ATA ARORA ATL. Hoh, SEPARA dentry 494i inode IKKE. ‘64 refill_inactive_scan( )i#/4UE3, 3X4 HALE mmv/vmscan.c Fs 699 /xk 700 refill_inactive_scan ~ scan the active list and find pages to deactivate 11 * @priority: the priority at which to scan 702 -* @oneshot: exit after deactivating one page m3 0 * 74 * This function will scan a portion of the active list to find 705 * unused pages, those pages will then be moved to the inactive list. m6 */ 707 int refill_inactive_scan (unsigned int priority, int oneshot) 708 709 struct list_head * page_lru; no struct page * page; mL int maxscan, page_active = 0: 12 int ret = 0; 713 m4 /* Take the lock while messing with the list... +/ 715 spin_lock (&pagemap_Iru_lock) ; 716 maxscan = nr_active_pages >> priority; nT while (maxscan-~ > 0 && (page_tru = active_list. prev) != dactive_list) { 18 page = list_entry(page_lru, struct page, lru): 719 720 /* Wrong page on list?! (list corruption, should not happen) */ 1 if (!PageActive(page)) { 722 printk (VM: refill_inactive, wrong page on list. \n”); 723 list_del (page_lru); +105. 724 15 726 ce 128 129 730 731 732 733, 734 735 736 737 738, 739 140 741 742 743 744 745 746 747 748 749 750 151 752 7153 754 755 756 757 758 759 760 761 762 763 164 165 766 167 768 769 REG AE” ATU AE. GALL —“S eb A maxscan RESTA AT TTT BRE. 7S Nes. Linwx PHBE La) esa nr_active_pages--; continue; } /* Do aging on the pages. */ if (PageTestandClearReferenced(page)) { ‘age_page_up nolock (nage) : page active = 1; ) else ( age. page down_ageonly (page) ; ix * Since we don’t hold a reference on the page ourselves, we have to do our test a bit more strict then deactivate_page( ). This is needed since otherwise the system could hang shuffling unfreeable pages from the active list to the inactive dirty list and back again... RRR RH SUBTLE: we can have buffer pages with count 1. */ if (page—>age = 0 && page_count (page) <= (page->buffers ? 2: 1) { deactivate_page_nolock (page) ; Page_active } else { Page_active = } } i* * If the page is still on the active list, move it * to the other end of the list. Otherwise it was * deactivated by age_page_down and we exit successfully. */ if (page_active || PageActive(page)) { List_del (page_lru) ; list_add(page_lru, &active_list); } else f ret = 1; if (oneshot) break; ) } spin_unlock (&pagemap_Iru_lock) ; return ret; } + 106 - Ble telewn RRA ELE SRILA], TLRS AL priority MULE ASO, RATE prlority 7 0 BY A FLAMES ASIC 716 7). RF STAR OA) I, BE SP IHC, 72 ‘ide MR, AUREL ASLAM 729 F)- BENE MURR LG Ae. MURAD SAF ERAT 0. MARMARA CARMA LE GA, BCR TA, BM RR T UT ROCHA OCRAE ATR, OR RTS ATRIAL. RIUM AA SRI / SAR, MAR EOE ROK 1 ROLF PIRI, IEA RENE A ANI BARS IL 144 47), IEA AVTUTT AEST owap_out( EAH MUINERRISRIN AREA MERE A TODOS. XE FHERREMATRSORAMUIURL, ERI IMSUP AY HAC BE SIOA UO. LZ, MR RH ARMA T AERA, RUHL BM oneshot (OULIE HE AAT. ARORA, CERSRE LTD IL POSE RTM ASE 1. 124 swap_out( BF TF—7> CORY WET CHORE A ASRARASIT, UE RAAT RATL, RATER TUT OTA, SOUR PRET WSCCETERPAR hte. He HMA” wT LR 18H swap_out( ) AVEAE mmivmscane 4H Les {kswapd( )> do_try_to_free_pages( ) > refill_inactive () > swap_out( )] 207 Je 298 * Select the task with maximal swap ent and try to swap out a page. 299 * NB, This function returns only 0 or 1. Return values != 1 from 300 * the Iover level routines result in continued processing. 301 ef 302 tidefine SWAP_SHIFT 5 303 define SWAP_MIN 8 304 205 static int swap_out (unsigned int priority, int gfp_nask) 306 { 307 int counter 308 int __ret 309 H0 (* a * We make one or two passes through the task list, indexed by a2 * assign = {0, 1}: a3 * Pass 1: select the swappable task with maximal RSS that has a * not yet been swapped out. v5 * Pass 2: re-assign rss swap ent values, then select as above. 316 * aT * With this approach, there's no need to remember the last task us * swapped out. If the swap-out fails, we clear swap_ent so the ag * task won't be selected again until all others have been tried. 320 * 2 * Think of swap_ent as a “shadow rss” - it (ells us process gee % we want to page out (always try largest first). we */ 4 counter = (nr_threads << SWAP_SHIFT) >> priority; 325 if (counter < 1) +107. Nes. Linux PBA iit) sear counter = 1; for (; counter >= 0; counter--) { struct List_head *p; unsigned long max_cni struct mm_struct #best int assign = 0; int found_task select: spin_lock (tmmlist_lock) ; p = init mn, nnlist. next; for (: p != Ginit_mn nmlist; struct mm struct *nm if (om->rss < 0) continue; Found task++; /* Refresh swap cnt? 4/ if (assign == 1) [ nim->swap_ent. = (nm->rss >> SHAP_SHLFT); if (om->swap ent < SWAP MIN) tmm->swap_ent = SWAP_MIN: = p>next) { ist_entry(p, struct mm_struct, mmlist); } if (am->swap_ont > max_ent) { mmax_ent = mm->swap_ent best = am; } /* Make sure it doesn’ t disappear ¥/ if (best) atomic_inc (Gbest—>ma_users) ; spin unlock (émml ist_lock) : is * We have dropped the tasklist_lock, but we * know that “mm” still exists: we are running * with the big kernel lock, and exit_mm( ) * cannot race with us. */ if (best) { if (lassign && found task > 0) { assign = 1; goto select; b break; } else { __ret = swap_out_mm(best, gfp_mask) ; mmput (best) ; RIE Hho a4 break; 375 } 376 } 3m? return __ret; 378} PR ME —*T for I, HRMAURUE counter, MM counter CLAN SHE SEARS) BOSSA swap_out( ISOLS624 EADY 6 2, BER LITE OME) UFR ET AD, tRaGeE AO, counter HFT (nr_threads<< SWAP_SHIFT), Ell 32% nr_threads, 1X4! nr_threads 2% e HUPEAD CRE. A ARR EAE TO AA BA, RET She OA IKE. BAL gtp_mask "EER HAT Mh TERI, EEE ERE PIES) — NEARY best. $B) T at Hse PUY TUMOR A. HEE AP ml PUTT ET, RE OE ARBORS, 8 t RPS PBR EEA. SES IRATE. GES ACALIRAY “swap_out”, (HSER ESL Ae we mie Sse EEE HER, AR pe A TI, PLL PTR TAN HR FE MA. MA HE FRAME “ARATE” (REPRE? WPCA DLA “BURRS” GS RRUAREE” GSES. AE TERA HOS AME SU, AEP LL RSPR T BRA ETI MR. EEA — META, BRA HE AS ITE NN NE TT A ABE A A, TESA EE SPR. RE SAR “SE PY IRS” (resident set), SLAB rss. TEP HEE mm_struct PA PASE ARGUE SR ha OATH rss Bet TAD RAS. TORR EEN Ti Aa RAS IL toatik Se FURS LY Ae for HH RATA IB SLATE ARIST AMEE. ACP PEL task_struct SAAB SURE —MAB. DEAE initoask JOA BANU — (SEER, LATER ie. EN JETT, SERB “RMU”. BILL, AA init_task. next_lask J672 init_task 1h. BALAEEBR ERRATA ERD. RIOD BE MPA mm-oswap ent YAIR. ES mm_struct SiH PAVE AME, Ce aE ARNG PE INT ALTE Td, IA TTT mm_struce SDT ADS AB AR TOUR AE T A, Rak aT AE Rk Fe ULE mors. ARIK“ A" Bi. SARE S/S — ASTI, BASAL mm->swap_ent ik 1, ALARA O. TEL, mm->rss BET NERY ARV ALTE TURES, TH mm->swap_ent WRT dere: RNY fF HCA HPPOKLAAROROEE, RECA CPR DLA THAMMMMKENSR, RK ARE) AS EERE He". AYALA mm->swap_ent MALT 0, Mig HE VE AA" best” A 439~444 tr), PHO AAE assign Bim 1, Ag 3k- VOR EERE HT 9 mm->rss 7H] mm->swap_ent P, RE BME TSAI (DE, PA OY OS LE ROE AE” BUG BUR SP ASTD A CTR ALY > AAT TETAT, SAS oe BE) FIR A SURE" DURA BURR. RR ERY Pat A EAN TH APC he be: AH A He A TT A&A SR Sw I} Hy TB SAE AE) swap_out( ) (YS RT RDA AE TLIO. A Ae er TERR TREAT A A RTO PRE“ RAERTR” best DUG. BRAUER A, ETT AE TU 0 if OS HAE et swap_out_mm( )3% TERRE. 29 swap_out_mom )F7/j lie} 4 TERED Rh Mf ASWELL 0, JAIL RU EAE. ERR LEZ AT SELL 356 47097 atomic_inc( ) i&!# mm_struct FFF O4tE +109. Nes. Lay esa HUE mm_users #56 RUUG Fhe 373 47 mmput( YAREGIR, (XMM Aa TERRE MRIS T SMP, IMATE NR BRA. PRC swap_out_mm( )iHI{RE5H4 vmscan.c 4 [kswapd() > do_try_to_free_pages( ) > refill_inactive () > swap_out( ) > swap_out_mm( )] 257 static int swap_out_mm(struct mm_struct * mu, int gfp_mask) 258 259 int result 260 unsigned long address; 261 struct vm area_struct® vma; 262, 263 i* 264 * Go through process’ page directory. 265 */ 266 267 is 268 * Find the proper vm-area after freezing the vma chain 269 * and ptes. 270 ”/ 2 spin_lock (tmm-Ypage_table_Lock) 272 address = am->swap_address; 273 vna = find_vma(mm, address) ; 274 if (vma) { 275 if (address < yma->vm_start) 276 address = vma->vm start; 27 278 for Gi) { 219 result = swap_out_vma(om, vma, address, gfp mask); 280 if (result) 281 goto out unlock; 282 yma = va >vm_next; 283 if (tvma) 284 break: 285 address = vma~>vm start: 286 3 287 } 288 /* Reset Lo 0 when We reach the ond of address space */ 289 n->swap_address = 0; 290 ma->swap_ent = 0; 291 292 out unlock: 293 spin_unlock (@nm->page_table_lock) ; 294 return result; 295} F3ti, mm->swap_address Hav TEA TIMER Y BEROE AS RANT ik. AINA I 0, ABE “HO. RI Amn e FIN BOS RT aL a AR O CR 289 47). FERRE for RP HRA TIA IL ARSUSUOTZENY AEA? At va, AUS RLIBJH swap_out_vma( )istHdBeih -oiti. WIA GBI] 1D, Le MES RIERT « AUR EPR hl. BRA TUE PAL, #8ik swap_out_vma( ), swap_out_pgd(). swap_out_pmd(), —Hl try_to_swap_out(), iA Beit di —7 ZA pre PTI Ae Ki. PAR ILP SRE TE, ROTM RR, RERNARKE tyto_swap_out(), ARBRE. Pili, RM -L—- PRE EME TH we [kswapd( ) > do_try_to_free_pages( ) > refill_inactive () > swap_out( ) > swap_out_mm( ) > swap_out_vma( ) > swap_out_pgd( ) > swap_out_pmd( ) > try_to_swap_out( )] at fe 28 * The swap-out functions return 1 if they successfully 29 * threw something out, and we got a free page. It returns 30 * zero if it couldn’ t do anything, and any other value 31 * indicates it decreased rss, but the page was shared. a2 * 33 * NOTE! Tf it sleeps, it *must® return 1 to make sure we 34 don’t continue with the swap-out. Otherwise we may be 35 * using @ process that no longer actually exists (it might 36 * have died while we slept). Mw e/ 38° static int try_to_swap_out (struct mm struct * mn, struct vmarea struct® vma, unsigned long address, pte_t * page table, int gfp_mask) 3 40 pte_t pte: 4l swp_entry_t entry; 2 struct page * page; % int onlist: 4 45 pte = *page table: 46. if (pte present (pte)) 0 goto out failed; 48 page = pte page (pte): 49 if ((IVALTD PAGE(page)) || PageReserved (page) ) 50 goto out_failed: 51 52 if (!mm->swap_ent) 53 return 1; 5 55 mmn->swap_ent—; 5B, AE, BH page_table Kin LAAI— AMMAN. MAAK, SA page table HAR Bo UR PAN A AIA ER pte VE, HMR pte_presemt( PRM AAT ITH HEME TH EA FER, MURA TE TEL ARES A] out_failed, AUKERE RAM Ts “mM. Linux APA fh 106 out_failed: 107 return 0: i ry_to_swap_out( ln O Rf, FUERA RBS, Tt Pm de RAO - TRE. WR TMHREAIR, AAG HB-B bP. RL, RH TEAL AEH , LDL pte_page yi LIM BETAY A A Me ARENT NT 19 page SHH. HRPM page S8FIMZE mem_map Hip, HFCL page - mem_map)itJEi% HE NURS CRRA Fin). BARA E SA RAR A TE MES max_mapnr, ASA Ak — 7 WORK, AM RAN ANE MOORE ME), PREM ok. 118 define VALID_PAGE (nage) (page ~ mem map) < max_mapnr) JESS, ARATE FET AS AOR hy ET SE Beal AAR, PAAR LTT. BTELAF mm-sswap_cnt WK 1. SARE RE try_to_swap_out( )iM{ UH: [kswapd( ) > do_try_to_free_pages( ) > refill_inactive ( )> swap_out( ) > swap_out_mm( ) > swap_out_vma( ) > swap_out_pgd() > swap_out_pmd( ) > try_to_swap_out( )} aT onl ist. = Pagedctive (page) ; 8 /* Don't look at this pte if it's been accessed recently, #/ 59 if (ptep_test_and clear_young(page table) { 60 ‘age_page_up (page) ; él goto out_failed; «2 } 6 if (lontist) 64 /* The page is still mapped, so it can’t be freeable... */ 6 age page_down_ageonly (page) ; 66 6r Ix 68 % If the page is in active use by us, or if the page 6 + is in active use by others, don’ ( unmap it or 70 * (worse) start unneeded 10. 7 ”/ 2 if (page-age > 0) B goto out failed; m4 WTP RUIN page 2A, PAR Mags | MRAP OA AY TARAS. Heeb By PG_active ti BURR MTAT RM HE", AE TE activelist BSI: 230 #def'ine PageActive (page) —_test_bit (PG_active, &(page)->flags) AT SCR ETM — GEER LRU BARI, ARLE active list GA 3) PBR TRA — HE inactive_dirty_list ‘1 B@4E 7 inactive_clean_list |', E> “FAR BE HMUMAR, “Mm. Bl mee — PR PE a, WORT IA RT EH) PT. RRL inline ef ‘& ptep_test_and_clear_young (iit (3FIH 0) AY. Haz X7E include/asm-i386/pgtable.h "P+ 285 static inline int ptep_test_and_clear_young (ple_t *ptep) { return test_and_clear_bit(_PAGE_BIT_ACCESSED, ptep); | AUNT, SiH S_PAGE_ACCESSED i243. 44 1386 CPU IPS FBR OLA it aS HDA HG — Fi FSA SIL HS HH: 5 AT Vd SRS lH We RE TL AN PAGE_ACCESSED fran 1. AFL, WR pte_young( Bll 1, BEARS Bort] 4 RTE WAH ty_to_swapou( 4, WRMBDOAww--k, DELP “He”. A, Bae LATTA MR MATTER AIRES A A, OPA RR. AE TM, AEE (21h (_PAGE_ACCESSED Peak fvii& 0. AEE S AA, y bE DGA Matin Pe MORE “ARE”, ARE EE I DUR S, ATLABBEREH out_failed. Ait, A463) outfailed 2AELR— AHI: MAR TIMER, MLL SetPageReferenced( y+ page SARAH HU PG.referenced PeAAL FEL Le HR AAR. HG TCT ZU Hp Re as SE BNL ew fs A He Se PTAA TSU, NWI RE age_page_upt )SHH A Tif HT OL RAP 9k “CAN ease” ASIST Dy EARP TM AE UAE ALL HG lo [kswapd( } > do_try_to_free_pages( ) > refill_inactive ( ) > swap_out( ) > swap_out_mm( ) > swap_out_vma( ) > swap_out_pgd( ) > swap_out_pmd( ) > try_to_swap_out( ) > age_page_up )] 125 void age_page_up (struct page * page) 126 17 * 128 # We're dealing with an inactive page, move the page 129 * to the active list. 130 */ 131 if (page->age) 132 activate_page (page) : 133 134 /* The actual page aging bit */ 135 page>age += PAGE_AGE_ADV; 136 if (page-Page > PAGE_AGE_MAX) 137 page-Dage = PAGE_AGE_MAX: Bs} $5) oucfailed LUa, MPMALGEMLO, Uw eR TM. EE, BF He RRR SABRE TRS, HR) — FMA pte Ht f_PAGE_ACCESSED Pa (208 4) 0, ABA LAAT “Oe” 7. SEP la), WIA RT EO RR EE BP) EASA IEA NE? LUA aS He do_swap_page( 4 EH), “Hb Hal FA TTR SL (SAGE AR TUT BD BRAS IN) FEAR VOL iG ER OU BA B,C A eB page_launder(), eb Be (i FAA HCBE EVA AEE, ULAR PRI ATA AE ATER To SUR TURILAAR “ARS, ARR APT NAR, REPL SA U3. ire Limax ABEL ath) esi AG ARAGLCC RMA, DEE “OUR” BILE. LAGE? TAIL page->age (HL, 80 FC he «RS A RH EB SL age_page.down_ageonly( ik 36% fr Gnmiswapc): 103 /# 104 # We use this (minimal) function in the case where we 105 -* know we can’t deactivate the page (yet). 106 #/ 107 void age_page_down_ageonly (struct page * page) 10g 109 page-Dage /= 2; lo} FUE page->age MAAS 0, BUCA AE TRH, PTL HSL F! out_failed. BALA, APT MAM LABS T HART. BARE Tet, [kswapd() > do_try_to_free_pages( ) > refill_inactive ( ) > swap_out( ) > swap_out_mm( ) > swap_out_yma( ) > swap_out_pgd() > swap_out_pmd( )> try_to_swap_out()] 15 if (TryLockPage (page)) 78 goto out_failed: 7 78 /* From this point on, the odds are that we're going to 9 * nuke this pte, so read and clear the pte. This hook 80 * is needed on CPUs which update the accessed and dirty 81 * bits in hardware, a2 */ 83 pte = ptep. get and clear(page_table) ; 84 flush, t1b.page(vma, address) ; 85 86 ie 87 # Is the page already in the swap cache? If so, then 88 % we can just drop our reference to it without doing 89 * any TO - it's already up-to-date on disk. 90 * 9 * Return 0, as we didn’ t actually free any real 92 %* memory, and we should just continue our sean. 93 */ ” if, (PageSwapCache(page)) { 95 entry. val = page->index: 96 if (pte dirty (pte)) or set_page_dirty (page); 98 set_swap_pte: 99 swap_dupl icate (entry) ; 100 set_pte(page_table, swp_entry_to_pte(entry)); 101 drop pte: 102 UnlockPage (page) : iia. 2 dei 103 m->rss—~; 104 deactivate page(page) ; 105 page cache release (page) ; 108 out failed: 107 return 0; 108, 5 FY page SEGRE Ri, AR AR RTT ERE, PLR Ht ‘TryLockPage( )#¥ page SUR HLL (include/linux/mm.h): 183 #def'ine TryLockPage (page) test and set bit (PG locked, &(page) >flags) MRE PEL 1, BFA PG locked MSM GORMOBA 1, BARMERA T, lea akAS EASULMLIK >} page BURA, MM RARE. DMAP EL es. BLY LARA TCT A ASTRA a AE aT HFM T ptep_get_and_clear( )PEK—U HUN MA AE, FH AeA AK O, BALM ATE OB. GTA 45 TOS T AMARA, HARASS -IK, MMU OR? EMMA SH, ASMA OTA CE + CPU Lse4T, PILLS ROMA OTE Bae, WRAY page AMAA RRA / MTA, WAAEEY swapper_space AMBP, BARMMAS ABER, REORA AIT, Zea H US EPRM, AVAT. Bel, AMARA / RMR TMI IR” AHR” ES, FLL RUM CHL S Hol RBA A set_page_diny( HI A “HE” TOTO. ARF PageSwapCache ( ) {XA Cinclude/linux/mm.h): 217 define PageSwapCache (page) test bit (PG swap cache, &(page)—>flags) Biae( PG_swap_cache 7) 1 27% page HMI swapper_space BLIP, UOTE TAL AT HRA / HUTT. HOI page SHIEH index PREP 32 CADRSLM sup_entry 1, SeoE BAT KATIA (OBRINT . FAK swap_duplicatel EM, -HALBERNRAIRINATEE BES, SERA MAAR LTE REM. SAREYAE mm/swapfile.e [kswapd() > do_try_to_free_pages( ) > refill_ inactive () > swap_out) > swap_out_mn( ) > swap_out_vma( ) > swap_out_pgd{ ) > swap_oul_pmd( ) > try_to_swap_out( )> swap_duplicate()} 820 821 Verify that a swap entry is valid and incroment its swap map count. 822 # Kernel_lock is held, which guarantees existance of swap device. 63 824 Note: if swap_map[ ] reaches SWAP MAP MAX the entries are treated as 825 * “permanent”, but will be reclaimed by the next swapoff. 6 / 827 int swap_duplicate(swp entry t entry) ee | -uS- Nes. Linux ABER ba cscs 929 struct swap_info_struct * p; 830 unsigned long offset, type: 831 int result = 0. 832 833 /* Swap entry 0 is illegal */ 834 if (lentry. val) 835 goto out; 836 type = SHP_TYPE (entry); 837 if (type >= nr swapfiles) 838 goto bad file; 839 ype + svap_info; 840 set = SHP_OFFSET (entry) : BAL if (offset >= p->max) B42 goto bad offset; 843 if (!p->swap_map[offset ]) 844 goto bad_unused; 845 ie 846 * Entry is valid, so increment the map count. 847 +/ 848 swap_device lock (p) 5 49 if (p~>swap_map[offset] < SWAP MAP_MAX) 850 p>swap_map[offset)++; 851 else { 852, static int overflow = 853 if (overflow++ ¢ 5) 854 printk(’WM: swap entry overflow\n”); 855 p->swap_mapLorfse.} = SWAP MAP MAX; 856 t 857 swap_device unlock (p) ; 858 result = 1; 859 out: 860 return result; 861 862 bad_Tile: 863 printk("Bad swap file entry $081x\n", entry. val); 864 goto out: 865 bad offset 866 printk ("Bad swap offset entry $081x\n", entry. val); 867 goto out: 868 bad_unused: 869 printk("Unused swap offset entry in swap dup MO81x\n", entry. val) ; 870 goto out; sm} DhaniRRL. BRAY swp_entry_t ER EAE 32 HIG SHEAR, HMR MY AEA EO, tee fe HG A HEHE O, HAMS COPEL offset WES LATIMES, KRAIT ROLLE type RISES AIR eA SHES. CURL, HALE type Sef 5 “RAY ER, RIES. U6. B28 He VLA PR, BERT EPH EPMO AEA swap_info PARAL IR BLA swap_info._ struct SLAY. BP AREA MI SR EYMEEAL swap_map J» UCLA BEE A UA OS MAPS a RRCKERA RE | KURA 0, HMR, BAH, BILL AS SWAP_MAP MAX. 22a E SURWJU35i| Bete ik THEMES TR FS try_to_swap_out( )AUEE I. 100 FR] set pte)» FERMI L TUTOR LTTE A EAS TUAER, RACAL PTE RTA ER TAT ETCH. BCE, ABUT EARS drop_pte HH ELYaSUEASAVSE Wy IESE A os Bm TST. cs T RATTLE TRE ARATE AAT EWTRROUMK EE, PALLET deactivate_page( JA AKAMA SBM RII, JF ETL page SEMA ISER DLAI PUPAL A RA ANSER HMI Cmmswap.c)= kswapd( ) > do_try_to_free_pages( ) > refill_inactive () > swap_out( ) > swap_out_mm( } > swap_out_vima() > swap_out_ped( ) > swap_out_pmd( ) > try_to_swap_out( ) > deuctivate_puge ()] 389 void deactivate_page (struct page * page) 1 { 191 spin_lock (apagemap_Lru_lock) ; 192 deactivate_page_nolock (page) ; 193 spin_unlock (kpagemap_lru_lock) : 1} [kswapd( ) > do_try_to_free_pages( ) > refill_inactive ( )>swap_out( ) > swap_out_mm( ) > swap_out_vma( ) >swap_out_pgd( )>swap_out_pmd( ) > try_to_swap_out( ) > deactivate_page ( ) > deactivate_page_nolock( )] 154 ee 155 * (de)activate page ~ move pages from/to active and inactive lists 156 * Ghage: the page we want to move 187 @nolock ~ are we already holding the pagenap_Iru lock? 1B 159 * Deactivate_page will move an active page to the right 160 inactive list, while activate page will move a page back 161 * from one of the inactive lists to the active list. If 162 * called on a page which is not on any of the Lists, the 163 page is left alone. 164 / 165 void deactivate_page_nolock (struct page * page) 1660 { 167 * 168 * One for the cache, one for the extra reference the 169 * caller has and (maybe) one for the buffers. 170 * 7 * This isn't perfect, but works for just about everything. i * Bosides, as long as we don’ L move unfreeable pages to the 173 * inactive clean list it doesn’ | need to be perfect.. 174 */ 1 int maxcount = (page—>buffers ? 3: 2); 176 page-rage = 0; UT. Linux ARG Lap 177 ClearPageReferenced (page) : 178 179 i 180 * Don’ t touch it if it’s not on the active list. 181 * (some pages aren't on any list at all) 182 */ 183, if (PageActive(page) && page_count (page) <= maxcount && !page_ramdisk(page)) { 184 del_page_from_active_list (page) ; 185 add_page_to_inactive_dirty_list (page) ; 186 } 187 TEMPE page BEEP AMAR count, SITAR 0, EAA INE 1 CHL__alloc_pages( )BI rmqueuet )f#/845 9, se Je 4 CIAL AM — 4S FEL”, we 3 — NB LE count AN 1. GA, MRIS BNA 2, RTL INT FRA Ce te ae TC AR OY. BERG ASAE AE, RO RE EKO SBR ERE 2 HEA ME, WEE ILI maxcount. (HE, AEE — APPAR, BORA ALAS mmap( BRIA JSF, MRP AER OT I, RASC PRE a, BK SIN PR / ICE AS 2, DET RRs PRE, HL page MPH ET buffers HI—% buffer_head SARAHB, FAA WL TAR BITS AS HU. BFLA, “4 page->buffers Jf 0 If, maxcount 9 3 BLAM, FEROBAT AEA WY 77 SUD OSE — TAS 5 Sh PAP URI AT AT REF F ramdisk, BULA AB Sd ES SAME, KEORDAMAS RMR, A. ARMAS, RAT SK SEMA AAT ER AAT ERI TL. SCH a A RRA lt, MERE AT ARERR AS. TUB, MARSH tb TBE, RE AS AE HBR BA 3 eb 0 TT PF 2k deactivate_page_nolock( ) jf A #4. He VARMA RAAT, BETA TAY page Sit MISE IMM LRU BAS active list PF BI-TAARORPA, OE, RAPA MMAR. ARE “dirty”, RUG RE RE LS Ah, BRC LR RAS BA AE” TAS, RATS LeROR ATA, I PARES MATAMEE “We”. Ti HUE “clean”, RIERA S LAT BUY “TH” RL SS. ROPE GUN be) Hy A RRR, QL TCT PL AA, ld AER fe PARLE]. ANGER “ME” TIARA —+>, AHL AE imactive_dirty_list: i 7AWARK “+7” TB WARS, HP WMHSAK PMA inactive cleanlist I). BA, 4 —-P SINGER A RY, REHAB T API? BMRA E RA “AL” HR IBAT. 3-4 page HEA. TR BR OFM AE AE LEARY del_page_from_active_list( )7CAKA. J&5E X7F include/linux/swap.h PF: 234 ttdefine del_page_from active list (page) { \ 235, List_del @(page)—>1ru) ; \ 236 ClearPagesctive(page) ; \ 237 nr_active pages--; \ 238 DEBUG ADD_PAGR \ 239 ZERO_PAGE_ BUG \ 200) UB. M2 Bie He page SKTGEA AARON, SMITH add_page_to_inactive_dirty_list( )75 Rk: 217 f#define add_page to_inactive_dirty_list(page) { \ 218 DEBUG_ADD_PAGE \ 219 Z2ERO_PAGE_BUG \ 220 SetPageInact iveDirty (page); \ 221 List_add(& (page)->lru, &inactive_dirty_list): \ 222 nr_inactive dirty pagest+: \ 223 page->zone-> inactive dirty_pages++; \ wmf iK EM ClearPageActivet yA SetPagelnactiveDirty( )71 15 page 19409 PG_active frats(iLit i 0 ‘A0% PG_inactive_diny HARA 1, ERATED page SP MHUT AAR. LEIA) wy_to_swap_out( )AOACESh, BARITIE TAA PDE, SLES 7 ICT WUHITM. iRALETZIMP page_cache_release( ). S/7_t AE Hi__free_pages( TEMA. 3 #define pagecache release (x) __free page(x) 379 define __free page (page) __free_pages (swap_ent EAI T 0 MOTE C5217. TLAERORE, 248 swap_out_mm( )AE95 HOKAG#2 BULL — PEPE ROT AT BU BLT page HHIAZE swapper_space MYBA SPUR? ILA WAIM AL Ay AUTH Ee RE La LAB ae BR, SERA CHE. BRA PILAPIBL— TR, CEA ICT LTT REACT Rt, PLA YE HGR TUBB EOD (LAL EER T 4S vm_operations struct BH, 3+ Lidebik PARSE AOR SUN nopage HAM THEME. MRGERT nopage SE, AANA wR — AEE CAS REBAR ABE AL mf LEE SAE SCD TO i EY TT, ELIA AB ARE Kd A RH O), Me IE TRIS ATR, CUS SNA IZ HATS} PAM. RE PA wy_to_swap_out(Ai{UiS, Fibi AURbAL MLA RA: [kswapd() > do_try_to_free_pages( ) > refill inactive () > swap. out( ) > swap_out_mm( ) > swap_ovt_vma( ) > swap_out_pgd( ) > swap_out_pmd() > try_to_swap_out( )] Hy. Nes. Linux eR GS pss 10 ie int + Ts it a clean page? Then it must be recoverable 12 * by just paging it in again, and we can just drop 113 * it 14 * 15 %* Howover, this won't actually free any real 116 % momory, as tho page will just be in the page cache ur * somewhere, and as such we should just continue ug * our scan. 19 * 120 * Basically, this just makes it possible for us to do 121 * sone real work in the future in “refill_inactive( )”. 122 */ 123 flush_cache_page(vma, address) ; 124 if (pte _dirty(pte)) 125 goto drop pte: 126 127 is 128 * Ok, it’s really dirty, That means that 129 * we should either create a new swap cache 130 * entry for it, or we should write it back 131 * to its own backing store. 132 */ 133 if (age-Dmapping) 134 set_page_dirty (page) ; 135 goto drop_pte; 136 } 137 138 is 139 * This is a dirty, swappable page. First of all, 140 * get a suitable swap entry for it, and make sure 141 * we have the swap cache set up to associate the 142 % page with that swap entry. 143 */ 144 entry = get swap page( ); 145 if (entry. val) 46 goto out_unlock restore; /* No swap space left */ ur 148 /* Add it to the swap cache and mark it dirty #/ 149 add_to_swap_cache(page, entry); 150 set_page_dirty (page) ; 151 koto set_swap_pte; 152 153 out_unlock restore: 154 set_pte(page_table, pte); 155 UnlockPage (page) : 156 roturn 0; 157) 120. 22S ment 3% BMY pte_dirty( )—* inline HAH, +X F include/asm-i386/pgtable.h: 269 static inline int pte_dirty(pte_t pte) \ { return (pte). pte_low & _PAGE_DIRTY; } TER MAA 4} “DD” SRG (PAGE DIRTY), 158 CPU 8) AOR TTI fr SET TS SE, ARB OM 1, ARAL MR AI” 7. HURL 0, Bh aHMIKY FERAL, REAPER TET, MUA ASE BUS Ui WL, REM MB IRC AS NT A). RAAA: MRAMMWAEEA, MAW TARE, Me, RK Jd mmap( EVAL BRR, UAE AG REIT AT DRAB MeO PE TE EPA OL SBR a GUL LR ASR it ESB, Ts CHS A HR Pe). TA, SUA ER S drop_pte Hh. HERE AHAUL F iV BY deactivate_page( )Sc Ry. LANRLHEHL. RES AL Hain de REACH 83 47H 0, TI page_cache_release( )A A AEMARA 45 KH MVS FET Re. SURAT AS ENS HOR LAL mmap( RTARTA, MUSE page Hite FEE mapping 4% 1 FANEAY address_space SALi#4. AYP RPEY Riil, MURA ERR PRD Ti KL AITEP M)_PAGE_DIRTY PRL 1, PRE drop_pte MZ iii, Aa page HAHN PC_diny AGT 1. FE MMH FRAT AD HE” TEDW. ASEHNBRLE set_page_dirty( ) M4 include/linux/mm.h VLE mm/filemap.c: [swap ) > do_try_to_free_pages( ) > refill inactive () > swap_out() > swap_out_mn( ) > swap_out_vma( ) > swap_out_pgd() > swap_out_pmd( ) > try_to_swap_out( ) > set_page_dirty( )] 187 static inline void set_page_dirty (struct page * page) 188 189 if (test_and set bit(PG dirty, &pege->flags)) 190 __set_page_dirty (page) : i} 134 135 * Add a page to the dirty page List 136 ¥/ 137 void __set_page_dirty (struct page *page) wf 139 struct address space mapping = paxe->mapping; 40 MI spin lock (@pagecache lock) ; Me list del @page >1ist); M3 list_add@page—>list, &mapping->dirty_pages) ; M4 spin_unlock (&pagecache_lock) ; 45 146 mark inode dirty_pages (wapping host): M7} AME y_to_swap_out( )RU{UG. SAEHYUT ALE A), BPA ATTIRE AL RAIL EA UHH], LANE swapper_space (BRA / HMI, AT RRR, (Ha Se ES Ui aly “BT” “2. Linus His DUG. APRA TU TT AE 2 OPC 4 aE . yore stra 5 HL Lomi. ab ait get_swap_page( )4>ME—-“P#t F.9UiHT. JASE THRE: 150 define get_swap_page() __get swap page(1) PALE. MW t__get_swap_page( MHS EAL PIA. FURISZE mmiswapfile.c *F, uF LORE, ATHENA. BL RTE EIN RL, CU AES SIEE A) AFP TURIN ALEL swap_duplicate( HEH, HSA AT REEUT TT Me ROR A EHH CL 99 17); ZIMA swap free( BIR. WRAACL ETT, ACHES! out_unlock_restore Sb Hh 3 JE7 YW » SENET Bb SU EL, i add_to_swap_cache( 14 SUDHA swapper_space (BAF. LABIA FTIR FU 34 eB ATR CHER S. FPL set_page_dinty( 5 RR BAT AL” KAORI, SLRS, RY WI AB BE page launder( 25H Blk, eS aE A a P9210 SARE RE EMT. swap_out( ETE —7P for AR HL swap_out_mm( )f}, BFELEERCHL swap_our Med 2 FEROS FOU if refill_inactive( )R ALTE FRAN while HFA" ISIA swap_out( IM), —HSEBIALE PAT SPACEY OM, CLARE ny RA Mc LE VIARPTAGSEET We. BLABY, do_try_to_free_pages( RHR T. LSI kswapd( RACER, HAT ARET DARD TREO PARAM ELAE, BFLLEBETEIT YK refill_inactive_scan( )~ IAFF, Kkswapd( ) BRAT RIE ACE ET. AUATBTER, kswapd( ERIE SAAT I, ATPL RE Le ALA EA PROLAD HEAT EAE IE TERE IR PR HS Here, AHR wake_up_all( )WEALIR MEER. BEA UTE, Lal swap_out_mm( )Aj RSE RPI AEMOE BIE ARIE “FE MEA UD A AER OS KEE refill_inactive( ) ARE BIA) SH Fe? Bb, ERT RC PH , RAT TUT RAED ASE US EAR RH eA AR RS ABR, WL, retill_imactive \R 2 LIA ILI AR TF MA SE —AeH HR 5 TAHITI Se TCT A SI Si, FAL TUBA ae SER (RSA BM Aid. BORE, TR RR AE RA OE ROAM PRARIAEET. SUG, ERAT, WHR RAR BIR, LN RIA oom KN MRSEF AS UERE, SILL GEL FR ORS Fa Gils, APRA BHM kreclaimd (FU, AE 4 mm/vscance '[': 1095 DECLARE_WAIT_QUEUE_HRAD(kreclaimd_wait) ; 1096 /* 1097 -* Kreclaimd will move pages from the inactive_clean Tist to the 1098 —-* free list, in order to keep atomic allocations possible under 1099 * all circumstances. Even when kswapd is blocked on IO. 100 #/ 1101 int kreclaimd(void #unused) 102 { 1103 struct task struct -#tsk — current; los pg_data_t *pedat; 1105 1106 isk-Dsession = 1; 1107 tsk—>perp 1108 strepy (tsk—>comn, “kreclaimd”) ; +12. 1109 sigfillset (&tsk->blocked) ; 110 current->flags |= PP_MEMALLOC ui ue while (1) [ 113 uu i 15 * We sleep until someone wakes us up from 1116 * page_alloc.c:: _alloc_pages unt */ 1118 interruptible_sleep_on(&kreclaimd wait) ; 119 1120 fe 1121 ¥ Move some pages from the inactive clean lists to 1122 * the free lists, if it is needed. 1123 */ 124 pedat = pgdat_list: 1125 do [ 1126 int i; uz for(i = 0; i < MAX_NR_ZONFS: i+#) { 1128 zone t *zone = pgdat->node_zones + i; 1129 if (!zone->size) 1130 continue: 1131 1132 while (zone->free_pages < zone->pages_lon) { 1138 struct page * page; 134 page ~ reclaim_page(zone) ; 1135 if (page) 1136 break; 137 __free_page (page) 1138 } 1139 } 1140 pedat = pgdat->node next; 14 } while (pedat) ; 142 } m3) YI F kswapd( AIF TOR ee PRIN, BR AI. dea MAES: task_struct HH)" flags LAY PF_MEMALLOC Hrcds( BEAR 1, ACUI SA ARAB AL IC HULSE. Hb, TELAT ATP RT ARE kowapd, 7 24 MALE IAI “Bid BALA PRE. Bel, ik —YcRillt reclaim _page( HHH BT MSAK PARE “TTR” RUM, MPU TTR. BONA mm/vmscanc #, FEE Me A CBE. MRT Me, REBAR PRAT {hrecluima ) > rectaimn_page( J] 381 foe 382 * reclaim page - reclaims one page fron the inactive clean list 123. Tess Linux ARBOR, Esa 383 * @zone: reclaim a page from this zone 3840 385 * The pages on the inactive_clean can be instantly reclaimed. 386 * The tests look impressive, but most of the time we'll grab 387 * the first page of the list and exit successfully. 388 #/ 389 struct page * reclaim page (zone_t * one) 390 391 struct page * page = NULL; 392 struct list_head * page_Lru; 393 int maxscan; 394 395 is 396 % We only need the pagemap_lru_lock if we don’ t reclaim the page, 397 %* but we have to grab the pagecache_lock before the pagemap_lru_lock 398 % to avoid deadlocks and most of the time we’ 11 succeed anyway. 399 */ 400 spin_lock (&pagecache Lock) ; 401 spin_lock @pagemap_Tru_lock) ; 402 maxscan = zone->inactive clean pages 403 while ((page_lru = zone->inactive_clean_list. prev) 404 ‘&zone->inactive_clean list && maxscen—) { 405 page - List_entry(page_lru, struct page, ru); 406 407 /* Wrong page on list?! (list corruption, should not happen) */ 408 if (1PageInactiveClean(page)) { 409 printk("VM: reclaim_page, wrong page on list. \n"); 410 list_del (page_lrv) : au page->zone->inactive_clean_pages—; a2 continue: ALB. } au AIS /* Pago is or was in use? Move it to the active list. 4/ 416 if (PageTestandClearReferenced (page) |; page->age > 0 | 47 (page-buffers && page_count (page) > 1)) { 418 del_page_from_inact ive clean list (page) : 419 add_page_to_active_list (page); 420 continue 42 } 422 423 The page is dirty, or locked, move to inactive dirty list. */ Aas if (page->buffers |! PageDirty(page) || TryLockPage(page)) | 425 del_page from inactive_clean_list (page) ; 426 udd_page to inactive dirty list(page) ; 427 continue: 428 } 429 430 /* OK, remove the page from the caches. ¥/ 14. 2 Famer BL if (PageSwapCache (page)) [ 482 __delete from swap_cache (page) ; 433 goto found page: a4 } 435 436 if (page->mapping) { 437 remove_inode_page (page) ; 238 goto found page: 439 } 40 Ml /* We should never ever get here, +/ 42 printk (KERN_FRR reclaim_page, found unknown page\n”) ; 43 list_del (page_lru M4 zone->inactive_clean_pages—; 445 UnlockPage (page) ; 446 1 47 /* Reset page pointer, maybe we encountered an unfreeable page. */ 48 page = NULL; 449 goto out; 450 451 found_page: 452 del_page_from_inactive_elean_list (page) ; 453 UnlockPage (page) : 454 page->age = PAGE_AGR_START; 455 if (page_count (page) != 1) 456 printk (VW: reclaim page, found page with count %d!\n”, 457 page_count (page)) ; 458 out: 459 spin_unlock (&pagemap_lru_lock) ; 460 spin_unlock (@pagecache. Leck) ; 461 enory_pressuret+ 462 return page: 463 29 RG ARA #£ 1386 CPU #7 SUSU OBEN MSA, OR AMR eee ie, UR PUSUMZETER H ARGUE UH P (Present) bea O, WR AFM RACE TE. HATTIE RRA RAB. MACE, WUE AAD REAL A ORR, RRR ESE, SATS ACE BUN AAT. BLL ERATOR Zh OTT”. (EU. CPU HY MMU BEPESE TAPER PIT, AH P be, O RAV AE TAM, CPU BEAT Ek OTA ” (Page Fault), 489, CPU TORE P 4H AA TD TR RI ON Pha. REP WAH 0. HRA MBAR £. BPA A TTMAN CEA Fett, AFH se Zend ETT, AEE. BLL, BAP APRA BIR ETA SLIM ANE WFR, MOREA BRINE Act TORRAAA UBA AL AOE AE" EB "UE, ATS Hw BE handle _pte_fault( ) +125. Linux ABS IRFE wy PMT ILAT: [do_pege_fault( ) > handle_mm_fault( ) > handle_pte_fault( )] 1153 static inline int handle_pte_fault (struct mm_struct: *nm, 1154 struct vmarea_struct * yma, unsigned long address, 1185 int write_access, pte_t * pte) 1156 1187 pte t entry: 1158 1159 i* 1160 %* We need the page table lock to synchronize with kswapd 1161 + and the SMP-safe atomic PTE updates. 1162 ¥ 1163 spin_lock (&nm->page_table_lock) 1164 entry = pte: 1165 if (Ipte_present(entry)) { 1166 is 1167 * If it truly wasn’t present, we know that kswapd 1168 % and the PTE updates will not touch it later. So 1169 # drop the lock. 1170 */ a7 spin_unlock (&inn->page_table_lock) ; 1172 if (pte_none(entry)) 1173 return do_no_page(am, vma, address, write_access, pte); 1174 return do_swap_page(nm, vma, address, pte, pte_to_swp_entry(entry), write access) ; 1175 } EB, HEME pte_present(), BRENT Phat, SPW RARSEN GE. tA GE, SURE INL pte_none( HEA ITE HIE, AMO. HU Nw PLAT ACB, TLL BE do_no_page( ). REUH MTR EAE HLT. RZ, WUE, RRP US, BE SUD ATEA TEP, BFELEESELSL do_swap_page( ). MERE BEG [8 ALK SUTET. AM St handle_pte_fault( ) MTA HELL RAT RAG SEA BOIL, ARATE Beet A do_swap_page( ). 3&/>o0 3A {84475 mm/memory.c Fs [do_page_fault( ) > handle_mm_fault( ) > handle_pte_fault( ) > do_swap_page( )] 1018 static int do_swap_page (struct mm_struct * nm, 1019 struct vmarea_struct, * vma, unsigned long address, 1020 ptet * page_table, sup entry_t entry, int write access) woz 1022 struct page *page = lookup swap, cache (entry) ; 1023 pte t pte: 1024 1025 if (page) ( - 126. I Tiny ean 32 e_ feet 1026 lock_kernel ( ): 102 swapin_readahead (entry) : 1028 page = read_srap_cache (entry) ; 1029 unlock kernel ( ); 1030 if (page) 1081 return “1; 1092 1033 flush_pase_to_ram (page) : 1034 flush_icache page(vma, page) ; 1035 ! 1036 1037 in Dest; 1038 1039 pte = mk_pto(page, vma~vm_page_prot) ; 1040 1041 is 0s2 * Freeze the “shared”ness of the page, ic page_count + swap_count. 1083 * Must lock page before transferring our swap count to already 1044 * obtained page count. 1085 */ 1046 lock_page (page) : 1047 swap free(entry) ; 1048 if (write access && !is_page shared (page)) 14g pte = pte_mkwrite(nte mkdirty (pte)) ; 1050 UnlockPage (page) ; 1051 1052 set pte(page table, pte); 1053 /* No need to invalidate ~ it was non-present before #/ 1054 update mnu_cache(vma, address, pte); 1085 return 1; /* Minor fault */ 1058} REGAN ARHSR LA A. ALLSTAR MT ERA RD. WECPU BATH EIR, FATIH HACE ZK. BARA LY mm, va HAT address E~H FRR. TAL A SAUER) mm_struct AMMA. BTBETERCY vm_area_struct Hi fyLSHET RR BRU OAR HE HHL PR page_table HAM RATA, i entry WHATMAN. RWB, ST AEAEEN, TRIE +S pel ah, Heh) AAT: TARE AEM, ME ‘ swap_entry_¢ #48), 48T- -Mit EWUT. SPR LAB 32 (ALT SL. ET, IR OR HATE” RIB EAN, ME CPU AY TURE A RETA ET, IzIR EGE VETRTAR IT EAE AS ETT, Ad, REAR. EASE weite_access, 4276 BEN AMUN ITMAT HUH APR / 5), BOEZE do. page_fault( ) switch 8A CML arch/i386/fault.c) $B CPU PEE NIH RES ervor_code if) bit] YURI CER, de AA switch iA, “default:” 47 “case 2:" ZAR break 14). IT BET BK. PMS SUNACEMTE, BELA entry SARS LER Cu LE 127. Linux Pte si {L). WARE LARS: BMPR TNR ICE) RS, BAST ALMA Bek Goch. FD Mo, Teta ROS. SA cea NE hie FE Th. PRM RROREER—- TH OFS 0) BRMATA, FILL entry AT EAS 0. 1K BEA AE SIR Pe A NT PH. hE — Yi DUR TU S| LAAT EIN, AGE I AL 42 TE swappeer_space ffS#% A/ BROT ARAL. DREAM HT . LL, BCAA lookup_swap_cache( )» 3&4 FAMCRZE swap_siatcc PEIN, SME CME A SBS. ti RATERS). PAL ULLAATA PAkTae e e ERL, BUEN A ET AE WRB read_swap_cache( SR —- TAG NM, IFA A i. Atha CZ JAVA swapin_readahead( WE? SANE I URAUIN A, ARC ILIRIR PII AL BEY. La UE Bet Pe AAR LPARAM LAL TAT AR AG OB rd SE HL EF TE KS. HV RINE: BURBS, BPR SRL, BO PH TOS (cluster). TENSE MRA WTAE ROBBEN, WWE “HUE” Cread ahead). jie SUES AY TLD A BH REA SBR BA FULL ZR swapper_space (AERA / RUN EASUHE, UR SCTE HSE ANS LA LNEFE kswapd #1 kreclaimd ¢£—BLA AILS IELELEL. GKAE, “SiRF read_swap_cache( IM. Hitt FPA TUM AAUP A OER TLR, MTR LRT PALS OBEN fe TTIW, RAE EU. KUMI RA AT. EAT MEST, APTN, HATER TP DAT LARA ETT A, BURR BHT MA TRE? BRAN. EPMA AMO, ALORS Rae. TEE LAAT OER PT ee — HG TU, SE HE ET EAL swap. HALL FRSA AE TL Td EAN RA ORM. ER eh BAT LARA F efr__alloc_pages( )'#f—‘}7 Be: 382 wakeup_kswapd (0) ; 383 if (gfp mask & __GFP_WAIT) ( 384 __set_current_state (TASK RUNNING) ; 385 current->policy |= SCHED YIELD: 386 schedule( ); 387 } ‘FAVA swapin_readahead( DHE read_swap_cache( ), CEP iH 7} ALP 1 TLGINS AEE IM FA] SEK gfp_mask Hh(f)__GFP_WAIT PRASGL ERR 1, DLS IRIN AS AL AL, UN PE BE PET. PFET WERE YS kewapd, SAREE MOM DRALIZ ATED, HALA JA schedule( )2 EIN, PARADIS AA AERIY T . BIMLZE swapin_readahead( 4! RAMT » 4 read_swap_cache( )"{"FE KAU HARA AE CW AS EAEHE) eo. AIR, ATER AMRIT. AGE do_swap_paget ) LAT, PULLER 1031 FAI 1. RAB, LA REAR AS) swapin_readahead( ) WT. BEA ATE AAT. Til read_swap_cache( read_swap_cache_async( ), 7 J24@iH HIS &e wait Bee, 4 REG GEA TROT GL RIERA. 125 define read_swap_cache (entry) read swap cache_asyne (entry, 1): 6% read_swap_cache_asyne( iM) Ci47 mm/swap_state.c P: +128. BIR ftw [do_page. fault( ) > handle_mm_fault( ) > handle_pte_fault( ) > do_swap_page( ) > read_swap_cache_asyne( )] 204 205, 206 207 208 209 210 ail 22 213, 24 215 216 2u7 218 219 220 221 222 223 224 225, 226 227 228 229) 230 231 232 233 234 235 236 237 238 239) 240 241 242, 243, 244 245 246 247 248 ix * Locate a page of swap in physical memory, reserving swap cache space * and reading the disk if it is not already cached. If wait--0, we are * only doing readahead, so don’t worry if the page is alrcady Locked, * * A failure return means that either the page allocation failed or that %* the swap entry is no longer in us a/ struct page * read_swap_cache_async (swp_entry_t entry, int wait) { struct page *found_page = 0, *new_page: unsigned long new_page_edir; it * Make sure the swap entry is still in use. */ if (!swap_duplicate(entry)) /* Account for the swap cache ¥/ goto out; it * Look for the page in the swap cache, af found_page = lookup swap_cache(entry) ; if (found page) goto out free_swap: new _page_addr = __get_free_page (GFP_USER) ; if (new page_addr) goto out_free_swap; /* Out of memory */ new page = virt_to_page(new_page_addr) ; je * Check the swap cache again, in case we stalled above, */ found_page = lookup swap cache (entry) ; if (found page) goto out free page: ft * Add it to the swap cache and read its contents. *f Jock page (new page) ; add_to_swap_cacho(new_page, entry): rw_swap_pago(READ, new_page, wait); return new page; 29. op SOs (tees Linux Pa #8 ay enna 249 out_free_page: 250 page_cache release (new_page) ; 251 out_free_swap 282 swap_free (entry) ; 253 out: 254 return found page: 255} LUNES T, AA PUIGA T lookup_swap_cache( ). UALR ARMA, A swapin_seadahead( Wi L242 HAR MUMURMEAT . BLLBESEM swapper_space BIH SHR tk. WAT WEVA, BBN, BERYL Pun AS WALA TERAREN. TTL IHNL T AAA HEIGL ROKR ORD? LD Ih AR HF MLA ATREE, MARTA, SEARS, UNUM UMLBEEAT, FeALUi AWARE HSA _get_free_page( JiR IFN, (HPF AUER EES AERA HL TRMAT, UL, WIRE BLADE, RUILEE add_to_swap_cache( )H53 J} WA YT AR AL'CLIM page SBMA KDIEA swapper_space BASILL2 active list BLE, J&A SOVUEM DSBS T © BF ow_swap_paget jo WER AT LCE I T SL va) — He Ls BLL 4B. FHT read_swap_cache( )SSHLMi, PYBEIH DL ihi Ars CLEEZE swapper_space BAFJLA2 active list BAI 7. FAD Lees. QE — ALL TUMOUR. Bot. — SPARSE 221 {F-RRIEERL swap_duplicate jit OT LTH A. AUREL TB SUT) TAA LAAN SME 252 Bat swap_free( HRTEM. RL, RABE ME it ACTH ASHE] swap_free()» FDL L TUS E HMO T 1. FTAE, [AS do_swap_page( )LLFi. 1047 47 MALT 1K swap_free()» foe | TnI Be 1, AOR, BURR TRAE UREA TTR SEL TA SERUM, TDMA OAT ARENT HR AVTU, USER Soa 1. Rue, Barapa EL: FF try_to_swap_out() 0) 99 47. EE, AFTRA, BLL swap_duplicate( )i838T ALTMAN SCH Me. TOTEM LRA MISE wk 1, SET bE. BMRA ATT, BUTE page HUI THRE. BIA, FRAP Ah 084 TSE A HR 1 ARG 4286 add_to_swap_cache( HWSEABEA / BRM DLTUCER CAPRA FLY BLLRU IIL active ist If L7E add_to_page_cache_locked( )*PiULME page_cache_get( JM TIR-NEML PAUL. EH APES AMA / Bee TUT, SOR MOn 2. SOUR TEER A PHU aT StF iE / SOBER, LS ALY, BILL DOs 3. Ee, eR ARR, ae ‘swapin_readahead( )PUEUE ARM Ki. [do_page_fault() > handle_mm_fault( ) > handle_pte_fault( ) > do_swap_page( ) > swapin_readahead( )] 990 void swapin_readahead (swp_entry_t entry) gd 1001 for (i = 0; i < num; offsett+, i++) { 1009 /* Ok, do the asyne read-ahead now */ 1010 new_page = read_swap cache async ( +130. BIE fiw SHP_ENTRY (SWP TYPE (entry), offset), 0); ou if (now_pege !~ NULL) 012 page_cache_release (new page) : 1013 swap_free (SHP_ENTRY (SNP TYPE(entry), offset); 1014 b 1015 return; 1016} Z£ swapin_readahead( )'H, (A6hi8/9 read_swap_cache_async( SAAS, BTZEDA read_swap_cache_asyno( )38 [AM i WUT AU (RAB AL 2. MAL, EMT Sb SL page_cache_release( BWA, BOY HEEL KN RRR AER. Pik, AER T SRN. ATA activelist , TARA MCR 1. DUE, Me RCE A LE “US”, INTEL BER 2, BEALE PLINTH I FEREEZU ML, BUA AL refill_inactive_scan( JAMIE MA CR mm/vmscan.c ff) 744 47), MARAT OU | OS SURI RA (a158 do_swap_page( )ffU{t#9'1'. iXM149 flush_page_to_ram( )#l flush_icache_page( ) #1 1386 4h8 BAYA SPE. (ESP aL pte_mkdiny( if RMR D AM 1, RRA MLS ME” Ts # Hite pte_mkwrite AS RAZOUPH_PAGE_RW SHA. RY: BARTLEY HORS PCS RMAC RES? ORR TS ORO? EAL, aS HARA NLA SAMMI. UA AMM Ia AEF do_page_fault()' switch AVHY case 2. ZR BMT APT AR MULR ITA Soir GE CVM_WRITE Aick(70 0), BEE! bad_area k/. DRIER, FRI RUET Shak VM_WRITE '9 0 HATS bids_PAGE_RW S278], VM_WRITE 22 MST ARH Bll: th PAGERW WHCH AA, RAGAN AHMAR NDRE RUS WH. RATE VM_WRITE 25 1 RUATSE F. PAGE_RW AFT ETRE 1, (AANA 2H 1, BILL, ¢E 1039 fF AR 4% vma->vm_page_prot Si-O UT, A_PAGE_RW baah{i02y 0 CER VM_WRITE vma->vm_flags Ti7742 vma>vm_page_prot if) —fir). HERABAD AEST. AMAR ok, BES aL PRR, RP MADRALT LHS TG? TER, RES WMS ARETE A RR . IN, LATE handle_pte_fault )*7 iAH do_wp_page( )» #¢ FUGTAIW PUALOR IE ase (0 SLE cow, il copy_on_write (is, WALHEMEMMREERY), TRAD do_wp_page( MT, — EBL WRIMMIR, SLATER EE ECR. MR AHI T EP RBEA CH update_mmu_cache( ), X]F i386 CPU SUES See, BY 1386 AY MMU #245 CPU WR HR. 2.10 Ame RHEL FRE, AEE RRA RE ARR. fl, 4BRU— SM MREN REIN 4b task_struct #8), Ti Made REGAIN BERR CAEN task cHe) ace |v aR Ae eS A AE FEA WFR AFBI, PERIL MY DLA yak 4 FRA SE ETA Th, emer SAADEH, ARTA RIG SIEM page SHIRE, EKO ARAL DT page MH, HR AARNE SI). HP SUSE AR FETE BUMS TTP SAAR A OR ER, AE AAT FR Ss Ba as ae RT, PR AE AR NR Se ok A wT A KEM KSA. We, ARR ALS “131. Linux #3 BabRd ts aD MA, FOREARM? WRAL Sf6) PHY malloc( RFE DASA ALIN, Mh AFR ARAETE) “HE” Cheap) 1), MEAD) REA, ANAT MIA. IATL Mik ei ae BEA cE i © ATA L, STREET HL", LEB BIR Ce MEP 28 8 SS I) LK RP TEA AES TIA]. yi, RAR 2° AACR TARE TO, DASERRR Ho © KP RAAMAADMRME Ua, MET GL. ARMM LR, 1k OIE ADA AT GD GEES RTI, ABA PUSKAS TADS AH a 0. WERE nL SPRY EAA” TTR EA, ART LA as PC PRA RIN. CET EE CIR PF RMR LE RVMAARAAT MG PH, DMBBA TN ME. RA, BERNER RTS first fit) BITE, Se AR EIB Hr BEA RAH SP RS ZERRENERT — Be AREAL BERT MAA ET EF EPP LY RE RS RA a HI UK ABE A BT TI, TRAGER, MATER ARTE. AKA OYA, FOR LAL AT aanT. © REASON BIA EHR. SOR LIne SR AD, RA RRL ALT EFEUR. 90 EARRISE, ZE solaris 2.4 PIRYE Unix SERED OP. RAL T PAR “slab” (SRK ARATE: Glob MRR ARMM), EASE LAT RW. TT Linux, EAR TR, JE Tack. APIS SRI PAE BE, slab NS APMIS ATS ARR MAHAL, eS ER ai TE Bi ON RWG (one) MTA, AEWA ERMA 4E slab FEF, GPR OEM ACSA, RAR om “Ry &” Constructor) Al “HFBR” Cdestructor) PARK. FINS. CRIS R ETA i, ME Be eK” TRE Iy MSR” Cobject). SEMPEC OTB AAS SAE wT ALE Rk” BNET ATI RAPE SP BLT BIR AE (EF, TAS FER IT UL RAR. I, REP ICP AIBA SRR BL RATT BE page MAHAL, BHP ABI struct list_head HLA), SPREE BM SIPSRBRIL FAR ER Ae TAR. ABTS OR” RRS AEE, ABI DR it, Mave, Haat SeAIEA 3H ELE fi). BRANT Ce kswapd li’) do_try_to_free_pages( )'19224 Bi, iM/1] 8 kimem_cache_reapi )+ ARENT RAMA RM, FESR AME. FS, eh Re Sk BAP, HAL kswapd ()— Sth AE. HESb. slab BEAD A MES, SAMIR MMM OVIFEL ATR BK, iE EBM “AR” Glob) HR, TED ILS THAR. RTA, RTA, AF BANK, HED R. HDR R, RGAE PP TL PRL A Ao PB AP inode MAE 300 BAF, WEA FEET HY LAAN 8 A ELEGY inode, FFL inode Ret} SH. PP RAS BAG MAB ATR, ATLL, BREA RL Ay et BAUER UIE AIT FO slab 349. TERI PRP RAR — + slab RNA (2.8): . +132. 2m eR BEAM RHE stein Slab HiRMH slab t HK prev = next 2.8 1itM slab ATE TASS AG AE Flas ILS BO» PESOS UB HS RA BRA @ A slab AUREL 4S, 2. 4 AN oo BEE 32 ERENT TT RIAL. lab MDE B AK NRL BATT, MLE ER EIA. FEET slab MTS ALK slab HAE HY slab_te FIA) HRM RATS + slab ALMA hay DFM — MUTE T. AAS slab WAM TES |p, BAUER slab 1:57 ARMA RL: AULA slab EAT RAL ASCO: BERET slab EA PAX SALES. A+ slab LA ARR, REPRE TU, DUT RHO Sy PRT LE AYRE te Hh FEA slab LDA TNR ERMA, REL MERE. FAUST. EP slab MUHA MA — TFB, RAK slab LINE PASAT. FBT MRO HEMT REARS. E slab fh PIE A — FOB EAT RT RE. AP ENT SCAT AG slab PEGA MTT ASIN 1, FRR, A FSP AB SPRAY, AE AN AI CRU slab ARRAS, SPEAR SRY slab HEAT TER TE slab BATH (Olin, LOR slab LATA KI RE IRE (EH, BREN slab MAREE B—ME). 44S slab ik RBA AVE UE MEA, PRY “AF EEK” (coloring area). HARB AME Slab 998 SA Ae HE AAR (SER AEP OY “HEPAT” (cache line) Ae (80386 1) 2 EGP RRA 16 FF 15, Pentium % 32 MP4) WF. AFP slab AEM HUB iD +133. . 138 139 M40 i 12 3 44 145 146 M7 48 149 150 151 152 110 ul 12 43 44 45 116 47 18 119 120 121 e Linux AN (RE Hi Es FPGA, RCIA AR REM TAT AE ER A PR BE SAL SEED —~ PSR LTE. TRAP HET slab HF BEA RAIS AT AEH SEE RASA A bys (HEAT slab ETA —AB i (he BOY SR ARE A IR PHAM, LETRA EMRE, 487 slab REPRE MBE RANT, ROR ERAN AME, SE KARE BEER IF LTE slab 5 FEE PRE RATT Dy (ERE AR Sas EAT) ARMAS slab BPEL. SEAT RADA AREA LEBER AAD Ao SUE RR EO A cls As SR ZF RO SORA, MAE ERIE FE. PTDL, AS slab LIKI Aah SLAP REA IE BS IR AES FRETS P RUROAT AL FFA» PRR slab RG HY slab_t HEX, 4 mmVslab.c H+ Ix * slab_t * * Manages the objs in a slab. Placed either at the beginning of mem allocated * for a slab, or allocated from an general cache. * Slabs are chained into one ordered list: fully used, partial, then fully * free slabs. / typedef struct slab_s | struct list_head list: unsigned long colouroff: void *s_mem; /* including colour offset */ unsigned int inuse: /* num of objs active in slab #/ knem_bufet1_t free: } slab_t; 3B AYIA list FOR slab BEA 7S HBX AMF, colouroff 948 slab LAER AKA, Hitt s_mem HPDPRE AEA, inuse AE RENRINH RM. AEG, free MIELIWE T SA nt SEH RTMR, RATER: i * kmem_bufetl_t: * * Bufctl’s are used for linking objs within a slab * Linked offsets. * * This implementaion relies on “struct page” for locating the cache & * slab an object belongs to. * This allows the bufct! structure to be small (one int), but limits * the number of objects # slab (not a cache) can contain when off-slab * bufctls are used, The limit is the size of the largest general cache * that does not use off-slab slabs. +134. BIR cana 122 * For 32bit archs with 4 kB pages, is this 56, 123 * This is not serious, as it is only for large objects, when it is unwise 124 * to have too many per slab. 125 * Note: This limit can be raised by introducing e gencral cache whose size 126 * is less than 512 (PAGE_SIZE<<3), but greater than 256, wr 128 129 define BUFCTL END Oxf ft fFFFF 130 define SLAB LIMIT Oxfff£FFFE 131 typedef unsigned int kmem_bufetl_t; EEA RESON, REPL MRO MTT RA KTR IOS, BS -BT MICH BUFCTL_END, AAPG ALIEN slab MAT BRK, FEBARAEHIA kmem_cache_t. BARAT) } BRAD RAE slab SIRS ARES. ICR TGF IARI ES slab NAHB, URS ERE Mat ‘BHATT (constructor), 3-7 FEAR BA Cdectructo). AMBITZ, BIIEARA RE. Apt AURA slab LILI AECE slab ks RMT AY slab AT, FOREST RAN RU slab BAK, HED Silt 2—4> kmem_cache_t 45H], #89 cache_cache. JRF. BIER T AR Ea: © NTE cache_cache —* kmem_cache_t 44], HDRES —EZ slab BUI), XA slab LAB AS AM kmem_cache_t SUR. © EEE slab LANE TIT, HM kmem_cache_t RUB HABE BSL, HME —798 — het slab Bail. © BE slab IRSA LA WMA, ENKEI AL. © ETRE slab LAMA — TEAM BRAT, WAR FRE 2.9 fia MEL 2.9 PRIDE, ARIMA slab DAF) cache cache, BLIP EIA slab ATI Kmem_cache_t SORE) o Ti AG 07 MCR Si 1 AR ME aH) CAPO inode. vm_area_struct, mm_struct, TF IP RRS RUSS) APD AY slab BLAUMSLAE. SA, “EES OREN, RARE — TATE, TORR IC), NRL T. SURE: void *kmem_cache_alloc(kmem cache_t *cachep, int flags); void kmem_cache_free(kmem cachet *cachep, void *obip); BL. SBE TAA SHY slab WSUS EITIRT, BEX kmem_cache_alloc( JE. SM, RAMEAR PAB mm_struct . vm_area_struct, file. dentry, inode 4 HMSGRE ty, Bhat FE HAY slab BAFI, TIMGEAL kmem_cache_alloc( )/} Fis +135. Linux 1 Beet wD slab ym_area_steuctslab list inode slab list slab slab slab > kmem.cache.— siohs Jmm_struct slab list cache cache Q —, Fe] stats sans fe slat. fe kmem_cache_t B29 DRAM ARGC, MAUR te” HY, slab HOMME A AI. AZAR AMS slab HEE AA EPR slab FTA ERFRMDK, Seep MLE Si SHH slab |. EAE slab ALESHA kmem_slab_t #9 (Mi ETSE TAA slab EATS —AS AR. HPCE AER. HSE. Rte BA GRAM SAAD BAN, AT ST, BOR “AYR” (N slab BERENS RI OK, UBT “AOR” WS slab RIT IRIE ASR Ae CEI TT CLE lh. ST AOA MB AT ARDS DUTY 1/2, 1/4 BR AB INS, HEHE TREE Sa RT OP A dH SH slab SILEATHAURR, BLIZERMOS PAIR T. APREBHI TT HAA slab Li RRE fei, LURE slab MEME R. Aid JAR AS BP ARH La HB HH PE MS, AAA Be, FPP AA SR RET BL Ss A PL). BUA, Linux AVA RIL TOBA GRP RANA DAR, RRA slab HRS HALAL, BA“ slab_cache". slab_cache (#)2# 5 cache_cache AVN FH, FURST AE TBI PHBA GAB HT slab_cache UCHR BELLE), BGAPHT SA ERIEA—T ASAE slab IAP. RAE slab BASIE ASF LAARTEL BOT RRA. BNA 32, RRA OF, 128, A 128K (habe 32 PHIM). SASL apie 4 APR TBE 0 BE void *kmalloc(size_t size, int flags) ; void kfree(const void *objp ) ; Bh, SABRE 7A BAY slab BURUND A Ah, Ia iE kmalloc Pd. XARA I MANA ARR A, GU SB CP RL TTA vfsmount 2H 20a ALINE. OR BUR RIE ME OCTET ELF ERLE alloc_pagest ) + 136. B2a Pine ALR -TH. NELE—R, Vy He Ti] 485 AN AF SPRAY BAL valloct Fil viree( js void* vmalloc(unsigned long size); void vfree(voidk addr) ; FAK vnalloct MAA HARETE TE] (3GB LL) SMIC—SR TELL REET, ROR SEI FR bk(). Bich rk )A2 EERE ZE ALP HAUS Sh SEAL PSSA SHAAN, i valloc( MEME RAIA, WHA HS. AACS FACAT. GY vmalloc( )/PABEY 510] N22 Be kswapd 48111, [5199 kswapd TURE TRE RAL S08), TURAL ABLE vmalloct ARCA TRON, 28 PHL kmalloc( )ShFE AURER SA. BU kswapd HEM A 7S slab BA SP BERRA ALAS) slab, JRE BCBT ds HAM Uh. BARAT CET PY slab DF SAB PREAH EF vmalloc( )5 BVI SEVEN ioremap( {a HLL, RARAT TEU REE RZ AT, TA SP BS 2101 SARE WZ ET AK, MR AFI H4T vm_arca_struct ff) GHSPE WS — MRE, AA LR TS SURG IAVERAL. (A, BUM Aik, Linux Poe & Mice EPPO AAA NULL fe ui eae Constructor) FUE, WRIA FE AAU slab ELALMATE OMA CARER, slab BELEBE BASRA, WAP ARAB RY. ATLL, BRATAA AY Be OR BF REAR T NRE nevcore/skbuff.c Pai KAY: 473 void __ init skb_init (void) amt 415 int i: 416 a7 skbuff_head_ cache = kmem cache create (“skbuff head_cache”, 418, sizeof (struct sk_buff), 479 0 480 SLAB_HWCACHE_ALIGN, 481 skb_headerinit, NULL): 482 if Cskburf head cache) 483 panic(“cannot create skbutf cache”); 484 485 for (iO; i¢NR CPUS; i++) 486 skb quewe_head_init (Askb_head poo! [i]. list); 4s?) SAAR 2 AAG El skb_ init )/FF (I) Fit CORE BA AL Wy I Hs a HARARE +> sk_butt McA HE A RGA BM, BK “skbutf_head_cache”, Heid “cav/procislabinfo” RMR KAA PAR. ATP, BRET OLR” [AAR sizeof(struct sk buff). IEA offset 0, PLL AE slab PEI GLAS JT TLRPPRSER.. {HSK flags % SLAB_HWCACHE_ ALIGN, #76 ETE UTI (16 FBR 32 FHT) FE. WE AVRIL skb_headerinit( ), thi “137. Wes cesta destructor Sl NULL, BiB ZETRR ARAL slab BIER & MPR MET RH 63 kmem_cache.create( HMMA TST), ATP, RR AARABI PET. PEMARME Ain F. BE, BM cache_cache 4} Aid 7 kmem_cache_t 4/4, (F9 sk_buff BAB #9 slab BA yy AEA Hi. MESH kmem_cache_t 427F mmislab.c PE XA: 181 struct kmem_cache_s { 182 /# 1) each alloc & free */ 183 /* full, partial first, then free */ 184 struct listhead — slabs; 185 struct list_head *firstnotfull; 186 unsigned int objsize: 187 unsigned int flags; /* constant flags */ 188 unsigned int num; /* # of objs per slab +/ 189 spinlock t spinlock; 190 ifdef CONFIG SMP 191 unsigned int batcheount ; 192 Hendif 193 194 /# 2) slab additions /renovals */ 195 /* order of pgs per slab (2°n) */ 196 unsigned int gfporder; 197 198 /* force GFP flags, e.g. GFP_DMA */ 199) unsigned int gfpflags; 200 201 sizet colour; —/* cache colouring range #/ 202 unsigned int colour_off; /* colour offset #*/ 203 unsigned int colour_next; /* cache colouring */ 204 kmem_cache_t aslabp_cache; 205 unsigned int growing; 208 unsigned int flags; /* dynamic flags */ 207 208 /* constructor fune */ 209 void (4ctor) (void +, kmem cache t *, unsigned long); 210 2 /* de-constructor func */ 212 void (*dtor) (void *, kmemecache t *, unsigned long); 213 24 unsigned long failures; 215 216 /* 3) cache creation/removal */ 27 char name [CACHE_NAMELEN] ; 218 struct list_head next; 219 ifdef CONFIG SUP 220 /* 4) per-cpu data */ 221 epucache t -*epudata(NR_CPUS]; +138. Re feimee 222 tendif 223 if STATS 224 unsigned long num_active: 225 unsigned long num_ allocations; 226 unsigned long high mark; 27 unsigned long grown; 28 unsigned Jong reaped: 29 unsigned long errors; 230 ifdef CONFIG SMP 231 atomic t allochit; 232 atomic t allocmiss; 233 atomic t freehit; 234 atomic freemiss; 235 Hendif 236 Hendif, 27); 4 kmem_cache_s (38H, 2 include/linux/slab.h X5E XT kmem_cache_t: 12 typedef struct kmem_cache s kmem_cache_t; SEHD EIA BSE slabs HIARYEE—7* slab BAF], FHEL firstnotfull SETI BA II SAR, (UPR AY slab, HALA RIB UATTS 2 Be. SAR MAHDI, slabs BRM UD AEE SHEA slab. HARRIE POF next UAL ALICE cache_cache * REI — 7% IABTELS slab BASU", te, LAE slab APIS ER ATRIAT). 4 slab HARES AED «slab KEY, BUS slab, FR Ff slabp_cache 35 E2175 BA SURES Hil dai, BRSAD TART OSb, HEE RSP: objsize ERMA OTR) IAC, CEA SEAtHURA sizeof(struct sk_bufl); num #744 slab LA LMEMX ; gfporder MAA slab MKD. 4B slab ARES 2° HALE, TH gfporder LAE ne HINDER, ZEB slab PRE TAN RD E AHA, ABLE “AEE” Coloring area), Ht slab HAIMA slab SS RRL HA, A AF AE PR FL, TA slab AVENE A 1, UTP slab AYER AE 2, HE 7 slab ft 38- Ma RE WES, TELE, ABA] BES” RATER, CHRP BR slab MARAE RE OB DLR DM RAM ZETIA, WLR ERE EP AE-MEBRHE Couche Tine) OSAAs. BFEL. AEA slab DA PSR FE EERE RC, GOR REERE AE colour F. fT F—7S slab PEED MBER colour_next ‘|! 4 colour_next iS FIAE ACH colour Mt. RUA OFF. AUC ATAT LO. AEG DCH A ay MRI (colour_otfXcolour) HA. AT —4 kmem_cache_t 4#49U\ ii kmem_cache_create( )2tNE*T— RIMS, UAHA AENY slab FipAo TOA: AEA slab AJL AS HT AL, Sh OR Ae co OE CAN RE”), slab PEI kmem_slab_t FLERHE slab SHESET TF AGE ARH HERES slab RYFEATs SRS HO REREIE HIM IAZE slab SNARE TE BORE slab LSHINTAISEM RRA aS: A “RL” RAE. IRE RS RTE HDF kmem_cachet GHPMET SM, CEA ARMAEE ctor A dtore 29. as eau i mee) Linux 99a tt) a, Pi! US, ¥0.d kmem_cache_t HiHI4EA cache_cache ff) next BU GU, ARE slab wal, FHM, kmem cache create )FLERE WT rw ALI ALATA RE LAE, TTB ARMY slab WARES EBT. TT AAA slab AYER MEG RAP RAR IPD. AACE BN Sib PALE A Me eT GSP A #BLEL kmem_cache_grow( )>Kitt fT - 2.10.2 Be RES ER TEE TPE AT BILLS, LUTTE kmem_cache_alloc( RAVAN T . ALIBI S329 skbuff_head_cache BAPAC BL, JCF neticore/skbuff.c * BRE REBT A AEA: 165 struct sk_buff *alloc_skb (unsigned int size, int gfp_mask) 166 181 skb head from pool ( ); 182 NULL) 183 kmem cache alloc(skbuff_head_cache, gfp_mask) ; 184 if (skb == NULL) 185 goto nohead; 186 } 215} H&K alloc_skb( )/E ASW SH ZVALAENT kmem_cache_alloc( )AJ@A#, TELA M LEA A POR, CHE —7S sk_buft BURMAN aPH, DUNE SACRE AE, HEB EI A ua HEIPIRNE / EMR AUT. FRAY, DESLUSM IR BYALA kmem_cache_alloc( 445 HT. BEA} 7h sk_buft BABI, FSW skb_head_from_pool( URAHARA CHP TABI, AEWA kmem_cache_alloc( )3}ME, 1X BUERATPT IBY. FACE: mnvslab.c [alloc_skb( ) > kmem_cache_alloc( )] 1506 void * kmemcache_alloc (kmemcache_t *cachep, int flags) 1507 1508 return __kmem_cache_alloc(cachep, flags) ; 1509} {alloc_skb( ) > kmem_cache_alloc( ) > __kmem_cache_alloc( )] 1291 static inline void * __kmemcache alloc (ksem cache t *cachep, int flags) wz 1293 unsigned long save flags; 1294 void® objp; 1295 1296 kmem_cacho_alloc_head(eachep, flags) : 1297 try_again: 140. RI tee 1298 local_irq_save(save flags) ; 1299 #ifdef CONFIG SMP 1319 else 1320 objp = knem_cache_alloc_one (cachep) ; 1821 endif 1322 local_irq_restore(save_flags) ; 1323 return objp; 1324 alloc_new_slab: 1325 ifdef CONFIG SMP 1328 endif 1329 jocal_irq_restore(save_flags) ; 1330 if’ (kmem cache grow(cachep, flags)) 1331 /* Someone may have stolen our objs. Does’ t matter, we’ 11 1332 % just come back here again 1333 */ 1334 goto try again: 1335 return NULL: 1336} FHA, alloc_skb( PASE skbuffl_head_cache ATP}. FSAI slab BAR G&A A: sk_buft S8H009 slab BAP) RUBS, PATA HM SM cachep HARMAN. #2FF PAY kmem_cache_alloc_head( )& 1) WMA), ZESTRRATI REET PBL. ARATE ME WA KS Me BLA SMP HHH, BELL BIC BEHE ATER EBL kmem_cache_alloc_one( ), 1X FE—-MERHE, EMA: 1246 f® 1247 Returns a ptr to an obj in the given cache, 1248 caller mast guarantee synchronization 1249 * Hdefine for the goto optimization 8 ) 1250 #/ 1251 #define kmem_cache_alloc_one (cachep) \ 162 \ 1253 slab_t *slabp; \ 1254 \ 1255 /# Get slab alloc is to come from >/ \ 1256 { \ 1257 struct list_head* p = cachep->firstnotfull; \ 1258 if (p = &cachep->slabs) \ 1259 goto al loc_new_slab; \ 1260 slabp = list_entry(p, slab_t, list): \ 1261 } \ 1262 kmen cache alloc one tait(cachep, slabp); \ 12631) -ETH__kmem_cache_alloc( HUH — EB ALA F252 MARAT A. EULA, ‘M1. ay eau i mee) Linux AVRGAURY t 2B RAL slab DSI MEET firstnotfull, EBAY — A SAAT RUN slab. WM SHIA slab BU REK(AEREP HAE —7F slab), AM RADA} CER AA SIAL RAY slab, FLL SRHGF__kmem_cache_alloc( ) {4/5 alloc_new_slab £2(1324 47), MEL FiK slab BAF). RRA T SABAH BM slab, BU kmem_cache_alloc_one_tail()J}M2 727A RIB ait: {alloc_skb( ) > kmem_cache_alloc( ) > __kmem_eache_alloc( ) > kmem_cache_alloc_one_tail( )] 1211 static inline void * kmem_cache_alloc_one_tail (kmem cache t *cachep, 1212 slab_t *slabp) 1213, f 1214 void *obip: 1215 1216 STATS_INC_ALLOCED (cachep) ; 1217 STATS_INC_ACTIVE(cachep) ; 1218 STATS_SET_ITGH(cachep) ; 1219 1220 /* get obj pointer */ 1221 slabp->inuse++; 1222 objp = slabp->s_mem + slabp->freetcachep->objsize: 1223 slabp->free=slab_bufct1 (slabp) [stabp->rree] ; 1224 1225 if (labp->free == BUFCTI. END) 1226 / slab now full: move to next slab for next alloc */ 1227 cachep->Pirstnotfull = slabp->Jist. next; 1228 #if DEBUG 1242 fendi 1243 return objp: 1244 J WTAE, BABS slab PAY free RA FMA SA RAVE SHAD s_mem SAT slab PRIRTRR, VAR MIE ALAR SURO Ay, BY LL pe Se a Se Ria, BSB AP slab_bufctl AEF AL free MM, GLH ED FP SE 154 #define slab_bufctl(slabp) \ 155 (C(kmem_bufett_t *) (((sTab_t*) slabp) +1) JAMA] —7 kmem_bufctl_t SCH ASHE, 4 AA HEZE slab "Pp SORE slab_e ALE, PRG RGA slab_t. BRAN ATTRA Pn, OO 7G BYE Ze FP ST RIT Bo GET slabtF free FRO, tke AMR ORE. MI RIAE) T slab MAKE BUFCTL_END, AS UUEIX slab DSUMITHE firstnotfull, HERP AE fF —* slab. Ril, BANE slab WAT CBATHES TEAM RA slab, HUBS AHA HRS alloc_new_slab St, 38Lit kmem_cache_grow( KA} AL —“HEHNY slabs MAEM PERIL TY “ARK” EK. BE 11a. Tina cena 582% Ferre ‘Bi kmem_cache_grow( HUSH 27, mm/slab.c {alloc_skb( ) > kmem_cache_alloc( ) > __kmem_eache_alloc) > kmem_eache_grow()] 1066 /* 1067» Grow (by 1) the number of slabs within a cache. This is called by 1068 * kmem_cache_alloc( ) when there are no active objs left in a cache. 1069 ¥/ 1070 static int kmemcache_grow (kmem cache t * cachep, int flags) wor 1072 slab t *slabp: 1073 struct page #page: 1074 void 0b 5D; 1075 sizet offset: 1076 unsigned int i, Local. flags; 1077 unsigned long ctor flags: 1078 unsigned long save flags: 1079 1080 /* Be lazy and only check for valid flags here, 1081 * keeping it out of the critical path in kmem cache alloc( ). 1082 */ 1083 if (flags & ~(SLAB_DMA|SLAB LEVEL MASK|SLAB_NO_GRO)) 1084 BUGC) 1085 if (flags & SLAB_NO_GROW) 1086 return 0; 1087 1088 ie 1089 * The test for missing atomic flag is performed here, rather than 1090 * the more obvious place, simply to reduce the critical path length 1091 % in kmem cache alloc(). If a caller is seriously mis-behaving they 1092 * will eventually be caught here (where it matters). 1098 */ 1094 if (in interrupt() && (flags & SLAB_LEVEL_MASK) != SLAB_ATOMIC) 1095 BUGC ); 1096 1097 ctor_flags = SLAB_CTOR CONSTRUCTOR: 1098 local_flags = (flags & SLAB_LEVEI. MASK) ; 1099 if (local_flags == SLAB_ATOMTC) 1100 i 1101 * Not allowed to sleep. Need to tell a constructor about 1102, * this - it might need to know... 1103 */ 1104 ctor_flags |= SLAB CTOR_ATOMIC: 1105 1106 /* About to mess with non-constant members ~ lock, #/ 1107 spin lock irgsave (kcachep->spinlock, save_tlags) ; 1108 1109 /* Get colour for the slab, and cal the next value. #/ 13. 1110 ui 12 1113 14 ALS, 116 47 1118 119 1120 121 1122 1123 1124 1125, 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 141 1142 1143 144 1145 1146 1a7 1148 1149 1150 1151 1152 1153 1154 1155 1156, 1157 14. Wes Linux Pa beiet A) Essen offset = cachep->colour_next cachep->colour_next++; if (cachep->colour_next. >= cachep->colour) cachep->colour_next. = offset # cachep->colour_of cachep->dflags |= DFLGS_GROWN; cachep->growing++; spin_unlock_irgrestore(&cachep->spinlock, save_flags) ; /* A series of memory allocations for a new slab, * Neither the cache-chain semaphore, or cache-lock, are * held, but the incrementing c_growing prevents this * cache from being reaped or shrunk, * Note: The cache could be selected in for reaping in = kmom_cache_reap(), but when the final test is made the * growing value will be seen. */ /* Get mem for the objs. */ if (1 (objp = kmem_getpages ( goto failed; hep, flags))) /* Get slab management. */ if (1(slabp = kmem_cache_slabmgmt (cachep, objp, offset, local_flags))) goto oppsl; /* Nasty! !N!!1 L hope this is OK. #/ i = 1 << eachep->gfporder; page = virt_to page (objip) ; do { SET_PAGE_CACHE (page, cachep) ; SET_PAGE_SLAB (page, slabp} ; PageSetSlab (page) ; page++; } while (--i); kmem_cache_init_objs(cachep, slabp, ctor flags) ; spin_lock_irqsave(&cachep->spinlock, save flags) ; cachep->growing-—; /# Make slab active. */ list_add_tail @slabplist, &cachep->slabs) ; if (cachep->firstnotfull == &cachep->slabs) cachep->firstnotfull = &slabp>list; SIATS_INC_GROHN(cachep) ; cachep~>failures = 0: RI ewe 1158 1159 spin_unlock_irgrestore(écachep~>spinlock, save_flags) ; 1160 return 1; 1161 oppst: 162 kmem_freepages (cachep, ob jp) ; 1163 failed 1164 spin_lock_irqsave (&cachep->spinlock, save_flags) ; 1165 cachep->growing—~; 1166 spin_unlock_irqrestore(&cachep~>spinlock, save flags) 1167 return 0; 168) FRA kmem_cache_grow( RDA HSKPAISR efporder PACA AE AIHA LT, FRE TU MAERR slab, HEA SH EY slab BLS, SMT T HELLS, ARUP SEE BR slab AAT ASE RA. MG, JHE kmem_getpages( )Z> RCH TH Heat RP Be A TR, 3X a alloc_pages( STALIN RG. SAT APM RAG AC Hil, ZA kmem_cache_slabmgmi( ) JET slab MHA B. HARES TE mmfslab.c + {alloc_skb( ) > kmem_eache_alloc( ) > __kmmem_cache_alloc( ) > kmem_cache_grow() > ‘kmem_cache_slabmgmt( )} 996 /* Get the memory for a slab management obj. */ 997 static inline slab t * kmem_cache_slabmgmt (knem_cache_t *cachep, 998 void ¥objp, int colour_off, int local_flags) goof 1000 slab_t #slabp; 1001 1002 if (OFP_SLAB(cachep)) { 1003 /* Slab management obj is off-slab, +/ 1004 slabp = knem_cache_alloc(cachep->slabp_cache, local flags) : 1005 if (tslabp) 1006 return NULL: 1007 } else { 1008 /* FIXME: change to 1009 slabp = objp 1010 * if you enable OPTIMIZE lol */ 1012 slabp = objpteolour_off; 1013 colour off '= 1.1 CACHE ALIGN (cachep->num 1014 sizeof (kmem_bufctl_t) + sizeof (slab_t)): 1015 } 1016 slabp->inuse = 0; 1017 slabp->colouroff = colour_off. 1018 slabp->s_sen = objptcolour_off: 1019 1020 return slabp; 1021} 145s. Www.zzba Linox A Bet a Esau DATA, ATR stab HNIAY slab_t Rk TA slab E, ARS mses tl DET slab 29h. (HE, DOSR IAL slab, (F2EF EEA AY slab Ey HT ESTA slab IS). BREA, RAAT kmem_cache_alloc( )4}#2--7 slab_t, BMWA DMR slab (MI RBA ZEIL AYES aiRy, AL Ee 2 A — Aa, ERIK 1012 4F-A 1017 731A colour_off EMI —A BUH, AE REALE 1013 tPF Fi, AER OL RT HYG EAA IN ANE RSS HY slab_t ATA. BALL, slabp->s_mem SALI slab LT SK MR. ALAC slab 19 E+ TUT pagee Bde iH, SEI HBF SET_PAGE_CACHE fi SET_PAGE_SLAB, BH SU@{RINGT prev A next, (CATS HIHIA PTB AY slab Al slab AMY. FI, BEE page S81 AY PG_slab SREB 1, LAAT. 4853, it kmem_cache_init_objs( 247 slab HOW HAHE: {alloc_skb( ) > kmem_cache_alloc( ) >__kmem_cache_alloc( ) > kmem_cache_grow( ) > ‘kmem_cache_init_objs( )] 1023 static inline void kmem_cache_init_objs (kmem_cache_t * cachep, 1024 slab_t * slabp, unsigned long ctor_flags) 1025 { 1026 int i; 1027 1028 for (i = 0; i < cachep->num; i++) ( 1029 void® objp = slabp~>s mem+cachep-rob jsizeri; 1030 if DEBUG 1037 #endif 1038 1039 i* 1040 * Constructors are not allowed to allocate memory from 1041 * the same cache which they are a constructor for. 1042 * Otherwise, deadlock. They must also be threaded. 1043 */ 1044 if (cachep->ctor) 1045 cachep->ctor (objp, cachep, ctor_flags) ; 1046 if DEBUG 1059 endif. 1080 slab_bufet1(slabp) [i] = i+1; 1081 1082 slab_bufct] (slabp) [1-1] = BUFCTL, END; 1083 slabp~>free = 0; 1064} BEN MGAT PARAMS eA. MP skbuff BES, ee skb_headerinit(). JE, (UEDA 1060 47 EXTER ALE TERT IL SBT RR” TELS, BRAT TOME T . BELLI ry_again SbF RB CL__kmem_cache_alloc( )#f#) 1334 47). 146. RIS tomes BARE, BNR PS IEP PACH hl. ACT RS, ARTA MB HLA alloc_skb(), FAM AL GIL skb_head from pool(), MRM, BNEYS chet slab SP APA. SOR ARATE, BLE FEU—-ARMA slab UA 9)'1'i8 iL kmem_cache_alloc( )¢AG. BU slab BAFIF ELSI AT BASALHRI slab, WEF RM—E, Mt kmemcache_grow( ), SAce PRM MASLIN +S slab BR. FZ 5 BEI DEAT AE AA US RR AC TA NV? FRAP CE A TAR BI LL. Kswapd 5 Asp Sh a FA kmem_cache_reap( 3% “HCH)". AML, KORMABES ASME slab I, SHRM TA BY slab ZF70. AMARA CMY slab As NTA fr Ue RAR IAS FORA SX HYREHL, BEAL kmem_cache_free( )sEARM. HACEYA: mmislab.c Ht: 1554 void kmem_cache_free (kmem_cache t *cachep, void *objp) 1555 f 1556 unsigned long flags: 1557 -#if DEBUG 1561 endif 1562 1563 Jocal_irg_save (flags) ; 1564 |_cache_free(cachep, objp) ; 1565 local_irq_restore (flags) ; 1566) ALAR. Bete HAL RAL__kmem_eache_free(), 15 SUR c(h WIRE iY Cia. [kmem_cache_free( ) >__kmem_cache_free( )] 466 /* 1467 __kmem cache free 1468 called with disabled ints 1469 #/ 1470 static inline void kmem cache free (kmem_cache_t *cachep, void* objp) wml f 1472‘ Rifdef CONFIG SMP 1493 else 1494 knem_cache_free_one(eachep, objp) 1495 endif 1496} PIPES BAK SALAS SMP 4if4, FATARLAL kmem_cache_free_one( )RIAU HERI —CfET: (kmem_cache_free( ) > __kmem_cache_free( ) > kmem_cache_free_one( }] 1367 static inline void kmem_cache_free_one (knen cache t ¥cachep, void tobjp) 1368 { 147. Nes. Linux Asai am cua 1369, slab_t* slabp; 1370 1371 CHECK_PAGE (virt_to_page (ob jp)) ; 1372 /* reduces memory footprint 1373 * 1374 if (OPTIMIZE(cachep) ) 1375 slabp = (void#) ((unsigned long)obip& (~(PAGE_STZE-1))); 1376 else 1377 af 1378 slabp = GET PAGE SLAB(virt_to_page (objp)) ; 1379 1380 -#if DEBUG 1402 endif 1403 { 1404 unsigned int objnr = (objp-slabp->s_mem) /cachep->objsize; 1405, 1406 slab_bufet] (slabp) [ob jnr] = slabp>>free: 1407 slabp->free = ob jnr; 1408 + 1409 STATS_DEC_ACTIVE cachep) ; 1410 1it /* fixup slab chain */ m2 if (slabp->inuse— == cachep->num) 13 goto moveslab_partial; wid if (1slabp->inuse) 1415 goto mveslab free 116 return; 1417 1418 noveslab_partial: 1419 /* was full. 1420 % Even if the page is now empty, we can set ¢ firstnotfull to 1421 % slabp: there are no partial slabs in this case 1422 */ 1428 ( 1424 struct list_head #t = cachep->firstnotfull; 1425 1426 cachep->firstnotfull = &slabp->list: 1427 if (slabp->list. next == t) 1428 return: 1429 List_del (@slabp>list) ; 1430 List_add_tail @slabp list, t): 1431 return: 1432 } 1433 noveslab_free: 1434 ie 1435 # was partial, now empty. 1436 * ¢ firstnotfull might point to slabp 148. re Re wee sana 1437 * FIXME: optimize 1438 +/ 1439 { M440 struct list_head #t = cachep->firstnotfull->prev: 141 late list_del (&stabp->Iist) ; 1443 list add_tail @slabp->List, &eachep->slabs) ; 1444 if (cachep->firstnotfull == &slabp->List) 1445 cachep>firstnotfull = t->next; 1446 return; 1447 b 4g} {RESHEY CHECK PAGE SUFI TANF MR, AESEON TT MVR SP Wy EA) ARLE PCS SR SLETLASE A ATAEROTR. MEAP, UAT ATAU kmem_cache_grow( )Pity 1142 47), LI page B58 SHES list A, BUA TOUUGEREBEL prev, HEALTH STRAY slab, BFLIE at EM GET_PAGE_SLAB LPL LAGI HURT slab HUTHET HEAT SRL BLT EY slab, a ri] LAGE ink J tea ICE PE HLS TET HT CL 1404~ 140747). IBY, DE RDTR stab WSEAS ROA. RL A SA nl Res © BOK slab LBM, MURA T. PPLAZEHEH) movestab_partial Kb, #2 slab IIS hk KACEB ATI BOM, ADELA firstnorfull AML © AR sab LRASANR, MMAR AMR TL, LIES) movesiab_free Rk, #2 slab SAAS RATE RE BIOS, MUTA IE 6 © RK sb LMABWNR, MARALEST—, (eA ARIE, BILLA REL a. TYR, SCAT RS SEP KTR A. EGA, BPR IVPEMGEA SRL slab FY PHL, SUH slab (FEHCE TH kswapd 8A BAL EA MESH AT kmem_cache_reap( )5E RIN. AT SNR, BAAR TC. HEL. Bee SA Ssh, BREA ME RIIWETIIE cachesizes, STARIRSEPP EIN ATTRA TOA. FL Tibi SEPP AYR K kmalloc( ETE mnvslab.c PIE KAY: ISL fe 1512 * kmalloe ~ allocate menory 1518 -* @size: how many bytes of memory are required. 151M @flags: the type of memory to allocate. 1515 1516 -* kmalloc is the normal method of allocating memory 1517 -* in the kernel. The @flags argument may be one of: 118 1519 GFP BUFFER ~ XXX 152000 & 1521 —-* SGFP_ATOMIC - allocation will not sleep. Use inside interrupt handlers. 122 1523. * SGFP_USER - allocate memory on behalf of user. May sleep. 1524 19. Linux Heist, aD, 1525 * NGFP_KERNEL - allocate normal kernel ran, May sleep. 15260 * 1527 * SCFP_NFS - has a slightly lower probability of sleeping than 4GRP_KERNEL. 1528 Don’t use unless you're in the NFS code. 152900 1530 NGRP_KSWAPD ~ Don’ t use unless you’ re modifying kswapd. 1581 ¥/ 1532 void * kmalloc (s 1533 _t size, int flags) 1534 cache_sizes_t #csizep = cache_sizes; 1535 1536 for (; esizepres_size: esizep++) { 1537 if (size > esizep->es_size) 1538 continue; 1539 return __knem_cache_alloc(flags & GFP_DMA ? 1540 esizep->cs dnacachep : esizep->es_cachep, flags) ; 1541 } 1542 BUG( ); // too big size 1543 return NULL: 1344} REHLEL— +P for AIR, ZE cache_sizes iKIMANH Bh VB] AAS, 223 —AP MLA SOROS, WARE _kmem_cache_alloc( )AKSI VP AAG -AMBYPR. iti kmem_cache_alloc( ) EH RINA A MBS i kM Ty. eld, RUDRA BA slab HUE”, BOHR slab MH TLURIMTIOI. LITER, A SLR kswapd ZEA APENJE ATP UA kmem_cache_reap( )[EIWGX“S iid. KM LSZE mmvslab.c We 170l /#e 1702 * kmem_cache_reap ~ Reclaim memory from caches, 1703 * @gfp mask: the type of memory required. 17040 1705 * Called from try tof 1706 / 1707 void kmem_cache_reap (int gfp_mask) e_page( ), 1708 { 1709 slab_t *slabp; 1710 knem_cache 1 *searchp; m1 kmem_cache_t *best_cachep: 1712 unsigned int best_pages; 1713 unsigned int best len; 174 unsigned int scan; 1715 1716 if (gfp_mask & — GFP_NAIT) 17iT down (&cache_chain_sem) ; +150» 1718 1719 1720 1721 1722 1728 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1750 1751 1752 1753, 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 171 RIE mee else if (down_trylock (&cache_chain_sem)) return; scan = REAP_SCANLEN; best_len best_pages = best_cachep = NULL; searchp = clock_searchp; do { unsigned int pages: struct list_head* p; unsigned int full free: /* It's safe to test this without holding the cache-lock. */ if (searchp->flags & SLAB_NO_REAP) goto next: spin_lock_irq(searchp->spinlock) ; if (searchp->growing) goto next_unlock; if (searchp>dflags & DFLGS GROWN) { scarchp>dflags & “DFLGS GROWN; goto next_untock; } ifdef CONFIG SMP endif full_free = 0; p = searchp->slabs. prev; while (p != &searchp->slabs) { slabp = list_entry(p, slab_t, List); if (slabp->inuse) break; full free++: b = p>prev; } Jt * Try to avoid slabs with constructors and/or * more than one page per slab (as it can be difficult * to wet high orders from gfp( )). */ pages = full_free * (1<efporder) : if (searchp- pages ~ (pages*4+1)/5; if (searchp->gfporder) pages = (pages*4+1) /5; ctor) “151. 1772 173 1774 1775 176 17 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798, 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 18LL 1812 1813 1814 1815 1816 1817 1818 1819 Nes. Linux AERA Ea cee if (pages > best_pages) { best_cachep = searchp; best len = full_free; best pages = pages; if (full_free >= REAP_PERFECT) [ clock _searchp = list_entry(searchp->next. next, kmem_cache_t, next); goto perfect; } next_unlock: spin_unlock_irg(&searchp~>spinlock) ; next: searchp = list_entry(searchp->next. next, kmem_cache_t, next) : } while (~-scan && searchp != clock_searchp) ; clock_searchp = searchp; if (thest_cachep) /* couldn’ t find anything to reap */ goto out; spin_lock, irq(@best_eachep->spinlock) ; perfect: /* free only 80% of the free slabs */ best_len = (best_lon¥4 + 1)/5; for (scan = 0; scan < best_le struct list_head *p: scant+) [ if (best_cachep>growing) break; p = best_cachep->slabs. prev; if (P = &best_cachep~>slabs) break; slabp = list entry(p, slab_t, list); if (slabp->inuse) break; List_del (&slabp->List) ; if (best_cachep->firstnottull best_cachep->firstnotfull STATS_INC_REAPED (best_cachep) ; &slabp->Tist) best cachepslabs; /* Safe to drop the lock. The slab is no longer linked to the * cache. a spin_unlock_irg(®best_cachep->spinlock) ; kmem_slab_destroy (best_cachep, slabp); spin_lock_irg (&best_cachep->spinlock) ; +152. R2e Guoe 1820 1821 1822 1823 up(&eache_chain_sem) ; 1824 return; 1825} chen >spinlock) JA“ RSLS slab BAF BAZi cache_cache, AAP ARRAT “WC” (H slab BASU. AL. FRAN ALAE UTTER cache_cache, MUR ALAEE HP EY ABS slab BMS, PALA RATT 4s hi BRIER FRA HALA, XBL clock_scarchp: 360 /* Place maintainer for reaping. ¥/ 361 static kmem cache t *clock_searchp = &cache cache: RAT ATLA “i” AY slab BU), WARE AEA slab BEA, A HP A 80%. ATT SWC slab, AL kmem_slab_destroy( FRLATHE, RMR AMERESA aie. (kmem_cache_reap( ) > kmem_slab_destroy( )] 540 /* Destroy all the objs in a slab, and release the mem back to the system. 541 * Before calling the slab must have been unlinked from the cache. 542 * The cache-lock is not held/needed. 8438 544 static void kmem_slab_destroy (knem cache t *cachep, slab t +slabp) 545 | 546 if (cachep->dtor S47 #if DERUG 548 || cachep->flags & (SLAB POISON | SLAB_RED_ZONE) 549 endif 550, v4 551 int i; 552 for (i = 0; i < cachep->num; i+) { 553 void objp = slabp->s_mem+cachep~robjsi ze; 564 #if DEBUG 363 endif 564, if (cachep->dtor) 565 (cachep->dtor) (objp, cachep, 0); 566 Rif DEBUG 573 endif 574 ' 5m } 576 sit kmem_freepages (cachep, slabp->s_new-slabp->colourof!) ; a78 if (OFF_SLAB(cachep)) +153. Linux os Hebei wD 579 kmem_cache_free(cachep->slabp_cache, slabp) ; 580 | QL SPSS Ay Hist Sl] a4 ek HEAR RAT BARA / Hoth, BTU SAB RYU MAS CPU BLE A AB KG, SPARE APM AE, — AOA FE BRI (memory mapped), —FhY 1/0 BE SSR C/O mapped). AERA TWN TSK CPU, SA ASTER IG, WALA TE AR, RAS AETE BB. RNAS, Efe MELE RAE I. CPU TLR DTA FUSE RAUB IG, ATLA BST BLT Sb VO RS. AAGTEY PDP-11. 5909 M68K, Power PC %§ CPU ABR FLIX #8 Jy XC. TD CEFR AL WO WBA Jy SCL) HRA MAS Ta, Sh PR a Ai 8 FARRAR. HATES ARERR ISEB R eFe 7C, FATE X86 CPU FL TS1780 IN Al OUT HES. (LRAT VO RSAY “RULE” HARRI. BSL, DE X86 (4) VO Hib le] 5 IF aa. BE, MAT MILRAM RR, MURS VO BON URARE EER. IRA ARE ATSBNHSARR, WARM ARR ALN, ML Lae PER) UL ce ESF ROR ARMET . MRNA — AE. BIN, ZEPC BLENDER Bi OMB AOE (ai, RRMA SR ROM, BIA ATHY. GUA PCL BAH BLEU, Ain] BOLE FE Fo RDA, AVEE CPU (BLT IRHY VO BRAT RE ARABI. ARO ET HES BLS LA fe ea AA fe, Khe LETT. CE Linux AP, PARA LiaLTt A ioremap( BIT. PAST LA PPR, BAAR AEE EO APT, ED CY DENT AMEN . TAIN HAR ESE, BTA eee BT MB. {AE. ioremap( WAAL. HA. FRA REE AL, HGS RE SBE EB Fei HE EE Be A SSeS A a I i Ef LE AERA bile CPU Ar “AB” ATL, PLAT REET UC nLRR AT, (Ek ABRs T CPU RBG. PLLA “Shh. ML, ME BR REUTER”, REA PEA. Fe LAB. RAR ALU O FRAG MD, ROMS IR BEY PyPe shh. (ALK PE RGB) PC WU--4 PCL ARIA Lt, PC 1 CPU Air FM eee ‘1 [a] REHEAT FEAL JA, Ox0000 £000 0000 0000 TAH, HM BAA STK. BDA. MRS CPC) AY CPU AIS BERL. A KIRA EF Ail FB] AL AK Ox 0000 £000 0000 OVO FFE). ik mt Ab BPX a] AOE, AL BBE”. ZT Linux ABE, CPU AGED REHEHER a M Te fim eT 5 TAR FRET BELL, BREAN ca” Ho DAR ah Hs AEH — Hy a a IR RY. TK, RET RAAT AMIR SAP, MURR ATE, PTL METER ACE RAEI GB LAE). 4E ULAUES Linux ABA H, 1X4 26 BEY veemap(), JEKAUKT ioremap(), AH HHRT — Fie JB. ROPPRY TUT IRANI ASH OA ETUC, 1B ASAI IA kswapd Bet . FEE ioremap( ), K/E~7 inline BA, HEL T include\asm-i386tio.h: 140 extern inline void * ioremap (unsigned long offset, unsigned Long size) Mi M2 return __ioremap(offset, size, 0); “154. 143 SeUp RE a. R2e Few 1 joremap( Gh, 4 4E arch’i386/mnvioremap.c PE ME: fioremap( ) > __ioremap( )] 92. 93 94 95, 96 97 98 99 100 101 102 103, 104 105 106 107 108 109 110 m1 2 113 44 15 116 17 18 9 120 121 122 123 124 125 126 127 128, 129 130 131 132 133 je Remap an arbitrary physical address space into the kernel virtual address space. Needed when the kernel wants to access high addresses directly. wR RR * NOTE! We need to allow non-paxe-aligned mappings too: we will obviously * have to convert them into an offset in a page-aligned mapping, but the * caller shouldn't need to know that small detail. */ void *__ioremap (unsigned long phys addr, unsigned long size, unsigned long flags) void * addr; struct vm_struct * area; unsigned long offset, last addr; /* Don’ t allow wraparound or zero size */ last_addr = phys addr + size ~ 1; if (Isize || last_addr < phys_addr) return NULL; as * Don’t remap the low PCI/ISA area, it’s always mapped, */ if (phys addr >= 0xA0000 8& last_addr < 0x100000) return phys_to_virt (phys_addr) ; a * Don’ t allow anybody to remap normal RAM that we’ re using. . */ if (phys_addr < virt to phys(high_memory)) { char #t addr, *t_end: struct page *page: t addr = tend = “addr + (ize ~ for (page = virt_to_page(t_addr); page <= virt to_page(t_end); page++) if ({PageReserved (page)) return NULL; i* + 15S. Linux Pa dBabid Cha) %* Mappings have to be page-aligned 135 */ 136 offset = phys_addr & ~PAGE_MASK; 137 phys_addr &- PAGE_MASK; 138 size = PAGE ALIGN (last_addr) ~ phys_addr; 139 140 is 141 * Ok, go for it.. 142 */ 143 area = get_vm_area(size, VM_IOREMAP) : 144 if (larea) 145 return NULL; 46 addr = area~raddr: 147 if (remap_area_pages(VMALLOC, VMADDR(addr), phys addr, size, flags)) { 148 vfree (addr) ; 14g return NULL: 150 } 11 return (void *) (offset + (char addr) ; 152} Pt TE. HAM “sanity check”, RUE “RERRIAAE”, “TAR”. FOP 109 REMARK AMADRAD 0, HAKEAA ME T 32 Redihhs (AAD BRA). PeAh! Oxa0000 4 0x100000 FAY* VGA Al BIOS, JAE RSME AARNE Ta, ABRUPT RPE. 121 4D MY high_memory BAERGA LEL, RUE MMIPIAT MRE AEA AR A A FEE ER MEAG MEHL). UR PT BEARAY phys_addr 2) TIX4> LRM, BRAS RENE AR BSAA BS 29 HETU SOK ROR BF ASE. AEM HAE fe, BRIE aA kN eT. FFT (136~138 TERT AEE, A TRIER". SSAC BRT He Ae MMEER IA). MHL. HY Pele TA, TARP EE MERE REAL, PIAGET AAS mm_struct SiH MTF A PA DAR, TT IRF PPR MEE bx TAIRA SI GAR. ME get_vm_area( )E7E mm/vmalloc.c PH MLA: {ioremap() > __ioremap( ) > get_vm_area( )] 168 struct vm_struct * get_vm_area (unsigned long size, unsigned long flags) 169 170 unsigned Long addr; im struct vm struct *p, *tmp, area; 172 173 area = (struct vm_struct *) kmalloc (sizeof (#area), GFP_KERNEL) ; 174 if (tarea) 16 return NULL: 176 size += PAGE SIZE: VW" addr = VMALLOC_START; 178 write Lock (@vmlist lock) ; 179 for (p = avmlist: (tmp = *p) : p = &tmpnext) [ 180 if ((size + addr) < addr) { +156. RIA eine ae 181 write_unlock (@vml ist_lock) ; 182 kfree(area) ; 183 return NULL: 184 ' 185 if (size + addr < (unsigned long) tmp->addr) 186 break; 187 addr = tmp~>size + (unsigned Long) tmp->adér; 188 if (addy > YMALLOC_END-size) { 189 write unlock (&vmlist_lock) ; 190 kfree (area) : 191 return NULL; 192 ) 193, } 194 area->flags = flags; 195 area~>addr = (void *)addr; 196 area->size = size; 197 area~>next = +p 198 *p = area; 199 write unlock (@val ist_lock) ; 200 return area; 201} RD REE ASTER TEDGA Si venlist BORER —FR vm_struct BGR HMA AR) NEES. REALIN vm_stract Al vmlist #8 Jk HH AS BS ALY. vm_stract AA ABE Qs LE EK AL we A A vm_area_struct, (HEY AAGX, j2&MF includefinux/vmalioc.h #1 mm/vmalloc.c Fs 14 struct vmstruct ( 15 unsigned long flags; 16 void * addr; Ww unsigned long size; 18 struct vm_struct * next; of: 18 struct vm_struct * vmlist; DAHIDB EL, A RARE a RAF] AU 5 ye Te Pe EE PR AR, RT Hihk Em 4S 3GB (iiss SAS Bl TA eR MAL: (TAR high_memory tna ARE TFL LARRY AS RA EAL Shh, A AR EAU HI EU. “CREE J Ae PRS my, BENGE JWGHEDL |: SMB AbS}A. uk, Ze include/asm-i386/pgtable.h P= XT VMALLOC_START “#4 (10% Be 152 /* Just any arbitrary offset to the start of the vmalloc YM area: the 133 * current MB value just means that there will be a 8MB “hole” after the 134 * physical memory until the kernel virtual memory starts. That means that 135 -* any out-of-bounds memory accesses will hopefully be caught. 136 The vmalloc( ) routines leaves a hole of 4kB between cach vmalloced +4157. re est Linux #y Beit Gs 137 * area for the same reason. :) 138 ¥/ 139 define YMALLOC OFFSET (8*1024#1024) 140 define VMALLOC_START (((unsigned long) high memory + 24VMALLOC_OFFSET-L) &\. M1 ~ (VMALLOC_OFFSET-1)) 142 define YMALLOC_VMADDR(x) ((unsigned long) (x)) 143 define YMALLOC END (FIXADDR_START) BEAST ERENT TZ SET BS SMB L2H, LL Dae i me AF UBL EA FP WU CL 13247) BERTRAM: 2 TE Tan he TEAR I. SMU TIMES TOL, 185 4788 if PUR EA A RR I LE Iba TR 4 DRindnyaSAa Mahe, TAAL ALAT LAR. MDL 176 TPA IA ANS UE I LT AEA AES FA ES MOTE 189 TEAR EE __ioremap( ) > remap_area_pages( )} 62 static int remap_area_pages (unsigned long address, unsigned long phys addr, 63 unsigned long size, unsigned long flags) 64 65 pad_t * dir: 66 unsigned long end = address + size; 87 68 phys addr ~~ addres: 69 dir = pad_offset (@init_nm, address) ; 70 flush_cache_all(); a if (address >= end) 72 BUG(): 73 do ( ™ pnd_t *pmd: 5 pad = pnd_alloc kernel (dir, address) ; 76 if (!pmd) 7 return ~FNOMEM: 78 if (cemap_area_pmd (pmé, address, end ~ address, 2 phys addr + address, flags)) 80 return ~ENOMEM: al address = (address + PGDIR SIZE) & PGDTR_ MASK: 82 dirt; 83 } while (address && (address < end)) ; 84 flush tib_all(); 85 return 0; 8} +158. B28 Hho BUTE, FEEPEN task_struct Sit BA — “MEET mm_strcuct St, ART LRA TCT aR (OA. A TR ASE Fal SS CAE, FP RT — Se HO mam_struct, #5 intmm. “GR, WHAWBEAT RTM task_struct Hit, HILL 69 ATARIB ALGAE AL init_mm HR PIATRA aR, Rea SARA LSE OY BR. LEY 68 TAR. REM Sp AACA HS HH ~ SUI CEB, SMSC REZE 78~79 4 SC MATL HE ALL OA at. BP TAR MEE address (E35 WL 81 47), WERE BAAN 3. 3B 75 4TH pmd_alloc_kernel{ ) XTF i386 CPU MLA pmd_alloc( }, XT include/asm-i386/pgalloc.h : 151 #define pmd_alloc_ kernel pmd_alloc If inline FAS pmd_alloc( HI MAPA, SWARM, A RMR 2 CA include/asm-i386/pgtable_2level.h): fioremap( ) > __ioremap( ) > remap_area_pages( ) > pmd_atloc( )} 16 extern inline pmd_t * pmd_alloc (ped_t *pgd, unsigned long address) "7 18 if (tpgd) i9 BUG( ); 20 return (pmd t #) ped; ATR, AT 1386 AMET, HRA IMS ah a] LRT, EG FAL” Stn La HER. HMR TRI T MBL Fe (PAE) MM Pentium CPU, LARIAT AMAL. ICY AL UE «aR AY PRAM, NAAT ARM ASNT RNR. A&HF, remap_area_pages( )14\ 73 474K do_while GH, AWAD Wi RROT I remap_area_pmd( ). I remap_area_pmd( )JLPS64—, MW BAIA MLA CY 1386 EO, fee Pla) ARSE BRL AL HT Re, PD RAEI AK 1) remap_area_pte()» 3X UAE arch/i386/mnvioremap.c Ps Xt: fioremap( )> __ioremap( ) > remap_area_pages( )> remap_area_pmd () > remap_area_pte()} 15 static inline void remap_area_pte(pte_t * pte, unsigned long address, unsigned long size, 16 unsigned long phys addr, unsigned long flags) wo 18 unsigned long ends 19 20° address &= “PMD MASK; 21 end = address + size: 22 if (end > Pub_size) 23 end = PMD SIZE; 24 if (address >= end) % BUG); 2% do { +159. Linux HE AD. ar if (Ipte_none(pte)) [ 2B printk ("remap area pte: page already exists\n”); 29 BUG); 30 ) 31 set_pte(pte, mk_pte phys (phys_addr, __paprot (_PAGE_PRESENT | _PAGE_RW | 32 _PAGE DIRTY | PAGE ACCESSED | flags))) ; 33 address += PAGE_SIZ ou phys addr +> PAGE SIZE; 35 ptett: 36 } while (address & (address < end)); aw} SFR OD Lt Ae OG AE UG De PA HT PETE 31 AT ET ET ARTA _PAGE_DIRTY, _PAGE_ACCESSED #1|_PAGE_PRESENTED- 4 kewapd REHM, RICA E A kswapd MK. PAL. HUCK task SHRUG Heth TA LS ERR, ARO MERLIN swap_out_mm( #84) 26001. TAA mm_struct #9 init_mm SEAURHH, AMER —(SRE RELY task SSA MBBS AT init_mm. FLL kswapd RARBAS init_om PHEFE A, LLAMA ARASH KET Af. 212 AAA brk() Red OLE” ATG, brk( YAR EAR, HAE E AE. A VASA BUTEA AD brk( ), UATE PAR AA RE SEA rk NGAI, TH. Ei ® malloc( )—AM CHARM (AES. bo CHP IE new) TALREHLATE) brk( ). SURE malloc( ARR, brk( UAL BE. FEAL malloc ALP ERE (malloc ASH ALLE MH) BS MCE, RRR BEA E SATE SID RR DER, LOE PER RELL brk() IPRA. WUE, ERT 3G FRU AEE. (AR, PAL ERLE 3G TTY Eo] CUE. (RTE AEN BC AEA] CA APRA ETAL). ARRIETA FD, TES APN a A A BE. BA ETE), REE AER ea PRM AARNE TOM, JPR KR. PAA ESE EI SRA (3G). SEE BEAR VALGY SLAs, PKA) GEE URE REIN By SEA UT ADD A TR MRR AE SA “SHE” Seb. Zo. AEE HE UES 3G SW ARAE EINE? HERBAL BL, LPR ean, ER EA PRR MPR CAL data BUT bss BL), SEP ARISE, SUR BREL. Bate BUp IE TAHARI RE Hin), LI AA static WARE. RAE RMI BA EAS ER, WL Pea a — AEP IE AT BR I BG) OA Ta) A TUM, JPRS ODN. Bh, HERR ATANSElal eR TARA R, TDA LA CR a Rt BRAPALIE AY CURT ELA JS). OTANI AS, MERTEN PREIS AE FLBUSR BUUTCIGRE CLR, AEG X86 RAHI RAS eR RER” MRE = 160 - re ea BIS fer TRIE), TREAT AL AGAR. mM LTA end_data BIMPARBUG HEY PRR SP PRAM, RA AATLACS TNS ALTE. RAT, SaaS PRT AL IAGLAEBY end_data FRB, PMLA AEA TSE RT. LUE. BRAS BR NTE”, JASE LAE BR FER. UY REPS ABEL PS PEE. TUE UL malloc( eR PLAICE, TAG EP Bee US a a a cA mm_struct AAI. EAMG, mm_struct BHAT A GP brk. RASA ALE SATA, SERRA RESTA HAM, BERK hGH mH a ROR AON, ARV OT RARER, aL brk( ADSM brk. “PT RAER LER, FREI ek yiEIFT O, LA SIE 0 YP a AY LAD TA ARE AMAA CE RE), RAR BAIL TALS TAR HERIN, ABE Sy MTEL “1 ARSC brkt EA PENSE sys_brk( ), SU4RESZE mmimmap.c 1. XP SBERT LLP AL Bia), MEDS LR LH: ATLA, BD). Blk, CAI FUP RAM. RATE AS: Isys_brk()] 30 /* 114 * sys_brk() for the most part doesn’ t need the global kernel 15 * lack, except when an application is doing something nasty 116 like trying to un-brk an area that has already been mapped 117 * to a regular file. in this case, the unmapping will need 118 * to invoke file system routines that need the global lock U9 #/ 120 asmlinkage unsigned long sys_brk (unsigned long brk) 121 122 unsigned long rlim, retval; 123 unsigned long newbrk, oldbrk; 124 struct mm struct mm = current~>mm; 125 126 down (dium >manap_sem) ; 17 128 if (brk < mmDend_code) 129 goto out; 130 newbrk = PAGE_ALIGN ork) 131 oldbrk = PAGE_ALIGN(am-Ybrk) ; 132 if (oldbrk ~~ newbrk) 133 goto set_brk: 134 135 /* Always allow shrinking brk. */ 136 if (brk mw>brk) { 137 if (ido munmap (am, newbrk, oldbrk-newbrk)) 138 goto set_brk; 139 goto out; 140 } ui 161. Tees TROD Linus Avid! ER perenne § BH brk RARHER RR, KMART IES A, HAS I ARS RRAIMLT BAH, MRALAAD ATI, MRR TA), FeLAt do_munmap( pF ABST fa GOORA, EER. RRESE mm/mmap.c "Ps Isys_brk() > do_munmap( )} 664 —/* Munnap is split into 2 main parts — this part which finds 665 -* what needs doing, and the areas themselves, which do the 666 * work. This now handles partial unmappings, 667 > Jeremy Fitzhardine 668 / 669 int do_munmap (struct mn_struct mm, unsigned long addr, size_t len) 670 67 struct vmarea_struct *mpnt, *prev, *npp, #free, ¥extra; 672 673 if (addr & “PAGE MASK) || addr > TASK SIZE || len > TASK SIZE-adér) ert return -RINVAL: 675 676 if ((len = PAGE_ALIGN(len)) == 0) 67 return ~BINVAL; 678 673 /* Check if this memory area is ok ~ put it on the temporary 680 # list if so.. The checks here are pretty simple — 681 ¥ every area affected in some way (by any overlap) is put 682 *on the list. If nothing is put on, nothing is affected. 683 */ 684 mpnt = Pind_vma_prev (mm, addr, &prev) : 685 if (Impnt) 686 return 0; 687 (we have addr < mpnt->vmend +/ 688 689 if (apnt->va_start >= addr+len) 690 return 0; 691 692 /* If we! LI make “hole”, check the vm areas limit */ 693 if C(mpnt->vm_start < addr && mpnt->vm_end > addr+len) 694 8 mm->map_count >= MAX_MAP_COUNT) 695 return ~ENOMEM; 696 FRB find_vma_prev( )/f EFAS LAU AE “JL ASHES Oh eae RL AL” — SP ELLY find_vat JE AMIR), CAE MUA MAY vm_area_struct SABE AcaR AVL DUPRE B/S Rab AT addr AVR —TRIA, ARE E, Me BOR ALAR 1285 vm_area_struct SMS. ANTIQUA, “eR eis Bi prev HEHE AMAT. SPREE BA ae. oR MARE O, BUA AD ALE A T addetlen, ASHRZe RL AE AE BR BRA A SA OF 2 RIS AT TEL ‘PURSE IP 0, IR 8) Ae (A IE A TD» RRA A AT BR ss APT +162. re ea CBD tena AREOK AVE) —St A. BE, — 4A AL RATHER IA, PAR BJT ER MAX_MAP_COUNT, 7S FE SDH FE IKRR IT. PUM Fs [sys_brk( ) > do_manmap( )} 697 ie 698 + We may need one additional vma to fix up the mappings ... 699 * and this is the last chance for an easy error exit. 700 */ 701 extra = kmem_cache_alloc (vmarea_cachep, SLAB KERNEL) ; 702 if (lextra) 703 return ~ENOMEM; 704 705 app = (prev 2 Sprev->vm next + &mm->nmap) ; 706 free = NULL: 707 spin_lock (@nm->page table_lock) ; 708 for (; mpnt & mpnt->vm start < addrtlen; mpnt = +npp) { 709 ‘np = mpnt->vm_next; 70 mpnt->vm next = free; m1 free = mpnt; m2 if Gm>nmap_avi) m3 avl_remove (npnt, &mm-nmap_av1) ; 74 t m5 mn-mnap cache = NULL; /* Kill the cache. +/ 16 spin_unlock (&nm>page_table_Lock) ; m7 EF PRR BR — GH TD A OS A AE GLK BK Ia] — 5p IT ARK BE PE — v_area_struct #44) extras 55 — Aili, Bi ARORRRS HN USE 3 i te 47 HT ERRATA ), BA DLA — A for TEFL AD BNE (DARE SA —-MIAIT BA) free P, MUSRRENE T AVL ft, Wilt BAELALERR fal 5 vm_area_struct #4) AVL ATER. VAATUEL, mm_struct 4H A997 Et mmap_cache #1 b find_vma( )BR PRESTR, (UAT ae Ae Ue TBR HEE ELE SEEEI COL find_vma( AFRO). TOLZEY PUMA T Ste, ROBART AEE, LUCERO. Bik, CATER T HA HUME. FR aLe FSR SRB T [sys_brk() > do_munmap( )} 78 /* Ok ~ we have the memory areas we should free on the ’ free’ list, 9 * so release them, and unmap the page range. . 720 * If the one of the segments is only being partially unmapped, 72 * it will put new vm area struct(s) into the address space. 7122 * In that case we have to be careful with VM_DENYWRITE. 723 +f 74 while ((mpnt ~ free) != NUT) ( 125 unsigned long st, end, size; - 163.

You might also like