You are on page 1of 121
Manual for BLUPF90 family of programs Ignacy Misztal (ignacy@uga.edu), Shogo Tsuruta (shogo@uga.edu), Daniela Lourenco (danilino@uga.edu) University of Georgia, USA Ignacio Aguilar (iaguilar@inia.org.uy) INIA, Uruguay Andres Legarra (andres.legarra@toulouse.inra.fr) INRA Toulouse, France Zulma Vitezica (zulma.vitezica@ensat fr) ENSAT, France University of Georgia, Athens, USA. First released May 12, 2014 Last edit May 21, 2014 VEVUVUEVUVUVULULVULUYLULUVULULUVULULUUUO Ys Table of Contents Introduction. List of programs from Wiki page Programs ina chart... Parameter file for application programs. Description of effect: z Definition of random effects. Correlated effects. Data and Pedigree files Data file. Pedigree fie. Error messages in parameter file. RENUMF90 parameter file... When to use what program a BLUP. Variance component estimation Genomic program: Examples for parameter file: Appendix A (single trait animal model). Appendix B (multiple trait sire model). Appendix € (test-day model) Appendix D (multibreed maternal effect model) . Appendix E (random regression model) Appendix F (terminal cross model) Appendix G (competitive model; Appendix H (genomic model) Appendix I (complete genomic analysis Appendix J (selected programming details) VUVUVUVUVUVUVEVULVEVELUVUELULELUELULULELEULEULLY Introduction BLUPF90 is a family of programs for mixed-model computations with focus on animal breeding applications. The programs can do data conditioning, estimate variances using several methods, calculate BLUP for very large data sets, calculate approximate accuracy, and use SNP information for improved accuracy of breeding values + for genome-wide association studies (GWAS). The programs have been designed with 3 goals in mind: 1. Flexibility to support a large set of models found in animal breeding applications. 2. Simplicity of software to minimize errors and facilitate modifications. 3. Efficiency at the algorithmic level Aside from being used in hundreds of studies, the programs are utilized for commercial genetic evaluation in dairy, beef, pigs and broiler chicken by major companie: ‘and beyond. titutions/associations in the US ‘The programs are written in Fortran 90/95 and originated as exercises for a class taught by Ignacy Misztal at the University of Georgia. Over time, they have been upgraded and enhanced by many contributors. Details on programming and computing algorithms are available in an Interbull 1999 paper and as course notes. Nearly all programs are available in source code. Online information about the programs is available at http://nce.ads.uga.edu/wiki/doku.php as wiki pages. There is discussion group blupf90 at groups.yahoo.com. List of programs from Wiki page Latest versions available from website at http://nce.ads.uga.edu/wiki/doku.php?id=application_programs (Use latest versions. All applications for Linux, Mac OSX, and Windows have been updated frequently) ‘The pragreme support mixed models with multiple-carrelated ettects, multiple animai models and dominance. + BLUPF20 - BLUP in memory "iblupf90.paf + REMLFS0 - accelerated EM REML iremif90.pdf + QKPAK - Joint analysis of QTL and polygenic effects (M. Perez-Enciso) QxPak web page + AIREMLFSO - Average Information REML with several options Including EM-REML and heterogeneous residual variances (S. Tsuruta) + CBLUPSO - solutions for bivariate linear-threshold models + CBLUPEOTHR - as above but with thresholds computed and many linear traits (B. Auvray) + CBLUPSOREML - as above but with quasi REML (8. Auvray) + GIBBSF9O - simple block implementation of Gibbs sampling 0 ~ as above but fastar for creating mixed model equations only once 30 = as above but with joint sampling of correlated effects F90 - as above with support for heterogeneous residual variances + POSTGIBBSF90 - statistics and graphics for post-Gibbs analysis (S. Tsuruta) + THRGIBBSF90 - Gibbs sampling for any combination of categorical and linear traits (D. Lee) + THRGIBBS1F90 - as above but simplified with several options (5. Tsuruta) ‘+ RENUMFSO - a renumbering program that also can check pedigrees and assign unknown parent groups; supports large data sets INBUPGFSO - a program to calculate inbreeding coefficients with Incomplete pedigree (1. Aguilar) Available by request ' MRF90 - Method R program suitable for very large data sets; contact T. Druet. = COKF90 - Sayasian Cox model - contact J. P. Sanchez (JuanPablo.Sancher airta.cat) = BLUPFSOHYP - BLUPFSO with hypothesis testing (F and Chi2 tests) - contact J. P. Sanchez as above Avaliable only under research agreement + BLUPS01002 - BLUP by Iteration on data with support for very large models (S. Tsuruta) LUPSOIOD - BLUP by Iteration on data for threshold-linear models = ACCF20 - approximation of accuracies for breeding values + BLUPSOMBE - BLUP by iteration on data with support for very large models for mult-breed evaluations + BLUPSOAD) - BLUP data preadjustment too! Included in application programs {EGSF20 - genomic preprocessor that combines genomic and pedigree relationships (I. Aguilar) + POSTGSF90 = genomic postprocessor that extracts SNP solutions after genomic evaluations (single step, GBLUP) (I. Aguilar) ther programming contributions were made by Miguel Perez-Enciso ( hashing functions) £419) and Francois Guillaume (Jenkins e e e e e e e e ¢ e e e e e e e e € € « « € € ‘ e ¢ € € € ¢ ¢ ( ¢ € « ¢ « ¢ ¢ ¢ « ¢ im VVVUVUVUVUEVOUVEVUEVLELULUUELEVELELULESDESEEEDSE - Programs in a chart Renumbering + Data quality control BLUP with Explicit equations BLUP by teratign on data (Average information) REML Large data sets) Bayesian Analysis- v Une waits a Approximated accuracies: Sample anal POSTGSF90 Post-genomic analyses including Prediction of GEBV based estimation of SNP effects and GWAS only on SNP effects Bayesian Analysis- Categorical traits Preprocessing of SNP data invoked automatically Application programs (BLUP*, *REMLF90, THRGIBBS* and GIBBS*) are driven by parameter files and require data files with effects renumbered from 1 consecutively. Renumbering and quality control can be done by RENUMF90, which is also driven by a parameter file. Separation of renumbering and application programs allows supporting complicated models. ‘Some models are not directly supported by RENUMF90 and require tweaking the parameter file in the application programs. Parameter file for application programs ‘The parameter file has keywords that are fixed and cannot be changed followed by values, with the following structure: Keywords* Description DATAFILE Name of file with phenotypes; free fortran format (space-dlimited file) file dat [NUMBER OF TRAITS Number of traits 2 NUMBER OF EFFECTS ‘Number of effects in a model except for residual 6 OBSERVATIONS(S) Postion(s of observations in data file a2 WEIGHTS 4/7 2 Postion of weight on observations if used otherwise blank a "2° means that residual variance (R) is set to W/2. FFEg{S: POSITIONS. IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] itd eae ocr 46 crosscassified effect positions in dat ile for 2 traits; 10 = levels 7 5,0 100 cross 5 0 crossclassified effec, postions for 2 traits; 100 = levels 65 ‘1e0 66 6=covarable postions in data file ian 7\7 10 cova 7 7 = covariable nested in effect position 4; 10 = levels 1a ja 1 88 2000cr58 io ica ANC Be roscasid effet poston for 2 was 000 vel peel 4.019 ooo cross on efecto F 10 9 = crossclassified effect positions for 2 traits; 1000 = levels pl to" aaNdOM RESIDUAL VALUES Residual variance o residual covariance matrix 101 For 2 trait model 110 RANDOM_GROUP Ls of effect numbers that form a group 56 For correlated random effects 5 6 RANDOM_TVPE “Type of random effect distribution) add_animal diagonal, ad sre, a46_an_upe, 2 Fue Pedigre file cr other ile associated with random effect; blank f none file-ped (cowaniances (Covariance matte for each random effect 101 For2 trait model 240 et pe mosmen nets (UR wel) aaa i, age ak pebreten! oF SKeswords need tobe typed exactly (up to 20 characters). When preparing a new parameter ile, consider modifying an existing file. ema MAAHAKHAAHAAAAAARAARAAADAADASCAAZARCERAARAEEETE ae Description of effects The effects are specified after the keyword: EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] Each line contains the followin - Position(s) of each effect in the data file; t positions fort traits - Number of levels (assumed consecutive from 1) - Type of effect: “cross” for crossclassified, and “cov” for covariable © crossclassified uses integer number from 1 © covariable uses integer or real numbers - For nested covariables, the following number (or t numbers fort traits) indicates the position of the data file = Text after #t can be used as a comment Consider a data file (file.dat) with the following columns 76 3.20 18.00 Let igo from 1 to 50, j from 1 to 80, and k from 1 to 200. The model: yly=ajtbtex+e, will be specified in the parameter file as DATAFILE file dat NUMBER_OF_TRAITS. 1 NUMBER_OF_EFFECTS 3 OBSERVATIONS(S) 4 WEIGHTS EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 2.80 cross # position 2,80 levels: 150 cross # position 1,50 levels, 61cov _Fcovariable on postion 6, oe level By definition, a regular covariable has one level (ie., a slope as regression). Fora similar model but with a nested covariable: majeboecX+ey The description will change to: EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 2 80 cross # position 2,80 levels 41 50.cross_# position 1,50 levels 6 50 cov 1 # covariable on position 6 nested in position 1; 50 levels Assume a two trait model: ylyralt chX+ely y= baiHe2K+e24 This corresponds to: NUMBER_OF_TRAITS. a NUMBER_OF_EFFECTS 3 EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 2.0 80cross position 2 for trait only, 80 levels 01 50cross_ # position 1 for trait2 only, 50 levels 6 6 S0.cov1 #covariable on postion for two traits nested In postion 1 “0” in effect definitions means missing effect per trait. ‘Two effects above can be merged: NUMBER_OF_EFFECTS 2 EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 2.1 80cross # postions 2 and 1 for traits 1 and 2, 80 is max{50,80}levels 6 6 S0cov1 # covariable on position 6 fortwo traits nested in postion 1 e e e e e e e e e e e e e e e © e © e e © e Ls e e e e e e e e e © e e e e e e e e e e VUEVUVUEULUVUVELELELELELELELELELELOELELELELEEES Definition of random effects RANDOM_GROUP defines one group of random effects. A group is one effect or multiple (correlated) effects that share the same covariance structure, e.g., direct-maternal effect or random regressions. ‘The structure of RANDOM GROUP is: RANDOM_GROUP Corresponding to the effect number specified above; "5" means that the 5" effect, 5 is random. Or “5 6” means that S* and 6" are correlated random effects. 56 RANDOM_TYPE defines a covariance structure: diagonal var() = s @1 or G where s is a variance and G is. a covariance matrix. For other types, see “Random effects and Pedigree files” Assume a model: with var(animal_additive) = 2.5@A, varlanimal_environment) = 5.1®1, var(error) = 13.721 With these effects: EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 3 100cross_# effect: farm 2 1000 cross # effect 2: additive genetic 2 1000 cross. # effect 3: permanent envionment RANDOM. RESIDUAL, VALUES 137 RANDOM_GROUP 2 {this is for effect 2 on the effect list RANDOM_TYPE nimal _— addtve genetic Fie flle.ped_¥ name of pedigree file (covariances 25 RANDOM_GROUP 3 ‘effect 3 on the effec ist above RANDOM_TYPE diagonal # permanent environment FILE ‘4 no file associated with diagonal structures (coyvaRiances, 5a related effects Assume a model: y= farm + season + direct + maternal + error 5 1 var(directymaternal= [? 2] @A with the effects as specified: EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 3 100cross effect: farm 4 4eross ——effect2: season 2 1000 cross _—Helfect 3: direct 2 1000 cross effect 3: maternal The distribution of the random effects are specified below: RANDOM_GROUP 34 ‘direct and maternal effects RANDOM_TYPE ‘add_animal a Fu file.ped name of pedigree file (CoyvARIANCES, sa 16 itive genetic Random regression models may have many correlated random effects. Assume a data file with the following positions: 1to 4: polynomials 5: animal number (1000 levels) 6: _ herd year season (50 levels) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 6 50 cross ‘therd year season 11000 covS _#first polynomial nested within the animal effect position 5 21000.covS _~# second polynomial nested within the animal effect position S 31000.covS _# third polynomial nested within the animal effect postion 5 41000 cov5 —_# fourth polynomial nested within the animal effect position 5 RANDOM_GROUP 2345 {all covariables are correlated (effects 2, 3,4, and 5 onthe lst above) RANDOM _TYPE adé_animal —_adaltve genetic Fite file ped # name of pedigree file (coyvariances (44 matrix) 10 PREELALEALALULRLALTLELRLELEREUBDLADADAUDUDADADERAGEAD4D Boob boob 8888888888588 E EEE EDS uw There are a few types of additive genetic effects, each with a different pedigree format. a) additive sire (add_sire) ‘The pedigree file has the following format: sire number, sire’s sire number, sire’s maternal grandsire (MGS) number where unknown sire’s sire and/or sire’s MGS numbers are replaced by 0. b) additive animal (add_animal) The pedigree file has the following format: animal number, sire number, dam number where unknown sire and/or dam numbers are replaced by 0. ¢) additive animal with unknown parent groups (add_an_upg) The pedigree file has the following format: animal number, sire number, dam number, parent code where sire and/or dam numbers can be replaced by unknown parent group numbers parent code = 3 - number of known parents: 1 (both parents known) 2 (one parent known) 3 (both parents unknown) 4d) additive animal with unknown parent groups and inbreeding (add_an_upginb) The pedigree file has the following format: animal number, sire number, dam number, inb/upg code where sire and/or dam numbers can be replaced by unknown parent group numbers inb/upg code = 4000 / {(1+ms)(1-Fs) + (1+md)(1-Fd)) where ms (md) is O whenever sire (dam) is known, and 1 otherwise, and Fs(Fd) is the Coefficient of inbreeding of the sire (dam). For example, the inb/upg code for the animal with both parents known is 2000. e) parental dominance (par_domin) The pedigree class file has the following format: s-d s-sd s-dd ss-d ds-d ss-sd ss-dd ds-sd ds-dd code where x-y is a combination number of animals x and y, sis sire, d is dam, sd is sire of dam, etc. Code is a number of 0 to 255 and refers to the combi PSB 57 56 55 s4 53 52 s1 code then code = sum(ai 2**i), where ai0 if si=1 and 1 otherwise. For example, the code for a line with all nonzero parental subclasses is 255. For a line with only zero parental subclasses, If classes are ordered so that lines with zero parental subclasses, code=0. If lines are ordered so that p for parental classes with code=0 are ordered last, they may be omitted and will added automatically. The parental dominance file can be created by program RENDOMN. of missing subclasses. If one line is: R Data and Pedigree files All files are free format, with fields separated by spaces. By default, 0 is a missing value for all effects, including covariables. Transferring o file from Windows (DOS) to Linux environment Use “dos2unix” to convert the DOS (Windows) format to the UNIX (Linux) format if the programs show an error message while reading a file. Data file a. Spaces) isa delimiter, At least one character space between columns is required. b. Dot (.)is just one character but not a missing value (default missing value = 0). Check the data again especially when converting from another format or software such as EXCEL, SAS, .. d. For Gibbs sampling programs with “OPTION cont”, copy the previous output files somewhere else just in case making mistakes and replacing those files. Pedigree file a. An original pedigree file for RENUMF90 can include alpha-numeric characters with free format. b. Remove duplicates. © Use 0 for unknown parent(s) Error messages in parameter file a. Wrong data file name Check outputs for the data file name and the number of records on the screen. The program will not stop if the wrong file name already exists. b. Wrong pedigree file name Check output for the pedigree file name and the number of animals on the screen. The program will not stop if the wrong file name exists. © Wrong positions or formats for observations and effects Program may not stop and may get wrong results. Check outputs for the number of levels for each effect on the screen. d, Missing or skipping one or more fixed lines in the parameter file Program may stop. Check the missing line. e — Misspelling Program may stop. Correct the wrong spelling, Missing an empty last line Program may not stop. Parameter, data, and pedigree files may need one more extra line at the end of the file. (Covariance matrix is not symmetric, not positive definite, not right sized, Program may not stop. h. _Agood result does not mean that your parameter file is correct. Always double-check! PECK EEEEAEEAEREEEEEEEEELEEEE4asaas VVUVVVUUVVUUUVUVUUEVUUUUUUULUULULULELULELUULLLELUY B RENUMF90 parameter file RENUMF90 is a renumbering program to create input (data and pedigree) files for BLUPF90 programs and provide basic statistics. Parameter file DATAFILE fe # data file name — input files cannot contain character # because it is used as a comment. TRAITS htt # positions of traits in data file FIELDS_PASSED TO OUTPUT PiP2..Pm _ # positions that are not renumbered WEIGHTS) w # position of weight - fraction to the residual variance RESIDUAL_VARIANCE R # matrix of residual (covariances EFFECT ereres..typeform —#erezes ion of this effect for each trait # type = ‘cross' for crossclassified or ‘cov’ for covariables # form ='alpha' for alphanumeric or ‘numer’ for numeric EFFECT dideds...cov #dido ds. NESTED e:e2e3...form # e: e2 es... = positions of crossclassified effects nested positions of covariables nested in the following crossclassified effects # form = alpha’ for alphanumeric or ‘numer’ for numeric RANDOM random_type #'diagonal, sire’ or 'animal' for random effect OPTIONAL 010203.. _ #'pe' for permanent environment, ‘mat’ for maternal, and 'mped! for maternal permanent environment FILE fped # pedigree file name FILE_POS animal sire dam alt_dam yob _# positions of animal, sire, dam, alternate dam (recipient dam), and year of birth in pedigree file (default 12 3.000) SNP_FILE fsnp 4 specify a SNP file with ID and SNP information; the relationship matrix will include the genomic information; a fsnp file should start with ID with the same format as fped, and SNP info needs to start from a fixed column and include digits 0, 1, 2 and 5; ID and SNP info need to be separated by at least one space; see more information in PREGSI9O. 14 PED_DEPTH P +# depth of pedigree search (default 3); all pedigrees are loaded if GEN_INT min avg max _# minimum, average and maximum generation interval; applicable only if year of birth present in pedigree file; minimum and maximum used for pedigree checks; average used to predict year of birth of parent with missing pedigree. REC_SEX sex # if only one sex has records, specifies which parent itis; used for pedigree checks. UPG_TYPE t # 'yob' = based on year of birth; if 'in_pedigrees’, the value of a missing parent should be ~*, where xis UPG number that this missing parent should be allocated to; in this option, all known parents should have pedigree lines, ie. each pareit field should cont either the ID of a real parent, or a negative UPG number. Ifit is internal’, allocation is by @ user-written function custom_upg (year_of_birth,sex,ID, parent_code). (coyvaRIANces G # (covariances for animal effects or animal + maternal effects (CO)VARIANCES_PE Gre 4 (covariances for the PE effect (CO}VARIANCES_MPE GME # (covariances for the MPE effect The additive pedigree file built by RENUMF90 is renaddxx.ped and has the following structure: 1) animal number (from 1) 2) parent 1 number or unknown parent group number for parent 1 3) parent 2 number or unknown parent group number for parent 2 4) 3 minus number of known parents 5) known or estimated year of birth (0 if not provided) 6) number of known parents (if genotypes are used: 10 + number of known parents) 7) number of records 8) number of progenies as parent 1 9) number of progenies as parent 2 10) original animal id Can we change the maximum size of character fields? OPTION alpha_size nn # new size (default 20 characters) How can we specify interactions? Combining fields or interactions Several fields in the data file can be combined into one using a COMBINE keyword. COMBINE abc... # keywords COMBINE need to be on top of the parameter file, but possibly after ‘comments. PEACAAEACARAAEAAAEAARAAEEOAAGEBDAAKRAAAAAAEOEeOAKAA A: VUVUEVUVUVUELULLLULEULLELELELELEELELEELEELELELEE 15 For example: COMBINE 7 23.4 combines content of fields 2 3 4 into field 7; the data fle is not changed, only the program treats field 7 as fields 2 3 4 put together (without spaces). The combined fields can be treated as numeric" with the total length is <9 or "alpha". Example Input file - data aa 110 aa 212 pia cc 1 32 cc 214 a 233 oo 218 Pedigree file - ped ££ 00 2006 ce mh ss 2006 aa ££ 0 2006 ee ££ 0 2002 2 0 0 2002 93 ££ 0 2002 Bh 0 0 2002 Ai 0 0 2002 we 0 0 2000 Parameter file - testpar1 ‘# Parameter file for program renf90; Its translated to parameter ‘file for BLUPE9O family f programs. DATAFILE data ‘TRaITs 3 FIELDS_PASSED TO OUTPUT 1 weicHr(s) RESIDUAL VARIANCE 1 EFFECT 2 cross num EFFECT cross alpha RANDOM animal OPTIONAL fimat File ped Fiue_pos 12304 PED_DEPTH 3 GeN_INT 1210 uPG_TvPE vyob 2002 2003 Output log. ReMM@SO version 1.73 ane of paraneter file?testpart Gatafite:data eraite: 3 fields passed: 1 1.000 Processing effect 1 of type cross item kind-num Processing effect 2 of type cross stem kindealpha pedigree file name "ped" positions of animal, sire, dam, alternate dam and yob 1 2 3 0 4 pedigree traced to generation | 3 Minimum, average and maximum generation intervals: 1 2 10 Unknown parent groupe separated by years: ‘2002, 2003 Maximun ize of character fields: 20 hash tables for effects set up table with 2 elements sorted Refect group 1 of colum 1 with 2 levels table expanded from 10000 to 10000 records ‘sdded count Eefect group 2 of colum 1 with 5 levels wrote statistics in file "rent90. tables" Basic statistics for input date (missing value code is 0) Pos Min vax Moss ‘= " 2 1.0000 2.0000 ase 0.53482 7 3 10.000 © 14.000» 12.286 1.4960, 7 correlation matrix au. a 2 1.00 0.80 3 0.80 1.00 Counts of nonzero values (order as above) PET ee 7 random effect 2 ‘type:aninal ‘opened output pedigree File "zenadd02.ped" 10 pedigree records loaded 4 parent(s) in round 1 1 ‘99: younger than parent 1 by by Unknown parent group allocation Equation Group ‘Waninals Years 10 a ° o- 2002, u 2 e 2002-2002 2 3 a 2003- mumber of aniaals with records: 5 Mumber of parents without records: 4 output data file - rent90.dat observation, effect 1, effect 2, animal 1 nize 121500 1425 co Output pedigree file ~ renadd03.ped Animal, sire, dam, 3-fusknown parents, birth year, know parents, #records, #progeny of size, # Progeny of dam, original animal 1D 1611 22002110100 2871 20062100 nb 76112 2002100199 3612 2 20081100 44 911 113 20020001 a3 4611200022008 @11 31 3 2002 0020 bh Output parameter file -renf90.par DATAFILE renf90.dat NUMBER_OF_TRAITS, 1 NUMBER_OF_EFFECTS 2 OBSERVATION(S) 1 WEIGHT(s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 2eross 3 eros RANDOM _RESIDUAL VALUES 11.000 RANDOM_GROUP 2 RANDOM_TVPE ‘add_an_upe FILE renadd02.ped (coyvaniances 1.000 Output tables after renumbering - renf90.tables, Effect group 1 of column 1 with 2 levels Value # consecutive number 15 1 a C2 KLECALEECECAALAELALARLEAAAERARABRAREARAABCEBBABEAEAES RFU U UO UE EUEEEELUELS 19 When to use what program and computing limits BLUP BLUPF90 sets up equations in memory. It can support.a few milion equations with a simple model to much smaller with complicated models (multiple traits, maternal effects, random regression, etc). BLUPF90 uses three solvers, chosen with options. PCG is the default solver and is usually the fastest one. SOR require less memory but usually converges slower. Sparse Cholesky (FSPAK) is usually the most accurate method but uses the most memory. The following options are available: OPTION conv_crit 1e-12 Set convergence criteria (default 1e-10). OPTION maxrounds 10000 ‘Set maximum number of rounds (default 1000). OPTION solv_method FSPAK Selection of solving method: FSPAK, SOR or PCG (default PCG). OPTION r_factor 1.6 Set relaxation factor for SOR (default 1.4). OPTION sol se Store solutions and se. If this option is used, the sol OPTION biksize 3 Set block size for preconditioner (default 1). 1g method will turn to FSPAK. BLUP9OIOD uses an iteration on data algorithm. it can handle hundreds of millions of equations with ‘complicated models in a reasonable time. However, itis only available with a research contract or for research at UGA. The following options are available: OPTION conv_crit 1e-12 Set convergence criteria (deault 1e-12).. OPTION maxrounds 10000 Set maximum number of rounds (default 5000}. OPTION biksize 3 Set block size for preconditioner (default 1). Usually biksize will be the same number of traits. OPTION init_eq 10 Set the number of effects to be solved directly (default 0). OPTION solv_method FSPAK Solving method for initial equations (default DIRECT), OPTION tol 14-12 Tolerance to get a positive definite matrix (default 14-12) OPTION residual y-hat and residuals will be included in “yhat_residual 20 OPTION avgeps 50 Using the last 50 average eps for convergence. OPTION cont 1 To restart the program from the previous solutions. OPTION missing -1 Set the missing value (default 0). OPTION restart 100 Set the number of iteration to recompute residuals (default 100). OPTION Using the previous solution file to start the iteration. Additional software is required to use this option OPTION random_upg 12 Set the UPG random. “1” the weight for random UPG inverted (e.g., 1/2=0.5). OPTION SNP_file snp Specify the SNP file name snp to use genotype data. . If the second number exists, the weight will be, ag gakudnwee PakA BE UBWAR ba opqes PlokrENER Sk DE AAS oe Ant eSTIMAR PEN CEACALAEAEAALACAALAEAAERECATCAEAAEAEEEABaeagaaaaanaeaaaaaa VBVLVEVUVUVUULUVUUVULUUUUULDUYULULULUVULULELELUUL 2 Variance component estimation There is not a single-best choice for variance component estimation. Programs below offer choices for simple and complicated models. For advice on what works best under your circumstances, google a paper “Reliable computing in estimation of variance components”. REMLF90 uses EM REML. For most problems it is the most reliable algorithm but can take hundreds of rounds of iteration. REMLF90 was found to have problems converging with random regression models. In this case, using starting variances that are too large than too small usually helps. Also, EM does not calculate standard errors for the estimates. The following options are available: OPTION conv_crit 14-12 Convergence criterion (default 14-10) OPTION maxrounds 10000 Maximum rounds (default 5000). OPTION sol se Store solutions and se. OPTION re chat and residuals will be included in “yhat_residual”. OPTION missing -999 Specify missing observations (default 0).. OPTION use_yams Run the program with YAMS (modified FSPAK). The computing time can be dramatically improved. OPTION constant_var 5 12 5: effect number, 1: first trait number, 2: second trait number implying the covariance between traits 1 and 2 for effect 5 is fixed. OPTION SNP_file snp Specify the SNP file name snp to use genotype data, AIREMLFS0 uses Average Information REML. It usually converges much faster but sometimes does not converge. Very slow convergence usually indicates that the model is over parameterized and there is insufficient information to estimate some variances. Al REML calculates standard errors for the estimates. The following options are available: OPTION conv_crit 1-12 Convergence criterion (default 1d-10). OPTION maxrounds 500 Maximum rounds (default 5000). When it is negative, the program calculates BLUP without running REML. OPTION EM-REML 10 Run EM-REML for the first 10 rounds to get initial variances within the parameter space (default 0). 2 OPTION tol 14-18 Tolerance (or precision) for positive definite matrix and G-inverse subroutines (default 14-14). OPTION sol se Store solutions and s.e. OPTION missing -1 Set the missing observation (default 0). OPTION constant_var 512 5: effect number, 1: first trait number, and 2 for effect 5 is fixed, : second trait number implying the covariance between traits 1 Heterogeneous residual variances for a single trait OPTION hetres_pos 1011 Specify positions of covariables. OPTION hetres_pol 4.0 0.10.1 Initial values of coefficients for heterogeneous residual variances. Use /n(a0, a1, a2, ..) to make these values. When the number of positions = the number of polynomials, the regressions do not include the intercept (eg., linear spline). Heterogeneous residual variances for multiple traits (the convergence will be very slow) OPTION hetres_pos 10 10.1111 Specify positions of covariables (trait first). OPTION hetres_pol 4.0 4.00.1 0.10.01 0.01 Initial values of coefficients for heterogeneous residual variances using In(a0, a1, a2, ..) to make these Values (trait first). “4.0 4.0” are intercept for first and second traits. “0.1 0.1” could be linear and “0.01 0.01” could be quadratic. To transform back to the original scale, use exp(a0+a1*X1+a2"X2). OPTION SNP_file snp Specify the SNP file name snp to use genotype data. GIBBSxF90 programs implement Bayesian methods. These methods potentially have better statistical Properties. Also they are more stable and use less memory for complicated models. After running any of the Gibbs sampling programs, samples can be analyzed (posterior means, SD, and convergence parameters) with the POSTGIBBSF90 programs. In practical cases, results from Gibbs samplers and REML are si lar. Choose one or the other based on computing feasibility. If there are large differences beyond sampling errors, this indicates problems usually with the Gibbs sampler. Try longer chains or different priors. Gibbs samplers may be slow to achieve convergence if initial values are far away from those at convergence, .g., 100 times too low or too high. Before using more complicated models, Karin Meyer advocates using a series of simpler models. RETAALAAAELAALAAAEAAALELARLALABABLARARARAAAAAARARA VVVEVELELEVEVUELEVELELEULULUUEUULELELELLUUULESE 23 GIBBS1F90 can run models with over 20 traits. However, if models are different per trait, the lines due to effects need to be modified. Also, with too many differences in models among traits, the program becomes increasingly slower. GIBBS2F90 adds joint sampling of correlated effects. This results in faster mixing with random regression and maternal models. Interactive inputs: number of samples and length of burn-in? In the first run, if you have no idea about the number of samples and burn-in, just type your guess (10000 or whatever) for samples and (0) for burn-in. You may need 2 or 3 runs to figure out the convergence. Give n to store every n-th sample? Gibbs samples are highly correlated, so you do not have to keep all samples (every 10th, 20th, 50th, ... The following options are available for GIBBSKFSO: OPTION fixed_var all 123 Store all solutions and posterior means and SD for effects for effects1, 2, and 3 are stored in "all_solutions" and in "final_solutions” every round using fixed variances. Without numbers, all solutions for all effects are stored. OPTION fixed_var mean 123 Posterior means and SD for effects1, 2, and 3 in “final_solutions". OPTION solution all 123 Store all solutions and posterior means and SD for effects1, 2, and 3 are stored in “all_solutions” and in “final_solutions" every round. Without numbers, all solutions for all effects are stored, OPTION solution mean 123 Posterior means and SD for effects1, 2, and 3 in "final_solutions'”. OPTION cont 10000 10000 is the number of samples run previously when restarting the program from the last run. OPTION prior 5 2-15 ‘The (co)variance priors are specified in the parameter file. Degree of belief for all random effects should be specified using the following structure: OPTION prior effi db1 eff2 db2 ... effn dbn -1 dbres effx correspond to the effect number and dbx to the degree of belief for this random effect, -1 corresponds to the degree of belief of the residual variance. In this example 2 is the degree of belief for the Sth effect, and 5 is the degree of belief for the residual. OPTION seed 123 321 Two seeds for a random number generator can be specified. OPTION SNP_file snp. Specify the SNP file name snp to use genotype data, GIBBS3F90 adds estimation of heterogeneous residual covariances in classes. The computing costs usually increase with the number of classes. 24 OPTION hetres_int 5 10 ‘The position (5) to identify the interval in the data file and the number of intervals (10) for heterogeneous residual variances. Other options are the same as for GIBBS1F90 and GIBBS2F90. THRGIBBS1F90 is a Gibbs sampling program to analyze categorical and continuous traits simultaneously. The following options are available: OPTION catoo25, “0” indicate that the first and second traits are linear. “2” and “S” indicate that the third and fourth traits are categorical with 2 (binary) and 5 categories. OPTION thresholds 0.0 1.0 2.0 Set the fixed thresholds. No need to set 0 for binary traits. OPTION residual 1 Set the residual variance = 1 OPTION censored xx Negative values of the last category in the data set indicate censored records. “xx” determines the lower ‘and upper limit of the category + xx when sampling from the distribution, Other options are the same as for GIBBS1F90 and GIBBS2F90. POSTGIBBSF90 is a program to calculate posterior means and SD and diagnose the convergence. The Program reads “gibbs_samples” and “fort.99" files from Gibbs sampling programs. Read 1000 samples from round 10 to 10000 Burn-in? 1000 # in the first run, type 0 for burn-in to include all samples Give n to read every n-th sample? (1. means read all samples) 10 # Type the same number used with a Gibbs sampling program. # samples after burn-in = 9000 Input fi gibbs_samples, fort.99, and other files used in a parameter file from (THR)GIBBSxF9O Output files: postgibbs_samples, postout, postmean, postsd postgibbs_samples RELARARALAAARAAEABABAREAARABAAANAAANAALBABAAOO VVUVUVUUUUUU UV UUUUUUVUVUUUVUVUUUVUUUSUUUUSLY 25 A text file containing all Gibbs samples from gibbs_samples for other software (EXCEL, SAS, ..) to calculate posterior means and SD, and to create graphs. postmean Posterior means postsd Posterior standard deviations postout seevsens Monte Carlo. —_Errorby Time Series Pos. efft eff2 trtt trtz MCE Mean HPD Effective Median Mode Interval (95%) sample size 104 4 2 2 1362-02 09889 0.788 1.215 704 0.9844 0.9861 204 4 1 2 1288602 1008 90.777 1219 B41 1.006 0.952 304 4 2 2 Leaeon 165 «1347 1.987 B03 1652 1579 4 0 0 4 1 9530603 2447 2407 2484 «4256 ©2447 28.53 5 0 0 1 2 8253603 118 1154 12.48 395811831182 6 0 0 2 2 1233602 30.1 © 29.65 3058 387.8 30.09 29.97 sevesee posterior Standard Deviation **" Pos. effi eff2 tt tt2 PSD Mean Psp Geweke __Autocorrelations Interval (95%) diagnostic lag: 1050. 1 4 4 1 1 04184 09889 07648 ©1213 «002 0853 0.188 0.089 204 4 4 2 oats2 1006 0.7742 1.237 OAL 0828 0.111 0.066 3.4 4 2 2 01656 165 «1335 1.988 0.05 0.828 0.108 0.021, 4 0 0 1 1 01967 2447 24.09 24.86 0.01 0.034 0.029 0.062 5 0 0 1 2 01643 1184 1281 1216 0.03 0.032 0.005 0.016 6 0 0 2 2 0289 30a 2962 3057 «= 0.02 0.07 0.018 0.037 where "Pos." position of each parameter in the parameter file effi" and "eff2" effect number in the parameter “trta" and "trt2" file trait number in the parameter file "mice" Monte Carlo Error "Mean" posterior means "HPD interval (95%)" 95% Highest Probability Density “Effective sample size" at least > 10 is recommended. > 30 may be better. "Median" pendent chain size ey 18 25 2 Independent hatehes 36 450 ao ‘median of Gibbs samples "Mode" when the distribution of the samples is not normal, "Mean" and “Mode” could be different. “Independent chain size" number of dependent cycles of Gibbs samples "psp" Posterior Standard Deviation "PSD interval (95%)" ‘95% Posterior Standard Deviation interval "Geweke diagnostic’ ratio between first half and second half of the samples should be < 1.0, but itis not useful because itis < 1.0 most of the time. “autocorrelations" autocorrelations between two lags. High correlation implies samples are not independent. “Independent # batches" Choose a graph for samples (= 1) or histogram (= z Positions 123 # choose from the po: ; oF exit (= 0) numbers 1 through 6 If the graph is stable (not increasing or decreasing), the convergence is met. All samples before that Point should be discarded as burn-in, PERLARAAALADALAREAEARALARAAADARADEERALREEA 27 print = 1; other graphs = 2; or stop = 0 2 Choose a graph for samples (= 1) or histogram (= 2); or exit (= 0) 2 ‘Type position and # bins 120 | » | ler | | | { L | PEEL Peat | The distribution should be usually normal (Mean = Mode = Median), print = 4; other graphs = 2; or stop = 0 0 *** Log Marginal Density for Bayes Factor *** after 900 burn-in log(p) = -179448.742766031 This value could be used when calculating Bayes Factor and/or DIC. 28 Genomic programs The PREGSF90 program constructs a genomic relationship matrix G and a relationship matrix A for genotyped animals. The relationship matrix A based on the pedigree information in mixed model equations is replaced by matrix H, which combines the pedigree and genomic information. The main difference between A” and H" is structure of G-' —A;\.. Some of the options for PREGSF90 can be also used with BLUPF90, (Al)REMLF90, GIBBS1F90, GIBBS2F90, GIBBS3F90, THRGIBBSiF90, and BLUPSOI002 OPTION SNP_file The SNP file should contain Field 1 animal 1D with the same format as in pedigree file Field 2 - genotypes with 0, 1, 2, and 5 (missing) or real values for gene content 0.12, ‘Two Fields (animal ID and SNP) need to be separated by at least one space, and Field 2 should have fixed format (i.., all rows of genotypes should start at the same column number or position). 80 21201012002012012011020210112181211111210100 0x4 21110102511101220721110111511132101112210100 516 21100103202252021120210121102131202212111101 381 21110131112201220850200020103022212211111100 The renumbered ID file for genotypes named as the genotype file name.XreflD is created by RENUMF90 (using the SNP file) , containing sequential ID renumbers and the original ID, which must be in the same order as in the SNP file as follows: 3732 80 2474 e024 406 S16 pean 102, The pedigree file from RENUMESO looks like 1732 11010 1058 13.12 10 0 go 74 8691 99081 3.12 200 Bore 06 9691 98251332 102 sie 941 3691 692913 12 100 102, Several optional files are available: Allele frequencies (OPTION FregFilev ) Map file (OPTION chrinfo ) ‘Weight file (OPTION weightedG ) G orits inverse, Az or its inverse, etc, as specified by respective OPTIONS. OPTION chrinfo : read SNP map information from the file. These files are useful to check for Mendelian conflicts and HWE (with also OPTION sex_chr) and for POSTGSF90 (ssGWAS). SHELAAAAAALARALRLATLERULRLTARRLEATERARRADERARADAEAED VUVEVUEVLVELULELELLELELELLELELELELELELELELLED 29 Format = all numeric variables: SNP order, chromosome, position (bp): the SNP order corresponds to the index number of the SNP, in the sorted map by chromosome and the position. ‘The first line in the file corresponds to the first SNP in the genotype file, and so on. Other alphanumeric fields are optional. By default, PREGSF90 always create GimA22i in binary format for use by later programs specifying OPTION readGimA22i. With OPTION saveAscii, this file can be stored as ASCII format: j, G! —A3! “freqdata.count” contains allele frequencies in the original genotype file with the format: SNP number (related to the genotype file) and allele frequency. “freqdata.count.after.clean” contains allele frequencies as used in calculations with the format: SNP ‘number (related to the genotype file), allele frequency, and code of excl Exclusion codes: 1: Call Rate 2: MAF lonomorphic 4: Excluded by request 5: Mendelian error 6: HWE 7: High Correlation with other(s) SNP “Gen_call_rate” contains a list of animals excluded with call rate below the threshold “Gen_conflicts” contains a report of animals with Mendelian conflicts with their parents. ‘The program can store files such as G or its inverse, Axor its inverse, or other reports from QC as specified by their respective OPTIONS. Options for creation of genomic relationship Matrix (6 ‘The genomic relationship matrix G can be created in different ways. OPTION whichG x Specify how G is created The variable x can be LG 2 ; VanRaden, 2008 (default) ane: + Amin et al., 2007; Leuttenger et al., 2003; where D zs a 2p(l=p) 3: As 2 with modification UAR from Yang et al 2010 26 OPTION whichfreg x Specify what frequency is used to create G. The variable x can be 0: read from file “freqdata” or from the other file using OPTION FreqFile 30 1:05 2: current calculated from genotypes (default) OPTION Freqfile Read allele frequencies from a file. For example, based on allele frequencies calculated by estfreq.f90 (VanRaden, 2008) with format: SNP, frequency where SNP corresponds to the index of SNP based on the same order that are in the genotype file. If whichfreq is set to 0, the default file name is “freqdata’. OPTION whichScale x Specify how G is scaled The variable x can be 4: 2Ga—p)}; Vankaden 2008 (default) al 3: correction ; Gianola et al 2009 ; Legarra 2009, Hayes 2009 OPTION weightedG Read weights from a file to create weighted genomic relationship. Weighting Z* = Z sqrt(D) > = Z*Z*! = 202' (format: one column of weights in the same order as in the genotyped file). Weights can be extracted from output of the POSTGSF90 program: OPTION maxsnp x Set the maximum length of string to read marker data from a file, It is only necessary if greater than default (400,000). Quality Control (QC) for 6 By default the following QC can be run: Mar Call rate (SNPs and animals) Monomorphic Parent-progeny conflicts (SNPs and animals) Parameters can be modified with the following options: OPTION minfreq x Ignore all SNP with MAF x (default value = 0.9). OPTION threshold _« 32 Check for extremely large diagonals in the genomic relationship matrix. if optional x is present, the threshold will be set (default value = 1.6). OPTION plotca Plot first two principal components to look for stratification in the population. OPTION extra_info_pca col Read the column col to plot with different colors for different classes from the file. The file should contain at least one variable with different classes for each genotyped individual, and the order should match the order of the genotype file. Variables could be alphanumeric and separated by one or more spaces. OPTION saveCleanSNPs * Save clean genotype data with excluded SNP and animals based on the OPTIONS specified. *_clean files are created: + gtclean + gt_clean_XrefiD *_removed files are created. + gt_SNPs_removed + gt_Animals_removed where “gt” is the genotype file. OPTION no_quality_control Turns off all quality control. It is useful to speed up computation when the QC was performed previously. OPTION outcallrate Print all call rate information for SNP and individuals. The files “callrate” for SNP and “callrate_a” for individuals are created. Quality Control for Off-diagonal of Az and G OPTION thrWarnCorAG x Set the threshold to issue warning if correlation between Az: and G x (default values = 0.02). 3). Options for H including different weights to create G-"A,! as {alpha G + beta Ax + gamma I + delta)*- omega A>! where the parameters are to scale the genomic info to be compatible with the pedigree information, to make matrices invertible in the presence of clones, and to control bias. The defaults values are: Oooo 88 oo 888888 8888888888888 8888S SEES SS 33 ta alpha =0.95 beta = 0.05 gamma=0 delta=0 omega=1 Options to change these defaults are specified with: OPTION TauOmega tau omega OPTION AlphaBeta alpha beta OPTION GammaDelta gamma delta OPTION tunedG x Scale G based on Az. The variable x can be: 0: no scaling 1: mean(diag(6))=1, mean(offidiag(G))=0 2: mean(diag(G)}=mean(diag(A:.)), mean(offdiag(G))=mean(offdiag(A:.)) (default) 3: mean(6)=mean(Az) 4: rescale G using the first adjustment as in Powell et al. (2010) or Vitezica et al. (2011). OPTION nthreads n Specify number of threads to be used with MKL-OpenMP for creation and inversion of matrices, OPTION ntheadsiod n Specify number of threads to be used with MKL-OpenMP in BLUPSOIOD for matrix-vector ‘multiplications in the PCG algorithm. OPTION graphics s Allows to generate plots with GNUPLOT. If optional parameters is present, set the time in seconds to show the plot. Avoid using in batch programs!!! OPTION msg x Set the level of verbose; 0 minimal; 1 gives ots of diagnostics. Save and Read options: OPTION saveascii Save intermediate matrices (GimA22i, G, Gi etc) files as ASCII (default = binary). OPTION saveHinv Save H* in “Hinv-txt” (format: ij, val with i,j, the index level for the ad OPTION saveAiny w.txt” (format: jj, val with /e genetic effect), Save Ain“) ,j the index level for the additive genetic effect). The following options use the information of the or the “renaddxx.ped” file created by RENUMF90. OPTION saveHinvOrig ‘Save H* with original IDs OPTION saveAinvOrig ‘Save A? with original IDs OPTION saveDiagGOrig Save diagonal of G in “DiagGOrig.txt” (format: id, val with id, original IDs). OPTION saveGOrig Save G in “G_Orig.txt” (format: id_ inal ID (alphanumeric) stored in the 10th column of 1_j, val with id_i and id_j, the original IDs). 34 OPTION saveA220rig Save Azz in “A22_Orig.txt” (format: id_i, id_j, val with id_i and id_j, the original 1Ds). OPTION readOrigid Read information from “renaddxx.ped” file, original ID and possibly year of birth for its use in parent- progeny conflict. Only need unless the previous “save*Orig” is present. OPTION savePLINK Save genotypes in PLINK format files: toPLINK.ped and toPLINK.map. Save and Read intermediate files: OPTION readGimA22i This option can be used in analysis programs (BLUPF9O, REMLFSO, etc.) in order to use matrices stored in GimA22i file (default filename). In general, methods used to create and invert matrices in such programs don not use optimized version. For large number of genotyped animals, run first PREGSf90 and read stored matrices in analysis programs. ‘The optional file can be used to specify the other file name or path, For example, OPTION readGimA22i ../../pregsrun/GimA22i Other intermediate matrices files can be stored for inspection or for use in BLUPE9O programs as user_file type of random effect. See tricks and REMLF90 for details. OPTION saveA2z OPTION savea22inverse OPTION saves all ‘f optional all is present, all intermediate matrices for G will be saved. OPTION saveGinverse OPTION saveGmAzz OPTION readG OPTION readGinverse OPTION readA22 OPTION readA22Inverse OPTION readGmaz2 POSTGSF90 The following options for POSTGSF90 (ssGWAS) are available: OPTION Manhattan_plot Plot using GNUPLOT the Manhattan plot (SNP effects) for each trait and correlated effect. OPTION Manhattan_plot_R Plot the Manhattan plot (SNP effects) for each trait and correlated effects using R. TIF images are created: manplot_sft1e2.tif (note: t1e2 corresponds to trait 1, effect 2). CAIRO packaged is required. OPTION plotsnp n Control the values of SNP effects to use in Manhattan plots 1: plot regular SNP effects: abs(val) 2: plot standardized SNP effects: abs(val/sd) (default) LALLA hae aaa aaaaaaa VUEVEVEEULEVELULEVULULELULELELULEULLEULLEULEULELYE 35, OPTION SNP_moving_averagen Solutions for SNP effects will be by moving average of n adjacent SNPs. OPTION windows_variance n Calculate the variance explained by n adjacent SNPs. OPTION windows_variance_mbp n Calculate the variance explained by n Mb window of adjacent SNPs. OPTION windows_variance_type n Set windows type for variances calculations 1: moving windows 2: exclusive windows OPTION which_weight x Generate a weight variable to be used in the creation of a weighted genomic relationship matrix A:w=y2* (2(p(1-p))) 2:w=yh2 with scaled weight = w * nSnp/sum(w) Output files for POSTGSF9O: “snp_sol” contains solutions of SNP and weights A: trait 2: effect 3: SNP 4: Chromosome 5: Position 6: SNP solution T:weight —_if OPTION windows_variance is used 8 sd by n adjacent SNP. : variance exp “chrsnp” contains data to create plot by GNUPLOT A: trait 2: effect, 3: values of SNP effects to use in Manhattan plots 4: SNP 5: Chromosome 6: Position “chrsnpvar” contains data to create plot by GNUPLOT aetrait 2: effect 3: variance explained by n adjacent SNP 4: SNP 5: Chromosome 6: Position “snp_pred” contains gene frequencies + SNP effects Graphic control files: Several files are created to generate graphics using either GNUPLOT or R File names rules “sft1e2.R”. The first letter indicates “S” for solutions of SNP and “V" for variance explained. “t1e2" indicates that the file is for the trait 1 and the effect 2. Filename extension 2ox.gnuplot => GNUPLOT 200.8 => R programs »ou.tif => image PREDF90 predicts GEBV for young animals based on only genotypes. The prediction is based on SNP effects obtained from POSTGSF9O. For young animals that were not included in the previous analysis, GEBV can be calculated using the “snp_pred’ file from POSTGSF9O. Input files: “snp_pred” ~ information about the random effect (number of traits + correlated effects) - gene frequencies + solutions of SNP effects Prepare an updated genotype file in the same format as used in POSTGS{90. Output fi “SNP_predictions =1D, calling rate, and GEBV Parameters: 1. alpha - fraction of G used (default=0.95); affects scale of prediction 2, callrate -to be used later for discarding genotypes with poor quality (default=0.7) Sample run using example from our website “renum.par” for RENUMF90 DATAFILE phenotypes.tut TRAITS. 3 36 A i i i i i i inn i ai a i me a BVUVUUVULLUVUULULUUUUULEULULULUULULUUULUULUUULLOD FIELDS_PASSED To OUTPUT WeIGHTIS) RESIDUAL VARIANCE f variances are from airemif90 results 0.9038 eFFecT 1Lcross alpha EFFECT 2 ross alpha animal RANDOM animal Fie pedigree SNP_FILE smarker.geno.clean (coyvaRiances, 0.99516-01 Run RENUMF9O RENUMESO version 1.96 name of paranater file?renua.per mumber of animals with records 15800 mumber of animale with genotypes 1500 ‘wrote renumbered data "rene90.dat" ““renf90.par” from RENUMF90 ‘#BLUPF9O parameter file created by RENF90 DATAFILE renf90.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 1 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 15800 cross RANDOM. RESIDUAL VALUES 0.9038 RANDOM_GROUP 2 RANDOM _TYPE add_animal Fue renadd02.ped (coyvaniances. 0.99516-01 OPTION SNP_file marker.geno.clean Run BLUPF90 pane of paraneter £2e?rent90.par ound 67 convergence= 1.2592041363980448-012 ound 68 convergence= 9.0255920585124428-013 68 iterations, convergence criterion= 9.0255928585124438-013, solutions stored in file: "solutions" Sa/postases0 name of parameter file?renf90.par postas 1.12 Solutions read from file: "solutions" Files for pedictions by SUP effects in file: "enp_prea” Snead -5 np pred 3000 a ° 15800 0.751 0.362 0.568 0.680 0.184 0.298 0.514 0.717 0.464 0.502 0.639 0.773 0.622 0.673 0.238 0.556 0.606 0.590 0.660 0.439 0.609 0.418 0.572 0.401 Run PREDFSO Preat90 1.00 Predicts EBVe from genotypes based on results from single-s name of genotype file? marker.geno.clean Mumber of SNP: 3000 Number of trate: 2 number of correlated traita: 2 3000 sup The genotype file contains 3000 SNP atarting from position 7 ‘3002 0.1286204 Bore -0.1033363, sore 0.1308723, 08 -0.1908423, 9024 0.365095, 9038 0.1939673 9061 -0.3284970 9063 0.124869 8065 -2.s096019R-02 Processed 1500 genotypes 38 CATTLE ALTEALELAETLAEALEEALELELEEEELARERLEEEARADADLEDED ARETE ERE ERA IATA a eee Average calling rate: 1.00 head -5 SNP_predictions 2002 1.00 0.1156 gore 1.00 -0.1007 gore 1.00 0.1276 gore 1.00 -0.1887 02 1.00 -0.3582 PREDICTF90 is to calculate J and residuals using the same parameter file and “solutions” and can be used to calculate predictive ability ly, Output files: “yhat_residual” Format: record #, original y, 9, residual “pvs.dat” The same format as “solutions” including (G)EBV. 39 Examples for parameter files Sire model without A DATAFILE estat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS. 2 OBSERVATION(S} 3 WeicHT(s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 1 2eross 23 cross RANDOM_RESIDUAL VALUES 10 RANDOM_GROUP 2 RANDOM_TYPE diagonal File (coyvartances 1 Sire model with A DATAFILE testidat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS. 2 OBSERVATION(S) a \WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 12 eross 23.cross RANDOM_RESIDUAL VALUES 10 RANDOM_GROUP 2 RANDOM_TYPE ‘add_sire Fue sire.ped {Co]VARIANCES 1 CELALTELGTLAECBALLELELGCELUDULDLERLADTADLUDULGDTADTELADLAEDADLELAD VUVUVULUUEULULUULUUULULUULULUUULUULLULULULULLLL model le (2) trait: DATAFILE testdat NUMBER_OF_TRAITS. 2 [NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 34 WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] Laz cross 223 cross RANDOM._RESIDUAL VALUES 101 1s RANDOM_GROUP 2 RANDOM _TYPE add_sire Fue ped {Co)ARIANCES 10a oat Animal model DATAFILE testdat NUMBER_OF_TRAITS. 2 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 3 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] a 2eross 5.10.cross RANDOM. RESIDUAL VALUES 10 RANDOM_GROUP 2 RANDOM _TYPE ‘add_animal FILE animal ped (CO)VARIANCES. 1 a Multiple trait animal model # Example 1: 2 trait animal model DATAFILE testidat NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 34 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 112 cross 55.10 cross RANDOM_RESIOUAL VALUES 101 1s RANDOM_GROUP 2 RANDOM_TYPE ‘adé_animal Fie animal ped {Co)VARIANCES toa oat 4 Example 2: different model for each trait DATAFILE testedat NUMBER_OF_TRAITS. 2 NUMBER_OF_EFFECTS 3 ‘OBSERVATION(S) 34 WeIGHT(s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 122 cross 55.10 cross 67 30 cross RANDOM RESIDUAL VALUES 301 as RANDOM_GROUP 2 RANDOM._TYPE ‘add_animal Fie 42 CTCTLTLUGBGELLULLALLELALELDLELEELAGEEDE DDD 0D440190O 00D VVVVUVUVUVUVUVUUUUULULELULEUEULELELELEUYLULELYE animal ped (coyvaRIANcES to. oat RANDOM_GROUP a RANDOM _TYPE diagonal FILE {CO)ARIANCES 10 on Animal model with UPG DATAFILE testdat NUMBER_OF_TRAITS, 2 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) a4 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TVPE_OF_EFFECT [EFFECT NESTED] 112 cross 553 cross RANDOM RESIDUAL VALUES 101 15 RANDOM_GROUP 2 RANDOM_TYPE ‘add_an_upe FILE animal ped {COWARIANCES 10. oat Animal mode! with inbreeding DATAFILE testdat NUMBER_OF_TRAITS. 2 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 34 WEIGHT(S) 43 EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_ LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 112 cross 5513 cross RANDOM_RESIDUAL VALUES 101 15 RANDOM_GROUP 2 RANDOM _TYPE adé_an_upginb FILE animal.ped (Co]VARIANCES. toa oat Repeatability model 1 DATAFILE testat NUMBER_OF_TRAITS a NUMBER_OF_EFFECTS 3 OBSERVATION(S) 3 WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 12eross 5S ross 5 10 cross RANDOM_RESIDUAL VALUES 10 RANDOM_GROUP 2 RANDOM _TYPE ‘add_animal FILE animal.ped (CO]VARIANCES, 2 RANDOM_GROUP 3 RANDOM_TYPE diagonal FILE (coyvaniances 1 Repeatability model 2 VPVUVUVUVUVUUUUUUUUUUVULUUUULUEUULULEULELEULELEELYS DATAFILE testidat NUMBER_OF_TRAITS. 2 NUMBER_OF_EFFECTS 3 ‘OBSERVATION(S) 34 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 112 cross 555 cross 5510 cross RANDOM. RESIDUAL VALUES 101 1s RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal FILE animal ped (CO)ARIANCES. toa oaa RANDOM_GROUP. 3 RANDOM_TYPE diagonal Fue (coyvariances 10.1 oat Maternal effect model DATAFILE maternal.dat NUMBER_OF_TRAITS. 1 NUMBER_OF_EFFECTS 4 OBSERVATION(S) 4 WeIGHTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 45, 2.22473 cross RANDOM_RESIDUAL VALUES 1050 RANDOM_GROUP 23 RANDOM_TYPE ‘add_animal Fie maternal.ped {covariances 450-100 100 340 RANDOM_GROUP 4 RANDOM _TYPE diagonal FILE {(coyvantances, 370 # For (THR)GIBBSxF90 # Example 1 DATAFILE testdat NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS 5 OBSERVATION(S) 34 weiciTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 102 cross 022 cross 5510 cross 6.030 cross 07 20cross RANDOM_RESIDUAL VALUES 101 as RANDOM_GROUP 3 RANDOM_TYPE ‘adé_animal File animal ped (CO]VARIANCES, 10a oat 46 TALELELELELLULTLAULLLULAURAULARLULALALALLADLADUAARALD VBEVEVEUVEVUVLELELEVYVEVYVLYUVVSYVSVSSVTYVVsVeyoyvesy RANDOM_GROUP 4 RANDOM_TYPE diagonal FILE (CoyVARIANCES, 10 oo RANDOM_GROUP 5 RANDOM_TYPE diagonal FILE (CoyvARIANCES oo on # Example 2 DATAFILE testidat NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS 5 (OBSERVATIONS) 34 WeIGHT{s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 102 cross 022 cross 55.10 cross 6030 cross 0730 cross RANDOM. RESIDUAL VALUES 101 1s RANDOM_GROUP 3 RANDOM _TYPE ‘add_animal ri animal ped (coyvaRIANces 10. oat RANDOM_GROUP 45 RANDOM_TYPE a7 { 48 ( ( , ( {Co)ARIANCES ( 1000 ‘ 0000 ‘ 0000 ; 0001 ( # Dominance model DATAFILE dom.dat NUMBER_OF_TRAITS. 1 NUMBER_OF_EFFECTS 4 OBSERVATION(s) 3 WEIGHTIS) « « [ ¢ « ¢ ¢ « ee 410 ¢ jain eo Sine goss : RANDOM_RESIDUAL VALUES € in a 3 € RANDOM_TvPE € cana Fue € add.ped € (coyananes 7 ‘ RANDOM_GROUP € _ € tasoou, i € rue € pele (CO)VARIANCES € 2 € e e e e e e € e ¢ Random regression model # Example 1 DATAFILE NUMBER_OF_TRAITS. 1 VVUVUVUVUVUUUUEULUULEUULLELELELEUULELELULELELE NUMBER_OF_EFFECTS 10 ‘OBSERVATION(S) 9 WeIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 1 788 cross 2 azcross 5 1c 6 1c 3 15097 cross 5 15097 cov 3 6 15097 cov 3 3 81883 cross 5 81883 cov 3 6 81883 cov 3 RANDOM _RESIDUAL VALUES 100 RANDOM_GROUP 367 RANDOM _TYPE diagonal FILE {covariances 1001 4 1101 110 RANDOM_GROUP 8910 RANDOM _TYPE add_an_upg, File ped_score (coyvantances, 1001 4 1104 11.0 # Example 2 DATAFILE testdatt NUMBER_OF_TRAITS. 2 NUMBER_OF_EFFECTS 9 ‘OBSERVATION(S) 34 WeIGHTiS) 49 50 EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 112 cross 661c0v 77 1c0v 225 cross 665cov22 775 cov22 22:10 cross 6610cov22 7730 cov22 RANDOM_RESIDUAL VALUES 101 1s RANDOM_GROUP 456 RANDOM_TYPE diagonal Fie (coyvaRiances, 10.10.1010.20.1 o1101010.10.1 0.10.1 1010.101 o1010.110.10.1 01010101101 o1010.101012 RANDOM._GROUP 789 RANDOM_TYPE ‘add_animal FILE animal ped (CoyvaRIANcES. 10101020303 o110.1010.10.1 101101010. 0101011010. 01010101102, o10r010.1011 # Example 3 DATAFILE testedat2 NUMBER_OF_TRAITS 2 NUMBER_OF EFFECTS 10 ‘OBSERVATION(S) 34 WEIGHT(S) CATLULGLTRLBRLLELABULGBLGLAUBLLLBLAUBLULBULBLALELALABLALLADALAD VUVUVUVUUULULUVULELUVUELLLEUUUELELEUELELUUEESE EFFECTS: POSITIONS 1 112 cross 66100 774.c0v 881cov 665 cov22 775 cov22 B5cov22 6610cov22 7710cov22 88.10cov22 RANDOM. RESIDUAL VALUES 101 1s RANDOM_GROUP 567 RANDOM _TYPE diagonal FILE \_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] (coyvARIANcES. 10.10.10.10.101 o1102010.102 01011010.103 01010110101 0.10.10.101101 1010101012 RANDOM_GROUP 2910 RANDOM_TYPE 184_animal Fie animalped (coyvaniances 10.10.1010.102 01101010201 e101 1010.10.21 01010110103 0101010110 oaor0s01014 Random regression model with heterogeneous residual variances ‘HHH using airemif90 # Example 1: wi DATAFILE testidat NUMBER_OF_TRAITS. 1 NUMBER_OF_EFFECTS 51 9 OBSERVATION(S) 3 WeIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] Aeros 61cov T1eov SScross 65covs 75cov5 5.10 cross 610 cov, 710 cov, RANDOM_RESIDUAL VALUES 10 RANDOM_GROUP 456 RANDOM_TYPE diagonal FILE (coyvatances 1010.1 oa10a oa0.11 RANDOM_GROUP 789 RANDOM_TYPE ‘add_animal FILE animal ped (coyvantances 10.102 o1104 oaoat OPTION hetres_pos 67 OPTION hetres pol 4.0.00.1 # Example 2: with no intercept DATAFILE testidat NUMBER_OF_TRAITS a NUMBER_OF_EFFECTS 7 OBSERVATION(S) 3 WeIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] UU UU UU UU UU eb bbe 12eross Stcov 7ieov 65covs 75covs 6 10.covs, 710 cov, RANDOM RESIDUAL VALUES 10 RANDOM_GROUP 45 RANDOM_TYPE diagonal FUE (coyvaniances 10.1 oat RANDOM_GROUP 67 RANDOM _TYPE ‘add_animal FILE animal ped (CO)ARIANCES 104 oat OPTION hetres_pos 67 OPTION hetres_pol 1.00.1 ‘iH using GIBBS3F90 DATAFILE testdat NUMBER_OF_TRAITS. 1 NUMBER_OF_EFFECTS 9 OBSERVATION‘) 3 WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 12cross 61cov Trev 55 .cross 65covs 75covs 5 10 crass 6 10covs, 710 cov, 53 RANDOM. RESIDUAL VALUES 10 RANDOM_GROUP 456 RANDOM _TYPE diagonal Fie {(Co}vaRIAnces, 2010.1 e110. oa0aa RANDOM_GRoUP 789 RANDOM_TYPE add_animal File animal ped {(CO}VARIANCES 10.101 o1101 oaoar OPTION hetres.int@ 5 Competitive model DATAFILE competition dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 9 OBSERVATION(s) 24 WEIGHTIS) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2.88 cross 3 362 cross 21.2409 cross 4.8004 cross 220¢0v5, 220cov6 220cov7 220cove 2009 220¢ov10 220cov 11 220cov 12 220cov 13 220cov14 54 CALETA EALAELAETLELATEDLALTGEDDDD9214404% UU UU UU UU UU RU UUEEEEELULEEES 220cov5 220cov 16 220¢cov17 220cov 18 22.8004 cov 19. RANDOM_RESIDUAL VALUES 1225.8 RANDOM_GROUP 45 RANDOM_TYPE add_animal Fite renadd04.ped (CoyVARIANCES 267.03 25.313 25,313 108.48 RANDOM_GROUP 2 RANDOM_TYPE diagonal FILE {CO)ARIANCES. 89.187 RANDOM_GROUP a RANDOM_TYPE diagonal File (coyvaniances, 167.34 55. Appendix A (single trait animal model) Single trait “USDA-type” animal model. This example is from the documentation of program JAA20, Y= hysi+ hsy+ Pk +a, + ey where Yiu - production yield hys: - fixed herd year season hsi - random herd x sire interaction Px: random permanent environment a random animal and var( hsi) = .05, var(p.)=.1, varlas)=.5, var(e)=2 Data file (ic Format: animal/hys/p/hs/y b232is 424313 535418 636332 Relationship file (i) Format: animal/dam/sire/code «730 5 za 6 110 eosu 913 8 10 736 a a36 Parameter file # Example of single-trait animal model with one fixed effect DATAFILE ke NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 4 OBSERVATION(S) 5 WercHT(s) LAGALLARCAALLARALLADELELDLADLALDLVLADELTUADUADLALAALAAALS? VVUGVLUVLLULULLEVLULUVLELELULLEDULULLULELELLELELLULELE EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF EFFECT [EFFECT NESTED] 2a cross 36 cross 44.cross 118 cross RANDOM_RESIOUAL VALUES 1 RANDOM_GROUP zi RANDOM_TYPE diagonal FILE (coyVARIANCES, a RANDOM_GROUP 3 RANDOM _TYPE diagonal FILE {coARIANcEs 0s RANDOM_GROUP 4 RANDOM_TYPE add_an_ups, Fite & (coyvaRIAnces, s Execution Prone/ignacy/£90/exampies blupt90 name of paranater £le7exiap Parameter #:1e: extep Data file se 5 Position of Weight (1) ° Value of Missing Trait/Observation ° position (2) levels (positions for nested] 2 3 3 6 ‘ ‘ 2 au“ Residual (co)variance Matrixe 1.000 Bandon Bffect 2 type of Random BE¢ect: diagonal trait effect (Co) VARIANCES 2 2 0.100 Random Effect 3 ‘ype of Randos Effect: diagonal trait efsect (co) VARIANCES + 3 0.050 Randoa Befect 4 type of Random Eteact, additive animal Pedigree Pile: ie trait effect (co)VARIANCES 0.500 PEMARKS (1) Weight position 0 means no weights utilized (2) REeect positions of 0 for some effects and traits means that such effects are missing for specified traits Data record length = 5 original & 0.20 Anvarted & 10.00 original & 0.05 snverted & 20.00 original & 0.50 2.00 solutions stored in file: "solutions" ‘Ynome/ignacy/£90/examples cat solutions trait/eteect level solution 252 2 aL.eses ii 2 43.7839 aa 3 34 aa 2 0.008 12 2 0.0088 ras} 3 -0.0159 iva 4 o.01ss 12 5 0.031 aa, 6-0-0321 13 i 0.0000 a)i3 2 0.0079 ae 3 0.0081 13 «ovo ae 2 -a17627 a4 2 019583 16 3 lanes 2.4 «019206 dice 5 -a.o7as 14 6 2.374 14 7 0-esi2 58 e244446404046464600406606060008082006068088028280204282804284.44442424244444 4 ARUAUVAAAURAAVAVAVDADAVLADAVLVAVADVAVARAVAVAVVAIATSD Appendix B (multiple trait sire model) Example of multiple trait sire model (from LR. Schaeffer notes of 1985). Models Trait 1: yarhitsitens Trait 2: yarutsajtern where h-fixed herd s-random sire and var(s)=A[8 6; 6 17], var(e)=I[10 10; 10 20) Data file (Irsdat) Format: h//s/ys/y2 21.30 3.8 50.3 as 52.6 5 055.0 Pedigree file (Irsrel Format: bull/sire/MGS 3 Parameter file (Irsex) {Example of two trait sre model with unequal models. DATAFILE lesdat NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS 2 ‘OBSERVATION(S) 4s WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 122 cross 3.35 cross 60 SAALARALELLERLERLEEETERERLLELEREREUBELDEREDEDEDD VVVUEVEGUVULLELUEULELUUEULELEVEYELELELELULELSE 61 RANDOM RESIDUAL VALUES 1010 1020 RANDOM_GROUP 2 RANDOM_TYPE add sire FILE Irstel {COVARIANCES a6 6a7 Execution ‘[none/ignacy/£90/axamples biupf90 name of parameter £ile?ireex BLUPF90 1.00 pata file iredat Wanber of Traite 2 Number of Refects 2 Position of Cbeervations = «5 Position of Weight (2) ° Value of Missing Trait/Obeervation ° 4 type position (2) levels [positions for nested) 1 cross-classified 12 2 2 crose-classified = 3 3 5 Residual (co)variance Matrix 10.000 10.000 30.000 20.000, ‘type of Random netect: additive sire Pedigree Pile: irarel trait effect (Co)VARIANCES a 2 2.000 6.000 2 2 6-000 17.000 ght position 0 means no weights utilized fact positions of 0 for some effects and traits means that such Date record length = 5 original ¢ 0.17 -0.06 0106 0.08 solutions stored in file: "solutions" ‘Inone/ignacy/£90/exampies cat solutions trait/etfect level sols 3.2180 0000 0.2243 -0.0210 018227 -012866 014969 0.7522 0.6178 -0.0769 62 AAAABABAAEBEAALALABABALALEAZEEEBABABAAGBAAABAAAAS? VPVVYEVVULUVUEVEVLUEVELUELUELEUEEEVELELUELEUYUUUUESES 63 Appendix C (test-day model) This test-day model example comes from the paper of Schaeffer and Dekkers (WCGALP94 18:443) Model Yyou'= by + BaXsy + BaXa)+ a4 + YuXy + yank Hes where ym vield of test day hy- test day effect Xay- days in milk X_j- log(days in milk) B1, Bz - fixed regressions a random animal Yau Ya random regressions for each animal and vvar(eyu) = 4; var(an, Vis, Vas) = [2.25 4.7; 4 1375 12; -7 12 94]* Data file Irsrrdat) Format: h/a/Xa/X2/y 173 1.42905 26 34 2.19395 29, @ 3.64087 37 323'0.908127 23 84 1.20949 18 58 1.65907 25 5 4.12087 44 178 0.538528 21 139 0.795038 6 313 0.992924 19 60 1.62597 29 184 0.505376 1 58 0.657717 15, 105 1.06635 22 14 3.08125 35 aue 0.238817 11, 165 0.614366 14 74 1.41625 23 31 2.20632 28 21s 0.249674 8 1 1.32586 22 Relationship file (rsrrrel Format: animal/sire/dam 197 2108 392 a8 000 Parameter file (exlrsrr ‘¥ Example of single-tralt random-regression model DATAFILE Iesredat NUMBER_OF_TRAITS 1 NUMBER_OF EFFECTS 6 ‘OBSERVATION(S) 5 weigHTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 1G cross Bieov 4teov 211 cross 311 cov2 411 cov2 RANDOM_RESIDUAL VALUES 1 RANDOM_GROUP 456 RANDOM_TYPE add_animmal File Iesrrel (coyvartances “447906 0.001334 0.003506 0.001334 0.000732 0.000103, 0.003506 -0.000103 010678 Execution Prone/ignacy/£90/exampies bsp €90 name of paraneter file?exizerr RLUPF90 1.00 Parameter fle! exieere Data file: Ireredat Muaber of Effects ‘ Position of Observations 5 Bosition of Weight (1) ° Value of Missing Trait/Observation ° type position (2) levels [positions for nested] 64 CAAA AAEAEAGDEAEAAGDLGALGAEAELALAZLEARLALALAZLEALZABZEAARAAS VUVUELELUEVELELELELUELELLELLELELELLELELLELEULULEELS a x ‘ 2 3 a 3 covariable 4 a 5 covartable 3 uo? 6 covartable 4 ia Residual (co)variance Matrix 1.000 correlated random effects 4 5 6 ‘ype of Random EEtect: additive antaal Pedigree File: Lesrrrel trait effect (Co)VARIANCES a ‘ 0.448 -0.001 0.008 a 5 0.002 0.001 0.000, 1 6 0.006 0.000 0.012 (2) Weight position 0 means no weights utilised (2) Erect positions of 0 for some effects and traits means that euch effects are missing for specified traits Data record length = 5 0.45 0.00 0.00 0:00 0.00 0.00 0.00 0.00 0.01 snverted 2.25 4.00 -0.70 4[001375.09. 11.95 0.70 11.98 94.00 solutions stored in file: "solutions" ‘/nowe/ ignacy/£90/axampies cat solutions trait/effect level solution 19.7278 37.8500 -0.0498 5.2512 -0.4430 0.2708 -0.7288 Peers) -0.1626 ~o.4928 ose 0.4574 -0.6288 0.4574 0.0369 00068 -0.0054 0.0068 0.0167 0.0133 65 a -0.0238 0.0350 0.0238 0.0008 “0.0370 0.035 0.0479 0.0767 -0.0149 -0.0377 “0.0103 0.0366 -0.0480 0.0366 -0.014s, 66 VUVUEVULVULUELULULULULELELULLELLLELELULELEULULELE 67 Appendix D (multibreed maternal effect model) ‘This model was used for studies on multibreed evaluation in beef cattle. It is pro ‘a model with maternal effect and different models per trait. led as an example of Model (in concise form, with most indices omitted) yrcgi+bt+mbt+a+M +e yrrcg: + bt + mbt +a+M+pe+e ysecgs + bt + mbt +a+ e where yi3- birth weight, weaning weight, and gain gis - contemporary groups separate for each trait br- breed type ‘mbt - maternal breed type a additive effect m- maternal effect pe - permanent environmental effect of the dam Data file (data.out) Format: contemporary group for trait 1 contemporary group for trait 2 contemporary group for trait 3 animal breed type maternal breed type animal id dam id birth weight ‘weaning weight 10. gain Relationship file (pedi.outok: Format: animal sire or unknown parent group dam or unknown parent group “1 + number of missing parents” Parameter file (exirsrr DATAFILE data.out NUMBER_OF_TRAITS 3 NUMBER_OF_EFFECTS 6 OBSERVATION(S) 3910 WeIGHT(s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 123 133085 cross 444° 1Bieross 550 165.cross 666 1724112 cross 770 3724112 cross 070 4724112 cross RANDOM RESIDUAL VALUES 23 00 00 00 = 1312.9 0.0 00 00 1286.3 RANDOM_GROUP 4s RANDOM_TYPE add_an_upg. File pedl.outok {(CoyVARIANCES. 29 363 16-46 00 366 $002 1108 00 © 916 6 1108 3130 00 © 00 “46 00 00 101 00 oo 916 00 «00a. 09 00 «00 0 oo RANDOM_GROUP 2 RANDOM_TYPE diagonal FILE (covariances 02688 00 © 00 00 13.129 00 00 002.463 RANDOM_GROUP a 00 00 00 0 00 0.0 68 ERAGATTAETEATCALARAAAAAAABLEARLADBDLAGDABLAAARAAAAASA VUOVUEDELELEELELOEEEEEEEEE00004000800000000008 RANDOM_TYPE diagonal FILE (coyvariances, 0268 00 © 00 00 = 13.129 0.0 00 =00 a RANDOM_GROUP 6 RANDOM_TYPE diagonal Fie (covariances Cr) oo 455 00 a) 6 70 Appendix E (random regression model) A single-trait random regression model for test-day milk is using cubic Legendre polynomials. Model fune{ y_iikt-=~hym_ij-+"sum from {m=1} to 4 alpha_m(l) h_imv+~ sum from {m=1} to 4 alpha_m(l) u_km™+~ “sum from {m=1) to 4 alpha_m(I) p_im~+~e_jkl } where yqu~ test day milk hhymy- hear-year-test for herd i and year-test j hy - effects of herd i La(I)- value of m-th Legendre polynomial at point corresponding to DIM=1 u-additive effects pe - permanent environmental effects Data file (datarr Format: Lherd 2. hear-year-test, 3-6. values of Legendre polynomials 7. weight for residuals: 100/var(eyx) 8. test day 9. animal Relationship file (pedirr) Format: animal sire dam Parameter file (exrr3} DaTar datare NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS a OBSERVATION(S) £€4.444.4.444444442424444444444444444444444 44aA4A4 VUEVUBEEEULLELULULELELULULULULULUELUELUULULUUELE ma 8 weicHtis) 7 EFFECTS: POSITIONS_IN_OATAFILE NUMBER_OF_LEVELS TYPE_OF EFFECT 2 3726cross —_therd-year-test 3.88.cov1 ‘therd 484.covt 5.84 cov 688cova 3.21874cov9 additive 421874 cov9 5.21874 cov9 621874 cova 321874cov9 fipe 421874 cov 521874 cov 621874 cov9 RANDOM RESIDUAL VALUES 100 RANDOM_GROUP 6789 RANDOM _TYPE ‘add_animal File pedirr (coyvaniances (44 matrix) RANDOM_GROUP 10111213 RANDOM_TYPE diagonal File (coyvaRIANcES, (4x4 matrix) Appendix F (terminal cross model) A terminal cross model by Fernando et al. and Lo et al. breed A: yarcga + va tea breed B: ybecgb+ ub +eb cross: yab=cgab+ —uaab+ ubab +eab Data file (data cross) 1. og A (85 levels) 2. B (110 levels) 3. eg crossbred (87 levels) 4. animal - breed A (2400 animals) or parent from breed A 5. animal - breed B (3000 animals) or parent from breed B 7, 8, 9. ya yb Pedigree files: pedig_A for breed A and pedig_B for breed B Parameter file 1 Example ofa terminalcross model DATAFILE data-cross NUMBER_OF_TRAITS 3 NUMBER_OF_EFFECTS 3 (OBSERVATIONS) 678 WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECT [EFFECT NESTED] 123.110 cross 4042400 cross (055 3000 cross RANDOM_RESIDUAL VALUES 10000 01000 00100 RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal Fite edig A VEVUVLELELELELUELELLELLLLESLULLELELEELELELLULULLLE {coyvaniances (9x9 matey) RANDOM_GROUP 3 RANDOM _TYPE add_animal (coyvaniances (9x9 mates) Appendix G (competitive model) Example of a competitive model (a la Muir and Schinkel) yecgtatcl+e2+tcS+e ciis the effect of the i-th competitor; assumed pen size of up to 6. Datafile (data comp) Ly 2.¢g (max 120) 3. animal (max 3000) 4. competitor 1 5.2 8.05 If pen size is less than 10, unused fields set to 0. Parameter file ‘Example of a competitive mode! DATAFILE = 0.0% otimating Regrs bot + bate 0.008 "0.997 ete | Correlations of off-diagonal elements of G and A22 is 0.660; low numbers indicated genotyped mistakes zoo | ot Poor pedigrees jsion Coefficients G = bO 11' + BLA +e Ripiasniog colericins’so'nis''" “3.006” ‘1-600 correlation off-piagonal elesents @ 6A 0.660 tend @ as aiphata + beta*aaa: (aisha bets) 0.950 0.050 seatistie of Canonte wateix ee Diagoned asoo 09 oth Late 0.008 SeePattgonnn — z2eent 000 -0.ias Lesa s00g reequanay ~ Diagonal of @ ara ae Diagonal lemons of G shoul be 3202. Too large o = oveae too small elrents inate: tea 1.446 Genotyping mistakes Ringe: cau0 Cees bined nes See imeone et (2013) T o.enea ° 2 Slone se 3 Siscae a8 tise ite 5 “aloes sa ies i $ fee at a Laer ie 5 tus ‘ a Tia 5 Hone i Lise i ooUgs i ree : is tee 3 1 130, ° er ° ers $ is ie, é tae : OOM a sonia 0 matete according to 822 - athod: 2 alagonni'at “°PGaE? “ofeauagomal Ar 0.002 ALLA: 0.008 bégference: Bingen G: 51999 Seediogonal @: 0.000 All! 91000 bitfarence DPS" bueg ~ 0 beebiags 07398" dant /tagrog) 0.99 Die A OffDiag ~ ¢ OfeDiag: Dige & all ~ @ al Mew Alpha: 0,008 0.008 Final Pedrigres-Based Matrix statistic of Ral. Matrix A22 New Beta: 0.050 Mew Delta 0.008 0.998 01999, 82 i a i i iii eee 83, Diagonal 1500 1.001 1.000 1.250 0.000, off-diagonal 2248500 0.003. «0.000 0.750 0.008, Statistics of G after scaling as in Chen et al (2014) or Vitezica et al. (2011) Statistics should be same as for A22. Statistic of Genomic Matrix Mean men vax ver Diagonal 1s00 1.001 0.896 1.447 0.002, orfediagonal 2248500 «0.003. -0.134 0.822 0.002 Correlation of Genonic Inbreeding and Pedigree Inbreeding ALL elesents ~ Diagonal / off-Diagonal Eotinating Regression Coefficients © = BO 11! + DLA +e Regression coefficients BO bi = 0.000 0.995 correlation all elenents G&A 0.663 of Diagonal Using 70306 elements from A22 >= 0.02000 Eotimating Regression Coefficients ¢ = 20 11! + BLA +e Regression coefficients BO Bi= -0.001 0.998 Correlation Off-Diagonal elements G&A 0.679 creating A22-inverse "Wall time: 08-05-2011 16h Sem 10s @66 Inverse using ginv2 elapsed tine ”3.5¢446100000000 Statistics of Azy* Statiatic of Inv. Rel. Matrix A22 x Moan msn wae var Diagonal 1500 1.607 1.086 9.221 0.575 off-diagonal 2248500 -0.001 1.067 0.533 0.001 creating G-inverse ‘Wall tine: 08-05-2011 16h Sem 17s 987 Inverse using ginv? elapsed tine "4.24635400000000 Wall time: 08-05-2011 16h 56m 265 068 Statistics of G* 2.x diag(G* - Ans") is approx. measure of extra genomic info in terms of effective daughters Statistic of Inv. Genomic Matrix Diagonal 1500 8.007 3.597 64.893 21.055, Oefrdiagonal «2248500 -0.005 -12.697 6.632 0.056. Creating Gima22i in file: "Giaaz2i" Calculating GaA22/Gian22i Matrix Densem storage Calculating GaA22/GinA22i Matrix..velapsed time 0.1269617 2 a 2 2 2 2 2 2 2 2 2 2 . 2 2 3 2 2 3 5 2 2 3 2 3 2 2 2 2 2 2 3 3 “4 84 Setup Genoate Done. waieazai 1.00000000000000 ‘aatrix increased from 100000 to 150000 § filled: 0.9000 Gatrix increseed from 150000 to 225000 $ filled: 0.9000 matrix incressed from 225000 to 337500 § filled: 0.9000 atrix increased from 337500 to 506250 $ filled: 0.9000 Matrix increased from 506250 to 759375 $ filled: 0.9000 Sateix increased from 759375 to 1139062 % filled: 0.9000 Sitcix increased from 1139062 to 1708593 § filled: 0.9000 Finished peda sm 30-68333 8, 1153064 nonzerces round 1 “convergence” 3.2347761279059925-006 round 2 Cenvengence= 1.6159551481596982-005 round 3 SShvergence= 9. 6751370583609918-006 ound 4 Gonvargence= | 6_5334826759414475-006 ound 5 convergence= | 2.7117511659833212-006 ‘Found, 64 convergance=2.7210309586176838-012 Found 5 Gonvergence= | 1.9310295787582112-012 round 66 cenvergence= 1.6104729921881481-012 ound 69 Convergence= 1.2592041366430062-012 Found 62 convergence= _9.025592062452769E-029 68 iterations, convergence eriterion= 9.025592062452768R-013 solutions stored in file: "solutions" Solution file solutions trait/esfect level solution a gs 1 a orssizia 12 1 O:20194865 12 2 0133749439 a 3 010475742 ae &-0/31055520 12 5 _0.22360631 12 6 -0.0945480 12 7 “olosi86435 2 8 0,28033163 ‘Variance component estimation by AIREMLF90 name of parameter £ile?renf90.par + xP fite: mazker.gono.clean SNP Xeef file: sarker.geno.clean_Xref1D { prequency to Center Zip to create GnZZ'/k (default whichtreq = 2): AI-RIMLFDO ver. 1.96 Parameter £10: ant90.per Data file Fene90 dat Womber of Teaits 1 Number of Befects 2 Position of Observations = 1 Position of Weight (1) ° Value of Missing Trait/observation ° Statistic of Inv. Genoatc Matrix Diagonal 2.007 3.597 64.893 21.055, off-diagonal 01005 12.697 6.632 (0.056 creating Gim22i in file: "Ginaz2i" et ta AA REZ AHAATAEAAZARAAAAALAAARLAALABLAAAR 85 calculating Gnaz2/cimaz2i Matrix Densen storage Calculating GaA22/GimA22i Matrix..velapsed time 0.1089821 ‘Setup Genomic Done weima22i 1,00000000000000 hash matrix increased fro 95429 to 128142 $ filled: 0.9000 hash matrix increased from 126142 to 192213 § fillea: 0.9000, hah matrix from 192213 to 280319 § filled: 0.8000. hash matrix from 288319 to 432478 § filled: 0.9000, hash matrix from 432478 to 649717 § filled: 0.9000. hash matrix increased from 649717 to 973075 $ filled: 9.9000 hash matrix increased from 973075 to 1459612 8 filled: 0.9000 hash matzix increased from 95428 to 126142 $ filled: 0.9000 hah matrix increased from 128142 to 192212 $ £illed: 9.9000. hhash matrix increased from 192213 to 268319 $ filled: 0.9000, hash matrix increased from to 432478 § filled: 0.9000, hash matrix increased from to 648717 § filtea: 0.9000, hash matrix increased from 0 973075 $ ¢iliea: 0.9000 hash aateix increased from 973075 to 1459612 $ filled: 0.9000 ‘finished peds in’ 32.01313. 8, 1193064 nonzeroes SPARSE svarrsTics DIMENSION OF MATRIX = ‘ze Im FACTOR (OF CALLS SPARS SoLW OF CALLS Der / IDET ‘TOTAL CPU TIME IN FSPAK ‘TIME FoR srwmoLic FAC 0.676899, ‘PIME FOR NUMERICAL. FAC 21017693 ‘TMe FoR SOLE : 0008995 (TIME FOR SPARSE goLva = ‘0,000000 ‘TIME FOR SPARSE IKVERSE = 41147369, “Rog = | 43515,7413644011, are = 43519,741 3644021 an round 1 convergence= 0.42305178038i002 Getta convergence= 0.252173522062583 058510 new G 0,28516 -2iogh = 53013.2734486059 arc = 59017.2724486053 In round 2 convergences_0.141351613622645 delta convergences 0.117430750820623 0,52205, “Bog = 52800.6601605267 Are = 52904. 6co160s267 In round 3) convergence= "1,7253305659253585-002 daits convergence= 4.768938966058494E-002 0.49575, new o 0.52606 Blog = 52785.2479463395, rare = 52789,2479¢63205, Im round ‘4 convergences ”1,1018917634514988-004 dette convergence= 3. 6624971044840092-003 0749400 9.46sse1 VUVEVELELLULELELELELESELELELLEULELELELELELES 86 0,53164 -2logh = 52785.1635385807 arc = _52799.1635305807 Tn round ‘5 convergences 2.8046958472400735-009 doles convergence= 1.777604045032979E-005, Estimates of variance components Final Betinates Genetic variance(s) for effect 053167 inverse of AI matrix (Sampling Variance) (0. 40¢4aE-03 -0.173678-03 Correlations from inverse of AT matrix fo. 71219 0.71219 1.0000 SB for R 0.121258-01 se solutions stored in file: "solutions" CERLGEEEGEAEELEBLEDBEAELLERETALLELADTADLALDTELEAADLADTALD | VPUVEVUELELELELELELELELEEELEEELEELEEEEELELELELE 87 Appendix I (complete genomic analysis) Data files are available at http://nce.ads.uga.edu/wi _from_uga_2014 joku.php? Using RENUMF90, PREGSF90, BLUPF9O (BLUP), BLUPF90 (ssGBLUP), PREDICTF90, POSTGSF90 (ssGWAS) Simulated data Single trait with heritability of 0.30 and phenotypic variance = 1.0 Five generations Total of 994 parents from generations 1 to 4 were genotyped Three hundred progeny from 5® generation had genotypes and pedigree, but phenotypes were removed for traditional and genomic evaluations Data Structure: #Animal Generation Sex Mu QTL Residual Phenotype (Phenotype = Mu + QTL + Residual) 1011 -0.926104 1.586661 1.76056 0.11 1.093034 -0.4s1621 -0.54405s 11 0.135824 0.984936 1.94911 21 0.064242 0.802145 0.242097 21 0.342068 0.028434 1.3705, 6095 521 1.801324 -0.4o4a22 2.3065, 6096 5 21 0.772964 0.791936 2.5649 6097 5 21 0.749241 0.285@15 2.02406 609 511 1.042522 -1.606656 0.425866 6099 511 0.891219 0.179843 2.07116 6100 52.1 0.745673 0.034715 1.78059 Pedigree: 6100 animals Animal Sire Dam 00 6095 4576 4403, 6096 4576 4065, 6097 4576 2263, 6099 4576 2690 Genotypes: 1294 animals genotyped for 1000 SNP across 5 chromosomes # Animal SNP;SNP;SNP,SNP:SNPS...SNPsoo0 6100 2712.1 Map: HSNP order chromosome position 11 0.00000 116722 0.33406 o.so66 Parameter file for RENUMF90 DATAFILE newdata.txt TRAITS 7 FIELDS_PASSED TO OUTPUT 2 weiGHTis) RESIDUAL_VARIANCE 0.70 errecT cross alpha mu eFrecT 1 cross alpha animal RANDOM animal Fie ped.txt FILE_pos 12300 SNP_FILE snp PED_DEPTH ° {CO}VARIANCES 030 OPTION chrinfo map.txt Log file for RENUMF0 REWMESO version 1.106 name of parameter £ile? renua.par ‘atafile:newdata. txt fields passed: 2 0.7000 Processing effect 1 of type cross item Kindealpha Processing effect 2 of type cxoss 88. CERERARERETLERLELELETETLALALEELALAEDLAETLALEABAEDELDTD 2 2 2 2 2 2» > 2 2 2 = 2 = » 4 2» 2 2 2 2 2 2 2 2 2 2 > 2 2 2 2 2 > 2 2 2 2 4 > 2 4 39 item kindealpha, Pedigree file name "ped. txt" Positions of animal, sire, dam, alternate das andyob 1 2 3 0 0 SNP file name "enp. txt" Reading (CO) VARIANCES: ax a Maximum size of character fields: 20 Maximun size of record (max_etring readline) ; 800 Maximum munber of fields for input file (max field readline): 100 hash tables for effects set up read 6100 records table with 1 elesents sorted added count ztrect group 2 of column 2 wien 1 levels table expanded from 10000 to 10000 records added count Eefect group 2 of colum 1 wien 6100 levels wrote statistics in file "renf90. tables" Basic statistics for input data (aissing value code is 0) 7 2.0083 5.0863 1.0042 0.99034 6100 zandon effect with supe 2 type: anisal file: anp.txt 4 sure 1294 records Recect group 2 of column 2 owen 6100 levels sendom effect 2 ‘opened output pedigres file "cenaddo2.ped™ read 6100 pedigree records Pedigree checks Mumber of animals with records 6100 Number of animals with genotypes 1298 Number of animals with records or genotypes. 6100 Nuaber of animals with genotypes and no records ° Number of parents without records or genotypes: ° Total number of animals 6100 Wrote cross reference Ibs for SNP file “enp. txt xreeID" Weote parameter file "rent90.par" Wrote renumbered data "renf90. dat” Parameter file for PREGSF90 without quality control DATAFILE renf90.dat NUMBER_OF_TRAITS 2 NUMBER_OF_EFFECTS 2 OBSERVATION(S) 1 WEIGHT(S) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100cross RANDOM _RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal OPTION SNP file snp.tut OPTION chrinfo map.txt ‘OPTION no_quality control Log file for PREGSF90 without quality control rane of parenater file? rene90.par regs 1.10 Parameter file: ene90.par Data file: ene90 dat dumber of BEfects 2 Position of Observations 2 Position of Weight (1) ° Value of Missing Trait/Observation ° position (2) levels [positions for nested] Residual (co)variance Matrix 0.70000, Random HEfect(s) 2 type of Random Effect additive ania Pedigree File ronadd02.ped trait effect (Co)VARIANCES 1 2 0.2000 (2) Weight position 0 masns no weights utilized (2) Ettect positions of 0 for some effects and traits means that auch effects are missing for specified traits options read from parameter file * Sup tite: anp. txt, 90 100 ELATLAGELALELLELELDLELDLAURLADLEDLUDAETADTUAGDEADEAAAAD VUVUVEEUEEEELEEELELELELEEELELELELEELELELELES a1 + sup xret file: snp. tet_xret1D + map file: map. txt No Quality Control Checks 1111! (default fs Genomic Library: Version 1.164 Optimized OpenkP Version * stionship matrix (i) created for effect: . creating a22 Extracting subset of: 2312 pedigrees from: 6100 elapsed time: 0.0150 Calculating 222 Matzix by Colleau OpenMP...elapsed tine: .0190 Column position in file for the first marker: 8 Format to read SNP #:1e: (7x,40000031) Number of SNPs: 1000 Number of Genctyped animals: 1294 Reading SNP file elapsed time: .06 Statistics of alleles frequencies in the current population 1000 0.504 0.043 0.929 0.032 Reading MAP file: "map.txt" - 1000 suPs out of 1000 Min and max # of chromosome: 1 5 Min and max # of SNP: 1 1000 Genotypes missings (8): 0.000 calculating 6 matrix Dgema ML Wehreads= «G16 Elapsed omp get_tine: 0.7359, Scale by Sum(2pq). Average: 435.221580281360 Blend Gas alphatG + beta+az2: (alpha,bets) 0.950 0.050 Frequency ~ Diagonal of Moan: 0.999 Msn 0169s. ax 1468 Range: o.029 tease ciass count 3 0.9523, +300 4 0.9810 380 92 010 207 038 337 067 096 24 183 382 362 au 40 468 1 Fr 20 a Check for diagonal of genomic relationship matrix Check £or diagonal of genomic relationship matrix, genotypes not removed: 0 Final Pedrigree-Based Matrixx Statistic of Rel. Matrix a22 Diagonal 1294 1.001 1.000 1.250 0.000 off-diagonal «1673142 «0.005 «0.000 «0.750 (0.001 Statistic of Genoaic Matrix off-diagonal «1673142 0.005. 0.158 0.791 0.002 correlation of Genomic Inbreeding and Pedigree Inbreeding Correlation: 0.2177 ALL elements - Diagonal / off-piagenal Retimating Regression Coefficients @ = BO 11' + BLA +e Regression coefficients bO bi = 0.000 0.992 correlation all elements G&A 0.737 ofe-Diagonal ‘Using @2426 elements from A22 >= .02000 Eetimating Regression Coefficients G = b0 11' + bLA +e Regression coefficients bO bi = -0.003 0.999. correlation Off-Diagonal elements G&A 0.777 creating A22-snverse Inverse IAPACK MEL dpotr£/i Wthresde= 8 16 Elapsed omp_get_time: 0.1071 Q2ECAAEEAAAAELELAGCELLARLALLELELBLELRLELELLALBLELLRLALALLRLAD VWEVEEEELELELELEL ELE LELELELELELEL EEE EEEEEEEESE Pinal a22 Inv Matrix Mose msn Max var Diagonal 1294 1.851 1.067 5.012 0.431 off-diagonal «1673142 -0.001. 1.200 0.600 0.001. Creating G-inverse Inverse IAPACK KL dpotrt/i #threade= @ 16 Elapsed omp_get_time: 0,1050 Statistic of Inv. Genomic Matrix N Moan msn Max var Diagonal 1204 13.457 5.627 45.588 27.905 off-diagonal «1673142 -0.010 -13.500 6.896 0.226. Check for diagonal of Inverse Genomic - Inverse of pedigree relationship matzix Saving Gimaz2i in file: “Gimaz2i= Pinal G Inv ~ A22 Inv Matrix Diagonal 1294 11.606 4.746 40.210 off-diagonal 1673142 -0,009 -12.500 6.396 + Setup Genomic Done 111 + Parameter file for PREGSF90 with quality control DATAFILE renf90.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 (OBSERVATIONS) 1 WEIGHT(S) 21.707 EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100 cross RANDOM. RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal Fur renadd02.ped (coyvaRIANces 0.30000 OPTION SNP file spt OPTION chrinfo map. txt Log file for PREGSF90 with quality control pane of paranater file? precs 1.10 Paraneter £1e: ene90.par Data file: rent90.dat Mumber of Traits a umber of Effects 2 Position of Weight (1) ° Value of Missing Teait/observation ° position (2) levels Residual (co)vartance Matrix 0.70000, Random Eefect(e) 2 type of Random Effect: additive animal Pedigree File: renadd02.ped trait effect (CO)VARIANCES 2 2 0.3000 (2) Weight position 0 means no weight (2) Eetect positions of 0 for some a: ng for specified traite options read from parameter file: + uP fi10: emp. txt 94 (positions for nested] Read 6100 animals from pedigres file: “renadd02.ped" munber of Genotyped Animale: 1294 creating a22 EAAECAALAAATEALELADCABLULDEBLELABELBALALBLADAELABLELTLLAD PUBSSEELEEEEEEELELEEESELELELEEES Extracting subset of: 2312 pedigrees from: 6100 elapsed tine: Calculating A22 Matrix by Colleau OpenlP. elapsed tine: .0189 Mumbers of threadesd 16 Resding SMP file (Colum position in file for the first marker: 8 Format to read SMP file: (7x, 40000051) mumber of SNP=: 1000 Munber of Genotyped animals: 1294 Reading SNP file elapsed time: .06 Statistics of alleles freqvencis in the current population m 1000 Mean: 0.504 min. 0.063 Max 0.929 Resding MAP file: ‘map.txt" - 1000 SuPs out of 1000 Min and max # of chromosome: 1 5 Msn and max # of SNP: 1 1000 95 0.0160 Quality Control - SNPs with Call Rate < callzate (0.90) will removed: 0 Quality Control - swps with MAF < minfreq (0.05) will removed: 1 Quality Control - Monomorphic SKPs will be removed: 0 ‘quality Control - Removed Animals with Call rate < callrate ( 0.90): 0 ‘quality Control - check Parent-Progeny Mendelian conflicts ‘Total animals: 6100 - Genotyped animale: 1294 - Retective: 1296 Mumber of pairs Individual - size: 450 Mumber of pairs Individual - Dam: 440 Mumber of trios Individual - Sire - Dam: 206 Parent-progeny conflicts or HME could eliminate SNPs in sex Chr Provide map information and sex Chr to checks using autosomes ‘Total munber of parent-progeny evaluations: 890 mumber of SNPs with Mendelian conflicts: 0 checking Animals for Mendelian conflicts ‘otal number of effective SNP for checks on Animals: 999 munber of Parent-Progeny Mendelian Conflicts: 0 Munber of effective supe (after gc): 999 Munber of effective Indiviuals (after Qc): 1294 Statiation of alleles frequencies in the current population after Quality Control (MAP, monomorphic, call ra HWE, Mondalian conflicts) 96 w 999 Moan 0.504 Man! 0.081 Max 0.929 var: 01032 Conctypes missings (8): 0.100 cenctypes miss loge after cleannig (3): 0.000 calculating G matrix Dgemn 100, #thrende= «8 «16 Elapsed omp_get_tine: 0.9840 Scale by Sun(2pq). Average: 435.140105710293 Blend G as aiphatc + betataz2: (alpha/beta) 0.950 0.050 eequency ~ Diagonal of 6 w: 3234 Masa: 0.999 Min: 0.895, Max: 1469 Range: 0.029, class; 20 fciass count A 27 2 103 2 304 ‘ 379 5 285 ‘ 437 e a ° 3 10 a a ° 2 2 3 ° 1 ° 5 ° 16 ° v ° 1 ° 19 ° a ° Cchack for diagonal of genomic relationship matrix check for diagonal of genonic relationship matrix, genotypes not removed: 0 Final Padrigree-Based Matrix Statistic of Rel. Matrix a22 M Maan vin Max var Diagonal 1294 1.001 1.000 1.280 0.000 Ore-aiagonal «1673142 «0.005 0.000070 0.001 EEAAEEALAGCAELALELAEEALALAEEELALEALALALALELALAD VOVSVVVOVVVVVVOVOGSEVSVouousyvyovyuvyvyuyvyuyvysoyousysvysesd Statistic of Genomic Mateix 7 N Maan Min, Max var Diagonal 3294 1.001 0.898 1.470 0.002 off-diagonal 1673142 0.005 -0.158 0.791 0.002 Correlation of Genoaic Inbreeding and Pedigree Inbreeding Correlation: 0.2180 ‘ALL olements - Diagonal / off-Diagonal stimating Regression Coefficients G = b0 11! + bL A +e Regression coefficients BO bl= 0.000 0.991, Correlation all elements 6A 0.717 off Diagonal Using 83426 elements from A22 >= .02000 Eetinating Regression Coefficients G = bO 11! + BLA +e Regression coefficients b0 bl = -0.003 0,999, Correlation Off-Diagonal elements GA 0.777 creating A22-inverse Inverse IAPACK MUL dpotrf/i #threade= @ 16 Elapsed omp_get_time; 0.1068 anal A22 Inv Matrix, Statistic of Inv. Rel. Matrix a22 Diagonal, 1298 1.851 1.067 5.812 0.431, off-diagonal «1673142 -0.001 -1.200 0.600 0.001. Creating G-inverse Inverse LAPACK MG, dpotze/i Whreade= @ 16 Blapsed omp_get_tine: 0.1047 Final Genomic Inv Mateix Statistic of Inv. Genomic Matrix Diagonal i204 13.466 5.963 45.587 28.023 oef-diagonal 1673142 -0.010-13.521 6.897 0.227 check for diagonal of inverse Genomic - Inverse of pedigre Saving Gimaz2i in site: "oimaz2i" Final G Inv - 922 Inv Matrixx Statistic of inv. Ganoaie- A22 Matrix Diagonal 3296 11.615 4.782 40,309 ofe-diagonal 1673142 -0.009 -12.521 6.387 relationship matrix 21.740 0.211 setup Genomic Done 11! + Parameter file for PREGSF90 with quality control, removing SNP from chromosome 5 and saving the clean SNP file DATAFILE renf90.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS. 2 OBSERVATION(S} a weiciTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 teres 3 6100 cross RANDOM_RESIOUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal FILE renadd02.ped {COARIANCES 1030000 ‘OPTION SNP_filesnp.txt OPTION chrinfo map.txt ‘OPTION excludeCHR 5 ‘OPTION savecleansNPs Log file for PREGSF90 with quality control, removing SNP from chromosome 5 and saving the clean SNP file ane of paranater £1167 rent90.par recs 1.10 Parameter f1le: ene90.par Data file ene90 dat Number of Traits t Position of Observations 1 Position of Weight (1) ° Value of Missing Trait/Observation ° position (2) levels (positions for nested] 2 a 3 6100 EALAALALALALAALAGLALAGAALAGLELADRLADABABRAGALALAAD 99 Residual (co)variance Matrix 0.70000, Random Befect(s) 2 type of Randos Efrect, additive animal Pedigree File: renadd02.ped trait eefect _(CO)VARIANCES 1 2 0.3000 (2) Weight position 0 means no weights utilized (2) Etrect positions of 0 for some effects and traits means that such effects are missing for specified traits options read from paraneter fi16 * owe ite: enp.txe + Sup xref file: onp. txt Xeee1D + map file: mep.txe + Save Clean SNP data to (SNP_file)_clean file (default .falee.) + mxclude Chromosomes (default des Genomic Library: Version 1.166 Modified relationship matrix (i) created for affect: 2 Ry Optimized OpenkP Version * Read 6100 animals from pedigree file: "renadd02.ped" Mumber of Cenotyped Animals: 1294 ‘ting subset of: 2312 pedigreas from: 6100 elapsed tine; 0.0150 elapsed tine: .0190 Reading SuP file Column position in file for the firet marker: 8 Format to read SNP file: (7x,40000032) Manber of SNPs: 1000 Munber of Genotyped animals: 1294 Roading SNP file elapsed tine: .06 Statistics of alleles frequencies in the current population Mean: 0.504 Min! 0.043 Max: 01929 var: 0.032 Reading MAP file: "aap.txt" - 1000 supe out of 1000 Min and max # of chromosome: 15 Min and max # of SNP: 1 1000 Excluded 199 SNPs from 1 chromosomes: 5 Quality Control - SNPs with Call Rate < callrate (0.90) will resoved: 199 2 2 - » - > » 4 > 2 > 2 > > > > 4 > 2 = 2 > 4 di » id 2 2 > 2 4 100 Quality Control ~ supa with MAF < minfreq ( 0.05) will removed: 1 Quality Control ~ Monomorphic SNPs will be removed: 0 ‘quality Control - Removed Animale with Call rate < callrate ( 0.90): 0 ‘quality Control - check Parent-Progeny Mendelian conflicts Total animals: 6100 - Genotyped animale: 1294 - Effective: 1294 umber of pairs Individual - Size: 450 Mumber of pairs Individual - Dam: 440 Number of trios Individual - Sire - Das: 206 No sex Chrososose information is available Parent-progeny conflicts or HAE could elizinate SNPs in sex Chr Provide mp information and sex Chr to checks using autosonas checking SNPs for Mendelian conflicts otal number of effective SNP: 801 ‘Total nusber of parent-progeny evaluations: 890 mumber of SNPs with Mendelian conflicts: 0 Checking Animals for Mandelian conflicts otal number of effective SMP for checks on Animals: 801 Munber of Parent-Progeny Mendelian conflicts: 0 Tei finn ta tical wmber of effective sure career ce): 001 <<] | terremoving chromosomes munber of effective Indiviuale (after Qc): 1294 Statistics of alleles frequencies in the current population after Quality Control (AP, monomorphic, call rate, HME, Mendelian conflicts) mM: 801 Moan: 0.503 min: 0.081 Max 0.28 var: 0.032 List of SNPs removed in: " Clean genotype file was creat Ht New files with clean genotypes cross reference ID file was rented: Genotypes missings (8): 19.900 Genotypes missinge after cleannig (8): 0.000 calculating @ Matrix ‘Dgenm Xu #thrende= «@ «16 Elapsed omp_get_tine: 0.8766 Scale by Sun(2pq). Average: 349.571560214902 Blend G as alphatc + betatAz2: (alpha,beta) 0.950 0.050 Prequency - Diagonal of ¢ EAAAARAAARARERALERADLRADLABLELRBRADLARLAULRADLDLALCLALD 101 0.874 2593 0.036 20 0.9460 3a. 0.9839 as io. 281 2054 98 1.090 20 126 4 9 1162 ‘ ao 1a9e a a 1.234 ° a2 1.270 2 13,1306 ° a1 11593 ° Check for dtagonl of genomic relationship matrix Check for diagonal of genomic relationship matrix, genotypes not renoved: 0 Final Pedrigres-Based Matrix Statiatic of Ral. Matrix A22 Diagonal 1284 1,001 1.000 1.250 0.000 off-diagonal «1673142 0.008. «0.000 «0.750 0.002 Diagonal, 3204 1.001 0.876 «1.593 0.002 ofe-diagonal «1673142 «0.008 0.169 0.861 0.003, correlation of Genoaic Inbreeding and Pedigree Inbreeding Correlation: 0.2092 ‘ALL elements - Diagonal / off-Diagonal Estimating Regression Cosfficients G =O 11! + BL Ate Regression coefficients bO B1= 0,000 0.991 Correlation all elements G&A 0.677 Using 82426 elements from A22 >= .02000 Botinating Regression Coafficients G = 0 11" + bL Ate Regression coafficients 50 bi = 0.002 0.996 correlation Off-Diagonal elements G&A 0.742 Inverse LAPACK MKL dpotze/i #threads= statiatic of Inv. Rel. Matrix A22 Diagonal 1296 1.851 oef-diagonal 1673142 0.002 Statistic of Inv. Ganoate Matrix Diagonal, 3298 17.075, off-diagonal 1673142 -0.013 8 16 miapsed omp_get_tine: 0.1409 3.067 5.012 0.431 11200 0.600 0.002, © 16 Blapeed omp_get_tine: 0.1370 7.840 56.092 43.645 “16.499 8.893 0.309 check for diagonal of Inverse Genomic - Inverse of pedigree relationship matrix saving Gima22i in fi1e: "Gimaz2i" Statistic of Inv. Genomic- A22 Matrix Diagonal 3296 15.223 off-diagonal «3673142 -0.012 6.789 1.043 35.648 15499 6.393 0.289 Parameter file for PREGSF90 with quality control and PCA analysis, Include extra option: OPTION plotpca 102 SHAGAEREAEBECGEBAECDELATLELGLLABEAULBELELGELDLERLELBALBLALVAD 103 Parameter file for BLUPF90 without genomic information DATAFILE renf90_s.dat NUMBER_OF_TRAITS, ‘renf90_S.dat has phenotypes forall animals, but generation 5 2 icin ok forces Linux code to remove phenotypes for those animals: aie awk {if $4225) print 0,$2,$3,$4; else print $1,$2,$3,$4)' renf90.dat > renf90_5.dat OBSERVATION(S) a weicHtis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100cross RANDOM_RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM _TYPE add_animal Fue renadd02.ped (co)vaRiances, om Log file for BLUPF90 without genomic information pane of paraneter file? + convergence criterion (default=1e-12): 1.00000008-15 BUUPPOO 1.48 Paranater £110: en£90.par pata file ene90_5.dat Position of Weight (1) ° Value of Missing Teait/cbeervation ° levels {positions for nested) Residual (co)variance Matrixx 0.70000, Random Eefect(s) 2 additive aniaal enadd02.ped trait effect (CO)VARIANCES a 2 0.3000 (2) Weight position 0 means no weights utilized (2) Refact positions of 0 for some effects and traits means that such effects are missing for specified traits Data record length = a # equation: e102 0.30000 read 6100 records sn 1.49970008-02 2, 2220 read 6100 additive pedigzese finished pede in 1.9996000R-02 2, 27178 nonsezoes 3. convergence = 0.59232-06 4 convergence = 0.62198-04 5S convergence = 0.21228-06 round = 40. convergence = 0.12308-13, round = 41. convergence = 0.3164n-14 round = 42 convergence = 0.2804E-14 round = 43. convergence = 0.108iE-14 = 44 convergence = 0.5761E-15 iterations, convergence criterions 0.57615-15 Solutions for BLUPF90 without genomic information trait/etsect level solution teed 1” -1.02176505 6100 104 EREBERLERLELELELRLEBELUBULRLELERERARLAURADRDLADRADRAL. SUS SESSSSSELELEEEEEEOEELELUDYULEUULUELELES 105 12 2 0.16420973 eee ae eee The solution file (solutions) has 4 columns: 12 @ 0-00s26330 4} Trait [only 1 trait inthis exampie}] 8 5 -0.13277100 2), Effect [we have 2 effects: overall mean (effect 1) and additive genetic direct (effect 2)] 3) Level [number of the level for each effect in the model] 44) Solution EBV accuracy If accuracy of EBV is desired, it can be calculated based on standard errors (se) for EBV. BLUPF90 has an option for calculating se: OPTION sol se Solutions for BLUPF90 with option to calculate se trait/ettect level solution The solution EBV standard errors t.0217650¢ 002496066 a! 1 -0,24665117 0. 29158195 1 2 2 o.16421026 0. das es2 ata 3 0132371755 0.29408286 a8 4 010038218 0. 38229658 aa 5 -0.13277154 —0.46566701 Parameter file for BLUPF9O DATAFILE renf90_S.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 OBSERVATION(S) 1 WEIGHT(S) ;enomic information (ssGBLUP) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100cross RANDOM _RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE ‘adé_animal FILE renadd02.ped (coyvaniances, 0.30000 OPTION SNP_file snp.txt 106 OPTION chrinfo OPTION cony_erit 1¢-15, Log file for BLUPE9O with genomic information (ssGBLUP) ane of paraneter file? renf90.par + convergence criterion (defaults options read from parameter file + SNP Xref File:enp.txt_Xree1D ‘BLUPH9O 1.48 Parameter file: ene90.par rent90_5.dat 2 Position of Weight (1) ° Value of Missing Trait/observation ° + type position (2) levels [positions for nested) 0.70000, ype of Random BEfact: additive animal Fenadd02.ped (corvarrances (2) Weight position 0 means no weights utilized (2) Resect positions of 0 for sone effects and traits means that such effects are missing for specified traits Date record length = 2 # equations = e101 0.30000 6100 records in 0.149970 8, 2201 6100 additive pedigrees optimized Opens Version * Read 6100 animals from pedigree file: "renadd02.ped" EELEAARADERGLABADAACLADLALADLADADAADLALDADLAALALBAD 107 Munber of Genotyped Animals: 1294 creating a22 Extracting subset of: 2312 pedigrees from: Calculating A22 Matrix by Colleau Open. Munbers of threads=8 16 0 elapsed time: 0.0150, 034s Reading SUP file Column position in file for the first marker: @ Format to read SMP file: (7x, 40000031) Munber of SNPs: 1000 Munber of Genctyped animals: 1294 Resding SMP file elapsed tine: .06 Statiaticn of alleles frequencies in the current population Moan 0.504 max: 0.929 ve 0.032 Reading MAP file: "map.txt" - 1000 SNPs out of 1000 Min and max # of chromosome: 1 5 Min and max # of SNP: 1 1000 ‘Quality Control - SNPs with Call Rate < callrate ( 0.90) will removed: 0 ‘Quality Control - supe with MAF < minfreq ( 0.05) will removed: 1 ‘quality Control - Mononorphic SPs will be removed: 0 ‘quality Control - Renoved Animals with Call rate < callrate ( 0.90): 0 Quality Contzol - check Parent-Progeny Mendelian conflicts ‘Total animale: 6100 - Genotyped animale: 1294 - Effective: 1294 Nusber of pairs Individual - size: 450 Number of pairs Individual ~ Dam: 440 Number of trios Individual - Sire = Dam: 206 No sex Chromosome information ie available Parent-progeny conflicts or HME could eliminate SNPs in sex che Provide map information and sex Chr to chacks using autosones Checking SxPs for Mendelian conflicts ‘rotal nuaber of effective SnP: 999 ‘Total nusber of parent-progeny evaluations: 890 Mumber of SNPs with Mendelian conflicts: 0 Checking Aninals for Mendelian conflicts ‘otal nuaber of effective SNP for checks on Animale: 999 Mumber of Parent-Progeny Mandelian Conflicts: 0 mumber of effective SNPs (after Qc): 999 Mumber of effective Indiviuale (after 9c): 1294 108 Statistics of alleles frequencies in the current population after Quality Control (ME, mononorphic, call rate, HWE, Mandelian conflicts) Maen: 0.508 Min 0.051 var 0.032 Genotypes missings ($): 0.100 Genctypes missings after cleannig ($): 0,000 calculating @ Matrix Dgens MeL #threads= 8 «16 Elapsed cap get_time: 1.0240 Scale by Sum(2pq). Average: 435.140185710293 Blend G as alphatc + betataz2: (alpha,beta) 0.950 0.050 Frequency ~ Diagonal of ¢ 3294 Moan: 0.999 0.095, o.e951 27 2 0.9238 108 3 0.9524 304 4 0.9811 379 5 i.o10 205 6 1.038 337 7 1.067 32 8 1.096 ery 9 .aas a 10 1153 2 a1 182 ° a tan 2 13 1238 ° 14 1.268 ° as 1.297 ° 16 1.325 ° a7 1356 ° ae 11383 ° as lan ° 20. 11440 1 2 1a6s ° check for diagonal of genomic relationship matrix Check for diagonal of genoaic relationship matrix, genotypes not removed: 0 Statistic of Rel. Matrix a22 4£446464464444444644044444444444444444444444646444448 OSOVOVOGV VO VOU OGES OU VUDTUYUUUUUSUUULUUEEE Diagonal, 1294 1.001 1.000 1.280 0.000 off-diagonal «1673142 0.005. «0.000 «0.7500. 001. Final Genomic Matrix Statistic of Genomic Matrix Diagonal, 1294 1.001 0.898 ©1470 0.002 off-diagonal «1673142 0.005. -0.188 0.791 0.002 Correlation of Genomic Inbreeding and Pedigree Inbreeding Correlation: 0.2180 ‘M11 elesents ~ Diagonal / of¢-Diagonal Eotimating Regression Coefficients G = bO Li! + bi A +e Regression costficients b0bi= 0.000 0,993 off-Diagonat Using 83426 elements from A22 >= .02000 Estimating Regression Coefficients G = 0 11' + DLA +e Regression coefficients bO bi= — -0.003 0.999. Correlation Off-Diagonal elements G&A 0.777 creating A22-snverse Inverse IAPACK MK dpotrt/i #thresde= @ 16 Elapsed omp_get_tine. Final A22 Inv Matrix: Statiatic of Inv. Rel. Matrix A22 Moan man Max var Diagonal 421.851 1.067 5.020.431 Oft-diagonal «1673142 -0.001 1.200 0.600 0.001 creating G-inverse Inverse IAPACK MKL dpotrt/i #threade= 8 16 Elapsed omp_get_tine. N Moan min Max var Diagonal, 1296 13.466 5.863 45.587 26.023 off-diagonal «1673142 -0.010 13.521 6.897 0.227 109 0.1058 0.2093 Check for diagonal of Inverse Genomic - Inverse of pedigree relationship matrix Final G Inv ~ A22 Inv Matrix Statistic of Inv, Genomic~ A22 Matrix 110 Diagonal, a296 12.615 4.762, 40.309 21.740 off-diagonal «1673142 -0.009 -12.521 6.397 0.211 hash matrix increased from 121072 to 262144 8 fitted: bash matrix increased from 262144 to 524288 § filled: ash matrix s242e8 to ioass76 $ eiiied: 0.8000 ash matrix 1049876 to 2097152 $ filted: 0.8000 finished peds in 25.61810 961721 nonzerose round = 1 convergence = 0.63978-03 round = 2 convergence = 0.42808-03 round = 3. convergence = 0.31128-03 round = 4 convergence = 0.99942-04 round = 5 convergence = 0.61298-04 round = 90 convergence = 0.35908-14 round = 91 convergence = 0.25498-14 round = 92 convergence = 0.20228-14 Found = 93 convergence = 0.14538-14 round = 94 convergence = 0.95998-15 94 iterations, convergence criterion= 0.95998-15 ‘Solutions for BLUPF90 with genomic information (ssGBLUP! ‘The solution file has the same format as in blupf90 without genomic information. The option for calculating se for EBV can also be used here. Parameter file for PREDICTE9O Predictivity can be measured as correlation between corrected phenotypes and (G)EBV. In this example we show predictivity for all young animals and for all young genotyped animals. 1) _Prectivity for traditional BLUP ‘As this program needs solution file, it can be run in the same folder as BLUP «£2446 46864464444444444444444444444444444444444 iit Parameter file: pred.dat is the data file only for animals in 5® generation DATAFILE bie LUnux code to create pred.dat: awk $4225" renf90.dat> pred.dat NUMBER_OF_TRAMTS st oe 1 For creating pred.dat only for genotyped animals in 5 generation: NUMBER_OF_EFFECTS eate predalne ool fo Strate ste naa ee awk BEGIN {for (==0;<1000;1+4) print ie) > ind paste-d"" pred.dat ind | sort +2-3> datl.temp RVATION(S) br ml awk'(print $1) snp.tet_XrefD | sort +0 -1 > dat2.temp 1 join 1 41-2 43 dat2.temp datt.temp | sort-n +4-5 | awk {print $2,$3,$3,$4)"> WeIGHT(s) eae EFFECTS: POSITIONS. IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF _EFFECTIEFFECT NESTED] 2 Leross 3 6100cross RANDOM_RESIOUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TVPE add_animal FILE renadd02.ped (CO}VARIANCES 0.30000 OPTION include effects 2 ste include effets to predict Yhat n, effects 1 PREDICTE9O 1.3, Parameter file: pred.par Data fit pred.dat, Munber of Traits 1 umber of Béfects 2 Position of Observations 1 Position of Weight (1) ° Valve of Missing Trait/observation ° * type position (2) levels [positions for nested) A cross-classitied 2 claesisied 3 0.70000 Randon Befect(s) 2 ype of Random Effect: additive antaal Pedigree Fite: enadd02 ped trait effect (CO)VARIANCES a 2 0.3000 RRORKS (2) Weight position 0 means no weights utilized (2) Erect positions of 0 for some effects and traits means that such effects are missing for specified traits Data record length = 3 # equations = e101 tty effete to include in that (2/f): FT solutions read from file: soltutions ye), yaat(e), cesidual(e) in written in "yhat_residual” file 1000 records read reat. 1 1000 ean ¥ 2,802014134522520R-003 var ¥ 0.982498631522066 ean Yhat 1. 2044018469750808-002 var hat 17,9515672052761418-002 cov (x,Yhat) 8,2562361512213318-002 corr (¥,Yhat) 0.297261012657303 wrote bvs for animals in data in file "bve.dat" edietvity Output files from PREDICTF9O yhat_residual YYhat_residual has 4 columns: animal | y | yhat | residual bbvsidat has 4 columns: trait | effect | Animal | solution EBV) 12 46a o.aassas 1.2 2176 0.094262 12 29% © o.as77e2 12 797 0.328604 12 6021 0.430780 Log file for predicting genotyped animals in 5" generation ane of parameter £18? pred.par s+ include effets to predict That n, effects a 2 PREDICTESO 1.3 Pecanater £i10: gen.par Data file gen. dat Musber of Traits 2 Mumber of Effects 2 Position of Observations 1 Position of Wasght (2) ° Value of Missing Trait/observation ° 112 “ 0.266520 0.415535 0.339710 217s 0.419925 0.094263 «0.808877 _| ‘Because OPTION include effects 2 was used: 293¢ 2.184195 0.187782 3.016178 _| vis phenotype minus al effects other than animal 797 -.a26ses 0.328604 1.919726 | yhat receives the second effect, which isthe animal effect soa. 1.143225 0.430780 1.734210 | residuals phenotype minus animal effect bvs.dat QEELAEGLALALALALACLALAAELALALEALARAALEELEATAAAG —_ > 2» » = - 2» 2 2 a 2 2 2 2 “= 2 2 2 2 2 2 2 a 2 2 2 2 2 rd = 2 2 2 2 2 2 2 2 2 2 2 2 2 13 position (2) levels [positions for nested} lolassified “2 1 felassified == 3 6100 Random Eefect(s) 2 ‘Type of Random Effect: additive ania Pedigree File: enadd02 ped trait effect —_(co)vARIANCES a 2 0.3000 (2) Weight position 0 means no weights utilized (2) Betect positions of 0 for some effects and traite means that such effects are missing for specified traits Data record length = 3 4 equations = e101 te eftate to include in That (0/2): FT solutions read from file: soltutions y(e), yhat(s), residual(s) in written in "yhat residual” file ‘300 recorde read reese: 1 300 ean ~5.2040561862910798-002 var ¥ 0.979795877964320 ean that 11875361266235518-002 var Yhat 17 sagas03e4zai6sex-002 cov (Y,Yhat) 8.232182257800019"-002 corr (¥,Yhat) 0.306765659847626 wrote bvs for animals in data in file "bve.dat" Predictvity 2) Predictivity for ssGBLUP Parameter file: DATAFILE pred.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 OBSERVATION(S) 1 WeiGHT(s) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 erase 3 6100 cross RANDOM_RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE fadd_animal File renadd02.ped (coyvaRiances 0.30000 ‘OPTION SNP_filesnp.txt OPTION chrinfo map.txt ‘OPTION include effects 2 Log file for predicting all a pane of paranater file? pred.par als in 5" generation #44 include effete to predict Yat an, PREDICTESO 1.3, ettects 1 2 Parameter £10: pred.par Data file Pred. dat Munber of Teaite 1 Munber of Eefects 2 Position of Observations Position of Weight (1) ° Value of Missing Teait/observation ° * tye position (2) levels {positions for nested) 1 erees-cl a a 2 exose-el: 3 6100 Residual (colvariance Matrix 0.70000, Random BEfect (2) 2 type of Random Effect: additive animal Pedigree Pile: enadd02.ped trait effect (Co)VARIANCES 1 2 0.3000 REMARKS (2) Weight position 0 means no weights utilized (2) REtect positions of 0 for sone effects and traits means that such effects are missing for specified traits record length = 3 # equations = e101 tte effets to include in That (5/F): FT solutions read from file: soltutions Animal BEtect: 2 (2), Yaat(a), residual (2) in written in "yhat_residual” file 1000 records read meas: a 1000 meen ¥ 1.7296207250095908-002 var ¥ 0. se2aseeress7i62 mean Yat -1.45923218850¢1898-002 var hat 8 6427636115427038-002 cov (r,mnat) 9.1890718 }926587E-002 corr (¥,hat) 0.315340214001692 Predictvity Output files from PREDICTF90 Output files for predict#90 fallow the same pattern whether using genomic information or not. 14 €O4O444O444400040444040444444440444444444444444 OVOVVGVVOVVVVVVVYVUVUVVUUVYUVYUUYVY 115 Log file for predicting genotyped animals in 5" generation ‘name of parameter £116? pred.par te include efgate to predict That n, effects a a PREDICT9O 1.3 Parameter file: gen.par Data file: gen.dat Wunber of Traits 2 mumber of Effects 2 Position of Observations 1 Position of Weight (1) ° Value of Missing Teait/cbservation ° . position (2) levels [positions for nested! a waited 2 2 2 cross-classisied 3 e100 Residual (co)variance Matrix 0.70000, Random Eefect(e) 2 ‘type of Randon Effect: additive animal Pedigree File: renadd02.ped trait effect (CO)VARIANCES + 2 0.3000 REORKS (2) Weight position 0 means no weights utilized (2) Retect positions of 0 for some effects and traits means that such effects are missing for specified traits Data record length 3 # equations = e101 see effets to include in Yhat (2/2): FT solutions read from file: soltutions Aninal Effect: 2 y(a), yhat(a), residual(2) in written in "yhat_residual” tile 300 records read eait: 1 300 mean 3, 7547372336902928-002 var ¥ 0.979795861954027 mean That -1.8630662485957155-002 var Yhat 0,119326686734040 cov (x,Zhat) 0.117365728215231, core (¥,Yhat) 0.3432453846129¢0 cote bvs for animals in data in file "bvs.dat"™ Predictvity Parameter files for GWAS using ssGBLUP (ssGWAS) Run BLUPF90 with genomic information and salve G* and Az* DATAFILE renf90.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 (OBSERVATIONS) 1 weigHTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF_LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100crose RANDOM _RESIDUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE add_animal Fue renadd02.ped (coyvariances ‘Weights for SNP can be updated by an iterative process, where the initial 0.30000 ‘weights are all equal to 1. OPTION SNP file snp.tet OPTION chrinfo map.txt LUnux code to get initial weights for 1000 SNP: OPTION no_quality_controt awk BEGIN (for (I=1:1c1000;H+) print )' > wet OPTION saveGinverse OPTION saven2zinverse OPTION weighted wei Run POSTGSF90 and salve G* and Az:* DATAFILE renf90.dat NUMBER_OF_TRAITS 1 NUMBER_OF_EFFECTS 2 OBSERVATION(S) 1 WeIGHTis) EFFECTS: POSITIONS_IN_DATAFILE NUMBER_OF. LEVELS TYPE_OF_EFFECTIEFFECT NESTED] 2 Leross 3 6100crose RANDOM _RESIUAL VALUES 0.70000 RANDOM_GROUP 2 RANDOM_TYPE ‘add_animal Fie renadd02.ped (coyvaRIANces 0.30000 OPTION SNP file snp.tat 116 a2eaeaengan4enananeng4anaeaaasea4eaeaeaeaneanenneaaaannaaarnast OPTION chrinfo map.txt OPTION no_quality control ‘Moving average of SNP effects can be obtained by using the following option: OPTION Manhattan_plot OPTION SNP_moving_average n OPTION readGinverse where n isthe number of SNP ‘OPTION readazzinverse OPTION weighted wel OPTION windows_varianceS Manhattan plots for SNP windows variance 118 Output files for ssGWAS snp_sol 2 1 1 0 0.7001368e-02 0.2208213 0.119293 0,11266488-03 2 2 1 0 -0.13593496-01 0.5065436 0.2104747 0.21185776-03 1 2 3 1 0 os7iazsseo2 0.3917027 0.757968 0.78089428-03 12 4 1 0 -oaazz40ae-02 0.6873333e-01 1.271113 0.12794656-02 1 2 5 1 © oSa7i6296-03 0.15391376-02 1.261010 0.12692966-02 snp_sol has 9 columns: tralt | effect | SNP | chromosome | position | SNP_solution | weight | % of variance explained by n adjacent SNP | variance explained by n adjacent SNP chrsnpvar . 1 2 oatis2934589 91 1 0 1 2 o2wa7a7s39 «©2010 1 2 0757968023 3 1 0 102 12707 4 10 1 2 12610103595 5 10 ‘chrsnpvar has 6 columns: trait | effect | % of variance explained by n adjacent SNP | SNP | chromosome | position This files used by POSTGSF9O for Manhattan plots module model Implicit none ' mypes of effects ' types of random effects Integer, parameter: g_fixedel,& Sidiag=2, 6 ass, © Gases, 6 GAPE mms, 6 6 gas6,6 character (40) partite, & Satasiis integer ++ ntrait,& Anteger,allocatable ++ pos_y(:) integer? os_weight integer allecatable 1: pos_stf( Blev(:) /€ Fin ( 29/6 fend module model program BLUPF9O Use modaluse sparseay use sparseop Smpiicst none Eealallocatable #5 y(:)/& Sraata(:) 21 weight_y type (sparse_hasha) : 0 ype (eparse_ija) i: mia zeal, allocatabie:: xy(:) ,s01(:) | Integer parameter: sefferese-0,6 leffects can be cro ‘Stfcovni | lor covariables The core of the program is presented below. 119 Appendix J (selected programming details) ‘This section provides some programming insight into an early version of the blupf90 program. ‘The model is completely described in the module MODEL. lassifiod | timed effect diagonal saditive animal Additive aniaal with unknown parent groups | additive animal with unknown ' ‘parent groups and inbreeding | additive sire parental doninance Vanse type inane of parameter file thane of data set tnumber of traite nett, tnumber of effects isso Twalue of missing trait/etfect tpositions of observations HTposition of weight of records; zero if none 19,6 tpositions of effects for each trait munber of levels eetecttype(:),6 ttype of aftacts Restedoov(:, 1}, €tposition of nesting effect for each trait, ie the effect is & candomtype(:),£ 1 status of each off Eandomnunb(:)'" | nimber of consecutive correlated effects eed covariable character (40),allocatable:: randomfile(:) | name of file associated with given real, allocatable #5 2(5,0,6 tresidual (co)variance matrise ‘and its inverse oe) | me random (co)variance astrix for each trait, | one Line of input data 1 weight for records 1 X°x dn spares hash fore 1 XX in TWA form, for use with FSPAK only he'y and solueions 120 real,allocatabie :: weight_cov(:,:) integer, allocatable:’ aadeaes(:,:) start and address of each affect snteger 11 neq,io,# {number of equations and io-atatus daea_lon, 6 | Yength of data record to read 23. {extra variables real:: val, ast lest call read paranaters call print parameters egentrai t¥sum(alev) data _lenvmax(pos_woight maxval (pos_y) ,zaxval (pos_sf£)) prints, 'Data record length = ',data-lon allocate (xy (neg), 01 (neq) ,address (nef#,ntrase) ,¢ Weight _Cov(nere ntrait) ,y (nezaie) , andaca (data ‘ean erom(xx,neg)/ 27=0 cali setup g § savert R matrices ‘open (50, ¢iiendataesie) tdate e616 1 contributions from records eo ead(50,*,iostatnio) indata Af (io.ne.0) exit call docode record call find addres: call tind-einv do int nee ‘do Sat nee do ket nteait ‘val=weight_cov (i,k) *weight_cov(3,1) twaight_y#zinw(k,1) call addn(Val, address (2,k) ;aadraes (3/1) 38) fendao 0 kel ntrast, ny (addrens (2) )=xy (address (1,1) )teinv(e,1) 4y(2) tweight_cov(s,) & sweight_y nda endas, ‘endo do set nett select case (randomtype(i)) ‘case (g_fixed) ‘continue 1 fixed effect, do nothing case (g_diag) reall add g_diag() case (GA, GAs, g A UPG.g A UPC 3 LT ada’ ged (zandomtype(s) 3) (s2) call add g_domin(s) case default 'Print*, 'unimplenented random type! randomtype(i) endeeiect s | | Random effects! contributions Af (neg < 15) then print, ‘ert hand side" ‘call printa(ax) print '( 1! right hand side:'' 10088 enaiz call solve_itezm(xe,xy,201) ' comment the line above and uncoments the Lines below only if Si a i NN Tal a i a a i i | solutions by FSPAK are desired hoe Jane tall fepak90 (‘eolve! 2 Aja, xy, 802) Ae (neq <18) print 1( 1" solution:'* call store_solutions 1087.3), 202 12a.

You might also like