MKL 2017 Developer Reference Fortran PDF

Intel Math Kernel Library
Developer Reference
Revision: 082
MKL 2017
Legal Information
Contents
Contents
Legal Information.............................................................................. 33
Introducing the Intel Math Kernel Library.........................................39
Getting Help and Support................................................................... 41
What's New........................................................................................ 43
Notational Conventions...................................................................... 45
Chapter 1: Function Domains

Performance Enhancements...................................................................... 51
Parallelism.............................................................................................. 51
Chapter 2: BLAS and Sparse BLAS Routines

BLAS Routines......................................................................................... 53
Naming Conventions for BLAS Routines.............................................. 53
Fortran 95 Interface Conventions for BLAS Routines............................. 55
Matrix Storage Schemes for BLAS Routines......................................... 55
BLAS Level 1 Routines and Functions..................................................56
?asum....................................................................................56
?axpy.................................................................................... 58
?copy.....................................................................................59
?dot.......................................................................................60
?sdot..................................................................................... 61
?dotc..................................................................................... 62
?dotu.....................................................................................63
?nrm2....................................................................................65
?rot....................................................................................... 66
?rotg..................................................................................... 67
?rotm.................................................................................... 68
?rotmg...................................................................................70
?scal......................................................................................71
?swap.................................................................................... 73
i?amax...................................................................................74
i?amin................................................................................... 75
?cabs1................................................................................... 76
BLAS Level 2 Routines...................................................................... 76
?gbmv................................................................................... 77
?gemv................................................................................... 80
?ger...................................................................................... 83
?gerc..................................................................................... 84
?geru.....................................................................................86
?hbmv................................................................................... 87
?hemv................................................................................... 90
?her...................................................................................... 92
?her2.....................................................................................94
?hpmv................................................................................... 95
?hpr...................................................................................... 97
3
Intel Math Kernel Library Developer Reference
?hpr2.....................................................................................99
?sbmv.................................................................................. 101
?spmv.................................................................................. 103
?spr..................................................................................... 105
?spr2................................................................................... 107
?symv.................................................................................. 109
?syr..................................................................................... 111
?syr2................................................................................... 112
?tbmv.................................................................................. 114
?tbsv................................................................................... 117
?tpmv.................................................................................. 119
?tpsv................................................................................... 121
?trmv...................................................................................123
?trsv.................................................................................... 125
BLAS Level 3 Routines.................................................................... 127
?gemm.................................................................................127
?hemm.................................................................................131
?herk................................................................................... 133
?her2k................................................................................. 136
?symm................................................................................. 138
?syrk................................................................................... 141
?syr2k..................................................................................144
?trmm..................................................................................147
?trsm................................................................................... 149
Sparse BLAS Level 1 Routines.................................................................. 152
Vector Arguments.......................................................................... 152
Naming Conventions for Sparse BLAS Routines.................................. 152
Routines and Data Types.................................................................152
BLAS Level 1 Routines That Can Work With Sparse Vectors.................. 153
?axpyi.......................................................................................... 153
?doti............................................................................................ 155
?dotci........................................................................................... 156
?dotui...........................................................................................157
?gthr............................................................................................158
?gthrz.......................................................................................... 159
?roti.............................................................................................161
?sctr............................................................................................ 162
Sparse BLAS Level 2 and Level 3 Routines................................................. 163
Naming Conventions in Sparse BLAS Level 2 and Level 3.....................164
Sparse Matrix Storage Formats for Sparse BLAS Routines.................... 165
Routines and Supported Operations..................................................165
Interface Consideration...................................................................166
Sparse BLAS Level 2 and Level 3 Routines.........................................171
mkl_?csrgemv.......................................................................174
mkl_?bsrgemv.......................................................................176
mkl_?coogemv...................................................................... 178
mkl_?diagemv.......................................................................181
mkl_?csrsymv....................................................................... 183
mkl_?bsrsymv.......................................................................186
mkl_?coosymv...................................................................... 188
4
Contents
mkl_?diasymv....................................................................... 190
mkl_?csrtrsv......................................................................... 193
mkl_?bsrtrsv.........................................................................195
mkl_?cootrsv........................................................................ 198
mkl_?diatrsv......................................................................... 201
mkl_cspblas_?csrgemv........................................................... 203
mkl_cspblas_?bsrgemv...........................................................206
mkl_cspblas_?coogemv.......................................................... 208
mkl_cspblas_?csrsymv........................................................... 210
mkl_cspblas_?bsrsymv........................................................... 213
mkl_cspblas_?coosymv.......................................................... 215
mkl_cspblas_?csrtrsv............................................................. 217
mkl_cspblas_?bsrtrsv............................................................. 220
mkl_cspblas_?cootrsv............................................................ 223
mkl_?csrmv.......................................................................... 226
mkl_?bsrmv.......................................................................... 229
mkl_?cscmv.......................................................................... 233
mkl_?coomv......................................................................... 236
mkl_?csrsv........................................................................... 239
mkl_?bsrsv........................................................................... 243
mkl_?cscsv........................................................................... 246
mkl_?coosv...........................................................................250
mkl_?csrmm......................................................................... 253
mkl_?bsrmm.........................................................................257
mkl_?cscmm.........................................................................261
mkl_?coomm........................................................................ 265
mkl_?csrsm.......................................................................... 268
mkl_?cscsm.......................................................................... 272
mkl_?coosm..........................................................................276
mkl_?bsrsm.......................................................................... 279
mkl_?diamv.......................................................................... 282
mkl_?skymv......................................................................... 286
mkl_?diasv........................................................................... 289
mkl_?skysv...........................................................................292
mkl_?diamm......................................................................... 295
mkl_?skymm........................................................................ 299
mkl_?diasm.......................................................................... 302
mkl_?skysm..........................................................................306
mkl_?dnscsr..........................................................................309
mkl_?csrcoo..........................................................................312
mkl_?csrbsr.......................................................................... 315
mkl_?csrcsc.......................................................................... 318
mkl_?csrdia.......................................................................... 321
mkl_?csrsky..........................................................................324
mkl_?csradd......................................................................... 327
mkl_?csrmultcsr.................................................................... 331
mkl_?csrmultd...................................................................... 335
Inspector-executor Sparse BLAS Routines..................................................338
Naming conventions in Inspector-executor Sparse BLAS Routines......... 338
5
Sparse Matrix Storage Formats for Inspector-executor Sparse BLAS

Routines................................................................................... 339
Supported Inspector-executor Sparse BLAS Operations....................... 339
Matrix manipulation routines........................................................... 340
mkl_sparse_?_create_csr........................................................341
mkl_sparse_?_create_csc....................................................... 342
mkl_sparse_?_create_coo.......................................................344
mkl_sparse_?_create_bsr....................................................... 346
mkl_sparse_copy...................................................................348
mkl_sparse_destroy............................................................... 350
mkl_sparse_convert_csr......................................................... 351
mkl_sparse_convert_bsr.........................................................352
mkl_sparse_?_export_csr....................................................... 353
mkl_sparse_?_export_bsr....................................................... 355
mkl_sparse_?_set_value.........................................................357
Inspector-executor Sparse BLAS Analysis Routines............................. 359
mkl_sparse_set_mv_hint........................................................ 359
mkl_sparse_set_sv_hint......................................................... 361
mkl_sparse_set_mm_hint....................................................... 363
mkl_sparse_set_sm_hint........................................................ 366
mkl_sparse_set_memory_hint.................................................368
mkl_sparse_optimize............................................................. 369
Inspector-executor Sparse BLAS Execution Routines........................... 371
mkl_sparse_?_mv..................................................................371
mkl_sparse_?_trsv.................................................................374
mkl_sparse_?_mm.................................................................376
mkl_sparse_?_trsm................................................................380
mkl_sparse_?_add................................................................. 383
mkl_sparse_spmm.................................................................385
mkl_sparse_?_spmmd............................................................386
BLAS-like Extensions.............................................................................. 388
?axpby......................................................................................... 389
?gem2vu...................................................................................... 391
?gem2vc.......................................................................................393
?gemmt........................................................................................395
?gemm3m.................................................................................... 399
?gemm_batch............................................................................... 402
?gemm3m_batch........................................................................... 406
mkl_?imatcopy.............................................................................. 410
mkl_?omatcopy............................................................................. 413
mkl_?omatcopy2........................................................................... 415
mkl_?omatadd...............................................................................418
?gemm_alloc.................................................................................422
?gemm_pack.................................................................................423
?gemm_compute........................................................................... 425
?gemm_free..................................................................................427
Chapter 3: LAPACK Routines

Naming Conventions for LAPACK Routines................................................. 429
Fortran 95 Interface Conventions for LAPACK Routines................................ 430
6
Contents
Intel MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib

Implementation......................................................................... 431
Matrix Storage Schemes for LAPACK Routines............................................ 432
Mathematical Notation for LAPACK Routines............................................... 432
Error Analysis........................................................................................ 433
LAPACK Linear Equation Routines............................................................. 434
LAPACK Linear Equation Computational Routines................................ 434
Matrix Factorization: LAPACK Computational Routines.................436
Solving Systems of Linear Equations: LAPACK Computational
Routines........................................................................... 479
Estimating the Condition Number: LAPACK Computational
Routines........................................................................... 518
Refining the Solution and Estimating Its Error: LAPACK
Computational Routines...................................................... 548
Matrix Inversion: LAPACK Computational Routines..................... 613
Matrix Equilibration: LAPACK Computational Routines................. 641
LAPACK Linear Equation Driver Routines........................................... 660
?gesv................................................................................... 661
?gesvx................................................................................. 664
?gesvxx................................................................................670
?gbsv...................................................................................678
?gbsvx................................................................................. 680
?gbsvxx................................................................................686
?gtsv................................................................................... 694
?gtsvx..................................................................................696
?dtsvb..................................................................................700
?posv...................................................................................702
?posvx................................................................................. 705
?posvxx................................................................................710
?ppsv...................................................................................717
?ppsvx................................................................................. 719
?pbsv...................................................................................723
?pbsvx................................................................................. 725
?ptsv................................................................................... 730
?ptsvx..................................................................................732
?sysv................................................................................... 735
?sysv_rook........................................................................... 737
?sysvx..................................................................................740
?sysvxx................................................................................ 744
?hesv...................................................................................751
?hesv_rook........................................................................... 754
?hesvx................................................................................. 756
?hesvxx................................................................................760
?spsv................................................................................... 768
?spsvx................................................................................. 770
?hpsv...................................................................................773
?hpsvx................................................................................. 775
LAPACK Least Squares and Eigenvalue Problem Routines............................. 779
LAPACK Least Squares and Eigenvalue Problem Computational Routines779
Orthogonal Factorizations: LAPACK Computational Routines.........780
7
Singular Value Decomposition: LAPACK Computational Routines...864

Symmetric Eigenvalue Problems: LAPACK Computational
Routines........................................................................... 889
Generalized Symmetric-Definite Eigenvalue Problems: LAPACK
Computational Routines...................................................... 954
Nonsymmetric Eigenvalue Problems: LAPACK Computational
Routines........................................................................... 968
Generalized Nonsymmetric Eigenvalue Problems: LAPACK
Computational Routines.................................................... 1013
Generalized Singular Value Decomposition: LAPACK
Computational Routines.................................................... 1050
Cosine-Sine Decomposition: LAPACK Computational Routines.... 1070
LAPACK Least Squares and Eigenvalue Problem Driver Routines..........1081
Linear Least Squares (LLS) Problems: LAPACK Driver Routines...1081
Generalized Linear Least Squares (LLS) Problems: LAPACK
Driver Routines................................................................1096
Symmetric Eigenvalue Problems: LAPACK Driver Routines......... 1102
Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines.... 1179
Singular Value Decomposition: LAPACK Driver Routines.............1198
Cosine-Sine Decomposition: LAPACK Driver Routines................ 1232
Generalized Symmetric Definite Eigenvalue Problems: LAPACK
Driver Routines................................................................1242
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Driver
Routines......................................................................... 1303
LAPACK Auxiliary Routines..................................................................... 1336
?lacgv.........................................................................................1349
?lacrm........................................................................................ 1350
?lacrt..........................................................................................1351
?laesy.........................................................................................1352
?rot............................................................................................1353
?spmv........................................................................................ 1354
?spr........................................................................................... 1355
?syconv...................................................................................... 1357
?symv........................................................................................ 1359
?syr............................................................................................1360
i?max1....................................................................................... 1361
?sum1........................................................................................ 1362
?gbtf2.........................................................................................1363
?gebd2....................................................................................... 1364
?gehd2....................................................................................... 1366
?gelq2........................................................................................ 1368
?geql2........................................................................................ 1369
?geqr2........................................................................................1371
?geqr2p...................................................................................... 1372
?geqrt2.......................................................................................1374
?geqrt3.......................................................................................1375
?gerq2........................................................................................1377
?gesc2........................................................................................1378
?getc2........................................................................................ 1380
?getf2.........................................................................................1381
8
Contents
?gtts2.........................................................................................1382
?isnan........................................................................................ 1383
?laisnan...................................................................................... 1384
?labrd.........................................................................................1384
?lacn2........................................................................................ 1387
?lacon.........................................................................................1388
?lacpy.........................................................................................1390
?ladiv......................................................................................... 1391
?lae2.......................................................................................... 1392
?laebz.........................................................................................1393
?laed0........................................................................................ 1397
?laed1........................................................................................ 1399
?laed2........................................................................................ 1401
?laed3........................................................................................ 1403
?laed4........................................................................................ 1405
?laed5........................................................................................ 1406
?laed6........................................................................................ 1407
?laed7........................................................................................ 1408
?laed8........................................................................................ 1412
?laed9........................................................................................ 1415
?laeda........................................................................................ 1416
?laein......................................................................................... 1418
?laev2........................................................................................ 1420
?laexc.........................................................................................1422
?lag2.......................................................................................... 1423
?lags2........................................................................................ 1425
?lagtf..........................................................................................1428
?lagtm........................................................................................ 1430
?lagts......................................................................................... 1431
?lagv2........................................................................................ 1433
?lahqr.........................................................................................1435
?lahrd.........................................................................................1437
?lahr2.........................................................................................1439
?laic1......................................................................................... 1442
?lakf2......................................................................................... 1444
?laln2......................................................................................... 1445
?lals0......................................................................................... 1448
?lalsa..........................................................................................1451
?lalsd......................................................................................... 1454
?lamrg........................................................................................1456
?laneg........................................................................................ 1457
?langb........................................................................................ 1458
?lange........................................................................................ 1460
?langt.........................................................................................1461
?lanhs........................................................................................ 1462
?lansb........................................................................................ 1463
?lanhb........................................................................................ 1465
?lansp........................................................................................ 1466
?lanhp........................................................................................ 1468
?lanst/?lanht............................................................................... 1469
9
?lansy.........................................................................................1470
?lanhe........................................................................................ 1471
?lantb.........................................................................................1473
?lantp.........................................................................................1474
?lantr......................................................................................... 1476
?lanv2........................................................................................ 1478
?lapll.......................................................................................... 1479
?lapmr........................................................................................1480
?lapmt........................................................................................ 1481
?lapy2........................................................................................ 1482
?lapy3........................................................................................ 1483
?laqgb........................................................................................ 1483
?laqge........................................................................................ 1485
?laqhb........................................................................................ 1487
?laqp2........................................................................................ 1488
?laqps........................................................................................ 1490
?laqr0.........................................................................................1492
?laqr1.........................................................................................1495
?laqr2.........................................................................................1496
?laqr3.........................................................................................1500
?laqr4.........................................................................................1503
?laqr5.........................................................................................1506
?laqsb........................................................................................ 1509
?laqsp........................................................................................ 1511
?laqsy.........................................................................................1512
?laqtr......................................................................................... 1514
?lar1v.........................................................................................1516
?lar2v.........................................................................................1519
?laran.........................................................................................1520
?larf........................................................................................... 1521
?larfb......................................................................................... 1522
?larfg......................................................................................... 1526
?larfgp........................................................................................1527
?larft.......................................................................................... 1529
?larfx..........................................................................................1531
?large.........................................................................................1533
?largv.........................................................................................1534
?larnd.........................................................................................1535
?larnv.........................................................................................1536
?laror......................................................................................... 1537
?larot......................................................................................... 1540
?larra......................................................................................... 1543
?larrb......................................................................................... 1545
?larrc..........................................................................................1547
?larrd......................................................................................... 1548
?larre......................................................................................... 1552
?larrf.......................................................................................... 1555
?larrj.......................................................................................... 1557
?larrk......................................................................................... 1559
?larrr..........................................................................................1560
10
Contents
?larrv......................................................................................... 1561
?lartg......................................................................................... 1565
?lartgp........................................................................................1566
?lartgs........................................................................................ 1568
?lartv......................................................................................... 1569
?laruv.........................................................................................1570
?larz...........................................................................................1571
?larzb......................................................................................... 1573
?larzt..........................................................................................1575
?las2.......................................................................................... 1578
?lascl..........................................................................................1579
?lasd0........................................................................................ 1580
?lasd1........................................................................................ 1582
?lasd2........................................................................................ 1584
?lasd3........................................................................................ 1587
?lasd4........................................................................................ 1590
?lasd5........................................................................................ 1592
?lasd6........................................................................................ 1593
?lasd7........................................................................................ 1597
?lasd8........................................................................................ 1600
?lasd9........................................................................................ 1602
?lasda.........................................................................................1604
?lasdq........................................................................................ 1607
?lasdt......................................................................................... 1610
?laset......................................................................................... 1610
?lasq1........................................................................................ 1612
?lasq2........................................................................................ 1613
?lasq3........................................................................................ 1614
?lasq4........................................................................................ 1616
?lasq5........................................................................................ 1617
?lasq6........................................................................................ 1619
?lasr...........................................................................................1620
?lasrt..........................................................................................1623
?lassq.........................................................................................1623
?lasv2.........................................................................................1625
?laswp........................................................................................ 1626
?lasy2.........................................................................................1627
?lasyf......................................................................................... 1629
?lasyf_rook................................................................................. 1631
?lahef......................................................................................... 1633
?lahef_rook................................................................................. 1635
?latbs......................................................................................... 1637
?latm1........................................................................................ 1639
?latm2........................................................................................ 1641
?latm3........................................................................................ 1644
?latm5........................................................................................ 1647
?latm6........................................................................................ 1651
?latme........................................................................................ 1654
?latmr........................................................................................ 1658
?latdf..........................................................................................1666
11
?latps......................................................................................... 1667
?latrd......................................................................................... 1670
?latrs..........................................................................................1672
?latrz..........................................................................................1676
?lauu2........................................................................................ 1678
?lauum....................................................................................... 1679
?orbdb1/?unbdb1......................................................................... 1680
?orbdb2/?unbdb2......................................................................... 1683
?orbdb3/?unbdb3......................................................................... 1686
?orbdb4/?unbdb4......................................................................... 1689
?orbdb5/?unbdb5......................................................................... 1693
?orbdb6/?unbdb6......................................................................... 1695
?org2l/?ung2l.............................................................................. 1698
?org2r/?ung2r............................................................................. 1699
?orgl2/?ungl2.............................................................................. 1700
?orgr2/?ungr2............................................................................. 1702
?orm2l/?unm2l............................................................................ 1703
?orm2r/?unm2r........................................................................... 1705
?orml2/?unml2............................................................................ 1707
?ormr2/?unmr2........................................................................... 1709
?ormr3/?unmr3........................................................................... 1711
?pbtf2.........................................................................................1713
?potf2.........................................................................................1715
?ptts2.........................................................................................1716
?rscl........................................................................................... 1717
?syswapr.................................................................................... 1718
?heswapr.................................................................................... 1720
?sygs2/?hegs2.............................................................................1723
?sytd2/?hetd2............................................................................. 1724
?sytf2......................................................................................... 1726
?sytf2_rook................................................................................. 1728
?hetf2.........................................................................................1729
?hetf2_rook.................................................................................1731
?tgex2........................................................................................ 1732
?tgsy2........................................................................................ 1735
?trti2.......................................................................................... 1738
clag2z.........................................................................................1739
dlag2s........................................................................................ 1740
slag2d........................................................................................ 1741
zlag2c.........................................................................................1742
?larfp......................................................................................... 1743
ila?lc.......................................................................................... 1744
ila?lr...........................................................................................1745
?gsvj0........................................................................................ 1746
?gsvj1........................................................................................ 1749
?sfrk...........................................................................................1752
?hfrk.......................................................................................... 1754
?tfsm..........................................................................................1755
?lansf......................................................................................... 1757
?lanhf......................................................................................... 1759
12
Contents
?tfttp..........................................................................................1760
?tfttr.......................................................................................... 1761
?tpqrt2 ...................................................................................... 1763
?tprfb ........................................................................................ 1765
?tpttf..........................................................................................1768
?tpttr..........................................................................................1770
?trttf.......................................................................................... 1771
?trttp..........................................................................................1772
?pstf2.........................................................................................1773
dlat2s ........................................................................................ 1775
zlat2c ........................................................................................ 1776
?lacp2........................................................................................ 1777
?la_gbamv.................................................................................. 1778
?la_gbrcond................................................................................ 1780
?la_gbrcond_c............................................................................. 1782
?la_gbrcond_x............................................................................. 1784
?la_gbrfsx_extended.................................................................... 1785
?la_gbrpvgrw...............................................................................1792
?la_geamv.................................................................................. 1793
?la_gercond.................................................................................1795
?la_gercond_c............................................................................. 1796
?la_gercond_x............................................................................. 1798
?la_gerfsx_extended.....................................................................1799
?la_heamv.................................................................................. 1805
?la_hercond_c............................................................................. 1807
?la_hercond_x............................................................................. 1808
?la_herfsx_extended.................................................................... 1810
?la_herpvgrw...............................................................................1816
?la_lin_berr................................................................................. 1817
?la_porcond................................................................................ 1818
?la_porcond_c............................................................................. 1819
?la_porcond_x............................................................................. 1821
?la_porfsx_extended.................................................................... 1822
?la_porpvgrw...............................................................................1829
?laqhe........................................................................................ 1830
?laqhp........................................................................................ 1831
?larcm........................................................................................ 1833
?la_gerpvgrw...............................................................................1834
?larscl2.......................................................................................1835
?lascl2........................................................................................ 1836
?la_syamv...................................................................................1837
?la_syrcond.................................................................................1839
?la_syrcond_c..............................................................................1840
?la_syrcond_x............................................................................. 1842
?la_syrfsx_extended.....................................................................1843
?la_syrpvgrw............................................................................... 1850
?la_wwaddw................................................................................1851
mkl_?tppack................................................................................1852
mkl_?tpunpack............................................................................ 1854
LAPACK Utility Functions and Routines.....................................................1856
13
ilaver..........................................................................................1857
ilaenv......................................................................................... 1857
iparmq........................................................................................1859
ieeeck.........................................................................................1861
?labad........................................................................................ 1862
?lamch....................................................................................... 1863
?lamc1....................................................................................... 1864
?lamc2....................................................................................... 1864
?lamc3....................................................................................... 1865
?lamc4....................................................................................... 1866
?lamc5....................................................................................... 1867
chla_transtype............................................................................. 1867
iladiag........................................................................................ 1868
ilaprec........................................................................................ 1869
ilatrans....................................................................................... 1869
ilauplo........................................................................................ 1870
xerbla_array................................................................................1871
LAPACK Test Functions and Routines....................................................... 1871
?latms........................................................................................ 1872
Chapter 4: ScaLAPACK Routines

Overview of ScaLAPACK Routines............................................................1877
ScaLAPACK Array Descriptors................................................................. 1878
Naming Conventions for ScaLAPACK Routines...........................................1880
ScaLAPACK Computational Routines........................................................ 1881
Systems of Linear Equations: ScaLAPACK Computational Routines...... 1881
Matrix Factorization: ScaLAPACK Computational Routines.................. 1882
p?getrf............................................................................... 1882
p?gbtrf............................................................................... 1884
p?dbtrf............................................................................... 1886
p?dttrf................................................................................1889
p?potrf............................................................................... 1891
p?pbtrf............................................................................... 1892
p?pttrf................................................................................1895
Solving Systems of Linear Equations: ScaLAPACK Computational
Routines................................................................................. 1897
p?getrs...............................................................................1897
p?gbtrs...............................................................................1899
p?dbtrs...............................................................................1901
p?dttrs............................................................................... 1904
p?potrs...............................................................................1906
p?pbtrs...............................................................................1908
p?pttrs............................................................................... 1910
p?trtrs................................................................................ 1912
Estimating the Condition Number: ScaLAPACK Computational Routines1914
p?gecon..............................................................................1914
p?pocon..............................................................................1917
p?trcon...............................................................................1919
Refining the Solution and Estimating Its Error: ScaLAPACK
Computational Routines............................................................ 1922
14
Contents
p?gerfs............................................................................... 1922
p?porfs............................................................................... 1926
p?trrfs................................................................................ 1929
Matrix Inversion: ScaLAPACK Computational Routines....................... 1932
p?getri............................................................................... 1933
p?potri............................................................................... 1935
p?trtri.................................................................................1936
Matrix Equilibration: ScaLAPACK Computational Routines...................1938
p?geequ............................................................................. 1938
p?poequ............................................................................. 1940
Orthogonal Factorizations: ScaLAPACK Computational Routines.......... 1942
p?geqrf...............................................................................1942
p?geqpf.............................................................................. 1945
p?orgqr.............................................................................. 1948
p?ungqr..............................................................................1949
p?ormqr............................................................................. 1951
p?unmqr.............................................................................1954
p?gelqf............................................................................... 1957
p?orglq...............................................................................1959
p?unglq.............................................................................. 1961
p?ormlq..............................................................................1963
p?unmlq............................................................................. 1966
p?geqlf............................................................................... 1969
p?orgql...............................................................................1971
p?ungql.............................................................................. 1973
p?ormql..............................................................................1975
p?unmql............................................................................. 1978
p?gerqf...............................................................................1981
p?orgrq.............................................................................. 1983
p?ungrq..............................................................................1985
p?ormr3............................................................................. 1987
p?unmr3.............................................................................1990
p?ormrq............................................................................. 1994
p?unmrq.............................................................................1997
p?tzrzf................................................................................2000
p?ormrz..............................................................................2002
p?unmrz............................................................................. 2005
p?ggqrf...............................................................................2008
p?ggrqf...............................................................................2012
Symmetric Eigenvalue Problems: ScaLAPACK Computational Routines. 2017
p?syngst.............................................................................2017
p?syntrd............................................................................. 2020
p?sytrd...............................................................................2024
p?ormtr.............................................................................. 2027
p?hengst............................................................................ 2030
p?hentrd.............................................................................2033
p?hetrd.............................................................................. 2037
p?unmtr............................................................................. 2040
p?stebz.............................................................................. 2044
p?stedc.............................................................................. 2047
15
p?stein............................................................................... 2049
Nonsymmetric Eigenvalue Problems: ScaLAPACK Computational
Routines................................................................................. 2053
p?gehrd..............................................................................2053
p?ormhr............................................................................. 2056
p?unmhr.............................................................................2059
p?lahqr...............................................................................2063
p?hseqr.............................................................................. 2065
p?trevc............................................................................... 2067
Singular Value Decomposition: ScaLAPACK Driver Routines................ 2070
p?gebrd.............................................................................. 2071
p?ormbr............................................................................. 2074
p?unmbr.............................................................................2079
Generalized Symmetric-Definite Eigenvalue Problems: ScaLAPACK
Computational Routines............................................................ 2083
p?sygst...............................................................................2083
p?hegst.............................................................................. 2085
ScaLAPACK Driver Routines....................................................................2087
p?gesv........................................................................................2087
p?gesvx...................................................................................... 2089
p?gbsv........................................................................................2094
p?dbsv........................................................................................2097
p?dtsv........................................................................................ 2099
p?posv........................................................................................2102
p?posvx...................................................................................... 2104
p?pbsv........................................................................................2109
p?ptsv........................................................................................ 2112
p?gels.........................................................................................2114
p?syev........................................................................................2118
p?syevd...................................................................................... 2121
p?syevr.......................................................................................2123
p?syevx...................................................................................... 2128
p?heev....................................................................................... 2134
p?heevd......................................................................................2137
p?heevr...................................................................................... 2140
p?heevx......................................................................................2145
p?gesvd...................................................................................... 2152
p?sygvx...................................................................................... 2156
p?hegvx......................................................................................2163
ScaLAPACK Auxiliary Routines................................................................ 2171
b?laapp.......................................................................................2177
b?laexc....................................................................................... 2178
b?trexc....................................................................................... 2180
p?lacgv....................................................................................... 2182
p?max1...................................................................................... 2183
pilaver........................................................................................ 2184
pmpcol....................................................................................... 2184
pmpim2...................................................................................... 2186
?combamax1............................................................................... 2187
p?sum1...................................................................................... 2188
16
Contents
p?dbtrsv..................................................................................... 2188
p?dttrsv...................................................................................... 2191
p?gebal.......................................................................................2195
p?gebd2......................................................................................2196
p?gehd2..................................................................................... 2200
p?gelq2...................................................................................... 2202
p?geql2...................................................................................... 2205
p?geqr2...................................................................................... 2207
p?gerq2...................................................................................... 2209
p?getf2....................................................................................... 2211
p?labrd....................................................................................... 2213
p?lacon.......................................................................................2217
p?laconsb....................................................................................2218
p?lacp2.......................................................................................2219
p?lacp3.......................................................................................2221
p?lacpy....................................................................................... 2223
p?laevswp................................................................................... 2224
p?lahrd....................................................................................... 2226
p?laiect.......................................................................................2228
p?lamve......................................................................................2229
p?lange.......................................................................................2231
p?lanhs.......................................................................................2233
p?lansy, p?lanhe.......................................................................... 2235
p?lantr........................................................................................2237
p?lapiv........................................................................................2239
p?lapv2.......................................................................................2241
p?laqge.......................................................................................2243
p?laqr0....................................................................................... 2245
p?laqr1....................................................................................... 2248
p?laqr2....................................................................................... 2251
p?laqr3....................................................................................... 2254
p?laqr4....................................................................................... 2257
p?laqr5....................................................................................... 2259
p?laqsy....................................................................................... 2262
p?lared1d....................................................................................2264
p?lared2d....................................................................................2265
p?larf......................................................................................... 2266
p?larfb........................................................................................2269
p?larfc........................................................................................ 2273
p?larfg........................................................................................2276
p?larft........................................................................................ 2277
p?larz......................................................................................... 2280
p?larzb....................................................................................... 2283
p?larzc........................................................................................2287
p?larzt........................................................................................ 2289
p?lascl........................................................................................ 2292
p?lase2.......................................................................................2294
p?laset....................................................................................... 2296
p?lasmsub...................................................................................2297
p?lasrt........................................................................................ 2299
17
p?lassq....................................................................................... 2301
p?laswp...................................................................................... 2302
p?latra........................................................................................2304
p?latrd........................................................................................2305
p?latrs........................................................................................ 2309
p?latrz........................................................................................ 2311
p?lauu2...................................................................................... 2313
p?lauum..................................................................................... 2315
p?lawil........................................................................................ 2316
p?org2l/p?ung2l........................................................................... 2317
p?org2r/p?ung2r.......................................................................... 2320
p?orgl2/p?ungl2........................................................................... 2322
p?orgr2/p?ungr2.......................................................................... 2324
p?orm2l/p?unm2l......................................................................... 2326
p?orm2r/p?unm2r........................................................................ 2330
p?orml2/p?unml2......................................................................... 2333
p?ormr2/p?unmr2........................................................................ 2337
p?pbtrsv..................................................................................... 2340
p?pttrsv...................................................................................... 2344
p?potf2....................................................................................... 2347
p?rot.......................................................................................... 2349
p?rscl......................................................................................... 2351
p?sygs2/p?hegs2......................................................................... 2352
p?sytd2/p?hetd2.......................................................................... 2355
p?trord....................................................................................... 2358
p?trsen....................................................................................... 2362
p?trti2........................................................................................ 2367
?lahqr2....................................................................................... 2368
?lamsh....................................................................................... 2370
?lapst......................................................................................... 2372
?laqr6.........................................................................................2372
?lar1va....................................................................................... 2376
?laref..........................................................................................2378
?larrb2........................................................................................2381
?larrd2........................................................................................2384
?larre2........................................................................................2388
?larre2a...................................................................................... 2392
?larrf2........................................................................................ 2396
?larrv2........................................................................................2399
?lasorte...................................................................................... 2404
?lasrt2........................................................................................ 2405
?stegr2....................................................................................... 2406
?stegr2a..................................................................................... 2410
?stegr2b..................................................................................... 2414
?stein2....................................................................................... 2418
?dbtf2.........................................................................................2420
?dbtrf......................................................................................... 2421
?dttrf..........................................................................................2423
?dttrsv........................................................................................2424
?pttrsv........................................................................................2425
18
Contents
?steqr2....................................................................................... 2427
?trmvt........................................................................................ 2428
pilaenv....................................................................................... 2431
pilaenvx......................................................................................2432
pjlaenv....................................................................................... 2434
Additional ScaLAPACK Routines...................................................... 2435
ScaLAPACK Utility Functions and Routines................................................2436
p?labad.......................................................................................2436
p?lachkieee................................................................................. 2437
p?lamch......................................................................................2438
p?lasnbt......................................................................................2439
ScaLAPACK Redistribution/Copy Routines.................................................2440
p?gemr2d................................................................................... 2440
p?trmr2d.................................................................................... 2442
Chapter 5: Sparse Solver Routines

Intel MKL PARDISO - Parallel Direct Sparse Solver Interface....................... 2447
pardiso....................................................................................... 2452
pardisoinit...................................................................................2459
pardiso_64.................................................................................. 2460
pardiso_getenv_f, pardiso_setenv_f, pardiso_getenv, pardiso_setenv.. 2461
mkl_pardiso_pivot........................................................................2463
pardiso_getdiag........................................................................... 2464
pardiso_handle_store................................................................... 2466
pardiso_handle_restore.................................................................2467
pardiso_handle_delete.................................................................. 2467
pardiso_handle_store_64.............................................................. 2468
pardiso_handle_restore_64........................................................... 2469
pardiso_handle_delete_64.............................................................2470
Intel MKL PARDISO Parameters in Tabular Form............................... 2470
pardiso iparm Parameter............................................................... 2475
PARDISO_DATA_TYPE................................................................... 2486
Parallel Direct Sparse Solver for Clusters Interface.................................... 2486
cluster_sparse_solver................................................................... 2488
cluster_sparse_solver_64.............................................................. 2494
cluster_sparse_solver iparm Parameter........................................... 2494
Direct Sparse Solver (DSS) Interface Routines..........................................2499
DSS Interface Description............................................................. 2501
DSS Implementation Details.......................................................... 2501
DSS Routines...............................................................................2502
dss_create.......................................................................... 2502
dss_define_structure............................................................ 2504
dss_reorder.........................................................................2505
dss_factor_real, dss_factor_complex...................................... 2507
dss_solve_real, dss_solve_complex........................................ 2508
dss_delete.......................................................................... 2511
dss_statistics.......................................................................2512
mkl_cvt_to_null_terminated_str............................................ 2515
Iterative Sparse Solvers based on Reverse Communication Interface (RCI
ISS)............................................................................................... 2515
19
CG Interface Description............................................................... 2516

FGMRES Interface Description........................................................2522
RCI ISS Routines..........................................................................2529
dcg_init.............................................................................. 2529
dcg_check...........................................................................2530
dcg.................................................................................... 2531
dcg_get.............................................................................. 2533
dcgmrhs_init....................................................................... 2533
dcgmrhs_check....................................................................2534
dcgmrhs............................................................................. 2535
dcgmrhs_get....................................................................... 2537
dfgmres_init........................................................................2538
dfgmres_check.................................................................... 2539
dfgmres..............................................................................2540
dfgmres_get........................................................................2542
RCI ISS Implementation Details..................................................... 2543
Preconditioners based on Incomplete LU Factorization Technique................ 2543
ILU0 and ILUT Preconditioners Interface Description......................... 2545
dcsrilu0.......................................................................................2546
dcsrilut....................................................................................... 2549
Sparse Matrix Checker Routines............................................................. 2552
sparse_matrix_checker................................................................. 2552
sparse_matrix_checker_init........................................................... 2554
Chapter 6: Extended Eigensolver Routines

The FEAST Algorithm............................................................................ 2557
Extended Eigensolver Functionality......................................................... 2558
Parallelism in Extended Eigensolver Routines................................... 2559
Achieving Performance With Extended Eigensolver Routines............... 2559
Extended Eigensolver Interfaces............................................................. 2560
Extended Eigensolver Naming Conventions...................................... 2560
feastinit...................................................................................... 2561
Extended Eigensolver Input Parameters...........................................2562
Extended Eigensolver Output Details...............................................2563
Extended Eigensolver RCI Routines.................................................2564
Extended Eigensolver RCI Interface Description....................... 2564
?feast_srci/?feast_hrci.......................................................... 2566
Extended Eigensolver Predefined Interfaces..................................... 2569
Matrix Storage.....................................................................2570
?feast_syev/?feast_heev.......................................................2570
?feast_sygv/?feast_hegv.......................................................2572
?feast_sbev/?feast_hbev.......................................................2575
?feast_sbgv/?feast_hbgv...................................................... 2577
?feast_scsrev/?feast_hcsrev.................................................. 2580
?feast_scsrgv/?feast_hcsrgv..................................................2582
Chapter 7: Vector Mathematical Functions

VM Data Types, Accuracy Modes, and Performance Tips............................. 2587
VM Naming Conventions........................................................................2588
VM Function Interfaces................................................................. 2589
20
Contents
VM Mathematical Function Interfaces...................................... 2589

VM Pack Function Interfaces.................................................. 2589
VM Unpack Function Interfaces.............................................. 2589
VM Service Function Interfaces.............................................. 2589
VM Input Function Interfaces................................................. 2590
VM Output Function Interfaces...............................................2590
Vector Indexing Methods....................................................................... 2590
VM Error Diagnostics.............................................................................2591
VM Mathematical Functions....................................................................2592
Special Value Notations................................................................. 2593
Arithmetic Functions..................................................................... 2594
v?Add.................................................................................2594
v?Sub.................................................................................2596
v?Sqr................................................................................. 2597
v?Mul................................................................................. 2598
v?MulByConj....................................................................... 2600
v?Conj................................................................................2601
v?Abs................................................................................. 2602
v?Arg................................................................................. 2604
v?LinearFrac........................................................................2605
Power and Root Functions............................................................. 2607
v?Inv................................................................................. 2607
v?Div................................................................................. 2608
v?Sqrt................................................................................ 2610
v?InvSqrt............................................................................2612
v?Cbrt................................................................................ 2613
v?InvCbrt........................................................................... 2614
v?Pow2o3........................................................................... 2615
v?Pow3o2........................................................................... 2616
v?Pow................................................................................ 2618
v?Powx...............................................................................2621
v?Hypot..............................................................................2623
Exponential and Logarithmic Functions............................................2624
v?Exp.................................................................................2624
v?Expm1............................................................................ 2626
v?Ln...................................................................................2627
v?Log10............................................................................. 2629
v?Log1p..............................................................................2631
Trigonometric Functions................................................................ 2632
v?Cos................................................................................. 2633
v?Sin..................................................................................2634
v?SinCos............................................................................ 2636
v?CIS................................................................................. 2638
v?Tan................................................................................. 2639
v?Acos............................................................................... 2641
v?Asin................................................................................ 2643
v?Atan................................................................................2645
v?Atan2..............................................................................2646
Hyperbolic Functions.....................................................................2648
v?Cosh............................................................................... 2648
21
v?Sinh................................................................................2651
v?Tanh............................................................................... 2653
v?Acosh..............................................................................2655
v?Asinh.............................................................................. 2657
v?Atanh..............................................................................2659
Special Functions......................................................................... 2661
v?Erf.................................................................................. 2661
v?Erfc.................................................................................2663
v?CdfNorm..........................................................................2665
v?ErfInv..............................................................................2667
v?ErfcInv............................................................................ 2669
v?CdfNormInv..................................................................... 2671
v?LGamma..........................................................................2672
v?TGamma......................................................................... 2674
v?ExpInt1........................................................................... 2675
Rounding Functions...................................................................... 2676
v?Floor............................................................................... 2676
v?Ceil.................................................................................2677
v?Trunc.............................................................................. 2678
v?Round............................................................................. 2680
v?NearbyInt........................................................................ 2681
v?Rint................................................................................ 2682
v?Modf............................................................................... 2683
v?Frac................................................................................ 2684
VM Pack/Unpack Functions.................................................................... 2685
v?Pack........................................................................................2686
v?Unpack.................................................................................... 2688
VM Service Functions............................................................................ 2689
vmlSetMode................................................................................ 2690
vmlgetmode................................................................................ 2691
vmlSetErrStatus...........................................................................2692
vmlgeterrstatus........................................................................... 2693
vmlclearerrstatus......................................................................... 2693
vmlSetErrorCallBack..................................................................... 2694
vmlGetErrorCallBack.....................................................................2696
vmlClearErrorCallBack.................................................................. 2696
Chapter 8: Statistical Functions

Random Number Generators.................................................................. 2697
Random Number Generators Conventions........................................ 2698
Random Number Generators Mathematical Notation................. 2699
Random Number Generators Naming Conventions.................... 2700
Basic Generators.......................................................................... 2703
BRNG Parameter Definition....................................................2705
Random Streams................................................................. 2706
BRNG Data Types.................................................................2706
Error Reporting............................................................................ 2707
VS RNG Usage Model.................................................................... 2708
Service Routines.......................................................................... 2710
vslNewStream..................................................................... 2711
22
Contents
vslNewStreamEx..................................................................2712
vsliNewAbstractStream......................................................... 2713
vsldNewAbstractStream........................................................ 2715
vslsNewAbstractStream........................................................ 2717
vslDeleteStream.................................................................. 2718
vslCopyStream.................................................................... 2719
vslCopyStreamState............................................................. 2720
vslSaveStreamF...................................................................2720
vslLoadStreamF................................................................... 2721
vslSaveStreamM.................................................................. 2723
vslLoadStreamM.................................................................. 2724
vslGetStreamSize.................................................................2725
vslLeapfrogStream............................................................... 2725
vslSkipAheadStream............................................................ 2727
vslGetStreamStateBrng........................................................ 2729
vslGetNumRegBrngs.............................................................2729
Distribution Generators................................................................. 2730
Continuous Distributions....................................................... 2733
Discrete Distributions........................................................... 2761
Advanced Service Routines............................................................ 2779
Advanced Service Routine Data Types..................................... 2779
vslGetBrngProperties............................................................ 2780
Convolution and Correlation................................................................... 2780
Convolution and Correlation Naming Conventions............................. 2781
Convolution and Correlation Data Types.......................................... 2782
Convolution and Correlation Parameters.......................................... 2782
Convolution and Correlation Task Status and Error Reporting..............2784
Convolution and Correlation Task Constructors................................. 2785
vslConvNewTask/vslCorrNewTask........................................... 2786
vslConvNewTask1D/vslCorrNewTask1D................................... 2788
vslConvNewTaskX/vslCorrNewTaskX....................................... 2789
vslConvNewTaskX1D/vslCorrNewTaskX1D................................2792
Convolution and Correlation Task Editors......................................... 2794
vslConvSetMode/vslCorrSetMode........................................... 2795
vslConvSetInternalPrecision/vslCorrSetInternalPrecision............2796
vslConvSetStart/vslCorrSetStart............................................ 2797
vslConvSetDecimation/vslCorrSetDecimation........................... 2798
Task Execution Routines................................................................ 2798
vslConvExec/vslCorrExec...................................................... 2799
vslConvExec1D/vslCorrExec1D...............................................2801
vslConvExecX/vslCorrExecX...................................................2803
vslConvExecX1D/vslCorrExecX1D........................................... 2806
Convolution and Correlation Task Destructors...................................2808
vslConvDeleteTask/vslCorrDeleteTask......................................2808
Convolution and Correlation Task Copiers........................................ 2809
vslConvCopyTask/vslCorrCopyTask......................................... 2809
Convolution and Correlation Usage Examples................................... 2810
Convolution and Correlation Mathematical Notation and Definitions..... 2813
Convolution and Correlation Data Allocation..................................... 2814
Summary Statistics...............................................................................2816
23
Summary Statistics Naming Conventions.........................................2817

Summary Statistics Data Types...................................................... 2817
Summary Statistics Parameters......................................................2818
Summary Statistics Task Status and Error Reporting......................... 2818
Summary Statistics Task Constructors.............................................2822
vslSSNewTask..................................................................... 2822
Summary Statistics Task Editors.....................................................2824
vslSSEditTask...................................................................... 2825
vslSSEditMoments................................................................2834
vslSSEditSums.................................................................... 2835
vslSSEditCovCor.................................................................. 2836
vslSSEditCP.........................................................................2838
vslSSEditPartialCovCor..........................................................2840
vslSSEditQuantiles............................................................... 2841
vslSSEditStreamQuantiles..................................................... 2842
vslSSEditPooledCovariance.................................................... 2843
vslSSEditRobustCovariance....................................................2845
vslSSEditOutliersDetection.................................................... 2846
vslSSEditMissingValues......................................................... 2848
vslSSEditCorParameterization................................................ 2851
Summary Statistics Task Computation Routines................................2852
vslSSCompute..................................................................... 2855
Summary Statistics Task Destructor................................................2856
vslSSDeleteTask...................................................................2856
Summary Statistics Usage Examples...............................................2857
Summary Statistics Mathematical Notation and Definitions.................2859
Chapter 9: Fourier Transform Functions

FFT Functions...................................................................................... 2866
FFT Interface............................................................................... 2867
Computing an FFT........................................................................ 2867
Configuration Settings.................................................................. 2867
DFTI_PRECISION................................................................. 2870
DFTI_FORWARD_DOMAIN..................................................... 2870
DFTI_DIMENSION, DFTI_LENGTHS.........................................2871
DFTI_PLACEMENT................................................................ 2871
DFTI_FORWARD_SCALE, DFTI_BACKWARD_SCALE...................2872
DFTI_NUMBER_OF_USER_THREADS....................................... 2872
DFTI_THREAD_LIMIT............................................................2872
DFTI_INPUT_STRIDES, DFTI_OUTPUT_STRIDES...................... 2873
DFTI_NUMBER_OF_TRANSFORMS.......................................... 2874
DFTI_INPUT_DISTANCE, DFTI_OUTPUT_DISTANCE.................. 2875
DFTI_COMPLEX_STORAGE, DFTI_REAL_STORAGE,
DFTI_CONJUGATE_EVEN_STORAGE....................................2876
DFTI_PACKED_FORMAT........................................................ 2878
DFTI_WORKSPACE............................................................... 2886
DFTI_COMMIT_STATUS.........................................................2887
DFTI_ORDERING................................................................. 2887
FFT Descriptor Manipulation Functionss........................................... 2887
DftiCreateDescriptor.............................................................2888
24
Contents
DftiCommitDescriptor........................................................... 2889
DftiFreeDescriptor................................................................ 2890
DftiCopyDescriptor............................................................... 2891
FFT Descriptor Configuration Functions............................................2892
DftiSetValue........................................................................ 2892
DftiGetValue........................................................................2894
FFT Computation Functions............................................................2896
DftiComputeForward.............................................................2896
DftiComputeBackward.......................................................... 2899
Configuring and Computing an FFT in Fortran.......................... 2901
Status Checking Functions.............................................................2904
DftiErrorClass...................................................................... 2904
DftiErrorMessage................................................................. 2905
Cluster FFT Functions............................................................................2906
Computing Cluster FFT..................................................................2907
Distributing Data among Processes................................................. 2908
Cluster FFT Interface.................................................................... 2909
Cluster FFT Descriptor Manipulation Functions.................................. 2910
DftiCreateDescriptorDM........................................................ 2910
DftiCommitDescriptorDM.......................................................2911
DftiFreeDescriptorDM........................................................... 2912
Cluster FFT Computation Functions................................................. 2913
DftiComputeForwardDM........................................................ 2913
DftiComputeBackwardDM...................................................... 2914
Cluster FFT Descriptor Configuration Functions................................. 2916
DftiSetValueDM....................................................................2916
DftiGetValueDM................................................................... 2918
Error Codes................................................................................. 2920
Chapter 10: PBLAS Routines

PBLAS Routines Overview...................................................................... 2923
PBLAS Routine Naming Conventions........................................................2924
PBLAS Level 1 Routines......................................................................... 2926
p?amax...................................................................................... 2926
p?asum.......................................................................................2927
p?axpy....................................................................................... 2928
p?copy........................................................................................2930
p?dot..........................................................................................2931
p?dotc........................................................................................ 2933
p?dotu........................................................................................ 2934
p?nrm2.......................................................................................2935
p?scal.........................................................................................2936
p?swap....................................................................................... 2937
p?gemv...................................................................................... 2939
p?agemv.....................................................................................2942
p?ger..........................................................................................2945
p?gerc........................................................................................ 2946
p?geru........................................................................................2948
p?hemv...................................................................................... 2950
25
p?ahemv.....................................................................................2952
p?her......................................................................................... 2954
p?her2........................................................................................2955
p?symv.......................................................................................2957
p?asymv..................................................................................... 2959
p?syr..........................................................................................2961
p?syr2........................................................................................ 2963
p?trmv........................................................................................2965
p?atrmv...................................................................................... 2967
p?trsv.........................................................................................2970
p?geadd......................................................................................2972
p?tradd.......................................................................................2974
p?gemm..................................................................................... 2976
p?hemm..................................................................................... 2979
p?herk........................................................................................ 2981
p?her2k...................................................................................... 2983
p?symm......................................................................................2985
p?syrk........................................................................................ 2987
p?syr2k...................................................................................... 2990
p?tran........................................................................................ 2992
p?tranu.......................................................................................2993
p?tranc....................................................................................... 2995
p?trmm...................................................................................... 2996
p?trsm........................................................................................2998
Chapter 11: Partial Differential Equations Support

Trigonometric Transform Routines........................................................... 3003
Trigonometric Transforms Implemented...........................................3004
Sequence of Invoking TT Routines.................................................. 3005
Trigonometric Transform Interface Description..................................3006
TT Routines................................................................................. 3007
?_init_trig_transform............................................................3007
?_commit_trig_transform......................................................3008
?_forward_trig_transform..................................................... 3010
?_backward_trig_transform................................................... 3012
free_trig_transform.............................................................. 3014
Common Parameters of the Trigonometric Transforms....................... 3015
Trigonometric Transform Implementation Details.............................. 3018
Fast Poisson Solver Routines ................................................................. 3019
Poisson Solver Implementation...................................................... 3020
Sequence of Invoking Poisson Solver Routines................................. 3025
Fast Poisson Solver Interface Description.........................................3027
Routines for the Cartesian Solver................................................... 3028
?_init_Helmholtz_2D/?_init_Helmholtz_3D.............................. 3028
_commit_Helmholtz_2D/?_commit_Helmholtz_3D.................... 3030
?_Helmholtz_2D/?_Helmholtz_3D.......................................... 3033
free_Helmholtz_2D/free_Helmholtz_3D...................................3036
Routines for the Spherical Solver....................................................3037
?_init_sph_p/?_init_sph_np...................................................3037
26
Contents
?_commit_sph_p/?_commit_sph_np.......................................3039
?_sph_p/?_sph_np............................................................... 3041
free_sph_p/free_sph_np....................................................... 3043
Common Parameters for the Poisson Solver..................................... 3044
ipar....................................................................................3044
dpar and spar......................................................................3048
Caveat on Parameter Modifications......................................... 3050
Parameters That Define Boundary Conditions...........................3050
Poisson Solver Implementation Details............................................ 3053
Calling PDE Support Routines from Fortran...............................................3059
Chapter 12: Nonlinear Optimization Problem Solvers

Nonlinear Solver Organization and Implementation................................... 3061
Nonlinear Solver Routine Naming Conventions..........................................3062
Nonlinear Least Squares Problem without Constraints................................3062
?trnlsp_init..................................................................................3063
?trnlsp_check.............................................................................. 3065
?trnlsp_solve............................................................................... 3066
?trnlsp_get..................................................................................3068
?trnlsp_delete..............................................................................3069
Nonlinear Least Squares Problem with Linear (Bound) Constraints...............3070
?trnlspbc_init...............................................................................3070
?trnlspbc_check........................................................................... 3072
?trnlspbc_solve............................................................................ 3075
?trnlspbc_get...............................................................................3076
?trnlspbc_delete.......................................................................... 3077
Jacobian Matrix Calculation Routines....................................................... 3078
?jacobi_init..................................................................................3078
?jacobi_solve...............................................................................3079
?jacobi_delete............................................................................. 3080
?jacobi........................................................................................3081
?jacobix...................................................................................... 3082
Chapter 13: Support Functions

Using a Fortran Interface Module for Support Functions............................. 3088
Version Information.............................................................................. 3089
mkl_get_version_string.................................................................3089
Threading Control.................................................................................3090
mkl_set_num_threads.................................................................. 3091
mkl_domain_set_num_threads...................................................... 3091
mkl_set_num_threads_local.......................................................... 3092
mkl_set_dynamic......................................................................... 3094
mkl_get_max_threads.................................................................. 3095
mkl_domain_get_max_threads...................................................... 3096
mkl_get_dynamic.........................................................................3097
mkl_set_num_stripes................................................................... 3098
mkl_get_num_stripes................................................................... 3099
Error Handling..................................................................................... 3099
Error Handling for Linear Algebra Routines.......................................3099
xerbla................................................................................ 3099
27
pxerbla...............................................................................3101
Handling Fatal Errors.................................................................... 3101
mkl_set_exit_handler........................................................... 3102
Character Equality Testing..................................................................... 3102
lsame......................................................................................... 3102
lsamen....................................................................................... 3103
Timing................................................................................................ 3104
second/dsecnd.............................................................................3104
mkl_get_cpu_clocks..................................................................... 3104
mkl_get_cpu_frequency................................................................ 3105
mkl_get_max_cpu_frequency........................................................ 3106
mkl_get_clocks_frequency.............................................................3106
Memory Management............................................................................3107
mkl_free_buffers..........................................................................3107
mkl_thread_free_buffers............................................................... 3108
mkl_disable_fast_mm...................................................................3108
mkl_mem_stat............................................................................ 3109
mkl_peak_mem_usage................................................................. 3109
mkl_malloc..................................................................................3110
mkl_calloc...................................................................................3111
mkl_realloc................................................................................. 3112
mkl_free..................................................................................... 3113
mkl_set_memory_limit................................................................. 3114
Usage Examples for the Memory Functions...................................... 3115
Single Dynamic Library Control...............................................................3117
mkl_set_interface_layer................................................................ 3117
mkl_set_threading_layer...............................................................3118
mkl_set_xerbla............................................................................ 3119
mkl_set_progress.........................................................................3120
mkl_set_pardiso_pivot.................................................................. 3121
Intel Many Integrated Core Architecture Support...................................... 3121
mkl_mic_enable...........................................................................3122
mkl_mic_disable.......................................................................... 3122
mkl_mic_get_device_count........................................................... 3123
mkl_mic_set_workdivision.............................................................3124
mkl_mic_get_workdivision............................................................ 3125
mkl_mic_set_max_memory...........................................................3127
mkl_mic_free_memory................................................................. 3128
mkl_mic_register_memory............................................................ 3129
mkl_mic_set_device_num_threads................................................. 3130
mkl_mic_set_resource_limit.......................................................... 3131
mkl_mic_get_resource_limit.......................................................... 3134
mkl_mic_set_offload_report.......................................................... 3135
mkl_mic_set_flags....................................................................... 3136
mkl_mic_get_flags....................................................................... 3136
mkl_mic_get_status..................................................................... 3137
mkl_mic_clear_status................................................................... 3139
mkl_mic_get_meminfo..................................................................3140
mkl_mic_get_cpuinfo....................................................................3141
Conditional Numerical Reproducibility Control........................................... 3143
28
Contents
mkl_cbwr_set.............................................................................. 3143
mkl_cbwr_get..............................................................................3144
mkl_cbwr_get_auto_branch...........................................................3145
Named Constants for CNR Control.................................................. 3146
Reproducibility Conditions............................................................. 3147
Usage Examples for CNR Support Functions..................................... 3148
Miscellaneous.......................................................................................3148
mkl_progress...............................................................................3148
mkl_enable_instructions ...............................................................3150
mkl_set_env_mode...................................................................... 3151
mkl_verbose................................................................................3152
mkl_set_mpi .............................................................................. 3153
mkl_finalize.................................................................................3155
Chapter 14: BLACS Routines

Matrix Shapes...................................................................................... 3157
Repeatability and Coherence.................................................................. 3158
BLACS Combine Operations................................................................... 3161
?gamx2d.....................................................................................3162
?gamn2d.....................................................................................3163
?gsum2d.....................................................................................3165
BLACS Point To Point Communication...................................................... 3166
?gesd2d......................................................................................3168
?trsd2d....................................................................................... 3168
?gerv2d...................................................................................... 3169
?trrv2d....................................................................................... 3170
BLACS Broadcast Routines..................................................................... 3170
?gebs2d......................................................................................3172
?trbs2d....................................................................................... 3172
?gebr2d...................................................................................... 3173
?trbr2d....................................................................................... 3174
BLACS Support Routines........................................................................3175
Initialization Routines................................................................... 3175
blacs_pinfo......................................................................... 3175
blacs_setup.........................................................................3176
blacs_get............................................................................ 3176
blacs_set............................................................................ 3177
blacs_gridinit.......................................................................3179
blacs_gridmap..................................................................... 3180
Destruction Routines.....................................................................3181
blacs_freebuff..................................................................... 3182
blacs_gridexit......................................................................3182
blacs_abort......................................................................... 3182
blacs_exit........................................................................... 3183
Informational Routines..................................................................3183
blacs_gridinfo......................................................................3184
blacs_pnum........................................................................ 3184
blacs_pcoord....................................................................... 3184
Miscellaneous Routines................................................................. 3185
blacs_barrier....................................................................... 3185
29
Examples of BLACS Routines Usage........................................................ 3186
Chapter 15: Data Fitting Functions

Data Fitting Function Naming Conventions............................................... 3197
Data Fitting Function Data Types............................................................ 3198
Mathematical Conventions for Data Fitting Functions................................. 3198
Data Fitting Usage Model....................................................................... 3201
Data Fitting Usage Examples..................................................................3201
Data Fitting Function Task Status and Error Reporting............................... 3202
Data Fitting Task Creation and Initialization Routines.................................3204
df?newtask1d.............................................................................. 3204
Task Configuration Routines................................................................... 3206
df?editppspline1d.........................................................................3207
df?editptr....................................................................................3214
dfieditval.....................................................................................3215
df?editidxptr................................................................................3217
df?queryptr................................................................................. 3218
dfiqueryval.................................................................................. 3219
df?queryidxptr............................................................................. 3220
Data Fitting Computational Routines....................................................... 3221
df?construct1d.............................................................................3222
df?interpolate1d/df?interpolateex1d............................................... 3223
df?integrate1d/df?integrateex1d.................................................... 3229
df?searchcells1d/df?searchcellsex1d............................................... 3233
df?interpcallback..........................................................................3234
df?integrcallback..........................................................................3235
df?searchcellscallback................................................................... 3237
Data Fitting Task Destructors................................................................. 3239
dfdeletetask................................................................................ 3239
Appendix A: Linear Solvers Basics

Sparse Linear Systems.......................................................................... 3241
Matrix Fundamentals.................................................................... 3241
Direct Method.............................................................................. 3242
Sparse Matrix Storage Formats...............................................................3246
DSS Symmetric Matrix Storage...................................................... 3246
DSS Nonsymmetric Matrix Storage................................................. 3247
DSS Structurally Symmetric Matrix Storage..................................... 3248
DSS Distributed Symmetric Matrix Storage...................................... 3249
Sparse BLAS CSR Matrix Storage Format......................................... 3250
Sparse BLAS CSC Matrix Storage Format......................................... 3252
Sparse BLAS Coordinate Matrix Storage Format................................3252
Sparse BLAS Diagonal Matrix Storage Format...................................3253
Sparse BLAS Skyline Matrix Storage Format.....................................3254
Sparse BLAS BSR Matrix Storage Format......................................... 3255
Appendix B: Routine and Function Arguments

Vector Arguments in BLAS..................................................................... 3259
Vector Arguments in VM........................................................................ 3260
Matrix Arguments................................................................................. 3261
30
Contents
Appendix C: FFTW Interface to Intel Math Kernel Library

FFTW Notational Conventions................................................................. 3267
FFTW2 Interface to Intel Math Kernel Library ......................................... 3267
Wrappers Reference..................................................................... 3267
One-dimensional Complex-to-complex FFTs ............................ 3267
Multi-dimensional Complex-to-complex FFTs............................ 3268
One-dimensional Real-to-half-complex/Half-complex-to-real FFTs3268
Multi-dimensional Real-to-complex/Complex-to-real FFTs.......... 3268
Multi-threaded FFTW............................................................ 3269
FFTW Support Functions....................................................... 3269
Calling FFTW2 Interface Wrappers from Fortran................................3270
Limitations of the FFTW2 Interface to Intel MKL................................3271
Installing FFTW2 Interface Wrappers...............................................3271
Creating the Wrapper Library.................................................3271
Application Assembling ........................................................ 3272
Running FFTW2 Interface Wrapper Examples........................... 3272
FFTW3 Interface to Intel Math Kernel Library.......................................... 3273
Using FFTW3 Wrappers................................................................. 3273
Calling FFTW3 Interface Wrappers from Fortran................................3275
Building Your Own FFTW3 Interface Wrapper Library......................... 3275
Building an Application With FFTW3 Interface Wrappers.....................3276
Running FFTW3 Interface Wrapper Examples................................... 3276
MPI FFTW3 Wrappers....................................................................3276
Building Your Own Wrapper Library.........................................3277
Building an Application......................................................... 3277
Running Examples............................................................... 3277
Appendix D: Code Examples

BLAS Code Examples............................................................................ 3279
Fourier Transform Functions Code Examples.............................................3282
FFT Code Examples...................................................................... 3282
Examples of Using OpenMP* Threading for FFT Computation......3286
Examples for Cluster FFT Functions.................................................3289
Auxiliary Data Transformations.......................................................3290
Bibliography
Glossary
31
32
Legal Information
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this
document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of
merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from
course of performance, course of dealing, or usage in trade.
This document contains information on products, services and/or processes in development. All information
provided here is subject to change without notice. Contact your Intel representative to obtain the latest
forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors which may cause deviations from
published specifications. Current characterized errata are available on request.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change to any of those factors may
cause the results to vary. You should consult other information and performance tests to assist you in fully
evaluating your contemplated purchases, including the performance of that product when combined with
other products.
Cilk, Intel, the Intel logo, Intel Atom, Intel Core, Intel Inside, Intel NetBurst, Intel SpeedStep, Intel vPro,
Intel Xeon Phi, Intel XScale, Itanium, MMX, Pentium, Thunderbolt, Ultrabook, VTune and Xeon are
trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporation
in the United States and/or other countries.
Java is a registered trademark of Oracle and/or its affiliates.
Third Party Content

Intel Math Kernel Library (Intel MKL) includes content from several 3rd party sources that was originally
governed by the licenses referenced below:
Portions Copyright 2001 Hewlett-Packard Development Company, L.P.
Sections on the Linear Algebra PACKage (LAPACK) routines include derivative work portions that have
been copyrighted:
1991, 1992, and 1998 by The Numerical Algorithms Group, Ltd.
Intel MKL supports LAPACK 3.5 set of computational, driver, auxiliary and utility routines under the
following license:
Copyright 1992-2011 The University of Tennessee and The University of Tennessee Research Foundation. All rights
reserved.
Copyright 2000-2011 The University of California Berkeley. All rights reserved.
Copyright 2006-2012 The University of Colorado Denver. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the
following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following
disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following
disclaimer listed in this license in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote
products derived from this software without specific prior written permission.
33
The copyright holders provide no reassurances that the source code provided does not infringe any patent, copyright,
or any other intellectual property rights of third parties. The copyright holders disclaim any liability to any recipient for
claims brought against recipient by any third party for infringement of that parties intellectual property rights.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from
http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S.
Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D.
Sorensen.
The original versions of the Basic Linear Algebra Subprograms (BLAS) from which the respective part of
Intel MKL was derived can be obtained from http://www.netlib.org/blas/index.html.
XBLAS is distributed under the following copyright:
Copyright 2008-2009 The University of California Berkeley. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided
that the following conditions are met:
- Redistributions of source code must retain the above copyright notice, this list of conditions and the
following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the
following disclaimer listed in this license in the documentation and/or other materials provided with the
distribution.
- Neither the name of the copyright holders nor the names of its contributors may be used to endorse or
promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
The original versions of the Basic Linear Algebra Communication Subprograms (BLACS) from which the
respective part of Intel MKL was derived can be obtained from http://www.netlib.org/blacs/index.html.
The authors of BLACS are Jack Dongarra and R. Clint Whaley.
The original versions of Scalable LAPACK (ScaLAPACK) from which the respective part of Intel MKL was
derived can be obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are
L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G.
Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.
The original versions of the Parallel Basic Linear Algebra Subprograms (PBLAS) routines from which the
respective part of Intel MKL was derived can be obtained from http://www.netlib.org/scalapack/html/
pblas_qref.html.
PARDISO (PARallel DIrect SOlver)* in Intel MKL was originally developed by the Department of Computer
Science at the University of Basel (http://www.unibas.ch). It can be obtained at http://www.pardiso-
project.org.
The Extended Eigensolver functionality is based on the Feast solver package and is distributed under the
following license:
34
Legal Information
Copyright 2009, The Regents of the University of Massachusetts, Amherst.

Developed by E. Polizzi
All rights reserved.
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the
following disclaimer in the documentation and/or other materials provided with the distribution.
3. Neither the name of the University nor the names of its contributors may be used to endorse or
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ''AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
Some Fast Fourier Transform (FFT) functions in this release of Intel MKL have been generated by the
SPIRAL software generation system (http://www.spiral.net/) under license from Carnegie Mellon
University. The authors of SPIRAL are Markus Puschel, Jose Moura, Jeremy Johnson, David Padua,
Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen,
Robert W. Johnson, and Nick Rizzolo.
Open MPI is distributed under the New BSD license, listed below.
Most files in this release are marked with the copyrights of the organizations who have edited them. The
copyrights below are in no particular order and generally reflect members of the Open MPI core team who
have contributed code to this release. The copyrights for code used under license from other parties are
included in the corresponding files.
Copyright 2004-2010 The Trustees of Indiana University and Indiana University Research and
Technology Corporation. All rights reserved.
Copyright 2004-2010 The University of Tennessee and The University of Tennessee Research
Foundation. All rights reserved.
Copyright 2004-2010 High Performance Computing Center Stuttgart, University of Stuttgart. All rights
reserved.
Copyright 2004-2008 The Regents of the University of California. All rights reserved.
Copyright 2006-2010 Los Alamos National Security, LLC. All rights reserved.
Copyright 2006-2010 Cisco Systems, Inc. All rights reserved.
Copyright 2006-2010 Voltaire, Inc. All rights reserved.
Copyright 2006-2011 Sandia National Laboratories. All rights reserved.
Copyright 2006-2010 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms.
Copyright 2006-2010 The University of Houston. All rights reserved.
Copyright 2006-2009 Myricom, Inc. All rights reserved.
Copyright 2007-2008 UT-Battelle, LLC. All rights reserved.
Copyright 2007-2010 IBM Corporation. All rights reserved.
Copyright 1998-2005 Forschungszentrum Juelich, Juelich Supercomputing Centre, Federal Republic of
Germany
35
Copyright 2005-2008 ZIH, TU Dresden, Federal Republic of Germany

Copyright 2007 Evergrid, Inc. All rights reserved.
Copyright 2008 Chelsio, Inc. All rights reserved.
Copyright 2008-2009 Institut National de Recherche en Informatique. All rights reserved.
Copyright 2007 Lawrence Livermore National Security, LLC. All rights reserved.
Copyright 2007-2009 Mellanox Technologies. All rights reserved.
Copyright 2006-2010 QLogic Corporation. All rights reserved.
Copyright 2008-2010 Oak Ridge National Labs. All rights reserved.
Copyright 2006-2010 Oracle and/or its affiliates. All rights reserved.
Copyright 2009 Bull SAS. All rights reserved.
Copyright 2010 ARM ltd. All rights reserved.
Copyright 2010-2011 Alex Brick . All rights reserved.
Copyright 2012 The University of Wisconsin-La Crosse. All rights reserved.
Copyright 2013-2014 Intel, Inc. All rights reserved.
Copyright 2011-2014 NVIDIA Corporation. All rights reserved.
- Redistributions of source code must retain the above copyright notice, this list of conditions and the
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the
following disclaimer listed in this license in the documentation and/or other materials provided with the
distribution.
- Neither the name of the copyright holders nor the names of its contributors may be used to endorse or
The copyright holders provide no reassurances that the source code provided does not infringe any
patent, copyright, or any other intellectual property rights of third parties. The copyright holders disclaim
any liability to any recipient for claims brought against recipient by any third party for infringement of that
parties intellectual property rights.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
The Safe C Library is distributed under the following copyright:
Copyright (c)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
associated documentation files (the "Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
of the Software, and to permit persons to whom the Software is furnished to do so, subject to the
following conditions:
36
Legal Information
The above copyright notice and this permission notice shall be included in all copies or substantial portions
of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
HPL Copyright Notice and Licensing Terms
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions, and the
following disclaimer in the documentation and/or other materials provided with the distribution.
3. All advertising materials mentioning features or use of this software must display the following
acknowledgement: This product includes software developed at the University of Tennessee, Knoxville,
Innovative Computing Laboratories.
4. The name of the University, the name of the Laboratory, or the names of its contributors may not be
used to endorse or promote products derived from this software without specific written permission.
Copyright 2017, Intel Corporation. All rights reserved.
37
38
Introducing the Intel Math Kernel
Library
The Intel Math Kernel Library (Intel MKL) improves performance with math routines for software
applications that solve large computational problems. Intel MKL provides BLAS and LAPACK linear algebra
routines, fast Fourier transforms, vectorized math functions, random number generation functions, and other
functionality.
NOTE
It is your responsibility when using Intel MKL to ensure that input data has the required format and
does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some
Intel MKL routines do limited checking of parameter errors, your application should check for NULL
pointers, for example.
Intel MKL is optimized for the latest Intel processors, including processors with multiple cores (see the Intel
MKL Release Notes for the full list of supported processors). Intel MKL also performs well on non-Intel
processors.
For more details about functionality provided by Intel MKL, see the Function Domains section.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or
effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain
optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to
the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
39
40
Getting Help and Support
Intel provides a support web site that contains a rich repository of self help information, including getting
started tips, known product issues, product errata, license information, user forums, and more. Visit the Intel
MKL support website at http://www.intel.com/software/products/support/.
41
42
What's New
This Developer Reference documents Intel Math Kernel Library (Intel MKL) 2017 Update 2 release for the
Fortran interface.
NOTE
This publication, the Intel Math Kernel Library Developer Reference, was previously known as the Intel
Math Kernel Library Reference Manual.
The following function domains were updated with new functions, enhancements to the existing functionality,
or improvements to the existing documentation:
New LAPACK functions provide factorization particularly for tall and skinny matrices. See ?gelq, ?geqr, ?
gemlq, ?gemqr, and ?getsls.
Support functions have been added to specify and return the number of partitions along the leading
dimension of the output matrix for parallel ?gemm functions. For more details, see mkl_set_num_stripes
and mkl_get_num_stripes.
Support of ScaLAPACK by the progress routine has been implemented. For more details, see
mkl_progress and mkl_set_progress.
The manual has also been updated to reflect other enhancements to the product, and minor improvements
and error corrections have been made.
Optimization Notice
43
44
Notational Conventions
This manual uses the following terms to refer to operating systems:
Windows* OS This term refers to information that is valid on all supported Windows* operating
systems.
Linux* OS This term refers to information that is valid on all supported Linux* operating
systems.
macOS* This term refers to information that is valid on Intel-based systems running the
macOS* operating system.
This manual uses the following notational conventions:
Routine name shorthand (for example, ?ungqr instead of cungqr/zungqr).

Font conventions used for distinction between the text and the code.
Routine Name Shorthand

For shorthand, names that contain a question mark "?" represent groups of routines with similar
functionality. Each group typically consists of routines used with four basic data types: single-precision real,
double-precision real, single-precision complex, and double-precision complex. The question mark is used to
indicate any or all possible varieties of a function; for example:
?swap Refers to all four data types of the vector-vector ?swap routine:
sswap, dswap, cswap, and zswap.
Font Conventions
The following font conventions are used:
UPPERCASE COURIER Data type used in the description of input and output parameters for
Fortran interface. For example, CHARACTER*1.
lowercase courier Code examples:

a(k+i,j) = matrix(i,j)
lowercase courier italic Variables in arguments and parameters description. For example, incx.
* Used as a multiplication symbol in code examples and equations and

where required by the programming language syntax.
45
46
Function Domains 1
NOTE
It is your responsibility when using Intel MKL to ensure that input data has the required format and
does not contain invalid characters. These can cause unexpected behavior of the library.
The library requires subroutine and function parameters to be valid before being passed. While some
Intel MKL routines do limited checking of parameter errors, your application should check for NULL
pointers, for example.
The Intel Math Kernel Library includes Fortran routines and functions optimized for Intel processor-based
computers running operating systems that support multiprocessing. In addition to the Fortran interface, Intel
MKL includes a C-language interface for the Discrete Fourier transform functions, as well as for the Vector
Mathematics and Vector Statistics functions. For hardware and software requirements to use Intel MKL, see
Intel MKL Release Notes.
BLAS Routines
The BLAS routines and functions are divided into the following groups according to the operations they
perform:
BLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical
operations include scaling and dot products.
BLAS Level 2 Routines perform matrix-vector operations, such as matrix-vector multiplication, rank-1 and
rank-2 matrix updates, and solution of triangular systems.
BLAS Level 3 Routines perform matrix-matrix operations, such as matrix-matrix multiplication, rank-k
update, and solution of triangular systems.
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to the BLAS routines.
Starting from release 10.1, a number of BLAS-like Extensions are added to enable the user to perform
certain data manipulation, including matrix in-place and out-of-place transposition operations combined with
simple matrix arithmetic operations.
Sparse BLAS Routines

The Sparse BLAS Level 1 Routines and Functions and Sparse BLAS Level 2 and Level 3 Routines routines and
functions operate on sparse vectors and matrices. These routines perform vector operations similar to the
BLAS Level 1, 2, and 3 routines. The Sparse BLAS routines take advantage of vector and matrix sparsity:
they allow you to store only non-zero elements of vectors and matrices. Intel MKL also supports Fortran 95
interface to Sparse BLAS routines.
LAPACK Routines
The Intel Math Kernel Library fully supports the LAPACK 3.6 set of computational, driver, auxiliary and utility
routines.
The original versions of LAPACK from which that part of Intel MKL was derived can be obtained from http://
www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J.
Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen.
The LAPACK routines can be divided into the following groups according to the operations they perform:
Routines for solving systems of linear equations, factoring and inverting matrices, and estimating
condition numbers (see LAPACK Routines: Linear Equations).
47
1 Intel Math Kernel Library Developer Reference
Routines for solving least squares problems, eigenvalue and singular value problems, and Sylvester's
equations (see LAPACK Routines: Least Squares and Eigenvalue Problems).
Auxiliary and utility routines used to perform certain subtasks, common low-level computation or related
tasks (see LAPACK Auxiliary Routines and LAPACK Utility Functions and Routines).
Starting from release 8.0, Intel MKL also supports the Fortran 95 interface to LAPACK computational and
driver routines. This interface provides an opportunity for simplified calls of LAPACK routines with fewer
required arguments.
ScaLAPACK Routines
The ScaLAPACK package (provided only for Intel 64 and Intel Many Integrated Core architectures, see
Chapter 6) runs on distributed-memory architectures and includes routines for solving systems of linear
equations, solving linear least squares problems, eigenvalue and singular value problems, as well as
performing a number of related computational tasks.
The original versions of ScaLAPACK from which that part of Intel MKL was derived can be obtained from
http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L. Blackford, J. Choi, A.Cleary, E.
D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K.Stanley, D. Walker, and
R. Whaley.
The Intel MKL version of ScaLAPACK is optimized for Intel processors and uses MPICH version of MPI as well
as Intel MPI.
PBLAS Routines
The PBLAS routines perform operations with distributed vectors and matrices.
PBLAS Level 1 Routines perform operations of both addition and reduction on vectors of data. Typical
operations include scaling and dot products.
PBLAS Level 2 Routines perform distributed matrix-vector operations, such as matrix-vector multiplication,
rank-1 and rank-2 matrix updates, and solution of triangular systems.
PBLAS Level 3 Routines perform distributed matrix-matrix operations, such as matrix-matrix
multiplication, rank-k update, and solution of triangular systems.
Intel MKL provides the PBLAS routines with interface similar to the interface used in the Netlib PBLAS (part of
the ScaLAPACK package, see http://www.netlib.org/scalapack/html/pblas_qref.html).
Sparse Solver Routines

Direct sparse solver routines in Intel MKL (see Chapter 8) solve symmetric and symmetrically-structured
sparse matrices with real or complex coefficients. For symmetric matrices, these Intel MKL subroutines can
solve both positive-definite and indefinite systems. Intel MKL includes a solver based on the PARDISO*
sparse solver, referred to as Intel MKL PARDISO, as well as an alternative set of user callable direct sparse
solver routines.
If you use the Intel MKL PARDISO sparse solver, please cite:
O.Schenk and K.Gartner. Solving unsymmetric sparse systems of linear equations with PARDISO. J. of Future
Generation Computer Systems, 20(3):475-487, 2004.
Intel MKL provides also an iterative sparse solver (see Chapter 8) that uses Sparse BLAS level 2 and 3
routines and works with different sparse data formats.
Extended Eigensolver Routines

TheExtended Eigensolver RCI Routines is a set of high-performance numerical routines for solving standard
(Ax = x) and generalized (Ax = Bx) eigenvalue problems, where A and B are symmetric or Hermitian. It
yields all the eigenvalues and eigenvectors within a given search interval. It is based on the Feast algorithm,
an innovative fast and stable numerical algorithm presented in [Polizzi09], which deviates fundamentally
48
Function Domains 1
from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms [Bai00]) or
other Davidson-Jacobi techniques [Sleijpen96]. The Feast algorithm is inspired by the density-matrix
representation and contour integration technique in quantum mechanics.
It is free from orthogonalization procedures. Its main computational tasks consist of solving very few inner
independent linear systems with multiple right-hand sides and one reduced eigenvalue problem orders of
magnitude smaller than the original one. The Feast algorithm combines simplicity and efficiency and offers
many important capabilities for achieving high performance, robustness, accuracy, and scalability on parallel
architectures. This algorithm is expected to significantly augment numerical performance in large-scale
modern applications.
Some of the characteristics of the Feast algorithm [Polizzi09] are:
Converges quickly in 2-3 iterations with very high accuracy

Naturally captures all eigenvalue multiplicities
No explicit orthogonalization procedure
Can reuse the basis of pre-computed subspace as suitable initial guess for performing outer-refinement
iterations
This capability can also be used for solving a series of eigenvalue problems that are close one another.
The number of internal iterations is independent of the size of the system and the number of eigenpairs in
the search interval
The inner linear systems can be solved either iteratively (even with modest relative residual error) or
directly
VM Functions
The Vector Mathematics functions (see Chapter 9) include a set of highly optimized implementations of
certain computationally expensive core mathematical functions (power, trigonometric, exponential,
hyperbolic, etc.) that operate on vectors of real and complex numbers.
Application programs that might significantly improve performance with VM include nonlinear programming
software, integrals computation, and many others. VM provides interfaces both for Fortran and C languages.
Statistical Functions
Vector Statistics (VS) contains three sets of functions (see Chapter 10) providing:
Pseudorandom, quasi-random, and non-deterministic random number generator subroutines
implementing basic continuous and discrete distributions. To provide best performance, the VS
subroutines use calls to highly optimized Basic Random Number Generators (BRNGs) and a set of vector
mathematical functions.
A wide variety of convolution and correlation operations.
Initial statistical analysis of raw single and double precision multi-dimensional datasets.
Fourier Transform Functions

The Intel MKL multidimensional Fast Fourier Transform (FFT) functions with mixed radix support (see
Chapter 11) provide uniformity of discrete Fourier transform computation and combine functionality with
ease of use. Both Fortran and C interface specification are given. There is also a cluster version of FFT
functions, which runs on distributed-memory architectures and is provided only for Intel 64 and Intel Many
Integrated Core architectures.
The FFT functions provide fast computation via the FFT algorithms for arbitrary lengths. See the Intel MKL
Developer Guide for the specific radices supported.
Partial Differential Equations Support

Intel MKL provides tools for solving Partial Differential Equations (PDE) (see Chapter 13). These tools are
Trigonometric Transform interface routines and Poisson Solver.
49
The Trigonometric Transform routines may be helpful to users who implement their own solvers similar to the
Intel MKL Poisson Solver. The users can improve performance of their solvers by using fast sine, cosine, and
staggered cosine transforms implemented in the Trigonometric Transform interface.
The Poisson Solver is designed for fast solving of simple Helmholtz, Poisson, and Laplace problems. The
Trigonometric Transform interface, which underlies the solver, is based on the Intel MKL FFT interface (refer
to Chapter 11), optimized for Intel processors.
Nonlinear Optimization Problem Solvers

Intel MKL provides Nonlinear Optimization Problem Solver routines (see Chapter 14) that can be used to
solve nonlinear least squares problems with or without linear (bound) constraints through the Trust-Region
(TR) algorithms and compute Jacobi matrix by central differences.
Support Functions
The Intel MKL support functions (see Chapter 15) are used to support the operation of the Intel MKL
software and provide basic information on the library and library operation, such as the current library
version, timing, setting and measuring of CPU frequency, error handling, and memory allocation.
Starting from release 10.0, the Intel MKL support functions provide additional threading control.
Starting from release 10.1, Intel MKL selectively supports a Progress Routine feature to track progress of a
lengthy computation and/or interrupt the computation using a callback function mechanism. The user
application can define a function called mkl_progress that is regularly called from the Intel MKL routine
supporting the progress routine feature. See the Progress Routines section in Chapter 15 for reference. Refer
to a specific LAPACK or DSS/PARDISO function description to see whether the function supports this feature
or not.
BLACS Routines
The Intel Math Kernel Library implements routines from the BLACS (Basic Linear Algebra Communication
Subprograms) package (see Chapter 16) that are used to support a linear algebra oriented message passing
interface that may be implemented efficiently and uniformly across a large range of distributed memory
platforms.
The original versions of BLACS from which that part of Intel MKL was derived can be obtained from http://
www.netlib.org/blacs/index.html. The authors of BLACS are Jack Dongarra and R. Clint Whaley.
Data Fitting Functions

The Data Fitting component includes a set of highly-optimized implementations of algorithms for the
following spline-based computations:
spline construction
interpolation including computation of derivatives and integration
search
The algorithms operate on single and double vector-valued functions set in the points of the given partition.
You can use Data Fitting algorithms in applications that are based on data approximation. See Data Fitting
Functions for more information.
Optimization Notice
50
Function Domains 1
Optimization Notice
Performance Enhancements
The Intel Math Kernel Library has been optimized by exploiting both processor and system features and
capabilities. Special care has been given to those routines that most profit from cache-management
techniques. These especially include matrix-matrix operation routines such as dgemm().
In addition, code optimization techniques have been applied to minimize dependencies of scheduling integer
and floating-point units on the results within the processor.
The major optimization techniques used throughout the library include:
Loop unrolling to minimize loop management costs

Blocking of data to improve data reuse opportunities
Copying to reduce chances of data eviction from cache
Data prefetching to help hide memory latency
Multiple simultaneous operations (for example, dot products in dgemm) to eliminate stalls due to
arithmetic unit pipelines
Use of hardware features such as the SIMD arithmetic units, where appropriate
These are techniques from which the arithmetic code benefits the most.
Optimization Notice
Parallelism
In addition to the performance enhancements discussed above, Intel MKL offers performance gains through
parallelism provided by the symmetric multiprocessing performance (SMP) feature. You can obtain
improvements from SMP in the following ways:
One way is based on user-managed threads in the program and further distribution of the operations over
the threads based on data decomposition, domain decomposition, control decomposition, or some other
parallelizing technique. Each thread can use any of the Intel MKL functions (except for the deprecated ?
lacon LAPACK routine) because the library has been designed to be thread-safe.
Another method is to use the FFT and BLAS level 3 routines. They have been parallelized and require no
alterations of your application to gain the performance enhancements of multiprocessing. Performance
using multiple processors on the level 3 BLAS shows excellent scaling. Since the threads are called and
managed within the library, the application does not need to be recompiled thread-safe.
Yet another method is to use tuned LAPACK routines. Currently these include the single- and double
precision flavors of routines for QR factorization of general matrices, triangular factorization of general
and symmetric positive-definite matrices, solving systems of equations with such matrices, as well as
solving symmetric eigenvalue problems.
51
For instructions on setting the number of available processors for the BLAS level 3 and LAPACK routines, see
Intel MKL Developer Guide.
Optimization Notice
52
BLAS and Sparse BLAS Routines 2
This chapter describes the Intel Math Kernel Library implementation of the BLAS and Sparse BLAS routines,
and BLAS-like extensions. The routine descriptions are arranged in several sections:
BLAS Level 1 Routines (vector-vector operations)
BLAS Level 2 Routines (matrix-vector operations)
BLAS Level 3 Routines (matrix-matrix operations)
Sparse BLAS Level 1 Routines (vector-vector operations).
Sparse BLAS Level 2 and Level 3 Routines (matrix-vector and matrix-matrix operations)
BLAS-like Extensions
Each section presents the routine and function group descriptions in alphabetical order by routine or function
group name; for example, the ?asum group, the ?axpy group. The question mark in the group name
corresponds to different character codes indicating the data type (s, d, c, and z or their combination); see
Routine Naming Conventions.
When BLAS or Sparse BLAS routines encounter an error, they call the error reporting routine xerbla.
In BLAS Level 1 groups i?amax and i?amin, an "i" is placed before the data-type indicator and corresponds
to the index of an element in the vector. These groups are placed in the end of the BLAS Level 1 section.
BLAS Routines
Naming Conventions for BLAS Routines

BLAS routine names have the following structure:
<character> <name> <mod> ( )
The <character> field indicates the data type:
s real, single precision
c complex, single precision
d real, double precision
z complex, double precision
Some routines and functions can have combined character codes, such as sc or dz.
For example, the function scasum uses a complex input array and returns a real value.
The <name> field, in BLAS level 1, indicates the operation type. For example, the BLAS level 1 routines ?
dot, ?rot, ?swap compute a vector dot product, vector rotation, and vector swap, respectively.
In BLAS level 2 and 3, <name> reflects the matrix argument type:
ge general matrix
gb general band matrix
sy symmetric matrix
53
sp symmetric matrix (packed storage)
sb symmetric band matrix
he Hermitian matrix
hp Hermitian matrix (packed storage)
hb Hermitian band matrix
tr triangular matrix
tp triangular matrix (packed storage)
tb triangular band matrix.
The <mod> field, if present, provides additional details of the operation. BLAS level 1 names can have the
following characters in the <mod> field:
c conjugated vector
u unconjugated vector
g Givens rotation construction
m modified Givens rotation

mg modified Givens rotation construction
BLAS level 2 names can have the following characters in the <mod> field:
mv matrix-vector product
sv solving a system of linear equations with a single unknown vector
r rank-1 update of a matrix
r2 rank-2 update of a matrix.
BLAS level 3 names can have the following characters in the <mod> field:
mm matrix-matrix product
sm solving a system of linear equations with multiple unknown vectors
rk rank-k update of a matrix
r2k rank-2k update of a matrix.
The examples below illustrate how to interpret BLAS routine names:
ddot <d> <dot>: double-precision real vector-vector dot product
cdotc <c> <dot> <c>: complex vector-vector dot product, conjugated
scasum <sc> <asum>: sum of magnitudes of vector elements, single precision real
output and single precision complex input
cdotu <c> <dot> <u>: vector-vector dot product, unconjugated, complex
sgemv <s> <ge> <mv>: matrix-vector product, general matrix, single precision
54
ztrmm <z> <tr> <mm>: matrix-matrix product, triangular matrix, double-precision
complex.
Sparse BLAS level 1 naming conventions are similar to those of BLAS level 1. For more information, see
Naming Conventions.
Fortran 95 Interface Conventions for BLAS Routines

Fortran 95 interface to BLAS and Sparse BLAS Level 1 routines is implemented through wrappers that call
respective FORTRAN 77 routines. This interface uses such features of Fortran 95 as assumed-shape arrays
and optional arguments to provide simplified calls to BLAS and Sparse BLAS Level 1 routines with fewer
parameters.
NOTE
For BLAS, Intel MKL offers two types of Fortran 95 interfaces:
using mkl_blas.fi only through include 'mkl.fi' statement. Such interfaces allow you to
make use of the original LAPACK routines with all their arguments
using blas.f90 that includes improved interfaces. This file is used to generate the module files
blas95.mod and f95_precision.mod. See also section "Fortran 95 interfaces and wrappers to
LAPACK and BLAS" of Intel MKL Developer Guide for details. The module files are used to process
the FORTRAN use clauses referencing the BLAS interface: use blas95 and use f95_precision.
The main conventions used in Fortran 95 interface are as follows:
The names of parameters used in Fortran 95 interface are typically the same as those used for the
respective generic (FORTRAN 77) interface. In rare cases formal argument names may be different.
Some input parameters such as array dimensions are not required in Fortran 95 and are skipped from the
calling sequence. Array dimensions are reconstructed from the user data that must exactly follow the
required array shape.
A parameter can be skipped if its value is completely defined by the presence or absence of another
parameter in the calling sequence, and the restored value is the only meaningful value for the skipped
parameter.
Parameters specifying the increment values incx and incy are skipped. In most cases their values are
equal to 1. In Fortran 95 an increment with different value can be directly established in the
corresponding parameter.
Some generic parameters are declared as optional in Fortran 95 interface and may or may not be present
in the calling sequence. A parameter can be declared optional if it satisfies one of the following conditions:
1. It can take only a few possible values. The default value of such parameter typically is the first value in
the list; all exceptions to this rule are explicitly stated in the routine description.
2. It has a natural default value.
Optional parameters are given in square brackets in Fortran 95 call syntax.
The particular rules used for reconstructing the values of omitted optional parameters are specific for each
routine and are detailed in the respective "Fortran 95 Notes" subsection at the end of routine specification
section. If this subsection is omitted, the Fortran 95 interface for the given routine does not differ from the
corresponding FORTRAN 77 interface.
Note that this interface is not implemented in the current version of Sparse BLAS Level 2 and Level 3
routines.
Matrix Storage Schemes for BLAS Routines

Matrix arguments of BLAS routines can use the following storage schemes:
Full storage: a matrix A is stored in a two-dimensional array a, with the matrix element Aij stored in the
array element a(i,j).
55
Packed storage scheme allows you to store symmetric, Hermitian, or triangular matrices more compactly:
the upper or lower triangle of the matrix is packed by columns in a one-dimensional array.
Band storage: a band matrix is stored compactly in a two-dimensional array: columns of the matrix are
stored in the corresponding columns of the array, and diagonals of the matrix are stored in rows of the
array.
For more information on matrix storage schemes, see Matrix Arguments in Appendix B.
BLAS Level 1 Routines and Functions

BLAS Level 1 includes routines and functions, which perform vector-vector operations. Table BLAS Level 1
Routine Groups and Their Data Types lists the BLAS Level 1 routine and function groups and the data types
associated with them.
BLAS Level 1 Routine and Function Groups and Their Data Types
Routine or Data Types Description
Function Group
?asum s, d, sc, dz Sum of vector magnitudes (functions)
?axpy s, d, c, z Scalar-vector product (routines)
?copy s, d, c, z Copy vector (routines)
?dot s, d Dot product (functions)
?sdot sd, d Dot product with double precision (functions)
?dotc c, z Dot product conjugated (functions)
?dotu c, z Dot product unconjugated (functions)
?nrm2 s, d, sc, dz Vector 2-norm (Euclidean norm) (functions)
?rot s, d, cs, zd Plane rotation of points (routines)
?rotg s, d, c, z Generate Givens rotation of points (routines)
?rotm s, d Modified Givens plane rotation of points (routines)
?rotmg s, d Generate modified Givens plane rotation of points

(routines)
?scal s, d, c, z, cs, zd Vector-scalar product (routines)
?swap s, d, c, z Vector-vector swap (routines)
i?amax s, d, c, z Index of the maximum absolute value element of a vector

(functions)
i?amin s, d, c, z Index of the minimum absolute value element of a vector

(functions)
?cabs1 s, d Auxiliary functions, compute the absolute value of a

complex number of single or double precision
?asum
Computes the sum of magnitudes of the vector
elements.
Syntax
res = sasum(n, x, incx)
56
res = scasum(n, x, incx)
res = dasum(n, x, incx)
res = dzasum(n, x, incx)
res = asum(x)
Include Files
mkl.fi, blas.f90
Description
The ?asum routine computes the sum of the magnitudes of elements of a real vector, or the sum of
magnitudes of the real and imaginary parts of elements of a complex vector:
res = |Rex1| + |Imx1| + |Rex2| + |Imx2|+ ... + |Rexn| + |Imxn|,
where x is a vector with n elements.
Input Parameters
n INTEGER. Specifies the number of elements in vector x.
x REAL for sasum

DOUBLE PRECISION for dasum
COMPLEX for scasum
DOUBLE COMPLEX for dzasum
Array, size at least (1 + (n-1)*abs(incx)).
incx INTEGER. Specifies the increment for indexing vector x.
Output Parameters
res REAL for sasum

DOUBLE PRECISION for dasum
REAL for scasum
DOUBLE PRECISION for dzasum
Contains the sum of magnitudes of real and imaginary parts of all elements
of the vector.
BLAS 95 Interface Notes

Routines in Fortran 95 interface have fewer arguments in the calling sequence than their FORTRAN 77
counterparts. For general conventions applied to skip redundant or reconstructible arguments, see BLAS 95
Interface Conventions.
Specific details for the routine asum interface are the following:
x Holds the array of size n.
57
?axpy
Computes a vector-scalar product and adds the result
to a vector.
Syntax
call saxpy(n, a, x, incx, y, incy)
call daxpy(n, a, x, incx, y, incy)
call caxpy(n, a, x, incx, y, incy)
call zaxpy(n, a, x, incx, y, incy)
call axpy(x, y [,a])
Include Files
mkl.fi, blas.f90
Description
The ?axpy routines perform a vector-vector operation defined as
y := a*x + y
where:
a is a scalar
x and y are vectors each with a number of elements that equals n.
Input Parameters
n INTEGER. Specifies the number of elements in vectors x and y.
a REAL for saxpy

DOUBLE PRECISION for daxpy
COMPLEX for caxpy
DOUBLE COMPLEX for zaxpy
Specifies the scalar a.
x REAL for saxpy

COMPLEX for caxpy
incx INTEGER. Specifies the increment for the elements of x.
y REAL for saxpy

COMPLEX for caxpy
58
Array, size at least (1 + (n-1)*abs(incy)).
incy INTEGER. Specifies the increment for the elements of y.
Output Parameters
y Contains the updated vector y.

Specific details for the routine axpy interface are the following:
y Holds the array of size n.
a The default value is 1.
?copy
Copies vector to another vector.
Syntax
call scopy(n, x, incx, y, incy)
call dcopy(n, x, incx, y, incy)
call ccopy(n, x, incx, y, incy)
call zcopy(n, x, incx, y, incy)
call copy(x, y)
Include Files
mkl.fi, blas.f90
Description
The ?copy routines perform a vector-vector operation defined as
y = x,
where x and y are vectors.
Input Parameters
x REAL for scopy

DOUBLE PRECISION for dcopy
COMPLEX for ccopy
DOUBLE COMPLEX for zcopy
59
y REAL for scopy

DOUBLE PRECISION for dcopy
COMPLEX for ccopy
DOUBLE COMPLEX for zcopy
Output Parameters
y Contains a copy of the vector x if n is positive. Otherwise, parameters are

unaltered.

Specific details for the routine copy interface are the following:
x Holds the vector with the number of elements n.
y Holds the vector with the number of elements n.
?dot
Computes a vector-vector dot product.
Syntax
res = sdot(n, x, incx, y, incy)
res = ddot(n, x, incx, y, incy)
res = dot(x, y)
Include Files
mkl.fi, blas.f90
Description
The ?dot routines perform a vector-vector reduction operation defined as
where xi and yi are elements of vectors x and y.
60
Input Parameters
x REAL for sdot

DOUBLE PRECISION for ddot
Array, size at least (1+(n-1)*abs(incx)).
y REAL for sdot

Array, size at least (1+(n-1)*abs(incy)).
Output Parameters
res REAL for sdot

Contains the result of the dot product of x and y, if n is positive. Otherwise,
res contains 0.

Specific details for the routine dot interface are the following:
?sdot
Computes a vector-vector dot product with double
precision.
Syntax
res = sdsdot(n, sb, sx, incx, sy, incy)
res = dsdot(n, sx, incx, sy, incy)
res = sdot(sx, sy)
res = sdot(sx, sy, sb)
Include Files
mkl.fi, blas.f90
Description
61
The ?sdot routines compute the inner product of two vectors with double precision. Both routines use double
precision accumulation of the intermediate results, but the sdsdot routine outputs the final result in single
precision, whereas the dsdot routine outputs the double precision result. The function sdsdot also adds
scalar value sb to the inner product.
Input Parameters
n INTEGER. Specifies the number of elements in the input vectors sx and sy.
sb REAL. Single precision scalar to be added to inner product (for the function
sdsdot only).
sx, sy REAL.
Arrays, size at least (1+(n -1)*abs(incx)) and (1+(n-1)*abs(incy)),
respectively. Contain the input single precision vectors.
incx INTEGER. Specifies the increment for the elements of sx.
incy INTEGER. Specifies the increment for the elements of sy.
Output Parameters
res REAL for sdsdot

DOUBLE PRECISION for dsdot
Contains the result of the dot product of sx and sy (with sb added for
sdsdot), if n is positive. Otherwise, res contains sb for sdsdot and 0 for
dsdot.

Specific details for the routine sdot interface are the following:
sx Holds the vector with the number of elements n.
sy Holds the vector with the number of elements n.
NOTE
Note that scalar parameter sb is declared as a required parameter in Fortran 95 interface for the
function sdot to distinguish between function flavors that output final result in different precision.
?dotc
Computes a dot product of a conjugated vector with
another vector.
Syntax
res = cdotc(n, x, incx, y, incy)
res = zdotc(n, x, incx, y, incy)
res = dotc(x, y)
62
Include Files
mkl.fi, blas.f90
Description
The ?dotc routines perform a vector-vector operation defined as:
where xi and yi are elements of vectors x and y.
Input Parameters
x COMPLEX for cdotc

DOUBLE COMPLEX for zdotc
Array, size at least (1 + (n -1)*abs(incx)).
y COMPLEX for cdotc

Array, size at least (1 + (n -1)*abs(incy)).
Output Parameters
res COMPLEX for cdotc

Contains the result of the dot product of the conjugated x and unconjugated
y, if n is positive. Otherwise, it contains 0.

Specific details for the routine dotc interface are the following:
?dotu
Computes a vector-vector dot product.
Syntax
res = cdotu(n, x, incx, y, incy)
63
res = zdotu(n, x, incx, y, incy)

res = dotu(x, y)
Include Files
mkl.fi, blas.f90
Description
The ?dotu routines perform a vector-vector reduction operation defined as
where xi and yi are elements of complex vectors x and y.
Input Parameters
x COMPLEX for cdotu

DOUBLE COMPLEX for zdotu
y COMPLEX for cdotu

Output Parameters
res COMPLEX for cdotu

Contains the result of the dot product of x and y, if n is positive. Otherwise,
it contains 0.

Specific details for the routine dotu interface are the following:
64
?nrm2
Computes the Euclidean norm of a vector.
Syntax
res = snrm2(n, x, incx)
res = dnrm2(n, x, incx)
res = scnrm2(n, x, incx)
res = dznrm2(n, x, incx)
res = nrm2(x)
Include Files
mkl.fi, blas.f90
Description
The ?nrm2 routines perform a vector reduction operation defined as
res = ||x||,
where:
x is a vector,
res is a value containing the Euclidean norm of the elements of x.
Input Parameters
x REAL for snrm2

DOUBLE PRECISION for dnrm2
COMPLEX for scnrm2
DOUBLE COMPLEX for dznrm2
Array, size at least (1 + (n -1)*abs (incx)).
Output Parameters
res REAL for snrm2

DOUBLE PRECISION for dnrm2
REAL for scnrm2
DOUBLE PRECISION for dznrm2
Contains the Euclidean norm of the vector x.

Specific details for the routine nrm2 interface are the following:
65
?rot
Performs rotation of points in the plane.
Syntax
call srot(n, x, incx, y, incy, c, s)
call drot(n, x, incx, y, incy, c, s)
call csrot(n, x, incx, y, incy, c, s)
call zdrot(n, x, incx, y, incy, c, s)
call rot(x, y, c, s)
Include Files
mkl.fi, blas.f90
Description
Given two complex vectors x and y, each vector element of these vectors is replaced as follows:
xi = c*xi + s*yi
yi = c*yi - s*xi
Input Parameters
x REAL for srot

DOUBLE PRECISION for drot
COMPLEX for csrot
DOUBLE COMPLEX for zdrot
y REAL for srot

COMPLEX for csrot
DOUBLE COMPLEX for zdrot
c REAL for srot

REAL for csrot
DOUBLE PRECISION for zdrot
66
A scalar.
s REAL for srot

REAL for csrot
DOUBLE PRECISION for zdrot
A scalar.
Output Parameters
x Each element is replaced by c*x + s*y.
y Each element is replaced by c*y - s*x.

Specific details for the routine rot interface are the following:
?rotg
Computes the parameters for a Givens rotation.
Syntax
call srotg(a, b, c, s)
call drotg(a, b, c, s)
call crotg(a, b, c, s)
call zrotg(a, b, c, s)
call rotg(a, b, c, s)
Include Files
mkl.fi, blas.f90
Description
Given the Cartesian coordinates (a, b) of a point, these routines return the parameters c, s, r, and z
associated with the Givens rotation. The parameters c and s define a unitary matrix such that:
The parameter z is defined such that if |a| > |b|, z is s; otherwise if c is not 0 z is 1/c; otherwise z is 1.
See a more accurate LAPACK version ?lartg.
67
Input Parameters
a REAL for srotg

DOUBLE PRECISION for drotg
COMPLEX for crotg
DOUBLE COMPLEX for zrotg
Provides the x-coordinate of the point p.
b REAL for srotg

COMPLEX for crotg
Provides the y-coordinate of the point p.
Output Parameters
a Contains the parameter r associated with the Givens rotation.
b Contains the parameter z associated with the Givens rotation.
c REAL for srotg

REAL for crotg
DOUBLE PRECISION for zrotg
Contains the parameter c associated with the Givens rotation.
s REAL for srotg

COMPLEX for crotg
Contains the parameter s associated with the Givens rotation.
?rotm
Performs modified Givens rotation of points in the
plane.
Syntax
call srotm(n, x, incx, y, incy, param)
call drotm(n, x, incx, y, incy, param)
call rotm(x, y, param)
Include Files
mkl.fi, blas.f90
68
Description
Given two vectors x and y, each vector element of these vectors is replaced as follows:
xi xi
=H
yi yi
for i=1 to n, where H is a modified Givens transformation matrix whose values are stored in the param(2)
through param(5) array. See discussion on the param argument.
Input Parameters
x REAL for srotm

DOUBLE PRECISION for drotm
y REAL for srotm

param REAL for srotm

Array, size 5.
The elements of the param array are:
param(1) contains a switch, flag. param(2-5) contain h11, h21, h12, and
h22, respectively, the components of the array H.
Depending on the values of flag, the components of H are set as follows:
h 11 h 12
flag = -1.0: H =
h 21 h 22
1.0 h 12
flag = 0.0: H =
h 21 1.0
h 11 1.0
flag = 1.0: H =
1.0 h 22
1.0 0.0
flag = -2.0: H =
0.0 1.0
In the last three cases, the matrix entries of 1.0, -1.0, and 0.0 are assumed
based on the value of flag and are not required to be set in the param
vector.
69
Output Parameters
x Each element x(i) is replaced by h11*x(i) + h12*y(i).
y Each element y(i) is replaced by h21*x(i) + h22*y(i).

Specific details for the routine rotm interface are the following:
?rotmg
Computes the parameters for a modified Givens
rotation.
Syntax
call srotmg(d1, d2, x1, y1, param)
call drotmg(d1, d2, x1, y1, param)
call rotmg(d1, d2, x1, y1, param)
Include Files
mkl.fi, blas.f90
Description
Given Cartesian coordinates (x1, y1) of an input vector, these routines compute the components of a
modified Givens transformation matrix H that zeros the y-component of the resulting vector:
x1 x1 d1
=H
0 y1 d2
Input Parameters
d1 REAL for srotmg

DOUBLE PRECISION for drotmg
Provides the scaling factor for the x-coordinate of the input vector.
d2 REAL for srotmg

Provides the scaling factor for the y-coordinate of the input vector.
x1 REAL for srotmg

Provides the x-coordinate of the input vector.
70
y1 REAL for srotmg
Provides the y-coordinate of the input vector.
Output Parameters
d1 REAL for srotmg

Provides the first diagonal element of the updated matrix.
d2 REAL for srotmg

Provides the second diagonal element of the updated matrix.
x1 REAL for srotmg

Provides the x-coordinate of the rotated vector before scaling.
param REAL for srotmg

Array, size 5.
The elements of the param array are:
param(1) contains a switch, flag. the other array elements param(2-5)
contain the components of the array H: h11, h21, h12, and h22, respectively.
Depending on the values of flag, the components of H are set as follows:
h 11 h 12
flag = -1.0: H =
h 21 h 22
1.0 h 12
flag = 0.0: H =
h 21 1.0
h 11 1.0
flag = 1.0: H =
1.0 h 22
1.0 0.0
flag = -2.0: H =
0.0 1.0
In the last three cases, the matrix entries of 1.0, -1.0, and 0.0 are assumed
based on the value of flag and are not required to be set in the param
vector.
?scal
Computes the product of a vector by a scalar.
Syntax
call sscal(n, a, x, incx)
call dscal(n, a, x, incx)
71
call cscal(n, a, x, incx)

call zscal(n, a, x, incx)
call csscal(n, a, x, incx)
call zdscal(n, a, x, incx)
call scal(x, a)
Include Files
mkl.fi, blas.f90
Description
The ?scal routines perform a vector operation defined as
x = a*x
where:
a is a scalar, x is an n-element vector.
Input Parameters
a REAL for sscal and csscal

DOUBLE PRECISION for dscal and zdscal
COMPLEX for cscal
DOUBLE COMPLEX for zscal
x REAL for sscal

DOUBLE PRECISION for dscal
COMPLEX for cscal and csscal
DOUBLE COMPLEX for zscal and zdscal
Output Parameters
x Updated vector x.

Specific details for the routine scal interface are the following:
72
?swap
Swaps a vector with another vector.
Syntax
call sswap(n, x, incx, y, incy)
call dswap(n, x, incx, y, incy)
call cswap(n, x, incx, y, incy)
call zswap(n, x, incx, y, incy)
call swap(x, y)
Include Files
mkl.fi, blas.f90
Description
Given two vectors x and y, the ?swap routines return vectors y and x swapped, each replacing the other.
Input Parameters
x REAL for sswap

DOUBLE PRECISION for dswap
COMPLEX for cswap
DOUBLE COMPLEX for zswap
y REAL for sswap

DOUBLE PRECISION for dswap
COMPLEX for cswap
DOUBLE COMPLEX for zswap
Output Parameters
x Contains the resultant vector x, that is, the input vector y.
y Contains the resultant vector y, that is, the input vector x.

Specific details for the routine swap interface are the following:
73
i?amax
Finds the index of the element with maximum
absolute value.
Syntax
index = isamax(n, x, incx)
index = idamax(n, x, incx)
index = icamax(n, x, incx)
index = izamax(n, x, incx)
index = iamax(x)
Include Files
mkl.fi, blas.f90
Description
Given a vector x, the i?amax functions return the position of the vector element x(i) that has the largest
absolute value for real flavors, or the largest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
If n is not positive, 0 is returned.
If more than one vector element is found with the same largest absolute value, the index of the first one
encountered is returned.
Input Parameters
x REAL for isamax

DOUBLE PRECISION for idamax
COMPLEX for icamax
DOUBLE COMPLEX for izamax
Output Parameters
index INTEGER. Contains the position of vector element that has the largest
absolute value such that x(index) has the largest absolute value.

Functions and routines in Fortran 95 interface have fewer arguments in the calling sequence than their
FORTRAN 77 counterparts. For general conventions applied to skip redundant or reconstructible arguments,
see BLAS 95 Interface Conventions.
Specific details for the function iamax interface are the following:
74
i?amin
Finds the index of the element with the smallest
absolute value.
Syntax
index = isamin(n, x, incx)
index = idamin(n, x, incx)
index = icamin(n, x, incx)
index = izamin(n, x, incx)
index = iamin(x)
Include Files
mkl.fi, blas.f90
Description
Given a vector x, the i?amin functions return the position of the vector element x(i) that has the smallest
absolute value for real flavors, or the smallest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
If n is not positive, 0 is returned.
If more than one vector element is found with the same smallest absolute value, the index of the first one
encountered is returned.
Input Parameters
n INTEGER. On entry, n specifies the number of elements in vector x.
x REAL for isamin

DOUBLE PRECISION for idamin
COMPLEX for icamin
DOUBLE COMPLEX for izamin
Output Parameters
index INTEGER. Indicates the position of vector element with the smallest
absolute value such that x(index) has the smallest absolute value.

Functions and routines in Fortran 95 interface have fewer arguments in the calling sequence than their
FORTRAN 77 counterparts. For general conventions applied to skip redundant or reconstructible arguments,
see BLAS 95 Interface Conventions.
Specific details for the function iamin interface are the following:
75
?cabs1
Computes absolute value of complex number.
Syntax
res = scabs1(z)
res = dcabs1(z)
res = cabs1(z)
Include Files
mkl.fi, blas.f90
Description
The ?cabs1 is an auxiliary routine for a few BLAS Level 1 routines. This routine performs an operation
defined as
res=|Re(z)|+|Im(z)|,
where z is a scalar, and res is a value containing the absolute value of a complex number z.
Input Parameters
z COMPLEX scalar for scabs1.

DOUBLE COMPLEX scalar for dcabs1.
Output Parameters
res REAL for scabs1.

DOUBLE PRECISION for dcabs1.
Contains the absolute value of a complex number z.
BLAS Level 2 Routines

This section describes BLAS Level 2 routines, which perform matrix-vector operations. Table BLAS Level 2
Routine Groups and Their Data Types lists the BLAS Level 2 routine groups and the data types associated
with them.
BLAS Level 2 Routine Groups and Their Data Types
Routine Groups Data Types Description
?gbmv s, d, c, z Matrix-vector product using a general band matrix
?gemv s, d, c, z Matrix-vector product using a general matrix
?ger s, d Rank-1 update of a general matrix
?gerc c, z Rank-1 update of a conjugated general matrix
?geru c, z Rank-1 update of a general matrix, unconjugated
?hbmv c, z Matrix-vector product using a Hermitian band matrix
76
Routine Groups Data Types Description
?hemv c, z Matrix-vector product using a Hermitian matrix
?her c, z Rank-1 update of a Hermitian matrix
?her2 c, z Rank-2 update of a Hermitian matrix
?hpmv c, z Matrix-vector product using a Hermitian packed matrix
?hpr c, z Rank-1 update of a Hermitian packed matrix
?hpr2 c, z Rank-2 update of a Hermitian packed matrix
?sbmv s, d Matrix-vector product using symmetric band matrix
?spmv s, d Matrix-vector product using a symmetric packed matrix
?spr s, d Rank-1 update of a symmetric packed matrix
?spr2 s, d Rank-2 update of a symmetric packed matrix
?symv s, d Matrix-vector product using a symmetric matrix
?syr s, d Rank-1 update of a symmetric matrix
?syr2 s, d Rank-2 update of a symmetric matrix
?tbmv s, d, c, z Matrix-vector product using a triangular band matrix
?tbsv s, d, c, z Solution of a linear system of equations with a triangular

band matrix
?tpmv s, d, c, z Matrix-vector product using a triangular packed matrix
?tpsv s, d, c, z Solution of a linear system of equations with a triangular

packed matrix
?trmv s, d, c, z Matrix-vector product using a triangular matrix
?trsv s, d, c, z Solution of a linear system of equations with a triangular

matrix
?gbmv
Computes a matrix-vector product using a general
band matrix
Syntax
call sgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call dgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call cgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call zgbmv(trans, m, n, kl, ku, alpha, a, lda, x, incx, beta, y, incy)
call gbmv(a, x, y [,kl] [,m] [,alpha] [,beta] [,trans])
Include Files
mkl.fi, blas.f90
Description
The ?gbmv routines perform a matrix-vector operation defined as
77
y := alpha*A*x + beta*y,
or
y := alpha*A'*x + beta*y,
or
y := alpha *conjg(A')*x + beta*y,
where:
alpha and beta are scalars,
x and y are vectors,
A is an m-by-n band matrix, with kl sub-diagonals and ku super-diagonals.
Input Parameters
trans CHARACTER*1. Specifies the operation:

If trans= 'N' or 'n', then y := alpha*A*x + beta*y
If trans= 'T' or 't', then y := alpha*A'*x + beta*y
If trans= 'C' or 'c', then y := alpha *conjg(A')*x + beta*y
m INTEGER. Specifies the number of rows of the matrix A.

The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix A.

The value of n must be at least zero.
kl INTEGER. Specifies the number of sub-diagonals of the matrix A.

The value of kl must satisfy 0kl.
ku INTEGER. Specifies the number of super-diagonals of the matrix A.

The value of ku must satisfy 0ku.
alpha REAL for sgbmv

DOUBLE PRECISION for dgbmv
COMPLEX for cgbmv
DOUBLE COMPLEX for zgbmv
Specifies the scalar alpha.
a REAL for sgbmv

COMPLEX for cgbmv
Array, size (lda, n).
Before entry, the leading (kl + ku + 1) by n part of the array a must

contain the matrix of coefficients. This matrix must be supplied column-by-
column, with the leading diagonal of the matrix in row (ku + 1) of the
array, the first super-diagonal starting at position 2 in row ku, the first sub-
78
diagonal starting at position 1 in row (ku + 2), and so on. Elements in the
array a that do not correspond to elements in the band matrix (such as the
top left ku by ku triangle) are not referenced.
The following program segment transfers a band matrix from conventional
full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
k = ku + 1 - j
do 10, i = max(1, j-ku), min(m, j+kl)
a(k+i, j) = matrix(i,j)
10 continue
20 continue
lda INTEGER. Specifies the leading dimension of a as declared in the calling

(sub)program. The value of lda must be at least (kl + ku + 1).
x REAL for sgbmv

COMPLEX for cgbmv
Array, size at least (1 + (n - 1)*abs(incx)) when trans= 'N' or 'n',
and at least (1 + (m - 1)*abs(incx)) otherwise. Before entry, the array
x must contain the vector x.
incx INTEGER. Specifies the increment for the elements of x. incx must not be
zero.
beta REAL for sgbmv

COMPLEX for cgbmv
Specifies the scalar beta. When beta is equal to zero, then y need not be
set on input.
y REAL for sgbmv

COMPLEX for cgbmv
Array, size at least (1 +(m - 1)*abs(incy)) when trans= 'N' or 'n'
and at least (1 +(n - 1)*abs(incy)) otherwise. Before entry, the
incremented array y must contain the vector y.

The value of incy must not be zero.
Output Parameters
y Updated vector y.
79

Specific details for the routine gbmv interface are the following:
a Holds the array a of size (kl+ku+1, n). Contains a banded matrix m*nwith
kl lower diagonal and ku upper diagonal.
x Holds the vector with the number of elements rx, where rx = n if trans =
'N',rx = m otherwise.
y Holds the vector with the number of elements ry, where ry = m if trans =
'N',ry = n otherwise.
trans Must be 'N', 'C', or 'T'.
The default value is 'N'.
kl If omitted, assumed kl = ku, that is, the number of lower diagonals equals
the number of the upper diagonals.
ku Restored as ku = lda-kl-1, where lda is the leading dimension of matrix A.
m If omitted, assumed m = n, that is, a square matrix.
alpha The default value is 1.
beta The default value is 0.
?gemv
matrix
Syntax
call sgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call cgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call zgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call scgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dzgemv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call gemv(a, x, y [,alpha][,beta] [,trans])
Include Files
mkl.fi, blas.f90
Description
The ?gemv routines perform a matrix-vector operation defined as
or
80
y := alpha*A'*x + beta*y,
or
y := alpha*conjg(A')*x + beta*y,
where:
A is an m-by-n matrix.
Input Parameters

if trans= 'N' or 'n', then y := alpha*A*x + beta*y;
if trans= 'T' or 't', then y := alpha*A'*x + beta*y;
if trans= 'C' or 'c', then y := alpha *conjg(A')*x + beta*y.
m INTEGER. Specifies the number of rows of the matrix A. The value of m

must be at least zero.
n INTEGER. Specifies the number of columns of the matrix A. The value of n

alpha REAL for sgemv

DOUBLE PRECISION for dgemv
COMPLEX for cgemv, scgemv
DOUBLE COMPLEX for zgemv, dzgemv
a REAL for sgemv, scgemv

DOUBLE PRECISION for dgemv, dzgemv
COMPLEX for cgemv
DOUBLE COMPLEX for zgemv
Before entry, the leading m-by-n part of the array a must contain the
matrix of coefficients.

(sub)program.
The value of lda must be at least max(1, m).
x REAL for sgemv

81
Array, size at least (1+(n-1)*abs(incx)) when trans= 'N' or 'n' and

at least (1+(m - 1)*abs(incx)) otherwise. Before entry, the incremented
array x must contain the vector x.

The value of incx must not be zero.
beta REAL for sgemv

Specifies the scalar beta. When beta is set to zero, then y need not be set
on input.
y REAL for sgemv

Array, size at least (1 +(m - 1)*abs(incy)) when trans= 'N' or 'n'
and at least (1 +(n - 1)*abs(incy)) otherwise. Before entry with non-
zero beta, the incremented array y must contain the vector y.

Output Parameters
y Updated vector y.

Specific details for the routine gemv interface are the following:
a Holds the matrix A of size (m,n).
x Holds the vector with the number of elements rx where rx = n if trans =

'N', rx = m otherwise.
y Holds the vector with the number of elements ry where ry = m if trans =

'N', ry = n otherwise.
82
?ger
Performs a rank-1 update of a general matrix.
Syntax
call sger(m, n, alpha, x, incx, y, incy, a, lda)
call dger(m, n, alpha, x, incx, y, incy, a, lda)
call ger(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?ger routines perform a matrix-vector operation defined as
A := alpha*x*y'+ A,
where:
alpha is a scalar,
x is an m-element vector,
y is an n-element vector,
A is an m-by-n general matrix.
Input Parameters


alpha REAL for sger

DOUBLE PRECISION for dger
x REAL for sger

Array, size at least (1 + (m - 1)*abs(incx)). Before entry, the
incremented array x must contain the m-element vector x.

y REAL for sger

Array, size at least (1 + (n - 1)*abs(incy)). Before entry, the
incremented array y must contain the n-element vector y.
83

a REAL for sger


(sub)program.
Output Parameters
a Overwritten by the updated matrix.

Specific details for the routine ger interface are the following:
x Holds the vector with the number of elements m.
?gerc
Performs a rank-1 update (conjugated) of a general
matrix.
Syntax
call cgerc(m, n, alpha, x, incx, y, incy, a, lda)
call zgerc(m, n, alpha, x, incx, y, incy, a, lda)
call gerc(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?gerc routines perform a matrix-vector operation defined as
A := alpha*x*conjg(y') + A,
where:
alpha is a scalar,
84
Input Parameters


alpha COMPLEX for cgerc

DOUBLE COMPLEX for zgerc
x COMPLEX for cgerc


y COMPLEX for cgerc


a COMPLEX for cgerc


(sub)program.
Output Parameters
85

Specific details for the routine gerc interface are the following:
?geru
Performs a rank-1 update (unconjugated) of a general
matrix.
Syntax
call cgeru(m, n, alpha, x, incx, y, incy, a, lda)
call zgeru(m, n, alpha, x, incx, y, incy, a, lda)
call geru(a, x, y [,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?geru routines perform a matrix-vector operation defined as
A := alpha*x*y ' + A,
where:
alpha is a scalar,
Input Parameters


alpha COMPLEX for cgeru

DOUBLE COMPLEX for zgeru
86
x COMPLEX for cgeru

y COMPLEX for cgeru


a COMPLEX for cgeru


(sub)program.
Output Parameters

Specific details for the routine geru interface are the following:
?hbmv
Computes a matrix-vector product using a Hermitian
band matrix.
87
Syntax
call chbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call zhbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call hbmv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
The ?hbmv routines perform a matrix-vector operation defined as y := alpha*A*x + beta*y,
where:
x and y are n-element vectors,
A is an n-by-n Hermitian band matrix, with k super-diagonals.
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or lower triangular part of the
Hermitian band matrix A is used:
If uplo = 'U' or 'u', then the upper triangular part of the matrix A is
used.
If uplo = 'L' or 'l', then the low triangular part of the matrix A is used.
n INTEGER. Specifies the order of the matrix A. The value of n must be at

least zero.
k INTEGER. For uplo = 'U' or 'u'Specifies the number of super-diagonals

of the matrix A.
For uplo = 'L' or 'l': Specifies the number of sub-diagonals of the
matrix A.
The value of k must satisfy 0k.
alpha COMPLEX for chbmv

DOUBLE COMPLEX for zhbmv
a COMPLEX for chbmv

Before entry with uplo = 'U' or 'u', the leading (k + 1) by n part of the
array a must contain the upper triangular band part of the Hermitian
matrix. The matrix must be supplied column-by-column, with the leading
diagonal of the matrix in row (k + 1) of the array, the first super-diagonal
starting at position 2 in row k, and so on. The top left k by k triangle of the
array a is not referenced.
88
The following program segment transfers the upper triangular part of a
Hermitian band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
a( m + i, j ) = matrix( i, j )
10 continue
20 continue
Before entry with uplo = 'L' or 'l', the leading (k + 1) by n part of the
array a must contain the lower triangular band part of the Hermitian matrix,
supplied column-by-column, with the leading diagonal of the matrix in row
1 of the array, the first sub-diagonal starting at position 1 in row 2, and so
on. The bottom right k by k triangle of the array a is not referenced.
The following program segment transfers the lower triangular part of a
Hermitian band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
10 continue
20 continue
The imaginary parts of the diagonal elements need not be set and are
assumed to be zero.

(sub)program. The value of lda must be at least (k + 1).
x COMPLEX for chbmv

Array, size at least (1 + (n - 1)*abs(incx)). Before entry, the
incremented array x must contain the vector x.

beta COMPLEX for chbmv

Specifies the scalar beta.
y COMPLEX for chbmv


89
Output Parameters
y Overwritten by the updated vector y.

Specific details for the routine hbmv interface are the following:
a Holds the array a of size (k+1,n).
uplo Must be 'U' or 'L'. The default value is 'U'.
?hemv
matrix.
Syntax
call chemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zhemv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call hemv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
The ?hemv routines perform a matrix-vector operation defined as
where:
A is an n-by-n Hermitian matrix.
Input Parameters
array a is used.
If uplo = 'U' or 'u', then the upper triangular of the array a is used.
90
If uplo = 'L' or 'l', then the low triangular of the array a is used.

least zero.
alpha COMPLEX for chemv

DOUBLE COMPLEX for zhemv
a COMPLEX for chemv

Before entry with uplo = 'U' or 'u', the leading n-by-n upper triangular
part of the array a must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of a is not referenced. Before
entry with uplo = 'L' or 'l', the leading n-by-n lower triangular part of
the array a must contain the lower triangular part of the Hermitian matrix
and the strictly upper triangular part of a is not referenced.
assumed to be zero.

(sub)program. The value of lda must be at least max(1, n).
x COMPLEX for chemv

incremented array x must contain the n-element vector x.

beta COMPLEX for chemv

Specifies the scalar beta. When beta is supplied as zero then y need not be
set on input.
y COMPLEX for chemv


Output Parameters
91

Specific details for the routine hemv interface are the following:
a Holds the matrix A of size (n,n).
?her
Performs a rank-1 update of a Hermitian matrix.
Syntax
call cher(uplo, n, alpha, x, incx, a, lda)
call zher(uplo, n, alpha, x, incx, a, lda)
call her(a, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
The ?her routines perform a matrix-vector operation defined as
A := alpha*x*conjg(x') + A,
where:
alpha is a real scalar,
x is an n-element vector,
Input Parameters
array a is used.

least zero.
alpha REAL for cher
92
DOUBLE PRECISION for zher
x COMPLEX for cher

DOUBLE COMPLEX for zher
Array, dimension at least (1 + (n - 1)*abs(incx)). Before entry, the

a COMPLEX for cher

DOUBLE COMPLEX for zher
matrix and the strictly lower triangular part of a is not referenced.
Before entry with uplo = 'L' or 'l', the leading n-by-n lower triangular
part of the array a must contain the lower triangular part of the Hermitian
matrix and the strictly upper triangular part of a is not referenced.
assumed to be zero.

Output Parameters
a With uplo = 'U' or 'u', the upper triangular part of the array a is
overwritten by the upper triangular part of the updated matrix.
With uplo = 'L' or 'l', the lower triangular part of the array a is
overwritten by the lower triangular part of the updated matrix.
The imaginary parts of the diagonal elements are set to zero.

Specific details for the routine her interface are the following:
93
?her2
Performs a rank-2 update of a Hermitian matrix.
Syntax
call cher2(uplo, n, alpha, x, incx, y, incy, a, lda)
call zher2(uplo, n, alpha, x, incx, y, incy, a, lda)
call her2(a, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?her2 routines perform a matrix-vector operation defined as
A := alpha *x*conjg(y') + conjg(alpha)*y *conjg(x') + A,
where:
alpha is a scalar,
Input Parameters
array a is used.

least zero.
alpha COMPLEX for cher2

DOUBLE COMPLEX for zher2
x COMPLEX for cher2


y COMPLEX for cher2

94
a COMPLEX for cher2

part of the array a must contain the lower triangular part of the Hermitian
assumed to be zero.

Output Parameters

Specific details for the routine her2 interface are the following:
?hpmv
packed matrix.
Syntax
call chpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call zhpmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
95
call hpmv(ap, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
The ?hpmv routines perform a matrix-vector operation defined as
where:
A is an n-by-n Hermitian matrix, supplied in packed form.
Input Parameters
matrix A is supplied in the packed array ap.
supplied in the packed array ap .
If uplo = 'L' or 'l', then the low triangular part of the matrix A is

least zero.
alpha COMPLEX for chpmv

DOUBLE COMPLEX for zhpmv
ap COMPLEX for chpmv

Array, size at least ((n*(n + 1))/2).
Before entry with uplo = 'U' or 'u', the array ap must contain the upper
triangular part of the Hermitian matrix packed sequentially, column-by-
column, so that ap(1) contains A1, 1, ap(2) and ap(3) contain A1, 2 and
A2, 2 respectively, and so on. Before entry with uplo = 'L' or 'l', the
array ap must contain the lower triangular part of the Hermitian matrix
packed sequentially, column-by-column, so that ap(1) contains A1, 1,
ap(2) and ap(3) contain A2, 1 and A3, 1 respectively, and so on.
assumed to be zero.
x COMPLEX for chpmv

DOUBLE PRECISION COMPLEX for zhpmv
Array, size at least (1 +(n - 1)*abs(incx)). Before entry, the
96
beta COMPLEX for chpmv

When beta is equal to zero then y need not be set on input.
y COMPLEX for chpmv


Output Parameters

Specific details for the routine hpmv interface are the following:
ap Holds the array ap of size (n*(n+1)/2).
?hpr
Performs a rank-1 update of a Hermitian packed
matrix.
Syntax
call chpr(uplo, n, alpha, x, incx, ap)
call zhpr(uplo, n, alpha, x, incx, ap)
call hpr(ap, x [,uplo] [, alpha])
97
Include Files
mkl.fi, blas.f90
Description
The ?hpr routines perform a matrix-vector operation defined as
A := alpha*x*conjg(x') + A,
where:
Input Parameters
If uplo = 'U' or 'u', the upper triangular part of the matrix A is supplied
in the packed array ap .
If uplo = 'L' or 'l', the low triangular part of the matrix A is supplied in
the packed array ap .

least zero.
alpha REAL for chpr

DOUBLE PRECISION for zhpr
x COMPLEX for chpr

DOUBLE COMPLEX for zhpr
zero.
ap COMPLEX for chpr

DOUBLE COMPLEX for zhpr
A2, 2 respectively, and so on.
Before entry with uplo = 'L' or 'l', the array ap must contain the lower
98
assumed to be zero.
Output Parameters
ap With uplo = 'U' or 'u', overwritten by the upper triangular part of the
updated matrix.
With uplo = 'L' or 'l', overwritten by the lower triangular part of the
updated matrix.

Specific details for the routine hpr interface are the following:
?hpr2
Performs a rank-2 update of a Hermitian packed
matrix.
Syntax
call chpr2(uplo, n, alpha, x, incx, y, incy, ap)
call zhpr2(uplo, n, alpha, x, incx, y, incy, ap)
call hpr2(ap, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?hpr2 routines perform a matrix-vector operation defined as
A := alpha*x*conjg(y') + conjg(alpha)*y*conjg(x') + A,
where:
alpha is a scalar,
99
Input Parameters

least zero.
alpha COMPLEX for chpr2

DOUBLE COMPLEX for zhpr2
x COMPLEX for chpr2

Array, dimension at least (1 +(n - 1)*abs(incx)). Before entry, the

y COMPLEX for chpr2

Array, size at least (1 +(n - 1)*abs(incy)). Before entry, the

ap COMPLEX for chpr2

assumed to be zero.
100
Output Parameters
updated matrix.
updated matrix.
The imaginary parts of the diagonal elements need are set to zero.

Specific details for the routine hpr2 interface are the following:
?sbmv
Computes a matrix-vector product using a symmetric
band matrix.
Syntax
call ssbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call dsbmv(uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
call sbmv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
The ?sbmv routines perform a matrix-vector operation defined as
where:
A is an n-by-n symmetric band matrix, with k super-diagonals.
Input Parameters
band matrix A is used:
101
if uplo = 'U' or 'u' - upper triangular part;
if uplo = 'L' or 'l' - low triangular part.

least zero.
k INTEGER. Specifies the number of super-diagonals of the matrix A.

alpha REAL for ssbmv

DOUBLE PRECISION for dsbmv
a REAL for ssbmv

Array, size (lda, n). Before entry with uplo = 'U' or 'u', the leading (k
+ 1) by n part of the array a must contain the upper triangular band part
of the symmetric matrix, supplied column-by-column, with the leading
diagonal of the matrix in row (k + 1) of the array, the first super-diagonal
starting at position 2 in row k, and so on. The top left k by k triangle of the
array a is not referenced.
The following program segment transfers the upper triangular part of a
symmetric band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
10 continue
20 continue
array a must contain the lower triangular band part of the symmetric
matrix, supplied column-by-column, with the leading diagonal of the matrix
in row 1 of the array, the first sub-diagonal starting at position 1 in row 2,
and so on. The bottom right k by k triangle of the array a is not referenced.
The following program segment transfers the lower triangular part of a
symmetric band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
10 continue
20 continue

x REAL for ssbmv

102
incremented array x must contain the vector x.

beta REAL for ssbmv

y REAL for ssbmv


Output Parameters

Specific details for the routine sbmv interface are the following:
?spmv
packed matrix.
Syntax
call sspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call dspmv(uplo, n, alpha, ap, x, incx, beta, y, incy)
call spmv(ap, x, y [,uplo][,alpha] [,beta])
103
Include Files
mkl.fi, blas.f90
Description
The ?spmv routines perform a matrix-vector operation defined as
where:
A is an n-by-n symmetric matrix, supplied in packed form.
Input Parameters

least zero.
alpha REAL for sspmv

DOUBLE PRECISION for dspmv
ap REAL for sspmv

triangular part of the symmetric matrix packed sequentially, column-by-
column, so that ap(1) contains a(1,1), ap(2) and ap(3) contain a(1,2)
and a(2, 2) respectively, and so on. Before entry with uplo = 'L' or
'l', the array ap must contain the lower triangular part of the symmetric
matrix packed sequentially, column-by-column, so that ap(1) contains
a(1,1), ap(2) and ap(3) contain a(2,1) and a(3,1) respectively, and so
on.
x REAL for sspmv


104
beta REAL for sspmv
When beta is supplied as zero, then y need not be set on input.
y REAL for sspmv


Output Parameters

Specific details for the routine spmv interface are the following:
?spr
Performs a rank-1 update of a symmetric packed
matrix.
Syntax
call sspr(uplo, n, alpha, x, incx, ap)
call dspr(uplo, n, alpha, x, incx, ap)
call spr(ap, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
105
The ?spr routines perform a matrix-vector operation defined as
a:= alpha*x*x'+ A,
where:
Input Parameters

least zero.
alpha REAL for sspr

DOUBLE PRECISION for dspr
x REAL for sspr


ap REAL for sspr

Output Parameters
updated matrix.
106
updated matrix.

Specific details for the routine spr interface are the following:
?spr2
Performs a rank-2 update of a symmetric packed
matrix.
Syntax
call sspr2(uplo, n, alpha, x, incx, y, incy, ap)
call dspr2(uplo, n, alpha, x, incx, y, incy, ap)
call spr2(ap, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?spr2 routines perform a matrix-vector operation defined as
A:= alpha*x*y'+ alpha*y*x' + A,
where:
alpha is a scalar,
Input Parameters
107

least zero.
alpha REAL for sspr2

DOUBLE PRECISION for dspr2
x REAL for sspr2


y REAL for sspr2

incy INTEGER. Specifies the increment for the elements of y. The value of incy
must not be zero.
ap REAL for sspr2

Output Parameters
updated matrix.
updated matrix.

Specific details for the routine spr2 interface are the following:
108
?symv
Computes a matrix-vector product for a symmetric
matrix.
Syntax
call ssymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call dsymv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call symv(a, x, y [,uplo][,alpha] [,beta])
Include Files
mkl.fi, blas.f90
Description
The ?symv routines perform a matrix-vector operation defined as
where:
A is an n-by-n symmetric matrix.
Input Parameters
array a is used.
If uplo = 'U' or 'u', then the upper triangular part of the array a is used.
If uplo = 'L' or 'l', then the low triangular part of the array a is used.

least zero.
alpha REAL for ssymv

DOUBLE PRECISION for dsymv
a REAL for ssymv

109
part of the array a must contain the upper triangular part of the symmetric
matrix A and the strictly lower triangular part of a is not referenced. Before
the array a must contain the lower triangular part of the symmetric matrix
A and the strictly upper triangular part of a is not referenced.

x REAL for ssymv


beta REAL for ssymv

When beta is supplied as zero, then y need not be set on input.
y REAL for ssymv


Output Parameters

Specific details for the routine symv interface are the following:
110
?syr
Performs a rank-1 update of a symmetric matrix.
Syntax
call ssyr(uplo, n, alpha, x, incx, a, lda)
call dsyr(uplo, n, alpha, x, incx, a, lda)
call syr(a, x [,uplo] [, alpha])
Include Files
mkl.fi, blas.f90
Description
The ?syr routines perform a matrix-vector operation defined as
A := alpha*x*x' + A ,
where:
Input Parameters
array a is used.

least zero.
alpha REAL for ssyr

DOUBLE PRECISION for dsyr
x REAL for ssyr

Array, size at least (1 + (n-1)*abs(incx)). Before entry, the

a REAL for ssyr

111
matrix A and the strictly lower triangular part of a is not referenced.
part of the array a must contain the lower triangular part of the symmetric
matrix A and the strictly upper triangular part of a is not referenced.

Output Parameters

Specific details for the routine syr interface are the following:
?syr2
Performs a rank-2 update of symmetric matrix.
Syntax
call ssyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
call dsyr2(uplo, n, alpha, x, incx, y, incy, a, lda)
call syr2(a, x, y [,uplo][,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?syr2 routines perform a matrix-vector operation defined as
A := alpha*x*y'+ alpha*y*x' + A,
where:
alpha is a scalar,
112
Input Parameters
array a is used.

least zero.
alpha REAL for ssyr2

DOUBLE PRECISION for dsyr2
x REAL for ssyr2


y REAL for ssyr2

must not be zero.
a REAL for ssyr2


113
Output Parameters

Specific details for the routine syr2 interface are the following:
x Holds the vector x of length n.
y Holds the vector y of length n.
?tbmv
Computes a matrix-vector product using a triangular
band matrix.
Syntax
call stbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbmv(uplo, trans, diag, n, k, a, lda, x, incx)
call tbmv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?tbmv routines perform one of the matrix-vector operations defined as
x := A*x, or x := A'*x, or x := conjg(A')*x,

where:
A is an n-by-n unit, or non-unit, upper or lower triangular band matrix, with (k +1) diagonals.
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is an upper or lower

triangular matrix:
114
uplo = 'U' or 'u'
if uplo = 'L' or 'l', then the matrix is low triangular.

if trans= 'N' or 'n', then x := A*x;
if trans= 'T' or 't', then x := A'*x;
if trans= 'C' or 'c', then x := conjg(A')*x.
diag CHARACTER*1. Specifies whether the matrix A is unit triangular:

if uplo = 'U' or 'u' then the matrix is unit triangular;
if diag = 'N' or 'n', then the matrix is not unit triangular.

least zero.
k INTEGER. On entry with uplo = 'U' or 'u' specifies the number of super-
diagonals of the matrix A. On entry with uplo = 'L' or 'l', k specifies the
number of sub-diagonals of the matrix a.
a REAL for stbmv

DOUBLE PRECISION for dtbmv
COMPLEX for ctbmv
DOUBLE COMPLEX for ztbmv
array a must contain the upper triangular band part of the matrix of
coefficients, supplied column-by-column, with the leading diagonal of the
matrix in row (k + 1) of the array, the first super-diagonal starting at
position 2 in row k, and so on. The top left k by k triangle of the array a is
not referenced. The following program segment transfers an upper
triangular band matrix from conventional full matrix storage (matrix) to
band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
10 continue
20 continue
array a must contain the lower triangular band part of the matrix of
matrix in row 1 of the array, the first sub-diagonal starting at position 1 in
row 2, and so on. The bottom right k by k triangle of the array a is not
115
referenced. The following program segment transfers a lower triangular

band matrix from conventional full matrix storage (matrix) to band storage
(a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
10 continue
20 continue
Note that when uplo = 'U' or 'u', the elements of the array a
corresponding to the diagonal elements of the matrix are not referenced,
but are assumed to be unity.

x REAL for stbmv

DOUBLE PRECISION for dtbmv
COMPLEX for ctbmv
DOUBLE COMPLEX for ztbmv

Output Parameters
x Overwritten with the transformed vector x.

Specific details for the routine tbmv interface are the following:
diag Must be 'N' or 'U'. The default value is 'N'.
116
?tbsv
Solves a system of linear equations whose coefficients
are in a triangular band matrix.
Syntax
call stbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call dtbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ctbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call ztbsv(uplo, trans, diag, n, k, a, lda, x, incx)
call tbsv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?tbsv routines solve one of the following systems of equations:
A*x = b, or A'*x = b, or conjg(A')*x = b,

where:
b and x are n-element vectors,
A is an n-by-n unit, or non-unit, upper or lower triangular band matrix, with (k + 1) diagonals.
The routine does not test for singularity or near-singularity.

Such tests must be performed before calling this routine.
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is an upper or lower

triangular matrix:
if uplo = 'U' or 'u' the matrix is upper triangular;
if uplo = 'L' or 'l', the matrix is low triangular.
trans CHARACTER*1. Specifies the system of equations:

if trans= 'N' or 'n', then A*x = b;
if trans= 'T' or 't', then A'*x = b;
if trans= 'C' or 'c', then conjg(A')*x = b.

if diag = 'U' or 'u' then the matrix is unit triangular;

least zero.
k INTEGER. On entry with uplo = 'U' or 'u', k specifies the number of

super-diagonals of the matrix A. On entry with uplo = 'L' or 'l', k
specifies the number of sub-diagonals of the matrix A.
117
a REAL for stbsv

DOUBLE PRECISION for dtbsv
COMPLEX for ctbsv
DOUBLE COMPLEX for ztbsv
array a must contain the upper triangular band part of the matrix of
matrix in row (k + 1) of the array, the first super-diagonal starting at
position 2 in row k, and so on. The top left k by k triangle of the array a is
not referenced.
The following program segment transfers an upper triangular band matrix
from conventional full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
m = k + 1 - j
do 10, i = max( 1, j - k ), j
10 continue
20 continue
array a must contain the lower triangular band part of the matrix of
matrix in row 1 of the array, the first sub-diagonal starting at position 1 in
row 2, and so on. The bottom right k by k triangle of the array a is not
referenced.
The following program segment transfers a lower triangular band matrix
from conventional full matrix storage (matrix) to band storage (a):
do 20, j = 1, n
m = 1 - j
do 10, i = j, min( n, j + k )
10 continue
20 continue
When diag = 'U' or 'u', the elements of the array a corresponding to the
diagonal elements of the matrix are not referenced, but are assumed to be
unity.

x REAL for stbsv

DOUBLE PRECISION for dtbsv
COMPLEX for ctbsv
DOUBLE COMPLEX for ztbsv
incremented array x must contain the n-element right-hand side vector b.
118
Output Parameters
x Overwritten with the solution vector x.

Specific details for the routine tbsv interface are the following:
?tpmv
packed matrix.
Syntax
call stpmv(uplo, trans, diag, n, ap, x, incx)
call dtpmv(uplo, trans, diag, n, ap, x, incx)
call ctpmv(uplo, trans, diag, n, ap, x, incx)
call ztpmv(uplo, trans, diag, n, ap, x, incx)
call tpmv(ap, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?tpmv routines perform one of the matrix-vector operations defined as

where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix, supplied in packed form.
119
Input Parameters
uplo CHARACTER*1. Specifies whether the matrix A is upper or lower triangular:

uplo = 'U' or 'u'



least zero.
ap REAL for stpmv

DOUBLE PRECISION for dtpmv
COMPLEX for ctpmv
DOUBLE COMPLEX for ztpmv
triangular matrix packed sequentially, column-by-column, so that ap(1)
contains a(1,1), ap(2) and ap(3) contain a(1,2) and a(2,2)
respectively, and so on. Before entry with uplo = 'L' or 'l', the array ap
must contain the lower triangular matrix packed sequentially, column-by-
column, so thatap(1) contains a(1,1), ap(2) and ap(3) contain a(2,1)
and a(3,1) respectively, and so on. When diag = 'U' or 'u', the
diagonal elements of a are not referenced, but are assumed to be unity.
x REAL for stpmv

DOUBLE PRECISION for dtpmv
COMPLEX for ctpmv
DOUBLE COMPLEX for ztpmv

Output Parameters
120
Specific details for the routine tpmv interface are the following:
?tpsv
are in a triangular packed matrix.
Syntax
call stpsv(uplo, trans, diag, n, ap, x, incx)
call dtpsv(uplo, trans, diag, n, ap, x, incx)
call ctpsv(uplo, trans, diag, n, ap, x, incx)
call ztpsv(uplo, trans, diag, n, ap, x, incx)
call tpsv(ap, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?tpsv routines solve one of the following systems of equations

where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix, supplied in packed form.
This routine does not test for singularity or near-singularity.
Input Parameters

uplo = 'U' or 'u'
121
trans CHARACTER*1. Specifies the system of equations:

if trans= 'C' or 'c', then conjg(A')*x = b.


least zero.
ap REAL for stpsv

DOUBLE PRECISION for dtpsv
COMPLEX for ctpsv
DOUBLE COMPLEX for ztpsv
triangular part of the triangular matrix packed sequentially, column-by-
column, so that ap(1) contains a(1, 1), ap(2) and ap(3) contain a(1,
2) and a(2, 2) respectively, and so on.
triangular part of the triangular matrix packed sequentially, column-by-
column, so that ap(1) contains a(1, 1), ap(2) and ap(3) contain a(2,
1) and a(3, 1) respectively, and so on.
When diag = 'U' or 'u', the diagonal elements of a are not referenced,
but are assumed to be unity.
x REAL for stpsv

DOUBLE PRECISION for dtpsv
COMPLEX for ctpsv
DOUBLE COMPLEX for ztpsv

Output Parameters

122
Specific details for the routine tpsv interface are the following:
?trmv
matrix.
Syntax
call strmv(uplo, trans, diag, n, a, lda, x, incx)
call dtrmv(uplo, trans, diag, n, a, lda, x, incx)
call ctrmv(uplo, trans, diag, n, a, lda, x, incx)
call ztrmv(uplo, trans, diag, n, a, lda, x, incx)
call trmv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?trmv routines perform one of the following matrix-vector operations defined as

where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix.
Input Parameters

uplo = 'U' or 'u'

123

least zero.
a REAL for strmv

DOUBLE PRECISION for dtrmv
COMPLEX for ctrmv
DOUBLE COMPLEX for ztrmv
Array, size (lda, n). Before entry with uplo = 'U' or 'u', the leading n-
by-n upper triangular part of the array a must contain the upper triangular
the array a must contain the lower triangular matrix and the strictly upper
triangular part of a is not referenced.
When diag = 'U' or 'u', the diagonal elements of a are not referenced
either, but are assumed to be unity.

x REAL for strmv

DOUBLE PRECISION for dtrmv
COMPLEX for ctrmv
DOUBLE COMPLEX for ztrmv

Output Parameters

Specific details for the routine trmv interface are the following:
124
?trsv
are in a triangular matrix.
Syntax
call strsv(uplo, trans, diag, n, a, lda, x, incx)
call dtrsv(uplo, trans, diag, n, a, lda, x, incx)
call ctrsv(uplo, trans, diag, n, a, lda, x, incx)
call ztrsv(uplo, trans, diag, n, a, lda, x, incx)
call trsv(a, x [,uplo] [, trans] [,diag])
Include Files
mkl.fi, blas.f90
Description
The ?trsv routines solve one of the systems of equations:

where:
A is an n-by-n unit, or non-unit, upper or lower triangular matrix.
The routine does not test for singularity or near-singularity.
Input Parameters

uplo = 'U' or 'u'
trans CHARACTER*1. Specifies the systems of equations:

if trans= 'C' or 'c', then oconjg(A')*x = b.


least zero.
125
a REAL for strsv

DOUBLE PRECISION for dtrsv
COMPLEX for ctrsv
DOUBLE COMPLEX for ztrsv
Array, size (lda, n) . Before entry with uplo = 'U' or 'u', the leading n-
the array a must contain the lower triangular matrix and the strictly upper

x REAL for strsv

DOUBLE PRECISION for dtrsv
COMPLEX for ctrsv
DOUBLE COMPLEX for ztrsv

Output Parameters

Specific details for the routine trsv interface are the following:
a Holds the matrix a of size (n,n).
126
BLAS Level 3 Routines
BLAS Level 3 routines perform matrix-matrix operations. Table BLAS Level 3 Routine Groups and Their Data
Types lists the BLAS Level 3 routine groups and the data types associated with them.
BLAS Level 3 Routine Groups and Their Data Types
Routine Group Data Types Description
?gemm s, d, c, z Computes a matrix-matrix product with general matrices.
?hemm c, z Computes a matrix-matrix product where one input matrix

is Hermitian.
?herk c, z Performs a Hermitian rank-k update.
?her2k c, z Performs a Hermitian rank-2k update.
?symm s, d, c, z Computes a matrix-matrix product where one input matrix

is symmetric.
?syrk s, d, c, z Performs a symmetric rank-k update.
?syr2k s, d, c, z Performs a symmetric rank-2k update.
?trmm s, d, c, z Computes a matrix-matrix product where one input matrix

is triangular.
?trsm s, d, c, z Solves a triangular matrix equation.
Symmetric Multiprocessing Version of Intel MKL

Many applications spend considerable time executing BLAS routines. This time can be scaled by the number
of processors available on the system through using the symmetric multiprocessing (SMP) feature built into
the Intel MKL Library. The performance enhancements based on the parallel use of the processors are
available without any programming effort on your part.
To enhance performance, the library uses the following methods:
The BLAS functions are blocked where possible to restructure the code in a way that increases the
localization of data reference, enhances cache memory use, and reduces the dependency on the memory
bus.
The code is distributed across the processors to maximize parallelism.
Optimization Notice
?gemm
Computes a matrix-matrix product with general
matrices.
127
Syntax
call sgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call cgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call scgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dzgemm(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemm(a, b, c [,transa][,transb] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product,
with general matrices. The operation is defined as
C := alpha*op(A)*op(B) + beta*C,
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,
A, B and C are matrices:
op(A) is an m-by-k matrix,
op(B) is a k-by-n matrix,
C is an m-by-n matrix.
See also
cblas_?gemm for the C language interface to this routine

?gemm3m, BLAS-like extension routines, that use matrix multiplication for similar matrix-matrix operations
Input Parameters
transa CHARACTER*1. Specifies the form of op(A) used in the matrix

multiplication:
if transa = 'N' or 'n', then op(A) = A;
if transa = 'T' or 't', then op(A) = AT;
if transa = 'C' or 'c', then op(A) = AH.
transb CHARACTER*1. Specifies the form of op(B) used in the matrix

multiplication:
if transb = 'N' or 'n', then op(B) = B;
if transb = 'T' or 't', then op(B) = BT;
if transb = 'C' or 'c', then op(B) = BH.
128
m INTEGER. Specifies the number of rows of the matrix op(A) and of the
matrix C. The value of m must be at least zero.
n INTEGER. Specifies the number of columns of the matrix op(B) and the
number of columns of the matrix C.
k INTEGER. Specifies the number of columns of the matrix op(A) and the
number of rows of the matrix op(B).
The value of k must be at least zero.
alpha REAL for sgemm

DOUBLE PRECISION for dgemm
COMPLEX for cgemm, scgemm
DOUBLE COMPLEX for zgemm, dzgemm
a REAL for sgemm, scgemm

DOUBLE PRECISION for dgemm, dzgemm
COMPLEX for cgemm
DOUBLE COMPLEX for zgemm
Array, size lda by ka, where ka is k when transa = 'N' or 'n', and is m
otherwise. Before entry with transa = 'N' or 'n', the leading m-by-k part
of the array a must contain the matrix A, otherwise the leading k-by-m part
of the array a must contain the matrix A.

(sub)program.
When transa = 'N' or 'n', then lda must be at least max(1, m),
otherwise lda must be at least max(1, k).
b REAL for sgemm

Array, size ldb by kb, where kb is n when transa = 'N' or 'n', and is k
otherwise. Before entry with transa = 'N' or 'n', the leading k-by-n part
of the array b must contain the matrix B, otherwise the leading n-by-k part
of the array b must contain the matrix B.
ldb INTEGER. Specifies the leading dimension of b as declared in the calling

(sub)program.
When transb = 'N' or 'n', then ldb must be at least max(1, k),
otherwise ldb must be at least max(1, n).
beta REAL for sgemm
129

When beta is equal to zero, then c need not be set on input.
c REAL for sgemm

Array, size ldc by n. Before entry, the leading m-by-n part of the array c
must contain the matrix C, except when beta is equal to zero, in which
case c need not be set on entry.
ldc INTEGER. Specifies the leading dimension of c as declared in the calling

(sub)program.
The value of ldc must be at least max(1, m).
Output Parameters
c Overwritten by the m-by-n matrix (alpha*op(A)*op(B) + beta*C).
Example
For examples of routine usage, see the code in the Intel MKL installation directory:
sgemm: examples\blas\source\sgemmx.f
dgemm: examples\blas\source\dgemmx.f
cgemm: examples\blas\source\cgemmx.f
zgemm: examples\blas\source\zgemmx.f

Specific details for the routine gemm interface are the following:
a Holds the matrix A of size (ma,ka) where

ka = k if transa='N',
ka = m otherwise,
ma = m if transa='N',
ma = k otherwise.
b Holds the matrix B of size (mb,kb) where

kb = n if transb = 'N',
kb = k otherwise,
130
mb = k if transb = 'N',
mb = n otherwise.
c Holds the matrix C of size (m,n).
transa Must be 'N', 'C', or 'T'.
transb Must be 'N', 'C', or 'T'.
?hemm
Computes a matrix-matrix product where one input
matrix is Hermitian.
Syntax
call chemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zhemm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call hemm(a, b, c [,side][,uplo] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?hemm routines compute a scalar-matrix-matrix product using a Hermitian matrix A and a general matrix
B and add the result to a scalar-matrix product using a general matrix C. The operation is defined as
C := alpha*A*B + beta*C
or
C := alpha*B*A + beta*C,
where:
A is a Hermitian matrix,
B and C are m-by-n matrices.
Input Parameters
side CHARACTER*1. Specifies whether the Hermitian matrix A appears on the left
or right in the operation as follows:
if side = 'L' or 'l', then C := alpha*A*B + beta*C;
if side = 'R' or 'r', then C := alpha*B*A + beta*C.
131
Hermitian matrix A is used:
If uplo = 'U' or 'u', then the upper triangular part of the Hermitian
matrix A is used.
If uplo = 'L' or 'l', then the low triangular part of the Hermitian matrix
A is used.
m INTEGER. Specifies the number of rows of the matrix C.

n INTEGER. Specifies the number of columns of the matrix C.

alpha COMPLEX for chemm

DOUBLE COMPLEX for zhemm
a COMPLEX for chemm

Array, size (lda, ka), where ka is m when side = 'L' or 'l' and is n
otherwise. Before entry with side = 'L' or 'l', the m-by-m part of the
array a must contain the Hermitian matrix, such that when uplo = 'U' or
'u', the leading m-by-m upper triangular part of the array a must contain
the upper triangular part of the Hermitian matrix and the strictly lower
triangular part of a is not referenced, and when uplo = 'L' or 'l', the
leading m-by-m lower triangular part of the array a must contain the lower
triangular part of the Hermitian matrix, and the strictly upper triangular
part of a is not referenced.
Before entry with side = 'R' or 'r', the n-by-n part of the array a must
contain the Hermitian matrix, such that when uplo = 'U' or 'u', the
leading n-by-n upper triangular part of the array a must contain the upper
triangular part of the Hermitian matrix and the strictly lower triangular part
of a is not referenced, and when uplo = 'L' or 'l', the leading n-by-n
lower triangular part of the array a must contain the lower triangular part of
the Hermitian matrix, and the strictly upper triangular part of a is not
referenced. The imaginary parts of the diagonal elements need not be set,
they are assumed to be zero.

(sub) program. When side = 'L' or 'l' then lda must be at least max(1,
m), otherwise lda must be at least max(1,n).
b COMPLEX for chemm

Array, size ldb by n.
The leading m-by-n part of the array b must contain the matrix B.

(sub)program. ldb must be at least max(1, m).
132
beta COMPLEX for chemm
When beta is supplied as zero, then c need not be set on input.
c COMPLEX for chemm

Array, size (c, n). Before entry, the leading m-by-n part of the array c
must contain the matrix C, except when beta is zero, in which case c need
not be set on entry.

(sub)program. ldc must be at least max(1, m).
Output Parameters
c Overwritten by the m-by-n updated matrix.

Specific details for the routine hemm interface are the following:
a Holds the matrix A of size (k,k) where

k = m if side = 'L',
k = n otherwise.
b Holds the matrix B of size (m,n).
side Must be 'L' or 'R'. The default value is 'L'.
?herk
Performs a Hermitian rank-k update.
Syntax
call cherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zherk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call herk(a, c [,uplo] [, trans] [,alpha][,beta])
133
Include Files
mkl.fi, blas.f90
Description
The ?herk routines perform a rank-k matrix-matrix operation using a general matrix A and a Hermitian
matrix C. The operation is defined as
C := alpha*A*AH + beta*C,
or
C := alpha*AH*A + beta*C,
where:
alpha and beta are real scalars,
C is an n-by-n Hermitian matrix,
A is an n-by-k matrix in the first case and a k-by-n matrix in the second case.
Input Parameters
array c is used.
If uplo = 'U' or 'u', then the upper triangular part of the array c is used.
If uplo = 'L' or 'l', then the low triangular part of the array c is used.

if trans= 'N' or 'n', then C := alpha*A*AH + beta*C;
if trans= 'C' or 'c', then C := alpha*AH*A + beta*C.
n INTEGER. Specifies the order of the matrix C. The value of n must be at

least zero.
k INTEGER. With trans= 'N' or 'n', k specifies the number of columns of

the matrix A, and with trans= 'C' or 'c', k specifies the number of rows
of the matrix A.
alpha REAL for cherk

DOUBLE PRECISION for zherk
a COMPLEX for cherk

DOUBLE COMPLEX for zherk
Array, size (lda, ka), where ka is k when trans= 'N' or 'n', and is n
otherwise. Before entry with trans= 'N' or 'n', the leading n-by-k part of
the array a must contain the matrix a, otherwise the leading k-by-n part of
the array a must contain the matrix A.
134
(sub)program. When trans= 'N' or 'n', then lda must be at least max(1,
n), otherwise lda must be at least max(1, k).
beta REAL for cherk

DOUBLE PRECISION for zherk
c COMPLEX for cherk

DOUBLE COMPLEX for zherk
Array, size ldc by n.
part of the array c must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of c is not referenced.
part of the array c must contain the lower triangular part of the Hermitian
matrix and the strictly upper triangular part of c is not referenced.
The imaginary parts of the diagonal elements need not be set, they are
assumed to be zero.

(sub)program. The value of ldc must be at least max(1, n).
Output Parameters
c With uplo = 'U' or 'u', the upper triangular part of the array c is
With uplo = 'L' or 'l', the lower triangular part of the array c is

Specific details for the routine herk interface are the following:

ka = k if trans='N',
ka = n otherwise,
ma = n if trans='N',
ma = k otherwise.
c Holds the matrix C of size (n,n).
135
trans Must be 'N' or 'C'. The default value is 'N'.
?her2k
Performs a Hermitian rank-2k update.
Syntax
call cher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zher2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call her2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?her2k routines perform a rank-2k matrix-matrix operation using general matrices A and B and a
Hermitian matrix C. The operation is defined as
C := alpha*A*BH + conjg(alpha)*B*AH + beta*C,
or
C := alpha*AH*B + conjg(alpha)*BH*A + beta*C,
where:
alpha is a scalar and beta is a real scalar,
C is an n-by-n Hermitian matrix,
A and B are n-by-k matrices in the first case and k-by-n matrices in the second case.
Input Parameters
array c is used.
If uplo = 'U' or 'u', then the upper triangular of the array c is used.
If uplo = 'L' or 'l', then the low triangular of the array c is used.

if trans= 'N' or 'n', then C:=alpha*A*BH + alpha*B*AH + beta*C;
if trans= 'C' or 'c', then C:=alpha*AH*B + alpha*BH*A + beta*C.

least zero.
k INTEGER. With trans= 'N' or 'n' specifies the number of columns of the
matrix A, and with trans= 'C' or 'c', k specifies the number of rows of
the matrix A.
136
The value of k must be at least equal to zero.
alpha COMPLEX for cher2k

DOUBLE COMPLEX for zher2k
a COMPLEX for cher2k

the array a must contain the matrix A, otherwise the leading k-by-n part of

(sub)program.
When trans= 'N' or 'n'trans= 'N' or 'n', then lda must be at least
max(1, n), otherwise lda must be at least max(1, k).
beta REAL for cher2k

DOUBLE PRECISION for zher2k
b COMPLEX for cher2k

Array, size (ldb, kb), where kb is k when trans= 'N' or 'n', and is n
the array b must contain the matrix B, otherwise the leading k-by-n part of
the array b must contain the matrix B.
ldb INTEGER. Specifies the leading dimension of a as declared in the calling

(sub)program.
When trans= 'N' or 'n', then ldb must be at least max(1, n), otherwise
ldb must be at least max(1, k).
c COMPLEX for cher2k

part of the array c must contain the upper triangular part of the Hermitian
matrix and the strictly lower triangular part of c is not referenced.
part of the array c must contain the lower triangular part of the Hermitian
The imaginary parts of the diagonal elements need not be set, they are
assumed to be zero.

137
Output Parameters

Specific details for the routine her2k interface are the following:

ka = k if trans = 'N',
ka = n otherwise,
ma = n if trans = 'N',
ma = k otherwise.

kb = k if trans = 'N',
kb = n otherwise,
mb = n if trans = 'N',
mb = k otherwise.
?symm
matrix is symmetric.
Syntax
call ssymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call dsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call csymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call zsymm(side, uplo, m, n, alpha, a, lda, b, ldb, beta, c, ldc)
call symm(a, b, c [,side][,uplo] [,alpha][,beta])
138
Include Files
mkl.fi, blas.f90
Description
The ?symm routines compute a scalar-matrix-matrix product with one symmetric matrix and add the result to
a scalar-matrix product. The operation is defined as
C := alpha*A*B + beta*C,
or
C := alpha*B*A + beta*C,
where:
A is a symmetric matrix,
B and C are m-by-n matrices.
Input Parameters
side CHARACTER*1. Specifies whether the symmetric matrix A appears on the

left or right in the operation:
if side = 'L' or 'l', then C := alpha*A*B + beta*C;
if side = 'R' or 'r', then C := alpha*B*A + beta*C.
symmetric matrix A is used:
if uplo = 'U' or 'u', then the upper triangular part is used;
if uplo = 'L' or 'l', then the lower triangular part is used.
m INTEGER. Specifies the number of rows of the matrix C.

n INTEGER. Specifies the number of columns of the matrix C.

alpha REAL for ssymm

DOUBLE PRECISION for dsymm
COMPLEX for csymm
DOUBLE COMPLEX for zsymm
a REAL for ssymm

COMPLEX for csymm
Array, size (lda, ka) , where ka is m when side = 'L' or 'l' and is n
otherwise.
139
Before entry with side = 'L' or 'l', the m-by-m part of the array a must
contain the symmetric matrix, such that when uplo = 'U' or 'u', the
leading m-by-m upper triangular part of the array a must contain the upper
triangular part of the symmetric matrix and the strictly lower triangular part
of a is not referenced, and when side = 'L' or 'l', the leading m-by-m
lower triangular part of the array a must contain the lower triangular part of
the symmetric matrix and the strictly upper triangular part of a is not
referenced.
Before entry with side = 'R' or 'r', the n-by-n part of the array a must
contain the symmetric matrix, such that when uplo = 'U' or 'u'e array a
must contain the upper triangular part of the symmetric matrix and the
strictly lower triangular part of a is not referenced, and when side = 'L'
or 'l', the leading n-by-n lower triangular part of the array a must contain
the lower triangular part of the symmetric matrix and the strictly upper

(sub)program. When side = 'L' or 'l' then lda must be at least max(1,
m), otherwise lda must be at least max(1, n).
b REAL for ssymm

COMPLEX for csymm
Array, size ldb by n.
The leading m-by-n part of the array b must contain the matrix B.

beta REAL for ssymm

COMPLEX for csymm
When beta is set to zero, then c need not be set on input.
c REAL for ssymm

COMPLEX for csymm
Array, size (c, n). Before entry, the leading m-by-n part of the array c
must contain the matrix C, except when beta is zero, in which case c need
not be set on entry.

(sub)program. ldc must be at least max(1, m).
140
Output Parameters
c Overwritten by the m-by-n updated matrix.

Specific details for the routine symm interface are the following:

k = n otherwise.
?syrk
Performs a symmetric rank-k update.
Syntax
call ssyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call dsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call csyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call zsyrk(uplo, trans, n, k, alpha, a, lda, beta, c, ldc)
call syrk(a, c [,uplo] [, trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?syrk routines perform a rank-k matrix-matrix operation for a symmetric matrix C using a general
matrix A. The operation is defined as
C := alpha*A*A' + beta*C,
or
C := alpha*A'*A + beta*C,
where:
141
C is an n-by-n symmetric matrix,

Input Parameters
array c is used.

if trans= 'N' or 'n', then C := alpha*A*A' + beta*C;
if trans= 'T' or 't', then C := alpha*A'*A + beta*C;
if trans= 'C' or 'c', then C := alpha*A'*A + beta*C.

least zero.
k INTEGER. On entry with trans= 'N' or 'n', k specifies the number of

columns of the matrix a, and on entry with trans= 'T', 't', 'C', or 'c' ,
k specifies the number of rows of the matrix a.
alpha REAL for ssyrk

DOUBLE PRECISION for dsyrk
COMPLEX for csyrk
DOUBLE COMPLEX for zsyrk
a REAL for ssyrk

COMPLEX for csyrk
the array a must contain the matrix a, otherwise the leading k-by-n part of

(sub)program. When trans= 'N' or 'n', then lda must be at least max(1,
n), otherwise lda must be at least max(1, k).
beta REAL for ssyrk
142
COMPLEX for csyrk
c REAL for ssyrk

COMPLEX for csyrk
Array, size (ldc, n). Before entry with uplo = 'U' or 'u', the leading n-
by-n upper triangular part of the array c must contain the upper triangular
part of the symmetric matrix and the strictly lower triangular part of c is not
referenced.
part of the array c must contain the lower triangular part of the symmetric

Output Parameters

Specific details for the routine syrk interface are the following:

ka = n otherwise,
ma = n if transa='N',
ma = k otherwise.
143
?syr2k
Performs a symmetric rank-2k update.
Syntax
call ssyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call csyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zsyr2k(uplo, trans, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call syr2k(a, b, c [,uplo][,trans] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?syr2k routines perform a rank-2k matrix-matrix operation for a symmetric matrix C using general
matrices A and B. The operation is defined as
C := alpha*A*B' + alpha*B*A' + beta*C,
or
C := alpha*A'*B + alpha*B'*A + beta*C,
where:
C is an n-by-n symmetric matrix,
A and B are n-by-k matrices in the first case, and k-by-n matrices in the second case.
Input Parameters
array c is used.

if trans= 'N' or 'n', then C := alpha*A*B'+alpha*B*A'+beta*C;
if trans= 'T' or 't', then C := alpha*A'*B +alpha*B'*A +beta*C;
if trans= 'C' or 'c', then C := alpha*A'*B +alpha*B'*A +beta*C.

least zero.
144
k INTEGER. On entry with trans= 'N' or 'n', k specifies the number of
columns of the matrices A and B, and on entry with trans= 'T' or 't' or
'C' or 'c', k specifies the number of rows of the matrices A and B. The
value of k must be at least zero.
alpha REAL for ssyr2k

DOUBLE PRECISION for dsyr2k
COMPLEX for csyr2k
DOUBLE COMPLEX for zsyr2k
a REAL for ssyr2k

COMPLEX for csyr2k

(sub)program.
When trans= 'N' or 'n'trans= 'N' or 'n', then lda must be at least
b REAL for ssyr2k

COMPLEX for csyr2k
Array, size (ldb, kb), where kb is k when trans= 'N' or 'n', and is n
the array b must contain the matrix B, otherwise the leading k-by-n part of
the array b must contain the matrix B.
ldb INTEGER. Specifies the leading dimension of a as declared in the calling

(sub)program.
When trans= 'N' or 'n', then ldb must be at least max(1, n), otherwise
ldb must be at least max(1, k).
beta REAL for ssyr2k

COMPLEX for csyr2k
c REAL for ssyr2k
145

COMPLEX for csyr2k
Array, size (ldc, n). Before entry with uplo = 'U' or 'u', the leading n-
by-n upper triangular part of the array c must contain the upper triangular
part of the symmetric matrix and the strictly lower triangular part of c is not
referenced.
part of the array c must contain the lower triangular part of the symmetric

Output Parameters

Specific details for the routine syr2k interface are the following:

ka = k if trans = 'N',
ka = n otherwise,
ma = n if trans = 'N',
ma = k otherwise.

kb = k if trans = 'N',
kb = n otherwise,
mb = n if trans = 'N',
mb = k otherwise.
146
?trmm
matrix is triangular.
Syntax
call strmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)

call dtrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ctrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ztrmm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call trmm(a, b [,side] [, uplo] [,transa][,diag] [,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?trmm routines compute a scalar-matrix-matrix product with one triangular matrix. The operation is
defined as
B := alpha*op(A)*B
or
B := alpha*B*op(A)
where:
alpha is a scalar,
B is an m-by-n matrix,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
Input Parameters
side CHARACTER*1. Specifies whether op(A) appears on the left or right of B in

the operation:
if side = 'L' or 'l', then B := alpha*op(A)*B;
if side = 'R' or 'r', then B := alpha*B*op(A).

uplo = 'U' or 'u'

multiplication:
if transa= 'N' or 'n', then op(A) = A;
if transa= 'T' or 't', then op(A) = A';
147
if transa= 'C' or 'c', then op(A) = conjg(A').

m INTEGER. Specifies the number of rows of B. The value of m must be at

least zero.
n INTEGER. Specifies the number of columns of B. The value of n must be at

least zero.
alpha REAL for strmm

DOUBLE PRECISION for dtrmm
COMPLEX for ctrmm
DOUBLE COMPLEX for ztrmm
When alpha is zero, then a is not referenced and b need not be set before
entry.
a REAL for strmm

COMPLEX for ctrmm
Array, size lda by k, where k is m when side = 'L'side or 'l' and is n
when side = 'R' or 'r'. Before entry with uplo = 'U' or 'u', the
leading k by k upper triangular part of the array a must contain the upper
triangular matrix and the strictly lower triangular part of a is not
referenced.
Before entry with uplo = 'L' or 'l', the leading k by k lower triangular
part of the array a must contain the lower triangular matrix and the strictly
upper triangular part of a is not referenced.

(sub)program. When side = 'L' or 'l', then lda must be at least max(1,
m), when side = 'R' or 'r', then lda must be at least max(1, n).
b REAL for strmm

COMPLEX for ctrmm
Array, size ldb by n. Before entry, the leading m-by-n part of the array b
must contain the matrix B.
148
Output Parameters
b Overwritten by the transformed matrix.

Specific details for the routine trmm interface are the following:

k = n otherwise.
?trsm
Solves a triangular matrix equation.
Syntax
call strsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call dtrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ctrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call ztrsm(side, uplo, transa, diag, m, n, alpha, a, lda, b, ldb)
call trsm(a, b [,side] [, uplo] [,transa][,diag] [,alpha])
Include Files
mkl.fi, blas.f90
Description
The ?trsm routines solve one of the following matrix equations:
op(A)*X = alpha*B,
149
or
X*op(A) = alpha*B,
where:
alpha is a scalar,
X and B are m-by-n matrices,
A is a unit, or non-unit, upper or lower triangular matrix
op(A) is one of op(A) = A, or op(A) = A', or op(A) = conjg(A').
The matrix B is overwritten by the solution matrix X.
Input Parameters
side CHARACTER*1. Specifies whether op(A) appears on the left or right of X in

the equation:
if side = 'L' or 'l', then op(A)*X = alpha*B;
if side = 'R' or 'r', then X*op(A) = alpha*B.

uplo = 'U' or 'u'

multiplication:
if transa= 'N' or 'n', then op(A) = A;
if transa= 'T' or 't';
if transa= 'C' or 'c', then op(A) = conjg(A').


least zero.

least zero.
alpha REAL for strsm

DOUBLE PRECISION for dtrsm
COMPLEX for ctrsm
DOUBLE COMPLEX for ztrsm
entry.
a REAL for strsm
150
COMPLEX for ctrsm
Array, size (lda, k) , where k is m when side = 'L' or 'l' and is n
when side = 'R' or 'r'. Before entry with uplo = 'U' or 'u', the
leading k by k upper triangular part of the array a must contain the upper
triangular matrix and the strictly lower triangular part of a is not
referenced.
Before entry with uplo = 'L' or 'l' lower triangular part of the array a
must contain the lower triangular matrix and the strictly upper triangular

(sub)program. When side = 'L' or 'l', then lda must be at least max(1,
m), when side = 'R' or 'r', then lda must be at least max(1, n).
b REAL for strsm

COMPLEX for ctrsm
Array, size ldb by n. Before entry, the leading m-by-n part of the array b

Output Parameters
b Overwritten by the solution matrix X.

Specific details for the routine trsm interface are the following:
a Holds the matrix A of size (k,k) where k = m if side = 'L', k = n

otherwise.
151
Sparse BLAS Level 1 Routines

This section describes Sparse BLAS Level 1, an extension of BLAS Level 1 included in the Intel Math Kernel
Library beginning with the Intel MKL release 2.1. Sparse BLAS Level 1 is a group of routines and functions
that perform a number of common vector operations on sparse vectors stored in compressed form.
Sparse vectors are those in which the majority of elements are zeros. Sparse BLAS routines and functions
are specially implemented to take advantage of vector sparsity. This allows you to achieve large savings in
computer time and memory. If nz is the number of non-zero vector elements, the computer time taken by
Sparse BLAS operations will be O(nz).
Vector Arguments
Compressed sparse vectors. Let a be a vector stored in an array, and assume that the only non-zero
elements of a are the following:
a(k1), a(k2), a(k3) . . . a(knz),
where nz is the total number of non-zero elements in a.

In Sparse BLAS, this vector can be represented in compressed form by two arrays, x (values) and indx
(indices). Each array has nz elements:
x(1)=a(k1), x(2)=a(k2), . . . x(nz)= a(knz),
indx(1)=k1, indx(2)=k2, . . . indx(nz)= knz.
Thus, a sparse vector is fully determined by the triple (nz, x, indx). If you pass a negative or zero value of nz
to Sparse BLAS, the subroutines do not modify any arrays or variables.
Full-storage vectors. Sparse BLAS routines can also use a vector argument fully stored in a single array (a
full-storage vector). If y is a full-storage vector, its elements must be stored contiguously: the first element
in y(1), the second in y(2), and so on. This corresponds to an increment incy = 1 in BLAS Level 1. No
increment value for full-storage vectors is passed as an argument to Sparse BLAS routines or functions.
Naming Conventions for Sparse BLAS Routines

Similar to BLAS, the names of Sparse BLAS subprograms have prefixes that determine the data type
involved: s and d for single- and double-precision real; c and z for single- and double-precision complex
respectively.
If a Sparse BLAS routine is an extension of a "dense" one, the subprogram name is formed by appending the
suffix i (standing for indexed) to the name of the corresponding "dense" subprogram. For example, the
Sparse BLAS routine saxpyi corresponds to the BLAS routine saxpy, and the Sparse BLAS function cdotci
corresponds to the BLAS function cdotc.
Routines and Data Types

Routines and data types supported in the Intel MKL implementation of Sparse BLAS are listed in Table
Sparse BLAS Routines and Their Data Types.
152
Sparse BLAS Routines and Their Data Types
Routine/ Data Types Description
Function
?axpyi s, d, c, z Scalar-vector product plus vector (routines)
?doti s, d Dot product (functions)
?dotci c, z Complex dot product conjugated (functions)
?dotui c, z Complex dot product unconjugated (functions)
?gthr s, d, c, z Gathering a full-storage sparse vector into compressed

form nz, x, indx (routines)
?gthrz s, d, c, z Gathering a full-storage sparse vector into compressed

form and assigning zeros to gathered elements in the full-
storage vector (routines)
?roti s, d Givens rotation (routines)
?sctr s, d, c, z Scattering a vector from compressed form to full-storage

form (routines)
BLAS Level 1 Routines That Can Work With Sparse Vectors

The following BLAS Level 1 routines will give correct results when you pass to them a compressed-form array
x(with the increment incx=1):
?asum sum of absolute values of vector elements
?copy copying a vector
?nrm2 Euclidean norm of a vector
?scal scaling a vector
i?amax index of the element with the largest absolute value for real flavors, or the
largest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
i?amin index of the element with the smallest absolute value for real flavors, or the
smallest sum |Re(x(i))|+|Im(x(i))| for complex flavors.
The result i returned by i?amax and i?amin should be interpreted as index in the compressed-form array, so
that the largest (smallest) value is x(i); the corresponding index in full-storage array is indx(i).
You can also call ?rotg to compute the parameters of Givens rotation and then pass these parameters to the
Sparse BLAS routines ?roti.
?axpyi
Adds a scalar multiple of compressed sparse vector to
a full-storage vector.
Syntax
call saxpyi(nz, a, x, indx, y)
call daxpyi(nz, a, x, indx, y)
call caxpyi(nz, a, x, indx, y)
call zaxpyi(nz, a, x, indx, y)
153
call axpyi(x, indx, y [, a])
Include Files
mkl.fi, blas.f90
Description
The ?axpyi routines perform a vector-vector operation defined as
y := a*x + y
where:
a is a scalar,
x is a sparse vector stored in compressed form,
y is a vector in full storage form.
The ?axpyi routines reference or modify only the elements of y whose indices are listed in the array indx.
The values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx.
a REAL for saxpyi

DOUBLE PRECISION for daxpyi
COMPLEX for caxpyi
DOUBLE COMPLEX for zaxpyi
x REAL for saxpyi

COMPLEX for caxpyi
Array, size at least nz.
indx INTEGER. Specifies the indices for the elements of x.

y REAL for saxpyi

COMPLEX for caxpyi
Array, size at least max(indx(i)).
Output Parameters
154
Specific details for the routine axpyi interface are the following:
x Holds the vector with the number of elements nz.
indx Holds the vector with the number of elements nz.
y Holds the vector with the number of elements nz.
?doti
Computes the dot product of a compressed sparse real
vector by a full-storage real vector.
Syntax
res = sdoti(nz, x, indx, y )
res = ddoti(nz, x, indx, y )
res = doti(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?doti routines return the dot product of x and y defined as
res = x(1)*y(indx(1)) + x(2)*y(indx(2)) +...+ x(nz)*y(indx(nz))
where the triple (nz, x, indx) defines a sparse real vector stored in compressed form, and y is a real vector in
full storage form. The functions reference only the elements of y whose indices are listed in the array indx.
The values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx .
x REAL for sdoti

DOUBLE PRECISION for ddoti

y REAL for sdoti

155
Output Parameters
res REAL for sdoti

Contains the dot product of x and y, if nz is positive. Otherwise, res
contains 0.

Specific details for the routine doti interface are the following:
?dotci
Computes the conjugated dot product of a
compressed sparse complex vector with a full-storage
complex vector.
Syntax
res = cdotci(nz, x, indx, y )
res = zdotci(nzz, x, indx, y )
res = dotci(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?dotci routines return the dot product of x and y defined as
conjg(x(1))*y(indx(1)) + ... + conjg(x(nz))*y(indx(nz))
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and y is a real
vector in full storage form. The functions reference only the elements of y whose indices are listed in the
array indx. The values in indx must be distinct.
Input Parameters
nz INTEGER. The number of elements in x and indx .
x COMPLEX for cdotci

DOUBLE COMPLEX for zdotci
156
y COMPLEX for cdotci

Output Parameters
res COMPLEX for cdotci

Contains the conjugated dot product of x and y, if nz is positive. Otherwise,
it contains 0.

Specific details for the routine dotci interface are the following:
x Holds the vector with the number of elements (nz).
indx Holds the vector with the number of elements (nz).
y Holds the vector with the number of elements (nz).
?dotui
Computes the dot product of a compressed sparse
complex vector by a full-storage complex vector.
Syntax
res = cdotui(nz, x, indx, y )
res = zdotui(nzz, x, indx, y )
res = dotui(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?dotui routines return the dot product of x and y defined as
res = x(1)*y(indx(1)) + x(2)*y(indx(2)) +...+ x(nz)*y(indx(nz))
where the triple (nz, x, indx) defines a sparse complex vector stored in compressed form, and y is a real
vector in full storage form. The functions reference only the elements of y whose indices are listed in the
array indx. The values in indx must be distinct.
157
Input Parameters
x COMPLEX for cdotui

DOUBLE COMPLEX for zdotui

y COMPLEX for cdotui

Output Parameters
res COMPLEX for cdotui

Contains the dot product of x and y, if nz is positive. Otherwise, res
contains 0.

Specific details for the routine dotui interface are the following:
?gthr
Gathers a full-storage sparse vector's elements into
compressed form.
Syntax
call sgthr(nz, y, x, indx )
call dgthr(nz, y, x, indx )
call cgthr(nz, y, x, indx )
call zgthr(nz, y, x, indx )
res = gthr(x, indx, y)
158
Include Files
mkl.fi, blas.f90
Description
The ?gthr routines gather the specified elements of a full-storage sparse vector y into compressed form(nz,
x, indx). The routines reference only the elements of y whose indices are listed in the array indx:
x(i) = y(indx(i)), for i=1,2,... ,nz.
Input Parameters
nz INTEGER. The number of elements of y to be gathered.
indx INTEGER. Specifies indices of elements to be gathered.

y REAL for sgthr

DOUBLE PRECISION for dgthr
COMPLEX for cgthr
DOUBLE COMPLEX for zgthr
Output Parameters
x REAL for sgthr

DOUBLE PRECISION for dgthr
COMPLEX for cgthr
DOUBLE COMPLEX for zgthr
Contains the vector converted to the compressed form.

Specific details for the routine gthr interface are the following:
?gthrz
Gathers a sparse vector's elements into compressed
form, replacing them by zeros.
159
Syntax
call sgthrz(nz, y, x, indx )
call dgthrz(nz, y, x, indx )
call cgthrz(nz, y, x, indx )
call zgthrz(nz, y, x, indx )
res = gthrz(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?gthrz routines gather the elements with indices specified by the array indx from a full-storage vector y
into compressed form (nz, x, indx) and overwrite the gathered elements of y by zeros. Other elements of y
are not referenced or modified (see also ?gthr).
Input Parameters
nz INTEGER. The number of elements of y to be gathered.
indx INTEGER. Specifies indices of elements to be gathered.

y REAL for sgthrz

DOUBLE PRECISION for dgthrz
COMPLEX for cgthrz
DOUBLE COMPLEX for zgthrz
Output Parameters
x REAL for sgthrz

DOUBLE PRECISION for dgthrz
COMPLEX for cgthrz
DOUBLE COMPLEX for zgthrz
Contains the vector converted to the compressed form.
y The updated vector y.

Specific details for the routine gthrz interface are the following:
160
?roti
Applies Givens rotation to sparse vectors one of which
is in compressed form.
Syntax
call sroti(nz, x, indx, y, c, s)
call droti(nz, x, indx, y, c, s)
call roti(x, indx, y, c, s)
Include Files
mkl.fi, blas.f90
Description
The ?roti routines apply the Givens rotation to elements of two real vectors, x (in compressed form nz, x,
indx) and y (in full storage form):
x(i) = c*x(i) + s*y(indx(i))

y(indx(i)) = c*y(indx(i))- s*x(i)
The routines reference only the elements of y whose indices are listed in the array indx. The values in indx
must be distinct.
Input Parameters
x REAL for sroti

DOUBLE PRECISION for droti

y REAL for sroti

DOUBLE PRECISION for droti
c REAL for sroti

DOUBLE PRECISION for droti.
A scalar.
s REAL for sroti
161
DOUBLE PRECISION for droti.

A scalar.
Output Parameters
x and y The updated arrays.

Specific details for the routine roti interface are the following:
?sctr
Converts compressed sparse vectors into full storage
form.
Syntax
call ssctr(nz, x, indx, y )
call dsctr(nz, x, indx, y )
call csctr(nz, x, indx, y )
call zsctr(nz, x, indx, y )
call sctr(x, indx, y)
Include Files
mkl.fi, blas.f90
Description
The ?sctr routines scatter the elements of the compressed sparse vector (nz, x, indx) to a full-storage
vector y. The routines modify only the elements of y whose indices are listed in the array indx:
y(indx(i)) = x(i), for i=1,2,... ,nz.
Input Parameters
nz INTEGER. The number of elements of x to be scattered.
indx INTEGER. Specifies indices of elements to be scattered.

x REAL for ssctr

DOUBLE PRECISION for dsctr
162
COMPLEX for csctr
DOUBLE COMPLEX for zsctr
Contains the vector to be converted to full-storage form.
Output Parameters
y REAL for ssctr

DOUBLE PRECISION for dsctr
COMPLEX for csctr
DOUBLE COMPLEX for zsctr
Contains the vector y with updated elements.

Specific details for the routine sctr interface are the following:
Sparse BLAS Level 2 and Level 3 Routines

This section describes Sparse BLAS Level 2 and Level 3 routines included in the Intel Math Kernel Library
(Intel MKL) . Sparse BLAS Level 2 is a group of routines and functions that perform operations between a
sparse matrix and dense vectors. Sparse BLAS Level 3 is a group of routines and functions that perform
operations between a sparse matrix and dense matrices.
The terms and concepts required to understand the use of the Intel MKL Sparse BLAS Level 2 and Level 3
routines are discussed in the Linear Solvers Basics appendix.
The Sparse BLAS routines can be useful to implement iterative methods for solving large sparse systems of
equations or eigenvalue problems. For example, these routines can be considered as building blocks for
Iterative Sparse Solvers based on Reverse Communication Interface (RCI ISS) described in the Chapter 8 of
the manual.
Intel MKL provides Sparse BLAS Level 2 and Level 3 routines with typical (or conventional) interface similar
to the interface used in the NIST* Sparse BLAS library [Rem05].
Some software packages and libraries (the PARDISO* Solver used in Intel MKL, Sparskit 2 [Saad94], the
Compaq* Extended Math Library (CXML)[CXML01]) use different (early) variation of the compressed sparse
row (CSR) format and support only Level 2 operations with simplified interfaces. Intel MKL provides an
additional set of Sparse BLAS Level 2 routines with similar simplified interfaces. Each of these routines
operates only on a matrix of the fixed type.
The routines described in this section support both one-based indexing and zero-based indexing of the input
data (see details in the section One-based and Zero-based Indexing).
163
Naming Conventions in Sparse BLAS Level 2 and Level 3

Each Sparse BLAS Level 2 and Level 3 routine has a six- or eight-character base name preceded by the prefix
mkl_ or mkl_cspblas_ .
The routines with typical (conventional) interface have six-character base names in accordance with the
template:
mkl_<character > <data> <operation>( )
The routines with simplified interfaces have eight-character base names in accordance with the templates:
mkl_<character > <data> <mtype> <operation>( )
for routines with one-based indexing; and

mkl_cspblas_<character> <data><mtype><operation>( )
for routines with zero-based indexing.

The <data> field indicates the sparse matrix storage format (see section Sparse Matrix Storage Formats):
coo coordinate format
csr compressed sparse row format and its variations
csc compressed sparse column format and its variations
dia diagonal format
sky skyline storage format
bsr block sparse row format and its variations
The <operation> field indicates the type of operation:
mv matrix-vector product (Level 2)
mm matrix-matrix product (Level 3)
sv solving a single triangular system (Level 2)
sm solving triangular systems with multiple right-hand sides (Level 3)
The field <mtype> indicates the matrix type:
ge sparse representation of a general matrix
sy sparse representation of the upper or lower triangle of a symmetric matrix
tr sparse representation of a triangular matrix
164
Sparse Matrix Storage Formats for Sparse BLAS Routines
The current version of Intel MKL Sparse BLAS Level 2 and Level 3 routines support the following point entry
[Duff86] storage formats for sparse matrices:
compressed sparse row format (CSR) and its variations;

compressed sparse column format (CSC);
coordinate format;
diagonal format;
skyline storage format;
and one block entry storage format:
block sparse row format (BSR) and its variations.

For more information see "Sparse Matrix Storage Formats" in Appendix A.
Intel MKL provides auxiliary routines - matrix converters - that convert sparse matrix from one storage
format to another.
Routines and Supported Operations

This section describes operations supported by the Intel MKL Sparse BLAS Level 2 and Level 3 routines. The
following notations are used here:
A is a sparse matrix;
B and C are dense matrices;
D is a diagonal scaling matrix;
x and y are dense vectors;
alpha and beta are scalars;
op(A) is one of the possible operations:

op(A) = A;
op(A) = AT - transpose of A;
op(A) = AH - conjugated transpose of A.
inv(op(A)) denotes the inverse of op(A).
The Intel MKL Sparse BLAS Level 2 and Level 3 routines support the following operations:
computing the vector product between a sparse matrix and a dense vector:
y := alpha*op(A)*x + beta*y
solving a single triangular system:
y := alpha*inv(op(A))*x
computing a product between sparse matrix and dense matrix:

C := alpha*op(A)*B + beta*C
solving a sparse triangular system with multiple right-hand sides:
C := alpha*inv(op(A))*B
Intel MKL provides an additional set of the Sparse BLAS Level 2 routines with simplified interfaces. Each of
these routines operates on a matrix of the fixed type. The following operations are supported:
computing the vector product between a sparse matrix and a dense vector (for general and symmetric
matrices):
y := op(A)*x
165
solving a single triangular system (for triangular matrices):

y := inv(op(A))*x
Matrix type is indicated by the field <mtype> in the routine name (see section Naming Conventions in Sparse
BLAS Level 2 and Level 3).
NOTE
The routines with simplified interfaces support only four sparse matrix storage formats, specifically:
CSR format in the 3-array variation accepted in the direct sparse solvers and in the CXML;
diagonal format accepted in the CXML;
coordinate format;
BSR format in the 3-array variation.
Note that routines with both typical (conventional) and simplified interfaces use the same computational
kernels that work with certain internal data structures.
The Intel MKL Sparse BLAS Level 2 and Level 3 routines do not support in-place operations.
Complete list of all routines is given in the Sparse BLAS Level 2 and Level 3 Routines.
Interface Consideration
One-Based and Zero-Based Indexing

The Intel MKL Sparse BLAS Level 2 and Level 3 routines support one-based and zero-based indexing of data
arrays.
Routines with typical interfaces support zero-based indexing for the following sparse data storage formats:
CSR, CSC, BSR, and COO. Routines with simplified interfaces support zero based indexing for the following
sparse data storage formats: CSR, BSR, and COO. See the complete list of Sparse BLAS Level 2 and Level 3
Routines.
The one-based indexing uses the convention of starting array indices at 1. The zero-based indexing uses the
convention of starting array indices at 0. For example, indices of the 5-element array x can be presented in
case of one-based indexing as follows:
Element index: 1 2 3 4 5
Element value: 1.0 5.0 7.0 8.0 9.0
and in case of zero-based indexing as follows:

Element index: 0 1 2 3 4
Element value: 1.0 5.0 7.0 8.0 9.0
The detailed descriptions of the one-based and zero-based variants of the sparse data storage formats are
given in the "Sparse Matrix Storage Formats" in Appendix A.
Most parameters of the routines are identical for both one-based and zero-based indexing, but some of them
have certain differences. The following table lists all these differences.
Parameter One-based Indexing Zero-based Indexing
val Array containing non-zero elements of Array containing non-zero elements of

the matrix A, its length is pntre(m) - the matrix A, its length is pntre(m1)
pntrb(1). - pntrb(0).
166
Parameter One-based Indexing Zero-based Indexing
pntrb Array of length m. This array contains Array of length m. This array contains
row indices, such that pntrb(i) - row indices, such that pntrb(i) -
pntrb(1)+1 is the first index of row i pntrb(0) is the first index of row i in
in the arrays val and indx the arrays val and indx.
pntre Array of length m. This array contains Array of length m. This array contains
row indices, such that pntre(I) - row indices, such that pntre(i) -
pntrb(1) is the last index of row i in pntrb(0)-1 is the last index of row i
the arrays val and indx. in the arrays val and indx.
ia Array of length m + 1, containing Array of length m+1, containing indices
indices of elements in the array a, of elements in the array a, such that
such that ia(i) is the index in the ia(i) is the index in the array a of
array a of the first non-zero element the first non-zero element from the
from the row i. The value of the last row i. The value of the last element
element ia(m + 1) is equal to the ia(m) is equal to the number of non-
number of non-zeros plus one. zeros.
ldb Specifies the leading dimension of b as Specifies the second dimension of b as
declared in the calling (sub)program. declared in the calling (sub)program.
ldc Specifies the leading dimension of c as Specifies the second dimension of c as
declared in the calling (sub)program. declared in the calling (sub)program.
Differences Between Intel MKL and NIST* Interfaces

The Intel MKL Sparse BLAS Level 3 routines have the following conventional interfaces:
mkl_xyyymm(transa, m, n, k, alpha, matdescra, arg(A), b, ldb, beta, c, ldc), for matrix-
matrix product;
mkl_xyyysm(transa, m, n, alpha, matdescra, arg(A), b, ldb, c, ldc), for triangular solvers
with multiple right-hand sides.
Here x denotes data type, and yyy - sparse matrix data structure (storage format).
The analogous NIST* Sparse BLAS (NSB) library routines have the following interfaces:
xyyymm(transa, m, n, k, alpha, descra, arg(A), b, ldb, beta, c, ldc, work, lwork), for
matrix-matrix product;
xyyysm(transa, m, n, unitd, dv, alpha, descra, arg(A), b, ldb, beta, c, ldc, work,
lwork), for triangular solvers with multiple right-hand sides.
Some similar arguments are used in both libraries. The argument transa indicates what operation is
performed and is slightly different in the NSB library (see Table Parameter transa). The arguments m and k
are the number of rows and column in the matrix A, respectively, n is the number of columns in the matrix C.
The arguments alpha and beta are scalar alpha and beta respectively (beta is not used in the Intel MKL
triangular solvers.) The arguments b and c are rectangular arrays with the leading dimension ldb and ldc,
respectively. arg(A) denotes the list of arguments that describe the sparse representation of A.
Parameter transa
MKL interface NSB interface Operation
data type CHARACTER*1 INTEGER
value N or n 0 op(A) = A
167
MKL interface NSB interface Operation
T or t 1 op(A) = AT
C or c 2 op(A) = AT or op(A) =
AH
Parameter matdescra
The parameter matdescra describes the relevant characteristic of the matrix A. This manual describes
matdescra as an array of six elements in line with the NIST* implementation. However, only the first four
elements of the array are used in the current versions of the Intel MKL Sparse BLAS routines. Elements
matdescra(5) and matdescra(6) are reserved for future use. Note that whether matdescra is described in
your application as an array of length 6 or 4 is of no importance because the array is declared as a pointer in
the Intel MKL routines. To learn more about declaration of the matdescra array, see the Sparse BLAS
examples located in the Intel MKL installation directory: examples/spblasf/ for Fortran . The table below
lists elements of the parameter matdescra, their Fortran values, and their meanings. The parameter
matdescra corresponds to the argument descra from NSB library.
Possible Values of the Parameter matdescra (descra)
MKL interface NSB Matrix characteristics

interface
one-based zero-based
indexing indexing
data type CHARACTER Char INTEGER
1st element matdescra(1) matdescra(0) descra(1) matrix structure
value G G 0 general
S S 1 symmetric (A = AT)
H H 2 Hermitian (A = (AH))
T T 3 triangular
A A 4 skew(anti)-symmetric (A = -AT)
D D 5 diagonal
2nd element matdescra(2) matdescra(1) descra(2) upper/lower triangular indicator
value L L 1 lower
U U 2 upper
3rd element matdescra(3) matdescra(2) descra(3) main diagonal type
value N N 0 non-unit
U U 1 unit
type of indexing
4th element matdescra(4) matdescra(3) descra(4)
one-based indexing
value F 1
zero-based indexing
C 0
168
In some cases possible element values of the parameter matdescra depend on the values of other elements.
The Table "Possible Combinations of Element Values of the Parameter matdescra" lists all possible
combinations of element values for both multiplication routines and triangular solvers.
Possible Combinations of Element Values of the Parameter matdescra
Routines matdescra(1) matdescra(2) matdescra(3) matdescra(4)

Multiplication G ignored ignored F (default) or C
Routines
S or H L (default) N (default) F (default) or C
S or H L (default) U F (default) or C
S or H U N (default) F (default) or C
S or H U U F (default) or C
A L (default) ignored F (default) or C
A U ignored F (default) or C
Multiplication T L U F (default) or C
Routines and
Triangular Solvers
T L N F (default) or C
T U U F (default) or C
T U N F (default) or C
D ignored N (default) F (default) or C
D ignored U F (default) or C
For a matrix in the skyline format with the main diagonal declared to be a unit, diagonal elements must be
stored in the sparse representation even if they are zero. In all other formats, diagonal elements can be
stored (if needed) in the sparse representation if they are not zero.
Operations with Partial Matrices

One of the distinctive feature of the Intel MKL Sparse BLAS routines is a possibility to perform operations
only on partial matrices composed of certain parts (triangles and the main diagonal) of the input sparse
matrix. It can be done by setting properly first three elements of the parameter matdescra.
An arbitrary sparse matrix A can be decomposed as
A = L + D + U
where L is the strict lower triangle of A, U is the strict upper triangle of A, D is the main diagonal.
Table "Output Matrices for Multiplication Routines" shows correspondence between the output matrices and
values of the parameter matdescra for the sparse matrix A for multiplication routines.
Output Matrices for Multiplication Routines
matdescra(1) matdescra(2) matdescra(3) Output Matrix
G ignored ignored alpha*op(A)*x + beta*y

alpha*op(A)*B + beta*C
S or H L N alpha*op(L+D+L')*x + beta*y
alpha*op(L+D+L')*B + beta*C
S or H L U alpha*op(L+I+L')*x + beta*y
alpha*op(L+I+L')*B + beta*C
S or H U N alpha*op(U'+D+U)*x + beta*y
169

alpha*op(U'+D+U)*B + beta*C
S or H U U alpha*op(U'+I+U)*x + beta*y
alpha*op(U'+I+U)*B + beta*C
T L U alpha*op(L+I)*x + beta*y
alpha*op(L+I)*B + beta*C
T L N alpha*op(L+D)*x + beta*y
alpha*op(L+D)*B + beta*C
T U U alpha*op(U+I)*x + beta*y
alpha*op(U+I)*B + beta*C
T U N alpha*op(U+D)*x + beta*y
alpha*op(U+D)*B + beta*C
A L ignored alpha*op(L-L')*x + beta*y

alpha*op(L-L')*B + beta*C
A U ignored alpha*op(U-U')*x + beta*y

alpha*op(U-U')*B + beta*C
D ignored N alpha*D*x + beta*y

alpha*D*B + beta*C
D ignored U alpha*x + beta*y

alpha*B + beta*C
Table Output Matrices for Triangular Solvers shows correspondence between the output matrices and values
of the parameter matdescra for the sparse matrix A for triangular solvers.
Output Matrices for Triangular Solvers
T L N alpha*inv(op(L))*x
alpha*inv(op(L))*B
T L U alpha*inv(op(L))*x
alpha*inv(op(L))*B
T U N alpha*inv(op(U))*x
alpha*inv(op(U))*B
T U U alpha*inv(op(U))*x
alpha*inv(op(U))*B
D ignored N alpha*inv(D)*x
alpha*inv(D)*B
D ignored U alpha*x
alpha*B
170
Sparse BLAS Level 2 and Level 3 Routines.
Table Sparse BLAS Level 2 and Level 3 Routines lists the sparse BLAS Level 2 and Level 3 routines
described in more detail later in this section.
Sparse BLAS Level 2 and Level 3 Routines
Routine/Function Description
Simplified interface, one-based indexing
mkl_?csrgemv Computes matrix - vector product of a sparse general matrix

in the CSR format (3-array variation)
mkl_?bsrgemv Computes matrix - vector product of a sparse general matrix

in the BSR format (3-array variation).
mkl_?coogemv Computes matrix - vector product of a sparse general matrix

in the coordinate format.
mkl_?diagemv Computes matrix - vector product of a sparse general matrix

in the diagonal format.
mkl_?csrsymv Computes matrix - vector product of a sparse symmetrical

matrix in the CSR format (3-array variation)
mkl_?bsrsymv Computes matrix - vector product of a sparse symmetrical

matrix in the BSR format (3-array variation).
mkl_?coosymv Computes matrix - vector product of a sparse symmetrical

matrix in the coordinate format.
mkl_?diasymv Computes matrix - vector product of a sparse symmetrical

matrix in the diagonal format.
mkl_?csrtrsv Triangular solvers with simplified interface for a sparse matrix

in the CSR format (3-array variation).
mkl_?bsrtrsv Triangular solver with simplified interface for a sparse matrix

in the BSR format (3-array variation).
mkl_?cootrsv Triangular solvers with simplified interface for a sparse matrix

mkl_?diatrsv Triangular solvers with simplified interface for a sparse matrix

Simplified interface, zero-based indexing
mkl_cspblas_?csrgemv Computes matrix - vector product of a sparse general matrix

in the CSR format (3-array variation) with zero-based
indexing.
mkl_cspblas_?bsrgemv Computes matrix - vector product of a sparse general matrix

in the BSR format (3-array variation)with zero-based indexing.
mkl_cspblas_?coogemv Computes matrix - vector product of a sparse general matrix

in the coordinate format with zero-based indexing.
mkl_cspblas_?csrsymv Computes matrix - vector product of a sparse symmetrical

matrix in the CSR format (3-array variation) with zero-based
indexing
171
mkl_cspblas_?bsrsymv Computes matrix - vector product of a sparse symmetrical

matrix in the BSR format (3-array variation) with zero-based
indexing.
mkl_cspblas_?coosymv Computes matrix - vector product of a sparse symmetrical

matrix in the coordinate format with zero-based indexing.
mkl_cspblas_?csrtrsv Triangular solvers with simplified interface for a sparse matrix

in the CSR format (3-array variation) with zero-based
indexing.
mkl_cspblas_?bsrtrsv Triangular solver with simplified interface for a sparse matrix

in the BSR format (3-array variation) with zero-based
indexing.
mkl_cspblas_?cootrsv Triangular solver with simplified interface for a sparse matrix

in the coordinate format with zero-based indexing.
Typical (conventional) interface, one-based and zero-based indexing
mkl_?csrmv Computes matrix - vector product of a sparse matrix in the

CSR format.
mkl_?bsrmv Computes matrix - vector product of a sparse matrix in the

BSR format.
mkl_?cscmv Computes matrix - vector product for a sparse matrix in the

CSC format.
mkl_?coomv Computes matrix - vector product for a sparse matrix in the

coordinate format.
mkl_?csrsv Solves a system of linear equations for a sparse matrix in the

CSR format.
mkl_?bsrsv Solves a system of linear equations for a sparse matrix in the

BSR format.
mkl_?cscsv Solves a system of linear equations for a sparse matrix in the

CSC format.
mkl_?coosv Solves a system of linear equations for a sparse matrix in the

coordinate format.
mkl_?csrmm Computes matrix - matrix product of a sparse matrix in the

CSR format
mkl_?bsrmm Computes matrix - matrix product of a sparse matrix in the

BSR format.
mkl_?cscmm Computes matrix - matrix product of a sparse matrix in the

CSC format
mkl_?coomm Computes matrix - matrix product of a sparse matrix in the

coordinate format.
mkl_?csrsm Solves a system of linear matrix equations for a sparse matrix

in the CSR format.
mkl_?bsrsm Solves a system of linear matrix equations for a sparse matrix

in the BSR format.
172
mkl_?cscsm Solves a system of linear matrix equations for a sparse matrix

in the CSC format.
mkl_?coosm Solves a system of linear matrix equations for a sparse matrix

Typical (conventional) interface, one-based indexing
mkl_?diamv Computes matrix - vector product of a sparse matrix in the

diagonal format.
mkl_?skymv Computes matrix - vector product for a sparse matrix in the

skyline storage format.
mkl_?diasv Solves a system of linear equations for a sparse matrix in the

diagonal format.
mkl_?skysv Solves a system of linear equations for a sparse matrix in the

skyline format.
mkl_?diamm Computes matrix - matrix product of a sparse matrix in the

diagonal format.
mkl_?skymm Computes matrix - matrix product of a sparse matrix in the

skyline storage format.
mkl_?diasm Solves a system of linear matrix equations for a sparse matrix

mkl_?skysm Solves a system of linear matrix equations for a sparse matrix

in the skyline storage format.
Auxiliary routines
Matrix converters
mkl_?dnscsr Converts a sparse matrix in uncompressed representation to

CSR format (3-array variation) and vice versa.
mkl_?csrcoo Converts a sparse matrix in CSR format (3-array variation) to

coordinate format and vice versa.
mkl_?csrbsr Converts a sparse matrix in CSR format to BSR format (3-

array variations) and vice versa.
mkl_?csrcsc Converts a sparse matrix in CSR format to CSC format and

vice versa (3-array variations).
mkl_?csrdia Converts a sparse matrix in CSR format (3-array variation) to

diagonal format and vice versa.
mkl_?csrsky Converts a sparse matrix in CSR format (3-array variation) to

sky line format and vice versa.
Operations on sparse matrices
mkl_?csradd Computes the sum of two sparse matrices stored in the CSR
format (3-array variation) with one-based indexing.
mkl_?csrmultcsr Computes the product of two sparse matrices stored in the

CSR format (3-array variation) with one-based indexing.
173
mkl_?csrmultd Computes product of two sparse matrices stored in the CSR

format (3-array variation) with one-based indexing. The result
is stored in the dense matrix.
mkl_?csrgemv
Computes matrix - vector product of a sparse general
matrix stored in the CSR format (3-array variation)
with one-based indexing.
Syntax
call mkl_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_zcsrgemv(transa, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_?csrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m sparse square matrix in the CSR format (3-array variation), AT is the transpose of A.
NOTE
This routine supports only one-based indexing of the input arrays.
Input Parameters
Parameter descriptions are common for all implemented interfaces with the exception of data types that refer
here to the FORTRAN 77 standard types. Data types specific to the different interfaces are described in the
section "Interfaces" below.
transa CHARACTER*1. Specifies the operation.

If transa = 'N' or 'n', then as y := A*x
If transa = 'T' or 't' or 'C' or 'c', then y := AT*x,
m INTEGER. Number of rows of the matrix A.
a REAL for mkl_scsrgemv.

DOUBLE PRECISION for mkl_dcsrgemv.
174
COMPLEX for mkl_ccsrgemv.
DOUBLE COMPLEX for mkl_zcsrgemv.
Array containing non-zero elements of the matrix A. Its length is equal to
the number of non-zero elements in the matrix A. Refer to values array
description in Sparse Matrix Storage Formats for more details.
ia INTEGER. Array of length m + 1, containing indices of elements in the array

a, such that ia(i) is the index in the array a of the first non-zero element
from the row i. The value of the last element ia(m + 1) is equal to the
number of non-zeros plus one. Refer to rowIndex array description in
Sparse Matrix Storage Formats for more details.
ja INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to the length of the array a. Refer to columns array
x REAL for mkl_scsrgemv.

Array, size is m.
On entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_scsrgemv.

Array, size at least m.
On exit, the array y must contain the vector y.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER ia(*), ja(*)
REAL a(*), x(*), y(*)
175
SUBROUTINE mkl_dcsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
DOUBLE PRECISION a(*), x(*), y(*)
SUBROUTINE mkl_ccsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
COMPLEX a(*), x(*), y(*)
SUBROUTINE mkl_zcsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
DOUBLE COMPLEX a(*), x(*), y(*)
mkl_?bsrgemv
matrix stored in the BSR format (3-array variation)
Syntax
call mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_?bsrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m block sparse square matrix in the BSR format (3-array variation), AT is the transpose of A.
176
NOTE
Input Parameters

If transa = 'N' or 'n', then the matrix-vector product is computed as
y := A*x
If transa = 'T' or 't' or 'C' or 'c', then the matrix-vector product is
computed as y := AT*x,
m INTEGER. Number of block rows of the matrix A.
lb INTEGER. Size of the block in the matrix A.
a REAL for mkl_sbsrgemv.

DOUBLE PRECISION for mkl_dbsrgemv.
COMPLEX for mkl_cbsrgemv.
DOUBLE COMPLEX for mkl_zbsrgemv.
Array containing elements of non-zero blocks of the matrix A. Its length is
equal to the number of non-zero blocks in the matrix A multiplied by lb*lb.
Refer to values array description in BSR Format for more details.
ia INTEGER. Array of length (m + 1), containing indices of block in the array

number of non-zero blocks plus one. Refer to rowIndex array description in
BSR Format for more details.
ja INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
Its length is equal to the number of non-zero blocks of the matrix A. Refer
to columns array description in BSR Format for more details.
x REAL for mkl_sbsrgemv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_sbsrgemv.
177

Array, size at least (m*lb).
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_zbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
mkl_?coogemv
Computes matrix-vector product of a sparse general
matrix stored in the coordinate format with one-based
indexing.
Syntax
call mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
178
call mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?coogemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m sparse square matrix in the coordinate format, AT is the transpose of A.
NOTE
Input Parameters

y := A*x
val REAL for mkl_scoogemv.

DOUBLE PRECISION for mkl_dcoogemv.
COMPLEX for mkl_ccoogemv.
DOUBLE COMPLEX for mkl_zcoogemv.
Array of length nnz, contains non-zero elements of the matrix A in the
arbitrary order.
Refer to values array description in Coordinate Format for more details.
rowind INTEGER. Array of length nnz, contains the row indices for each non-zero
element of the matrix A.
Refer to rows array description in Coordinate Format for more details.
179
colind INTEGER. Array of length nnz, contains the column indices for each non-
zero element of the matrix A. Refer to columns array description in
Coordinate Format for more details.
nnz INTEGER. Specifies the number of non-zero element of the matrix A.

Refer to nnz description in Coordinate Format for more details.
x REAL for mkl_scoogemv.

Array, size is m.
One entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_scoogemv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
INTEGER rowind(*), colind(*)
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
DOUBLE PRECISION val(*), x(*), y(*)
180
SUBROUTINE mkl_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
COMPLEX val(*), x(*), y(*)
SUBROUTINE mkl_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
DOUBLE COMPLEX val(*), x(*), y(*)
mkl_?diagemv
matrix stored in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
The mkl_?diagemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m sparse square matrix in the diagonal storage format, AT is the transpose of A.
NOTE
Input Parameters
181

If transa = 'N' or 'n', then y := A*x
If transa = 'T' or 't' or 'C' or 'c', then y := AT*x,
val REAL for mkl_sdiagemv.

DOUBLE PRECISION for mkl_ddiagemv.
DOUBLE COMPLEX for mkl_zdiagemv.
Two-dimensional array of size lval*ndiag, contains non-zero diagonals of
the matrix A. Refer to values array description in Diagonal Storage Scheme
for more details.
lval INTEGER. Leading dimension of vallvalm. Refer to lval description in

Diagonal Storage Scheme for more details.
idiag INTEGER. Array of length ndiag, contains the distances between main
diagonal and each non-zero diagonals in the matrix A.
Refer to distance array description in Diagonal Storage Scheme for more
details.
ndiag INTEGER. Specifies the number of non-zero diagonals of the matrix A.
x REAL for mkl_sdiagemv.

Array, size is m.
Output Parameters
y REAL for mkl_sdiagemv.

182
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiagemv(transa, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER m, lval, ndiag
INTEGER idiag(*)
REAL val(lval,*), x(*), y(*)
SUBROUTINE mkl_ddiagemv(transa, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), x(*), y(*)
SUBROUTINE mkl_cdiagemv(transa, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX val(lval,*), x(*), y(*)
SUBROUTINE mkl_zdiagemv(transa, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), x(*), y(*)
mkl_?csrsymv
Computes matrix - vector product of a sparse
symmetrical matrix stored in the CSR format (3-array
variation) with one-based indexing.
Syntax
call mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_ccsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_zcsrsymv(uplo, m, a, ia, ja, x, y)
Include Files
mkl.fi
183
Description
The mkl_?csrsymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array variation).
NOTE
Input Parameters
uplo CHARACTER*1. Specifies whether the upper or low triangle of the matrix A is
used.
If uplo = 'U' or 'u', then the upper triangle of the matrix A is used.
If uplo = 'L' or 'l', then the low triangle of the matrix A is used.
a REAL for mkl_scsrsymv.

DOUBLE PRECISION for mkl_dcsrsymv.
COMPLEX for mkl_ccsrsymv.
DOUBLE COMPLEX for mkl_zcsrsymv.

the matrix A.
x REAL for mkl_scsrsymv.

184
Array, size is m.
Output Parameters
y REAL for mkl_scsrsymv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dcsrsymv(uplo, m, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_ccsrsymv(uplo, m, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_zcsrsymv(uplo, m, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m
185
mkl_?bsrsymv
Computes matrix-vector product of a sparse
symmetrical matrix stored in the BSR format (3-array
variation) with one-based indexing.
Syntax
call mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_?bsrsymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array variation).
NOTE
Input Parameters
considered.
a REAL for mkl_sbsrsymv.

DOUBLE PRECISION for mkl_dbsrsymv.
COMPLEX for mkl_cbsrsymv.
186

the matrix A.
x REAL for mkl_sbsrsymv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_sbsrsymv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m, lb
187
SUBROUTINE mkl_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m, lb
mkl_?coosymv
symmetrical matrix stored in the coordinate format
Syntax
call mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?coosymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format.
NOTE
Input Parameters
188
used.
val REAL for mkl_scoosymv.

DOUBLE PRECISION for mkl_dcoosymv.
COMPLEX for mkl_ccoosymv.
DOUBLE COMPLEX for mkl_zcoosymv.
arbitrary order.

x REAL for mkl_scoosymv.

Array, size is m.
Output Parameters
y REAL for mkl_scoosymv.

189
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cdcoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
mkl_?diasymv
symmetrical matrix stored in the diagonal format with
one-based indexing.
Syntax
call mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
190
Description
The mkl_?diasymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix.
NOTE
Input Parameters
used.
val REAL for mkl_sdiasymv.

DOUBLE PRECISION for mkl_ddiasymv.
COMPLEX for mkl_cdiasymv.
DOUBLE COMPLEX for mkl_zdiasymv.
Two-dimensional array of size lval by ndiag, contains non-zero diagonals
of the matrix A. Refer to values array description in Diagonal Storage
Scheme for more details.
lval INTEGER. Leading dimension of val, lvalm. Refer to lval description in

details.
x REAL for mkl_sdiasymv.

Array, size is m.
191
Output Parameters
y REAL for mkl_sdiasymv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)
CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_ddiasymv(uplo, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_cdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 uplo
INTEGER idiag(*)
SUBROUTINE mkl_zdiasymv(uplo, m, val, lval, idiag, ndiag, x, y)

CHARACTER*1 uplo
INTEGER idiag(*)
192
mkl_?csrtrsv
Triangular solvers with simplified interface for a sparse
matrix in the CSR format (3-array variation) with one-
based indexing.
Syntax
call mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_?csrtrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the CSR format (3 array variation):
A*y = x
or
AT*y = x,
where:
A is a sparse upper or lower triangular matrix with unit or non-unit main diagonal, AT is the transpose of A.
NOTE
Input Parameters
used.
transa CHARACTER*1. Specifies the system of linear equations.

If transa = 'N' or 'n', then A*y = x
If transa = 'T' or 't' or 'C' or 'c', then AT*y = x,
diag CHARACTER*1. Specifies whether A is unit triangular.

If diag = 'U' or 'u', then A is a unit triangular.
193
If diag = 'N' or 'n', then A is not unit triangular.
a REAL for mkl_scsrtrmv.

DOUBLE PRECISION for mkl_dcsrtrmv.
COMPLEX for mkl_ccsrtrmv.
DOUBLE COMPLEX for mkl_zcsrtrmv.
NOTE
The non-zero elements of the given row of the matrix must be stored
in the same order as they appear in the row (from left to right).
No diagonal element can be omitted from a sparse storage if the solver
is called with the non-unit indicator.

the matrix A.
NOTE
Column indices must be sorted in increasing order for each row.
x REAL for mkl_scsrtrmv.

Array, size is m.
Output Parameters
y REAL for mkl_scsrtrmv.

194
Contains the vector y.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
CHARACTER*1 uplo, transa, diag
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
SUBROUTINE mkl_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
SUBROUTINE mkl_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
mkl_?bsrtrsv
Triangular solver with simplified interface for a sparse
Syntax
call mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
195
Include Files
mkl.fi
Description
The mkl_?bsrtrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the BSR format (3-array variation) :
y := A*x
or
y := AT*x,
where:
NOTE
Input Parameters
uplo CHARACTER*1. Specifies the upper or low triangle of the matrix A is used.

y := A*x
computed as y := AT*x.
diag CHARACTER*1. Specifies whether A is a unit triangular matrix.

If diag = 'U' or 'u', then A is a unit triangular.
If diag = 'N' or 'n', then A is not a unit triangular.
a REAL for mkl_sbsrtrsv.

DOUBLE PRECISION for mkl_dbsrtrsv.
COMPLEX for mkl_cbsrtrsv.
DOUBLE COMPLEX for mkl_zbsrtrsv.
196
NOTE

a, such that ia(I) is the index in the array a of the first non-zero element
from the row I. The value of the last element ia(m + 1) is equal to the
ja INTEGER.
Array containing the column indices for each non-zero block in the matrix A.
x REAL for mkl_sbsrtrsv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_sbsrtrsv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
REAL a(*), x(*), y(*)
197
SUBROUTINE mkl_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)

INTEGER m, lb
SUBROUTINE mkl_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)

INTEGER m, lb
SUBROUTINE mkl_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)

INTEGER m, lb
mkl_?cootrsv
matrix in the coordinate format with one-based
indexing.
Syntax
call mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?cootrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the coordinate format:
A*y = x
or
AT*y = x,
where:
198
NOTE
Input Parameters
considered.


If diag = 'U' or 'u', then A is unit triangular.
val REAL for mkl_scootrsv.

DOUBLE PRECISION for mkl_dcootrsv.
COMPLEX for mkl_ccootrsv.
DOUBLE COMPLEX for mkl_zcootrsv.
arbitrary order.

x REAL for mkl_scootrsv.

199
Array, size is m.
Output Parameters
y REAL for mkl_scootrsv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
SUBROUTINE mkl_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
SUBROUTINE mkl_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
200
mkl_?diatrsv
matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_ddiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_cdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
call mkl_zdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
The mkl_?diatrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the diagonal format:
A*y = x
or
AT*y = x,
where:
NOTE
Input Parameters
used.


201
val REAL for mkl_sdiatrsv.

DOUBLE PRECISION for mkl_ddiatrsv.
COMPLEX for mkl_cdiatrsv.
DOUBLE COMPLEX for mkl_zdiatrsv.

NOTE
All elements of this array must be sorted in increasing order.

details.
x REAL for mkl_sdiatrsv.

Array, size is m.
Output Parameters
y REAL for mkl_sdiatrsv.

202
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)
INTEGER indiag(*)
SUBROUTINE mkl_ddiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)

INTEGER indiag(*)
SUBROUTINE mkl_cdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)

INTEGER indiag(*)
SUBROUTINE mkl_zdiatrsv(uplo, transa, diag, m, val, lval, idiag, ndiag, x, y)

INTEGER indiag(*)
mkl_cspblas_?csrgemv
matrix stored in the CSR format (3-array variation)
with zero-based indexing.
Syntax
call mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrgemv(transa, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrgemv(transa, m, a, ia, ja, x, y)
Include Files
mkl.fi
203
Description
The mkl_cspblas_?csrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m sparse square matrix in the CSR format (3-array variation) with zero-based indexing, AT is
the transpose of A.
NOTE
This routine supports only zero-based indexing of the input arrays.
Input Parameters

y := A*x
a REAL for mkl_cspblas_scsrgemv.

DOUBLE PRECISION for mkl_cspblas_dcsrgemv.
COMPLEX for mkl_cspblas_ccsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zcsrgemv.

from the row I. The value of the last element ia(m) is equal to the number
of non-zeros. Refer to rowIndex array description in Sparse Matrix Storage
Formats for more details.
the matrix A.
x REAL for mkl_cspblas_scsrgemv.
204
Array, size is m.
One entry, the array x must contain the vector x.
Output Parameters
y REAL for mkl_cspblas_scsrgemv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrgemv(transa, m, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_cspblas_ccsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_cspblas_zcsrgemv(transa, m, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m
205
mkl_cspblas_?bsrgemv
Syntax
call mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrgemv(transa, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrgemv(transa, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?bsrgemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m block sparse square matrix in the BSR format (3-array variation) with zero-based indexing,
AT is the transpose of A.
NOTE
Input Parameters

y := A*x
a REAL for mkl_cspblas_sbsrgemv.

DOUBLE PRECISION for mkl_cspblas_dbsrgemv.
206
COMPLEX for mkl_cspblas_cbsrgemv.
DOUBLE COMPLEX for mkl_cspblas_zbsrgemv.

number of non-zero blocks. Refer to rowIndex array description in BSR
Format for more details.
the matrix A.
x REAL for mkl_cspblas_sbsrgemv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_cspblas_sbsrgemv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrgemv(transa, m, lb, a, ia, ja, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL a(*), x(*), y(*)
207
SUBROUTINE mkl_cspblas_dbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrgemv(transa, m, lb, a, ia, ja, x, y)

CHARACTER*1 transa
INTEGER m, lb
mkl_cspblas_?coogemv
matrix stored in the coordinate format with zero-
based indexing.
Syntax
call mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_dcoogemv routine performs a matrix-vector operation defined as
y := A*x
or
y := AT*x,
where:
A is an m-by-m sparse square matrix in the coordinate format with zero-based indexing, AT is the transpose
of A.
208
NOTE
Input Parameters

y := A*x
val REAL for mkl_cspblas_scoogemv.

DOUBLE PRECISION for mkl_cspblas_dcoogemv.
COMPLEX for mkl_cspblas_ccoogemv.
DOUBLE COMPLEX for mkl_cspblas_zcoogemv.
arbitrary order.

x REAL for mkl_cspblas_scoogemv.

Array, size is m.
Output Parameters
y REAL for mkl_cspblas_scoogemv.

209

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoogemv(transa, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcoogemv(transa, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccoogemv(transa, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcoogemv(transa, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
mkl_cspblas_?csrsymv
symmetrical matrix stored in the CSR format (3-array
variation) with zero-based indexing.
Syntax
call mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrsymv(uplo, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrsymv(uplo, m, a, ia, ja, x, y)
210
call mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?csrsymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the CSR format (3-array variation) with
zero-based indexing.
NOTE
Input Parameters
used.
a REAL for mkl_cspblas_scsrsymv.

DOUBLE PRECISION for mkl_cspblas_dcsrsymv.
COMPLEX for mkl_cspblas_ccsrsymv.
DOUBLE COMPLEX for mkl_cspblas_zcsrsymv.

number of non-zeros. Refer to rowIndex array description in Sparse Matrix
Storage Formats for more details.
the matrix A.
211
x REAL for mkl_cspblas_scsrsymv.

Array, size is m.
Output Parameters
y REAL for mkl_cspblas_scsrsymv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrsymv(uplo, m, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m
SUBROUTINE mkl_cspblas_ccsrsymv(uplo, m, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m
212
SUBROUTINE mkl_cspblas_zcsrsymv(uplo, m, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m
mkl_cspblas_?bsrsymv
symmetrical matrix stored in the BSR format (3-arrays
variation) with zero-based indexing.
Syntax
call mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?bsrsymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the BSR format (3-array variation) with
NOTE
Input Parameters
used.
213
a REAL for mkl_cspblas_sbsrsymv.

DOUBLE PRECISION for mkl_cspblas_dbsrsymv.
COMPLEX for mkl_cspblas_cbsrsymv.
DOUBLE COMPLEX for mkl_cspblas_zbsrsymv.

the matrix A.
x REAL for mkl_cspblas_sbsrsymv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_cspblas_sbsrsymv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
REAL a(*), x(*), y(*)
214
SUBROUTINE mkl_cspblas_dbsrsymv(uplo, m, lb, a, ia, ja, x, y)
CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrsymv(uplo, m, lb, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrsymv(uplo, m, lb, a, ia, ja, x, y)

CHARACTER*1 uplo
INTEGER m, lb
mkl_cspblas_?coosymv
symmetrical matrix stored in the coordinate format
with zero-based indexing .
Syntax
call mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?coosymv routine performs a matrix-vector operation defined as
y := A*x
where:
A is an upper or lower triangle of the symmetrical sparse matrix in the coordinate format with zero-based
indexing.
215
NOTE
Input Parameters
used.
val REAL for mkl_cspblas_scoosymv.

DOUBLE PRECISION for mkl_cspblas_dcoosymv.
COMPLEX for mkl_cspblas_ccoosymv.
DOUBLE COMPLEX for mkl_cspblas_zcoosymv.
arbitrary order.

x REAL for mkl_cspblas_scoosymv.

Array, size is m.
Output Parameters
y REAL for mkl_cspblas_scoosymv.

216
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scoosymv(uplo, m, val, rowind, colind, nnz, x, y)
CHARACTER*1 uplo
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcoosymv(uplo, m, val, rowind, colind, nnz, x, y)

CHARACTER*1 uplo
INTEGER m, nnz
mkl_cspblas_?csrtrsv
matrix in the CSR format (3-array variation) with
Syntax
call mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
call mkl_cspblas_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
217
Include Files
mkl.fi
Description
The mkl_cspblas_?csrtrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the CSR format (3-array variation) with zero-based indexing:
A*y = x
or
AT*y = x,
where:
NOTE
Input Parameters
used.

diag CHARACTER*1. Specifies whether matrix A is unit triangular.

a REAL for mkl_cspblas_scsrtrsv.

DOUBLE PRECISION for mkl_cspblas_dcsrtrsv.
COMPLEX for mkl_cspblas_ccsrtrsv.
DOUBLE COMPLEX for mkl_cspblas_zcsrtrsv.
218
NOTE
ia INTEGER. Array of length m+1, containing indices of elements in the array a,

such that ia(i) is the index in the array a of the first non-zero element
from the row i. The value of the last element ia(m) is equal to the number
of non-zeros. Refer to rowIndex array description in Sparse Matrix Storage
Formats for more details.
the matrix A.
NOTE
x REAL for mkl_cspblas_scsrtrsv.

Array, size is m.
Output Parameters
y REAL for mkl_cspblas_scsrtrsv.

219
Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)
INTEGER m
REAL a(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
SUBROUTINE mkl_cspblas_ccsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
SUBROUTINE mkl_cspblas_zcsrtrsv(uplo, transa, diag, m, a, ia, ja, x, y)

INTEGER m
mkl_cspblas_?bsrtrsv
Triangular solver with simplified interface for a sparse
Syntax
call mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
call mkl_cspblas_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
Include Files
mkl.fi
220
Description
The mkl_cspblas_?bsrtrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the BSR format (3-array variation) with zero-based indexing:
y := A*x
or
y := AT*x,
where:
NOTE
Input Parameters
uplo CHARACTER*1. Specifies the upper or low triangle of the matrix A is used.

y := A*x
diag CHARACTER*1. Specifies whether matrix A is unit triangular or not.

If diag = 'U' or 'u', A is unit triangular.
If diag = 'N' or 'n', A is not unit triangular.
a REAL for mkl_cspblas_sbsrtrsv.

DOUBLE PRECISION for mkl_cspblas_dbsrtrsv.
COMPLEX for mkl_cspblas_cbsrtrsv.
DOUBLE COMPLEX for mkl_cspblas_zbsrtrsv.
221
NOTE

from the row I. The value of the last element ia(m + 1) is equal to the
number of non-zero blocks. Refer to rowIndex array description in BSR
Format for more details.
the matrix A.
x REAL for mkl_cspblas_sbsrtrsv.

Array, size (m*lb).
Output Parameters
y REAL for mkl_cspblas_sbsrtrsv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_sbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
REAL a(*), x(*), y(*)
222
SUBROUTINE mkl_cspblas_dbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)
INTEGER m, lb
SUBROUTINE mkl_cspblas_cbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)

INTEGER m, lb
SUBROUTINE mkl_cspblas_zbsrtrsv(uplo, transa, diag, m, lb, a, ia, ja, x, y)

INTEGER m, lb
mkl_cspblas_?cootrsv
matrix in the coordinate format with zero-based
indexing .
Syntax
call mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
call mkl_cspblas_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_cspblas_?cootrsv routine solves a system of linear equations with matrix-vector operations for a
sparse matrix stored in the coordinate format with zero-based indexing:
A*y = x
or
AT*y = x,
where:
223
NOTE
Input Parameters
considered.


val REAL for mkl_cspblas_scootrsv.

DOUBLE PRECISION for mkl_cspblas_dcootrsv.
COMPLEX for mkl_cspblas_ccootrsv.
DOUBLE COMPLEX for mkl_cspblas_zcootrsv.
arbitrary order.

x REAL for mkl_cspblas_scootrsv.

224
Array, size is m.
Output Parameters
y REAL for mkl_cspblas_scootrsv.

Interfaces
FORTRAN 77:
SUBROUTINE mkl_cspblas_scootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)
INTEGER m, nnz
REAL val(*), x(*), y(*)
SUBROUTINE mkl_cspblas_dcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
SUBROUTINE mkl_cspblas_ccootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
SUBROUTINE mkl_cspblas_zcootrsv(uplo, transa, diag, m, val, rowind, colind, nnz, x, y)

INTEGER m, nnz
225
mkl_?csrmv
Computes matrix - vector product of a sparse matrix
stored in the CSR format.
Syntax
call mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_dcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_zcsrmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
Include Files
mkl.fi
Description
The mkl_?csrmv routine performs a matrix-vector operation defined as
y := alpha*A*x + beta*y
or
y := alpha*AT*x + beta*y,
where:
A is an m-by-k sparse matrix in the CSR format, AT is the transpose of A.
NOTE
This routine supports a CSR format both with one-based indexing and zero-based indexing.
Input Parameters

If transa = 'N' or 'n', then y := alpha*A*x + beta*y
If transa = 'T' or 't' or 'C' or 'c', then y := alpha*AT*x + beta*y,
k INTEGER. Number of columns of the matrix A.
alpha REAL for mkl_scsrmv.

DOUBLE PRECISION for mkl_dcsrmv.
COMPLEX for mkl_ccsrmv.
DOUBLE COMPLEX for mkl_zcsrmv.
226
matdescra CHARACTER. Array of six elements, specifies properties of the matrix used
for operation. Only first four array elements are used, their possible values
are given in Table Possible Values of the Parameter matdescra (descra).
Possible combinations of element values of this parameter are given in
Table Possible Combinations of Element Values of the Parameter
matdescra.
val REAL for mkl_scsrmv.

Array containing non-zero elements of the matrix A.
For one-based indexing its length is pntre(m) - pntrb(1).
For zero-based indexing its length is pntre(m-1) - pntrb(0).
Refer to values array description in CSR Format for more details.
indx INTEGER. Array containing the column indices for each non-zero element of
the matrix A.
Its length is equal to length of the val array.
Refer to columns array description in CSR Format for more details.
pntrb INTEGER. Array of length m.

For one-based indexing this array contains row indices, such that pntrb(i)
- pntrb(1) + 1 is the first index of row i in the arrays val and indx.
For zero-based indexing this array contains row indices, such that
pntrb(i) - pntrb(0) is the first index of row i in the arrays val and
indx.
Refer to pointerb array description in CSR Format for more details.
pntre INTEGER. Array of length m.

For one-based indexing this array contains row indices, such that pntre(i)
- pntrb(1) is the last index of row i in the arrays val and indx.
pntre(i) - pntrb(0)-1 is the last index of row i in the arrays val and
indx.
Refer to pointerE array description in CSR Format for more details.
x REAL for mkl_scsrmv.

Array, size at least k if transa = 'N' or 'n' and at least m otherwise. On
entry, the array x must contain the vector x.
227
beta REAL for mkl_scsrmv.

y REAL for mkl_scsrmv.

Array, size at least m if transa = 'N' or 'n' and at least k otherwise. On
entry, the array y must contain the vector y.
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmv(transa, m, k, alpha, matdescra, val, indx,
pntrb, pntre, x, beta, y)
CHARACTER*1 transa
CHARACTER matdescra(*)
INTEGER m, k
INTEGER indx(*), pntrb(m), pntre(m)
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcsrmv(transa, m, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
INTEGER m, k
DOUBLE PRECISION alpha, beta
228
SUBROUTINE mkl_ccsrmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k
COMPLEX alpha, beta
SUBROUTINE mkl_zcsrmv(transa, m, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
INTEGER m, k
DOUBLE COMPLEX alpha, beta
mkl_?bsrmv
Computes matrix - vector product of a sparse matrix
stored in the BSR format.
Syntax
call mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_dbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_cbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
call mkl_zbsrmv(transa, m, k, lb, alpha, matdescra, val, indx, pntrb, pntre, x, beta,
y)
Include Files
mkl.fi
Description
The mkl_?bsrmv routine performs a matrix-vector operation defined as
or
where:
229

A is an m-by-k block sparse matrix in the BSR format, AT is the transpose of A.
NOTE
This routine supports a BSR format both with one-based indexing and zero-based indexing.
Input Parameters

computed as y := alpha*AT*x + beta*y,
k INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrmv.

DOUBLE PRECISION for mkl_dbsrmv.
COMPLEX for mkl_cbsrmv.
DOUBLE COMPLEX for mkl_zbsrmv.
matdescra.
val REAL for mkl_sbsrmv.

indx INTEGER. Array containing the column indices for each non-zero block in
the matrix A.
230
Its length is equal to the number of non-zero blocks in the matrix A.
Refer to columns array description in BSR Format for more details.

For one-based indexing: this array contains row indices, such that
pntrb(i) - pntrb(1) + 1 is the first index of block row i in the array
indx.
For zero-based indexing: this array contains row indices, such that
pntrb(i) - pntrb(0) is the first index of block row i in the array indx
Refer to pointerB array description in BSR Format for more details.

- pntrb(1) is the last index of block row i in the array indx.
pntre(i) - pntrb(0) - 1 is the last index of block row i in the array
indx.
Refer to pointerE array description in BSR Format for more details.
x REAL for mkl_sbsrmv.

Array, size at least (k*lb) if transa = 'N' or 'n', and at least (m*lb)
otherwise. On entry, the array x must contain the vector x.
beta REAL for mkl_sbsrmv.

y REAL for mkl_sbsrmv.

Array, size at least (m*lb) if transa = 'N' or 'n', and at least (k*lb)
otherwise. On entry, the array y must contain the vector y.
Output Parameters
231
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, lb
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,

CHARACTER*1 transa
INTEGER m, k, lb
SUBROUTINE mkl_cbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,

CHARACTER*1 transa
INTEGER m, k, lb
COMPLEX alpha, beta
SUBROUTINE mkl_zbsrmv(transa, m, k, lb, alpha, matdescra, val, indx,

CHARACTER*1 transa
INTEGER m, k, lb
232
mkl_?cscmv
Computes matrix-vector product for a sparse matrix in
the CSC format.
Syntax
call mkl_scscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_dcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_ccscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
call mkl_zcscmv(transa, m, k, alpha, matdescra, val, indx, pntrb, pntre, x, beta, y)
Include Files
mkl.fi
Description
The mkl_?cscmv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix in compressed sparse column (CSC) format, AT is the transpose of A.
NOTE
This routine supports CSC format both with one-based indexing and zero-based indexing.
Input Parameters

alpha REAL for mkl_scscmv.

DOUBLE PRECISION for mkl_dcscmv.
COMPLEX for mkl_ccscmv.
DOUBLE COMPLEX for mkl_zcscmv.
233
matdescra.
val REAL for mkl_scscmv.

For one-based indexing its length is pntre(k) - pntrb(1).
Refer to values array description in CSC Format for more details.
indx INTEGER. Array containing the row indices for each non-zero element of the
matrix A.
Refer to rows array description in CSC Format for more details.
pntrb INTEGER. Array of length k.

For one-based indexing this array contains column indices, such that
pntrb(i) - pntrb(1) + 1 is the first index of column i in the arrays val
and indx.
For zero-based indexing this array contains column indices, such that
pntrb(i) - pntrb(0) is the first index of column i in the arrays val and
indx.
Refer to pointerb array description in CSC Format for more details.
pntre INTEGER. Array of length k.

pntre(i) - pntrb(1) is the last index of column i in the arrays val and
indx.
pntre(i) - pntrb(1) - 1 is the last index of column i in the arrays val
and indx.
Refer to pointerE array description in CSC Format for more details.
x REAL for mkl_scscmv.

234
beta REAL for mkl_scscmv.

y REAL for mkl_scscmv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmv(transa, m, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER m, k, ldb, ldc
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcscmv(transa, m, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
235
SUBROUTINE mkl_ccscmv(transa, m, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcscmv(transa, m, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
mkl_?coomv
Computes matrix - vector product for a sparse matrix
Syntax
call mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_dcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
call mkl_zcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
Include Files
mkl.fi
Description
The mkl_?coomv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix in compressed coordinate format, AT is the transpose of A.
236
NOTE
This routine supports a coordinate format both with one-based indexing and zero-based indexing.
Input Parameters

alpha REAL for mkl_scoomv.

DOUBLE PRECISION for mkl_dcoomv.
COMPLEX for mkl_ccoomv.
DOUBLE COMPLEX for mkl_zcoomv.
matdescra.
val REAL for mkl_scoomv.

arbitrary order.
zero element of the matrix A.
Refer to columns array description in Coordinate Format for more details.

237
x REAL for mkl_scoomv.

beta REAL for mkl_scoomv.

y REAL for mkl_scoomv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
CHARACTER*1 transa
INTEGER m, k, nnz
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)

CHARACTER*1 transa
INTEGER m, k, nnz
238
SUBROUTINE mkl_ccoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)
CHARACTER*1 transa
INTEGER m, k, nnz
COMPLEX alpha, beta
SUBROUTINE mkl_zcoomv(transa, m, k, alpha, matdescra, val, rowind, colind, nnz, x, beta, y)

CHARACTER*1 transa
INTEGER m, k, nnz
mkl_?csrsv
Solves a system of linear equations for a sparse
matrix in the CSR format.
Syntax
call mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_ccsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?csrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the CSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)*x,
where:
alpha is scalar, x and y are vectors, A is a sparse upper or lower triangular matrix with unit or non-unit main
diagonal, AT is the transpose of A.
239
NOTE
Input Parameters

If transa = 'N' or 'n', then y := alpha*inv(A)*x
If transa = 'T' or 't' or 'C' or 'c', then y := alpha*inv(AT)*x,
m INTEGER. Number of columns of the matrix A.
alpha REAL for mkl_scsrsv.

DOUBLE PRECISION for mkl_dcsrsv.
COMPLEX for mkl_ccsrsv.
DOUBLE COMPLEX for mkl_zcsrsv.
matdescra.
val REAL for mkl_scsrsv.

NOTE
the matrix A.
240
NOTE

indx.

pntre(i) - pntrb(0) - 1 is the last index of row i in the arrays val and
indx.
x REAL for mkl_scsrsv.

On entry, the array x must contain the vector x. The elements are accessed
with unit increment.
y REAL for mkl_scsrsv.

On entry, the array y must contain the vector y. The elements are accessed
Output Parameters
y Contains solution vector x.
241
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
DOUBLE PRECISION alpha
DOUBLE PRECISION val(*)
DOUBLE PRECISION x(*), y(*)
SUBROUTINE mkl_ccsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcsrsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
DOUBLE COMPLEX alpha
DOUBLE COMPLEX val(*)
DOUBLE COMPLEX x(*), y(*)
242
mkl_?bsrsv
matrix in the BSR format.
Syntax
call mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_cbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?bsrsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the BSR format:
y := alpha*inv(A)*x
or
y := alpha*inv(AT)* x,
where:
NOTE
Input Parameters

If transa = 'T' or 't' or 'C' or 'c', then y := alpha*inv(AT)* x,
m INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrsv.

DOUBLE PRECISION for mkl_dbsrsv.
COMPLEX for mkl_cbsrsv.
DOUBLE COMPLEX for mkl_zbsrsv.
243
matdescra.
val REAL for mkl_sbsrsv.

Refer to the values array description in BSR Format for more details.
NOTE
the matrix A.
Refer to the columns array description in BSR Format for more details.

indx.
pntrb(i) - pntrb(0) is the first index of block row i in the array indx.

indx.
x REAL for mkl_sbsrsv.

244
y REAL for mkl_sbsrsv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m, lb
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m, lb
245
SUBROUTINE mkl_cbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m, lb
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zbsrsv(transa, m, lb, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m, lb
mkl_?cscsv
matrix in the CSC format.
Syntax
call mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_dcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_ccscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
call mkl_zcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
Include Files
mkl.fi
Description
The mkl_?cscsv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the CSC format:
y := alpha*inv(A)*x
or
where:
246
NOTE
This routine supports a CSC format both with one-based indexing and zero-based indexing.
Input Parameters

If transa= 'T' or 't' or 'C' or 'c', then y := alpha*inv(AT)* x,
alpha REAL for mkl_scscsv.

DOUBLE PRECISION for mkl_dcscsv.
COMPLEX for mkl_ccscsv.
DOUBLE COMPLEX for mkl_zcscsv.
matdescra.
val REAL for mkl_scscsv.

NOTE
matrix A.
247
Refer to columns array description in CSC Format for more details.
NOTE
Row indices must be sorted in increasing order for each column.

and indx.
indx.

indx.
and indx.
x REAL for mkl_scscsv.

y REAL for mkl_scscsv.

Output Parameters
y Contains the solution vector x.
248
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)
CHARACTER*1 transa
INTEGER m
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
SUBROUTINE mkl_ccscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcscsv(transa, m, alpha, matdescra, val, indx, pntrb, pntre, x, y)

CHARACTER*1 transa
INTEGER m
249
mkl_?coosv
matrix in the coordinate format.
Syntax
call mkl_scoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_dcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_ccoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
call mkl_zcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
Include Files
mkl.fi
Description
The mkl_?coosv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the coordinate format:
y := alpha*inv(A)*x
or
where:
NOTE
Input Parameters

alpha REAL for mkl_scoosv.

DOUBLE PRECISION for mkl_dcoosv.
COMPLEX for mkl_ccoosv.
DOUBLE COMPLEX for mkl_zcoosv.
250
matdescra.
val REAL for mkl_scoosv.

arbitrary order.

x REAL for mkl_scoosv.

y REAL for mkl_scoosv.

Output Parameters
251
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)
CHARACTER*1 transa
INTEGER m, nnz
REAL alpha
REAL val(*)
REAL x(*), y(*)
SUBROUTINE mkl_dcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
SUBROUTINE mkl_ccoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
COMPLEX alpha
COMPLEX val(*)
COMPLEX x(*), y(*)
SUBROUTINE mkl_zcoosv(transa, m, alpha, matdescra, val, rowind, colind, nnz, x, y)

CHARACTER*1 transa
INTEGER m, nnz
252
mkl_?csrmm
Computes matrix - matrix product of a sparse matrix
stored in the CSR format.
Syntax
call mkl_scsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_ccsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_zcsrmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?csrmm routine performs a matrix-matrix operation defined as
or
C := alpha*AT*B + beta*C
or
C := alpha*AH*B + beta*C,
where:
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse row (CSR) format, AT is the
transpose of A, and AH is the conjugate transpose of A.
NOTE
Input Parameters

If transa = 'N' or 'n', then C := alpha*A*B + beta*C,
If transa = 'T' or 't', then C := alpha*AT*B + beta*C,
If transa = 'C' or 'c', then C := alpha*AH*B + beta*C.
253
n INTEGER. Number of columns of the matrix C.
alpha REAL for mkl_scsrmm.

DOUBLE PRECISION for mkl_dcsrmm.
COMPLEX for mkl_ccsrmm.
DOUBLE COMPLEX for mkl_zcsrmm.
matdescra.
val REAL for mkl_scsrmm.

For zero-based indexing its length is pntre(1) - pntrb(0).
the matrix A.

For one-based indexing this array contains row indices, such that pntrb(I)
- pntrb(1) + 1 is the first index of row I in the arrays val and indx.
pntrb(I) - pntrb(0) is the first index of row I in the arrays val and
indx.

For one-based indexing this array contains row indices, such that pntre(I)
- pntrb(1) is the last index of row I in the arrays val and indx.
pntre(I) - pntrb(0) - 1 is the last index of row I in the arrays val and
indx.
254
b REAL for mkl_scsrmm.

Array, size ldb by at least n for non-transposed matrix A and at least m for
transposed for one-based indexing, and (at least k for non-transposed
matrix A and at least m for transposed, ldb) for zero-based indexing.
On entry with transa='N' or 'n', the leading k-by-n part of the array b
must contain the matrix B, otherwise the leading m-by-n part of the array b
ldb INTEGER. Specifies the leading dimension of b for one-based indexing, and
the second dimension of b for zero-based indexing, as declared in the
calling (sub)program.
beta REAL for mkl_scsrmm.

c REAL for mkl_scsrmm.

Array, size ldc by n for one-based indexing, and (m, ldc) for zero-based
indexing.
On entry, the leading m-by-n part of the array c must contain the matrix C,
otherwise the leading k-by-n part of the array c must contain the matrix C.
ldc INTEGER. Specifies the leading dimension of c for one-based indexing, and
the second dimension of c for zero-based indexing, as declared in the
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta* C), (alpha*AT*B +

beta*C), or (alpha*AH*B + beta*C).
255
Interfaces
FORTRAN 77:
SUBROUTINE mkl_dcsrmm(transa, m, n, k, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc
REAL alpha, beta
REAL val(*), b(ldb,*), c(ldc,*)

CHARACTER*1 transa
DOUBLE PRECISION val(*), b(ldb,*), c(ldc,*)

CHARACTER*1 transa
COMPLEX alpha, beta
COMPLEX val(*), b(ldb,*), c(ldc,*)

CHARACTER*1 transa
DOUBLE COMPLEX val(*), b(ldb,*), c(ldc,*)
256
mkl_?bsrmm
Computes matrix - matrix product of a sparse matrix
stored in the BSR format.
Syntax
call mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_dbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_cbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
call mkl_zbsrmm(transa, m, n, k, lb, alpha, matdescra, val, indx, pntrb, pntre, b,
ldb, beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?bsrmm routine performs a matrix-matrix operation defined as
or
or
where:
B and C are dense matrices, A is an m-by-k sparse matrix in block sparse row (BSR) format, AT is the
transpose of A, and AH is the conjugate transpose of A.
NOTE
Input Parameters

If transa = 'N' or 'n', then the matrix-matrix product is computed as
If transa = 'T' or 't', then the matrix-vector product is computed as
257
If transa = 'C' or 'c', then the matrix-vector product is computed as

k INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrmm.

DOUBLE PRECISION for mkl_dbsrmm.
COMPLEX for mkl_cbsrmm.
DOUBLE COMPLEX for mkl_zbsrmm.
matdescra.
val REAL for mkl_sbsrmm.

the matrix A.
Its length is equal to the number of non-zero blocks in the matrix A. Refer
to the columns array description in BSR Format for more details.

pntrb(I) - pntrb(1) + 1 is the first index of block row I in the array
indx.
pntrb(I) - pntrb(0) is the first index of block row I in the array indx.

For one-based indexing this array contains row indices, such that pntre(I)
- pntrb(1) is the last index of block row I in the array indx.
258
pntre(I) - pntrb(0) - 1 is the last index of block row I in the array
indx.
b REAL for mkl_sbsrmm.

On entry with transa='N' or 'n', the leading n-by-k block part of the
array b must contain the matrix B, otherwise the leading m-by-n block part
ldb INTEGER. Specifies the leading dimension (in blocks) of b as declared in the
beta REAL for mkl_sbsrmm.

c REAL for mkl_sbsrmm.

Array, size (ldc, n) for one-based indexing, size (k, ldc) for zero-based
indexing.
On entry, the leading m-by-n block part of the array c must contain the
matrix C, otherwise the leading n-by-k block part of the array c must
contain the matrix C.
ldc INTEGER. Specifies the leading dimension (in blocks) of c as declared in the
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta*C) or (alpha*AT*B +

beta*C) or (alpha*AH*B + beta*C).
259
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrmm(transa, m, n, k, lb, alpha, matdescra, val,
indx, pntrb, pntre, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ld, ldb, ldc
REAL alpha, beta
SUBROUTINE mkl_dbsrmm(transa, m, n, k, lb, alpha, matdescra, val,

CHARACTER*1 transa
SUBROUTINE mkl_cbsrmm(transa, m, n, k, lb, alpha, matdescra, val,

CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zbsrmm(transa, m, n, k, lb, alpha, matdescra, val,

CHARACTER*1 transa
260
mkl_?cscmm
Computes matrix-matrix product of a sparse matrix
stored in the CSC format.
Syntax
call mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_dcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_ccscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
call mkl_zcscmm(transa, m, n, k, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?cscmm routine performs a matrix-matrix operation defined as
or
C := alpha*AT*B + beta*C,
or
where:
B and C are dense matrices, A is an m-by-k sparse matrix in compressed sparse column (CSC) format, AT is
the transpose of A, and AH is the conjugate transpose of A.
NOTE
This routine supports CSC format both with one-based indexing and zero-based indexing.
Input Parameters

If transa = 'N' or 'n', then C := alpha*A* B + beta*C
If transa ='C' or 'c', then C := alpha*AH*B + beta*C
261
alpha REAL for mkl_scscmm.

DOUBLE PRECISION for mkl_dcscmm.
COMPLEX for mkl_ccscmm.
DOUBLE COMPLEX for mkl_zcscmm.
matdescra.
val REAL for mkl_scscmm.

For one-based indexing its length is pntrb(k) - pntrb(1).
matrix A.
pntrb INTEGER. Array of length k.

and indx.
indx.
pntre INTEGER. Array of length k.

indx.
262
and indx.
b REAL for mkl_scscmm.

On entry with transa = 'N' or 'n', the leading k-by-n part of the array b
beta REAL*8. Specifies the scalar beta.
c REAL for mkl_scscmm.

indexing.
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta* C) or (alpha*AT*B +

beta*C) or (alpha*AH*B + beta*C).
263
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscmm(transa, m, n, k, alpha, matdescra, val, indx,
CHARACTER*1 transa
INTEGER indx(*), pntrb(k), pntre(k)
REAL alpha, beta
SUBROUTINE mkl_dcscmm(transa, m, n, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
SUBROUTINE mkl_ccscmm(transa, m, n, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcscmm(transa, m, n, k, alpha, matdescra, val, indx,

CHARACTER*1 transa
264
mkl_?coomm
stored in the coordinate format.
Syntax
call mkl_scoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_dcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_ccoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
call mkl_zcoomm(transa, m, n, k, alpha, matdescra, val, rowind, colind, nnz, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?coomm routine performs a matrix-matrix operation defined as
or
or
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the coordinate format, AT is the transpose of A,
and AH is the conjugate transpose of A.
NOTE
Input Parameters

If transa = 'N' or 'n', then C := alpha*A*B + beta*C
265
alpha REAL for mkl_scoomm.

DOUBLE PRECISION for mkl_dcoomm.
COMPLEX for mkl_ccoomm.
DOUBLE COMPLEX for mkl_zcoomm.
matdescra.
val REAL for mkl_scoomm.

arbitrary order.

b REAL for mkl_scoomm.

266
beta REAL for mkl_scoomm.

c REAL for mkl_scoomm.

indexing.
Output Parameters
c Overwritten by the matrix (alpha*A*B + beta*C), (alpha*AT*B +

Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoomm(transa, m, n, k, alpha, matdescra, val,
rowind, colind, nnz, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc, nnz
REAL alpha, beta
267
SUBROUTINE mkl_dcoomm(transa, m, n, k, alpha, matdescra, val,

CHARACTER*1 transa
SUBROUTINE mkl_ccoomm(transa, m, n, k, alpha, matdescra, val,

CHARACTER*1 transa
COMPLEX alpha, beta
SUBROUTINE mkl_zcoomm(transa, m, n, k, alpha, matdescra, val,

CHARACTER*1 transa
mkl_?csrsm
Solves a system of linear matrix equations for a
sparse matrix in the CSR format.
Syntax
call mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_dcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_ccsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_zcsrsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
268
Include Files
mkl.fi
Description
The mkl_?csrsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the CSR format:
C := alpha*inv(A)*B
or
C := alpha*inv(AT)*B,
where:
alpha is scalar, B and C are dense matrices, A is a sparse upper or lower triangular matrix with unit or non-
unit main diagonal, AT is the transpose of A.
NOTE
Input Parameters

If transa = 'N' or 'n', then C := alpha*inv(A)*B
If transa = 'T' or 't' or 'C' or 'c', then C := alpha*inv(AT)*B,
alpha REAL for mkl_scsrsm.

DOUBLE PRECISION for mkl_dcsrsm.
COMPLEX for mkl_ccsrsm.
DOUBLE COMPLEX for mkl_zcsrsm.
matdescra.
val REAL for mkl_scsrsm.

269

NOTE
the matrix A.
NOTE

indx.

pntre(i) - pntrb(0) - 1 is the last index of row i in the arrays val and
indx.
b REAL for mkl_scsrsm.

Array, size (ldb, n) for one-based indexing, and (m, ldb) for zero-based
indexing.
On entry the leading m-by-n part of the array b must contain the matrix B.
270
Output Parameters
c REAL*8.
indexing.
The leading m-by-n part of the array c contains the output matrix C.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsm(transa, m, n, alpha, matdescra, val, indx,
pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc
REAL alpha
SUBROUTINE mkl_dcsrsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
271
SUBROUTINE mkl_ccsrsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zcsrsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
mkl_?cscsm
sparse matrix in the CSC format.
Syntax
call mkl_scscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_dcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_ccscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
call mkl_zcscsm(transa, m, n, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c,
ldc)
Include Files
mkl.fi
Description
The mkl_?cscsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the CSC format:
C := alpha*inv(A)*B
or
272
where:
NOTE
This routine supports a CSC format both with one-based indexing and zero-based indexing.
Input Parameters
transa CHARACTER*1. Specifies the system of equations.

If transa = 'N' or 'n', then C := alpha*inv(A)*B
alpha REAL for mkl_scscsm.

DOUBLE PRECISION for mkl_dcscsm.
COMPLEX for mkl_ccscsm.
DOUBLE COMPLEX for mkl_zcscsm.
matdescra.
val REAL for mkl_scscsm.

For one-based indexing its length is pntre(k) - pntrb(1).
NOTE
273

matrix A. Its length is equal to length of the val array.
NOTE
Row indices must be sorted in increasing order for each column.

pntrb(I) - pntrb(1) + 1 is the first index of column I in the arrays val
and indx.
pntrb(I) - pntrb(0) is the first index of column I in the arrays val and
indx.

pntre(I) - pntrb(1) is the last index of column I in the arrays val and
indx.
pntre(I) - pntrb(1)-1 is the last index of column I in the arrays val
and indx.
b REAL for mkl_scscsm.

Array, size ldb by n for one-based indexing, and (m, ldb) for zero-based
indexing.
274
Output Parameters
c REAL for mkl_scscsm.

indexing.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scscsm(transa, m, n, alpha, matdescra, val, indx,
CHARACTER*1 transa
REAL alpha
SUBROUTINE mkl_dcscsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
SUBROUTINE mkl_ccscsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
COMPLEX alpha
275
SUBROUTINE mkl_zcscsm(transa, m, n, alpha, matdescra, val, indx,

CHARACTER*1 transa
mkl_?coosm
sparse matrix in the coordinate format.
Syntax
call mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
call mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c,
ldc)
Include Files
mkl.fi
Description
The mkl_?coosm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the coordinate format:
C := alpha*inv(A)*B
or
where:
NOTE
Input Parameters
276
C := alpha*inv(A)*B
computed as C := alpha*inv(AT)*B,
alpha REAL for mkl_scoosm.

DOUBLE PRECISION for mkl_dcoosm.
COMPLEX for mkl_ccoosm.
DOUBLE COMPLEX for mkl_zcoosm.
matdescra.
val REAL for mkl_scoosm.

arbitrary order.

b REAL for mkl_scoosm.

277
Array, size ldb by n for one-based indexing, and (m, ldb) for zero-based
indexing.
Before entry the leading m-by-n part of the array b must contain the matrix
B.
Output Parameters
c REAL for mkl_scoosm.

indexing.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc, nnz
REAL alpha
SUBROUTINE mkl_dcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
278
SUBROUTINE mkl_ccoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zcoosm(transa, m, n, alpha, matdescra, val, rowind, colind, nnz, b, ldb, c, ldc)
CHARACTER*1 transa
mkl_?bsrsm
sparse matrix in the BSR format.
Syntax
call mkl_scsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_dcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_ccsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
call mkl_zcsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb,
c, ldc)
Include Files
mkl.fi
Description
The mkl_?bsrsm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the BSR format:
C := alpha*inv(A)*B
or
where:
279
NOTE
Input Parameters

C := alpha*inv(A)*B.
computed as C := alpha*inv(AT)*B.
m INTEGER. Number of block columns of the matrix A.
alpha REAL for mkl_sbsrsm.

DOUBLE PRECISION for mkl_dbsrsm.
COMPLEX for mkl_cbsrsm.
DOUBLE COMPLEX for mkl_zbsrsm.
matdescra.
val REAL for mkl_sbsrsm.

NOTE
280
the matrix A.
Refer to the columns array description in BSR Format for more details.

indx.
pntrb(i) - pntrb(0) is the first index of block row i in the array indx.

indx.
b REAL for mkl_sbsrsm.

Array, size (ldb, n) for one-based indexing, size (m, ldb) for zero-based
indexing.
ldb INTEGER. Specifies the leading dimension (in blocks) of b as declared in the
ldc INTEGER. Specifies the leading dimension (in blocks) of c as declared in the
Output Parameters
c REAL for mkl_sbsrsm.

Array, size (ldc, n) for one-based indexing, size (m, ldc) for zero-based
indexing.
281
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, lb, ldb, ldc
REAL alpha
SUBROUTINE mkl_dbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
SUBROUTINE mkl_cbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
COMPLEX alpha
SUBROUTINE mkl_zbsrsm(transa, m, n, lb, alpha, matdescra, val, indx, pntrb, pntre, b, ldb, c, ldc)
CHARACTER*1 transa
mkl_?diamv
in the diagonal format with one-based indexing.
282
Syntax
call mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_ddiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_cdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
call mkl_zdiamv(transa, m, k, alpha, matdescra, val, lval, idiag, ndiag, x, beta, y)
Include Files
mkl.fi
Description
The mkl_?diamv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix stored in the diagonal format, AT is the transpose of A.
NOTE
Input Parameters

If transa = 'N' or 'n', then y := alpha*A*x + beta*y,
If transa = 'T' or 't' or 'C' or 'c', then y := alpha*AT*x + beta*y.
alpha REAL for mkl_sdiamv.

DOUBLE PRECISION for mkl_ddiamv.
COMPLEX for mkl_cdiamv.
DOUBLE COMPLEX for mkl_zdiamv.
283
matdescra.
val REAL for mkl_sdiamv.


details.
x REAL for mkl_sdiamv.

Array, size at least k if transa = 'N' or 'n', and at least m otherwise. On
beta REAL for mkl_sdiamv.

y REAL for mkl_sdiamv.

Array, size at least m if transa = 'N' or 'n', and at least k otherwise. On
284
Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,
ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER m, k, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
SUBROUTINE mkl_ddiamv(transa, m, k, alpha, matdescra, val, lval, idiag,

ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
SUBROUTINE mkl_cdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,

ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha, beta
285
SUBROUTINE mkl_zdiamv(transa, m, k, alpha, matdescra, val, lval, idiag,

ndiag, x, beta, y)
CHARACTER*1 transa
INTEGER idiag(*)
mkl_?skymv
in the skyline storage format with one-based indexing.
Syntax
call mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_dskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_cskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
call mkl_zskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
Include Files
mkl.fi
Description
The mkl_?skymv routine performs a matrix-vector operation defined as
or
where:
A is an m-by-k sparse matrix stored using the skyline storage scheme, AT is the transpose of A.
NOTE
Input Parameters
286
alpha REAL for mkl_sskymv.

DOUBLE PRECISION for mkl_dskymv.
COMPLEX for mkl_cskymv.
DOUBLE COMPLEX for mkl_zskymv.
matdescra.
NOTE
General matrices (matdescra(1)='G') is not supported.
val REAL for mkl_sskymv.

Array containing the set of elements of the matrix A in the skyline profile
form.
If matdescrsa(2)= 'L', then val contains elements from the low triangle
of the matrix A.
If matdescrsa(2)= 'U', then val contains elements from the upper
triangle of the matrix A.
Refer to values array description in Skyline Storage Scheme for more
details.
pntr INTEGER. Array of length (m + 1) for lower triangle, and (k + 1) for

upper triangle.
It contains the indices specifying in the val the positions of the first
element in each row (column) of the matrix A. Refer to pointers array
description in Skyline Storage Scheme for more details.
x REAL for mkl_sskymv.

287

beta REAL for mkl_sskymv.

y REAL for mkl_sskymv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
REAL alpha, beta
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)

CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
288
SUBROUTINE mkl_cdskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)
CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
COMPLEX alpha, beta
SUBROUTINE mkl_zskymv(transa, m, k, alpha, matdescra, val, pntr, x, beta, y)

CHARACTER*1 transa
INTEGER m, k
INTEGER pntr(*)
mkl_?diasv
matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_ddiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_cdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
call mkl_zdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
Include Files
mkl.fi
Description
The mkl_?diasv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix stored in the diagonal format:
y := alpha*inv(A)*x
or
where:
289
NOTE
Input Parameters

If transa = 'T' or 't' or 'C' or 'c', then y := alpha*inv(AT)*x,
alpha REAL for mkl_sdiasv.

DOUBLE PRECISION for mkl_ddiasv.
COMPLEX for mkl_cdiasv.
DOUBLE COMPLEX for mkl_zdiasv.
matdescra.
val REAL for mkl_sdiasv.


NOTE

details.
290
x REAL for mkl_sdiasv.

y REAL for mkl_sdiasv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)
CHARACTER*1 transa
INTEGER indiag(*)
REAL alpha
SUBROUTINE mkl_ddiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER indiag(*)
291
SUBROUTINE mkl_cdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER indiag(*)
COMPLEX alpha
SUBROUTINE mkl_zdiasv(transa, m, alpha, matdescra, val, lval, idiag, ndiag, x, y)

CHARACTER*1 transa
INTEGER indiag(*)
mkl_?skysv
matrix in the skyline format with one-based indexing.
Syntax
call mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_dskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
call mkl_zskysv(transa, m, alpha, matdescra, val, pntr, x, y)
Include Files
mkl.fi
Description
The mkl_?skysv routine solves a system of linear equations with matrix-vector operations for a sparse
matrix in the skyline storage format:
y := alpha*inv(A)*x
or
where:
292
NOTE
Input Parameters

alpha REAL for mkl_sskysv.

DOUBLE PRECISION for mkl_dskysv.
COMPLEX for mkl_cskysv.
DOUBLE COMPLEX for mkl_zskysv.
matdescra.
NOTE
val REAL for mkl_sskysv.

form.
If matdescra(2)= 'L', then val contains elements from the low triangle
of the matrix A.
If matdescra(2)= 'U', then val contains elements from the upper
details.

upper triangle.
293
It contains the indices specifying in the val the positions of the first
element in each row (column) of the matrix A. Refer to pointers array
description in Skyline Storage Scheme for more details.
x REAL for mkl_sskysv.

y REAL for mkl_sskysv.

Output Parameters
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
REAL alpha
REAL val(*), x(*), y(*)
SUBROUTINE mkl_dskysv(transa, m, alpha, matdescra, val, pntr, x, y)

CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
294
SUBROUTINE mkl_cskysv(transa, m, alpha, matdescra, val, pntr, x, y)
CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
COMPLEX alpha
SUBROUTINE mkl_zskysv(transa, m, alpha, matdescra, val, pntr, x, y)

CHARACTER*1 transa
INTEGER m
INTEGER pntr(*)
mkl_?diamm
stored in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_ddiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_cdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
call mkl_zdiamm(transa, m, n, k, alpha, matdescra, val, lval, idiag, ndiag, b, ldb,
beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?diamm routine performs a matrix-matrix operation defined as
or
or
295
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the diagonal format, AT is the transpose of A,
and AH is the conjugate transpose of A.
NOTE
Input Parameters

alpha REAL for mkl_sdiamm.

DOUBLE PRECISION for mkl_ddiamm.
COMPLEX for mkl_cdiamm.
DOUBLE COMPLEX for mkl_zdiamm.
matdescra.
val REAL for mkl_sdiamm.


296
details.
b REAL for mkl_sdiamm.

Array, size (ldb, n).

(sub)program.
beta REAL for mkl_sdiamm.

c REAL for mkl_sdiamm.


(sub)program.
Output Parameters

297
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiamm(transa, m, n, k, alpha, matdescra, val, lval,
idiag, ndiag, b, ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER m, n, k, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha, beta
REAL val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_ddiamm(transa, m, n, k, alpha, matdescra, val, lval,

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_cdiamm(transa, m, n, k, alpha, matdescra, val, lval,

CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha, beta
COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zdiamm(transa, m, n, k, alpha, matdescra, val, lval,

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
298
mkl_?skymm
stored using the skyline storage scheme with one-
based indexing.
Syntax
call mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_dskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_cskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
call mkl_zskymm(transa, m, n, k, alpha, matdescra, val, pntr, b, ldb, beta, c, ldc)
Include Files
mkl.fi
Description
The mkl_?skymm routine performs a matrix-matrix operation defined as
or
or
where:
B and C are dense matrices, A is an m-by-k sparse matrix in the skyline storage format, AT is the transpose
of A, and AH is the conjugate transpose of A.
NOTE
Input Parameters

299
alpha REAL for mkl_sskymm.

DOUBLE PRECISION for mkl_dskymm.
COMPLEX for mkl_cskymm.
DOUBLE COMPLEX for mkl_zskymm.
matdescra.
NOTE
General matrices (matdescra (1)='G') is not supported.
val REAL for mkl_sskymm.

form.
of the matrix A.
details.

upper triangle.
It contains the indices specifying the positions of the first element of the
matrix A in each row (for the lower triangle) or column (for upper triangle)
in the val array. Refer to pointers array description in Skyline Storage
b REAL for mkl_sskymm.

300
(sub)program.
beta REAL for mkl_sskymm.

c REAL for mkl_sskymm.


(sub)program.
Output Parameters

Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,
ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
REAL alpha, beta
301
SUBROUTINE mkl_dskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,

ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
SUBROUTINE mkl_cskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,

ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
COMPLEX alpha, beta
SUBROUTINE mkl_zskymm(transa, m, n, k, alpha, matdescra, val, pntr, b,

ldb, beta, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
mkl_?diasm
sparse matrix in the diagonal format with one-based
indexing.
Syntax
call mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_ddiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_cdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
call mkl_zdiasm(transa, m, n, alpha, matdescra, val, lval, idiag, ndiag, b, ldb, c,
ldc)
302
Include Files
mkl.fi
Description
The mkl_?diasm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the diagonal format:
C := alpha*inv(A)*B
or
where:
NOTE
Input Parameters

If transa = 'N' or 'n', then C := alpha*inv(A)*B,
If transa = 'T' or 't' or 'C' or 'c', then C := alpha*inv(AT)*B.
alpha REAL for mkl_sdiasm.

DOUBLE PRECISION for mkl_ddiasm.
COMPLEX for mkl_cdiasm.
DOUBLE COMPLEX for mkl_zdiasm.
matdescra.
val REAL for mkl_sdiasm.

303


NOTE

details.
b REAL for mkl_sdiasm.


(sub)program.

(sub)program.
Output Parameters
c REAL for mkl_sdiasm.

The leading m-by-n part of the array c contains the matrix C.
304
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,
ndiag, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER m, n, ldb, ldc, lval, ndiag
INTEGER idiag(*)
REAL alpha
REAL val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_ddiasm(transa, m, n, alpha, matdescra, val, lval, idiag,

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE PRECISION val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_cdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,

CHARACTER*1 transa
INTEGER idiag(*)
COMPLEX alpha
COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zdiasm(transa, m, n, alpha, matdescra, val, lval, idiag,

CHARACTER*1 transa
INTEGER idiag(*)
DOUBLE COMPLEX val(lval,*), b(ldb,*), c(ldc,*)
305
mkl_?skysm
sparse matrix stored using the skyline storage scheme
Syntax
call mkl_sskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_dskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_cskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
call mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
Include Files
mkl.fi
Description
The mkl_?skysm routine solves a system of linear equations with matrix-matrix operations for a sparse
matrix in the skyline storage format:
C := alpha*inv(A)*B
or
where:
NOTE
Input Parameters

If transa = 'N' or 'n', then C := alpha*inv(A)*B,
alpha REAL for mkl_sskysm.

DOUBLE PRECISION for mkl_dskysm.
COMPLEX for mkl_cskysm.
DOUBLE COMPLEX for mkl_zskysm.
306
matdescra.
NOTE
val REAL for mkl_sskysm.

form.
of the matrix A.
details.
pntr INTEGER. Array of length (m + 1) for lower triangle, and (n + 1) for

upper triangle.
It contains the indices specifying the positions of the first element of the
matrix A in each row (for the lower triangle) or column (for upper triangle)
in the val array. Refer to pointers array description in Skyline Storage
b REAL for mkl_sskysm.


(sub)program.

(sub)program.
307
Output Parameters
c REAL for mkl_sskysm.

The leading m-by-n part of the array c contains the matrix C.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)

CHARACTER*1 transa
INTEGER pntr(*)
REAL alpha
SUBROUTINE mkl_dskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)

CHARACTER*1 transa
INTEGER pntr(*)
SUBROUTINE mkl_cskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)

CHARACTER*1 transa
INTEGER pntr(*)
COMPLEX alpha
308
SUBROUTINE mkl_zskysm(transa, m, n, alpha, matdescra, val, pntr, b, ldb, c, ldc)
CHARACTER*1 transa
INTEGER pntr(*)
mkl_?dnscsr
Convert a sparse matrix in uncompressed
representation to the CSR format and vice versa.
Syntax
call mkl_sdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_ddnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_cdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
call mkl_zdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A between formats: stored as a rectangular array (dense
representation) and stored using compressed sparse row (CSR) format (3-array variation).
Input Parameters
job INTEGER
Array, contains the following conversion parameters:
job(1): Conversion type.

If job(1)=0, the rectangular matrix A is converted to the CSR
format;
if job(1)=1, the rectangular matrix A is restored from the CSR
format.
job(2): index base for the rectangular matrix A.
If job(2)=0, zero-based indexing for the rectangular matrix A is
used;
if job(2)=1, one-based indexing for the rectangular matrix A is used.
job(3): Index base for the matrix in CSR format.
If job(3)=0, zero-based indexing for the matrix in CSR format is
used;
309

if job(3)=1, one-based indexing for the matrix in CSR format is
used.
job(4): Portion of matrix.
If job(4)=0, adns is a lower triangular part of matrix A;
If job(4)=1, adns is an upper triangular part of matrix A;
If job(4)=2, adns is a whole matrix A.
job(5)=nzmax: maximum number of the non-zero elements allowed if
job(1)=0.
job(6): job indicator for conversion to CSR format.
If job(6)=0, only array ia is generated for the output storage.
If job(6)>0, arrays acsr, ia, ja are generated for the output
storage.
n INTEGER. Number of columns of the matrix A.
adns (input/output)
REAL for mkl_sdnscsr.
DOUBLE PRECISION for mkl_ddnscsr.
COMPLEX for mkl_cdnscsr.
DOUBLE COMPLEX for mkl_zdnscsr.
If the conversion type is from uncompressed to CSR, on input adns
contains an uncompressed (dense) representation of matrix A.
lda INTEGER. Specifies the leading dimension of adns as declared in the calling
(sub)program.
For zero-based indexing of A, lda must be at least max(1, n).
For one-based indexing of A, lda must be at least max(1, m).
acsr (input/output)
REAL for mkl_sdnscsr.
DOUBLE PRECISION for mkl_ddnscsr.
COMPLEX for mkl_cdnscsr.
DOUBLE COMPLEX for mkl_zdnscsr.
If conversion type is from CSR to uncompressed, on input acsr contains
the non-zero elements of the matrix A. Its length is equal to the number of
non-zero elements in the matrix A. Refer to values array description in
ja (input/output)INTEGER. If conversion type is from CSR to uncompressed,

on input ja contains the column indices for each non-zero element of the
matrix A.
Its length is equal to the length of the array acsr. Refer to columns array
310
ia (input/output)INTEGER. Array of length m + 1.
If conversion type is from CSR to uncompressed, on input ia contains

indices of elements in the array acsr, such that ia(i) is the index in the
array acsr of the first non-zero element from the row i.
The value ofia(m + 1) is equal to the number of non-zeros plus one. Refer
to rowIndex array description in Sparse Matrix Storage Formats for more
details.
Output Parameters
adns If conversion type is from CSR to uncompressed, on output adns contains

the uncompressed (dense) representation of matrix A.
acsr, ja, ia If conversion type is from uncompressed to CSR, on output acsr, ja, and
ia contain the compressed sparse row (CSR) format (3-array variation) of
matrix A (see Sparse Matrix Storage Formats for a description of the
storage format).
info INTEGER. Integer info indicator only for restoring the matrix A from the CSR
format.
If info=0, the execution is successful.
If info=i, the routine is interrupted processing the i-th row because there
is no space in the arrays acsr and ja according to the value nzmax.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_sdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)
INTEGER job(8)
INTEGER m, n, lda, info
INTEGER ja(*), ia(m+1)
REAL adns(*), acsr(*)
SUBROUTINE mkl_ddnscsr(job, m, n, adns, lda, acsr, ja, ia, info)

INTEGER job(8)
DOUBLE PRECISION adns(*), acsr(*)
SUBROUTINE mkl_cdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)

INTEGER job(8)
COMPLEX adns(*), acsr(*)
311
SUBROUTINE mkl_zdnscsr(job, m, n, adns, lda, acsr, ja, ia, info)

INTEGER job(8)
DOUBLE COMPLEX adns(*), acsr(*)
mkl_?csrcoo
Converts a sparse matrix in the CSR format to the
coordinate format and vice versa.
Syntax
call mkl_scsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_ccsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
call mkl_zcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
Include Files
mkl.fi
Description
This routine converts a sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to coordinate format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to the coordinate
format;
if job(1)=1, the matrix in the coordinate format is converted to the CSR
format.
if job(1)=2, the matrix in the coordinate format is converted to the CSR
format, and the column indices in CSR representation are sorted in the
increasing order within each row.
job(2)
If job(2)=0, zero-based indexing for the matrix in CSR format is used;
if job(2)=1, one-based indexing for the matrix in CSR format is used.
job(3)
312
If job(3)=0, zero-based indexing for the matrix in coordinate format is
used;
if job(3)=1, one-based indexing for the matrix in coordinate format is
used.
job(5)
job(5)=nzmax - maximum number of the non-zero elements allowed if
job(1)=0.
job(6) - job indicator.
For conversion to the coordinate format:
If job(6)=1, only array rowind is filled in for the output storage.
If job(6)=2, arrays rowind, colind are filled in for the output storage.
If job(6)=3, all arrays rowind, colind, acoo are filled in for the output
storage.
For conversion to the CSR format:
If job(6)=0, all arrays acsr, ja, ia are filled in for the output storage.
If job(6)=1, only array ia is filled in for the output storage.
If job(6)=2, then it is assumed that the routine already has been called
with the job(6)=1, and the user allocated the required space for storing
the output arrays acsr and ja.
n INTEGER. Dimension of the matrix A.
nnz INTEGER. Specifies the number of non-zero elements of the matrix A for
job(1)0.
acsr (input/output)
REAL for mkl_scsrcoo.
DOUBLE PRECISION for mkl_dcsrcoo.
COMPLEX for mkl_ccsrcoo.
DOUBLE COMPLEX for mkl_zcsrcoo.
ja (input/output) INTEGER. Array containing the column indices for each non-
ia (input/output)INTEGER. Array of length n + 1, containing indices of

elements in the array acsr, such that ia(i) is the index in the array acsr
of the first non-zero element from the row i. The value of the last element
313
ia(n + 1) is equal to the number of non-zeros plus one. Refer to

rowIndex array description in Sparse Matrix Storage Formats for more
details.
acoo (input/output)
REAL for mkl_scsrcoo.
DOUBLE PRECISION for mkl_dcsrcoo.
COMPLEX for mkl_ccsrcoo.
DOUBLE COMPLEX for mkl_zcsrcoo.
rowind (input/output)INTEGER. Array of length nnz, contains the row indices for
each non-zero element of the matrix A.
colind (input/output)INTEGER. Array of length nnz, contains the column indices for
each non-zero element of the matrix A. Refer to columns array description
in Coordinate Format for more details.
Output Parameters
nnz Returns the number of converted elements of the matrix A for job(1)=0.
info INTEGER. Integer info indicator only for converting the matrix A from the
CSR format.
If info=1, the routine is interrupted because there is no space in the arrays

acoo, rowind, colind according to the value nzmax.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
INTEGER n, nnz, info
INTEGER ja(*), ia(n+1), rowind(*), colind(*)
REAL acsr(*), acoo(*)
SUBROUTINE mkl_dcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
DOUBLE PRECISION acsr(*), acoo(*)
314
SUBROUTINE mkl_ccsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
COMPLEX acsr(*), acoo(*)
SUBROUTINE mkl_zcsrcoo(job, n, acsr, ja, ia, nnz, acoo, rowind, colind, info)
INTEGER job(8)
DOUBLE COMPLEX acsr(*), acoo(*)
mkl_?csrbsr
Converts a square sparse matrix in the CSR format to
the BSR format and vice versa.
Syntax
call mkl_scsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_ccsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
call mkl_zcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
Include Files
mkl.fi
Description
This routine converts a square sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the block sparse row (BSR) format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to the BSR format;
if job(1)=1, the matrix in the BSR format is converted to the CSR format.
job(2)
315
job(3)
If job(3)=0, zero-based indexing for the matrix in the BSR format is used;
if job(3)=1, one-based indexing for the matrix in the BSR format is used.
job(4) is only used for conversion to CSR format. By default, the converter
saves the blocks without checking whether an element is zero or not. If
job(4)=1, then the converter only saves non-zero elements in blocks.
For conversion to the BSR format:
If job(6)=0, only arrays jab, iab are generated for the output storage.
If job(6)>0, all output arrays absr, jab, and iab are filled in for the
output storage.
If job(6)=-1, iab(1) returns the number of non-zero blocks.

If job(6)=0, only arrays ja, ia are generated for the output storage.
m INTEGER. Actual row dimension of the matrix A for convert to the BSR
format; block row dimension of the matrix A for convert to the CSR format.
mblk INTEGER. Size of the block in the matrix A.
ldabsr INTEGER. Leading dimension of the array absr as declared in the calling
program. ldabsr must be greater than or equal to mblk*mblk.
acsr (input/output)
REAL for mkl_scsrbsr.
DOUBLE PRECISION for mkl_dcsrbsr.
COMPLEX for mkl_ccsrbsr.
DOUBLE COMPLEX for mkl_zcsrbsr.
ja (input/output)INTEGER. Array containing the column indices for each non-

ia (input/output)INTEGER. Array of length m + 1, containing indices of

elements in the array acsr, such that ia(I) is the index in the array acsr
of the first non-zero element from the row I. The value of the last
elementia(m + 1) is equal to the number of non-zeros plus one. Refer to
details.
absr (input/output)
REAL for mkl_scsrbsr.
316
DOUBLE PRECISION for mkl_dcsrbsr.
COMPLEX for mkl_ccsrbsr.
DOUBLE COMPLEX for mkl_zcsrbsr.
equal to the number of non-zero blocks in the matrix A multiplied by
mblk*mblk. Refer to values array description in BSR Format for more
details.
jab (input/output)INTEGER. Array containing the column indices for each non-
zero block of the matrix A.
iab (input/output)INTEGER. Array of length (m + 1), containing indices of

blocks in the array absr, such that iab(i) is the index in the array absr of
the first non-zero element from the i-th row . The value of the last
elementiab(m + 1) is equal to the number of non-zero blocks plus one.
Refer to rowIndex array description in BSR Format for more details.
Output Parameters
CSR format.
If info=1, it means that mblk is equal to 0.
If info=2, it means that ldabsr is less than mblk*mblk and there is no

space for all blocks.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
INTEGER m, mblk, ldabsr, info
INTEGER ja(*), ia(m+1), jab(*), iab(*)
REAL acsr(*), absr(ldabsr,*)
SUBROUTINE mkl_dcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
DOUBLE PRECISION acsr(*), absr(ldabsr,*)
317
SUBROUTINE mkl_ccsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
COMPLEX acsr(*), absr(ldabsr,*)
SUBROUTINE mkl_zcsrbsr(job, m, mblk, ldabsr, acsr, ja, ia, absr, jab, iab, info)
INTEGER job(8)
DOUBLE COMPLEX acsr(*), absr(ldabsr,*)
mkl_?csrcsc
Converts a square sparse matrix in the CSR format to
the CSC format and vice versa.
Syntax
call mkl_scsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_dcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_ccsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
call mkl_zcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
Include Files
mkl.fi
Description
This routine converts a square sparse matrix A stored in the compressed sparse row (CSR) format (3-array
variation) to the compressed sparse column (CSC) format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to the CSC format;
if job(1)=1, the matrix in the CSC format is converted to the CSR format.
job(2)
318
job(3)
If job(3)=0, zero-based indexing for the matrix in the CSC format is used;
if job(3)=1, one-based indexing for the matrix in the CSC format is used.

For conversion to the CSC format:
If job(6)=0, only arrays ja1, ia1 are filled in for the output storage.
If job(6)0, all output arrays acsc, ja1, and ia1 are filled in for the
output storage.
If job(6)=0, only arrays ja, ia are filled in for the output storage.
If job(6)0, all output arrays acsr, ja, and ia are filled in for the output
storage.
m INTEGER. Dimension of the square matrix A.
acsr (input/output)
REAL for mkl_scsrcsc.
DOUBLE PRECISION for mkl_dcsrcsc.
COMPLEX for mkl_ccsrcsc.
DOUBLE COMPLEX for mkl_zcsrcsc.
Array containing non-zero elements of the square matrix A. Its length is
equal to the number of non-zero elements in the matrix A. Refer to values
array description in Sparse Matrix Storage Formats for more details.


of the first non-zero element from the row i. The value of the last
details.
acsc (input/output)
REAL for mkl_scsrcsc.
DOUBLE PRECISION for mkl_dcsrcsc.
COMPLEX for mkl_ccsrcsc.
DOUBLE COMPLEX for mkl_zcsrcsc.
Array containing non-zero elements of the square matrix A. Its length is
equal to the number of non-zero elements in the matrix A. Refer to values
319
ja1 (input/output)INTEGER. Array containing the row indices for each non-zero
Its length is equal to the length of the array acsc. Refer to columns array
ia1 (input/output)INTEGER. Array of length m + 1, containing indices of

elements in the array acsc, such that ia1(i) is the index in the array acsc
of the first non-zero element from the column i. The value of the last
element ia1(m + 1) is equal to the number of non-zeros plus one. Refer to
details.
Output Parameters
info INTEGER. This parameter is not used now.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)
INTEGER job(8)
INTEGER m, info
INTEGER ja(*), ia(m+1), ja1(*), ia1(m+1)
REAL acsr(*), acsc(*)
SUBROUTINE mkl_dcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)

INTEGER job(8)
INTEGER m, info
DOUBLE PRECISION acsr(*), acsc(*)
SUBROUTINE mkl_ccsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)

INTEGER job(8)
INTEGER m, info
COMPLEX acsr(*), acsc(*)
SUBROUTINE mkl_zcsrcsc(job, m, acsr, ja, ia, acsc, ja1, ia1, info)

INTEGER job(8)
INTEGER m, info
DOUBLE COMPLEX acsr(*), acsc(*)
320
mkl_?csrdia
Converts a sparse matrix in the CSR format to the
diagonal format and vice versa.
Syntax
call mkl_scsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_dcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_ccsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
call mkl_zcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
Include Files
mkl.fi
Description
variation) to the diagonal format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to the diagonal
format;
if job(1)=1, the matrix in the diagonal format is converted to the CSR
format.
job(2)
job(3)
If job(3)=0, zero-based indexing for the matrix in the diagonal format is
used;
if job(3)=1, one-based indexing for the matrix in the diagonal format is
used.
For conversion to the diagonal format:
If job(6)=0, diagonals are not selected internally, and acsr_rem, ja_rem,
ia_rem are not filled in for the output storage.
321
If job(6)=1, diagonals are not selected internally, and acsr_rem, ja_rem,

ia_rem are filled in for the output storage.
If job(6)=10, diagonals are selected internally, and acsr_rem, ja_rem,
ia_rem are not filled in for the output storage.
If job(6)=11, diagonals are selected internally, and csr_rem, ja_rem,
ia_rem are filled in for the output storage.
If job(6)=0, each entry in the array adia is checked whether it is zero.
Zero entries are not included in the array acsr.
If job(6)0, each entry in the array adia is not checked whether it is zero.
m INTEGER. Dimension of the matrix A.
acsr (input/output)
REAL for mkl_scsrdia.
DOUBLE PRECISION for mkl_dcsrdia.
COMPLEX for mkl_ccsrdia.
DOUBLE COMPLEX for mkl_zcsrdia.


details.
adia (input/output)
REAL for mkl_scsrdia.
DOUBLE PRECISION for mkl_dcsrdia.
COMPLEX for mkl_ccsrdia.
DOUBLE COMPLEX for mkl_zcsrdia.
Array of size (ndiag x idiag) containing diagonals of the matrix A.
The key point of the storage is that each element in the array adia retains
the row number of the original matrix. To achieve this diagonals in the
lower triangular part of the matrix are padded from the top, and those in
the upper triangular part are padded from the bottom.
ndiag INTEGER.
322
Specifies the leading dimension of the array adia as declared in the calling
(sub)program, must be at least max(1, m).
distance INTEGER.
Array of length idiag, containing the distances between the main diagonal
and each non-zero diagonal to be extracted. The distance is positive if the
diagonal is above the main diagonal, and negative if the diagonal is below
the main diagonal. The main diagonal has a distance equal to zero.
idiag INTEGER.
Number of diagonals to be extracted. For conversion to diagonal format on
return this parameter may be modified.
acsr_rem, ja_rem, ia_rem Remainder of the matrix in the CSR format if it is needed for conversion to
the diagonal format.
Output Parameters
info INTEGER. This parameter is not used now.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
INTEGER m, info, ndiag, idiag
INTEGER ja(*), ia(m+1), distance(*), ja_rem(*), ia_rem(*)
REAL acsr(*), adia(*), acsr_rem(*)
SUBROUTINE mkl_dcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
DOUBLE PRECISION acsr(*), adia(*), acsr_rem(*)
SUBROUTINE mkl_ccsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
COMPLEX acsr(*), adia(*), acsr_rem(*)
323
SUBROUTINE mkl_zcsrdia(job, m, acsr, ja, ia, adia, ndiag, distance, idiag, acsr_rem, ja_rem,
ia_rem, info)
INTEGER job(8)
DOUBLE COMPLEX acsr(*), adia(*), acsr_rem(*)
mkl_?csrsky
Converts a sparse matrix in CSR format to the skyline
format and vice versa.
Syntax
call mkl_scsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_dcsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_ccsrsky(job, m, acsr, ja, ia, asky, pointers, info)
call mkl_zcsrsky(job, m, acsr, ja, ia, asky, pointers, info)
Include Files
mkl.fi
Description
variation) to the skyline format and vice versa.
Input Parameters
job INTEGER
job(1)
If job(1)=0, the matrix in the CSR format is converted to the skyline
format;
if job(1)=1, the matrix in the skyline format is converted to the CSR
format.
job(2)
job(3)
If job(3)=0, zero-based indexing for the matrix in the skyline format is
used;
324
if job(3)=1, one-based indexing for the matrix in the skyline format is
used.
job(4)
For conversion to the skyline format:
If job(4)=0, the upper part of the matrix A in the CSR format is converted.
If job(4)=1, the lower part of the matrix A in the CSR format is converted.

If job(4)=0, the matrix is converted to the upper part of the matrix A in
the CSR format.
If job(4)=1, the matrix is converted to the lower part of the matrix A in
the CSR format.
job(5)
job(5)=nzmax - maximum number of the non-zero elements of the matrix
A if job(1)=0.

Only for conversion to the skyline format:
If job(6)=0, only arrays pointers is filled in for the output storage.
If job(6)=1, all output arrays asky and pointers are filled in for the
output storage.
m INTEGER. Dimension of the matrix A.
acsr (input/output)
REAL for mkl_scsrsky.
DOUBLE PRECISION for mkl_dcsrsky.
COMPLEX for mkl_ccsrsky.
DOUBLE COMPLEX for mkl_zcsrsky.


details.
asky (input/output)
REAL for mkl_scsrsky.
325
DOUBLE PRECISION for mkl_dcsrsky.

COMPLEX for mkl_ccsrsky.
DOUBLE COMPLEX for mkl_zcsrsky.
Array, for a lower triangular part of A it contains the set of elements from
each row starting from the first none-zero element to and including the
diagonal element. For an upper triangular matrix it contains the set of
elements from each column of the matrix starting with the first non-zero
element down to and including the diagonal element. Encountered zero
elements are included in the sets. Refer to values array description in
Skyline Storage Format for more details.
pointers (input/output)INTEGER.
Array with dimension (m+1), where m is number of rows for lower triangle
(columns for upper triangle), pointers(i) - pointers(1) + 1 gives the
index of element in the array asky that is first non-zero element in row
(column)i . The value of pointers(m + 1) is set to nnz + pointers(1),
where nnz is the number of elements in the array asky. Refer to pointers
array description in Skyline Storage Format for more details
Output Parameters
CSR format.
If info=1, the routine is interrupted because there is no space in the array

asky according to the value nzmax.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrsky(job, m, acsr, ja, ia, asky, pointers, info)
INTEGER job(8)
INTEGER m, info
INTEGER ja(*), ia(m+1), pointers(m+1)
REAL acsr(*), asky(*)
SUBROUTINE mkl_dcsrsky(job, m, acsr, ja, ia, asky, pointers, info)

INTEGER job(8)
INTEGER m, info
DOUBLE PRECISION acsr(*), asky(*)
326
SUBROUTINE mkl_ccsrsky(job, m, acsr, ja, ia, asky, pointers, info)
INTEGER job(8)
INTEGER m, info
COMPLEX acsr(*), asky(*)
SUBROUTINE mkl_zcsrsky(job, m, acsr, ja, ia, asky, pointers, info)

INTEGER job(8)
INTEGER m, info
DOUBLE COMPLEX acsr(*), asky(*)
mkl_?csradd
Computes the sum of two matrices stored in the CSR
format (3-array variation) with one-based indexing.
Syntax
call mkl_scsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_dcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_ccsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_zcsradd(trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic,
nzmax, info)
Include Files
mkl.fi
Description
The mkl_?csradd routine performs a matrix-matrix operation defined as
C := A+beta*op(B)
where:
A, B, C are the sparse matrices in the CSR format (3-array variation).
op(B) is one of op(B) = B, or op(B) = BT, or op(B) = BH
beta is a scalar.
The routine works correctly if and only if the column indices in sparse matrix representations of matrices A
and B are arranged in the increasing order for each row. If not, use the parameter sort (see below) to
reorder column indices and the corresponding elements of the input matrices.
NOTE
327
Input Parameters
trans CHARACTER*1. Specifies the operation.

If trans = 'N' or 'n', then C := A+beta*B
If trans = 'T' or 't', then C := A+beta*BT
If trans = 'C' or 'c', then C := A+beta*BH.
request INTEGER.
If request=0, the routine performs addition. The memory for the output
arrays ic, jc, c must be allocated beforehand.
If request=1, the routine only computes the values of the array ic of

length m + 1. The memory for the ic array must be allocated beforehand.
On exit the value ic(m+1) - 1 is the actual number of the elements in the
arrays c and jc.
If request=2, after the routine is called previously with the parameter

request=1 and after the output arrays jc and c are allocated in the calling
program with length at least ic(m+1) - 1, the routine performs addition.
sort INTEGER. Specifies the type of reordering. If this parameter is not set
(default), the routine does not perform reordering.
If sort=1, the routine arranges the column indices ja for each row in the
increasing order and reorders the corresponding values of the matrix A in
the array a.
If sort=2, the routine arranges the column indices jb for each row in the
increasing order and reorders the corresponding values of the matrix B in
the array b.
If sort=3, the routine performs reordering for both input matrices A and B.
a REAL for mkl_scsradd.

DOUBLE PRECISION for mkl_dcsradd.
COMPLEX for mkl_ccsradd.
DOUBLE COMPLEX for mkl_zcsradd.
the matrix A. For each row the column indices must be arranged in the
increasing order.
328
The length of this array is equal to the length of the array a. Refer to
columns array description in Sparse Matrix Storage Formats for more
details.

number of non-zero elements of the matrix A plus one. Refer to rowIndex
beta REAL for mkl_scsradd.

b REAL for mkl_scsradd.

Array containing non-zero elements of the matrix B. Its length is equal to
the number of non-zero elements in the matrix B. Refer to values array
jb INTEGER. Array containing the column indices for each non-zero element of
the matrix B. For each row the column indices must be arranged in the
increasing order.
The length of this array is equal to the length of the array b. Refer to
details.
ib INTEGER. Array of length m + 1 when trans = 'N' or 'n', or n + 1

otherwise.
This array contains indices of elements in the array b, such that ib(i) is
the index in the array b of the first non-zero element from the row i. The
value of the last element ib(m + 1) or ib(n + 1) is equal to the number
of non-zero elements of the matrix B plus one. Refer to rowIndex array
nzmax INTEGER. The length of the arrays c and jc.

This parameter is used only if request=0. The routine stops calculation if
the number of elements in the result matrix C exceeds the specified value
of nzmax.
Output Parameters
c REAL for mkl_scsradd.

329

Array containing non-zero elements of the result matrix C. Its length is
equal to the number of non-zero elements in the matrix C. Refer to values
jc INTEGER. Array containing the column indices for each non-zero element of
the matrix C.
The length of this array is equal to the length of the array c. Refer to
details.
ic INTEGER. Array of length m + 1, containing indices of elements in the array

c, such that ic(i) is the index in the array c of the first non-zero element
from the row i. The value of the last element ic(m + 1) is equal to the
number of non-zero elements of the matrix C plus one. Refer to rowIndex
info INTEGER.
If info=I>0, the routine stops calculation in the I-th row of the matrix C
because number of elements in C exceeds nzmax.
If info=-1, the routine calculates only the size of the arrays c and jc and
returns this value plus 1 as the last element of the array ic.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
INTEGER request, sort, m, n, nzmax, info
INTEGER ja(*), jb(*), jc(*), ia(*), ib(*), ic(*)
REAL a(*), b(*), c(*), beta
SUBROUTINE mkl_dcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
DOUBLE PRECISION a(*), b(*), c(*), beta
330
SUBROUTINE mkl_ccsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
COMPLEX a(*), b(*), c(*), beta
SUBROUTINE mkl_zcsradd( trans, request, sort, m, n, a, ja, ia, beta, b, jb, ib, c, jc, ic, nzmax,
info)
CHARACTER trans
DOUBLE COMPLEX a(*), b(*), c(*), beta
mkl_?csrmultcsr
Computes product of two sparse matrices stored in
the CSR format (3-array variation) with one-based
indexing.
Syntax
call mkl_scsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_dcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_ccsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
call mkl_zcsrmultcsr(trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
Include Files
mkl.fi
Description
The mkl_?csrmultcsr routine performs a matrix-matrix operation defined as
C := op(A)*B
where:
A, B, C are the sparse matrices in the CSR format (3-array variation);
op(A) is one of op(A) = A, or op(A) =AT, or op(A) = AH .
You can use the parameter sort to perform or not perform reordering of non-zero entries in input and output
sparse matrices. The purpose of reordering is to rearrange non-zero entries in compressed sparse row matrix
so that column indices in compressed sparse representation are sorted in the increasing order for each row.
The following table shows correspondence between the value of the parameter sort and the type of
reordering performed by this routine for each sparse matrix involved:
331
Value of the parameter Reordering of A (arrays Reordering of B (arrays Reordering of C (arrays

sort a, ja, ia) b, ja, ib) c, jc, ic)
1 yes no yes
2 no yes yes
3 yes yes yes
4 yes no no
5 no yes no
6 yes yes no
7 no no no
arbitrary value not equal to no no yes
1, 2,..., 7
NOTE
Input Parameters

If trans = 'N' or 'n', then C := A*B
If trans = 'T' or 't' or 'C' or 'c', then C := AT*B.
request INTEGER.
If request=0, the routine performs multiplication, the memory for the
output arrays ic, jc, c must be allocated beforehand.
If request=1, the routine computes only values of the array ic of length m

+ 1, the memory for this array must be allocated beforehand. On exit the
value ic(m+1) - 1 is the actual number of the elements in the arrays c
and jc.
If request=2, the routine has been called previously with the parameter
request=1, the output arrays jc and c are allocated in the calling program
and they are of the length ic(m+1) - 1 at least.
sort INTEGER. Specifies whether the routine performs reordering of non-zeros

entries in input and/or output sparse matrices (see table above).
k INTEGER. Number of columns of the matrix B.
a REAL for mkl_scsrmultcsr.

DOUBLE PRECISION for mkl_dcsrmultcsr.
COMPLEX for mkl_ccsrmultcsr.
DOUBLE COMPLEX for mkl_zcsrmultcsr.
332
increasing order.
details.
ia INTEGER. Array of length m + 1.

This array contains indices of elements in the array a, such that ia(i) is
the index in the array a of the first non-zero element from the row i. The
value of the last element ia(m + 1) is equal to the number of non-zero
elements of the matrix A plus one. Refer to rowIndex array description in
b REAL for mkl_scsrmultcsr.

increasing order.
details.
ib INTEGER. Array of length n + 1 when trans = 'N' or 'n', or m + 1

otherwise.
value of the last element ib(n + 1) or ib(m + 1) is equal to the number
of non-zero elements of the matrix B plus one. Refer to rowIndex array
nzmax INTEGER. The length of the arrays c and jc.

This parameter is used only if request=0. The routine stops calculation if
the number of elements in the result matrix C exceeds the specified value
of nzmax.
Output Parameters
c REAL for mkl_scsrmultcsr.

333

Array containing non-zero elements of the result matrix C. Its length is
equal to the number of non-zero elements in the matrix C. Refer to values
jc INTEGER. Array containing the column indices for each non-zero element of
the matrix C.
The length of this array is equal to the length of the array c. Refer to
details.
ic INTEGER. Array of length m + 1 when trans = 'N' or 'n', or n + 1

otherwise.
This array contains indices of elements in the array c, such that ic(i) is
the index in the array c of the first non-zero element from the row i. The
value of the last element ic(m + 1) or ic(n + 1) is equal to the number
of non-zero elements of the matrix C plus one. Refer to rowIndex array
info INTEGER.
If info=I>0, the routine stops calculation in the I-th row of the matrix C
because number of elements in C exceeds nzmax.
If info=-1, the routine calculates only the size of the arrays c and jc and
returns this value plus 1 as the last element of the array ic.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
INTEGER request, sort, m, n, k, nzmax, info
REAL a(*), b(*), c(*)
SUBROUTINE mkl_dcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
DOUBLE PRECISION a(*), b(*), c(*)
334
SUBROUTINE mkl_ccsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
COMPLEX a(*), b(*), c(*)
SUBROUTINE mkl_zcsrmultcsr( trans, request, sort, m, n, k, a, ja, ia, b, jb, ib, c, jc, ic,
nzmax, info)
CHARACTER*1 trans
DOUBLE COMPLEX a(*), b(*), c(*)
mkl_?csrmultd
Computes product of two sparse matrices stored in
the CSR format (3-array variation) with one-based
indexing. The result is stored in the dense matrix.
Syntax
call mkl_scsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_dcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_ccsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
call mkl_zcsrmultd(trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
Include Files
mkl.fi
Description
The mkl_?csrmultd routine performs a matrix-matrix operation defined as
C := op(A)*B
where:
A, B are the sparse matrices in the CSR format (3-array variation), C is dense matrix;
op(A) is one of op(A) = A, or op(A) =AT, or op(A) = AH .
The routine works correctly if and only if the column indices in sparse matrix representations of matrices A
and B are arranged in the increasing order for each row.
NOTE
335
Input Parameters

If trans = 'N' or 'n', then C := A*B
If trans = 'T' or 't' or 'C' or 'c', then C := AT*B.
k INTEGER. Number of columns of the matrix B.
a REAL for mkl_scsrmultd.

DOUBLE PRECISION for mkl_dcsrmultd.
COMPLEX for mkl_ccsrmultd.
DOUBLE COMPLEX for mkl_zcsrmultd.
increasing order.
details.
ia INTEGER. Array of length m + 1 when trans = 'N' or 'n', or n + 1

otherwise.
This array contains indices of elements in the array a, such that ia(i) is
the index in the array a of the first non-zero element from the row i. The
value of the last element ia(m + 1) or ia(n + 1) is equal to the number
of non-zero elements of the matrix A plus one. Refer to rowIndex array
b REAL for mkl_scsrmultd.

increasing order.
336
details.
ib INTEGER. Array of length m + 1.

value of the last element ib(m + 1) is equal to the number of non-zero
elements of the matrix B plus one. Refer to rowIndex array description in
Output Parameters
c REAL for mkl_scsrmultd.

Array containing non-zero elements of the result matrix C.
ldc INTEGER. Specifies the leading dimension of the dense matrix C as declared
in the calling (sub)program. Must be at least max(m, 1) when trans =
'N' or 'n', or max(1, n) otherwise.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_scsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)
CHARACTER*1 trans
INTEGER m, n, k, ldc
INTEGER ja(*), jb(*), ia(*), ib(*)
REAL a(*), b(*), c(ldc, *)
SUBROUTINE mkl_dcsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)

CHARACTER*1 trans
DOUBLE PRECISION a(*), b(*), c(ldc, *)
SUBROUTINE mkl_ccsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)

CHARACTER*1 trans
COMPLEX a(*), b(*), c(ldc, *)
337
SUBROUTINE mkl_zcsrmultd( trans, m, n, k, a, ja, ia, b, jb, ib, c, ldc)

CHARACTER*1 trans
DOUBLE COMPLEX a(*), b(*), c(ldc, *)
Inspector-executor Sparse BLAS Routines

The inspector-executor API for Sparse BLAS divides operations into two stages: analysis and execution.
During the initial analysis stage, the API inspects the matrix sparsity pattern and applies matrix structure
changes. In the execution stage, subsequent routine calls reuse this information in order to improve
performance.
The inspector-executor API supports key Sparse BLAS operations for iterative sparse solvers:
Sparse matrix-vector multiplication

Sparse matrix-matrix multiplication with a sparse or dense result
Solution of triangular systems
Sparse matrix addition
Naming conventions in Inspector-executor Sparse BLAS Routines

The Inspector-executor Sparse BLAS API routine names use the following convention:
mkl_sparse_[<character>_]<operation>[_<format>]
The data type is included in the name only if the function accepts dense matrix or scalar floating point
parameters.
The <operation> field indicates the type of operation:
create create matrix handle
copy create a copy of matrix handle
convert convert matrix between sparse formats
export export matrix from internal representation to CSR or BSR format
destroy frees memory allocated for matrix handle
set_<op>_hin provide information about number of upcoming compute operations and

t operation type for optimization purposes, where <op> is mv, sv, mm, sm, or
memory
338
optimize analyze the matrix using hints and store optimization information in matrix
handle
mv compute sparse matrix-vector product
mm compute sparse matrix by dense matrix product (batch mv)
spmm/spmmd compute sparse matrix by sparse matrix product and store the result as a
sparse/dense matrix
trsv solve a triangular system
trsm solve a triangular system with multiple right-hand sides
add compute sum of two sparse matrices
The <format> field indicates the sparse matrix storage format:
coo coordinate format
csr compressed sparse row format plus variations
csc compressed sparse column format plus variations
bsr block sparse row format plus variations
The format is included in the function name only if the function parameters include an explicit sparse matrix
in one of the conventional sparse matrix formats.
Sparse Matrix Storage Formats for Inspector-executor Sparse BLAS Routines

Inspector-executor Sparse BLAS routines support four conventional sparse matrix storage formats:
compressed sparse row format (CSR) plus variations

compressed sparse column format (CSC) plus variations
coordinate format (COO)
block sparse row format (BSR) plus variations
Computational routines operate on a matrix handle that stores a matrix in CSR or BSR formats. Other
formats should be converted to CSR or BSR format before calling any computational routines. For more
information see Sparse Matrix Storage Formats.
Supported Inspector-executor Sparse BLAS Operations

The Inspector-executor Sparse BLAS API can perform several operations involving sparse matrices. These
notations are used in the description of the operations:
A, G, V are sparse matrices

B and C are dense matrices
x and y are dense vectors
alpha and beta are scalars
op(A) represents a possible transposition of matrix A

op(A) = A
op(A) = AT - transpose of A
op(A) = AH - conjugate transpose of A
op(A)-1 denotes the inverse of op(A).
339
The Inspector-executor Sparse BLAS routines support the following operations:

computing the vector product between a sparse matrix and a dense vector:
solving a single triangular system:
computing a product between a sparse matrix and a dense matrix:
C := alpha*op(A)*B + beta*C
computing a product between sparse matrices with a sparse result:
V := alpha*op(A)*G
computing a product between sparse matrices with a dense result:
C := alpha*op(A)*G
computing a sum of sparse matrices with a sparse result:
V := alpha*op(A) + G
solving a sparse triangular system with multiple right-hand sides:
C := alpha*inv(op(A))*B
Matrix manipulation routines

The Matrix Manipulation routines table lists the Matrix Manipulation routines and the data types associated
with them.
Matrix Manipulation Routines and Their Data Types
Function Group
mkl_sparse_? s, d, c, z Creates a handle for a CSR format matrix.

_create_csr
mkl_sparse_? s, d, c, z Creates a handle for a CSC format matrix.

_create_csc
mkl_sparse_? s, d, c, z Creates a handle for a matrix in COO format.

_create_coo
mkl_sparse_? s, d, c, z Creates a handle for a matrix in BSR format.

_create_bsr
mkl_sparse_copy NA Creates a copy of a matrix handle.
mkl_sparse_destro NA Frees memory allocated for matrix handle.

y
mkl_sparse_conve NA Converts internal matrix representation to CSR format.

rt_csr
mkl_sparse_conve NA Converts internal matrix representation to BSR format or

rt_bsr changes BSR block size.
mkl_sparse_? s, d, c, z Exports CSR matrix from internal representation.

_export_csr
mkl_sparse_? s, d, c, z Exports BSR matrix from internal representation.

_export_bsr
mkl_sparse_? s, d, c, z Changes a single value of matrix in internal

_set_value representation.
340
mkl_sparse_?_create_csr
Creates a handle for a CSR format matrix.
Syntax
stat = mkl_sparse_s_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_c_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_create_csr (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_csr routine creates a handle for an m-by-k matrix A in CSR format.
Input Parameters
indexing sparse_index_base_t.
Indicates how input arrays are indexed.
SPARSE_INDEX_BASE_Z Zero-based (C-style) indexing: indices start at

ERO 0.
SPARSE_INDEX_BASE_O One-based (Fortran-style) indexing: indices

NE start at 1.
rows C_INT.
Number of rows of matrix A.
cols C_INT.
Number of columns of matrix A.
rows_start C_INT.
Array of length at least m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of row i in the arrays
values and col_indx.
rows_end C_INT.
Array of at least length m. This array contains row indices, such that
rows_end(i) - rows_start(1) - 1 is the last index of row i in the arrays
341
col_indx C_INT .
For one-based indexing, array containing the column indices plus one for
each non-zero element of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero element of the matrix A.
Its length is at least rows_end(rows - 1) - rows_start(1).
values C_FLOAT for mkl_sparse_s_create_csr

C_DOUBLE for mkl_sparse_d_create_csr
C_FLOAT_COMPLEX for mkl_sparse_c_create_csr
C_DOUBLE_COMPLEX for mkl_sparse_z_create_csr
length of the col_indx array.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data for subsequent Inspector-executor Sparse
BLAS operations.
sparse_status_t INTEGER.
Value indicating whether the operation was successful or not, and why:
SPARSE_STATUS_SUCCE The operation was successful.

SS
SPARSE_STATUS_NOT_I The routine encountered an empty handle or
NITIALIZED matrix array.
SPARSE_STATUS_ALLOC Internal memory allocation failed.

_FAILED
SPARSE_STATUS_INVAL The input parameters contain an invalid value.
ID_VALUE
SPARSE_STATUS_EXECU Execution failed.
TION_FAILED
SPARSE_STATUS_INTER An error in algorithm implementation occurred.
NAL_ERROR
SPARSE_STATUS_NOT_S The requested operation is not supported.
UPPORTED
mkl_sparse_?_create_csc
Creates a handle for a CSC format matrix.
Syntax
stat = mkl_sparse_s_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
342
stat = mkl_sparse_c_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_create_csc (A, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_csc routine creates a handle for an m-by-k matrix A in CSC format.
Input Parameters

ERO 0.

NE start at 1.
rows C_INT.
Number of rows of the matrix A.
cols C_INT.
Number of columns of the matrix A.
rows_start C_INT.
Array of length at least m. This array contains row indices, such that
rows_end C_INT.
Array of at least length m. This array contains row indices, such that
col_indx C_INT.
each non-zero element of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero element of the matrix A.
Its length is at least rows_end(rows - 1) - rows_start(1).
values C_FLOAT for mkl_sparse_s_create_csc

C_DOUBLE for mkl_sparse_d_create_csc
C_FLOAT_COMPLEX for mkl_sparse_c_create_csc
343
C_DOUBLE_COMPLEX for mkl_sparse_z_create_csc

length of the col_indx array.
Output Parameters
A SPARSE_MATRIX_T.
Handle containing internal data.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_create_coo
Creates a handle for a matrix in COO format.
Syntax
stat = mkl_sparse_s_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_d_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_c_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
stat = mkl_sparse_z_create_coo (A, indexing, rows, cols, nnz, row_indx, col_indx,
values)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_coo routine creates a handle for an m-by-k matrix A in COO format.
344
Input Parameters

ERO 0.

NE start at 1.
rows C_INT.
Number of rows of matrix A.
cols C_INT.
Number of columns of matrix A.
nnz C_INT.
Specifies the number of non-zero elements of the matrix A.
row_indx C_INT.
Array of length nnz, containing the row indices for each non-zero element
of matrix A.
col_indx C_INT.
Array of length nnz, containing the column indices for each non-zero
element of matrix A.
values C_FLOAT for mkl_sparse_s_create_coo

C_DOUBLE for mkl_sparse_d_create_coo
C_FLOAT_COMPLEX for mkl_sparse_c_create_coo
C_DOUBLE_COMPLEX for mkl_sparse_z_create_coo
Array of length nnz, containing the non-zero elements of matrix A in
arbitrary order.
Output Parameters
A SPARSE_MATRIX_T.
345

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_create_bsr
Creates a handle for a matrix in BSR format.
Syntax
stat = mkl_sparse_s_create_bsr (A, indexing, block_layout, rows, cols, block_size,
rows_start, rows_end, col_indx, values)
stat = mkl_sparse_d_create_bsr (A, indexing, block_layout, rows, cols, block_size,
stat = mkl_sparse_c_create_bsr (A, indexing, block_layout, rows, cols, block_size,
stat = mkl_sparse_z_create_bsr (A, indexing, block_layout, rows, cols, block_size,
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_create_bsr routine creates a handle for an m-by-k matrix A in BSR format.
Input Parameters

ERO 0.

NE start at 1.
block_layout sparse_index_base_t.
346
Specifies layout of blocks:
SPARSE_LAYOUT_ROW_M Storage of elements of blocks uses row major

AJOR layout.
SPARSE_LAYOUT_COLUM Storage of elements of blocks uses column

N_MAJOR major layout.
NOTE
If you specify SPARSE_INDEX_BASE_ZERO for indexing, you must use
SPARSE_LAYOUT_ROW_MAJOR for block_layout. Similarly, if you
specify SPARSE_INDEX_BASE_ONE for indexing, you must use
SPARSE_LAYOUT_COLUMN_MAJOR for block_layout. Otherwise
mkl_sparse_?_create_bsr returns stat =
SPARSE_STATUS_NOT_SUPPORTED.
rows C_INT.
Number of block rows of matrix A.
cols C_INT.
Number of block columns of matrix A.
block_size C_INT.
Size of blocks in matrix A.
rows_start C_INT.
Array of length m. This array contains row indices, such that
rows_start(i) - rows_start(1) is the first index of block row i in the
arrays values and col_indx.
rows_end C_INT.
Array of length m. This array contains row indices, such that rows_end(i)
- rows_start(1) - 1 is the last index of block row i in the arrays values
and col_indx.
col_indx C_INT.
each non-zero block of the matrix A. For zero-based indexing, array
containing the column indices for each non-zero block of the matrix A. Its
length is rows_end(rows - 1) - rows_start(1).
values C_FLOAT for mkl_sparse_s_create_bsr

C_DOUBLE for mkl_sparse_d_create_bsr
C_FLOAT_COMPLEX for mkl_sparse_c_create_bsr
C_DOUBLE_COMPLEX for mkl_sparse_z_create_bsr
347

length of the col_indx array multiplied by block_size*block_size.
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_copy
Creates a copy of a matrix handle.
Syntax
stat = mkl_sparse_copy (source, descr, dest)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_copy routine creates a copy of a matrix handle.
Input Parameters
source SPARSE_MATRIX_T.
Specifies handle containing internal data.
descr MATRIX_DESCR.
Descriptor specifying sparse matrix properties.
348
type - Specifies the type of a sparse matrix:
SPARSE_MATRIX_TYPE_ The matrix is processed as is.

GENERAL
SPARSE_MATRIX_TYPE_ The matrix is symmetric (only the requested
SYMMETRIC triangle is processed).
SPARSE_MATRIX_TYPE_ The matrix is Hermitian (only the requested

HERMITIAN triangle is processed).
SPARSE_MATRIX_TYPE_ The matrix is triangular (only the requested

TRIANGULAR triangle is processed).
SPARSE_MATRIX_TYPE_ The matrix is diagonal (only diagonal elements

DIAGONAL are processed).
SPARSE_MATRIX_TYPE_ The matrix is block-triangular (only requested

BLOCK_TRIANGULAR triangle is processed). (Applies to BSR format
only.)
SPARSE_MATRIX_TYPE_ The matrix is block-diagonal (only diagonal

BLOCK_DIAGONAL blocks are processed. (Applies to BSR format
only.)
mode - Specifies the triangular matrix part for symmetric, Hermitian,

triangular, and block-triangular matrices:
SPARSE_FILL_MODE_LO The lower triangular matrix part is processed.

WER
SPARSE_FILL_MODE_UP The upper triangular matrix part is processed.
PER
diag - Specifies diagonal type for non-general matrices:
SPARSE_DIAG_NON_UNI Diagonal elements might not be equal to one.

T
SPARSE_DIAG_UNIT Diagonal elements are equal to one.
Output Parameters
dest SPARSE_MATRIX_T for mkl_sparse_copy


SS

_FAILED
349

ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_destroy
Frees memory allocated for matrix handle.
Syntax
stat = mkl_sparse_destroy (A)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_destroy routine frees memory allocated for matrix handle.
NOTE
You must free memory allocated for matrices after completing use of them. The mkl_sparse_destroy
provides a utility to do so.
Input Parameters
A SPARSE_MATRIX_T.
Output Parameters

SS

_FAILED
ID_VALUE
TION_FAILED
350
NAL_ERROR
UPPORTED
mkl_sparse_convert_csr
Converts internal matrix representation to CSR
format.
Syntax
stat = mkl_sparse_convert_csr (source, operation, dest)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_convert_csr routine converts internal matrix representation to CSR format.
Input Parameters
operation C_INT.
Specifies operation op() on input matrix.
SPARSE_OPERATION_NO Non-transpose, op(A) = A.

N_TRANSPOSE
SPARSE_OPERATION_TR Transpose, op(A) = AT.
ANSPOSE
SPARSE_OPERATION_CO Conjugate transpose, op(A) = AH.
NJUGATE_TRANSPOSE
Output Parameters
dest SPARSE_MATRIX_T.

SS

_FAILED
351

ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_convert_bsr
Converts internal matrix representation to BSR format
or changes BSR block size.
Syntax
stat = mkl_sparse_convert_bsr (source, block_size, block_layout, operation, dest)
Include Files
mkl_spblas.f90
Description
Themkl_sparse_convert_bsr routine converts internal matrix representation to BSR format or changes
BSR block size.
Input Parameters
block_size C_INT.
Size of the block in the output structure.

AJOR layout.

operation C_INT.

N_TRANSPOSE
ANSPOSE
352
NJUGATE_TRANSPOSE
Output Parameters
dest SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_export_csr
Exports CSR matrix from internal representation.
Syntax
stat = mkl_sparse_s_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_d_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_c_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
stat = mkl_sparse_z_export_csr (source, indexing, rows, cols, rows_start, rows_end,
col_indx, values)
Include Files
mkl_spblas.f90
Description
If the matrix specified by the source handle is in CSR format, the mkl_sparse_?_export_csr routine
exports an m-by-k matrix A in CSR format matrix from the internal representation. The routine returns
pointers to the internal representation and does not allocate additional memory.
353
NOTE
Since the exported data is a copy of an internally stored structure, any changes made to it have no
effect on subsequent Inspector-executor Sparse BLAS operations.
If the matrix is not already in CSR format, the routine returns SPARSE_STATUS_INVALID_VALUE.
Input Parameters
Output Parameters

ERO 0.

NE start at 1.
rows C_INT.
Number of rows of the matrix source.
cols C_INT.
Number of columns of the matrix source.
rows_start C_INT.
Pointer to array of length m. This array contains row indices, such that
rows_end C_INT.
col_indx C_INT.
For one-based indexing, pointer to array containing the column indices plus
one for each non-zero element of the matrix source. For zero-based
indexing, pointer to array containing the column indices for each non-zero
element of the matrix source. Its length is rows_end(rows - 1) -
rows_start(1).
values C_FLOAT for mkl_sparse_s_export_csr

C_DOUBLE for mkl_sparse_d_export_csr
354
C_FLOAT_COMPLEX for mkl_sparse_c_export_csr
C_DOUBLE_COMPLEX for mkl_sparse_z_export_csr
Pointer to array containing non-zero elements of the matrix A. Its length is
equal to length of the col_indx array.
Output Parameters

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_export_bsr
Exports BSR matrix from internal representation.
Syntax
stat = mkl_sparse_s_export_bsr (source, indexing, block_layout, rows, cols,
block_size, rows_start, rows_end, col_indx, values)
stat = mkl_sparse_d_export_bsr (source, indexing, block_layout, rows, cols,
stat = mkl_sparse_c_export_bsr (source, indexing, block_layout, rows, cols,
stat = mkl_sparse_z_export_bsr (source, indexing, block_layout, rows, cols,
Include Files
mkl_spblas.f90
Description
If the matrix specified by the source handle is in BSR format, the mkl_sparse_?_export_bsr routine
exports an m-by-k matrix A in BSR format from the internal representation. The routine returns pointers to
the internal representation and does not allocate additional memory.
355
NOTE
Since the exported data is a copy of an internally stored structure, any changes made to it have no
effect on subsequent Inspector-executor Sparse BLAS operations.
If the matrix is not already in BSR format, the routine returns SPARSE_STATUS_INVALID_VALUE.
Input Parameters
Output Parameters

ERO 0.

NE start at 1.

AJOR layout.

rows C_INT.
Number of block rows of the matrix source.
cols C_INT.
Number of columns of the matrix source. Number of block columns of
matrix source.
block_size C_INT.
Size of the block in matrix source.
rows_start C_INT.
rows_start(i) - rows_start(1) is the first index of block row i in the
rows_end C_INT.
356
rows_end(i) - rows_start(1) - 1 is the last index of block row i in the
col_indx C_INT.
For one-based indexing, pointer to array containing the column indices plus
one for each non-zero blocks of the matrix source. For zero-based indexing,
pointer to array containing the column indices for each non-zero blocks of
the matrix source. Its length is rows_end(m - 1) - rows_start(1).
values C_FLOAT for mkl_sparse_s_export_bsr

C_DOUBLE for mkl_sparse_d_export_bsr
C_FLOAT_COMPLEX for mkl_sparse_c_export_bsr
C_DOUBLE_COMPLEX for mkl_sparse_z_export_bsr
Pointer to array containing non-zero elements of matrix source. Its length is
equal to length of the col_indx array multiplied by
block_size*block_size.
Output Parameters

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_set_value
Changes a single value of matrix in internal
representation.
Syntax
stat = mkl_sparse_s_set_value (A , row, col, value);
357
stat = mkl_sparse_d_set_value (A, row, col, value );

stat = mkl_sparse_c_set_value (A , row, col, value);
stat = mkl_sparse_z_set_value (A , row, col, value);
Include Files
mkl_spblas.f90
Description
Use the mkl_sparse_?_set_value routine to change a single value of a matrix in internal Inspector-
executor Sparse BLAS format.
Input Parameters
A SPARSE_MATRIX_T.
Specifies handle containing internal data.
row C_INT .
Indicates row of matrix in which to set value.
col C_INT .
Indicates column of matrix in which to set value.
value C_FLOAT for mkl_sparse_s_create_csr

C_DOUBLE for mkl_sparse_d_create_csr
C_FLOAT_COMPLEX for mkl_sparse_c_create_csr
C_DOUBLE_COMPLEX for mkl_sparse_z_create_csr
Indicates value
Output Parameters
A Handle containing modified internal data.
stat INTEGER.

SS

ID_VALUE
NAL_ERROR
358
Inspector-executor Sparse BLAS Analysis Routines
Analysis Routines and Their Data Types
Routine or Function Description
Group
mkl_sparse_set_mv_hint Provides estimate of number and type of upcoming matrix-vector operations.
mkl_sparse_set_sv_hint Provides estimate of number and type of upcoming triangular system solver
operations.
mkl_sparse_set_mm_hint Provides estimate of number and type of upcoming matrix-matrix

multiplication operations.
mkl_sparse_set_sm_hint Provides estimate of number and type of upcoming triangular matrix solve
with multiple right hand sides operations.
mkl_sparse_set_memory Provides memory requirements for performance optimization purposes.

_hint
mkl_sparse_optimize Analyzes matrix structure and performs optimizations using the hints
provided in the handle.
Optimization Notice
mkl_sparse_set_mv_hint
Provides estimate of number and type of upcoming
matrix-vector operations.
Syntax
stat = mkl_sparse_set_mv_hint (A, operation, descr, expected_calls)
Include Files
mkl_spblas.f90
Description
Use the mkl_sparse_set_mv_hint routine to provide the Inspector-executor Sparse BLAS API an estimate
of the number of upcoming matrix-vector multiplication operations for performance optimization, and specify
whether or not to perform an operation on the matrix.
Optimization Notice
359
Optimization Notice
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
descr MATRIX_DESCR.

GENERAL




only.)

only.)


WER
360
PER

T
expected_calls C_INT.
Number of expected calls to execution routine.
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_set_sv_hint
triangular system solver operations.
Syntax
stat = mkl_sparse_set_sv_hint (A, operation, descr, expected_calls)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_sv_hint routine provides an estimate of the number of upcoming triangular system solver
operations and type of these operations for performance optimization.
361
Optimization Notice
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
descr MATRIX_DESCR.

GENERAL




only.)

only.)

362
WER
PER

T
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_set_mm_hint
matrix-matrix multiplication operations.
Syntax
stat = mkl_sparse_set_mm_hint (A, operation, descr, layout, dense_matrix_size,
expected_calls)
Include Files
mkl_spblas.f90
363
Description
The mkl_sparse_set_mm_hint routine provides an estimate of the number of upcoming matrix-matrix
multiplication operations and type of these operations for performance optimization purposes.
Optimization Notice
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
descr MATRIX_DESCR.

GENERAL




only.)
364
only.)


WER
PER

T
layout C_INT.
Specifies layout of elements:
SPARSE_LAYOUT_COLUM Storage of elements uses column major layout.

N_MAJOR
SPARSE_LAYOUT_ROW_M Storage of elements uses row major layout.
AJOR
dense_matrix_size C_INT.
Number of columns in dense matrix.
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
365

TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_set_sm_hint
triangular matrix solve with multiple right hand sides
operations.
Syntax
stat = mkl_sparse_set_sm_hint (A, operation, descr, layout, dense_matrix_size,
expected_calls)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_set_sm_hint routine provides an estimate of the number of upcoming triangular matrix
solve with multiple right hand sides operations and type of these operations for performance optimization
purposes.
Optimization Notice
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
descr MATRIX_DESCR.
366

GENERAL




only.)

only.)


WER
PER

T
layout C_INT.
Specifies layout of elements:

N_MAJOR
AJOR
dense_matrix_size C_INT.
Number of right-hand-side.
367
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_set_memory_hint
Provides memory requirements for performance
optimization purposes.
Syntax
stat = mkl_sparse_set_memory_hint (A, policy)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_set_memory_hint routine allocates additional memory for further performance
optimization purposes.
Optimization Notice
368
Input Parameters
policy C_INT.
Specify memory utilization policy for optimization routine using these types:
SPARSE_MEMORY_NONE Routine can allocate memory only for auxiliary

structures (such as for workload balancing); the
amount of memory is proportional to vector
size.
SPARSE_MEMORY_AGGRE Default.
SSIVE Routine can allocate memory up to the size of
matrix A for converting into the appropriate
sparse format.
Output Parameters
A SPARSE_MATRIX_T.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_optimize
Analyzes matrix structure and performs optimizations
using the hints provided in the handle.
Syntax
stat = mkl_sparse_optimize (A)
Include Files
mkl_spblas.f90
369
Description
The mkl_sparse_optimize routine analyzes matrix structure and performs optimizations using the hints
provided in the handle. Generally, specifying a higher number of expected operations allows for more
aggressive and time consuming optimizations.
Optimization Notice
Input Parameters
A SPARSE_MATRIX_T.
Output Parameters

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
370
Inspector-executor Sparse BLAS Execution Routines
Execution Routines and Their Data Types
Function Group
mkl_sparse_?_mv s, d, c, z Computes a sparse matrix-vector product.
mkl_sparse_?_ s, d, c, z Solves a system of linear equations for a square sparse

trsv matrix.
mkl_sparse_?_mm s, d, c, z Computes the product of a sparse matrix and a dense

matrix.
mkl_sparse_? s, d, c, z Solves a system of linear equations with multiple right

_trsm hand sides for a square sparse matrix.
mkl_sparse_?_add s, d, c, z Computes sum of two sparse matrices.
mkl_sparse_spmm s, d, c, z Computes the product of two sparse matrices and stores

the result as a sparse matrix.
mkl_sparse_? s, d, c, z Computes the product of two sparse matrices and stores

_spmmd the result as a dense matrix.
mkl_sparse_?_mv
Computes a sparse matrix-vector product.
Syntax
stat = mkl_sparse_s_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_d_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_c_mv (operation, alpha, A, descr, x, beta, y)
stat = mkl_sparse_z_mv (operation, alpha, A, descr, x, beta, y)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_mv routine computes a sparse matrix-vector product defined as
where:
alpha and beta are scalars, x and y are vectors, and A is a matrix handle of a matrix with m rows and k
columns.
Input Parameters
operation C_INT.

N_TRANSPOSE
371

ANSPOSE
NJUGATE_TRANSPOSE
alpha C_FLOAT for mkl_sparse_s_mv

C_DOUBLE for mkl_sparse_d_mv
C_FLOAT_COMPLEX for mkl_sparse_c_mv
C_DOUBLE_COMPLEX for mkl_sparse_z_mv
A SPARSE_MATRIX_T.
Handle containing sparse matrix in internal data structure.
descr MATRIX_DESCR.

GENERAL




only.)

only.)


WER
PER
372
T
x C_FLOAT for mkl_sparse_s_mv

Array with size equal to the number of columns of A if operation =
SPARSE_OPERATION_NON_TRANSPOSE and at least rows of A otherwise. On
beta C_FLOAT for mkl_sparse_s_mv

y C_FLOAT for mkl_sparse_s_mv

Array with size at least m if
operation=SPARSE_OPERATION_NON_TRANSPOSE and at least k otherwise.
On entry, the array y must contain the vector y.
Output Parameters
y C_FLOAT for mkl_sparse_s_mv

Overwritten by the updated vector y.

SS

_FAILED
373

ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_trsv
Solves a system of linear equations for a triangular
sparse matrix.
Syntax
stat = mkl_sparse_s_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_d_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_c_trsv (operation, alpha, A, descr, x, y)
stat = mkl_sparse_z_trsv (operation, alpha, A, descr, x, y)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_trsv routine solves a system of linear equations for a matrix:
op(A)*y = alpha * x
where A is a triangular sparse matrix, alpha is a scalar, and x and y are vectors.
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
alpha C_FLOAT for mkl_sparse_s_trsv

C_DOUBLE for mkl_sparse_d_trsv
C_FLOAT_COMPLEX for mkl_sparse_c_trsv
C_DOUBLE_COMPLEX for mkl_sparse_z_trsv
374
A SPARSE_MATRIX_T.
descr MATRIX_DESCR.

GENERAL




only.)

only.)


WER
PER

T
x C_FLOAT for mkl_sparse_s_trsv

Array of size at least m, where m is the number of rows of matrix A. On
375
Output Parameters
y C_FLOAT for mkl_sparse_s_trsv

Array of size at least m containing the solution to the system of linear
equations.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_mm
Computes the product of a sparse matrix and a dense
matrix.
Syntax
stat = mkl_sparse_s_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_d_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_c_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
stat = mkl_sparse_z_mm (operation, alpha, A, descr, layout, x, columns, ldx, beta, y,
ldy)
Include Files
mkl_spblas.f90
376
Description
The mkl_sparse_?_mm routine performs a matrix-matrix operation:
where alpha and beta are scalars, A is a sparse matrix, and x and y are dense matrices.
The mkl_sparse_?_mm and mkl_sparse_?_trsm routines support these configurations:
Column-major dense matrix: Row-major dense matrix: layout

layout = = SPARSE_LAYOUT_ROW_MAJOR
SPARSE_LAYOUT_COLUMN_MAJ
OR
0-based sparse matrix: CSR All formats

SPARSE_INDEX_BASE_ZERO
BSR: general non-transposed
matrix multiplication only
1-based sparse matrix: All formats CSR

SPARSE_INDEX_BASE_ONE
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
alpha C_FLOAT for mkl_sparse_s_mm

C_DOUBLE for mkl_sparse_d_mm
C_FLOAT_COMPLEX for mkl_sparse_c_mm
C_DOUBLE_COMPLEX for mkl_sparse_z_mm
A SPARSE_MATRIX_T.
descr MATRIX_DESCR.

GENERAL
377





only.)

only.)


WER
PER

T
layout C_INT.
Describes the storage scheme for the dense matrix:

N_MAJOR
AJOR
x C_FLOAT for mkl_sparse_s_mm

Array of size at least rows*cols.
layout = layout =
SPARSE_LAYOUT_CO SPARSE_LAYOUT_ROW_
LUMN_MAJOR MAJOR
378
rows (number of ldx If op(A) = A, number
rows in x) of columns in A
If op(A) = AT, number
of rows in A
cols (number of columns ldx

columns in x)
columns C_INT.
Number of columns of matrix y.
ldx C_INT.
Specifies the leading dimension of matrix x.
beta C_FLOAT for mkl_sparse_s_mm

Specifies the scalar beta
y C_FLOAT for mkl_sparse_s_mm

Array of size at least rows*cols, where
layout = layout =
LUMN_MAJOR MAJOR
rows (number of ldy If op(A) = A, number

rows in y) of columns in A
If op(A) = AT, number
of rows in A
cols (number of columns ldy

columns in y)
Output Parameters
y C_FLOAT for mkl_sparse_s_mm

Overwritten by the updated matrix y.
379

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_trsm
Solves a system of linear equations with multiple right
hand sides for a triangular sparse matrix.
Syntax
stat = mkl_sparse_s_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_d_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_c_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
stat = mkl_sparse_z_trsm (operation, alpha, A, descr, layout, x, columns, ldx, y, ldy)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_trsm routine solves a system of linear equations with multiple right hand sides for a
triangular sparse matrix:
where:
alpha is a scalar, x and y are dense matrices, and A is a sparse matrix.
The mkl_sparse_?_mm and mkl_sparse_?_trsm routines support these configurations:
Column-major dense matrix: Row-major dense matrix: layout

layout = = SPARSE_LAYOUT_ROW_MAJOR
SPARSE_LAYOUT_COLUMN_MAJ
OR
0-based sparse matrix: CSR All formats

SPARSE_INDEX_BASE_ZERO
380
1-based sparse matrix: All formats CSR

SPARSE_INDEX_BASE_ONE
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
alpha C_FLOAT for mkl_sparse_s_trsm

C_DOUBLE for mkl_sparse_d_trsm
C_FLOAT_COMPLEX for mkl_sparse_c_trsm
C_DOUBLE_COMPLEX for mkl_sparse_z_trsm
A SPARSE_MATRIX_T.
descr MATRIX_DESCR.

GENERAL




only.)
381

only.)


WER
PER

T
layout C_INT.

N_MAJOR
AJOR
x C_FLOAT for mkl_sparse_s_trsm

Array of size at least rows*cols.
layout = layout =
LUMN_MAJOR MAJOR
rows (number of ldx number of rows in A

rows in x)
cols (number of columns ldx

columns in x)
On entry, the array x must contain the matrix x.
columns C_INT.
Number of columns in matrix y.
ldx C_INT.
Specifies the leading dimension of matrix x.
y C_FLOAT for mkl_sparse_s_trsm
382
Array of size at least rows*cols, where
layout = layout =
LUMN_MAJOR MAJOR
rows (number of ldy number of rows in A

rows in y)
cols (number of columns ldy

columns in y)
Output Parameters
y C_FLOAT for mkl_sparse_s_trsm

Overwritten by the updated matrix y.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_add
Computes sum of two sparse matrices.
Syntax
stat = mkl_sparse_s_add (operation, A, alpha, B, C)
383
stat = mkl_sparse_d_add (operation, A, alpha, B, C)

stat = mkl_sparse_c_add (operation, A, alpha, B, C)
stat = mkl_sparse_z_add (operation, A, alpha, B, C)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_add routine performs a matrix-matrix operation:
C := alpha*op(A) + B
where alpha is a scalar and A, B, and C are sparse matrices.
Input Parameters
A SPARSE_MATRIX_T.
Handle containing a sparse matrix in internal data structure.
alpha C_FLOAT for mkl_sparse_s_add

C_DOUBLE for mkl_sparse_d_add
C_FLOAT_COMPLEX for mkl_sparse_c_add
C_DOUBLE_COMPLEX for mkl_sparse_z_add
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
B SPARSE_MATRIX_T.
Output Parameters
C SPARSE_MATRIX_T.
Handle containing the resulting sparse matrix in internal data structure.

SS
384

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_spmm
Computes the product of two sparse matrices and
stores the result as a sparse matrix.
Syntax
stat = mkl_sparse_spmm (operation, A, B, C)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_spmm routine performs a matrix-matrix operation:
C := op(A) *B
where A, B, and C are sparse matrices.
Input Parameters
operation C_INT.

N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
A SPARSE_MATRIX_T.
B SPARSE_MATRIX_T.
385
Output Parameters
C SPARSE_MATRIX_T.
Handle containing the resulting sparse matrix in internal data structure.

SS

_FAILED
ID_VALUE
TION_FAILED
NAL_ERROR
UPPORTED
mkl_sparse_?_spmmd
Computes the product of two sparse matrices and
stores the result as a dense matrix.
Syntax
stat = mkl_sparse_s_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_d_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_c_spmmd (operation, A, B, layout, C, ldc)
stat = mkl_sparse_z_spmmd (operation, A, B, layout, C, ldc)
Include Files
mkl_spblas.f90
Description
The mkl_sparse_?_spmmd routine performs a matrix-matrix operation:
C := op(A)*B
where A and B are sparse matrices and C is a dense matrix.
Input Parameters
operation C_INT.
386
N_TRANSPOSE
ANSPOSE
NJUGATE_TRANSPOSE
A SPARSE_MATRIX_T.
B SPARSE_MATRIX_T.
layout C_INT.

N_MAJOR
AJOR
ldC C_INT.
Leading dimension of matrix C.
Output Parameters
C C_FLOAT for mkl_sparse_s_spmmd

C_DOUBLE for mkl_sparse_d_spmmd
C_FLOAT_COMPLEX for mkl_sparse_c_spmmd
C_DOUBLE_COMPLEX for mkl_sparse_z_spmmd
Resulting dense matrix stored in internal data structure.

SS

_FAILED
ID_VALUE
TION_FAILED
387

NAL_ERROR
UPPORTED
Intel MKL provides C and Fortran routines to extend the functionality of the BLAS routines. These include
routines to compute vector products, matrix-vector products, and matrix-matrix products.
Intel MKL also provides routines to perform certain data manipulation, including matrix in-place and out-of-
place transposition operations combined with simple matrix arithmetic operations. Transposition operations
are Copy As Is, Conjugate transpose, Transpose, and Conjugate. Each routine adds the possibility of scaling
during the transposition operation by giving some alpha and/or beta parameters. Each routine supports
both row-major orderings and column-major orderings.
Table BLAS-like Extensions lists these routines.
The <?> symbol in the routine short names is a precision prefix that indicates the data type:
s REAL
d DOUBLE PRECISION
c COMPLEX
z DOUBLE COMPLEX
Routine Data Types Description
?axpby s, d, c, z Scales two vectors, adds them to one another and stores
result in the vector (routines).
?gem2vu s, d Two matrix-vector products using a general matrix, real

data.
?gem2vc c, z Two matrix-vector products using a general matrix,

complex data.
?gemmt s, d, c, z Computes a matrix-matrix product with general matrices

but updates only the upper or lower triangular part of the
result matrix.
?gemm3m c, z Computes a scalar-matrix-matrix product using matrix

multiplications and adds the result to a scalar-matrix
product.
?gemm_batch s, d, c, z Computes scalar-matrix-matrix products and adds the

results to scalar matrix products for groups of general
matrices.
?gemm3m_batch c, z Computes a scalar-matrix-matrix product using matrix

product.
mkl_?imatcopy s, d, c, z Performs scaling and in-place transposition/copying of

matrices.
388
Routine Data Types Description
mkl_?omatcopy s, d, c, z Performs scaling and out-of-place transposition/copying of

matrices.
mkl_?omatcopy2 s, d, c, z Performs two-strided scaling and out-of-place

transposition/copying of matrices.
mkl_?omatadd s, d, c, z Performs scaling and sum of two matrices including their

out-of-place transposition/copying.
?gemm_alloc s, d Allocates storage for a packed matrix.
?gemm_pack s, d Performs scaling and packing of the matrix into the

previously allocated buffer.
?gemm_compute s, d Computes a matrix-matrix product with general matrices

where one or both input matrices are stored in a packed
data structure and adds the result to a scalar-matrix
product.
?gemm_free s, d Frees the storage previously allocated for the packed

matrix.
?axpby
Scales two vectors, adds them to one another and
stores result in the vector.
Syntax
call saxpby(n, a, x, incx, b, y, incy)
call daxpby(n, a, x, incx, b, y, incy)
call caxpby(n, a, x, incx, b, y, incy)
call zaxpby(n, a, x, incx, b, y, incy)
call axpby(x, y [,a] [,b])
Include Files
mkl.fi, blas.f90
Description
The ?axpby routines perform a vector-vector operation defined as
y := a*x + b*y
where:
a and b are scalars
x and y are vectors each with n elements.
Input Parameters
a REAL for saxpby

DOUBLE PRECISION for daxpby
389
COMPLEX for caxpby

DOUBLE COMPLEX for zaxpby
x REAL for saxpby

COMPLEX for caxpby
b REAL for saxpby

COMPLEX for caxpby
Specifies the scalar b.
y REAL for saxpby

COMPLEX for caxpby
Output Parameters
Example
For examples of routine usage, see the code in the Intel MKL installation directory:
saxpby: examples\blas\source\saxpbyx.f
daxpby: examples\blas\source\daxpbyx.f
caxpby: examples\blas\source\caxpbyx.f
zaxpby: examples\blas\source\zaxpbyx.f

Specific details for the routine axpby interface are the following:
390
y Holds the array of size n.
b The default value is 1.
?gem2vu
Computes two matrix-vector products using a general
matrix (real data)
Syntax
call sgem2vu(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call dgem2vu(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call gem2vu(a, x1, x2, y1, y2 [,alpha][,beta] )
Include Files
mkl.fi, blas.f90
Description
The ?gem2vu routines perform two matrix-vector operations defined as
y1 := alpha*A*x1 + beta*y1,
and
y2 := alpha*A'*x2 + beta*y2,
where:
x1, x2, y1, and y2 are vectors,
Input Parameters


alpha REAL for sgem2vu

DOUBLE PRECISION for dgem2vu
a REAL for sgem2vu

Array, size (lda, n). Before entry, the leading m-by-n part of the array a
must contain the matrix of coefficients.
391

(sub)program. The value of lda must be at least max(1, m).
x1 REAL for sgem2vu

Array, size at least (1+(n-1)*abs(incx1)). Before entry, the incremented
array x1 must contain the vector x1.
incx1 INTEGER. Specifies the increment for the elements of x1.

The value of incx1 must not be zero.
x2 REAL for sgem2vu

Array, size at least (1+(m-1)*abs(incx2)). Before entry, the incremented

beta REAL for sgem2vu

Specifies the scalar beta. When beta is set to zero, then y1 and y2 need not
be set on input.
y1 REAL for sgem2vu

Array, size at least (1+(m-1)*abs(incy1)). Before entry with non-zero
beta, the incremented array y1 must contain the vector y1.
incy1 INTEGER. Specifies the increment for the elements of y1.

The value of incy1 must not be zero.
y REAL for sgem2vu

Array, size at least (1+(n-1)*abs(incy2)). Before entry with non-zero

Output Parameters
y1 Updated vector y1.
392
Specific details for the routine gem2vu interface are the following:
x1 Holds the vector with the number of elements rx1 where rx1 = n.
x2 Holds the vector with the number of elements rx2 where rx2 = m.
y1 Holds the vector with the number of elements ry1 where ry1 = m.
y2 Holds the vector with the number of elements ry2 where ry2 = n.
?gem2vc
Computes two matrix-vector products using a general
matrix (complex data)
Syntax
call cgem2vc(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call zgem2vc(m, n, alpha, a, lda, x1, incx1, x2, incx2, beta, y1, incy1, y2, incy2)
call gem2vc(a, x1, x2, y1, y2 [,alpha][,beta] )
Include Files
mkl.fi, blas.f90
Description
The ?gem2vc routines perform two matrix-vector operations defined as
y1 := alpha*A*x1 + beta*y1,
and
y2 := alpha*conjg(A')*x2 + beta*y2,
where:
x1, x2, y1, and y2 are vectors,
Input Parameters

393

alpha COMPLEX for cgem2vc

DOUBLE COMPLEX for zgem2vc
a COMPLEX for cgem2vc

Array, size (lda, n). Before entry, the leading m-by-n part of the array a
must contain the matrix of coefficients.

x1 COMPLEX for cgem2vc

Array, size at least (1+(n-1)*abs(incx1)). Before entry, the incremented

x2 COMPLEX for cgem2vc

Array, size at least (1+(m-1)*abs(incx2)). Before entry, the incremented

beta COMPLEX for cgem2vc

Specifies the scalar beta. When beta is set to zero, then y1 and y2 need not
be set on input.
y1 COMPLEX for cgem2vc

Array, size at least (1+(m-1)*abs(incy1)). Before entry with non-zero

y2 COMPLEX for cgem2vc

394
Array, size at least (1+(n-1)*abs(incy2)). Before entry with non-zero

INTEGER. Specifies the increment for the elements of y.
Output Parameters

Specific details for the routine gem2vc interface are the following:
x1 Holds the vector with the number of elements rx1 where rx1 = n.
x2 Holds the vector with the number of elements rx2 where rx2 = m.
y1 Holds the vector with the number of elements ry1 where ry1 = m.
y2 Holds the vector with the number of elements ry2 where ry2 = n.
?gemmt
matrices but updates only the upper or lower
triangular part of the result matrix.
Syntax
call sgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call dgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call cgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemmt (uplo, transa, transb, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemmt (a, b, c[, uplo] [, transa] [, transb] [, alpha] [, beta])
Include Files
mkl.fi, blas.f90
395
Description
The ?gemmt routines compute a scalar-matrix-matrix product with general matrices and add the result to the
upper or lower part of a scalar-matrix product. These routines are similar to the ?gemm routines, but they
only access and update a triangular part of the square result matrix (see Application Notes below).
The operation is defined as
where:
op(A) is an n-by-k matrix,
C is an n-by-n upper or lower triangular matrix.
Input Parameters
array c is used. If uplo = 'U' or 'u', then the upper triangular part of the
array c is used. If uplo = 'L' or 'l', then the lower triangular part of the
array c is used.

multiplication:
if transa = 'T' or 't', then op(A) = AT;
if transa = 'C' or 'c', then op(A) = AH.

multiplication:
if transb = 'T' or 't', then op(B) = BT;
if transb = 'C' or 'c', then op(B) = BH.

least zero.
number of rows of the matrix op(B). The value of k must be at least zero.
alpha REAL for sgemmt

DOUBLE PRECISION for dgemmt
COMPLEX for cgemmt
DOUBLE COMPLEX for zgemmt
a REAL for sgemmt
396
COMPLEX for cgemmt
Array, size lda by ka, where ka is k when transa = 'N' or 'n', and is n
otherwise. Before entry with transa = 'N' or 'n', the leading n-by-k part
of the array a must contain the matrix A, otherwise the leading k-by-n part

(sub)program. When transa = 'N' or 'n', then lda must be at least
b REAL for sgemmt

COMPLEX for cgemmt
Array, size ldb by kb, where kb is n when transb = 'N' or 'n', and is k
otherwise. Before entry with transb = 'N' or 'n', the leading k-by-n part

(sub)program. When transb = 'N' or 'n', then ldb must be at least
max(1, k), otherwise ldb must be at least max(1, n).
beta REAL for sgemmt

COMPLEX for cgemmt
Specifies the scalar beta. When beta is equal to zero, then c need not be
set on input.
c REAL for sgemmt

COMPLEX for cgemmt
part of the array c must contain the upper triangular part of the matrix C
and the strictly lower triangular part of c is not referenced.
part of the array c must contain the lower triangular part of the matrix C
and the strictly upper triangular part of c is not referenced.
397

Output Parameters
c When uplo = 'U' or 'u', the upper triangular part of the array c is
When uplo = 'L' or 'l', the lower triangular part of the array c is
Fortran 95 Interface Notes

counterparts. For general conventions applied to skip redundant or reconstructible arguments, see Fortran 95
Specific details for the routine gemmt interface are the following:

ka = n otherwise,
ma = n if transa='N',
ma = k otherwise.

kb = k otherwise,
mb = n otherwise.
uplo Must be 'U' or 'L'.
The default value is 'U'.
398
Application Notes
These routines only access and update the upper or lower triangular part of the result matrix. This can be
useful when the result is known to be symmetric; for example, when computing a product of the form C :=
alpha*B*S*BT + beta*C , where S and C are symmetric matrices and B is a general matrix. In this case,
first compute A := B*S (which can be done using the corresponding ?symm routine), then compute C :=
alpha*A*BT + beta*C using the ?gemmt routine.
?gemm3m
Computes a scalar-matrix-matrix product using matrix
product.
Syntax
call cgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call zgemm3m(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
call gemm3m(a, b, c [,transa][,transb] [,alpha][,beta])
Include Files
mkl.fi, blas.f90
Description
The ?gemm3m routines perform a matrix-matrix operation with general complex matrices. These routines are
similar to the ?gemm routines, but they use fewer matrix multiplication operations (see Application Notes
below).
where:
op(x) is one of op(x) = x, or op(x) = x', or op(x) = conjg(x'),
Input Parameters

multiplication:
if transa = 'T' or 't', then op(A) = A';
if transa = 'C' or 'c', then op(A) = conjg(A').

multiplication:
399
if transb = 'T' or 't', then op(B) = B';
if transb = 'C' or 'c', then op(B) = conjg(B').
number of columns of the matrix C.
number of rows of the matrix op(B).
alpha COMPLEX for cgemm3m

DOUBLE COMPLEX for zgemm3m
a COMPLEX for cgemm3m

Array, size lda by ka, where ka is k when transa = 'N' or 'n', and is m
otherwise. Before entry with transa = 'N' or 'n', the leading m-by-k part
of the array a must contain the matrix A, otherwise the leading k-by-m part

(sub)program.
When transa = 'N' or 'n', then lda must be at least max(1, m),
otherwise lda must be at least max(1, k).
b COMPLEX for cgemm3m

Array, size ldb by kb, where kb is n when transa = 'N' or 'n', and is k
otherwise. Before entry with transa = 'N' or 'n', the leading k-by-n part

(sub)program.
When transa = 'N' or 'n', then ldb must be at least max(1, k),
otherwise ldb must be at least max(1, n).
beta COMPLEX for cgemm3m

400
c COMPLEX for cgemm3m

(sub)program.
The value of ldc must be at least max(1, m).
Output Parameters
c Overwritten by the m-by-n matrix (alpha*op(A)*op(B) + beta*C).

Specific details for the routine gemm3m interface are the following:

ka = m otherwise,
ma = k otherwise.

kb = k otherwise,
mb = n otherwise.
401
Application Notes
These routines perform a complex matrix multiplication by forming the real and imaginary parts of the input
matrices. This uses three real matrix multiplications and five real matrix additions instead of the conventional
four real matrix multiplications and two real matrix additions. The use of three real matrix multiplications
reduces the time spent in matrix operations by 25%, resulting in significant savings in compute time for
large matrices.
If the errors in the floating point calculations satisfy the following conditions:
fl(x op y)=(x op y)(1+),||u, op=,/, fl(xy)=x(1+)y(1+), ||,||u
then for an n-by-n matrix =fl(C1+iC2)= fl((A1+iA2)(B1+iB2))=1+i2, the following bounds are
satisfied:
1-C1 2(n+1)uAB+O(u2),
2-C2 4(n+4)uAB+O(u2),
where A=max(A1,A2), and B=max(B1,B2).
Thus the corresponding matrix multiplications are stable.
?gemm_batch
Computes scalar-matrix-matrix products and adds the
matrices.
Syntax
call sgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
a_array, lda_array, b_array, ldb_array, beta_array, c_array, ldc_array, group_count,
group_size)
call dgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
group_size)
call cgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
group_size)
call zgemm_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
group_size)
call sgemm_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size
[,transa_array][,transb_array] [,alpha_array][,beta_array])
call dgemm_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size

call cgemm_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size

call zgemm_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size

Include Files
mkl.fi, blas.f90
402
Description
The ?gemm_batch routines perform a series of matrix-matrix operations with general matrices. They are
similar to the ?gemm routine counterparts, but the ?gemm_batch routines perform matrix-matrix operations
with groups of matrices, processing a number of groups at once. The groups contain matrices with the same
parameters.
idx = 1
for i = 1..group_count
alpha and beta in alpha_array(i) and beta_array(i)
for j = 1..group_size(i)
A, B, and C matrix in a_array(idx), b_array(idx), and c_array(idx)
idx = idx + 1
end for
end for
where:
alpha and beta are scalar elements of alpha_array and beta_array,
A, B and C are matrices such that for m, n, and k which are elements of m_array, n_array, and k_array:

A, B, and C represent matrices stored at addresses pointed to by a_array, b_array, and c_array,
respectively. The number of entries in a_array, b_array, and c_array is total_batch_count = the sum of all
of the group_size entries.
See also gemm for a detailed description of multiplication for general matrices and ?gemm3m_batch, BLAS-
like extension routines for similar matrix-matrix operations.
NOTE
Error checking is not performed for Intel MKL Windows* single dynamic libraries for the ?gemm_batch
routines.
Input Parameters
transa_array CHARACTER*1. Array of size group_count. For the group i, transai =

transa_array(i) specifies the form of op(A) used in the matrix
multiplication:
if transai = 'N' or 'n', then op(A) = A;
if transai = 'T' or 't', then op(A) = AT;
if transai = 'C' or 'c', then op(A) = AH.
transb_array CHARACTER*1. Array of size group_count. For the group i, transbi =

transb_array(i) specifies the form of op(Bi) used in the matrix
multiplication:
if transbi = 'N' or 'n', then op(B) = B;
403
if transbi = 'T' or 't', then op(B) = BT;
if transbi = 'C' or 'c', then op(B) = BH.
m_array INTEGER. Array of size group_count. For the group i, mi = m_array(i)

specifies the number of rows of the matrix op(A) and of the matrix C.
The value of each element of m_array must be at least zero.
n_array INTEGER. Array of size group_count. For the group i, ni = n_array(i)

specifies the number of columns of the matrix op(B) and the number of
columns of the matrix C.
The value of each element of n_array must be at least zero.
k_array INTEGER. Array of size group_count. For the group i, ki = k_array(i)

specifies the number of columns of the matrix op(A) and the number of
rows of the matrix op(B).
The value of each element of k_array must be at least zero.
alpha_array REAL for sgemm_batch

DOUBLE PRECISION for dgemm_batch
COMPLEX for cgemm_batch
DOUBLE COMPLEX for zgemm_batch
Array of size group_count. For the group i, alpha_array(i) specifies the
scalar alphai.

a_array INTEGER*8 for Intel 64 architecture
INTEGER*4 for IA-32 architecture
Array, size total_batch_count, of pointers to arrays used to store A
matrices.
lda_array INTEGER. Array of size group_count. For the group i, ldai =

lda_array(i) specifies the leading dimension of the array storing matrix A
as declared in the calling (sub)program.
When transai = 'N' or 'n', then ldai must be at least max(1, mi),
otherwise ldai must be at least max(1, ki).

b_array INTEGER*8 for Intel 64 architecture
Array, size total_batch_count, of pointers to arrays used to store B
matrices.
ldb_array INTEGER.
Array of size group_count. For the group i, ldbi = ldb_array(i)
specifies the leading dimension of the array storing matrix B as declared in
the calling (sub)program.
When transbi = 'N' or 'n', then ldbi must be at least max(1, ki),
otherwise ldbi must be at least max(1, ni).
beta_array REAL for sgemm_batch
404
DOUBLE PRECISION for dgemm_batch
COMPLEX for cgemm_batch
DOUBLE COMPLEX for zgemm_batch
Array of size group_count. For the group i, beta_array(i) specifies the
scalar betai.
When betai is equal to zero, then C matrices in group i need not be set on
input.

c_array INTEGER*8 for Intel 64 architecture
Array, size total_batch_count, of pointers to arrays used to store C
matrices.
ldc_array INTEGER.
Array of size group_count. For the group i, ldci = ldc_array(i)
specifies the leading dimension of all arrays storing matrix C in group i as
declared in the calling (sub)program.
ldci must be at least max(1, mi).
group_count INTEGER.
Specifies the number of groups. Must be at least 0.
group_size INTEGER.
Array of size group_count. The element group_size(i) specifies the
number of matrices in group i. Each element in group_size must be at
least 0.
Output Parameters
c_array Overwritten by the mi-by-ni matrix (alphai*op(A)*op(B) + betai*C) for

group i.

Specific details for the routine gemm_batch interface are the following:
a_array Holds pointers to arrays containing matrices A of size (ma,ka) where

ka = m otherwise,
ma = k otherwise.
b_array Holds pointers to arrays containing matrices B of size (mb,kb) where

kb = n if transb_array = 'N',
405
kb = k otherwise,
mb = k if transb_array = 'N',
mb = n otherwise.
c_array Holds pointers to arrays containing matrices C of size (m,n).
m_array Array indicating number of rows of matrices op(A) and C for each group.
n_array Array indicating number of columns of matrices op(B) and C for each
group.
k_array Array indicating number of columns of matrices op(A) and number of rows
of matrices op(B) for each group.
group_size Array indicating number of matrices for each group. Each element in
group_size must be at least 0.
transa_array Array with each element set to one of 'N', 'C', or 'T'.
The default values are 'N'.
transb_array Array with each element set to one of 'N', 'C', or 'T'.
alpha_array Array of alpha values; the default value is 1.
beta_array Array of beta values; the default value is 0.
?gemm3m_batch
Computes scalar-matrix-matrix products and adds the
matrices.
Syntax
call cgemm3m_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
group_size)
call zgemm3m_batch(transa_array, transb_array, m_array, n_array, k_array, alpha_array,
group_size)
call cgemm3m_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size
call zgemm3m_batch(a_array, b_array, c_array, m_array, n_array, k_array, group_size

Include Files
mkl.fi, blas.f90
Description
406
The ?gemm3m_batch routines perform a series of matrix-matrix operations with general matrices. They are
similar to the ?gemm3m routine counterparts, but the ?gemm3m_batch routines perform matrix-matrix
operations with groups of matrices, processing a number of groups at once. The groups contain matrices with
the same parameters. The ?gemm3m_batch routines use fewer matrix multiplications than the ?gemm_batch
routines, as described in the Application Notes.
idx = 1
for i = 1..group_count
alpha and beta in alpha_array(i) and beta_array(i)
for j = 1..group_size(i)
A, B, and C matrix in a_array(idx), b_array(idx), and c_array(idx)
idx = idx + 1
end for
end for
where:
alpha and beta are scalar elements of alpha_array and beta_array,
A, B and C are matrices such that for m, n, and k which are elements of m_array, n_array, and k_array:

A, B, and C represent matrices stored at addresses pointed to by a_array, b_array, and c_array,
respectively. The number of entries in a_array, b_array, and c_array is total_batch_count = the sum of all
the group_size entries.
See also gemm for a detailed description of multiplication for general matrices and gemm_batch, BLAS-like
extension routines for similar matrix-matrix operations.
NOTE
Error checking is not performed for Intel MKL Windows* single dynamic libraries for the ?
gemm3m_batch routines.
Input Parameters
transa_array CHARACTER*1. Array of size group_count. For the group i, transai =

transa_array(i) specifies the form of op(A) used in the matrix
multiplication:
if transai = 'N' or 'n', then op(A) = A;
if transai = 'T' or 't', then op(A) = AT;
if transai = 'C' or 'c', then op(A) = AH.
transb_array CHARACTER*1. Array of size group_count. For the group i, transbi =

transb_array(i) specifies the form of op(Bi) used in the matrix
multiplication:
if transbi = 'N' or 'n', then op(B) = B;
if transbi = 'T' or 't', then op(B) = BT;
407
if transbi = 'C' or 'c', then op(B) = BH.
m_array INTEGER. Array of size group_count. For the group i, mi = m_array(i)

specifies the number of rows of the matrix op(A) and of the matrix C.
The value of each element of m_array must be at least zero.
n_array INTEGER. Array of size group_count. For the group i, ni = n_array(i)

specifies the number of columns of the matrix op(B) and the number of
columns of the matrix C.
The value of each element of n_array must be at least zero.
k_array INTEGER. Array of size group_count. For the group i, ki = k_array(i)

specifies the number of columns of the matrix op(A) and the number of
rows of the matrix op(B).
The value of each element of k_array must be at least zero.
alpha_array COMPLEX for cgemm3m_batch

DOUBLE COMPLEX for zgemm3m_batch
Array of size group_count. For the group i, alpha_array(i) specifies the
scalar alphai.

a_array INTEGER*8 for Intel 64 architecture
Array, size total_batch_count, of pointers to arrays used to store A
matrices.
lda_array INTEGER. Array of size group_count. For the group i, ldai =

lda_array(i) specifies the leading dimension of the array storing matrix A
as declared in the calling (sub)program.
When transai = 'N' or 'n', then ldai must be at least max(1, mi),
otherwise ldai must be at least max(1, ki).

b_array INTEGER*8 for Intel 64 architecture
Array, size total_batch_count, of pointers to arrays used to store B
matrices.
ldb_array INTEGER.
Array of size group_count. For the group i, ldbi = ldb_array(i)
specifies the leading dimension of the array storing matrix B as declared in
the calling (sub)program.
When transbi = 'N' or 'n', then ldbi must be at least max(1, ki),
otherwise ldbi must be at least max(1, ni).
beta_array COMPLEX for cgemm3m_batch

DOUBLE COMPLEX for zgemm3m_batch
For the group i, beta_array(i) specifies the scalar betai.
408
When betai is equal to zero, then C matrices in group i need not be set on
input.

c_array INTEGER*8 for Intel 64 architecture
Array, size total_batch_count, of pointers to arrays used to store C
matrices.
ldc_array INTEGER.
Array of size group_count. For the group i, ldci = ldc_array(i)
specifies the leading dimension of all arrays storing matrix C in group i as
declared in the calling (sub)program.
ldci must be at least max(1, mi).
group_count INTEGER.
Specifies the number of groups. Must be at least 0.
group_size INTEGER.
Array of size group_count. The element group_size(i) specifies the
number of matrices in group i. Each element in group_size must be at
least 0.
Output Parameters
c_array Overwritten by the mi-by-ni matrix (alphai*op(A)*op(B) + betai*C) for

group i.

Specific details for the routine gemm3m_batch interface are the following:
a_array Holds pointers to arrays containing matrices A of size (ma,ka) where

ka = m otherwise,
ma = k otherwise.
b_array Holds pointers to arrays containing matrices B of size (mb,kb) where

kb = n if transb_array = 'N',
kb = k otherwise,
mb = k if transb_array = 'N',
mb = n otherwise.
c_array Holds pointers to arrays containing matrices C of size (m,n).
409
m_array Array indicating number of rows of matrices op(A) and C for each group.
n_array Array indicating number of columns of matrices op(B) and C for each
group.
k_array Array indicating number of columns of matrices op(A) and number of rows
of matrices op(B) for each group.
group_size Array indicating number of matrices for each group. Each element in
group_size must be at least 0.
transa_array Array with each element set to one of 'N', 'C', or 'T'.
transb_array Array with each element set to one of 'N', 'C', or 'T'.
alpha_array Array of alpha values; the default value is 1.
beta_array Array of beta values; the default value is 0.
Application Notes
These routines perform a complex matrix multiplication by forming the real and imaginary parts of the input
matrices. This uses three real matrix multiplications and five real matrix additions instead of the conventional
four real matrix multiplications and two real matrix additions. The use of three real matrix multiplications
reduces the time spent in matrix operations by 25%, resulting in significant savings in compute time for
large matrices.
If the errors in the floating point calculations satisfy the following conditions:
fl(x op y)=(x op y)(1+),||u, op=,/, fl(xy)=x(1+)y(1+), ||,||u
then for an n-by-n matrix =fl(C1+iC2)= fl((A1+iA2)(B1+iB2))=1+i2, the following bounds are
satisfied:
1-C1 2(n+1)uAB+O(u2),
2-C2 4(n+4)uAB+O(u2),
where A=max(A1,A2), and B=max(B1,B2).
Thus the corresponding matrix multiplications are stable.
mkl_?imatcopy
Performs scaling and in-place transposition/copying of
matrices.
Syntax
call mkl_simatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_dimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_cimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
call mkl_zimatcopy(ordering, trans, rows, cols, alpha, ab, lda, ldb)
410
Include Files
mkl.fi
Description
The mkl_?imatcopy routine performs scaling and in-place transposition/copying of matrices. A transposition
operation can be a normal matrix copy, a transposition, a conjugate transposition, or just a conjugation. The
operation is defined as follows:
AB := alpha*op(AB).
NOTE
Different arrays must not overlap.
Input Parameters
ordering CHARACTER*1. Ordering of the matrix storage.

If ordering = 'R' or 'r', the ordering is row-major.
If ordering = 'C' or 'c', the ordering is column-major.
trans CHARACTER*1. Parameter that specifies the operation type.

If trans = 'N' or 'n', op(AB)=AB and the matrix AB is assumed
unchanged on input.
If trans = 'T' or 't', it is assumed that AB should be transposed.
If trans = 'C' or 'c', it is assumed that AB should be conjugate

transposed.
If trans = 'R' or 'r', it is assumed that AB should be only conjugated.
If the data is real, then trans = 'R' is the same as trans = 'N', and
trans = 'C' is the same as trans = 'T'.
rows INTEGER. The number of rows in matrix AB before the transpose operation.
cols INTEGER. The number of columns in matrix AB before the transpose

operation.
ab REAL for mkl_simatcopy.

DOUBLE PRECISION for mkl_dimatcopy.
COMPLEX for mkl_cimatcopy.
DOUBLE COMPLEX for mkl_zimatcopy.
Array, size ab(lda,*).
alpha REAL for mkl_simatcopy.

411

This parameter scales the input matrix by alpha.
lda INTEGER. Distance between the first elements in adjacent columns (in the
case of the column-major order) or rows (in the case of the row-major
order) in the source matrix; measured in the number of elements.
This parameter must be at least rows if ordering = 'C' or 'c', and
max(1,cols) otherwise.
ldb INTEGER. Distance between the first elements in adjacent columns (in the
order) in the destination matrix; measured in the number of elements.
To determine the minimum value of ldb on output, consider the following
guideline:
If ordering = 'C' or 'c', then
If trans = 'T' or 't' or 'C' or 'c', this parameter must be at least

max(1,cols)
If trans = 'N' or 'n' or 'R' or 'r', this parameter must be at least
max(1,rows)
If ordering = 'R' or 'r', then
If trans = 'T' or 't' or 'C' or 'c', this parameter must be at least

max(1,rows)
If trans = 'N' or 'n' or 'R' or 'r', this parameter must be at least
max(1,cols)
Output Parameters
ab REAL for mkl_simatcopy.

Array, size ab(ldb,*).
Contains the matrix AB.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_simatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
CHARACTER*1 ordering, trans
INTEGER rows, cols, src_ld, dst_ld
REAL ab(*), alpha*
SUBROUTINE mkl_dimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
DOUBLE PRECISION ab(*), alpha*
412
SUBROUTINE mkl_cimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
COMPLEX ab(*), alpha*
SUBROUTINE mkl_zimatcopy ( ordering, trans, rows, cols, alpha, ab, lda, ldb )
DOUBLE COMPLEX ab(*), alpha*
mkl_?omatcopy
Performs scaling and out-place transposition/copying
of matrices.
Syntax
call mkl_somatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_domatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_comatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
call mkl_zomatcopy(ordering, trans, rows, cols, alpha, a, lda, b, ldb)
Include Files
mkl.fi
Description
The mkl_?omatcopy routine performs scaling and out-of-place transposition/copying of matrices. A

transposition operation can be a normal matrix copy, a transposition, a conjugate transposition, or just a
conjugation. The operation is defined as follows:
B := alpha*op(A)
The routine parameter descriptions are common for all implemented interfaces with the exception of data
types that mostly refer here to the FORTRAN 77 standard types. Data types specific to the different
interfaces are described in the section "Interfaces" below.
NOTE
Input Parameters


If trans = 'N' or 'n', op(A)=A and the matrix A is assumed unchanged
on input.
If trans = 'T' or 't', it is assumed that A should be transposed.
413
If trans = 'C' or 'c', it is assumed that A should be conjugate

transposed.
If trans = 'R' or 'r', it is assumed that A should be only conjugated.
rows INTEGER. The number of rows in matrix B (the destination matrix).
cols INTEGER. The number of columns in matrix B (the destination matrix).
alpha REAL for mkl_somatcopy.

DOUBLE PRECISION for mkl_domatcopy.
COMPLEX for mkl_comatcopy.
DOUBLE COMPLEX for mkl_zomatcopy.
a REAL for mkl_somatcopy.

Array, size a(lda,*).
lda INTEGER. (Fortran interface).

If ordering = 'R' or 'r', lda represents the number of elements in array
a between adjacent rows of matrix A; lda must be at least equal to the
number of columns of matrix A.
If ordering = 'C' or 'c', lda represents the number of elements in array
a between adjacent columns of matrix A; lda must be at least equal to the
number of row in matrix A.
b REAL for mkl_somatcopy.

Array, size b(ldb,*).
ldb INTEGER. (Fortran interface).

If ordering = 'R' or 'r', ldb represents the number of elements in array
b between adjacent rows of matrix B.
If trans = 'T' or 't' or 'C' or 'c', ldb must be at least equal to

rows.
If trans = 'N' or 'n' or 'R' or 'r', ldb must be at least equal to
cols.
If ordering = 'C' or 'c', ldb represents the number of elements in array

b between adjacent columns of matrix B.
414
cols.
rows.
Output Parameters
b REAL for mkl_somatcopy.

Contains the destination matrix.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy ( ordering, trans, rows, cols, alpha, a, lda, b, ldb )
INTEGER rows, cols, lda, ldb
REAL alpha, b(ldb,*), a(lda,*)
SUBROUTINE mkl_domatcopy ( ordering, trans, rows, cols, alpha, a, lda, b, ldb )

DOUBLE PRECISION alpha, b(ldb,*), a(lda,*)
SUBROUTINE mkl_comatcopy ( ordering, trans, rows, cols, alpha, a, lda, b, ldb )

COMPLEX alpha, b(ldb,*), a(lda,*)
SUBROUTINE mkl_zomatcopy ( ordering, trans, rows, cols, alpha, a, lda, b, ldb )

DOUBLE COMPLEX alpha, b(ldb,*), a(lda,*)
mkl_?omatcopy2
Performs two-strided scaling and out-of-place
transposition/copying of matrices.
Syntax
call mkl_somatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
call mkl_domatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
call mkl_comatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,
strideb)
415
call mkl_zomatcopy2(ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb,

strideb)
Include Files
mkl.fi
Description
The mkl_?omatcopy2 routine performs two-strided scaling and out-of-place transposition/copying of

matrices. A transposition operation can be a normal matrix copy, a transposition, a conjugate transposition,
or just a conjugation. The operation is defined as follows:
B := alpha*op(A)
Normally, matrices in the BLAS or LAPACK are specified by a single stride index. For instance, in the column-
major order, A(2,1) is stored in memory one element away from A(1,1), but A(1,2) is a leading dimension
away. The leading dimension in this case is at least the number of rows of the source matrix. If a matrix has
two strides, then both A(2,1) and A(1,2) may be an arbitrary distance from A(1,1).
NOTE
Input Parameters


If trans = 'N' or 'n', op(A)=A and the matrix A is assumed unchanged
on input.
If trans = 'T' or 't', it is assumed that A should be transposed.
If trans = 'C' or 'c', it is assumed that A should be conjugate

transposed.
If trans = 'R' or 'r', it is assumed that A should be only conjugated.
rows INTEGER. The number of rows in matrix B (the destination matrix).
cols INTEGER. The number of columns in matrix B (the destination matrix).
alpha REAL for mkl_somatcopy2.

DOUBLE PRECISION for mkl_domatcopy2.
COMPLEX for mkl_comatcopy2.
DOUBLE COMPLEX for mkl_zomatcopy2.
416
a REAL for mkl_somatcopy2.

Array, size a(*).
lda INTEGER.
If ordering = 'R' or 'r', lda represents the number of elements in array
a between adjacent rows of matrix A; lda must be at least equal to the
number of columns of matrix A.
If ordering = 'C' or 'c', lda represents the number of elements in array
a between adjacent columns of matrix A; lda must be at least 1 and not
more than the number of columns in matrix A.
stridea INTEGER.
If ordering = 'R' or 'r', stridea represents the number of elements in
array a between adjacent columns of matrix A. stridea must be at least 1
and not more than the number of columns in matrix A.
If ordering = 'C' or 'c', stridea represents the number of elements in
array a between adjacent rows of matrix A. stridea must be at least equal
to the number of columns in matrix A.
b REAL for mkl_somatcopy2.

Array, size b(*).
ldb INTEGER.
If ordering = 'R' or 'r', ldb represents the number of elements in array
b between adjacent rows of matrix B.

rows/strideb.
cols/strideb.
If ordering = 'C' or 'c', ldb represents the number of elements in array

b between adjacent columns of matrix B.
If trans = 'T' or 't' or 'C' or 'c', ldb must be at least 1 and not
more than rows/strideb.
If trans = 'N' or 'n' or 'R' or 'r', ldb must be at least 1 and not
more than cols/strideb.
strideb INTEGER.
417
If ordering = 'R' or 'r', strideb represents the number of elements in

array b between adjacent columns of matrix B.
If trans = 'T' or 't' or 'C' or 'c', strideb must be at least 1 and

not more than rows (the number of rows in matrix B).
If trans = 'N' or 'n' or 'R' or 'r', strideb must be at least 1 and
not more than cols (the number of columns in matrix B).
If ordering = 'C' or 'c', strideb represents the number of elements in

array b between adjacent rows of matrix B.
If trans = 'T' or 't' or 'C' or 'c', strideb must be at least equal to

rows (the number of rows in matrix B).
If trans = 'N' or 'n' or 'R' or 'r', strideb must be at least equal to
cols (the number of columns in matrix B).
Output Parameters
b REAL for mkl_somatcopy2.

Contains the destination matrix.
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
INTEGER rows, cols, lda, stridea, ldb, strideb
REAL alpha, b(*), a(*)
SUBROUTINE mkl_domatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
DOUBLE PRECISION alpha, b(*), a(*)
SUBROUTINE mkl_comatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
COMPLEX alpha, b(*), a(*)
SUBROUTINE mkl_zomatcopy2 ( ordering, trans, rows, cols, alpha, a, lda, stridea, b, ldb, strideb )
DOUBLE COMPLEX alpha, b(*), a(*)
mkl_?omatadd
Scales and sums two matrices including in addition to
performing out-of-place transposition operations.
418
Syntax
call mkl_somatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_domatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_comatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
call mkl_zomatadd(ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc)
Include Files
mkl.fi
Description
The mkl_?omatadd routine scales and adds two matrices, as well as performing out-of-place transposition
operations. A transposition operation can be no operation, a transposition, a conjugate transposition, or a
conjugation (without transposition). The following out-of-place memory movement is done:
C := alpha*op(A) + beta*op(B)
where the op(A) and op(B) operations are transpose, conjugate-transpose, conjugate (no transpose), or no
transpose, depending on the values of transa and transb. If no transposition of the source matrices is
required, m is the number of rows and n is the number of columns in the source matrices A and B. In this
case, the output matrix C is m-by-n.
NOTE
Note that different arrays must not overlap.
Input Parameters

transa CHARACTER*1. Parameter that specifies the operation type on matrix A.

If transa = 'N' or 'n', op(A)=A and the matrix A is assumed unchanged
on input.
If transa = 'T' or 't', it is assumed that A should be transposed.
If transa = 'C' or 'c', it is assumed that A should be conjugate

transposed.
If transa = 'R' or 'r', it is assumed that A should be conjugated (and not
transposed).
If the data is real, then transa = 'R' is the same as transa = 'N', and
transa = 'C' is the same as transa = 'T'.
transb CHARACTER*1. Parameter that specifies the operation type on matrix B.

If transb = 'N' or 'n', op(B)=B and the matrix B is assumed unchanged
on input.
419
If transb = 'T' or 't', it is assumed that B should be transposed.
If transb = 'C' or 'c', it is assumed that B should be conjugate

transposed.
If transb = 'R' or 'r', it is assumed that B should be conjugated (and not
transposed).
If the data is real, then transb = 'R' is the same as transb = 'N', and
transb = 'C' is the same as transb = 'T'.
m INTEGER. The number of matrix rows in op(A), op(B), and C.
n INTEGER. The number of matrix columns in op(A), op(B), and C.
alpha REAL for mkl_somatadd.

DOUBLE PRECISION for mkl_domatadd.
COMPLEX for mkl_comatadd.
DOUBLE COMPLEX for mkl_zomatadd.
a REAL for mkl_somatadd.

Array, size a(lda,*).
lda INTEGER. Distance between the first elements in adjacent columns (in the
order) in the source matrix A; measured in the number of elements.
For ordering = 'C' or 'c': when transa = 'N', 'n', 'R', or 'r', lda
must be at least max(1,m); otherwise lda must be max(1,n).
For ordering = 'R' or 'r': when transa = 'N', 'n', 'R', or 'r', lda
must be at least max(1,n); otherwise lda must be max(1,m).
beta REAL for mkl_somatadd.

This parameter scales the input matrix by beta.
b REAL for mkl_somatadd.

Array, size b(ldb,*).
420
ldb INTEGER. Distance between the first elements in adjacent columns (in the
order) in the source matrix B; measured in the number of elements.
For ordering = 'C' or 'c': when transa = 'N', 'n', 'R', or 'r', ldb
must be at least max(1,m); otherwise ldb must be max(1,n).
For ordering = 'R' or 'r': when transa = 'N', 'n', 'R', or 'r', ldb
must be at least max(1,n); otherwise ldb must be max(1,m).
ldc INTEGER. Distance between the first elements in adjacent columns (in the
order) in the destination matrix C; measured in the number of elements.
If ordering = 'C' or 'c', then ldc must be at least max(1, m),
otherwise ldc must be at least max(1, n).
Output Parameters
c REAL for mkl_somatadd.

Array, size c(ldc,*).
Interfaces
FORTRAN 77:
SUBROUTINE mkl_somatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
CHARACTER*1 ordering, transa, transb
INTEGER m, n, lda, ldb, ldc
REAL alpha, beta
REAL a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_domatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
DOUBLE PRECISION a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_comatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
COMPLEX alpha, beta
COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
SUBROUTINE mkl_zomatadd ( ordering, transa, transb, m, n, alpha, a, lda, beta, b, ldb, c, ldc )
DOUBLE COMPLEX a(lda,*), b(ldb,*), c(ldc,*)
421
?gemm_alloc
Allocates storage for a packed matrix.
Syntax
dest = sgemm_alloc (identifier, m, n, k)
dest = dgemm_alloc (identifier, m, n, k)
Include Files
mkl.fi
Description
The ?gemm_alloc routine is one of a set of related routines that enable use of an internal packed storage.
Call the ?gemm_alloc routine first to allocate storage for a packed matrix structure to be used in subsequent
calls, ultimately to compute
where:
op(X) is one of the operations op(X) = X, op(X) = XT, or op(X) = XH,

A , B, and C are matrices:
Input Parameters
identifier CHARACTER*1. Specifies which matrix is to be packed:

If identifier = 'A' or 'a', the routine allocates storage to pack matrix
A.
If identifier = 'B' or 'b', the routine allocates storage to pack matrix
B.
m INTEGER. Specifies the number of rows of matrix op(A) and of the matrix C.
n INTEGER. Specifies the number of columns of matrix op(B) and the number
of columns of matrix C. The value of n must be at least zero.
k INTEGER. Specifies the number of columns of matrix op(A) and the number
of rows of matrix op(B). The value of k must be at least zero.
Output Parameters
dest POINTER.
Pointer to allocated storage.
See Also
?gemm_packPerforms scaling and packing of the matrix into the previously allocated buffer.
422
?gemm_computeComputes a matrix-matrix product with general matrices where one or both
input matrices are stored in a packed data structure and adds the result to a scalar-matrix
product.
?gemm_freeFrees the storage previously allocated for the packed matrix.
?gemm for a detailed description of general matrix multiplication.
?gemm_pack
Performs scaling and packing of the matrix into the
previously allocated buffer.
Syntax
call sgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
call dgemm_pack (identifier, trans, m, n, k, alpha, src, ld, dest)
Include Files
mkl.fi
Description
The ?gemm_pack routine is one of a set of related routines that enable use of an internal packed storage.
Call ?gemm_pack after successfully calling ?gemm_alloc. The ?gemm_pack scales the identified matrix by
alpha and packs it into the buffer allocated previously with ?gemm_alloc.
The ?gemm_pack routine performs this operation:
dest := alpha*op(src) as part of the computation C := alpha*op(A)*op(B) + beta*C

where:

src is a matrix,
A , B, and C are matrices
op(src) is an m-by-k matrix if identifier = 'A' or 'a',
op(src) is a k-by-n matrix if identifier = 'B' or 'b',
dest is an internal packed storage buffer.
NOTE
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for
packing B.
Input Parameters
identifier CHARACTER*1. Specifies which matrix is to be packed:

If identifier = 'A' or 'a', the routine allocates storage to pack matrix
A.
If identifier = 'B' or 'b', the routine allocates storage to pack matrix
B.
trans CHARACTER*1. Specifies the form of op(src) used in the packing:
423
If trans = 'N' or 'n' op(src) = src.
If trans = 'T' or 't' op(src) = srcT.
If trans = 'C' or 'c' op(src) = srcH.
number of columns of the matrix C. The value of n must be at least zero.
alpha REAL for sgemm_pack

DOUBLE PRECISION for dgemm_pack
src REAL for sgemm_pack

DOUBLE PRECISION for dgemm_pack
Array:
trans = 'N' or 'n' trans = 'T', 't', 'C',

or 'c'
identifier = Size ld*k. Size ld*m.

'A' or 'a'
Before entry, the leading Before entry, the leading
m-by-k part of the array k-by-m part of the array
src must contain the src must contain the
matrix A. matrix A.
identifier = Size ld*n. Size ld*k.

'B' or 'b'
Before entry, the leading Before entry, the leading
k-by-n part of the array n-by-k part of the array
src must contain the src must contain the
matrix B. matrix B.
ld INTEGER. Specifies the leading dimension of src as declared in the calling

(sub)program.
trans = 'N' or 'n' trans = 'T', 't', 'C',

or 'c'
identifier = 'A' or ld must be at least ld must be at least

'a' max(1, m). max(1, k).
identifier = 'B' or ld must be at least ld must be at least

'b' max(1, k). max(1, n).
dest POINTER.
424
Scaled and packed internal storage buffer.
Output Parameters
dest Overwritten by the matrix alpha*op(src).
See Also
?gemm_allocAllocates storage for a packed matrix.
product.
?gemm_compute
matrices where one or both input matrices are stored
in a packed data structure and adds the result to a
scalar-matrix product.
Syntax
call sgemm_compute (transa, transb, m, n, k, a, lda, b, ldb, beta, C, ldc)
call dgemm_compute (transa, transb, m, n, k, a, lda, b, ldb, beta, C, ldc)
Include Files
mkl.fi
Description
The ?gemm_compute routine is one of a set of related routines that enable use of an internal packed storage.
After calling ?gemm_pack call ?gemm_compute to compute
C := op(A)*op(B) + beta*C,
where:

beta is a scalar,
A , B, and C are matrices:
NOTE
For best performance, use the same number of threads for packing and for computing.
If packing for both A and B matrices, you must use the same number of threads for packing A as for
packing B.
425
Input Parameters

multiplication:
If transa = 'N' or 'n' op(A) = A.
If transa = 'T' or 't' op(A) = AT.
If transa = 'C' or 'c' op(A) = AH.
If transa = 'P' or 'p' the matrix in array a is packed and lda is ignored.

multiplication:
If transb = 'N' or 'n' op(B) = B.
If transb = 'T' or 't' op(B) = BT.
If transb = 'C' or 'c' op(B) = BH.
If transb = 'P' or 'p' the matrix in array b is packed and ldb is ignored.
number of columns of the matrix C. The value of n must be at least zero.
a REAL for sgemm_compute

DOUBLE PRECISION for dgemm_compute
Array:
transa = 'N' or 'n' transa = 'T', 't', 'C', transa = 'P'

or 'c' or 'p'
Size lda*k. Size lda*m. Stored in

internal packed
Before entry, the leading Before entry, the leading k-
format.
m-by-k part of the array a by-m part of the array a
must contain the matrix A. must contain the matrix A.

(sub)program.
If transa = 'N' or 'n', lda must be at least max (1, m).
If transa = 'T', 't', 'C', or 'c', lda must be at least max (1, k).
If transa = 'P' or 'p', lda is ignored.
b REAL for sgemm_compute

Array:
426
transb = 'N' or 'n' transb = 'T', 't', 'C', transb = 'P'
or 'c' or 'p'
Size ldb*n. Size ldb*k. Stored in

internal packed
Before entry, the leading Before entry, the leading n-
format.
k-by-n part of the array b by-k part of the array b
must contain the matrix B. must contain the matrix B.

(sub)program.
If transb = 'N' or 'n', ldb must be at least max (1, k).
If transb = 'T', 't', 'C', or 'c', ldb must be at least max (1, n).
If transb = 'P' or 'p', ldb is ignored.
beta REAL for sgemm_compute

Specifies the scalar beta. When beta is equal to zero, then c need not be
set on input.
c REAL for sgemm_compute


(sub)program.
The value of ldc must be at least max (1, m).
Output Parameters
c Overwritten by the m-by-n matrix op(A)*op(B) + beta*C.
See Also
?gemm_free
Frees the storage previously allocated for the packed
matrix.
Syntax
call sgemm_free (dest)
call dgemm_free (dest)
427
Include Files
mkl.fi
Description
The ?gemm_free routine is one of a set of related routines that enable use of an internal packed storage. Call
the ?gemm_free routine last to release storage for the packed matrix structure allocated with ?gemm_alloc.
Input Parameters
dest POINTER.
Previously allocated storage.
Output Parameters
dest The freed buffer.
See Also
product.
428
LAPACK Routines 3
This chapter describes the Intel Math Kernel Library implementation of routines from the LAPACK package
that are used for solving systems of linear equations, linear least squares problems, eigenvalue and singular
value problems, and performing a number of related computational tasks. The library includes LAPACK
routines for both real and complex data. Routines are supported for systems of equations with the following
types of matrices:
general
banded
symmetric or Hermitian positive-definite (full, packed, and rectangular full packed (RFP) storage)
symmetric or Hermitian positive-definite banded
symmetric or Hermitian indefinite (both full and packed storage)
symmetric or Hermitian indefinite banded
triangular (full, packed, and RFP storage)
triangular banded
tridiagonal
diagonally dominant tridiagonal.
NOTE
Different arrays used as parameters to Intel MKL LAPACK routines must not overlap.
WARNING
LAPACK routines assume that input matrices do not contain IEEE 754 special values such as INF or
NaN values. Using these special values may cause LAPACK to return unexpected results or become
unstable.
Intel MKL supports the Fortran 95 interface, which uses simplified routine calls with shorter argument lists, in
addition to the FORTRAN 77 interface to LAPACK computational and driver routines. The syntax section of the
routine description gives the calling sequence for the Fortran 95 interface, where available, immediately after
the FORTRAN 77 calls.
Naming Conventions for LAPACK Routines

To call one of the routines from a FORTRAN 77 program, you can use the LAPACK name.
LAPACK names have the structure ?yyzzz or ?yyzz, where the initial symbol ? indicates the data type:
Some routines can have combined character codes, such as ds or zc.
429
The Fortran 95 interfaces to the LAPACK computational and driver routines are the same as the FORTRAN 77
names but without the first letter that indicates the data type. For example, the name of the routine that
performs a triangular factorization of general real matrices in Fortran 95 is getrf. Different data types are
handled through the definition of a specific internal parameter that refers to a module block with named
constants for single and double precision.
Fortran 95 Interface Conventions for LAPACK Routines

Intel MKL implements the Fortran 95 interface to LAPACK through wrappers that call respective FORTRAN 77
routines. This interface uses such Fortran 95 features as assumed-shape arrays and optional arguments to
provide simplified calls to LAPACK routines with fewer arguments.
NOTE
For LAPACK, Intel MKL offers two types of the Fortran 95 interfaces:
using mkl_lapack.fi only through the include 'mkl_lapack.fi' statement. Such interfaces
allow you to make use of the original LAPACK routines with all their arguments
using lapack.f90 that includes improved interfaces. This file is used to generate the module files
lapack95.mod and f95_precision.mod. See also the section "Fortran 95 interfaces and wrappers
to LAPACK and BLAS" of the Intel MKL Developer Guide for details. The module files are used to
process the FORTRAN use clauses referencing the LAPACK interface: use lapack95 and use
f95_precision.
The main conventions for the Fortran 95 interface are as follows:

The names of arguments used in Fortran 95 call are typically the same as for the respective generic
(FORTRAN 77) interface. In rare cases, formal argument names may be different. For instance, select
instead of selctg.
Input arguments such as array dimensions are not required in Fortran 95 and are skipped from the calling
sequence. Array dimensions are reconstructed from the user data that must exactly follow the required
array shape.
Another type of generic arguments that are skipped in the Fortran 95 interface are arguments that
represent workspace arrays (such as work, rwork, and so on). The only exception are cases when
workspace arrays return significant information on output.
NOTE
Internally, workspace arrays are allocated by the Fortran 95 interface wrapper, and are of optimal size
for the best performance of the routine.
An argument can also be skipped if its value is completely defined by the presence or absence of another
argument in the calling sequence, and the restored value is the only meaningful value for the skipped
argument.
Some generic arguments are declared as optional in the Fortran 95 interface and may or may not be
present in the calling sequence. An argument can be declared optional if it meets one of the following
conditions:
If an argument value is completely defined by the presence or absence of another argument in the
calling sequence, it can be declared optional. The difference from the skipped argument in this case is
that the optional argument can have some meaningful values that are distinct from the value
reconstructed by default. For example, if some argument (like jobz) can take only two values and one
of these values directly implies the use of another argument, then the value of jobz can be uniquely
reconstructed from the actual presence or absence of this second argument, and jobz can be omitted.
If an input argument can take only a few possible values, it can be declared as optional. The default
value of such argument is typically set as the first value in the list and all exceptions to this rule are
explicitly stated in the routine description.
If an input argument has a natural default value, it can be declared as optional. The default value of
such optional argument is set to its natural default value.
430
LAPACK Routines 3
Argument info is declared as optional in the Fortran 95 interface. If it is present in the calling sequence,
the value assigned to info is interpreted as follows:

If this value is more than -1000, its meaning is the same as in the FORTRAN 77 routine.

If this value is equal to -1000, it means that there is not enough work memory.

If this value is equal to -1001, incompatible arguments are present in the calling sequence.

If this value is equal to -i, the ith parameter (counting parameters in the FORTRAN 77 interface, not
the Fortran 95 interface) had an illegal value.
Optional arguments are given in square brackets in the Fortran 95 call syntax.
The "Fortran 95 Notes" subsection at the end of the topic describing each routine details concrete rules for
reconstructing the values of the omitted optional parameters.
Intel MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib Implementation
The following list presents general digressions of the Intel MKL LAPACK95 implementation from the Netlib
analog:
The Intel MKL Fortran 95 interfaces are provided for pure procedures.
Names of interfaces do not contain the LA_ prefix.
An optional array argument always has the target attribute.
Functionality of the Intel MKL LAPACK95 wrapper is close to the FORTRAN 77 original implementation in
the getrf, gbtrf, and potrf interfaces.
If jobz argument value specifies presence or absence of z argument, then z is always declared as
optional and jobz is restored depending on whether z is present or not.
To avoid double error checking, processing of the info argument is limited to checking of the allocated
memory and disarranging of optional arguments.
If an argument that is present in the list of arguments completely defines another argument, the latter is
always declared as optional.
You can transform an application that uses the Netlib LAPACK interfaces to ensure its work with the Intel MKL
interfaces providing that:
a. The application is correct, that is, unambiguous, compiler-independent, and contains no errors.
b. Each routine name denotes only one specific routine. If any routine name in the application coincides
with a name of the original Netlib routine (for example, after removing the LA_ prefix) but denotes a
routine different from the Netlib original routine, this name should be modified through context name
replacement.
You should transform your application in the following cases:
When using the Netlib routines that differ from the Intel MKL routines only by the LA_ prefix or in the
array attribute target. The only transformation required in this case is context name replacement.
When using Netlib routines that differ from the Intel MKL routines by the LA_ prefix, the target array
attribute, and the names of formal arguments. In the case of positional passing of arguments, no
additional transformation except context name replacement is required. In the case of the keywords
passing of arguments, in addition to the context name replacement the names of mismatching keywords
should also be modified.
When using the Netlib routines that differ from the respective Intel MKL routines by the LA_ prefix, the
target array attribute, sequence of the arguments, arguments missing in Intel MKL but present in Netlib
and, vice versa, present in Intel MKL but missing in Netlib. Remove the differences in the sequence and
range of the arguments in process of all the transformations when you use the Netlib routines specified by
this bullet and the preceding bullet.
When using the getrf, gbtrf, and potrf interfaces, that is, new functionality implemented in Intel MKL
but unavailable in the Netlib source. To override the differences, build the desired functionality explicitly
with the Intel MKL means or create a new subroutine with the new functionality, using specific MKL
interfaces corresponding to LAPACK 77 routines. You can call the LAPACK 77 routines directly but using
the new Intel MKL interfaces is preferable. Note that if the transformed application calls getrf, gbtrf or
potrf without controlling arguments rcond and norm, just context name replacement is enough in
modifying the calls into the Intel MKL interfaces, as described in the first bullet above. The Netlib
functionality is preserved in such cases.
431
When using the Netlib auxiliary routines. In this case, call a corresponding subroutine directly, using the
Intel MKL LAPACK 77 interfaces.
Transform your application as follows:
1. Make sure conditions a. and b. are met.

2. Select Netlib LAPACK 95 calls. For each call, do the following:
Select the type of digression and do the required transformations.
Revise results to eliminate unneeded code or data, which may appear after several identical calls.
3. Make sure the transformations are correct and complete.
Matrix Storage Schemes for LAPACK Routines

LAPACK routines use the following matrix storage schemes:
Full storage: an m-by-n matrix A is stored in a two-dimensional array a, with the matrix element aij (i =
1..mj = 1..n), and stored in the array element a(i,j).
Packed storage scheme allows you to store symmetric, Hermitian, or triangular matrices more
compactly: the upper or lower triangle of the matrix is packed by columns in a one-dimensional array.
Band storage: an m-by-n band matrix with kl sub-diagonals and ku superdiagonals is stored compactly in
a two-dimensional array ab with kl+ku+1 rows and n columns. Columns of the matrix are stored in the
corresponding columns of the array, and diagonals of the matrix are stored in rows of the array.
Rectangular Full Packed (RFP) storage: the upper or lower triangle of the matrix is packed combining
the full and packed storage schemes. This combination enables using half of the full storage as packed
storage while maintaining efficiency by using Level 3 BLAS/LAPACK kernels as the full storage.
Generally in LAPACK routines, arrays that hold matrices in packed storage have names ending in p; arrays
with matrices in band storage have names ending in b; arrays with matrices in the RFP storage have names
ending in fp.
For more information on matrix storage schemes, see "Matrix Arguments" in Appendix B.
Mathematical Notation for LAPACK Routines

Descriptions of LAPACK routines use the following notation:
AH For an M-by-N matrix A, denotes the conjugate transposed N-by-M
matrix with elements:
For a real-valued matrix, AH = AT.
xy The dot product of two vectors, defined as:
Ax = b A system of linear equations with an n-by-n matrix A = {aij}, a

right-hand side vector b = {bi}, and an unknown vector x = {xi}.
AX = B A set of systems with a common matrix A and multiple right-hand

sides. The columns of B are individual right-hand sides, and the
columns of X are the corresponding solutions.
|x| the vector with elements |xi| (absolute values of xi).
432
LAPACK Routines 3
|A| the matrix with elements |aij| (absolute values of aij).
||x|| = maxi|xi| The infinity-norm of the vector x.
||A|| = maxij|aij| The infinity-norm of the matrix A.
||A||1 = maxji|aij| The one-norm of the matrix A. ||A||1 = ||AT|| = ||AH||
||x||2 The 2-norm of the vector x: ||x||2 = (i|xi|2)1/2 = ||x||E (see

the definition for Euclidean norm in this section).
||A||2 The 2-norm (or spectral norm) of the matrix A.
||A||E The Euclidean norm of the matrix A: ||A||E2 = ij|aij|2.
(A) = ||A||||A-1|| The condition number of the matrix A.
i Eigenvalues of the matrix A (for the definition of eigenvalues, see

Eigenvalue Problems).
i Singular values of the matrix A. They are equal to square roots of the
eigenvalues of AHA. (For more information, see Singular Value
Decomposition).
Error Analysis
In practice, most computations are performed with rounding errors. Besides, you often need to solve a
system Ax = b, where the data (the elements of A and b) are not known exactly. Therefore, it is important
to understand how the data errors and rounding errors can affect the solution x.
Data perturbations. If x is the exact solution of Ax = b, and x + x is the exact solution of a perturbed
problem (A + A)(x + x) = (b + b), then this estimate, given up to linear terms of perturbations,
holds:
where A + A is nonsingular and
In other words, relative errors in A or b may be amplified in the solution vector x by a factor (A) = ||A||
||A-1|| called the condition number of A.
Rounding errors have the same effect as relative perturbations c(n) in the original data. Here is the
machine precision, defined as the smallest positive number x such that 1 + x > 1; and c(n) is a modest
function of the matrix order n. The corresponding solution error is
||x||/||x||c(n)(A). (The value of c(n) is seldom greater than 10n.)
433
NOTE
Machine precision depends on the data type used.
Thus, if your matrix A is ill-conditioned (that is, its condition number (A) is very large), then the error in
the solution x can also be large; you might even encounter a complete loss of precision. LAPACK provides
routines that allow you to estimate (A) (see Routines for Estimating the Condition Number) and also give
you a more precise estimate for the actual solution error (see Refining the Solution and Estimating Its Error).
LAPACK Linear Equation Routines

This section describes routines for performing the following computations:
factoring the matrix (except for triangular matrices)

equilibrating the matrix (except for RFP matrices)
solving a system of linear equations
estimating the condition number of a matrix (except for RFP matrices)
refining the solution of linear equations and computing its error bounds (except for RFP matrices)
inverting the matrix.
To solve a particular problem, you can call two or more computational routines or call a corresponding driver
routine that combines several tasks in one call. For example, to solve a system of linear equations with a
general matrix, call ?getrf (LU factorization) and then ?getrs (computing the solution). Then, call ?gerfs
to refine the solution and get the error bounds. Alternatively, use the driver routine ?gesvx that performs all
these tasks in one call.
LAPACK Linear Equation Computational Routines

Table "Computational Routines for Systems of Equations with Real Matrices" lists the LAPACK computational
routines (FORTRAN 77 and Fortran 95 interfaces) for factorizing, equilibrating, and inverting real matrices,
estimating their condition numbers, solving systems of equations with real matrices, refining the solution,
and estimating its error. Table "Computational Routines for Systems of Equations with Complex Matrices" lists
similar routines for complex matrices. Respective routine names in the Fortran 95 interface are without the
first symbol (see Routine Naming Conventions).
Computational Routines for Systems of Equations with Real Matrices
Matrix type, Factorize Equilibrate Solve Condition Estimate Invert matrix
storage scheme matrix matrix system number error
general ?getrf ?geequ, ?getrs ?gecon ?gerfs, ?getri

?geequb ?gerfsx
general band ?gbtrf ?gbequ, ?gbtrs ?gbcon ?gbrfs,

?gbequb ?gbrfsx
general tridiagonal ?gttrf ?gttrs ?gtcon ?gtrfs
diagonally ?dttrfb ?dttrsb

dominant
tridiagonal
symmetric ?potrf ?poequ, ?potrs ?pocon ?porfs, ?potri

positive-definite
?poequb ?porfsx
434
LAPACK Routines 3
symmetric ?pptrf ?ppequ ?pptrs ?ppcon ?pprfs ?pptri

positive-definite,
packed storage
symmetric ?pftrf ?pftrs ?pftri

positive-definite,
RFP storage
symmetric ?pbtrf ?pbequ ?pbtrs ?pbcon ?pbrfs

positive-definite,
band
symmetric ?pttrf ?pttrs ?ptcon ?ptrfs

positive-definite,
tridiagonal
symmetric ?sytrf ?syequb ?sytrs ?sycon ?syrfs, ?sytri

indefinite
? ? ? ?syrfsx ?sytri_rook
sytrf_ro sytrs_r sycon_roo
?sytri2
ok ook k
?sytri2x
?sytrs2
symmetric ?sptrf ?sptrs ?spcon ?sprfs ?sptri

indefinite, packed
storage
mkl_?
spffrt2,
mkl_?
spffrtx
triangular ?trtrs ?trcon ?trrfs ?trtri
triangular, packed ?tptrs ?tpcon ?tprfs ?tptri

storage
triangular, RFP ?tftri

storage
triangular band ?tbtrs ?tbcon ?tbrfs
In the table above, ? denotes s (single precision) or d (double precision) for the FORTRAN 77 interface.
Computational Routines for Systems of Equations with Complex Matrices
general ?getrf ?geequ, ?getrs ?gecon ?gerfs, ?getri

?geequb ?gerfsx
general band ?gbtrf ?gbequ, ?gbtrs ?gbcon ?gbrfs,

?gbequb ?gbrfsx
general tridiagonal ?gttrf ?gttrs ?gtcon ?gtrfs
Hermitian ?potrf ?poequ, ?potrs ?pocon ?porfs, ?potri

positive-definite
?poequb ?porfsx
435

Hermitian ?pptrf ?ppequ ?pptrs ?ppcon ?pprfs ?pptri

positive-definite,
packed storage
Hermitian ?pftrf ?pftrs ?pftri

positive-definite,
RFP storage
Hermitian ?pbtrf ?pbequ ?pbtrs ?pbcon ?pbrfs

positive-definite,
band
Hermitian ?pttrf ?pttrs ?ptcon ?ptrfs

positive-definite,
tridiagonal
Hermitian ?hetrf ?heequb ?hetrs ?hecon ?herfs, ?hetri

indefinite
? ? ? ?herfsx ?hetri_rook
hetrf_ro hetrs_r hecon_roo
?hetri2
ok ook k
?hetri2x
?hetrs2
symmetric ?sytrf ?syequb ?sytrs ?sycon ?syrfs, ?sytri

indefinite
? ? ? ?syrfsx ?sytri_rook
sytrf_ro sytrs_r sycon_roo
?sytri2
ok ook k
?sytri2x
?sytrs2
Hermitian ?hptrf ?hptrs ?hpcon ?hprfs ?hptri

indefinite, packed
storage
symmetric ?sptrf ?sptrs ?spcon ?sprfs ?sptri

indefinite, packed
storage
mkl_?
spffrt2,
mkl_?
spffrtx
triangular ?trtrs ?trcon ?trrfs ?trtri
triangular, packed ?tptrs ?tpcon ?tprfs ?tptri

storage
triangular, RFP ?tftri

storage
triangular band ?tbtrs ?tbcon ?tbrfs
In the table above, ? stands for c (single precision complex) or z (double precision complex) for FORTRAN 77
interface.
Matrix Factorization: LAPACK Computational Routines

This section describes the LAPACK routines for matrix factorization. The following factorizations are
supported:
LU factorization
436
LAPACK Routines 3
Cholesky factorization of real symmetric positive-definite matrices
Cholesky factorization of real symmetric positive-definite matrices with pivoting
Cholesky factorization of Hermitian positive-definite matrices
Cholesky factorization of Hermitian positive-definite matrices with pivoting
Bunch-Kaufman factorization of real and complex symmetric matrices
Bunch-Kaufman factorization of Hermitian matrices.
You can compute:
the LU factorization using full and band storage of matrices

the Cholesky factorization using full, packed, RFP, and band storage
the Bunch-Kaufman factorization using full and packed storage.
?getrf
Computes the LU factorization of a general m-by-n
matrix.
Syntax
call sgetrf( m, n, a, lda, ipiv, info )
call dgetrf( m, n, a, lda, ipiv, info )
call cgetrf( m, n, a, lda, ipiv, info )
call zgetrf( m, n, a, lda, ipiv, info )
call getrf( a [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the LU factorization of a general m-by-n matrix A as
A = P*L*U,
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n) and U is upper triangular (upper trapezoidal if m < n). The routine uses partial pivoting, with row
interchanges.
NOTE
This routine supports the Progress Routine feature. See Progress Functionsection for details.
Input Parameters
m INTEGER. The number of rows in the matrix A (m 0).
n INTEGER. The number of columns in A; n 0.
a REAL for sgetrf

DOUBLE PRECISION for dgetrf
COMPLEX for cgetrf
DOUBLE COMPLEX for zgetrf.
437
Array, size lda by *. Contains the matrix A. The second dimension of

a must be at least max(1, n).
lda INTEGER. The leading dimension of array a.
Output Parameters
a Overwritten by L and U. The unit diagonal elements of L are not

stored.
ipiv INTEGER.
Array, size at least max(1,min(m, n)). The pivot indices; for 1 i
min(m, n), row i was interchanged with row ipiv(i).
info INTEGER. If info=0, the execution is successful.

If info = -i, the i-th parameter had an illegal value.
If info = i, uii is 0. The factorization has been completed, but U is

exactly singular. Division by 0 will occur if you use the factor U for
solving a system of linear equations.
LAPACK 95 Interface Notes

counterparts. For general conventions applied to skip redundant or reconstructible arguments, see LAPACK
95 Interface Conventions.
Specific details for the routine getrf interface are as follows:
ipiv Holds the vector of length min(m,n).
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
|E| c(min(m,n))P|L||U|
c(n) is a modest linear function of n, and is the machine precision.

The approximate number of floating-point operations for real flavors is
(2/3)n3 If m = n,
(1/3)n2(3m-n) If m>n,
(1/3)m2(3n-m) If m<n.
The number of operations for complex flavors is four times greater.

After calling this routine with m = n, you can call the following:
?getrs to solve A*X = B or ATX = B or AHX = B
?gecon to estimate the condition number of A
?getri to compute the inverse of A.
438
LAPACK Routines 3
See Also
mkl_progress
mkl_?getrfnpi
Performs LU factorization (complete or incomplete) of
a general matrix without pivoting.
Syntax
call mkl_sgetrfnpi (m, n, nfact, a, lda, info)
call mkl_dgetrfnpi (m, n, nfact, a, lda, info )
call mkl_cgetrfnpi (m, n, nfact, a, lda, info )
call mkl_zgetrfnpi (m, n, nfact, a, lda, info )
call mkl_getrfnpi ( a [, nfact] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the LU factorization of a general m-by-n matrix A without using pivoting. It supports
incomplete factorization. The factorization has the form:
A = L*U,
where L is lower triangular with unit diagonal elements (lower trapezoidal if m > n) and U is upper triangular
(upper trapezoidal if m < n).
Incomplete factorization has the form:
where L is lower trapezoidal with unit diagonal elements, U is upper trapezoidal, and is the unfactored
part of matrix A. See the application notes section for further details.
NOTE
Use ?getrf if it is possible that the matrix is not diagonal dominant.
Input Parameters
The data types are given for the Fortran interface.
m INTEGER. The number of rows in matrix A; m 0.
n INTEGER. The number of columns in matrix A; n 0.
nfact INTEGER. The number of rows and columns to factor; 0 nfact min(m,
n). Note that if nfact < min(m, n), incomplete factorization is performed.
a REAL for mkl_sgetrfnpi

DOUBLE PRECISION for mkl_dgetrfnpi
COMPLEX for mkl_cgetrfnpi
439
DOUBLE COMPLEX for mkl_zgetrfnpi

Array of size (lda,*). Contains the matrix A. The second dimension of a
must be at least max(1, n).
lda INTEGER. The leading dimension of array a. lda max(1, m).
Output Parameters
a Overwritten by L and U. The unit diagonal elements of L are not stored.

When incomplete factorization is specified by setting nfact < min(m, n), a
also contains the unfactored submatrix . See the application notes

section for further details.
If info = i, uii is 0. The requested factorization has been completed, but U

is exactly singular. Division by 0 will occur if factorization is completed and
factor U is used for solving a system of linear equations.

Specific details for the routine getrf interface are as follows:
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, with
|E| c(min(m, n))|L||U|
where c(n) is a modest linear function of n, and is the machine precision.
(2/3)n3 If m = n = nfact
(1/3)n2(3m-n) If m>n = nfact
(1/3)m2(3n-m) If m = nfact<n
(2/3)n3 - (n-nfact)3 If m = n,nfact< min(m, n)
(1/3)(n2(3m-n) - (n-nfact)2(3m - If m>n > nfact

2nfact - n))
(1/3)(m2(3n-m) - (m-nfact)2(3n - If nfact < m < n

2nfact - m))
440
LAPACK Routines 3
When incomplete factorization is specified, the first nfact rows and columns are factored, with the update of
the remaining rows and columns of A as follows:
If matrix A is represented as a block 2-by-2 matrix:
where
A11 is a square matrix of order nfact,

A21 is an (m - nfact)-by-nfact matrix,
A12 is an nfact-by-(n - nfact) matrix, and
A22 is an (m - nfact)-by-(n - nfact) matrix.
The result is
L1 is a lower triangular square matrix of order nfact with unit diagonal and U1 is an upper triangular square
matrix of order nfact. L1 and U1 result from LU factorization of matrix A11: A11 = L1U1.
L2 is an (m - nfact)-by-nfact matrix and L2 = A21U1-1. U2 is an nfact-by-(n - nfact) matrix and U2 =

L1-1A12.
is an (m - nfact)-by-(n - nfact) matrix and = A22 - L2U2.
On exit, elements of the upper triangle U1 are stored in place of the upper triangle of block A11 in array a;
elements of the lower triangle L1 are stored in the lower triangle of block A11 in array a (unit diagonal
elements are not stored). Elements of L2 replace elements of A21; U2 replaces elements of A12 and
replaces elements of A22.
441
?getrf2
Computes LU factorization using partial pivoting with
row interchanges.
Syntax
call sgetrf2 (m, n, a, lda, ipiv, info )
call dgetrf2 (m, n, a, lda, ipiv, info )
call cgetrf2 (m, n, a, lda, ipiv, info )
call zgetrf2 (m, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
?getrf2 computes an LU factorization of a general m-by-n matrix A using partial pivoting with row
interchanges.
The factorization has the form
A=P*L*U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n), and U is upper triangular (upper trapezoidal if m < n).
This is the recursive version of the algorithm. It divides the matrix into four submatrices:
A11 A12
A=
A21 A22
where A11 is n1 by n1 and A22 is n2 by n2 with n1 = min(m, n), and n2 = n - n1.
A11
The subroutine calls itself to factor ,
A12
A12
do the swaps on , solve A12, update A22, then it calls itself to factor A22 and do the swaps on A21.
A22
Input Parameters
m INTEGER. The number of rows of the matrix A. m >= 0.
442
LAPACK Routines 3
n INTEGER. The number of columns of the matrix A. n >= 0.
a REAL for sgetrf2

DOUBLE PRECISION for dgetrf2
COMPLEX for cgetrf2
DOUBLE COMPLEX for zgetrf2
Array, size (lda,n).
On entry, the m-by-n matrix to be factored.
lda INTEGER. The leading dimension of the array a. lda >= max(1,m).
Output Parameters
a On exit, the factors L and U from the factorization A = P * L * U; the unit

diagonal elements of L are not stored.
ipiv INTEGER. Array, size (min(m,n)).
The pivot indices; for 1 <= i <= min(m,n), row i of the matrix was
interchanged with row ipiv(i).
info INTEGER. = 0: successful exit.

< 0: if info = -i, the i-th argument had an illegal value.
> 0: if info = i, Ui, i is exactly zero. The factorization has been completed,
but the factor U is exactly singular, and division by zero will occur if it is
used to solve a system of equations.
?gbtrf
band matrix.
Syntax
call sgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtrf( m, n, kl, ku, ab, ldab, ipiv, info )
call gbtrf( ab [,kl] [,m] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the LU factorization of a general m-by-n band matrix A with kl non-zero subdiagonals and
ku non-zero superdiagonals, that is,
A = P*L*U,
443
where P is a permutation matrix; L is lower triangular with unit diagonal elements and at most kl non-zero
elements in each column; U is an upper triangular band matrix with kl + ku superdiagonals. The routine uses
partial pivoting, with row interchanges (which creates the additional kl superdiagonals in U).
NOTE
This routine supports the Progress Routine feature. See Progress Function section for details.
Input Parameters
m INTEGER. The number of rows in matrix A; m 0.
n INTEGER. The number of columns in matrix A; n 0.
kl INTEGER. The number of subdiagonals within the band of A; kl 0.
ku INTEGER. The number of superdiagonals within the band of A; ku 0.
ab REAL for sgbtrf

DOUBLE PRECISION for dgbtrf
COMPLEX for cgbtrf
DOUBLE COMPLEX for zgbtrf.
Array, size ldab by *.
The array ab contains the matrix A in band storage, in rows kl + 1 to
2*kl + ku + 1; rows 1 to kl of the array need not be set. The j-th
column of A is stored in the j-th column of the array ab as follows:
ab(kl + ku + 1 + i - j, j) = a(i,j) for max(1,j-ku) i
min(m,j+kl).
ldab INTEGER. The leading dimension of the array ab. (ldab 2*kl + ku
+ 1)
Output Parameters
ab Overwritten by L and U. U is stored as an upper triangular band

matrix with kl + ku superdiagonals in rows 1 to kl + ku + 1, and
the multipliers used during the factorization are stored in rows kl +
ku + 2 to 2*kl + ku + 1.
See Application Notes below for further details.
ipiv INTEGER.
Array, size at least max(1,min(m, n)). The pivot indices; for 1 i
min(m, n) , row i was interchanged with row ipiv(i).
info If info = 0, the execution is successful.
If info = i, uiiis 0. The factorization has been completed, but U is

exactly singular. Division by 0 will occur if you use the factor U for
444
LAPACK Routines 3
Specific details for the routine gbtrf interface are as follows:
ab Holds the array A of size (2*kl+ku+1,n).
ipiv Holds the vector of length min(m,n).
kl If omitted, assumed kl = ku.
ku Restored as ku = lda-2*kl-1.
m If omitted, assumed m = n.
Application Notes
The computed L and U are the exact factors of a perturbed matrix A + E, where
|E| c(kl+ku+1) P|L||U|
c(k) is a modest linear function of k, and is the machine precision.

The total number of floating-point operations for real flavors varies between approximately 2n(ku+1)kl and
2n(kl+ku+1)kl. The number of operations for complex flavors is four times greater. All these estimates
assume that kl and ku are much less than min(m,n).
The band storage scheme is illustrated by the following example, when m = n = 6, kl = 2, ku = 1:
Elements marked * are not used; elements marked + need not be set on entry, but are required by the
routine to store elements of U because of fill-in resulting from the row interchanges.
After calling this routine with m = n, you can call the following routines:
gbtrs to solve A*X = B or AT*X = B or AH*X = B
gbcon to estimate the condition number of A.
See Also
mkl_progress
445
?gttrf
Computes the LU factorization of a tridiagonal matrix.
Syntax
call sgttrf( n, dl, d, du, du2, ipiv, info )
call dgttrf( n, dl, d, du, du2, ipiv, info )
call cgttrf( n, dl, d, du, du2, ipiv, info )
call zgttrf( n, dl, d, du, du2, ipiv, info )
call gttrf( dl, d, du, du2 [, ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the LU factorization of a real or complex tridiagonal matrix A using elimination with
partial pivoting and row interchanges.
A = L*U,
where L is a product of permutation and unit lower bidiagonal matrices and U is upper triangular with
nonzeroes in only the main diagonal and first two superdiagonals.
Input Parameters
n INTEGER. The order of the matrix A; n 0.
dl, d, du REAL for sgttrf

DOUBLE PRECISION for dgttrf
COMPLEX for cgttrf
DOUBLE COMPLEX for zgttrf.
Arrays containing elements of A.
The array dl of dimension (n - 1) contains the subdiagonal elements
of A.
The array d of dimension n contains the diagonal elements of A.
The array du of dimension (n - 1) contains the superdiagonal
elements of A.
Output Parameters
dl Overwritten by the (n-1) multipliers that define the matrix L from the
LU factorization of A. The matrix L has unit diagonal elements, and the
(n-1) elements of dl form the subdiagonal. All other elements of L
are zero.
d Overwritten by the n diagonal elements of the upper triangular matrix

U from the LU factorization of A.
446
LAPACK Routines 3
du Overwritten by the (n-1) elements of the first superdiagonal of U.
du2 REAL for sgttrf

DOUBLE PRECISION for dgttrf
COMPLEX for cgttrf
DOUBLE COMPLEX for zgttrf.
Array, dimension (n -2). On exit, du2 contains (n-2) elements of
the second superdiagonal of U.
ipiv INTEGER.
Array, dimension (n). The pivot indices: for 1 in, row i was
interchanged with row ipiv(i). ipiv(i) is always i or i+1; ipiv(i) = i
indicates a row interchange was not required.
info INTEGER. If info = 0, the execution is successful.

If info = i, uiiis 0. The factorization has been completed, but U is

exactly singular. Division by zero will occur if you use the factor U for

Specific details for the routine gttrf interface are as follows:
dl Holds the vector of length (n-1).
d Holds the vector of length n.
du Holds the vector of length (n-1).
du2 Holds the vector of length (n-2).
ipiv Holds the vector of length n.
Application Notes
?gbtrs to solve A*X = B or AT*X = B or AH*X = B
?gbcon to estimate the condition number of A.
?dttrfb
Computes the factorization of a diagonally dominant
tridiagonal matrix.
Syntax
call sdttrfb( n, dl, d, du, info )
call ddttrfb( n, dl, d, du, info )
call cdttrfb( n, dl, d, du, info )
447
call zdttrfb( n, dl, d, du, info )

call dttrfb( dl, d, du [, info] )
Include Files
mkl.fi, lapack.f90
Description
The ?dttrfb routine computes the factorization of a real or complex tridiagonal matrix A with the BABE
(Burning At Both Ends) algorithm without pivoting. The factorization has the form
A = L1*U*L2
where
L1 and L2 are unit lower bidiagonal with k and n - k - 1 subdiagonal elements, respectively, where k =
n/2, and
U is an upper bidiagonal matrix with nonzeroes in only the main diagonal and first superdiagonal.
Input Parameters
dl, d, du REAL for sdttrfb

DOUBLE PRECISION for ddttrfb
COMPLEX for cdttrfb
DOUBLE COMPLEX for zdttrfb.
Arrays containing elements of A.
The array dl of dimension (n - 1) contains the subdiagonal
elements of A.
The array d of dimension n contains the diagonal elements of A.
The array du of dimension (n - 1) contains the superdiagonal

elements of A.
Output Parameters
dl Overwritten by the (n -1) multipliers that define the matrix L from

the LU factorization of A.
d Overwritten by the n diagonal element reciprocals of the upper

triangular matrix U from the factorization of A.
du Overwritten by the (n-1) elements of the superdiagonal of U.

If info = i, uii is 0. The factorization has been completed, but U is

exactly singular. Division by zero will occur if you use the factor U for
Application Notes
A diagonally dominant tridiagonal system is defined such that |di| > |dli-1| + |dui| for any i:
448
LAPACK Routines 3
1 < i < n, and |d1| > |du1|, |dn| > |dln-1|
The underlying BABE algorithm is designed for diagonally dominant systems. Such systems are free from the
numerical stability issue unlike the canonical systems that use elimination with partial pivoting (see ?gttrf).
The diagonally dominant systems are much faster than the canonical systems.
NOTE
The current implementation of BABE has a potential accuracy issue on very small or large data
close to the underflow or overflow threshold respectively. Scale the matrix before applying the
solver in the case of such input data.
Applying the ?dttrfb factorization to non-diagonally dominant systems may lead to an accuracy
loss, or false singularity detected due to no pivoting.
?potrf
Computes the Cholesky factorization of a symmetric
(Hermitian) positive-definite matrix.
Syntax
call spotrf( uplo, n, a, lda, info )
call dpotrf( uplo, n, a, lda, info )
call cpotrf( uplo, n, a, lda, info )
call zpotrf( uplo, n, a, lda, info )
call potrf( a [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, Hermitian
positive-definite matrix A:
A = UT*U for real data, A = UH*U for complex data if uplo='U'

A = L*LT for real data, A = L*LH for complex data if uplo='L'
where L is a lower triangular matrix and U is upper triangular.
NOTE
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'.

Indicates whether the upper or lower triangular part of A is stored and
how A is factored:
If uplo = 'U', the array a stores the upper triangular part of the
matrix A, and the strictly lower triangular part of the matrix is not
referenced.
449
If uplo = 'L', the array a stores the lower triangular part of the
matrix A, and the strictly upper triangular part of the matrix is not
referenced.
n INTEGER. The order of matrix A; n 0.
a REAL for spotrf

DOUBLE PRECISION for dpotrf
COMPLEX for cpotrf
DOUBLE COMPLEX for zpotrf.
Array, size (lda,*). The array a contains either the upper or the
lower triangular part of the matrix A (see uplo). The second dimension
of a must be at least max(1, n).
lda INTEGER. The leading dimension of a.
Output Parameters
a The upper or lower triangular part of a is overwritten by the Cholesky

factor U or L, as specified by uplo.

If info = i, the leading minor of order i (and therefore the matrix A

itself) is not positive-definite, and the factorization could not be
completed. This may indicate an error in forming the matrix A.

Specific details for the routine potrf interface are as follows:
a Holds the matrix A of size (n, n).
Application Notes
If uplo = 'U', the computed factor U is the exact factor of a perturbed matrix A + E, where

A similar estimate holds for uplo = 'L'.
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for
complex flavors.
450
LAPACK Routines 3
After calling this routine, you can call the following routines:
?potrs to solve A*X = B
?pocon to estimate the condition number of A
?potri to compute the inverse of A.
See Also
mkl_progress
?potrf2
Computes Cholesky factorization using a recursive
algorithm.
Syntax
call spotrf2(uplo, n, a, lda, info)
call dpotrf2(uplo, n, a, lda, info)
call cpotrf2(uplo, n, a, lda, info)
call zpotrf2(uplo, n, a, lda, info)
Include Files
mkl.fi
Description
?potrf2 computes the Cholesky factorization of a real or complex symmetric positive definite matrix A using
the recursive algorithm.
for real flavors:
A = UT * U, if uplo = 'U', or
A = L * LT, if uplo = 'L',
for complex flavors:

A = UH * U, if uplo = 'U',
or A = L * LH, if uplo = 'L',
where U is an upper triangular matrix and L is lower triangular.

This is the recursive version of the algorithm. It divides the matrix into four submatrices:
A11 A12
A=
A21 A22
where A11 is n1 by n1 and A22 is n2 by n2, with n1 = n/2 and n2 = n-n1.
The subroutine calls itself to factor A11. Update and scale A21 or A12, update A22 then call itself to factor
A22.
Input Parameters
uplo CHARACTER*1. = 'U': Upper triangle of A is stored;
451
= 'L': Lower triangle of A is stored.
n INTEGER. The order of the matrix A.

n 0.
a REAL for spotrf2

DOUBLE PRECISION for dpotrf2
COMPLEX for cpotrf2
DOUBLE COMPLEX for zpotrf2
On entry, the symmetric matrix A.

If uplo = 'U', the leading n-by-n upper triangular part of a contains the
upper triangular part of the matrix A, and the strictly lower triangular part
of a is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of a contains the
lower triangular part of the matrix A, and the strictly upper triangular part
lda INTEGER. The leading dimension of the array a.
lda max(1,n).
Output Parameters
a On exit, if info = 0, the factor U or L from the Cholesky factorization.
For real flavors:

A = UT*U or A = L*LT;
For complex flavors:
A = UH*U or A = L*LH.
info INTEGER. = 0: successful exit

< 0: if info = -i, the i-th argument had an illegal value
> 0: if info = i, the leading minor of order i is not positive definite, and the
factorization could not be completed.
?pstrf
Computes the Cholesky factorization with complete
pivoting of a real symmetric (complex Hermitian)
positive semidefinite matrix.
Syntax
call spstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call dpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call cpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
call zpstrf( uplo, n, a, lda, piv, rank, tol, work, info )
452
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes the Cholesky factorization with complete pivoting of a real symmetric (complex
Hermitian) positive semidefinite matrix. The form of the factorization is:
PT * A * P = UT * U, if uplo ='U' for real flavors,

PT * A * P = UH * U, if uplo ='U' for complex flavors,
PT * A * P = L * LT, if uplo ='L' for real flavors,
PT * A * P = L * LH, if uplo ='L' for complex flavors,
where P is a permutation matrix stored as vector piv, and U and L are upper and lower triangular matrices,
respectively.
This algorithm does not attempt to check that A is positive semidefinite. This version of the algorithm calls
level 3 BLAS.
Input Parameters

Indicates whether the upper or lower triangular part of A is stored:
matrix A, and the strictly lower triangular part of the matrix is not
referenced.
matrix A, and the strictly upper triangular part of the matrix is not
referenced.
a REAL for spstrf

DOUBLE PRECISION for dpstrf
COMPLEX for cpstrf
DOUBLE COMPLEX for zpstrf.
Array a, size (lda,*). The array a contains either the upper or the
lower triangular part of the matrix A (see uplo). The second
dimension of a must be at least max(1, n).
work REAL for spstrf and cpstrf

DOUBLE PRECISION for dpstrf and zpstrf.
work(*) is a workspace array. The dimension of work is at least
max(1,2*n).
tol REAL for single precision flavors

DOUBLE PRECISION for double precision flavors.
453
User defined tolerance. If tol < 0, then n**max(Ak,k), where is the

machine precision, will be used (see Error Analysis for the definition of
machine precision). The algorithm terminates at the (k-1)-st step, if
the pivot tol.
lda INTEGER. The leading dimension of a; at least max(1, n).
Output Parameters
a If info = 0, the factor U or L from the Cholesky factorization is as

described in Description.
piv INTEGER.
Array, size at least max(1, n). The array piv is such that the nonzero
entries are Ppiv(k),k (1 kn).
rank INTEGER.
The rank of a given by the number of steps the algorithm completed.

If info = -k, the k-th argument had an illegal value.
If info > 0, the matrix A is either rank deficient with a computed

rank as returned in rank, or is not positive semidefinite.
See Also
?pftrf
(Hermitian) positive-definite matrix using the
Rectangular Full Packed (RFP) format .
Syntax
call spftrf( transr, uplo, n, a, info )
call dpftrf( transr, uplo, n, a, info )
call cpftrf( transr, uplo, n, a, info )
call zpftrf( transr, uplo, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
The routine forms the Cholesky factorization of a symmetric positive-definite or, for complex data, a
Hermitian positive-definite matrix A:

454
LAPACK Routines 3
The matrix A is in the Rectangular Full Packed (RFP) format. For the description of the RFP format, see Matrix
Storage Schemes.
This is the block version of the algorithm, calling Level 3 BLAS.
Input Parameters
transr CHARACTER*1. Must be 'N', 'T' (for real data) or 'C' (for complex
data).
If transr = 'N', the Normal transr of RFP A is stored.
If transr = 'T', the Transpose transr of RFP A is stored.
If transr = 'C', the Conjugate-Transpose transr of RFP A is stored.

matrix A.
matrix A.
a REAL for spftrf

DOUBLE PRECISION for dpftrf
COMPLEX for cpftrf
DOUBLE COMPLEX for zpftrf.
Array, size (n*(n+1)/2). The array a contains the matrix A in the RFP
format.
Output Parameters
a a is overwritten by the Cholesky factor U or L, as specified by uplo

and trans.


See Also
?pptrf
(Hermitian) positive-definite matrix using packed
storage.
Syntax
call spptrf( uplo, n, ap, info )
455
call dpptrf( uplo, n, ap, info )

call cpptrf( uplo, n, ap, info )
call zpptrf( uplo, n, ap, info )
call pptrf( ap [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
positive-definite packed matrix A:

NOTE
Input Parameters

Indicates whether the upper or lower triangular part of A is packed in
the array ap, and how A is factored:
If uplo = 'U', the array ap stores the upper triangular part of the
matrix A, and A is factored as UH*U.
If uplo = 'L', the array ap stores the lower triangular part of the
matrix A; A is factored as L*LH.
ap REAL for spptrf

DOUBLE PRECISION for dpptrf
COMPLEX for cpptrf
DOUBLE COMPLEX for zpptrf.
Array, size at least max(1, n(n+1)/2). The array ap contains either the
upper or the lower triangular part of the matrix A (as specified by
uplo) in packed storage (see Matrix Storage Schemes).
Output Parameters
ap Overwritten by the Cholesky factor U or L, as specified by uplo.

456
LAPACK Routines 3

Specific details for the routine pptrf interface are as follows:
ap Holds the array A of size (n*(n+1)/2).
Application Notes

The total number of floating-point operations is approximately (1/3)n3 for real flavors and (4/3)n3 for
complex flavors.
?pptrs to solve A*X = B
?ppcon to estimate the condition number of A
?pptri to compute the inverse of A.
See Also
mkl_progress
?pbtrf
(Hermitian) positive-definite band matrix.
Syntax
call spbtrf( uplo, n, kd, ab, ldab, info )
call dpbtrf( uplo, n, kd, ab, ldab, info )
call cpbtrf( uplo, n, kd, ab, ldab, info )
call zpbtrf( uplo, n, kd, ab, ldab, info )
call pbtrf( ab [, uplo] [,info] )
457
Include Files
mkl.fi, lapack.f90
Description
positive-definite band matrix A:

NOTE
Input Parameters

Indicates whether the upper or lower triangular part of A is stored in
the array ab, and how A is factored:
If uplo = 'U', the upper triangle of A is stored.
If uplo = 'L', the lower triangle of A is stored.
kd INTEGER. The number of superdiagonals or subdiagonals in the matrix

A; kd 0.
ab REAL for spbtrf

DOUBLE PRECISION for dpbtrf
COMPLEX for cpbtrf
DOUBLE COMPLEX for zpbtrf.
Array, size (ldab,*). The array ab contains either the upper or the
lower triangular part of the matrix A (as specified by uplo) in band
storage (see Matrix Storage Schemes). The second dimension of ab
ldab INTEGER. The leading dimension of the array ab. (ldabkd + 1)
Output Parameters
ab The upper or lower triangular part of A (in band storage) is

overwritten by the Cholesky factor U or L, as specified by uplo.

458
LAPACK Routines 3

Specific details for the routine pbtrf interface are as follows:
ab Holds the array A of size (kd+1,n).
Application Notes

The total number of floating-point operations for real flavors is approximately n(kd+1)2. The number of
operations for complex flavors is 4 times greater. All these estimates assume that kd is much less than n.
?pbtrs to solve A*X = B
?pbcon to estimate the condition number of A.
See Also
mkl_progress
?pttrf
Computes the factorization of a symmetric (Hermitian)
positive-definite tridiagonal matrix.
Syntax
call spttrf( n, d, e, info )
call dpttrf( n, d, e, info )
call cpttrf( n, d, e, info )
call zpttrf( n, d, e, info )
call pttrf( d, e [,info] )
Include Files
mkl.fi, lapack.f90
459
Description
The routine forms the factorization of a symmetric positive-definite or, for complex data, Hermitian positive-
definite tridiagonal matrix A:
A = L*D*LT for real flavors, or
A = L*D*LH for complex flavors,
where D is diagonal and L is unit lower bidiagonal. The factorization may also be regarded as having the form
A = UT*D*U for real flavors, or A = UH*D*U for complex flavors, where U is unit upper bidiagonal.
Input Parameters
d REAL for spttrf, cpttrf

DOUBLE PRECISION for dpttrf, zpttrf.
Array, dimension (n). Contains the diagonal elements of A.
e REAL for spttrf

DOUBLE PRECISION for dpttrf
COMPLEX for cpttrf
DOUBLE COMPLEX for zpttrf.
Array, dimension (n -1). Contains the subdiagonal elements of A.
Output Parameters
d Overwritten by the n diagonal elements of the diagonal matrix D from

the L*D*LT (for real flavors) or L*D*LH (for complex flavors)
factorization of A.
e Overwritten by the (n - 1) sub-diagonal elements of the unit

bidiagonal factor L or U from the factorization of A.


itself) is not positive-definite; if i < n, the factorization could not be
completed, while if i = n, the factorization was completed, but d(n)
0.

Specific details for the routine pttrf interface are as follows:
e Holds the vector of length (n-1).
460
LAPACK Routines 3
?sytrf
Computes the Bunch-Kaufman factorization of a
symmetric matrix.
Syntax
call ssytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call csytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytrf( uplo, n, a, lda, ipiv, work, lwork, info )
call sytrf( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A using the Bunch-Kaufman
diagonal pivoting method. The form of the factorization is:
if uplo='U', A = U*D*UT
if uplo='L', A = L*D*LT,
where A is the input matrix, U and L are products of permutation and triangular matrices with unit diagonal
(upper triangular for U and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1
and 2-by-2 diagonal blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of
D.
NOTE
This routine supports the Progress Routine feature. See Progress Routinesection for details.
Input Parameters

how A is factored:
matrix A, and A is factored as U*D*UT.
matrix A, and A is factored as L*D*LT.
a REAL for ssytrf

DOUBLE PRECISION for dsytrf
COMPLEX for csytrf
DOUBLE COMPLEX for zsytrf.
461
Array, size (lda,*). The array a contains either the upper or the
lower triangular part of the matrix A (see uplo). The second dimension
work Same type as a. A workspace array, dimension at least

max(1,lwork).
lwork INTEGER. The size of the work array (lworkn).

If lwork = -1, then a workspace query is assumed; the routine only
calculates the optimal size of the work array, returns this value as the
first entry of the work array, and no error message related to lwork is
issued by xerbla.
See Application Notes for the suggested value of lwork.
Output Parameters
a The upper or lower triangular part of a is overwritten by details of the

block-diagonal matrix D and the multipliers used to obtain the factor U
(or L).
work(1) If info=0, on exit work(1) contains the minimum value of lwork

required for optimum performance. Use this lwork for subsequent
runs.
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges
and the block structure of D. If ipiv(i) = k >0, then dii is a 1-by-1
block, and the i-th row and column of A was interchanged with the k-
th row and column.
If uplo = 'U' and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i-1, and (i-1)-th row and column of A
was interchanged with the m-th row and column.
If uplo = 'L' and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2
block in rows/columns i and i+1, and (i+1)-th row and column of A

If info = i, Dii is 0. The factorization has been completed, but D is

exactly singular. Division by 0 will occur if you use D for solving a
system of linear equations.

Specific details for the routine sytrf interface are as follows:
462
LAPACK Routines 3
a holds the matrix A of size (n, n)
ipiv holds the vector of length n
uplo must be 'U' or 'L'. The default value is 'U'.
Application Notes
For better performance, try using lwork = n*blocksize, where blocksize is a machine-dependent value
(typically, 16 to 64) required for optimum performance of the blocked algorithm.
If you are in doubt how much workspace to supply, use a generous value of lwork for the first run or set
lwork = -1.
If you choose the first option and set any of admissible lwork sizes, which is no less than the minimal value
described, the routine completes the task, though probably not so fast as with a recommended workspace,
and provides the recommended workspace in the first element of the corresponding array work on exit. Use
this value (work(1)) for subsequent runs.
If you set lwork = -1, the routine returns immediately and provides the recommended workspace in the
first element of the corresponding array (work). This operation is called a workspace query.
Note that if you set lwork to less than the minimal required value and not -1, the routine returns immediately
with an error exit and does not provide any information on the recommended workspace.
The 2-by-2 unit diagonal blocks and the unit diagonal elements of U and L are not stored. The remaining
elements of U and L are stored in the corresponding columns of the array a, but additional row interchanges
are required to recover U or L explicitly (which is seldom necessary).
If ipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in the
corresponding elements of the array a.
If uplo = 'U', the computed factors U and D are the exact factors of a perturbed matrix A + E, where
|E| c(n)P|U||D||UT|PT
c(n) is a modest linear function of n, and is the machine precision. A similar estimate holds for the
computed L and D when uplo = 'L'.
complex flavors.
?sytrs to solve A*X = B
?sycon to estimate the condition number of A
?sytri to compute the inverse of A.
If uplo = 'U', then A = U*D*U', where
U = P(n)*U(n)* ... *P(k)*U(k)*...,
that is, U is a product of terms P(k)*U(k), where

k decreases from n to 1 in steps of 1 and 2.
D is a block diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks D(k).
P(k) is a permutation matrix as defined by ipiv(k).
U(k) is a unit upper triangular matrix, such that if the diagonal block D(k) is of order s (s = 1 or 2), then
463
If s = 1, D(k) overwrites A(k,k), and v overwrites A(1:k-1,k).
If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k) and A(k,k), and v overwrites A(1:k-2,k
-1:k).
If uplo = 'L', then A = L*D*L', where
L = P(1)*L(1)* ... *P(k)*L(k)*...,
that is, L is a product of terms P(k)*L(k), where
k increases from 1 to n in steps of 1 and 2.

L(k) is a unit lower triangular matrix, such that if the diagonal block D(k) is of order s (s = 1 or 2), then
If s = 1, D(k) overwrites A(k,k), and v overwrites A(k+1:n,k).
If s = 2, the lower triangle of D(k) overwrites A(k,k), A(k+1,k), and A(k+1,k+1), and v overwrites A(k
+2:n,k:k+1).
See Also
mkl_progress
?sytrf_rook
Computes the bounded Bunch-Kaufman factorization
of a symmetric matrix.
Syntax
call ssytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call csytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call sytrf_rook( a [, uplo] [,ipiv] [,info] )
464
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A using the bounded Bunch-
Kaufman ("rook") diagonal pivoting method. The form of the factorization is:
(upper triangular for U and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1
D.
Input Parameters

how A is factored:
a REAL for ssytrf_rook

DOUBLE PRECISION for dsytrf_rook
COMPLEX for csytrf_rook
DOUBLE COMPLEX for zsytrf_rook.
Array, size (lda,n). The array a contains either the upper or the
lower triangular part of the matrix A (see uplo).
work Same type as a. A workspace array, dimension at least

max(1,lwork).

issued by xerbla.
See ?sytrf Application Notes for the suggested value of lwork.
465
Output Parameters

(or L).

runs.
ipiv INTEGER.
and the block structure of D.
If ipiv(k) > 0, then rows and columns k and ipiv(k) were
interchanged and Dk, k is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k -
1 and -ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k
+ 1 and -ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2
diagonal block.



Specific details for the routine sytrf_rook interface are as follows:
Application Notes
complex flavors.
?sytrs_rook to solve A*X = B
?sycon_rook to estimate the condition number of A
466
LAPACK Routines 3
?sytri_rook to compute the inverse of A.
If uplo = 'U', then A = U*D*U', where
U = P(n)*U(n)* ... *P(k)*U(k)*...,
that is, U is a product of terms P(k)*U(k), where
k decreases from n to 1 in steps of 1 and 2.

U(k) is a unit upper triangular matrix, such that if the diagonal block D(k) is of order s (s = 1 or 2), then
If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k) and A(k,k), and v overwrites A(1:k-2,k
-1:k).
If uplo = 'L', then A = L*D*L', where
L = P(1)*L(1)* ... *P(k)*L(k)*...,
that is, L is a product of terms P(k)*L(k), where
k increases from 1 to n in steps of 1 and 2.

L(k) is a unit lower triangular matrix, such that if the diagonal block D(k) is of order s (s = 1 or 2), then
+2:n,k:k+1).
See Also
467
?hetrf
complex Hermitian matrix.
Syntax
call chetrf( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetrf( uplo, n, a, lda, ipiv, work, lwork, info )
call hetrf( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman diagonal
pivoting method:
if uplo='U', A = U*D*UH
if uplo='L', A = L*D*LH,
(upper triangular for U and lower triangular for L), and D is a Hermitian block-diagonal matrix with 1-by-1
D.
NOTE
This routine supports the Progress Routine feature. See Progress Routinesection for details.
Input Parameters

how A is factored:
matrix A, and A is factored as U*D*UH.
matrix A, and A is factored as L*D*LH.
a, work COMPLEX for chetrf

DOUBLE COMPLEX for zhetrf.
Arrays, size (lda,*), work(*).
The array a contains the upper or the lower triangular part of the
matrix A (see uplo). The second dimension of a must be at least
max(1, n).
work(*) is a workspace array of dimension at least max(1, lwork).
468
LAPACK Routines 3
issued by xerbla.
Output Parameters

(or L).
work(1) If info = 0, on exit work(1) contains the minimum value of lwork

runs.
ipiv INTEGER.
th row and column.

If info = i, dii is 0. The factorization has been completed, but D is


Specific details for the routine hetrf interface are as follows:
Application Notes
This routine is suitable for Hermitian matrices that are not known to be positive-definite. If A is in fact
positive-definite, the routine does not perform interchanges, and no 2-by-2 diagonal blocks occur in D.
469
lwork = -1.
elements of U and L are stored in the corresponding columns of the array a, but additional row interchanges
are required to recover U or L explicitly (which is seldom necessary).
Ifipiv(i) = i for all i =1...n, then all off-diagonal elements of U (L) are stored explicitly in the corresponding
elements of the array a.

A similar estimate holds for the computed L and D when uplo = 'L'.
The total number of floating-point operations is approximately (4/3)n3.
?hetrs to solve A*X = B
?hecon to estimate the condition number of A
?hetri to compute the inverse of A.
See Also
mkl_progress
?hetrf_rook
Computes the bounded Bunch-Kaufman factorization
of a complex Hermitian matrix.
Syntax
call chetrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetrf_rook( uplo, n, a, lda, ipiv, work, lwork, info )
call hetrf_rook( a [, uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
470
LAPACK Routines 3
The routine computes the factorization of a complex Hermitian matrix A using the bounded Bunch-Kaufman
diagonal pivoting method:
where A is the input matrix, U (or L ) is a product of permutation and unit upper ( or lower) triangular
matrices, and D is a Hermitian block-diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks.
This is the blocked version of the algorithm, calling Level 3 BLAS.
Input Parameters

matrix A.
matrix A.
a COMPLEX for chetrf_rook

COMPLEX*16 for zhetrf_rook.
Array a, size (lda,n).
matrix A (see uplo).
If uplo = 'U', the leading n-by-n upper triangular part of a contains
the upper triangular part of the matrix A, and the strictly lower
triangular part of a is not referenced. If uplo = 'L', the leading n-by-n
lower triangular part of a contains the lower triangular part of the
matrix A, and the strictly upper triangular part of a is not referenced.
work COMPLEX for chetrf_rook

COMPLEX*16 for zhetrf_rook.
Array work(*).

The length of work. lwork 1. For best performance lworkn*nb,
where nb is the block size returned by ilaenv.
issued by xerbla.
Output Parameters
471
The block diagonal matrix D and the multipliers used to obtain the
factor U or L (see Application Notes for further details).

runs.
ipiv INTEGER.
and the block structure of D.
If uplo = 'U':
If ipiv(k) < 0 and ipiv(k - 1) < 0, then rows and columns k and
-ipiv(k) were interchanged and rows and columns k - 1 and -
ipiv(k - 1) were interchanged, Dk - 1:k,k - 1:k is a 2-by-2 diagonal
block.
If uplo = 'L':

interchanged and Dk,k is a 1-by-1 diagonal block.
If ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and columns k and
-ipiv(k) were interchanged and rows and columns k + 1 and -
ipiv(k + 1) were interchanged, Dk:k + 1,k:k + 1 is a 2-by-2 diagonal
block.

If info = i, Dii is exactly 0. The factorization has been completed,

but the block diagonal matrix D is exactly singular, and division by 0
will occur if you use D for solving a system of linear equations.

Specific details for the routine hetrf_rook interface are as follows:
Application Notes
If uplo = 'U', thenA = U*D*UH, where
U = P(n)*U(n)* ... *P(k)U(k)* ...,
472
LAPACK Routines 3
i.e., U is a product of terms P(k)*U(k), where k decreases from n to 1 in steps of 1 or 2, and D is a block
diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks D(k). P(k) is a permutation matrix as defined by
ipiv(k), and U(k) is a unit upper triangular matrix, such that if the diagonal block D(k) is of order s (s = 1
or 2), then
ks s nk
ks I v 0
U k =
s 0 I 0
nk 0 0 I

If s = 2, the upper triangle of D(k) overwrites A(k-1,k-1), A(k-1,k), and A(k,k), and v overwrites
A(1:k-2,k-1:k).
If uplo = 'L', then A = L*D*LH, where
L = P(1)*L(1)* ... *P(k)*L(k)* ...,

i.e., L is a product of terms P(k)*L(k), where k increases from 1 to n in steps of 1 or 2, and D is a block
diagonal matrix with 1-by-1 and 2-by-2 diagonal blocks D(k). P(k) is a permutation matrix as defined by
ipiv(k), and L(k) is a unit lower triangular matrix, such that if the diagonal block D(k) is of order s (s = 1 or
2), then
k1 s nks+1
k1 I 0 0
Lk =
s 0 I 0
nks+1 0 v I

+2:n,k:k+1).
See Also
mkl_progress
?sptrf
symmetric matrix using packed storage.
Syntax
call ssptrf( uplo, n, ap, ipiv, info )
call dsptrf( uplo, n, ap, ipiv, info )
call csptrf( uplo, n, ap, ipiv, info )
call zsptrf( uplo, n, ap, ipiv, info )
call sptrf( ap [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a real/complex symmetric matrix A stored in the packed format
using the Bunch-Kaufman diagonal pivoting method. The form of the factorization is:
473
where U and L are products of permutation and triangular matrices with unit diagonal (upper triangular for U
and lower triangular for L), and D is a symmetric block-diagonal matrix with 1-by-1 and 2-by-2 diagonal
blocks. U and L have 2-by-2 unit diagonal blocks corresponding to the 2-by-2 blocks of D.
NOTE
Input Parameters

the array ap and how A is factored:
ap REAL for ssptrf

DOUBLE PRECISION for dsptrf
COMPLEX for csptrf
DOUBLE COMPLEX for zsptrf.
Array, size at least max(1, n(n+1)/2). The array ap contains the upper
or the lower triangular part of the matrix A (as specified by uplo) in
packed storage (see Matrix Storage Schemes).
Output Parameters
ap The upper or lower triangle of A (as specified by uplo) is overwritten

by details of the block-diagonal matrix D and the multipliers used to
obtain the factor U (or L).
ipiv INTEGER.
th row and column.
474
LAPACK Routines 3


Specific details for the routine sptrf interface are as follows:
Application Notes
elements of U and L overwrite elements of the corresponding columns of the array ap, but additional row
interchanges are required to recover U or L explicitly (which is seldom necessary).
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly in packed form.
c(n) is a modest linear function of n, and is the machine precision. A similar estimate holds for the
computed L and D when uplo = 'L'.
complex flavors.
?sptrs to solve A*X = B
?spcon to estimate the condition number of A
?sptri to compute the inverse of A.
See Also
mkl_progress
?hptrf
complex Hermitian matrix using packed storage.
Syntax
call chptrf( uplo, n, ap, ipiv, info )
call zhptrf( uplo, n, ap, ipiv, info )
call hptrf( ap [,uplo] [,ipiv] [,info] )
475
Include Files
mkl.fi, lapack.f90
Description
The routine computes the factorization of a complex Hermitian packed matrix A using the Bunch-Kaufman
diagonal pivoting method:
(upper triangular for U and lower triangular for L), and D is a Hermitian block-diagonal matrix with 1-by-1
D.
NOTE
Input Parameters

Indicates whether the upper or lower triangular part of A is packed
and how A is factored:
ap COMPLEX for chptrf

DOUBLE COMPLEX for zhptrf.
Array, size at least max(1, n(n+1)/2). The array ap contains the upper
or the lower triangular part of the matrix A (as specified by uplo) in
Output Parameters
ap The upper or lower triangle of A (as specified by uplo) is overwritten

by details of the block-diagonal matrix D and the multipliers used to
obtain the factor U (or L).
ipiv INTEGER.
th row and column.
476
LAPACK Routines 3



Specific details for the routine hptrf interface are as follows:
Application Notes
elements of U and L are stored in the array ap, but additional row interchanges are required to recover U or L
explicitly (which is seldom necessary).
If ipiv(i) = i for all i = 1...n, then all off-diagonal elements of U (L) are stored explicitly in the
corresponding elements of the array a.

A similar estimate holds for the computed L and D when uplo = 'L'.

?hptrs to solve A*X = B
?hpcon to estimate the condition number of A
?hptri to compute the inverse of A.
See Also
mkl_progress
477
mkl_?spffrt2, mkl_?spffrtx
Computes the partial LDLT factorization of a
symmetric matrix using packed storage.
Syntax
call mkl_sspffrt2( ap, n, ncolm, work, work2 )
call mkl_dspffrt2( ap, n, ncolm, work, work2 )
call mkl_cspffrt2( ap, n, ncolm, work, work2 )
call mkl_zspffrt2( ap, n, ncolm, work, work2 )
call mkl_sspffrtx( ap, n, ncolm, work, work2 )
call mkl_dspffrtx( ap, n, ncolm, work, work2 )
call mkl_cspffrtx( ap, n, ncolm, work, work2 )
call mkl_zspffrtx( ap, n, ncolm, work, work2 )
Include Files
mkl.fi
Description
The routine computes the partial factorization A = LDLT , where L is a lower triangular matrix and D is a
diagonal matrix.
CAUTION
The routine assumes that the matrix A is factorizable. The routine does not perform pivoting and does
not handle diagonal elements which are zero, which cause the routine to produce incorrect results
without any indication.
a bT
Consider the matrix A = , where a is the element in the first row and first column of A, b is a column
b C
vector of size n - 1 containing the elements from the second through n-th column of A, C is the lower-right
square submatrix of A, and I is the identity matrix.
The mkl_?spffrt2 routine performs ncolm successive factorizations of the form
1
a bT a 0 a 0 a bT
A= = .
b C b I 0 C ba 1bT 0 I
The mkl_?spffrtx routine performs ncolm successive factorizations of the form

1 0 a T
a bT 0 1 ba 1
A= = .
b C ba 1 I 0 C ba 1bT 0 I
The approximate number of floating point operations performed by real flavors of these routines is
(1/6)*ncolm*(2*ncolm2 - 6*ncolm*n + 3*ncolm + 6*n2 - 6*n + 7).
The approximate number of floating point operations performed by complex flavors of these routines is
(1/3)*ncolm*(4*ncolm2 - 12*ncolm*n + 9*ncolm + 12*n2 - 18*n + 8).
Input Parameters
ap REAL for mkl_sspffrt2 and mkl_sspffrtx

DOUBLE PRECISION for mkl_dspffrt2 and mkl_dspffrtx
478
LAPACK Routines 3
COMPLEX for mkl_cspffrt2 and mkl_cspffrtx
DOUBLE COMPLEX for mkl_zspffrt2 and mkl_zspffrtx.
Array, size at least max(1, n(n+1)/2). The array ap contains the lower
triangular part of the matrix A in packed storage (see Matrix Storage
Schemes for uplo = 'L').
ncolm INTEGER. The number of columns to factor, ncolmn.
work, work2 REAL for mkl_sspffrt2 and mkl_sspffrtx

DOUBLE PRECISION for mkl_dspffrt2 and mkl_dspffrtx
COMPLEX for mkl_cspffrt2 and mkl_cspffrtx
DOUBLE COMPLEX for mkl_zspffrt2 and mkl_zspffrtx.
Workspace arrays, size of each at least n.
Output Parameters
ap Overwritten by the factor L. The first ncolm diagonal elements of the

input matrix A are replaced with the diagonal elements of D. The
subdiagonal elements of the first ncolm columns are replaced with the
corresponding elements of L. The rest of the input array is updated as
indicated in the Description section.
NOTE
Specifying ncolm = n results in complete factorization A = LDLT.
See Also
mkl_progress
Solving Systems of Linear Equations: LAPACK Computational Routines

This section describes the LAPACK routines for solving systems of linear equations. Before calling most of
these routines, you need to factorize the matrix of your system of equations (see Routines for Matrix
Factorization in this chapter). However, the factorization is not necessary if your system of equations has a
triangular matrix.
?getrs
Solves a system of linear equations with an LU-
factored square coefficient matrix, with multiple right-
hand sides.
Syntax
call sgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call dgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call cgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call zgetrs( trans, n, nrhs, a, lda, ipiv, b, ldb, info )
call getrs( a, ipiv, b [, trans] [,info] )
479
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations:
A*X = B if trans='N',
AT*X = B if trans='T',
AH*X = B if trans='C' (for complex matrices only).
Before calling this routine, you must call ?getrf to compute the LU factorization of A.
Input Parameters
trans CHARACTER*1. Must be 'N' or 'T' or 'C'.

Indicates the form of the equations:
If trans = 'N', then A*X = B is solved for X.
If trans = 'T', then AT*X = B is solved for X.
If trans = 'C', then AH*X = B is solved for X.
n INTEGER. The order of A; the number of rows in B(n 0).
nrhs INTEGER. The number of right-hand sides; nrhs 0.
a, b REAL for sgetrs

DOUBLE PRECISION for dgetrs
COMPLEX for cgetrs
DOUBLE COMPLEX for zgetrs.
Arrays: a (size lda by *), b (size ldb by *).
The array a contains LU factorization of matrix A resulting from the
call of ?getrf.
The array b contains the matrix B whose columns are the right-hand
sides for the systems of equations.
The second dimension of a must be at least max(1,n) and the second
dimension of b at least max(1,nrhs).
lda INTEGER. The leading dimension of a; lda max(1, n).
ldb INTEGER. The leading dimension of b; ldb max(1, n).
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?getrf.
Output Parameters
480
LAPACK Routines 3

Specific details for the routine getrs interface are as follows:
b Holds the matrix B of size (n, nrhs).
trans Must be 'N', 'C', or 'T'. The default value is 'N'.
Application Notes
For each right-hand side b, the computed solution is the exact solution of a perturbed system of equations (A
+ E)x = b, where
|E| c(n)P|L||U|

If x0 is the true solution, the computed solution x satisfies this error bound:
where cond(A,x)= || |A-1||A| |x| || / ||x|| ||A-1|| ||A|| = (A).
Note that cond(A,x) can be much smaller than (A); the condition number of AT and AH might or might
not be equal to (A).
The approximate number of floating-point operations for one right-hand side vector b is 2n2 for real flavors
and 8n2 for complex flavors.
To estimate the condition number (A), call ?gecon.
To refine the solution and estimate the error, call ?gerfs.
See Also
?gbtrs
Solves a system of linear equations with an LU-
factored band coefficient matrix, with multiple right-
hand sides.
Syntax
call sgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
481
call cgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbtrs( trans, n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call gbtrs( ab, b, ipiv, [, kl] [, trans] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations:
Here A is an LU-factored general band matrix of order n with kl non-zero subdiagonals and ku nonzero
superdiagonals. Before calling this routine, call ?gbtrf to compute the LU factorization of A.
Input Parameters
n INTEGER. The order of A; the number of rows in B; n 0.
ab, b REAL for sgbtrs

DOUBLE PRECISION for dgbtrs
COMPLEX for cgbtrs
DOUBLE COMPLEX for zgbtrs.
Arrays: ab(ldab,*), b(ldb,*).
The array ab contains elements of the LU factors of the matrix A as

returned by gbtrf. The second dimension of ab must be at least
max(1, n).
sides for the systems of equations. The second dimension of b at least
max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldab 2*kl + ku +1.
ipiv INTEGER. Array, size at least max(1, n). The ipiv array, as returned
by ?gbtrf.
482
LAPACK Routines 3
Output Parameters


Specific details for the routine gbtrs interface are as follows:
ipiv Holds the vector of length min(m, n).
ku Restored as lda-2*kl-1.
Application Notes
+ E)x = b, where
|E| c(kl + ku + 1)P|L||U|

The approximate number of floating-point operations for one right-hand side vector is 2n(ku + 2kl) for real
flavors. The number of operations for complex flavors is 4 times greater. All these estimates assume that kl
and ku are much less than min(m,n).
To estimate the condition number (A), call ?gbcon.
To refine the solution and estimate the error, call ?gbrfs.
See Also
483
?gttrs
Solves a system of linear equations with a tridiagonal
coefficient matrix using the LU factorization computed
by ?gttrf.
Syntax
call sgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call dgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call cgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call zgttrs( trans, n, nrhs, dl, d, du, du2, ipiv, b, ldb, info )
call gttrs( dl, d, du, du2, b, ipiv [, trans] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with multiple right hand sides:
Before calling this routine, you must call ?gttrf to compute the LU factorization of A.
Input Parameters

n INTEGER. The order of A; n 0.
nrhs INTEGER. The number of right-hand sides, that is, the number of
columns in B; nrhs 0.
dl,d,du,du2,b REAL for sgttrs

DOUBLE PRECISION for dgttrs
COMPLEX for cgttrs
DOUBLE COMPLEX for zgttrs.
Arrays: dl(n -1), d(n), du(n -1), du2(n -2), b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define the matrix L

from the LU factorization of A.
The array d contains the n diagonal elements of the upper triangular
matrix U from the LU factorization of A.
484
LAPACK Routines 3
The array du contains the (n - 1) elements of the first superdiagonal
of U.
The array du2 contains the (n - 2) elements of the second
superdiagonal of U.
ipiv INTEGER. Array, size (n). The ipiv array, as returned by ?gttrf.
Output Parameters


Specific details for the routine gttrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n)P|L||U|

485
The approximate number of floating-point operations for one right-hand side vector b is 7n (including n
divisions) for real flavors and 34n (including 2n divisions) for complex flavors.
To estimate the condition number (A), call ?gtcon.
To refine the solution and estimate the error, call ?gtrfs.
See Also
?dttrsb
Solves a system of linear equations with a diagonally
dominant tridiagonal coefficient matrix using the LU
factorization computed by ?dttrfb.
Syntax
call sdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call ddttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call cdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call zdttrsb( trans, n, nrhs, dl, d, du, b, ldb, info )
call dttrsb( dl, d, du, b [, trans] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The ?dttrsb routine solves the following systems of linear equations with multiple right hand sides for X:
Before calling this routine, call ?dttrfb to compute the factorization of A.
Input Parameters

Indicates the form of the equations solved for X:
If trans = 'N', then A*X = B.
If trans = 'T', then AT*X = B.
If trans = 'C', then AH*X = B.
486
LAPACK Routines 3
columns in B; nrhs 0.
dl, d, du, b REAL for sdttrsb

DOUBLE PRECISION for ddttrsb
COMPLEX for cdttrsb
DOUBLE COMPLEX for zdttrsb.
Arrays: dl(n -1), d(n), du(n -1), b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define the

matrices L1, L2 from the factorization of A.
matrix U from the factorization of A.
The array du contains the (n - 1) elements of the superdiagonal of U.
Output Parameters

?potrs
Solves a system of linear equations with a Cholesky-
factored symmetric (Hermitian) positive-definite
coefficient matrix.
Syntax
call spotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call dpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call cpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call zpotrs( uplo, n, nrhs, a, lda, b, ldb, info )
call potrs( a, b [,uplo] [, info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a symmetric positive-definite or, for
complex data, Hermitian positive-definite matrix A, given the Cholesky factorization of A:

487
where L is a lower triangular matrix and U is upper triangular. The system is solved with multiple right-hand
sides stored in the columns of the matrix B.
Before calling this routine, you must call potrf to compute the Cholesky factorization of A.
Input Parameters

Indicates how the input matrix A has been factored:
If uplo = 'U', U is stored, where A = UT*U for real data, A = UH*U
for complex data.
If uplo = 'L', L is stored, where A = L*LT for real data, A = L*LH for
complex data
nrhs INTEGER. The number of right-hand sides (nrhs 0).
a, b REAL for spotrs

DOUBLE PRECISION for dpotrs
COMPLEX for cpotrs
DOUBLE COMPLEX for zpotrs.
Arrays: a(lda,*), b(ldb,*).
The array a contains the factor U or L (see uplo) as returned by potrf.

The second dimension of a must be at least max(1,n).
sides for the systems of equations. The second dimension of b must
be at least max(1,nrhs).
Output Parameters


Specific details for the routine potrs interface are as follows:
488
LAPACK Routines 3
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a perturbed system
of equations (A + E)x = b, where
|E| c(n) |UH||U|

A similar estimate holds for uplo = 'L'. If x0 is the true solution, the computed solution x satisfies this
error bound:
Note that cond(A,x) can be much smaller than (A). The approximate number of floating-point operations
for one right-hand side vector b is 2n2 for real flavors and 8n2 for complex flavors.
To estimate the condition number (A), call ?pocon.
To refine the solution and estimate the error, call ?porfs.
See Also
?pftrs
factored symmetric (Hermitian) positive-definite
coefficient matrix using the Rectangular Full Packed
(RFP) format.
Syntax
call spftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call dpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call cpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
call zpftrs( transr, uplo, n, nrhs, a, b, ldb, info )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric positive-definite or, for complex
data, Hermitian positive-definite matrix A using the Cholesky factorization of A:

Before calling ?pftrs, you must call ?pftrf to compute the Cholesky factorization of A. L stands for a lower
triangular matrix and U for an upper triangular matrix.
489
Storage Schemes.
Input Parameters
data).
If transr = 'N', the untransposed factor of Ais stored in RFP format.
If transr = 'T', the transposed factor of Ais stored in RFP format.
If transr = 'C', the conjugate-transposed factor of Ais stored in RFP

format.

for complex data.
complex data
columns of the matrix B; nrhs 0.
a, b REAL for spftrs

DOUBLE PRECISION for dpftrs
COMPLEX for cpftrs
DOUBLE COMPLEX for zpftrs.
Arrays: a(n*(n + 1)/2), b(ldb,nrhs).
The array a contains, in the RFP format, the factor U or L obtained by

factorization of matrix A.
Output Parameters
b The solution matrix X.

See Also
490
LAPACK Routines 3
?pptrs
Solves a system of linear equations with a packed
Cholesky-factored symmetric (Hermitian) positive-
definite coefficient matrix.
Syntax
call spptrs( uplo, n, nrhs, ap, b, ldb, info )
call dpptrs( uplo, n, nrhs, ap, b, ldb, info )
call cpptrs( uplo, n, nrhs, ap, b, ldb, info )
call zpptrs( uplo, n, nrhs, ap, b, ldb, info )
call pptrs( ap, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a packed symmetric positive-definite or,
for complex data, Hermitian positive-definite matrix A, given the Cholesky factorization of A:

Before calling this routine, you must call ?pptrf to compute the Cholesky factorization of A.
Input Parameters

for complex data.
complex data
nrhs INTEGER. The number of right-hand sides (nrhs 0).
ap, b REAL for spptrs

DOUBLE PRECISION for dpptrs
COMPLEX for cpptrs
DOUBLE COMPLEX for zpptrs.
Arrays: ap(*), b(ldb,*)
The size of ap must be at least max(1,n(n+1)/2).

The array ap contains the factor U or L, as specified by uplo, in
491
be at least max(1, nrhs).
Output Parameters


Specific details for the routine pptrs interface are as follows:
Application Notes
If uplo = 'U', the computed solution for each right-hand side b is the exact solution of a perturbed system
of equations (A + E)x = b, where
|E| c(n) |UH||U|

Note that cond(A,x) can be much smaller than (A).
The approximate number of floating-point operations for one right-hand side vector b is 2n2 for real flavors
To estimate the condition number (A), call ?ppcon.
To refine the solution and estimate the error, call ?pprfs.
See Also
492
LAPACK Routines 3
?pbtrs
factored symmetric (Hermitian) positive-definite band
coefficient matrix.
Syntax
call spbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbtrs( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call pbtrs( ab, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for real data a system of linear equations A*X = B with a symmetric positive-definite or,
for complex data, Hermitian positive-definite band matrix A, given the Cholesky factorization of A:

Before calling this routine, you must call ?pbtrf to compute the Cholesky factorization of A in the band
storage form.
Input Parameters

If uplo = 'U', U is stored in ab, where A = UT*U for real matrices
and A = UH*U for complex matrices.
If uplo = 'L', L is stored in ab, where A = L*LT for real matrices and
A = L*LH for complex matrices.

A; kd 0.
ab, b REAL for spbtrs

DOUBLE PRECISION for dpbtrs
COMPLEX for cpbtrs
DOUBLE COMPLEX for zpbtrs.
Arrays: ab(ldab,*), b(ldb,*).
493
The array ab contains the Cholesky factor, as returned by the

factorization routine, in band storage form.
The second dimension of ab must be at least max(1, n), and the
second dimension of b at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldabkd +1.
Output Parameters


Specific details for the routine pbtrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(kd + 1)P|UH||U| or |E| c(kd + 1)P|LH||L|

The approximate number of floating-point operations for one right-hand side vector is 4n*kd for real flavors
and 16n*kd for complex flavors.
To estimate the condition number (A), call ?pbcon.
494
LAPACK Routines 3
To refine the solution and estimate the error, call ?pbrfs.
See Also
?pttrs
Solves a system of linear equations with a symmetric
(Hermitian) positive-definite tridiagonal coefficient
matrix using the factorization computed by ?pttrf.
Syntax
call spttrs( n, nrhs, d, e, b, ldb, info )
call dpttrs( n, nrhs, d, e, b, ldb, info )
call cpttrs( uplo, n, nrhs, d, e, b, ldb, info )
call zpttrs( uplo, n, nrhs, d, e, b, ldb, info )
call pttrs( d, e, b [,info] )
call pttrs( d, e, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X a system of linear equations A*X = B with a symmetric (Hermitian) positive-definite
tridiagonal matrix A. Before calling this routine, call ?pttrf to compute the L*D*LT or UT*D*Ufor real data
and the L*D*LH or UH*D*Ufactorization of A for complex data.
Input Parameters
uplo CHARACTER*1. Used for cpttrs/zpttrs only. Must be 'U' or 'L'.

Specifies whether the superdiagonal or the subdiagonal of the
tridiagonal matrix A is stored and how A is factored:
If uplo = 'U', the array e stores the conjugated values of the
superdiagonal of U, and A is factored as UH*D*U.
If uplo = 'L', the array e stores the subdiagonal of L, and A is

factored as L*D*LH.
d REAL for spttrs, cpttrs

DOUBLE PRECISION for dpttrs, zpttrs.
Array, dimension (n). Contains the diagonal elements of the diagonal
matrix D from the factorization computed by ?pttrf.
e, b REAL for spttrs

DOUBLE PRECISION for dpttrs
495
COMPLEX for cpttrs

DOUBLE COMPLEX for zpttrs.
Arrays: e(n -1), b(ldb, nrhs).
The array e contains the (n - 1) sub-diagonal elements of the unit
bidiagonal factor L or the conjugated values of the superdiagonal of U
from the factorization computed by ?pttrf (see uplo).
Output Parameters


Specific details for the routine pttrs interface are as follows:
uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
See Also
?sytrs
Solves a system of linear equations with a UDUT- or
LDLT-factored symmetric coefficient matrix.
Syntax
call ssytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call dsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call csytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zsytrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call sytrs( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
496
LAPACK Routines 3
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A, given the Bunch-
Kaufman factorization of A:
where U and L are upper and lower triangular matrices with unit diagonal and D is a symmetric block-
diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the matrix B.
You must supply to this routine the factor U (or L) and the array ipiv returned by the factorization routine ?
sytrf.
Input Parameters

If uplo = 'U', the array a stores the upper triangular factor U of the
factorization A = U*D*UT.
If uplo = 'L', the array a stores the lower triangular factor L of the
factorization A = L*D*LT.
by ?sytrf.
a, b REAL for ssytrs

DOUBLE PRECISION for dsytrs
COMPLEX for csytrs
DOUBLE COMPLEX for zsytrs.
The array a contains the factor U or L (see uplo). The second

dimension of a must be at least max(1,n).
sides for the system of equations. The second dimension of b must be
at least max(1,nrhs).
Output Parameters
497

Specific details for the routine sytrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n)P|U||D||UT|PT or |E| c(n)P|L||D||UT|PT

The total number of floating-point operations for one right-hand side vector is approximately 2n2 for real
flavors or 8n2 for complex flavors.
To estimate the condition number (A), call ?sycon.
To refine the solution and estimate the error, call ?syrfs.
See Also
?sytrs_rook
Solves a system of linear equations with a UDU- or
LDL-factored symmetric coefficient matrix.
Syntax
call ssytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call dsytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call csytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zsytrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
498
LAPACK Routines 3
call sytrs_rook( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric matrix A, using the factorization A
= U*D*UT or A = L*D*LT computed by ?sytrf_rook.
Input Parameters

If uplo = 'U', the factorization is of the form A = U*D*UT.
If uplo = 'L', the factorization is of the form A = L*D*LT.
by ?sytrf_rook.
a, b REAL for ssytrs_rook

DOUBLE PRECISION for dsytrs_rook
COMPLEX for csytrs_rook
DOUBLE COMPLEX for zsytrs_rook.
Arrays: a(lda,n), b(ldb,nrhs).
The array a contains the block diagonal matrix D and the multipliers
used to obtain U or L as computed by ?sytrf_rook (see uplo).
sides for the system of equations.
Output Parameters


Specific details for the routine sytrs_rook interface are as follows:
499
Application Notes
See Also
?hetrs
Solves a system of linear equations with a UDUT- or
LDLT-factored Hermitian coefficient matrix.
Syntax
call chetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zhetrs( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call hetrs( a, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A, given the Bunch-
where U and L are upper and lower triangular matrices with unit diagonal and D is a symmetric block-
diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the matrix B.
You must supply to this routine the factor U (or L) and the array ipiv returned by the factorization routine ?
hetrf.
Input Parameters

factorization A = U*D*UH.
factorization A = L*D*LH.
500
LAPACK Routines 3
ipiv INTEGER.
Array, size at least max(1, n).
The ipiv array, as returned by ?hetrf.
a, b COMPLEX for chetrs

DOUBLE COMPLEX for zhetrs.
The array a contains the factor U or L (see uplo).

The second dimension of a must be at least max(1,n), the second
Output Parameters


Specific details for the routine hetrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n)P|U||D||UH|PT or |E| c(n)P|L||D||LH|PT

501
The total number of floating-point operations for one right-hand side vector is approximately 8n2.
To estimate the condition number (A), call ?hecon.
To refine the solution and estimate the error, call ?herfs.
See Also
?hetrs_rook
LDL-factored Hermitian coefficient matrix.
Syntax
call chetrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call zhetrs_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, info )
call hetrs_rook( a, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for a system of linear equations A*X = B with a complex Hermitian matrix A using the
factorization A = U*D*UH or A = L*D*LH computed by ?hetrf_rook.
Input Parameters

If uplo = 'U', the factorization is of the form A = U*D*UH.
If uplo = 'L', the factorization is of the form A = L*D*LH.
ipiv INTEGER.
The ipiv array, as returned by ?hetrf_rook.
a, b COMPLEX for chetrs_rook

DOUBLE COMPLEX for zhetrs_rook.
502
LAPACK Routines 3
Arrays: a(lda,n), b(ldb,nrhs).
used to obtain the factor U or L as computed by ?hetrf_rook (see
uplo).
Output Parameters


Specific details for the routine hetrs_rook interface are as follows:
?sytrs2
LDL-factored symmetric coefficient matrix.
Syntax
call ssytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call dsytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call csytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call zsytrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call sytrs2( a,b,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a symmetric matrix A using the factorization of
A:
503
if uplo='L', A = L*D*LT
where
U and L are upper and lower triangular matrices with unit diagonal
D is a symmetric block-diagonal matrix.
The factorization is computed by ?sytrf.
Input Parameters

a, b REAL for ssytrs2

DOUBLE PRECISION for dsytrs2
COMPLEX for csytrs2
DOUBLE COMPLEX for zsytrs2
used to obtain the factor U or L as computed by ?sytrf.
The array b contains the right-hand side matrix B.

The second dimension of a must be at least max(1,n), and the
ipiv INTEGER. Array of size n. The ipiv array contains details of the
interchanges and the block structure of D as determined by ?sytrf.
work REAL for ssytrs2

DOUBLE PRECISION for dsytrs2
COMPLEX for csytrs2
DOUBLE COMPLEX for zsytrs2
Workspace array, size n.
504
LAPACK Routines 3
Output Parameters


Specific details for the routine sytrs2 interface are as follows:
uplo Indicates how the input matrix A has been factored. Must be 'U' or
'L'.
See Also
?sytrf
?hetrs2
LDL-factored Hermitian coefficient matrix.
Syntax
call chetrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call zhetrs2( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, info )
call hetrs2( a, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves a system of linear equations A*X = B with a complex Hermitian matrix A using the
factorization of A:
if uplo='L', A = L*D*LH
where
U and L are upper and lower triangular matrices with unit diagonal
D is a Hermitian block-diagonal matrix.
The factorization is computed by ?hetrf.
505
Input Parameters

a, b COMPLEX for chetrs2

DOUBLE COMPLEX for zhetrs2
used to obtain the factor U or L as computed by ?hetrf.
The array b contains the right-hand side matrix B.

The second dimension of a must be at least max(1,n), and the
ipiv INTEGER. Array of size n. The ipiv array contains details of the
interchanges and the block structure of D as determined by ?hetrf.
work COMPLEX for chetrs2

DOUBLE COMPLEX for zhetrs2
Workspace array, size n.
Output Parameters


Specific details for the routine hetrs2 interface are as follows:
506
LAPACK Routines 3
See Also
?hetrf
?sptrs
LDL-factored symmetric coefficient matrix using
packed storage.
Syntax
call ssptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call csptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zsptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call sptrs( ap, b, ipiv [, uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a symmetric matrix A, given the Bunch-
where U and L are upper and lower packed triangular matrices with unit diagonal and D is a symmetric
block-diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the
matrix B. You must supply the factor U (or L) and the array ipiv returned by the factorization routine ?sptrf.
Input Parameters

If uplo = 'U', the array ap stores the packed factor U of the
factorization A = U*D*UT. If uplo = 'L', the array ap stores the
packed factor L of the factorization A = L*D*LT.
ipiv INTEGER.
507
Array, size at least max(1, n). The ipiv array, as returned by ?sptrf.
ap REAL for ssptrs

DOUBLE PRECISION for dsptrs
COMPLEX for csptrs
DOUBLE COMPLEX for zsptrs.
The dimension of array ap must be at least max(1, n(n+1)/2). The
array ap contains the factor U or L, as specified by uplo, in packed
storage (see Matrix Storage Schemes).
b REAL for ssptrs

DOUBLE PRECISION for dsptrs
COMPLEX for csptrs
DOUBLE COMPLEX for zsptrs.
The array b(ldb,*) contains the matrix B whose columns are the
right-hand sides for the system of equations. The second dimension of
b must be at least max(1, nrhs).
Output Parameters


Specific details for the routine sptrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n)P|U||D||UT|PT or |E| c(n)P|L||D||LT|PT

508
LAPACK Routines 3
To estimate the condition number (A), call ?spcon.
To refine the solution and estimate the error, call ?sprfs.
See Also
?hptrs
LDL-factored Hermitian coefficient matrix using
packed storage.
Syntax
call chptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhptrs( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call hptrs( ap, b, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B with a Hermitian matrix A, given the Bunch-
where U and L are upper and lower packed triangular matrices with unit diagonal and D is a symmetric
block-diagonal matrix. The system is solved with multiple right-hand sides stored in the columns of the
matrix B.
You must supply to this routine the arrays ap (containing U or L)and ipiv in the form returned by the
factorization routine ?hptrf.
Input Parameters

509
If uplo = 'U', the array ap stores the packed factor U of the

factorization A = U*D*UH. If uplo = 'L', the array ap stores the
packed factor L of the factorization A = L*D*LH.
by ?hptrf.
ap COMPLEX for chptrs

DOUBLE COMPLEX for zhptrs.
The dimension of array ap(*) must be at least max(1,n(n+1)/2). The
array ap contains the factor U or L, as specified by uplo, in packed
storage (see Matrix Storage Schemes).
b COMPLEX for chptrs

DOUBLE COMPLEX for zhptrs.
right-hand sides for the system of equations. The second dimension of
b must be at least max(1, nrhs).
Output Parameters


Specific details for the routine hptrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n)P|U||D||UH|PT or |E| c(n)P|L||D||LH|PT
510
LAPACK Routines 3
The total number of floating-point operations for one right-hand side vector is approximately 8n2 for complex
flavors.
To estimate the condition number (A), call ?hpcon.
To refine the solution and estimate the error, call ?hprfs.
See Also
?trtrs
Solves a system of linear equations with a triangular
coefficient matrix, with multiple right-hand sides.
Syntax
call strtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call dtrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ctrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call ztrtrs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, info )
call trtrs( a, b [,uplo] [, trans] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a triangular matrix A, with multiple
right-hand sides stored in B:
Input Parameters

Indicates whether A is upper or lower triangular:
If uplo = 'U', then A is upper triangular.
511
If uplo = 'L', then A is lower triangular.

diag CHARACTER*1. Must be 'N' or 'U'.

If diag = 'N', then A is not a unit triangular matrix.
If diag = 'U', then A is unit triangular: diagonal elements of A are

assumed to be 1 and not referenced in the array a.
a REAL for strtrs

DOUBLE PRECISION for dtrtrs
COMPLEX for ctrtrs
DOUBLE COMPLEX for ztrtrs.
The array a(lda,*) contains the matrix A.
b REAL for strtrs

DOUBLE PRECISION for dtrtrs
COMPLEX for ctrtrs
DOUBLE COMPLEX for ztrtrs.
right-hand sides for the systems of equations.
The second dimension of b at least max(1,nrhs).
Output Parameters


512
LAPACK Routines 3
Specific details for the routine trtrs interface are as follows:
a Stands for argument ap in FORTRAN 77 interface. Holds the matrix A

of size (n*(n+1)/2).
Application Notes
+ E)x = b, where
|E| c(n) |A|
c(n) is a modest linear function of n, and is the machine precision. If x0 is the true solution, the computed
solution x satisfies this error bound:
The approximate number of floating-point operations for one right-hand side vector b is n2 for real flavors
To estimate the condition number (A), call ?trcon.
To estimate the error in the solution, call ?trrfs.
See Also
?tptrs
Solves a system of linear equations with a packed
triangular coefficient matrix, with multiple right-hand
sides.
Syntax
call stptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call dtptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ctptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call ztptrs( uplo, trans, diag, n, nrhs, ap, b, ldb, info )
call tptrs( ap, b [,uplo] [, trans] [,diag] [,info] )
513
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a packed triangular matrix A, with
multiple right-hand sides stored in B:
Input Parameters



If diag = 'U', then A is unit triangular: diagonal elements are

assumed to be 1 and not referenced in the array ap.
ap REAL for stptrs

DOUBLE PRECISION for dtptrs
COMPLEX for ctptrs
DOUBLE COMPLEX for ztptrs.
The dimension of arrayap(*) must be at least max(1,n(n+1)/2). The
array ap contains the matrix A in packed storage (see Matrix Storage
Schemes).
b REAL for stptrs

DOUBLE PRECISION for dtptrs
COMPLEX for ctptrs
DOUBLE COMPLEX for ztptrs.
514
LAPACK Routines 3
right-hand sides for the system of equations.
The second dimension of b must be at least max(1, nrhs).
Output Parameters


Specific details for the routine tptrs interface are as follows:
Application Notes
+ E)x = b, where
|E| c(n) |A|

The approximate number of floating-point operations for one right-hand side vector b is n2 for real flavors
To estimate the condition number (A), call ?tpcon.
515
To estimate the error in the solution, call ?tprfs.
See Also
?tbtrs
Solves a system of linear equations with a band
triangular coefficient matrix, with multiple right-hand
sides.
Syntax
call stbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call dtbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ctbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call ztbtrs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, info )
call tbtrs( ab, b [,uplo] [, trans] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the following systems of linear equations with a band triangular matrix A, with
multiple right-hand sides stored in B:
Input Parameters




assumed to be 1 and not referenced in the array ab.
516
LAPACK Routines 3
A; kd 0.
ab REAL for stbtrs

DOUBLE PRECISION for dtbtrs
COMPLEX for ctbtrs
DOUBLE COMPLEX for ztbtrs.
The array ab(ldab,*) contains the matrix A in band storage form.
The second dimension of ab must be at least max(1, n).
b REAL for stbtrs

DOUBLE PRECISION for dtbtrs
COMPLEX for ctbtrs
DOUBLE COMPLEX for ztbtrs.
right-hand sides for the systems of equations.
The second dimension of b at least max(1,nrhs).
ldab INTEGER. The leading dimension of ab; ldabkd + 1.
Output Parameters


Specific details for the routine tbtrs interface are as follows:
ab Holds the array A of size (kd+1,n)
517
Application Notes
+ E)x = b, where
|E| c(n)|A|
c(n) is a modest linear function of n, and is the machine precision. If x0 is the true solution, the computed
solution x satisfies this error bound:
The approximate number of floating-point operations for one right-hand side vector b is 2n*kd for real
flavors and 8n*kd for complex flavors.
To estimate the condition number (A), call ?tbcon.
To estimate the error in the solution, call ?tbrfs.
See Also
Estimating the Condition Number: LAPACK Computational Routines

This section describes the LAPACK routines for estimating the condition number of a matrix. The condition
number is used for analyzing the errors in the solution of a system of linear equations (see Error Analysis).
Since the condition number may be arbitrarily large when the matrix is nearly singular, the routines actually
compute the reciprocal condition number.
?gecon
Estimates the reciprocal of the condition number of a
general matrix in the 1-norm or the infinity-norm.
Syntax
call sgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call dgecon( norm, n, a, lda, anorm, rcond, work, iwork, info )
call cgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
call zgecon( norm, n, a, lda, anorm, rcond, work, rwork, info )
call gecon( a, anorm, rcond [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a general matrix A in the 1-norm or infinity-
norm:
1(A) =||A||1||A-1||1 = (AT) = (AH)
518
LAPACK Routines 3
(A) =||A||||A-1|| = 1(AT) = 1(AH).
An estimate is obtained for ||A-1||, and the reciprocal of the condition number is computed as rcond =
1 / (||A|| ||A-1||).
Before calling this routine:
compute anorm (either ||A||1 = maxji |aij| or ||A|| = maxij |aij|)

call ?getrf to compute the LU factorization of A.
Input Parameters
norm CHARACTER*1. Must be '1' or 'O' or 'I'.

If norm = '1' or 'O', then the routine estimates the condition
number of matrix A in 1-norm.
If norm = 'I', then the routine estimates the condition number of
matrix A in infinity-norm.
a, work REAL for sgecon

DOUBLE PRECISION for dgecon
COMPLEX for cgecon
DOUBLE COMPLEX for zgecon. Arrays: a(lda,*), work(*).
The array a contains the LU-factored matrix A, as returned by ?
getrf. The second dimension of a must be at least max(1,n). The
array work is a workspace for the routine.
The dimension of work must be at least max(1, 4*n) for real flavors
and max(1, 2*n) for complex flavors.
anorm REAL for single precision flavors.

The norm of the original matrix A (see Description).
iwork INTEGER. Workspace array, size at least max(1, n).
rwork REAL for cgecon

DOUBLE PRECISION for zgecon.
Workspace array, size at least max(1, 2*n).
Output Parameters
rcond REAL for single precision flavors.

519
An estimate of the reciprocal of the condition number. The routine sets

rcond = 0 if the estimate underflows; in this case the matrix is
singular (to working precision). However, anytime rcond is small
compared to 1.0, for the working precision, the matrix may be poorly
conditioned or even singular.


Specific details for the routine gecon interface are as follows:
norm Must be '1', 'O', or 'I'. The default value is '1'.
Application Notes
The computed rcond is never less than r (the reciprocal of the true condition number) and in practice is
nearly always less than 10r. A call to this routine involves solving a number of systems of linear equations
A*x = b or AH*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2*n2 floating-point operations for real flavors and 8*n2 for complex flavors.
See Also
?gbcon
band matrix in the 1-norm or the infinity-norm.
Syntax
call sgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info )
call dgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, iwork, info )
call cgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info )
call zgbcon( norm, n, kl, ku, ab, ldab, ipiv, anorm, rcond, work, rwork, info )
call gbcon( ab, ipiv, anorm, rcond [,kl] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a general band matrix A in the 1-norm or
infinity-norm:
1(A) = ||A||1||A-1||1 = (AT) = (AH)
(A) = ||A||||A-1|| = 1(AT) = 1(AH).
1 / (||A|| ||A-1||).
520
LAPACK Routines 3

call ?gbtrf to compute the LU factorization of A.
Input Parameters

ldab INTEGER. The leading dimension of the array ab. (ldab 2*kl + ku
+1).
by ?gbtrf.
ab, work REAL for sgbcon

DOUBLE PRECISION for dgbcon
COMPLEX for cgbcon
DOUBLE COMPLEX for zgbcon.
Arrays: ab(ldab,*), work(*).
The array ab contains the factored band matrix A, as returned by ?

gbtrf.
The second dimension of ab must be at least max(1,n). The array
work is a workspace for the routine.
The dimension of work must be at least max(1, 3*n) for real flavors

The norm of the original matrix A(see Description).
rwork REAL for cgbcon

DOUBLE PRECISION for zgbcon.
521
Output Parameters

rcond =0 if the estimate underflows; in this case the matrix is singular
(to working precision). However, anytime rcond is small compared to
1.0, for the working precision, the matrix may be poorly conditioned
or even singular.


Specific details for the routine gbcon interface are as follows:
Application Notes
A*x = b or AH*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n(ku + 2kl) floating-point operations for real flavors and 8n(ku + 2kl) for complex
flavors.
See Also
?gtcon
tridiagonal matrix.
Syntax
call sgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info )
call dgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, iwork, info )
call cgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
call zgtcon( norm, n, dl, d, du, du2, ipiv, anorm, rcond, work, info )
call gtcon( dl, d, du, du2, ipiv, anorm, rcond [,norm] [,info] )
522
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a real or complex tridiagonal matrix A in the
1-norm or infinity-norm:
1(A) = ||A||1||A-1||1
(A) = ||A||||A-1||
1 / (||A|| ||A-1||).

call ?gttrf to compute the LU factorization of A.
Input Parameters

dl,d,du,du2 REAL for sgtcon

DOUBLE PRECISION for dgtcon
COMPLEX for cgtcon
DOUBLE COMPLEX for zgtcon.
Arrays: dl(n -1), d(n), du(n -1), du2(n -2).
The array dl contains the (n - 1) multipliers that define the matrix L

from the LU factorization of A as computed by ?gttrf.

matrix U from the LU factorization of A.
The array du contains the (n - 1) elements of the first superdiagonal
of U.
The array du2 contains the (n - 2) elements of the second
superdiagonal of U.
ipiv INTEGER.
Array, size (n). The array of pivot indices, as returned by ?gttrf.

The norm of the original matrix A(see Description).
523
work REAL for sgtcon

DOUBLE PRECISION for dgtcon
COMPLEX for cgtcon
DOUBLE COMPLEX for zgtcon.
Workspace array, size (2*n).
iwork INTEGER. Workspace array, size (n). Used for real flavors only.
Output Parameters

rcond=0 if the estimate underflows; in this case the matrix is singular
or even singular.


Specific details for the routine gtcon interface are as follows:
Application Notes
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2
floating-point operations for real flavors and 8n2 for complex flavors.
?pocon
symmetric (Hermitian) positive-definite matrix.
524
LAPACK Routines 3
Syntax
call spocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call dpocon( uplo, n, a, lda, anorm, rcond, work, iwork, info )
call cpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
call zpocon( uplo, n, a, lda, anorm, rcond, work, rwork, info )
call pocon( a, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian) positive-definite
matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric or Hermitian, (A) = 1(A)).
1 / (||A|| ||A-1||).

call ?potrf to compute the Cholesky factorization of A.
Input Parameters
a, work REAL for spocon

DOUBLE PRECISION for dpocon
COMPLEX for cpocon
DOUBLE COMPLEX for zpocon.
Arrays: a(lda,*), work(*).
The array a contains the factored matrix A, as returned by ?potrf.

The array work is a workspace for the routine. The dimension of work
must be at least max(1, 3*n) for real flavors and max(1, 2*n) for
complex flavors.
anorm REAL for single precision flavors

rwork REAL for cpocon

DOUBLE PRECISION for zpocon.
525
Workspace array, size at least max(1, n).
Output Parameters
rcond REAL for single precision flavors

or even singular.


Specific details for the routine pocon interface are as follows:
Application Notes
See Also
?ppcon
packed symmetric (Hermitian) positive-definite
matrix.
Syntax
call sppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call dppcon( uplo, n, ap, anorm, rcond, work, iwork, info )
call cppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
call zppcon( uplo, n, ap, anorm, rcond, work, rwork, info )
call ppcon( ap, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
526
LAPACK Routines 3
Description
The routine estimates the reciprocal of the condition number of a packed symmetric (Hermitian) positive-
definite matrix A:
1 / (||A|| ||A-1||).

call ?pptrf to compute the Cholesky factorization of A.
Input Parameters
ap, work REAL for sppcon

DOUBLE PRECISION for dppcon
COMPLEX for cppcon
DOUBLE COMPLEX for zppcon.
Arrays: ap(*), work(*).
The array ap contains the packed factored matrix A, as returned by ?

pptrf. The dimension of ap must be at least max(1,n(n+1)/2).
complex flavors.

rwork REAL for cppcon

DOUBLE PRECISION for zppcon.
Output Parameters

or even singular.
527

Specific details for the routine ppcon interface are as follows:
Application Notes
See Also
?pbcon
symmetric (Hermitian) positive-definite band matrix.
Syntax
call spbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call dpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, iwork, info )
call cpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
call zpbcon( uplo, n, kd, ab, ldab, anorm, rcond, work, rwork, info )
call pbcon( ab, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric (Hermitian) positive-definite
band matrix A:
1 / (||A|| ||A-1||).

call ?pbtrf to compute the Cholesky factorization of A.
528
LAPACK Routines 3
Input Parameters

A; kd 0.
ldab INTEGER. The leading dimension of the array ab. (ldabkd +1).
ab, work REAL for spbcon

DOUBLE PRECISION for dpbcon
COMPLEX for cpbcon
DOUBLE COMPLEX for zpbcon.
Arrays: ab(ldab,*), work(*).
The array ab contains the factored matrix A in band form, as returned

by ?pbtrf. The second dimension of ab must be at least max(1, n).
complex flavors.

rwork REAL for cpbcon

DOUBLE PRECISION for zpbcon.
Output Parameters

or even singular.


Specific details for the routine pbcon interface are as follows:
529
Application Notes
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately
4*n(kd + 1) floating-point operations for real flavors and 16*n(kd + 1) for complex flavors.
See Also
?ptcon
symmetric (Hermitian) positive-definite tridiagonal
matrix.
Syntax
call sptcon( n, d, e, anorm, rcond, work, info )
call dptcon( n, d, e, anorm, rcond, work, info )
call cptcon( n, d, e, anorm, rcond, work, info )
call zptcon( n, d, e, anorm, rcond, work, info )
call ptcon( d, e, anorm, rcond [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the reciprocal of the condition number (in the 1-norm) of a real symmetric or complex
Hermitian positive-definite tridiagonal matrix using the factorization A = L*D*LT for real flavors and A =
L*D*LH for complex flavors or A = UT*D*U for real flavors and A = UH*D*U for complex flavors computed
by ?pttrf :

The norm ||A-1|| is computed by a direct method, and the reciprocal of the condition number is computed
as rcond = 1 / (||A|| ||A-1||).
compute anorm as ||A||1 = maxji |aij|

call ?pttrf to compute the factorization of A.
Input Parameters
d, work REAL for single precision flavors

Arrays, dimension (n).
530
LAPACK Routines 3
The array d contains the n diagonal elements of the diagonal matrix D
from the factorization of A, as computed by ?pttrf ;
work is a workspace array.
e REAL for sptcon

DOUBLE PRECISION for dptcon
COMPLEX for cptcon
DOUBLE COMPLEX for zptcon.
Array, size (n -1).
Contains off-diagonal elements of the unit bidiagonal factor U or L

from the factorization computed by ?pttrf .

The 1- norm of the original matrix A (see Description).
Output Parameters

or even singular.
info INTEGER.
If info = 0, the execution is successful.

Specific details for the routine gtcon interface are as follows:
Application Notes
4*n(kd + 1) floating-point operations for real flavors and 16*n(kd + 1) for complex flavors.
531
?sycon
symmetric matrix.
Syntax
call ssycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call dsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call csycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zsycon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call sycon( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric matrix A:
1(A) = ||A||1 ||A-1||1 (since A is symmetric, (A) = 1(A)).
1 / (||A|| ||A-1||).

call ?sytrf to compute the factorization of A.
Input Parameters

a, work REAL for ssycon

DOUBLE PRECISION for dsycon
COMPLEX for csycon
DOUBLE COMPLEX for zsycon.
The array a contains the factored matrix A, as returned by ?sytrf.

The array work is a workspace for the routine.

The dimension of work must be at least max(1, 2*n).
532
LAPACK Routines 3
ipiv INTEGER. Array, size at least max(1, n).

The array ipiv, as returned by ?sytrf.

Output Parameters

or even singular.


Specific details for the routine sycon interface are as follows:
Application Notes
See Also
?sycon_rook
symmetric matrix.
Syntax
call ssycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
533
call dsycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, iwork, info )
call csycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zsycon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call sycon_rook( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a symmetric matrix A:

call ?sytrf_rook to compute the factorization of A.
Input Parameters

a, work REAL for ssycon_rook

DOUBLE PRECISION for dsycon_rook
COMPLEX for csycon_rook
DOUBLE COMPLEX for zsycon_rook.
The array a contains the factored matrix A, as returned by ?

sytrf_rook. The second dimension of a must be at least max(1,n).
The array work is a workspace for the routine.
The dimension of work must be at least max(1, 2*n).

The array ipiv, as returned by ?sytrf_rook.

534
LAPACK Routines 3
Output Parameters

or even singular.


Specific details for the routine sycon_rook interface are as follows:
Application Notes
See Also
?hecon
Hermitian matrix.
Syntax
call checon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zhecon( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call hecon( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
535
1(A) = ||A||1 ||A-1||1 (since A is Hermitian, (A) = 1(A)).

compute anorm (either ||A||1 =maxji |aij| or ||A|| =maxij |aij|)

call ?hetrf to compute the factorization of A.
Input Parameters

a, work COMPLEX for checon

DOUBLE COMPLEX for zhecon.
The array a contains the factored matrix A, as returned by ?hetrf.

must be at least max(1, 2*n).

The array ipiv, as returned by ?hetrf.

Output Parameters

or even singular.

536
LAPACK Routines 3
Specific details for the routine hecon interface are as follows:
Application Notes
A*x = b; the number is usually 5 and never more than 11. Each solution requires approximately 8n2
floating-point operations.
See Also
?hecon_rook
Hermitian matrix using factorization obtained with one
of the bounded diagonal pivoting methods (max 2
interchanges).
Syntax
call checon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call zhecon_rook( uplo, n, a, lda, ipiv, anorm, rcond, work, info )
call hecon_rook( a, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A using the factorization A
= U*D*UH or A = L*D*LH computed by hetrf_rook.
An estimate is obtained for norm(A-1), and the reciprocal of the condition number is computed as rcond =
1/(anorm*norm(A-1)).
Input Parameters

537
a, work COMPLEX for checon_rook

COMPLEX*16 for zhecon_rook.
Arrays: a(lda,n), work(*).
The array a contains the factored matrix A, as returned by ?

hetrf_rook. The second dimension of a must be at least max(1,n).

The array ipiv, as returned by hetrf_rook.
anorm REAL for checon_rook

DOUBLE PRECISION for zhecon_rook.
The 1-norm of the original matrix A (see Description).
Output Parameters
rcond REAL for checon_rook

DOUBLE PRECISION for zhecon_rook.
The reciprocal of the condition number of the matrix A, computed as
rcond = 1/(anorm * ainvnm), where ainvnm is an estimate of the 1-
norm of A-1 computed in this routine.


Specific details for the routine hecon_rook interface are as follows:
See Also
?spcon
packed symmetric matrix.
538
LAPACK Routines 3
Syntax
call sspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call dspcon( uplo, n, ap, ipiv, anorm, rcond, work, iwork, info )
call cspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zspcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call spcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a packed symmetric matrix A:
1 / (||A|| ||A-1||).

call ?sptrf to compute the factorization of A.
Input Parameters

If uplo = 'U', the array ap stores the packed upper triangular factor
U of the factorization A = U*D*UT.
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = L*D*LT.
ap, work REAL for sspcon

DOUBLE PRECISION for dspcon
COMPLEX for cspcon
DOUBLE COMPLEX for zspcon.
Arrays: ap(*), work(*).
The array ap contains the packed factored matrix A, as returned by ?

sptrf. The dimension of ap must be at least max(1,n(n+1)/2).

The array ipiv, as returned by ?sptrf.
539

Output Parameters

rcond = 0 if the estimate underflows; in this case the matrix is
singular (to working precision). However, anytime rcond is small
compared to 1.0, for the working precision, the matrix may be poorly
conditioned or even singular.


Specific details for the routine spcon interface are as follows:
Application Notes
See Also
?hpcon
packed Hermitian matrix.
Syntax
call chpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call zhpcon( uplo, n, ap, ipiv, anorm, rcond, work, info )
call hpcon( ap, ipiv, anorm, rcond [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
540
LAPACK Routines 3
Description
The routine estimates the reciprocal of the condition number of a Hermitian matrix A:
1(A) = ||A||1 ||A-1||1 (since A is Hermitian, (A) = k1(A)).
1 / (||A|| ||A-1||).
compute anorm (either ||A||1 =maxji |aij| or ||A|| =maxij |aij|)

call ?hptrf to compute the factorization of A.
Input Parameters

If uplo = 'U', the array ap stores the packed upper triangular factor
U of the factorization A = U*D*UT.
If uplo = 'L', the array ap stores the packed lower triangular factor
L of the factorization A = L*D*LT.
ap, work COMPLEX for chpcon

DOUBLE COMPLEX for zhpcon.
The array ap(*) contains the packed factored matrix A, as returned
by ?hptrf. The dimension of ap must be at least max(1,n(n+1)/2).
The array work(*) is a workspace for the routine. The dimension of

work must be at least max(1, 2*n).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv, as returned by ?hptrf.

Output Parameters

or even singular.

541

Specific details for the routine hbcon interface are as follows:
Application Notes
A*x = b; the number is usually 5 and never more than 11. Each solution requires approximately 8n2
floating-point operations.
See Also
?trcon
triangular matrix.
Syntax
call strcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call dtrcon( norm, uplo, diag, n, a, lda, rcond, work, iwork, info )
call ctrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
call ztrcon( norm, uplo, diag, n, a, lda, rcond, work, rwork, info )
call trcon( a, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a triangular matrix A in either the 1-norm or
infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =k1 (AT) = 1 (AH) .
Input Parameters

542
LAPACK Routines 3
If uplo = 'U', the array a stores the upper triangle of A, other array
elements are not referenced.
If uplo = 'L', the array a stores the lower triangle of A, other array
elements are not referenced.


a, work REAL for strcon

DOUBLE PRECISION for dtrcon
COMPLEX for ctrcon
DOUBLE COMPLEX for ztrcon.
The array a(lda,*) contains the matrix A. The second dimension of a
must be at least max(1,n).

work must be at least max(1, 3*n) for real flavors and max(1, 2*n)
for complex flavors.
rwork REAL for ctrcon

DOUBLE PRECISION for ztrcon.
Output Parameters

or even singular.

543

Specific details for the routine trcon interface are as follows:
Application Notes
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately n2
floating-point operations for real flavors and 4n2 operations for complex flavors.
See Also
?tpcon
packed triangular matrix.
Syntax
call stpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call dtpcon( norm, uplo, diag, n, ap, rcond, work, iwork, info )
call ctpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
call ztpcon( norm, uplo, diag, n, ap, rcond, work, rwork, info )
call tpcon( ap, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a packed triangular matrix A in either the 1-
norm or infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =1 (AT) = 1(AH) .
Input Parameters

544
LAPACK Routines 3
uplo CHARACTER*1. Must be 'U' or 'L'. Indicates whether A is upper or

lower triangular:
If uplo = 'U', the array ap stores the upper triangle of A in packed
form.
If uplo = 'L', the array ap stores the lower triangle of A in packed
form.


ap, work REAL for stpcon

DOUBLE PRECISION for dtpcon
COMPLEX for ctpcon
DOUBLE COMPLEX for ztpcon.
The array ap(*) contains the packed matrix A. The dimension of ap
must be at least max(1,n(n+1)/2).
work must be at least max(1, 3*n) for real flavors and max(1, 2*n)
rwork REAL for ctpcon

DOUBLE PRECISION for ztpcon.
Output Parameters

or even singular.

545

Specific details for the routine tpcon interface are as follows:
Application Notes
A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires approximately n2
floating-point operations for real flavors and 4n2 operations for complex flavors.
See Also
?tbcon
triangular band matrix.
Syntax
call stbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call dtbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, iwork, info )
call ctbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
call ztbcon( norm, uplo, diag, n, kd, ab, ldab, rcond, work, rwork, info )
call tbcon( ab, rcond [,uplo] [,diag] [,norm] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the reciprocal of the condition number of a triangular band matrix A in either the 1-
norm or infinity-norm:
1(A) =||A||1 ||A-1||1 = (AT) = (AH)
(A) =||A|| ||A-1|| =1 (AT) = 1(AH) .
Input Parameters

546
LAPACK Routines 3
uplo CHARACTER*1. Must be 'U' or 'L'. Indicates whether A is upper or

lower triangular:
If uplo = 'U', the array ap stores the upper triangle of A in packed
form.
If uplo = 'L', the array ap stores the lower triangle of A in packed
form.



A; kd 0.
ab, work REAL for stbcon

DOUBLE PRECISION for dtbcon
COMPLEX for ctbcon
DOUBLE COMPLEX for ztbcon.
The array ab(ldab,*) contains the band matrix A. The second
dimension of ab must be at least max(1,n).
The array work is a workspace for the routine. The dimension of

work(*) must be at least max(1, 3*n) for real flavors and max(1,
2*n) for complex flavors.
ldab INTEGER. The leading dimension of the array ab. (ldabkd +1).
rwork REAL for ctbcon

DOUBLE PRECISION for ztbcon.
Output Parameters

or even singular.
547

Specific details for the routine tbcon interface are as follows:
Application Notes
2*n(kd + 1) floating-point operations for real flavors and 8*n(kd + 1) operations for complex flavors.
See Also
Refining the Solution and Estimating Its Error: LAPACK Computational Routines
This section describes the LAPACK routines for refining the computed solution of a system of linear equations
and estimating the solution error. You can call these routines after factorizing the matrix of the system of
equations and computing the solution (see Routines for Matrix Factorization and Routines for Solving
Systems of Linear Equations).
?gerfs
Refines the solution of a system of linear equations
with a general coefficient matrix and estimates its
error.
Syntax
call sgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call cgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zgerfs( trans, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call gerfs( a, af, ipiv, b, x [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
548
LAPACK Routines 3
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B or AT*X
= B or AH*X = B with a general matrix A, with multiple right-hand sides. For each computed solution vector
x, the routine computes the component-wise backward error. This error is the smallest relative
perturbation in elements of A and b such that x is the exact solution of the perturbed system:
|aij| |aij|, |bi| |bi| such that (A + A)x = (b + b).
Finally, the routine estimates the component-wise forward error in the computed solution ||x - xe||/||
x|| (here xe is the exact solution).
call the factorization routine ?getrf

call the solver routine ?getrs.
Input Parameters

If trans = 'N', the system has the form A*X = B.
If trans = 'T', the system has the form AT*X = B.
If trans = 'C', the system has the form AH*X = B.
a,af,b,x,work REAL for sgerfs

DOUBLE PRECISION for dgerfs
COMPLEX for cgerfs
DOUBLE COMPLEX for zgerfs.
Arrays:
a(size lda by *) contains the original matrix A, as supplied to ?getrf.
af(size ldaf by *) contains the factored matrix A, as returned by ?

getrf.
b(size ldb by *) contains the right-hand side matrix B.
x(size ldx by *) contains the solution matrix X.
work(size *) is a workspace array.
The second dimension of a and af must be at least max(1, n); the
second dimension of b and x must be at least max(1, nrhs); the
dimension of work must be at least max(1, 3*n) for real flavors and
max(1, 2*n) for complex flavors.
ldaf INTEGER. The leading dimension of af; ldaf max(1, n).
549
ldx INTEGER. The leading dimension of x; ldx max(1, n).
ipiv INTEGER.
The ipiv array, as returned by ?getrf.
iwork INTEGER.
rwork REAL for cgerfs

DOUBLE PRECISION for zgerfs.
Output Parameters
x The refined solution matrix X.
ferr, berr REAL for single precision flavors

Arrays, size at least max(1, nrhs). Contain the component-wise
forward and backward errors, respectively, for each solution vector.


Specific details for the routine gerfs interface are as follows:
af Holds the matrix AF of size (n, n).
x Holds the matrix X of size (n, nrhs).
ferr Holds the vector of length (nrhs).
berr Holds the vector of length (nrhs).
Application Notes
The bounds returned in ferr are not rigorous, but in practice they almost always overestimate the actual
error.
550
LAPACK Routines 3
For each right-hand side, computation of the backward error involves a minimum of 4n2 floating-point
operations (for real flavors) or 16n2 operations (for complex flavors). In addition, each step of iterative
refinement involves 6n2 operations (for real flavors) or 24n2 operations (for complex flavors); the number of
iterations may range from 1 to 5. Estimating the forward error involves solving a number of systems of linear
equations A*x = b with the same coefficient matrix A and different right hand sides b; the number is usually
4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point operations for real
See Also
?gerfsx
Uses extra precise iterative refinement to improve the
solution to the system of linear equations with a
general coefficient matrix A and provides error bounds
and backward error estimates.
Syntax
call sgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call dgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call cgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
call zgerfsx( trans, equed, n, nrhs, a, lda, af, ldaf, ipiv, r, c, b, ldb, x, ldx,
rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations and provides error bounds and
backward error estimates for the solution. In addition to a normwise error bound, the code provides a
maximum componentwise error bound, if possible. See comments for err_bnds_norm and err_bnds_comp
for details of the error bounds.
The original system of linear equations may have been equilibrated before calling this routine, as described
by the parameters equed, r, and c below. In this case, the solution and error bounds returned are for the
original unequilibrated system.
Input Parameters
trans CHARACTER*1. Must be 'N', 'T', or 'C'.

Specifies the form of the system of equations:
If trans = 'N', the system has the form A*X = B (No transpose).
If trans = 'T', the system has the form AT*X = B (Transpose).
551
If trans = 'C', the system has the form AH*X = B (Conjugate transpose
for complex flavors, Transpose for real flavors).
equed CHARACTER*1. Must be 'N', 'R', 'C', or 'B'.

Specifies the form of equilibration that was done to A before calling this
routine.
If equed = 'N', no equilibration was done.
If equed = 'R', row equilibration was done, that is, A has been
premultiplied by diag(r).
If equed = 'C', column equilibration was done, that is, A has been
postmultiplied by diag(c).
If equed = 'B', both row and column equilibration was done, that is, A has
been replaced by diag(r)*A*diag(c). The right-hand side B has been
changed accordingly.
n INTEGER. The number of linear equations; the order of the matrix A; n 0.
nrhs INTEGER. The number of right-hand sides; the number of columns of the
matrices B and X; nrhs 0.
a, af, b, work REAL for sgerfsx

DOUBLE PRECISION for dgerfsx
COMPLEX for cgerfsx
DOUBLE COMPLEX for zgerfsx.
Arrays: a (size lda by *), af (size ldaf by *), b(size ldb by *), work(*).
The array a contains the original n-by-n matrix A.
The array af contains the factored form of the matrix A, that is, the factors
L and U from the factorization A = P*L*U as computed by ?getrf.
The array b contains the matrix B whose columns are the right-hand sides
for the systems of equations. The second dimension of b must be at least
max(1,nrhs).
work (size *) is a workspace array. The dimension of work must be at least
max(1,4*n) for real flavors, and at least max(1,2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). Contains the pivot indices as computed by ?
getrf; for row 1 in, row i of the matrix was interchanged with row
ipiv(i).
r, c REAL for single precision flavors

Arrays: r (size n), c (size n). The array r contains the row scale factors for
A, and the array c contains the column scale factors for A.
552
LAPACK Routines 3
equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed = 'N'
or 'C', r is not accessed.
If equed = 'R' or 'B', each element of r must be positive.
If equed = 'C' or 'B', A is multiplied on the right by diag(c); if equed =

'N' or 'R', c is not accessed.
If equed = 'C' or 'B', each element of c must be positive.
Each element of r or c should be a power of the radix to ensure a reliable

solution and error estimates. Scaling by powers of the radix does not cause
rounding errors unless the result underflows or overflows. Rounding errors
during scaling lead to refining with a matrix that is not equivalent to the
input matrix, producing error estimates that may not be reliable.
ldb INTEGER. The leading dimension of the array b; ldb max(1, n).
x REAL for sgerfsx

COMPLEX for cgerfsx
Array, size ldx by *.
The solution matrix X as computed by ?getrs
ldx INTEGER. The leading dimension of the output array x; ldx max(1, n).
n_err_bnds INTEGER. Number of error bounds to return for each right hand side and
each type (normwise or componentwise). See err_bnds_norm and
err_bnds_comp descriptions in Output Arguments section below.
nparams INTEGER. Specifies the number of parameters set in params. If 0, the

params array is never referenced and default values are used.
params REAL for single precision flavors

Array, size nparams. Specifies algorithm parameters. If an entry is less than
0.0, that entry is filled with the default value used for that parameter. Only
positions up to nparams are accessed; defaults are used for higher-
numbered parameters. If defaults are acceptable, you can pass nparams =
0, which prevents the source code from accessing the params argument.
params(1) : Whether to perform iterative refinement or not. Default: 1.0
=0.0 No refinement is performed and no error bounds

are computed.
=1.0 Use the double-precision refinement algorithm,

possibly with doubled-single computations if the
compilation environment does not support
double precision.
(Other values are reserved for future use.)
553
params(2) : Maximum number of residual computations allowed for

refinement.
Default 10.0
Aggressive Set to 100.0 to permit convergence using

approximate factorizations or factorizations
other than LU. If the factorization uses a
technique other than Gaussian elimination, the
guarantees in err_bnds_norm and
err_bnds_comp may no longer be trustworthy.
params(3) : Flag determining if the code will attempt to find a solution

with a small componentwise relative error in the double-precision algorithm.
Positive is true, 0.0 is false. Default: 1.0 (attempt componentwise
convergence).
iwork INTEGER. Workspace array, size at least max(1, n); used in real flavors
only.
rwork REAL for single precision flavors

Workspace array, size at least max(1, 3*n); used in complex flavors only.
Output Parameters
x REAL for sgerfsx

COMPLEX for cgerfsx
The improved solution matrix X.

Reciprocal scaled condition number. An estimate of the reciprocal Skeel
condition number of the matrix A after equilibration (if done). If rcond is
less than the machine precision, in particular, if rcond = 0, the matrix is
singular to working precision. Note that the error may still be small even if
this number is very small and the matrix appears ill-conditioned.
berr REAL for single precision flavors

Array, size at least max(1, nrhs). Contains the componentwise relative
backward error for each solution vector x(j), that is, the smallest relative
change in any element of A or B that makes x(j) an exact solution.
err_bnds_norm REAL for single precision flavors

554
LAPACK Routines 3
Array of size nrhs by n_err_bnds. For each right-hand side, contains
information about various error bounds and condition numbers
corresponding to the normwise relative error, which is defined as follows:
Normwise relative error in the i-th solution vector
The array is indexed by the type of error information as described below.

There are currently up to three pieces of information returned.
The first index in err_bnds_norm(i,:) corresponds to the i-th right-hand
side.
The second index in err_bnds_norm(:,err) contains the following three
fields:
err=1 "Trust/don't trust" boolean. Trust the answer if

the reciprocal condition number is less than the
threshold sqrt(n)*slamch() for single
precision flavors and sqrt(n)*dlamch() for
double precision flavors.
err=2 "Guaranteed" error bound. The estimated

forward error, almost certainly within a factor of
10 of the true error so long as the next entry is
greater than the threshold sqrt(n)*slamch()
for single precision flavors and
sqrt(n)*dlamch() for double precision
flavors. This error bound should only be trusted
if the previous boolean is true.
err=3 Reciprocal condition number. Estimated

normwise reciprocal condition number.
Compared with the threshold
sqrt(n)*slamch() for single precision flavors
and sqrt(n)*dlamch() for double precision
flavors to determine if the error estimate is
"guaranteed". These reciprocal condition
numbers for some appropriately scaled matrix Z
are:
Let z=s*a, where s scales each row by a power

of the radix so all absolute row sums of z are
approximately 1.
err_bnds_comp REAL for single precision flavors

corresponding to the componentwise relative error, which is defined as
follows:
555
Componentwise relative error in the i-th solution vector:
The array is indexed by the right-hand side i, on which the componentwise

relative error depends, and by the type of error information as described
below. There are currently up to three pieces of information returned for
each right-hand side. If componentwise accuracy is not requested
(params(3) = 0.0), then err_bnds_comp is not accessed. If n_err_bnds
< 3, then at most the first (:,n_err_bnds) entries are returned.
The first index in err_bnds_comp(i,:) corresponds to the i-th right-hand

side.
The second index in err_bnds_comp(:,err) contains the following three
fields:



componentwise reciprocal condition number.
are:
Let z=s*(a*diag(x)), where x is the solution

for the current right-hand side and s scales each
row of a*diag(x) by a power of the radix so all
absolute row sums of z are approximately 1.

556
LAPACK Routines 3
Output parameter only if the input contains erroneous values, namely, in
params(1), params(2), params(3). In such a case, the corresponding
elements of params are filled with default values on output.
info INTEGER. If info = 0, the execution is successful. The solution to every

right-hand side is guaranteed.
If 0 < infon: Uinfo,info is exactly zero. The factorization has been

completed, but the factor U is exactly singular, so the solution and error
bounds could not be computed; rcond = 0 is returned.
If info = n+j: The solution corresponding to the j-th right-hand side is not
guaranteed. The solutions corresponding to other right-hand sides k with k
> j may not be guaranteed as well, but only the first such right-hand side is
reported. If a small componentwise error is not requested params(3) =
0.0, then the j-th right-hand side is the first with a normwise error bound
that is not guaranteed (the smallest j such that err_bnds_norm(j,1) =
0.0 or err_bnds_comp(j,1) = 0.0). See the definition of
err_bnds_norm and err_bnds_comp for err = 1. To get information about
all of the right-hand sides, check err_bnds_norm or err_bnds_comp.
See Also
?gbrfs
with a general band coefficient matrix and estimates
its error.
Syntax
call sgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, iwork, info )
call dgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
call cgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
berr, work, rwork, info )
call zgbrfs( trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, b, ldb, x, ldx, ferr,
call gbrfs( ab, afb, ipiv, b, x [,kl] [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
= B or AH*X = B with a band matrix A, with multiple right-hand sides. For each computed solution vector x,
the routine computes the component-wise backward error. This error is the smallest relative perturbation
in elements of A and b such that x is the exact solution of the perturbed system:
557
call the factorization routine ?gbtrf

call the solver routine ?gbtrs.
Input Parameters

kl INTEGER. The number of sub-diagonals within the band of A; kl 0.
ku INTEGER. The number of super-diagonals within the band of A; ku 0.
ab,afb,b,x,work REAL for sgbrfs

DOUBLE PRECISION for dgbrfs
COMPLEX for cgbrfs
DOUBLE COMPLEX for zgbrfs.
Arrays:
ab(size ldab by *) contains the original band matrix A, as supplied
to ?gbtrf, but stored in rows from 1 to kl + ku + 1.
afb(size ldafb by *) contains the factored band matrix A, as returned

by ?gbtrf.

work(*) is a workspace array.
The second dimension of ab and afb must be at least max(1, n); the
ldab INTEGER. The leading dimension of ab.
ldafb INTEGER. The leading dimension of afb.
ipiv INTEGER.
558
LAPACK Routines 3
Array, size at least max(1, n). The ipiv array, as returned by ?gbtrf.
rwork REAL for cgbrfs

DOUBLE PRECISION for zgbrfs.
Output Parameters

info INTEGER. If info =0, the execution is successful.


Specific details for the routine gbrfs interface are as follows:
ab Holds the array A of size (kl+ku+1,n).
afb Holds the array AF of size (2*kl*ku+1,n).
ku Restored as ku = lda-kl-1.
Application Notes
error.
559
For each right-hand side, computation of the backward error involves a minimum of 4n(kl + ku) floating-
point operations (for real flavors) or 16n(kl + ku) operations (for complex flavors). In addition, each step of
iterative refinement involves 2n(4kl + 3ku) operations (for real flavors) or 8n(4kl + 3ku) operations (for
complex flavors); the number of iterations may range from 1 to 5. Estimating the forward error involves
solving a number of systems of linear equations A*x = b; the number is usually 4 or 5 and never more than
11. Each solution requires approximately 2n2 floating-point operations for real flavors or 8n2 for complex
flavors.
See Also
?gbrfsx
banded coefficient matrix A and provides error bounds
and backward error estimates.
Syntax
call sgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
x, ldx, rcond, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call dgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
iwork, info )
call cgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
rwork, info )
call zgbrfsx( trans, equed, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, r, c, b, ldb,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
by the parameters equed, r, and c below. In this case, the solution and error bounds returned are for the
Input Parameters

560
LAPACK Routines 3
If trans = 'C', the system has the form AH*X = B (Conjugate transpose
for complex flavors, Transpose for real flavors).

routine.
been replaced by diag(r)*A*diag(c). The right-hand side B has been
ab, afb, b, work REAL for sgbrfsx

DOUBLE PRECISION for dgbrfsx
COMPLEX for cgbrfsx
DOUBLE COMPLEX for zgbrfsx.
Arrays: ab(ldab,*), afb(ldafb,*), b(ldb,*), work(*).
The array ab contains the original matrix A in band storage, in rows 1 to kl

+ku+1. The j-th column of A is stored in the j-th column of the array ab as
follows:
ab(ku+1+i-j,j) = A(i,j) for max(1,j-ku) i min(n,j+kl).
The array afb contains details of the LU factorization of the banded matrix
A as computed by ?gbtrf. U is stored as an upper triangular banded matrix
with kl + ku superdiagonals in rows 1 to kl + ku + 1. The multipliers used
during the factorization are stored in rows kl + ku + 2 to 2*kl + ku + 1.
max(1,nrhs).
work(*) is a workspace array. The dimension of work must be at least
ldab INTEGER. The leading dimension of the array ab; ldabkl+ku+1.
ldafb INTEGER. The leading dimension of the array afb; ldafb 2*kl+ku+1.
ipiv INTEGER.
561
Array, size at least max(1, n). Contains the pivot indices as computed by ?
gbtrf; for row 1 in, row i of the matrix was interchanged with row
ipiv(i).

Arrays: r(n), c(n). The array r contains the row scale factors for A, and
the array c contains the column scale factors for A.
If equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed =

'N' or 'C', r is not accessed.
If equed = 'R' or 'B', each element of r must be positive.

If equed = 'C' or 'B', each element of c must be positive.

x REAL for sgbrfsx

COMPLEX for cgbrfsx
Array, size (ldx,*).
The solution matrix X as computed by sgbtrs/dgbtrs for real flavors or

cgbtrs/zgbtrs for complex flavors.
n_err_bnds INTEGER. Number of error bounds to return for each right-hand side and


0.0, that entry will be filled with the default value used for that parameter.
Only positions up to nparams are accessed; defaults are used for higher-
562
LAPACK Routines 3
(for single precision flavors), 1.0D+0 (for double precision flavors).

are computed.

double precision.

refinement.
Default 10.0


convergence).
only.

Output Parameters
x REAL for sgbrfsx

COMPLEX for cgbrfsx

563




side.
fields:



564
LAPACK Routines 3
are

approximately 1.

follows:


side.
fields:

err=2 "Guaranteed" error bpound. The estimated


565

are


params(1), params(2), and params(3). In such a case, the corresponding


0.0 or err_bnds_comp(j,1) = 0.0. See the definition of err_bnds_norm
and err_bnds_comp for err = 1. To get information about all of the right-
hand sides, check err_bnds_norm or err_bnds_comp.
See Also
?gtrfs
with a tridiagonal coefficient matrix and estimates its
error.
Syntax
call sgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, iwork, info )
call dgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
call cgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
ferr, berr, work, rwork, info )
566
LAPACK Routines 3
call zgtrfs( trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x, ldx,
call gtrfs( dl, d, du, dlf, df, duf, du2, ipiv, b, x [,trans] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
= B or AH*X = B with a tridiagonal matrix A, with multiple right-hand sides. For each computed solution
vector x, the routine computes the component-wise backward error. This error is the smallest relative
|aij|/|aij| |aij|, |bi|/|bi| |bi| such that (A + A)x = (b + b).
call the factorization routine ?gttrf

call the solver routine ?gttrs.
Input Parameters

dl,d,du,dlf, REAL for sgtrfs

df,duf,du2,
DOUBLE PRECISION for dgtrfs
b,x,work
COMPLEX for cgtrfs
DOUBLE COMPLEX for zgtrfs.
Arrays:
dl, dimension (n -1), contains the subdiagonal elements of A.
d, dimension (n), contains the diagonal elements of A.
du, dimension (n -1), contains the superdiagonal elements of A.
dlf, dimension (n -1), contains the (n - 1) multipliers that define the
matrix L from the LU factorization of A as computed by ?gttrf.
df, dimension (n), contains the n diagonal elements of the upper

triangular matrix U from the LU factorization of A.
567
duf, dimension (n -1), contains the (n - 1) elements of the first

superdiagonal of U.
du2, dimension (n -2), contains the (n - 2) elements of the second
superdiagonal of U.
b(ldb,nrhs) contains the right-hand side matrix B.
x(ldx,nrhs) contains the solution matrix X, as computed by ?gttrs.
work(*) is a workspace array; the dimension of work must be at least
max(1, 3*n) for real flavors and max(1, 2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?gttrf.
rwork REAL for cgtrfs

DOUBLE PRECISION for zgtrfs.
Workspace array, size (n). Used for complex flavors only.
Output Parameters

Arrays, size at least max(1,nrhs). Contain the component-wise


Specific details for the routine gtrfs interface are as follows:
dlf Holds the vector of length (n-1).
df Holds the vector of length n.
568
LAPACK Routines 3
duf Holds the vector of length (n-1).
b Holds the matrix B of size (n,nrhs).
x Holds the matrix X of size (n,nrhs).
See Also
?porfs
with a symmetric (Hermitian) positive-definite
coefficient matrix and estimates its error.
Syntax
call sporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call cporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zporfs( uplo, n, nrhs, a, lda, af, ldaf, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call porfs( a, af, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine performs an iterative refinement of the solution to a system of linear equations A*X = B with a
symmetric (Hermitian) positive definite matrix A, with multiple right-hand sides. For each computed solution
call the factorization routine ?potrf

call the solver routine ?potrs.
569
Input Parameters

a,af,b,x,work REAL for sporfs

DOUBLE PRECISION for dporfs
COMPLEX for cporfs
DOUBLE COMPLEX for zporfs.
Arrays:
a(lda,*) contains the original matrix A, as supplied to ?potrf.
af(ldaf,*) contains the factored matrix A, as returned by ?potrf.
b(ldb,*) contains the right-hand side matrix B.
x(ldx,*) contains the solution matrix X.
rwork REAL for cporfs

DOUBLE PRECISION for zporfs.
Output Parameters

570
LAPACK Routines 3

Specific details for the routine porfs interface are as follows:
af Holds the matrix AF of size (n,n).
Application Notes
error.
equations A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n2 floating-point operations for real flavors or 8n2 for complex flavors.
See Also
?porfsx
symmetric/Hermitian positive-definite coefficient
matrix A and provides error bounds and backward
error estimates.
Syntax
call sporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call dporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call cporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
call zporfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, s, b, ldb, x, ldx, rcond, berr,
n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
571
Include Files
mkl.fi, lapack.f90
Description
by the parameters equed and s below. In this case, the solution and error bounds returned are for the
Input Parameters

equed CHARACTER*1. Must be 'N' or 'Y'.

routine.
If equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s). The right-hand side B has been
a, af, b, work REAL for sporfsx

DOUBLE PRECISION for dporfsx
COMPLEX for cporfsx
DOUBLE COMPLEX for zporfsx.
Arrays: a(lda,*), af(ldaf,*), b(ldb,*), work(*).
The array a contains the symmetric/Hermitian matrix A as specified by

uplo. If uplo = 'U', the leading n-by-n upper triangular part of a contains
the upper triangular part of the matrix A and the strictly lower triangular
part of a is not referenced. If uplo = 'L', the leading n-by-n lower
triangular part of a contains the lower triangular part of the matrix A and
the strictly upper triangular part of a is not referenced. The second
The array af contains the triangular factor L or U from the Cholesky

factorization A = U**T*U or A = L*L**T as computed by spotrf for real
flavors or dpotrf for complex flavors.
572
LAPACK Routines 3
max(1,nrhs).
s REAL for single precision flavors

Array of size n. The array s contains the scale factors for A.
If equed = 'N', s is not accessed.
If equed = 'Y', each element of s must be positive.
Each element of s should be a power of the radix to ensure a reliable

x REAL for sporfsx

COMPLEX for cporfsx
The solution matrix X as computed by ?potrs



573

are computed.

double precision.

refinement.
Default 10.0


convergence).
only.

Output Parameters
x REAL for sporfsx

COMPLEX for cporfsx

574
LAPACK Routines 3


side.
fields:



are
575

approximately 1.

follows:


side.
fields:



are
576
LAPACK Routines 3


Output parameter only if the input contains erroneous values, namely in
params(1), params(2), or params(3). In such a case, the corresponding


See Also
?pprfs
coefficient matrix stored in a packed format and
estimates its error.
Syntax
call spprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork, info )
call dpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, iwork, info )
call cpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork, info )
call zpprfs( uplo, n, nrhs, ap, afp, b, ldb, x, ldx, ferr, berr, work, rwork, info )
call pprfs( ap, afp, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
577
Description
symmetric (Hermitian) positive definite matrix A, with multiple right-hand sides. For each computed solution
Finally, the routine estimates the component-wise forward error in the computed solution
||x - xe||/||x||
where xe is the exact solution.
call the factorization routine ?pptrf

call the solver routine ?pptrs.
Input Parameters

ap, afp, b, x, work REAL for spprfs

DOUBLE PRECISION for dpprfs
COMPLEX for cpprfs
DOUBLE COMPLEX for zpprfs.
Arrays:
ap(*) contains the original matrix A in a packed format, as supplied
to ?pptrf.
afp(*) contains the factored matrix A in a packed format, as

returned by ?pptrf.

The dimension of arrays ap and afp must be at least max(1,n(n
+1)/2); the second dimension of b and x must be at least max(1,
nrhs); the dimension of work must be at least max(1, 3*n) for real
flavors and max(1, 2*n) for complex flavors.
578
LAPACK Routines 3
rwork REAL for cpprfs

DOUBLE PRECISION for zpprfs.
Output Parameters
ferr, berr REAL for single precision flavors.



Specific details for the routine pprfs interface are as follows:
afp Holds the array AF of size (n*(n+1)/2).
Application Notes
error.
iterations may range from 1 to 5.
Estimating the forward error involves solving a number of systems of linear equations A*x = b; the number
of systems is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point
operations for real flavors or 8n2 for complex flavors.
See Also
579
?pbrfs
with a band symmetric (Hermitian) positive-definite
coefficient matrix and estimates its error.
Syntax
call spbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call cpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call zpbrfs( uplo, n, kd, nrhs, ab, ldab, afb, ldafb, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call pbrfs( ab, afb, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric (Hermitian) positive definite band matrix A, with multiple right-hand sides. For each computed
solution vector x, the routine computes the component-wise backward error. This error is the smallest
relative perturbation in elements of A and b such that x is the exact solution of the perturbed system:
call the factorization routine ?pbtrf

call the solver routine ?pbtrs.
Input Parameters


A; kd 0.
ab,afb,b,x,work REAL for spbrfs

DOUBLE PRECISION for dpbrfs
580
LAPACK Routines 3
COMPLEX for cpbrfs
DOUBLE COMPLEX for zpbrfs.
Arrays:
ab(ldab,*) contains the original band matrix A, as supplied to ?
pbtrf.
afb(ldafb,*) contains the factored band matrix A, as returned by ?
pbtrf.
The second dimension of ab and afb must be at least max(1, n); the
ldab INTEGER. The leading dimension of ab; ldabkd + 1.
ldafb INTEGER. The leading dimension of afb; ldafbkd + 1.
rwork REAL for cpbrfs

DOUBLE PRECISION for zpbrfs.
Output Parameters



Specific details for the routine pbrfs interface are as follows:
581
ab Holds the array A of size (kd+1, n).
afb Holds the array AF of size (kd+1, n).
Application Notes
error.
For each right-hand side, computation of the backward error involves a minimum of 8n*kd floating-point
operations (for real flavors) or 32n*kd operations (for complex flavors). In addition, each step of iterative
refinement involves 12n*kd operations (for real flavors) or 48n*kd operations (for complex flavors); the
number of iterations may range from 1 to 5.
is usually 4 or 5 and never more than 11. Each solution requires approximately 4n*kd floating-point
operations for real flavors or 16n*kd for complex flavors.
See Also
?ptrfs
tridiagonal coefficient matrix and estimates its error.
Syntax
call sptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call dptrfs( n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, info )
call cptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zptrfs( uplo, n, nrhs, d, e, df, ef, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call ptrfs( d, df, e, ef, b, x [,ferr] [,berr] [,info] )
call ptrfs( d, df, e, ef, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
582
LAPACK Routines 3
symmetric (Hermitian) positive definite tridiagonal matrix A, with multiple right-hand sides. For each
computed solution vector x, the routine computes the component-wise backward error. This error is the
smallest relative perturbation in elements of A and b such that x is the exact solution of the perturbed
system:
call the factorization routine ?pttrf
call the solver routine ?pttrs.
Input Parameters
uplo CHARACTER*1. Used for complex flavors only. Must be 'U' or 'L'.
Specifies whether the superdiagonal or the subdiagonal of the
tridiagonal matrix A is stored and how A is factored:
If uplo = 'U', the array e stores the superdiagonal of A, and A is
factored as UH*D*U.
If uplo = 'L', the array e stores the subdiagonal of A, and A is

factored as L*D*LH.
d, df, rwork REAL for single precision flavors DOUBLE PRECISION for double
precision flavors
Arrays: d(n), df(n), rwork(n).
The array d contains the n diagonal elements of the tridiagonal matrix

A.
The array df contains the n diagonal elements of the diagonal matrix
D from the factorization of A as computed by ?pttrf.
The array rwork is a workspace array used for complex flavors only.
e,ef,b,x,work REAL for sptrfs

DOUBLE PRECISION for dptrfs
COMPLEX for cptrfs
DOUBLE COMPLEX for zptrfs.
Arrays: e(n -1), ef(n -1), b(ldb,nrhs), x(ldx,nrhs), work(*).
The array e contains the (n - 1) off-diagonal elements of the

tridiagonal matrix A (see uplo).
The array ef contains the (n - 1) off-diagonal elements of the unit
bidiagonal factor U or L from the factorization computed by ?pttrf
(see uplo).
583
The array x contains the solution matrix X as computed by ?pttrs.
The array work is a workspace array. The dimension of work must be

at least 2*n for real flavors, and at least n for complex flavors.
Output Parameters

info INTEGER.

Specific details for the routine ptrfs interface are as follows:
ef Holds the vector of length (n-1).
uplo Used in complex flavors only. Must be 'U' or 'L'. The default value is
'U'.
See Also
584
LAPACK Routines 3
?syrfs
with a symmetric coefficient matrix and estimates its
error.
Syntax
call ssyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call csyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zsyrfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call syrfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric full-storage matrix A, with multiple right-hand sides. For each computed solution vector x, the
routine computes the component-wise backward error. This error is the smallest relative perturbation in
elements of A and b such that x is the exact solution of the perturbed system:
call the factorization routine ?sytrf

call the solver routine ?sytrs.
Input Parameters

a,af,b,x,work REAL for ssyrfs

DOUBLE PRECISION for dsyrfs
COMPLEX for csyrfs
DOUBLE COMPLEX for zsyrfs.
Arrays:
585
a(lda,*) contains the original matrix A, as supplied to ?sytrf.

af(ldaf,*) contains the factored matrix A, as returned by ?sytrf.
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?sytrf.
rwork REAL for csyrfs

DOUBLE PRECISION for zsyrfs.
Output Parameters



Specific details for the routine syrfs interface are as follows:
586
LAPACK Routines 3
Application Notes
error.
equations A*x = b; the number is usually 4 or 5 and never more than 11. Each solution requires
approximately 2n2 floating-point operations for real flavors or 8n2 for complex flavors.
See Also
?syrfsx
symmetric indefinite coefficient matrix A and provides
error bounds and backward error estimates.
Syntax
call ssyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call dsyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork, info )
call csyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork, info )
call zsyrfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations when the coefficient matrix is
symmetric indefinite, and provides error bounds and backward error estimates for the solution. In addition to
a normwise error bound, the code provides a maximum componentwise error bound, if possible. See
comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
587
Input Parameters


routine.
a, af, b, work REAL for ssyrfsx

DOUBLE PRECISION for dsyrfsx
COMPLEX for csyrfsx
DOUBLE COMPLEX for zsyrfsx.
The array a contains the symmetric/Hermitian matrix A as specified by

uplo. If uplo = 'U', the leading n-by-n upper triangular part of a contains
the upper triangular part of the matrix A and the strictly lower triangular
part of a is not referenced. If uplo = 'L', the leading n-by-n lower
triangular part of a contains the lower triangular part of the matrix A and
the strictly upper triangular part of a is not referenced. The second

factorization A = UT*U or A = L*LT as computed by ssytrf for real flavors
or dsytrf for complex flavors.
max(1,nrhs).
588
LAPACK Routines 3
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges and the
block structure of D as determined by ssytrf for real flavors or dsytrf for
complex flavors.

Array, size (n). The array s contains the scale factors for A.

x REAL for ssyrfsx

COMPLEX for csyrfsx
The solution matrix X as computed by ?sytrs




are computed.
589

double precision.

refinement.
Default 10.0


convergence).
only.

Output Parameters
x REAL for ssyrfsx

COMPLEX for csyrfsx


590
LAPACK Routines 3


side.
The second index in err_bnds_norm(:,err) contains the follwoing three
fields:



are:

591

follows:


side.
fields:



are:

592
LAPACK Routines 3


See Also
?herfs
with a complex Hermitian coefficient matrix and
estimates its error.
Syntax
call cherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call zherfs( uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call herfs( a, af, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
complex Hermitian full-storage matrix A, with multiple right-hand sides. For each computed solution vector x,
the routine computes the component-wise backward error. This error is the smallest relative perturbation
in elements of A and b such that x is the exact solution of the perturbed system:
593
call the factorization routine ?hetrf

call the solver routine ?hetrs.
Input Parameters

a,af,b,x,work COMPLEX for cherfs

DOUBLE COMPLEX for zherfs.
Arrays:
a(size lda by *) contains the original matrix A, as supplied to ?
hetrf.
af(size ldaf by *) contains the factored matrix A, as returned
by ?hetrf.

dimension of work must be at least max(1, 2*n).
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hetrf.
rwork REAL for cherfs

DOUBLE PRECISION for zherfs.
Output Parameters
ferr, berr REAL for cherfs
594
LAPACK Routines 3
DOUBLE PRECISION for zherfs.


Specific details for the routine herfs interface are as follows:
Application Notes
error.
For each right-hand side, computation of the backward error involves a minimum of 16n2 operations. In
addition, each step of iterative refinement involves 24n2 operations; the number of iterations may range
from 1 to 5.
is usually 4 or 5 and never more than 11. Each solution requires approximately 8n2 floating-point operations.
The real counterpart of this routine is ?ssyrfs/?dsyrfs
See Also
?herfsx
symmetric indefinite coefficient matrix A and provides
error bounds and backward error estimates.
Syntax
call cherfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
595
call zherfsx( uplo, equed, n, nrhs, a, lda, af, ldaf, ipiv, s, b, ldb, x, ldx, rcond,
Include Files
mkl.fi, lapack.f90
Description
The routine improves the computed solution to a system of linear equations when the coefficient matrix is
Hermitian indefinite, and provides error bounds and backward error estimates for the solution. In addition to
a normwise error bound, the code provides a maximum componentwise error bound, if possible. See
comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Input Parameters


routine.
a, af, b, work COMPLEX for cherfsx

DOUBLE COMPLEX for zherfsx.
The array a contains the Hermitian matrix A as specified by uplo. If uplo =

'U', the leading n-by-n upper triangular part of a contains the upper
triangular part of the matrix A and the strictly lower triangular part of a is
not referenced. If uplo = 'L', the leading n-by-n lower triangular part of a
contains the lower triangular part of the matrix A and the strictly upper
triangular part of a is not referenced. The second dimension of a must be at
least max(1,n).
The array af contains the block diagonal matrix D and the multipliers used
to obtain the factor U or L from the factorization A = U*D*UT or A =
L*D*LT as computed by ssytrf for cherfsx or dsytrf for zherfsx.
596
LAPACK Routines 3
max(1,nrhs).
max(1,2*n).
ipiv INTEGER.
Array, size at least max(1, n). Contains details of the interchanges and the
block structure of D as determined by ssytrf for real flavors or dsytrf for
complex flavors.

Array, size (n). The array s contains the scale factors for A.

x COMPLEX for cherfsx

The solution matrix X as computed by ?hetrs


597

(for cherfsx), 1.0D+0 (for zherfsx).

are computed.

double precision.

refinement.
Default 10
Aggressive Set to 100 to permit convergence using


convergence).
rwork REAL for cherfsx

DOUBLE PRECISION for zherfsx.
Output Parameters
x COMPLEX for cherfsx

rcond REAL for cherfsx

berr REAL for cherfsx

598
LAPACK Routines 3
err_bnds_norm REAL for cherfsx


side.
fields:

threshold sqrt(n)*slamch() for cherfsx and
sqrt(n)*dlamch() for zherfsx.

for cherfsx and sqrt(n)*dlamch() for
zherfsx. This error bound should only be
trusted if the previous boolean is true.

are:

approximately 1.
err_bnds_comp REAL for cherfsx
599

follows:


side.
fields:

threshold sqrt(n)*slamch() for cherfsx and
sqrt(n)*dlamch() for zherfsx.

for cherfsx and sqrt(n)*dlamch() for
zherfsx. This error bound should only be

are:

600
LAPACK Routines 3


See Also
?sprfs
with a packed symmetric coefficient matrix and
estimates the solution error.
Syntax
call ssprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call csprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zsprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call sprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
packed symmetric matrix A, with multiple right-hand sides. For each computed solution vector x, the routine
computes the component-wise backward error. This error is the smallest relative perturbation in elements
of A and b such that x is the exact solution of the perturbed system:
601
call the factorization routine ?sptrf

call the solver routine ?sptrs.
Input Parameters

ap,afp,b,x,work REAL for ssprfs

DOUBLE PRECISION for dsprfs
COMPLEX for csprfs
DOUBLE COMPLEX for zsprfs.
Arrays:
ap (size *) contains the original packed matrix A, as supplied to ?
sptrf.
afp (size *) contains the factored packed matrix A, as returned
by ?sptrf.

The dimension of arrays ap and afp must be at least max(1, n(n
+1)/2); the second dimension of b and x must be at least max(1,
nrhs); the dimension of work must be at least max(1, 3*n) for real
flavors and max(1, 2*n) for complex flavors.
ipiv INTEGER.
rwork REAL for csprfs

DOUBLE PRECISION for zsprfs.
602
LAPACK Routines 3
Output Parameters



Specific details for the routine sprfs interface are as follows:
Application Notes
error.
iterations may range from 1 to 5.
of systems is usually 4 or 5 and never more than 11. Each solution requires approximately 2n2 floating-point
operations for real flavors or 8n2 for complex flavors.
See Also
?hprfs
with a packed complex Hermitian coefficient matrix
and estimates the solution error.
603
Syntax
call chprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call zhprfs( uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call hprfs( ap, afp, ipiv, b, x [,uplo] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
packed complex Hermitian matrix A, with multiple right-hand sides. For each computed solution vector x, the
call the factorization routine ?hptrf

call the solver routine ?hptrs.
Input Parameters

ap,afp,b,x,work COMPLEX for chprfs

DOUBLE COMPLEX for zhprfs.
Arrays:
ap (size *) contains the original packed matrix A, as supplied to ?
hptrf.
afp (size *) contains the factored packed matrix A, as returned
by ?hptrf.

+1)/2); the second dimension of b and x must be at least
max(1,nrhs); the dimension of work must be at least max(1, 2*n).
604
LAPACK Routines 3
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hptrf.
rwork REAL for chprfs

DOUBLE PRECISION for zhprfs.
Output Parameters
ferr, berr REAL for chprfs.

DOUBLE PRECISION for zhprfs.
Arrays, size at least max(1,nrhs). Contain the component-wise


Specific details for the routine hprfs interface are as follows:
Application Notes
error.
605
For each right-hand side, computation of the backward error involves a minimum of 16n2 operations. In
addition, each step of iterative refinement involves 24n2 operations; the number of iterations may range
from 1 to 5.
is usually 4 or 5 and never more than 11. Each solution requires approximately 8n2 floating-point operations.
The real counterpart of this routine is ?ssprfs/?dsprfs.
See Also
?trrfs
Estimates the error in the solution of a system of
linear equations with a triangular coefficient matrix.
Syntax
call strrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call dtrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
iwork, info )
call ctrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call ztrrfs( uplo, trans, diag, n, nrhs, a, lda, b, ldb, x, ldx, ferr, berr, work,
rwork, info )
call trrfs( a, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine estimates the errors in the solution to a system of linear equations A*X = B or AT*X = B or
AH*X = B with a triangular matrix A, with multiple right-hand sides. For each computed solution vector x, the
The routine also estimates the component-wise forward error in the computed solution ||x - xe||/||
Before calling this routine, call the solver routine ?trtrs.
Input Parameters


606
LAPACK Routines 3

If diag = 'U', then A is unit triangular: diagonal elements of A are

a, b, x, work REAL for strrfs

DOUBLE PRECISION for dtrrfs
COMPLEX for ctrrfs
DOUBLE COMPLEX for ztrrfs.
Arrays:
a(size lda by *) contains the upper or lower triangular matrix A, as
specified by uplo.
The second dimension of a must be at least max(1,n); the second
dimension of b and x must be at least max(1,nrhs); the dimension of
work must be at least max(1,3*n) for real flavors and max(1,2*n)
rwork REAL for ctrrfs

DOUBLE PRECISION for ztrrfs.
Output Parameters

607


Specific details for the routine trrfs interface are as follows:
Application Notes
error.
A call to this routine involves, for each right-hand side, solving a number of systems of linear equations A*x
= b; the number of systems is usually 4 or 5 and never more than 11. Each solution requires approximately
n2 floating-point operations for real flavors or 4n2 for complex flavors.
See Also
?tprfs
linear equations with a packed triangular coefficient
matrix.
Syntax
call stprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call dtprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, iwork,
info )
call ctprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call ztprfs( uplo, trans, diag, n, nrhs, ap, b, ldb, x, ldx, ferr, berr, work, rwork,
info )
call tprfs( ap, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
608
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
AH*X = B with a packed triangular matrix A, with multiple right-hand sides. For each computed solution
Before calling this routine, call the solver routine ?tptrs.
Input Parameters



If diag = 'N', A is not a unit triangular matrix.
If diag = 'U', A is unit triangular: diagonal elements of A are

ap, b, x, work REAL for stprfs

DOUBLE PRECISION for dtprfs
COMPLEX for ctprfs
DOUBLE COMPLEX for ztprfs.
Arrays:
ap (size *) contains the upper or lower triangular matrix A, as
specified by uplo.
609

The dimension of ap must be at least max(1,n(n+1)/2); the second
dimension of b and x must be at least max(1,nrhs); the dimension of
work must be at least max(1,3*n) for real flavors and max(1,2*n)
rwork REAL for ctprfs

DOUBLE PRECISION for ztprfs.
Output Parameters



Specific details for the routine tprfs interface are as follows:
Application Notes
error.
610
LAPACK Routines 3
n2 floating-point operations for real flavors or 4n2 for complex flavors.
See Also
?tbrfs
linear equations with a triangular band coefficient
matrix.
Syntax
call stbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call dtbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, iwork, info )
call ctbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call ztbrfs( uplo, trans, diag, n, kd, nrhs, ab, ldab, b, ldb, x, ldx, ferr, berr,
work, rwork, info )
call tbrfs( ab, b, x [,uplo] [,trans] [,diag] [,ferr] [,berr] [,info] )
Include Files
mkl.fi, lapack.f90
Description
AH*X = B with a triangular band matrix A, with multiple right-hand sides. For each computed solution vector
x, the routine computes the component-wise backward error. This error is the smallest relative
Before calling this routine, call the solver routine ?tbtrs.
Input Parameters


611

If diag = 'N', A is not a unit triangular matrix.

kd INTEGER. The number of super-diagonals or sub-diagonals in the

matrix A; kd 0.
ab, b, x, work REAL for stbrfs

DOUBLE PRECISION for dtbrfs
COMPLEX for ctbrfs
DOUBLE COMPLEX for ztbrfs.
Arrays:
ab(size ldab by *) contains the upper or lower triangular matrix A, as
specified by uplo, in band storage format.
The second dimension of a must be at least max(1,n); the second
dimension of b and x must be at least max(1,nrhs). The dimension
of work must be at least max(1,3*n) for real flavors and max(1,2*n)
rwork REAL for ctbrfs

DOUBLE PRECISION for ztbrfs.
Output Parameters

612
LAPACK Routines 3

Specific details for the routine tbrfs interface are as follows:
Application Notes
error.
2n*kd floating-point operations for real flavors or 8n*kd operations for complex flavors.
See Also
Matrix Inversion: LAPACK Computational Routines

It is seldom necessary to compute an explicit inverse of a matrix. In particular, do not attempt to solve a
system of equations Ax = b by first computing A-1 and then forming the matrix-vector product x = A-1b.
Call a solver routine instead (see Routines for Solving Systems of Linear Equations); this is more efficient
and more accurate.
However, matrix inversion routines are provided for the rare occasions when an explicit inverse matrix is
needed.
?getri
Computes the inverse of an LU-factored general
matrix.
Syntax
call sgetri( n, a, lda, ipiv, work, lwork, info )
call dgetri( n, a, lda, ipiv, work, lwork, info )
call cgetri( n, a, lda, ipiv, work, lwork, info )
613
call zgetri( n, a, lda, ipiv, work, lwork, info )

call getri( a, ipiv [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a general matrix A. Before calling this routine, call ?getrf to
factorize A.
Input Parameters
a, work REAL for sgetri

DOUBLE PRECISION for dgetri
COMPLEX for cgetri
DOUBLE COMPLEX for zgetri.
a(lda,*) contains the factorization of the matrix A, as returned by ?

getrf: A = P*L*U.
work(*) is a workspace array of dimension at least max(1,lwork).
ipiv INTEGER.
The ipiv array, as returned by ?getrf.
lwork INTEGER. The size of the work array; lworkn.

issued by xerbla.
See Application Notes below for the suggested value of lwork.
Output Parameters
a Overwritten by the n-by-n matrix inv(A).

runs.

614
LAPACK Routines 3
If info = i, the i-th diagonal element of the factor U is zero, U is
singular, and the inversion could not be completed.

Specific details for the routine getri interface are as follows:
Application Notes
lwork = -1.
The computed inverse X satisfies the following error bound:
|XA - I| c(n)|X|P|L||U|,
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix; P, L,
and U are the factors of the matrix factorization A = P*L*U.
complex flavors.
See Also
?potri
Computes the inverse of a symmetric (Hermitian)
positive-definite matrix using the Cholesky
factorization.
Syntax
call spotri( uplo, n, a, lda, info )
call dpotri( uplo, n, a, lda, info )
call cpotri( uplo, n, a, lda, info )
call zpotri( uplo, n, a, lda, info )
call potri( a [,uplo] [,info] )
615
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex flavors, Hermitian
positive-definite matrix A. Before calling this routine, call ?potrf to factorize A.
Input Parameters

a REAL for spotri

DOUBLE PRECISION for dpotri
COMPLEX for cpotri
DOUBLE COMPLEX for zpotri.
Array a(size lda by *). Contains the factorization of the matrix A, as
returned by ?potrf.
The second dimension of a must be at least max(1, n).
Output Parameters
a Overwritten by the upper or lower triangle of the inverse of A.
info INTEGER.
If info = i, the i-th diagonal element of the Cholesky factor (and

therefore the factor itself) is zero, and the inversion could not be
completed.

Specific details for the routine potri interface are as follows:
616
LAPACK Routines 3
Application Notes
The computed inverse X satisfies the following error bounds:
||XA - I||2c(n)2(A), ||AX - I||2c(n)2(A),
where c(n) is a modest linear function of n, and is the machine precision; I denotes the identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 = maxxx=1(AxAx)1/2, and the condition number
2(A) is defined by 2(A) = ||A||2 ||A-1||2.
complex flavors.
See Also
?pftri
Computes the inverse of a symmetric (Hermitian)
positive-definite matrix in RFP format using the
Cholesky factorization.
Syntax
call spftri( transr, uplo, n, a, info )
call dpftri( transr, uplo, n, a, info )
call cpftri( transr, uplo, n, a, info )
call zpftri( transr, uplo, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex data, Hermitian
positive-definite matrix A using the Cholesky factorization:

Before calling this routine, call ?pftrf to factorize A.
Storage Schemes.
Input Parameters
data).
If transr = 'N', the Normal transr of RFP U (if uplo = 'U') or L (if
uplo = 'L') is stored.
If transr = 'T', the Transpose transr of RFP U (if uplo = 'U') or L
(if uplo = 'L' is stored.
If transr = 'C', the Conjugate-Transpose transr of RFP U (if uplo

= 'U') or L (if uplo = 'L' is stored.
617

If uplo = 'U', A = UT*U for real data or A = UH*U for complex data,
and U is stored.
If uplo = 'L', A = L*LT for real data or A = L*LH for complex data,
and L is stored.
a REAL for spftri

DOUBLE PRECISION for dpftri
COMPLEX for cpftri
DOUBLE COMPLEX for zpftri.
Array, size (n*(n+1)/2). The array a contains the factor U or L
matrix A in the RFP format.
Output Parameters
a The symmetric/Hermitian inverse of the original matrix in the same

storage format.

If info = i, the (i,i) element of the factor U or L is zero, and the

inverse could not be computed.
See Also
?pptri
Computes the inverse of a packed symmetric
(Hermitian) positive-definite matrix using Cholesky
factorization.
Syntax
call spptri( uplo, n, ap, info )
call dpptri( uplo, n, ap, info )
call cpptri( uplo, n, ap, info )
call zpptri( uplo, n, ap, info )
call pptri( ap [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric positive definite or, for complex flavors, Hermitian
positive-definite matrix A in packed form. Before calling this routine, call ?pptrf to factorize A.
618
LAPACK Routines 3
Input Parameters

Indicates whether the upper or lower triangular factor is stored in ap:
If uplo = 'U', then the upper triangular factor is stored.
If uplo = 'L', then the lower triangular factor is stored.
ap REAL for spptri

DOUBLE PRECISION for dpptri
COMPLEX for cpptri
DOUBLE COMPLEX for zpptri.
Array, size at least max(1, n(n+1)/2).
Contains the factorization of the packed matrix A, as returned by ?
pptrf.
The dimension ap must be at least max(1,n(n+1)/2).
Output Parameters
ap Overwritten by the packed n-by-n matrix inv(A).
info INTEGER.

If info = i, the i-th diagonal element of the Cholesky factor (and
therefore the factor itself) is zero, and the inversion could not be
completed.

Specific details for the routine pptri interface are as follows:
Application Notes
||XA - I||2c(n)2(A), ||AX - I||2c(n)2(A),
where c(n) is a modest linear function of n, and is the machine precision; I denotes the identity matrix.
The 2-norm ||A||2 of a matrix A is defined by ||A||2 =maxxx=1(AxAx)1/2, and the condition number
2(A) is defined by 2(A) = ||A||2 ||A-1||2 .
619
complex flavors.
See Also
?sytri
Computes the inverse of a symmetric matrix using
U*D*UT or L*D*LT Bunch-Kaufman factorization.
Syntax
call ssytri( uplo, n, a, lda, ipiv, work, info )
call dsytri( uplo, n, a, lda, ipiv, work, info )
call csytri( uplo, n, a, lda, ipiv, work, info )
call zsytri( uplo, n, a, lda, ipiv, work, info )
call sytri( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric matrix A. Before calling this routine, call ?sytrf to
factorize A.
Input Parameters

If uplo = 'U', the array a stores the Bunch-Kaufman factorization A
= U*D*UT.
If uplo = 'L', the array a stores the Bunch-Kaufman factorization A
= L*D*LT.
a, work REAL for ssytri

DOUBLE PRECISION for dsytri
COMPLEX for csytri
DOUBLE COMPLEX for zsytri.
Arrays:
a(size lda by *) contains the factorization of the matrix A, as returned
by ?sytrf.
work(*) is a workspace array. The dimension of work must be at

least max(1,2*n).
620
LAPACK Routines 3
ipiv INTEGER.
The ipiv array, as returned by ?sytrf.
Output Parameters
info INTEGER.
If info =-i, the i-th parameter had an illegal value.
If info = i, the i-th diagonal element of D is zero, D is singular, and the
inversion could not be completed.

Specific details for the routine sytri interface are as follows:
Application Notes
|D*UT*PT*X*P*U - I| c(n)(|D||UT|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LT*PT*X*P*L - I| c(n)(|D||LT|PT|X|P|L| + |D||D-1|)
for uplo = 'L'. Here c(n) is a modest linear function of n, and is the machine precision; I denotes the
identity matrix.
complex flavors.
See Also
?sytri_rook
U*D*UT or L*D*LT bounded Bunch-Kaufman
factorization.
Syntax
call ssytri_rook( uplo, n, a, lda, ipiv, work, info )
call dsytri_rook( uplo, n, a, lda, ipiv, work, info )
call csytri_rook( uplo, n, a, lda, ipiv, work, info )
621
call zsytri_rook( uplo, n, a, lda, ipiv, work, info )

call sytri_rook( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric matrix A. Before calling this routine, call ?
sytrf_rook to factorize A.
Input Parameters

If uplo = 'U', the array a stores the factorization A = U*D*UT.
If uplo = 'L', the array a stores the factorization A = L*D*LT.
a, work REAL for ssytri_rook

DOUBLE PRECISION for dsytri_rook
COMPLEX for csytri_rook
DOUBLE COMPLEX for zsytri_rook.
Arrays:
a(size lda by *) contains the factorization of the matrix A, as returned
by ?sytrf_rook.
work(*) is a workspace array. The dimension of work must be at

least max(1,n).
ipiv INTEGER.
The ipiv array, as returned by ?sytrf_rook.
Output Parameters
a Overwritten by the n-by-n matrix inv(A). If uplo = 'U', the upper

triangular part of the inverse is formed and the part of a below the
diagonal is not referenced; if uplo = 'L' the lower triangular part of
the inverse is formed and the part of a above the diagonal is not
referenced."
info INTEGER.
622
LAPACK Routines 3
If info = i, the i-th diagonal element of D is zero, D is singular, and the
inversion could not be completed.

Specific details for the routine sytri_rook interface are as follows:
Application Notes
complex flavors.
See Also
?hetri
Computes the inverse of a complex Hermitian matrix
using U*D*UH or L*D*LH Bunch-Kaufman
factorization.
Syntax
call chetri( uplo, n, a, lda, ipiv, work, info )
call zhetri( uplo, n, a, lda, ipiv, work, info )
call hetri( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A. Before calling this routine, call ?
hetrf to factorize A.
Input Parameters

If uplo = 'U', the array a stores the Bunch-Kaufman factorization A
= U*D*UH.
If uplo = 'L', the array a stores the Bunch-Kaufman factorization A
= L*D*LH.
623
a, work COMPLEX for chetri

DOUBLE COMPLEX for zhetri.
Arrays:
hetrf.

The dimension of work must be at least max(1,n).
ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?hetrf.
Output Parameters
info INTEGER.

If info = i, the i-th diagonal element of D is zero, D is singular, and
the inversion could not be completed.

Specific details for the routine hetri interface are as follows:
Application Notes
|D*UH*PT*X*P*U - I| c(n)(|D||UH|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LH*PT*X*P*L - I| c(n)(|D||LH|PT|X|P|L| + |D||D-1|)
identity matrix.
The total number of floating-point operations is approximately (8/3)n3 for complex flavors.
The real counterpart of this routine is ?sytri.
624
LAPACK Routines 3
See Also
?hetri_rook
using U*D*UH or L*D*LH bounded Bunch-Kaufman
factorization.
Syntax
call chetri_rook( uplo, n, a, lda, ipiv, work, info )
call zhetri_rook( uplo, n, a, lda, ipiv, work, info )
call hetri_rook( a, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A. Before calling this routine, call ?
hetrf_rook to factorize A.
Input Parameters

If uplo = 'U', the array a stores the factorization A = U*D*UH.
If uplo = 'L', the array a stores the factorization A = L*D*LH.
a, work COMPLEX for chetri_rook

DOUBLE COMPLEX for zhetri_rook.
Arrays:
hetrf_rook.

ipiv INTEGER.
Array, size at least max(1, n). The ipiv array, as returned by ?
hetrf_rook.
Output Parameters
625
info INTEGER.


Specific details for the routine hetri_rook interface are as follows:
Application Notes
The total number of floating-point operations is approximately (8/3)n3 for complex flavors.
The real counterpart of this routine is ?sytri_rook.
See Also
?sytri2
Computes the inverse of a symmetric indefinite matrix
through setting the leading dimension of the
workspace and calling ?sytri2x.
Syntax
call ssytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call dsytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call csytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call zsytri2( uplo, n, a, lda, ipiv, work, lwork, info )
call sytri2( a,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric indefinite matrix A using the factorization A =
U*D*UT or A = L*D*LT computed by ?sytrf.
The ?sytri2 routine sets the leading dimension of the workspace before calling ?sytri2x that actually
computes the inverse.
626
LAPACK Routines 3
Input Parameters

a, work REAL for ssytri2

DOUBLE PRECISION for dsytri2
COMPLEX for csytri2
DOUBLE COMPLEX for zsytri2
Array a(size lda by n) contains the block diagonal matrix D and the
multipliers used to obtain the factor U or L as returned by ?sytrf.
work is a workspace array of (n+nb+1)*(nb+3) dimension.
ipiv INTEGER.
Details of the interchanges and the block structure of D as returned

by ?sytrf.
lwork INTEGER. The dimension of the work array.

lwork (n+nb+1)*(nb+3)
where
nb is the block size parameter as returned by sytrf.

issued by xerbla.
Output Parameters
a If info = 0, the symmetric inverse of the original matrix.
If uplo = 'U', the upper triangular part of the inverse is formed and
the part of A below the diagonal is not referenced.
If uplo = 'L', the lower triangular part of the inverse is formed and
the part of A above the diagonal is not referenced.
info INTEGER.
627
If info = i, D(i,i) = 0; D is singular and its inversion could not be

computed.

Specific details for the routine sytri2 interface are as follows:
uplo Indicates how the matrix A has been factored. Must be 'U' or 'L'.
See Also
?sytrf
?sytri2x
?hetri2
Computes the inverse of a Hermitian indefinite matrix
through setting the leading dimension of the
workspace and calling ?hetri2x.
Syntax
call chetri2( uplo, n, a, lda, ipiv, work, lwork, info )
call zhetri2( uplo, n, a, lda, ipiv, work, lwork, info )
call hetri2( a,ipiv[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a Hermitian indefinite matrix A using the factorization A =
U*D*UH or A = L*D*LH computed by ?hetrf.
The ?hetri2 routine sets the leading dimension of the workspace before calling ?hetri2x that actually
computes the inverse.
Input Parameters

a, work COMPLEX for chetri2
628
LAPACK Routines 3
DOUBLE COMPLEX for zhetri2
Array a(size lda by *) contains the block diagonal matrix D and the
work is a workspace array of (n+nb+1)*(nb+3) dimension.
ipiv INTEGER.
Details of the interchanges and the block structure of D as returned

by ?hetrf.
lwork INTEGER. The dimension of the work array.

lwork (n+nb+1)*(nb+3)
where
nb is the block size parameter as returned by hetrf.

issued by xerbla.
Output Parameters
a If info = 0, the inverse of the original matrix.
If uplo = 'U', the upper triangular part of the inverse is formed and
If uplo = 'L', the lower triangular part of the inverse is formed and
info INTEGER.
If info = i, D(i,i) = 0; D is singular and its inversion could not be

computed.

Specific details for the routine hetri2 interface are as follows:
629
'L'.
See Also
?hetrf
?hetri2x
?sytri2x
Computes the inverse of a symmetric indefinite matrix
after ?sytri2sets the leading dimension of the
workspace.
Syntax
call ssytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call dsytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call csytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call zsytri2x( uplo, n, a, lda, ipiv, work, nb, info )
call sytri2x( a,ipiv,nb[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a symmetric indefinite matrix A using the factorization A =
U*D*UT or A = L*D*LT computed by ?sytrf.
The ?sytri2x actually computes the inverse after the ?sytri2 routine sets the leading dimension of the
workspace before calling ?sytri2x.
Input Parameters

a, work REAL for ssytri2x

DOUBLE PRECISION for dsytri2x
COMPLEX for csytri2x
DOUBLE COMPLEX for zsytri2x
Arrays:
a(lda,*) contains the nb (block size) diagonal matrix D and the
630
LAPACK Routines 3
work is a workspace array of dimension (n+nb+1)*(nb+3)
where
nb is the block size as set by ?sytrf.
ipiv INTEGER.
Details of the interchanges and the nb structure of D as returned by ?

sytrf.
nb INTEGER. Block size.
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and
If info = 'L', the lower triangular part of the inverse is formed and
info INTEGER.
If info = i, Dii= 0; D is singular and its inversion could not be

computed.

Specific details for the routine sytri2x interface are as follows:
nb Holds the block size.
'L'.
See Also
?sytrf
?sytri2
631
?hetri2x
Computes the inverse of a Hermitian indefinite matrix
after ?hetri2sets the leading dimension of the
workspace.
Syntax
call chetri2x( uplo, n, a, lda, ipiv, work, nb, info )
call zhetri2x( uplo, n, a, lda, ipiv, work, nb, info )
call hetri2x( a,ipiv,nb[,uplo][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a Hermitian indefinite matrix A using the factorization A =
U*D*UH or A = L*D*LH computed by ?hetrf.
The ?hetri2x actually computes the inverse after the ?hetri2 routine sets the leading dimension of the
workspace before calling ?hetri2x.
Input Parameters

a, work COMPLEX for chetri2x

DOUBLE COMPLEX for zhetri2x
Arrays:
a(lda,*) contains the nb (block size) diagonal matrix D and the
multipliers used to obtain the factor U or L as returned by ?hetrf.
work is a workspace array of the dimension (n+nb+1)*(nb+3)

where
nb is the block size as set by ?hetrf.
ipiv INTEGER.
Details of the interchanges and the nb structure of D as returned by ?

hetrf.
nb INTEGER. Block size.
632
LAPACK Routines 3
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and
If info = 'L', the lower triangular part of the inverse is formed and
info INTEGER.
If info = i, Dii= 0; D is singular and its inversion could not be

computed.

Specific details for the routine hetri2x interface are as follows:
nb Holds the block size.
'L'.
See Also
?hetrf
?hetri2
?sptri
U*D*UT or L*D*LT Bunch-Kaufman factorization of
matrix in packed storage.
Syntax
call ssptri( uplo, n, ap, ipiv, work, info )
call dsptri( uplo, n, ap, ipiv, work, info )
call csptri( uplo, n, ap, ipiv, work, info )
call zsptri( uplo, n, ap, ipiv, work, info )
call sptri( ap, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
633
Description
The routine computes the inverse inv(A) of a packed symmetric matrix A. Before calling this routine, call ?
sptrf to factorize A.
Input Parameters

If uplo = 'U', the array ap stores the Bunch-Kaufman factorization A
= U*D*UT.
If uplo = 'L', the array ap stores the Bunch-Kaufman factorization A
= L*D*LT.
ap, work REAL for ssptri

DOUBLE PRECISION for dsptri
COMPLEX for csptri
DOUBLE COMPLEX for zsptri.
Arrays:
ap(*) contains the factorization of the matrix A, as returned by ?
sptrf.
The dimension of ap must be at least max(1,n(n+1)/2).

ipiv INTEGER.
Output Parameters
ap Overwritten by the matrix inv(A) in packed form.
info INTEGER.


Specific details for the routine sptri interface are as follows:
634
LAPACK Routines 3
Application Notes
|D*UT*PT*X*P*U - I| c(n)(|D||UT|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LT*PT*X*P*L - I| c(n)(|D||LT|PT|X|P|L| + |D||D-1|)
identity matrix.
complex flavors.
See Also
?hptri
using U*D*UH or L*D*LH Bunch-Kaufman factorization
of matrix in packed storage.
Syntax
call chptri( uplo, n, ap, ipiv, work, info )
call zhptri( uplo, n, ap, ipiv, work, info )
call hptri( ap, ipiv [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a complex Hermitian matrix A using packed storage. Before
calling this routine, call ?hptrf to factorize A.
Input Parameters

If uplo = 'U', the array ap stores the packed Bunch-Kaufman
If uplo = 'L', the array ap stores the packed Bunch-Kaufman

635
ap, work COMPLEX for chptri

DOUBLE COMPLEX for zhptri.
Arrays:
ap(*) contains the factorization of the matrix A, as returned by ?
hptrf.
The dimension of ap must be at least max(1,n(n+1)/2).

ipiv INTEGER.
The ipiv array, as returned by ?hptrf.
Output Parameters
ap Overwritten by the matrix inv(A).
info INTEGER.


Specific details for the routine hptri interface are as follows:
Application Notes
|D*UH*PT*X*P*U - I| c(n)(|D||UH|PT|X|P|U| + |D||D-1|)
for uplo = 'U', and
|D*LH*PT*X*PL - I| c(n)(|D||LH|PT|X|P|L| + |D||D-1|)
identity matrix.
The real counterpart of this routine is ?sptri.
636
LAPACK Routines 3
See Also
?trtri
Computes the inverse of a triangular matrix.
Syntax
call strtri( uplo, diag, n, a, lda, info )
call dtrtri( uplo, diag, n, a, lda, info )
call ctrtri( uplo, diag, n, a, lda, info )
call ztrtri( uplo, diag, n, a, lda, info )
call trtri( a [,uplo] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a triangular matrix A.
Input Parameters



a REAL for strtri

DOUBLE PRECISION for dtrtri
COMPLEX for ctrtri
DOUBLE COMPLEX for ztrtri.
Array: size lda by *size max(1,lda*n). Contains the matrix A. The
second dimension of a must be at least max(1,n).
lda INTEGER. The first dimension of a; lda max(1, n).
Output Parameters
a Overwritten by the matrix inv(A).
info INTEGER.
637

If info = i, the i-th diagonal element of A is zero, A is singular, and

Specific details for the routine trtri interface are as follows:
Application Notes
|XA - I| c(n) |X||A|
|XA - I| c(n) |A-1||A||X|,
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix.
complex flavors.
See Also
?tftri
Computes the inverse of a triangular matrix stored in
the Rectangular Full Packed (RFP) format.
Syntax
call stftri( transr, uplo, diag, n, a, info )
call dtftri( transr, uplo, diag, n, a, info )
call ctftri( transr, uplo, diag, n, a, info )
call ztftri( transr, uplo, diag, n, a, info )
Include Files
mkl.fi, lapack.f90
Description
Computes the inverse of a triangular matrix A stored in the Rectangular Full Packed (RFP) format. For the
description of the RFP format, see Matrix Storage Schemes.
This is the block version of the algorithm, calling Level 3 BLAS.
638
LAPACK Routines 3
Input Parameters
data).
If transr = 'N', the Normal transr of RFP A is stored.
If transr = 'T', the Transpose transr of RFP A is stored.
If transr = 'C', the Conjugate-Transpose transr of RFP A is stored.

Indicates whether the upper or lower triangular part of RFP A is
stored:
matrix A.
matrix A.


a REAL for stftri

DOUBLE PRECISION for dtftri
COMPLEX for ctftri
DOUBLE COMPLEX for ztftri.
Array, size max(1, n*(n + 1)/2). The array a contains the matrix A in
the RFP format.
Output Parameters
a The (triangular) inverse of the original matrix in the same storage

format.

If info = i, A(i,i) is exactly zero. The triangular matrix is singular

and its inverse cannot be computed.
See Also
?tptri
Computes the inverse of a triangular matrix using
packed storage.
Syntax
call stptri( uplo, diag, n, ap, info )
639
call dtptri( uplo, diag, n, ap, info )

call ctptri( uplo, diag, n, ap, info )
call ztptri( uplo, diag, n, ap, info )
call tptri( ap [,uplo] [,diag] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes the inverse inv(A) of a packed triangular matrix A.
Input Parameters



ap REAL for stptri

DOUBLE PRECISION for dtptri
COMPLEX for ctptri
DOUBLE COMPLEX for ztptri.
Array, size at least max(1,n(n+1)/2).
Contains the packed triangular matrix A.
Output Parameters
ap Overwritten by the packed n-by-n matrix inv(A) .
info INTEGER.

If info = i, the i-th diagonal element of A is zero, A is singular, and

640
LAPACK Routines 3
Specific details for the routine tptri interface are as follows:
Application Notes
|XA - I| c(n) |X||A|
|X - A-1| c(n) |A-1||A||X|,
where c(n) is a modest linear function of n; is the machine precision; I denotes the identity matrix.
complex flavors.
See Also
Matrix Equilibration: LAPACK Computational Routines

Routines described in this section are used to compute scaling factors needed to equilibrate a matrix. Note
that these routines do not actually scale the matrices.
?geequ
Computes row and column scaling factors intended to
equilibrate a general matrix and reduce its condition
number.
Syntax
call sgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call dgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call cgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call zgeequ( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call geequ( a, r, c [,rowcnd] [,colcnd] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n matrix A and reduce its
condition number. The output array r returns the row scale factors and the array c the column scale factors.
These factors are chosen to try to make the largest element in each row and column of the matrix B with
elements bij=r(i)*aij*c(j) have absolute value 1.
See ?laqge auxiliary function that uses scaling factors computed by ?geequ.
641
Input Parameters
m INTEGER. The number of rows of the matrix A; m 0.
n INTEGER. The number of columns of the matrix A; n 0.
a REAL for sgeequ

DOUBLE PRECISION for dgeequ
COMPLEX for cgeequ
DOUBLE COMPLEX for zgeequ.
Array: size lda by *.
Contains the m-by-n matrix A whose equilibration factors are to be
computed.
lda INTEGER. The leading dimension of a; lda max(1, m).
Output Parameters

Arrays: r (size m), c (size n).
If info = 0, or info>m, the array r contains the row scale factors of
the matrix A.
If info = 0, the array c contains the column scale factors of the
matrix A.
rowcnd REAL for single precision flavors

If info = 0 or info>m, rowcnd contains the ratio of the smallest
r(i) to the largest r(i).
colcnd REAL for single precision flavors

If info = 0, colcnd contains the ratio of the smallest c(i) to the
largest c(i).
amax REAL for single precision flavors

Absolute value of the largest element of the matrix A.
info INTEGER.
If info = i, i > 0, and
642
LAPACK Routines 3
im, the i-th row of A is exactly zero;
i>m, the (i-m)th column of A is exactly zero.

Specific details for the routine geequ interface are as follows:
a Holds the matrix A of size (m, n).
r Holds the vector of length (m).
c Holds the vector of length n.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number and BIGNUM=
largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number of A but
works well in practice.
SMLNUM and BIGNUM are parameters representing machine precision. You can use the ?lamch routines to
compute them. For example, compute single precision values of SMLNUM and BIGNUM as follows:
SMLNUM = slamch ('s')

BIGNUM = 1 / SMLNUM
If rowcnd 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If colcnd 0.1, it is not worth scaling by c.
If amax is very close to SMLNUM or very close to BIGNUM, the matrix A should be scaled.
See Also
Error Analysis
?geequb
Computes row and column scaling factors restricted to
a power of radix to equilibrate a general matrix and
reduce its condition number.
Syntax
call sgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call dgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call cgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
call zgeequb( m, n, a, lda, r, c, rowcnd, colcnd, amax, info )
Include Files
mkl.fi, lapack.f90
Description
643
The routine computes row and column scalings intended to equilibrate an m-by-n general matrix A and
reduce its condition number. The output array r returns the row scale factors and the array c - the column
scale factors. These factors are chosen to try to make the largest element in each row and column of the
matrix B with elements bi,j =r(i)*ai,j*c(j) have an absolute value of at most the radix.
r(i) and c(j) are restricted to be a power of the radix between SMLNUM = smallest safe number and
BIGNUM = largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number
of a but works well in practice.

BIGNUM = 1 / SMLNUM
This routine differs from ?geequ by restricting the scaling factors to a power of the radix. Except for over-
and underflow, scaling by these factors introduces no additional rounding errors. However, the scaled entries'
magnitudes are no longer equal to approximately 1 but lie between sqrt(radix) and 1/sqrt(radix).
Input Parameters
a REAL for sgeequb

DOUBLE PRECISION for dgeequb
COMPLEX for cgeequb
DOUBLE COMPLEX for zgeequb.
Array: size (lda,*).
Contains the m-by-n matrix A whose equilibration factors are to be

computed.
Output Parameters

Arrays: r(m), c(n).
If info = 0, or info>m, the array r contains the row scale factors for
the matrix A.
If info = 0, the array c contains the column scale factors for the
matrix A.

644
LAPACK Routines 3
r(i) to the largest r(i). If rowcnd 0.1, and amax is neither too
large nor too small, it is not worth scaling by r.

largest c(i). If colcnd 0.1, it is not worth scaling by c.

Absolute value of the largest element of the matrix A. If amax is very
close to SMLNUM or very close to BIGNUM, the matrix should be scaled.
info INTEGER.
If info = i, i > 0, and

i>m, the (i-m)-th column of A is exactly zero.
See Also
Error Analysis
?gbequ
equilibrate a banded matrix and reduce its condition
number.
Syntax
call sgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call dgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call cgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call zgbequ( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call gbequ( ab, r, c [,kl] [,rowcnd] [,colcnd] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n band matrix A and reduce
its condition number. The output array r returns the row scale factors and the array c the column scale
factors. These factors are chosen to try to make the largest element in each row and column of the matrix B
with elements bij=r(i)*aij*c(j) have absolute value 1.
See ?laqgb auxiliary function that uses scaling factors computed by ?gbequ.
645
Input Parameters
ab REAL for sgbequ

DOUBLE PRECISION for dgbequ
COMPLEX for cgbequ
DOUBLE COMPLEX for zgbequ.
Array, size ldab by *. Contains the original band matrix A stored in
rows from 1 to kl + ku + 1. The second dimension of ab must be at
least max(1,n).
ldab INTEGER. The leading dimension of ab; ldabkl+ku+1.
Output Parameters

If info = 0, or info>m, the array r contains the row scale factors of
the matrix A.
If info = 0, the array c contains the column scale factors of the
matrix A.

r(i) to the largest r(i).

largest c(i).

info INTEGER.
646
LAPACK Routines 3
If info = i and

i>m, the (i-m)th column of A is exactly zero.

Specific details for the routine gbequ interface are as follows:
r Holds the vector of length (m).
c Holds the vector of length n.
Application Notes
All the components of r and c are restricted to be between SMLNUM = smallest safe number and BIGNUM=
largest safe number. Use of these scaling factors is not guaranteed to reduce the condition number of A but

BIGNUM = 1 / SMLNUM
If rowcnd 0.1 and amax is neither too large nor too small, it is not worth scaling by r.
If colcnd 0.1, it is not worth scaling by c.
See Also
Error Analysis
?gbequb
Computes row and column scaling factors restricted to
a power of radix to equilibrate a banded matrix and
reduce its condition number.
Syntax
call sgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call dgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call cgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
call zgbequb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, info )
647
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate an m-by-n banded matrix A and
reduce its condition number. The output array r returns the row scale factors and the array c - the column
scale factors. These factors are chosen to try to make the largest element in each row and column of the
matrix B with elements b(ij)=r(i)*a(ij)*c(j) have an absolute value of at most the radix.
r(i) and c(j) are restricted to be a power of the radix between SMLNUM = smallest safe number and
BIGNUM = largest safe number. Use of these scaling factors is not guaranteed to reduce the condition
number of a but works well in practice.

BIGNUM = 1 / SMLNUM
This routine differs from ?gbequ by restricting the scaling factors to a power of the radix. Except for over-
and underflow, scaling by these factors introduces no additional rounding errors. However, the scaled entries'
magnitudes are no longer equal to approximately 1 but lie between sqrt(radix) and 1/sqrt(radix).
Input Parameters
ab REAL for sgbequb

DOUBLE PRECISION for dgbequb
COMPLEX for cgbequb
DOUBLE COMPLEX for zgbequb.
Array: size ldab by *
Contains the original banded matrix A stored in rows from 1 to kl +
ku + 1. The j-th column of A is stored in the j-th column of the array
ab as follows:
ab(ku+1+i-j,j) = a(i,j) for max(1,j-ku) i min(n,j+kl).
The second dimension of ab must be at least max(1,n).
ldab INTEGER. The leading dimension of a; ldab max(1, m).
Output Parameters

648
LAPACK Routines 3
If info = 0, or info>m, the array r contains the row scale factors for
the matrix A.
If info = 0, the array c contains the column scale factors for the
matrix A.

r(i) to the largest r(i). If rowcnd 0.1, and amax is neither too
large nor too small, it is not worth scaling by r.

largest c(i). If colcnd 0.1, it is not worth scaling by c.

close to SMLNUM or BIGNUM, the matrix should be scaled.
info INTEGER.
If info = i, the i-th diagonal element of A is nonpositive.

i>m, the (i-m)-th column of A is exactly zero.
See Also
Error Analysis
?poequ
equilibrate a symmetric (Hermitian) positive definite
matrix and reduce its condition number.
Syntax
call spoequ( n, a, lda, s, scond, amax, info )
call dpoequ( n, a, lda, s, scond, amax, info )
call cpoequ( n, a, lda, s, scond, amax, info )
call zpoequ( n, a, lda, s, scond, amax, info )
call poequ( a, s [,scond] [,amax] [,info] )
Include Files
mkl.fi, lapack.f90
649
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive-
definite matrix A and reduce its condition number (with respect to the two-norm). The output array s returns
scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements Bi,j=s(i)*Ai,j*s(j) has diagonal
elements equal to 1.
This choice of s puts the condition number of B within a factor n of the smallest possible condition number
over all possible diagonal scalings.
See ?laqsy auxiliary function that uses scaling factors computed by ?poequ.
Input Parameters
a REAL for spoequ

DOUBLE PRECISION for dpoequ
COMPLEX for cpoequ
DOUBLE COMPLEX for zpoequ.
Contains the n-by-n symmetric or Hermitian positive definite matrix A
whose scaling factors are to be computed. Only the diagonal elements
of A are referenced.
lda INTEGER. The leading dimension of a; lda max(1,n).
Output Parameters

Array, size n.
If info = 0, the array s contains the scale factors for A.
scond REAL for single precision flavors

If info = 0, scond contains the ratio of the smallest s(i) to the
largest s(i).
650
LAPACK Routines 3
info INTEGER.

Specific details for the routine poequ interface are as follows:
s Holds the vector of length n.
Application Notes
If scond 0.1 and amax is neither too large nor too small, it is not worth scaling by s.
See Also
Error Analysis
?poequb
matrix and reduce its condition number.
Syntax
call spoequb( n, a, lda, s, scond, amax, info )
call dpoequb( n, a, lda, s, scond, amax, info )
call cpoequb( n, a, lda, s, scond, amax, info )
call zpoequb( n, a, lda, s, scond, amax, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive-
definite matrix A and reduce its condition number (with respect to the two-norm).
These factors are chosen so that the scaled matrix B with elements Bi,j=s(i)*Ai,j*s(j) has diagonal
elements equal to 1. s(i) is a power of two nearest to, but not exceeding 1/sqrt(Ai,i).
651
Input Parameters
a REAL for spoequb

DOUBLE PRECISION for dpoequb
COMPLEX for cpoequb
DOUBLE COMPLEX for zpoequb.
Contains the n-by-n symmetric or Hermitian positive definite matrix A
whose scaling factors are to be computed. Only the diagonal elements
of A are referenced.
Output Parameters

Array, size (n).

largest s(i). If scond 0.1, and amax is neither too large nor too
small, it is not worth scaling by s.

info INTEGER.
See Also
Error Analysis
652
LAPACK Routines 3
?ppequ
matrix in packed storage and reduce its condition
number.
Syntax
call sppequ( uplo, n, ap, s, scond, amax, info )
call dppequ( uplo, n, ap, s, scond, amax, info )
call cppequ( uplo, n, ap, s, scond, amax, info )
call zppequ( uplo, n, ap, s, scond, amax, info )
call ppequ( ap, s [,scond] [,amax] [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive
definite matrix A in packed storage and reduce its condition number (with respect to the two-norm). The
output array s returns scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements bij=s(i)*aij*s(j) has diagonal
elements equal to 1.
See ?laqsp auxiliary function that uses scaling factors computed by ?ppequ.
Input Parameters

the array ap:
matrix A.
matrix A.
ap REAL for sppequ

DOUBLE PRECISION for dppequ
COMPLEX for cppequ
653
DOUBLE COMPLEX for zppequ.

Array, size at least max(1,n(n+1)/2). The array ap contains the
uplo) in packed storage (see Matrix Storage Schemes).
Output Parameters

Array, size (n).

largest s(i).

info INTEGER.

Specific details for the routine ppequ interface are as follows:
Application Notes
See Also
Error Analysis
654
LAPACK Routines 3
?pbequ
equilibrate a symmetric (Hermitian) positive-definite
band matrix and reduce its condition number.
Syntax
call spbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call dpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call cpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call zpbequ( uplo, n, kd, ab, ldab, s, scond, amax, info )
call pbequ( ab, s [,scond] [,amax] [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric (Hermitian) positive
definite band matrix A and reduce its condition number (with respect to the two-norm). The output array s
returns scale factors such that s(i)s[i + 1] contains
These factors are chosen so that the scaled matrix B with elements bij=s(i)*aij*s(j) has diagonal
elements equal to 1. This choice of s puts the condition number of B within a factor n of the smallest possible
condition number over all possible diagonal scalings.
See ?laqsb auxiliary function that uses scaling factors computed by ?pbequ.
Input Parameters

Indicates whether the upper or lower triangular part of A is stored in
the array ab:
If uplo = 'U', the array ab stores the upper triangular part of the
matrix A.
If uplo = 'L', the array ab stores the lower triangular part of the
matrix A.

A; kd 0.
ab REAL for spbequ

DOUBLE PRECISION for dpbequ
655
COMPLEX for cpbequ

DOUBLE COMPLEX for zpbequ.
Array, size ldab by *.
The array ap contains either the upper or the lower triangular part of
the matrix A (as specified by uplo) in band storage (see Matrix
Storage Schemes).
Output Parameters

Array, size (n).

largest s(i).

info INTEGER.

Specific details for the routine pbequ interface are as follows:
Application Notes
656
LAPACK Routines 3
See Also
Error Analysis
?syequb
equilibrate a symmetric indefinite matrix and reduce
its condition number.
Syntax
call ssyequb( uplo, n, a, lda, s, scond, amax, work, info )
call dsyequb( uplo, n, a, lda, s, scond, amax, work, info )
call csyequb( uplo, n, a, lda, s, scond, amax, work, info )
call zsyequb( uplo, n, a, lda, s, scond, amax, work, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a symmetric indefinite matrix A and
reduce its condition number (with respect to the two-norm).
The array s contains the scale factors, s(i) = 1/sqrt(A(i,i)). These factors are chosen so that the
scaled matrix B with elements b(i,j)=s(i)*a(i,j)*s(j) has ones on the diagonal.
Input Parameters

matrix A.
matrix A.
a, work REAL for ssyequb

DOUBLE PRECISION for dsyequb
COMPLEX for csyequb
DOUBLE COMPLEX for zsyequb.
Array a: lda by *.
Contains the n-by-n symmetric indefinite matrix A whose scaling

factors are to be computed. Only the diagonal elements of A are
referenced. The second dimension of a must be at least max(1,n).
657

max(1,3*n).
Output Parameters

Array, size (n).


info INTEGER.
See Also
Error Analysis
?heequb
equilibrate a Hermitian indefinite matrix and reduce its
condition number.
Syntax
call cheequb( uplo, n, a, lda, s, scond, amax, work, info )
call zheequb( uplo, n, a, lda, s, scond, amax, work, info )
Include Files
mkl.fi, lapack.f90
Description
The routine computes row and column scalings intended to equilibrate a Hermitian indefinite matrix A and
reduce its condition number (with respect to the two-norm).
658
LAPACK Routines 3
The array s contains the scale factors, s(i) = 1/sqrt(A(i,i)). These factors are chosen so that the
scaled matrix B with elements b(i,j)=s(i)*a(i,j)*s(j) has ones on the diagonal.
Input Parameters

matrix A.
matrix A.
a, work COMPLEX for cheequb

DOUBLE COMPLEX for zheequb.
Array a: size lda by *.
Contains the n-by-n symmetric indefinite matrix A whose scaling

factors are to be computed. Only the diagonal elements of A are
referenced.

max(1,3*n).
Output Parameters
s REAL for cheequb

DOUBLE PRECISION for zheequb.
Array, size (n).
scond REAL for cheequb

amax REAL for cheequb

info INTEGER.
659
See Also
Error Analysis
LAPACK Linear Equation Driver Routines

Table "Driver Routines for Solving Systems of Linear Equations" lists the LAPACK driver routines for solving
systems of linear equations with real or complex matrices.
Driver Routines for Solving Systems of Linear Equations
Matrix type, storage Simple Driver Expert Driver Expert Driver using
scheme Extra-Precise
Interative Refinement
general ?gesv ?gesvx ?gesvxx
general band ?gbsv ?gbsvx ?gbsvxx
general tridiagonal ?gtsv ?gtsvx
diagonally dominant ?dtsvb

tridiagonal
symmetric/Hermitian ?posv ?posvx ?posvxx

positive-definite
symmetric/Hermitian ?ppsv ?ppsvx

positive-definite,
storage
symmetric/Hermitian ?pbsv ?pbsvx

positive-definite, band
symmetric/Hermitian ?ptsv ?ptsvx

positive-definite,
tridiagonal
symmetric/Hermitian ?sysv/?hesv ?sysvx/?hesvx ?sysvxx/?hesvxx

indefinite
?sysv_rook/?
hesv_rook
symmetric/Hermitian ?spsv/?hpsv ?spsvx/?hpsvx

indefinite, packed
storage
complex symmetric ?sysv ?sysvx

?sysv_rook
complex symmetric, ?spsv ?spsvx

packed storage
In this table ? stands for s (single precision real), d (double precision real), c (single precision complex), or z
(double precision complex). In the description of ?gesv and ?posv routines, the ? sign stands for combined
character codes ds and zc for the mixed precision subroutines.
660
LAPACK Routines 3
?gesv
Computes the solution to the system of linear
equations with a square coefficient matrix A and
multiple right-hand sides.
Syntax
call sgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call cgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call zgesv( n, nrhs, a, lda, ipiv, b, ldb, info )
call dsgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, iter, info )
call zcgesv( n, nrhs, a, lda, ipiv, b, ldb, x, ldx, work, swork, rwork, iter, info )
call gesv( a, b [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n matrix, the columns
of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = P*L*U, where P
is a permutation matrix, L is unit lower triangular, and U is upper triangular. The factored form of A is then
used to solve the system of equations A*X = B.
The dsgesv and zcgesv are mixed precision iterative refinement subroutines for exploiting fast single
precision hardware. They first attempt to factorize the matrix in single precision (dsgesv) or single complex
precision (zcgesv) and use this factorization within an iterative refinement procedure to produce a solution
with double precision (dsgesv) / double complex precision (zcgesv) normwise backward error quality (see
below). If the approach fails, the method switches to a double precision or double complex precision
factorization respectively and computes the solution.
The iterative refinement is not going to be a winning strategy if the ratio single precision performance over
double precision performance is too small. A reasonable strategy should take the number of right-hand sides
and the size of the matrix into account. This might be done with a call to ilaenv in the future. At present,
iterative refinement is implemented.
The iterative refinement process is stopped if
iter > itermax
or for all the right-hand sides:
rnmr < sqrt(n)*xnrm*anrm*eps*bwdmax
where
iter is the number of the current iteration in the iterativerefinement process

rnmr is the infinity-norm of the residual
xnrm is the infinity-norm of the solution
anrm is the infinity-operator-norm of the matrix A
eps is the machine epsilon returned by dlamch (Epsilon).
The values itermax and bwdmax are fixed to 30 and 1.0d+00 respectively.
661
Input Parameters
n INTEGER. The number of linear equations, that is, the order of the
matrix A; n 0.
a REAL for sgesv

DOUBLE PRECISION for dgesv and dsgesv
COMPLEX for cgesv
DOUBLE COMPLEX for zgesv and zcgesv.
The array a(size lda by *) contains the n-by-n coefficient matrix
A.
The second dimension of a must be at least max(1, n), the second
b REAL for sgesv

DOUBLE PRECISION for dgesv and dsgesv
COMPLEX for cgesv
DOUBLE COMPLEX for zgesv and zcgesv.
The array b(size ldb by *) contains the n-by-nrhs matrix of right
hand side matrix B.
lda INTEGER. The leading dimension of the array a; lda max(1, n).
ldx INTEGER. The leading dimension of the array x; ldx max(1, n).
work DOUBLE PRECISION for dsgesv

DOUBLE COMPLEX for zcgesv.
Workspace array, size at least max(1,n*nrhs). This array is used to
hold the residual vectors.
swork REAL for dsgesv

COMPLEX for zcgesv.
Workspace array, size at least max(1,n*(n+nrhs)). This array is
used to use the single precision matrix and the right-hand sides or
solutions in single precision.
rwork DOUBLE PRECISION. Workspace array, size at least max(1,n).
Output Parameters
a Overwritten by the factors L and U from the factorization of A =

P*L*U; the unit diagonal elements of L are not stored.
662
LAPACK Routines 3
If iterative refinement has been successfully used (info= 0 and
iter 0), then A is unchanged.
If double precision factorization has been used (info= 0 and iter <
0), then the array A contains the factors L and U from the
factorization A = P*L*U; the unit diagonal elements of L are not
stored.
b Overwritten by the solution matrix X for dgesv, sgesv,zgesv,zgesv.

Unchanged for dsgesv and zcgesv.
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices that define the
permutation matrix P; row i of the matrix was interchanged with row
ipiv(i). Corresponds to the single precision factorization (if info= 0
and iter 0) or the double precision factorization (if info= 0 and
iter < 0).
x DOUBLE PRECISION for dsgesv

DOUBLE COMPLEX for zcgesv.
Array, size ldx by nrhs. If info = 0, contains the n-by-nrhs solution
matrix X.
iter INTEGER.
If iter < 0: iterative refinement has failed, double precision
factorization has been performed
If iter = -1: the routine fell back to full precision for

implementation- or machine-specific reason
If iter = -2: narrowing the precision induced an overflow, the
routine fell back to full precision
If iter = -3: failure of sgetrf for dsgesv, or cgetrf for zcgesv
If iter = -31: stop the iterative refinement after the 30th
iteration.
If iter > 0: iterative refinement has been successfully used. Returns

the number of iterations.

If info = i, Ui, i (computed in double precision for mixed precision

subroutines) is exactly zero. The factorization has been completed,
but the factor U is exactly singular, so the solution could not be
computed.

Specific details for the routine gesv interface are as follows:
663
NOTE
Fortran 95 Interface is so far not available for the mixed precision subroutines dsgesv/zcgesv.
See Also
ilaenv
?lamch
?getrf
?gesvx
equations with a square coefficient matrix A and
multiple right-hand sides, and provides error bounds
on the solution.
Syntax
call sgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, iwork, info )
call dgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
call cgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, ferr, berr, work, rwork, info )
call zgesvx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
call gesvx( a, b, x [,af] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c] [,ferr] [,berr]
[,rcond] [,rpvgrw] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the LU factorization to compute the solution to a real or complex system of linear equations
A*X = B, where A is an n-by-n matrix, the columns of matrix B are individual right-hand sides, and the
Error bounds on the solution and a condition estimate are also provided.
The routine ?gesvx performs the following steps:
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
trans = 'N': diag(r)*A*diag(c)*inv(diag(c))*X = diag(r)*B

trans = 'T': (diag(r)*A*diag(c))T*inv(diag(r))*X = diag(c)*B
trans = 'C': (diag(r)*A*diag(c))H*inv(diag(r))*X = diag(c)*B
664
LAPACK Routines 3
Whether or not the system will be equilibrated depends on the scaling of the matrix A, but if
equilibration is used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if trans='N') or
diag(c)*B (if trans = 'T' or 'C').
2. If fact = 'N' or 'E', the LU decomposition is used to factor the matrix A (after equilibration if fact
= 'E') as A = P*L*U, where P is a permutation matrix, L is a unit lower triangular matrix, and U is
upper triangular.
3. If some Ui,i= 0, so that U is exactly singular, then the routine returns with info = i. Otherwise, the
factored form of A is used to estimate the condition number of the matrix A. If the reciprocal of the
condition number is less than machine precision, info = n + 1 is returned as a warning, but the
routine still goes on to solve for X and compute error bounds as described below.
4. The system of equations is solved for X using the factored form of A.
5. Iterative refinement is applied to improve the computed solution matrix and calculate error bounds and
backward error estimates for it.
6. If equilibration was used, the matrix X is premultiplied by diag(c) (if trans = 'N') or diag(r) (if
trans = 'T' or 'C') so that it solves the original system before equilibration.
Input Parameters
fact CHARACTER*1. Must be 'F', 'N', or 'E'.

Specifies whether or not the factored form of the matrix A is supplied
on entry, and if not, whether the matrix A should be equilibrated
before it is factored.
If fact = 'F': on entry, af and ipiv contain the factored form of A. If
equed is not 'N', the matrix A has been equilibrated with scaling
factors given by r and c.
a, af, and ipiv are not modified.
If fact = 'N', the matrix A will be copied to af and factored.
If fact = 'E', the matrix A will be equilibrated if necessary, then

copied to af and factored.

If trans = 'C', the system has the form AH*X = B (Transpose for
real flavors, conjugate transpose for complex flavors).
n INTEGER. The number of linear equations; the order of the matrix A;

n 0.
nrhs INTEGER. The number of right hand sides; the number of columns of
the matrices B and X; nrhs 0.
a REAL for sgesvx

DOUBLE PRECISION for dgesvx
COMPLEX for cgesvx
DOUBLE COMPLEX for zgesvx.
665
The array a(size lda by *) contains the matrix A. If fact = 'F' and
equed is not 'N', then A must have been equilibrated by the scaling
factors in r and/or c. The second dimension of a must be at least
max(1,n).
af REAL for sgesvx

COMPLEX for cgesvx
The array afaf(size ldaf by *) is an input argument if fact =
'F'. It contains the factored form of the matrix A, that is, the factors
L and U from the factorization A = P*L*U as computed by ?getrf. If
equed is not 'N', then af is the factored form of the equilibrated
matrix A. The second dimension of af must be at least max(1,n).
b REAL for sgesvx

COMPLEX for cgesvx
The array bb(size ldb by *) contains the matrix B whose columns
are the right-hand sides for the systems of equations. The second
dimension of b must be at least max(1, nrhs).
, work REAL for sgesvx

COMPLEX for cgesvx
max(1,4*n) for real flavors, and at least max(1,2*n) for complex
flavors.
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if
fact = 'F'. It contains the pivot indices from the factorization A =
P*L*U as computed by ?getrf; row i of the matrix was interchanged
with row ipiv(i).

equed is an input argument if fact = 'F'. It specifies the form of
equilibration that was done:
666
LAPACK Routines 3
If equed = 'N', no equilibration was done (always true if fact =
'N').
If equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).

Arrays: r (size n), c (size n). The array r contains the row scale
factors for A, and the array c contains the column scale factors for A.
These arrays are input arguments if fact = 'F' only; otherwise they
are output arguments.
If equed = 'R' or 'B', A is multiplied on the left by diag(r); if equed
= 'N' or 'C', r is not accessed.
If fact = 'F' and equed = 'R' or 'B', each element of r must be
positive.
If equed = 'C' or 'B', A is multiplied on the right by diag(c); if
equed = 'N' or 'R', c is not accessed.
If fact = 'F' and equed = 'C' or 'B', each element of c must be
positive.
ldx INTEGER. The leading dimension of the output array x; ldx max(1,
n).
iwork INTEGER. Workspace array, size at least max(1, n); used in real
flavors only.

Workspace array, size at least max(1, 2*n); used in complex flavors
only.
Output Parameters
x REAL for sgesvx

COMPLEX for cgesvx
If info = 0 or info = n+1, the array x contains the solution matrix
X to the original system of equations. Note that A and B are modified
on exit if equed'N', and the solution to the equilibrated system is:
667
diag(C)-1*X, if trans = 'N' and equed = 'C' or 'B';

diag(R)-1*X, if trans = 'T' or 'C' and equed = 'R' or 'B'. The
second dimension of x must be at least max(1,nrhs).
a Array a is not modified on exit if fact = 'F' or 'N', or if fact =

'E' and equed = 'N'. If equed'N', A is scaled on exit as follows:
equed = 'R': A = diag(R)*A
equed = 'C': A = A*diag(c)
equed = 'B': A = diag(R)*A*diag(c).
af If fact = 'N' or 'E', then af is an output argument and on exit

returns the factors L and U from the factorization A = PLU of the
original matrix A (if fact = 'N') or of the equilibrated matrix A (if
fact = 'E'). See the description of a for the form of the equilibrated
matrix.
b Overwritten by diag(r)*B if trans = 'N' and equed = 'R'or 'B';
overwritten by diag(c)*B if trans = 'T' or 'C' and equed = 'C'

or 'B';
not changed if equed = 'N'.
r, c These arrays are output arguments if fact'F'. See the description

of r, c in Input Arguments section.

An estimate of the reciprocal condition number of the matrix A after
equilibration (if done). If rcond is less than the machine precision, in
particular, if rcond = 0, the matrix is singular to working precision.
This condition is indicated by a return code of info > 0.
ferr REAL for single precision flavors

Array, size at least max(1, nrhs). Contains the estimated forward
error bound for each solution vector x(j) (the j-th column of the
solution matrix X). If xtrue is the true solution corresponding to x(j),
ferr(j) is an estimated upper bound for the magnitude of the largest
element in (x(j) - xtrue) divided by the magnitude of the largest
element in x(j). The estimate is as reliable as the estimate for
rcond, and is almost always a slight overestimate of the true error.

Array, size at least max(1, nrhs). Contains the component-wise
relative backward error for each solution vector x(j), that is, the
smallest relative change in any element of A or B that makes x(j) an
exact solution.
668
LAPACK Routines 3
ipiv If fact = 'N'or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the
fact = 'E').
equed If fact'F', then equed is an output argument. It specifies the form

of equilibration that was done (see the description of equed in Input
Arguments section).
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors
(the Fortran interface) contains the reciprocal pivot growth factor
norm(A)/norm(U). The "max absolute element" norm is used. If
work(1) for real flavors, or rwork(1) for complex flavors is much
less than 1, then the stability of the LU factorization of the
(equilibrated) matrix A could be poor. This also means that the
solution x, condition estimator rcond, and forward error bound ferr
could be unreliable. If factorization fails with 0 < infon, then
work(1) for real flavors, or rwork(1) for complex flavors contains
the reciprocal pivot growth factor for the leading info columns of A.

If info = i, and in, then U(i, i) is exactly zero. The factorization

has been completed, but the factor U is exactly singular, so the
solution and error bounds could not be computed; rcond = 0 is
returned.
If info = n + 1, then U is nonsingular, but rcond is less than
machine precision, meaning that the matrix is singular to working
precision. Nevertheless, the solution and error bounds are computed
because there are a number of situations where the computed
solution can be more accurate than the value of rcond would suggest.

Specific details for the routine gesvx interface are as follows:
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
669
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
fact Must be 'N', 'E', or 'F'. The default value is 'N'. If fact = 'F',
then both arguments af and ipiv must be present; otherwise, an error
is returned.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
rpvgrw Real value that contains the reciprocal pivot growth factor norm(A)/
norm(U).
See Also
?gesvxx
Uses extra precise iterative refinement to compute the
square coefficient matrix A and multiple right-hand
sides
Syntax
call sgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params,
work, iwork, info )
call dgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
work, iwork, info )
call cgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
work, rwork, info )
call zgesvxx( fact, trans, n, nrhs, a, lda, af, ldaf, ipiv, equed, r, c, b, ldb, x,
work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, where A is an n-by-n matrix, the columns of the matrix B are individual right-hand sides, and the
Both normwise and maximum componentwise error bounds are also provided on request. The routine returns
a solution with a small guaranteed error (O(eps), where eps is the working machine precision) unless the
matrix is very ill-conditioned, in which case a warning is returned. Relevant condition numbers are also
calculated and returned.
670
LAPACK Routines 3
The routine accepts user-provided factorizations and equilibration factors; see definitions of the fact and
equed options. Solving with refinement and using a factorization from a previous call of the routine also
produces a solution with O(eps) errors or warnings but that may not be true for general user-provided
factorizations and equilibration factors if they differ from what the routine would itself produce.
The routine ?gesvxx performs the following steps:
1. If fact = 'E', scaling factors r and c are computed to equilibrate the system:

upper triangular.
factored form of A is used to estimate the condition number of the matrix A (see the rcond parameter).
If the reciprocal of the condition number is less than machine precision, the routine still goes on to
solve for X and compute error bounds.
5. By default, unless params(la_linrx_itref_i) is set to zero, the routine applies iterative refinement
to improve the computed solution matrix and calculate error bounds. Refinement calculates the residual
to at least twice the working precision.
Input Parameters

Specifies whether or not the factored form of the matrix A is supplied on
entry, and if not, whether the matrix A should be equilibrated before it is
factored.
If fact = 'F', on entry, af and ipiv contain the factored form of A. If
equed is not 'N', the matrix A has been equilibrated with scaling factors
given by r and c. Parameters a, af, and ipiv are not modified.
If fact = 'E', the matrix A will be equilibrated, if necessary, copied to af

and factored.

If trans = 'C', the system has the form AH*X = B (Conjugate Transpose
= Transpose for real flavors, Conjugate Transpose for complex flavors).
671
nrhs INTEGER. The number of right hand sides; the number of columns of the
a, af, b, work REAL for sgesvxx

DOUBLE PRECISION for dgesvxx
COMPLEX for cgesvxx
DOUBLE COMPLEX for zgesvxx.
Arrays: a(size lda by *), af(size ldafby *), b(size ldb by *), work(*).
The array a contains the matrix A. If fact = 'F' and equed is not 'N',
then A must have been equilibrated by the scaling factors in r and/or c.
The array af is an input argument if fact = 'F'. It contains the factored

form of the matrix A, that is, the factors L and U from the factorization A =
P*L*U as computed by ?getrf. If equed is not 'N', then af is the factored
form of the equilibrated matrix A. The second dimension of af must be at
least max(1,n).
max(1,nrhs).
ipiv INTEGER.
Array, size at least max(1, n). The array ipiv is an input argument if fact
= 'F'. It contains the pivot indices from the factorization A = P*L*U as
computed by ?getrf; row i of the matrix was interchanged with row
ipiv(i).

If equed = 'N', no equilibration was done (always true if fact = 'N').
been replaced by diag(r)*A*diag(c).

672
LAPACK Routines 3
Arrays: r (size n), c (size n). The array r contains the row scale factors
for A, and the array c contains the column scale factors for A. These arrays
are input arguments if fact = 'F' only; otherwise they are output
arguments.
'N' or 'C', r is not accessed.
If fact = 'F' and equed = 'R'or 'B', each element of r must be
positive.
positive.


Array, size max(1, nparams). Specifies algorithm parameters. If an entry is
less than 0.0, that entry is filled with the default value used for that
parameter. Only positions up to nparams are accessed; defaults are used
for higher-numbered parameters. If defaults are acceptable, you can pass
nparams = 0, which prevents the source code from accessing the params
argument.

are computed.

double precision.

refinement.
673
Default 10.0


convergence).
only.

Output Parameters
x REAL for sgesvxx

DOUBLE PRECISION for dgesvxx
COMPLEX for cgesvxx
DOUBLE COMPLEX for zgesvxx.
If info = 0, the array x contains the solution n-by-nrhs matrix X to the

original system of equations. Note that A and B are modified on exit if
equed'N', and the solution to the equilibrated system is:
inv(diag(c))*X, if trans = 'N' and equed = 'C' or 'B'; or
inv(diag(r))*X, if trans = 'T' or 'C' and equed = 'R' or 'B'. The
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E' and
equed = 'N'.
If equed'N', A is scaled on exit as follows:
equed = 'R': A = diag(r)*A

equed = 'B': A = diag(r)*A*diag(c).
af If fact = 'N' or 'E', then af is an output argument and on exit returns

the factors L and U from the factorization A = PLU of the original matrix A
(if fact = 'N') or of the equilibrated matrix A (if fact = 'E'). See the
description of a for the form of the equilibrated matrix.
b Overwritten by diag(r)*B if trans = 'N' and equed = 'R' or 'B';
674
LAPACK Routines 3
overwritten by trans = 'T' or 'C' and equed = 'C' or 'B';
r, c These arrays are output arguments if fact'F'. Each element of these

arrays is a power of the radix. See the description of r, c in Input
Arguments section.

rpvgrw REAL for single precision flavors

Contains the reciprocal pivot growth factor:
If this is much less than 1, the stability of the LU factorization of the

(equlibrated) matrix A could be poor. This also means that the solution X,
estimated condition numbers, and error bounds could be unreliable. If
factorization fails with 0 < infon, this parameter contains the reciprocal
pivot growth factor for the leading info columns of A. In ?gesvx, this
quantity is returned in work(1).



side.
675

fields:



are:

approximately 1.

follows:

676
LAPACK Routines 3
side.
fields:



are:

ipiv If fact = 'N' or 'E', then ipiv is an output argument and on exit
contains the pivot indices from the factorization A = P*L*U of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact = 'E').
equed If fact'F', then equed is an output argument. It specifies the form of

equilibration that was done (see the description of equed in Input
Arguments section).
params If an entry is less than 0.0, that entry is filled with the default value used
for that parameter, otherwise the entry is not modified

677
If 0 < infon: U(info,info) is exactly zero. The factorization has been

See Also
?gbsv
equations with a band coefficient matrix A and
Syntax
call sgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call dgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call cgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call zgbsv( n, kl, ku, nrhs, ab, ldab, ipiv, b, ldb, info )
call gbsv( ab, b [,kl] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n band
matrix with kl subdiagonals and ku superdiagonals, the columns of matrix B are individual right-hand sides,
and the columns of X are the corresponding solutions.
The LU decomposition with partial pivoting and row interchanges is used to factor A as A = L*U, where L is a
product of permutation and unit lower triangular matrices with kl subdiagonals, and U is upper triangular
with kl+ku superdiagonals. The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters
n INTEGER. The order of A. The number of rows in B; n 0.
nrhs INTEGER. The number of right-hand sides. The number of columns in

B; nrhs 0.
678
LAPACK Routines 3
ab, b REAL for sgbsv
DOUBLE PRECISION for dgbsv
COMPLEX for cgbsv
DOUBLE COMPLEX for zgbsv.
Arrays: ab(size ldab by *), b(size ldb by *).
The array ab contains the matrix A in band storage (see Matrix
Storage Schemes). The second dimension of ab must be at least
max(1, n).
ldab INTEGER. The leading dimension of the array ab. (ldab 2kl + ku +1)
Output Parameters
ab Overwritten by L and U. The diagonal and kl + ku superdiagonals of U

are stored in the first 1 + kl + ku rows of ab. The multipliers used to
form L are stored in the next kl rows.
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices: row i was

If info = i, U(i, i) is exactly zero. The factorization has been

completed, but the factor U is exactly singular, so the solution could
not be computed.

Specific details for the routine gbsv interface are as follows:
679
See Also
?gbsvx
Computes the solution to the real or complex system
of linear equations with a band coefficient matrix A
and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
call sgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call dgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, iwork, info )
call cgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call zgbsvx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c, b,
ldb, x, ldx, rcond, ferr, berr, work, rwork, info )
call gbsvx( ab, b, x [,kl] [,afb] [,ipiv] [,fact] [,trans] [,equed] [,r] [,c] [,ferr]
[,berr] [,rcond] [,rpvgrw] [,info] )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, AT*X = B, or AH*X = B, where A is a band matrix of order n with kl subdiagonals and ku
superdiagonals, the columns of matrix B are individual right-hand sides, and the columns of X are the
corresponding solutions.
The routine ?gbsvx performs the following steps:
1. If fact = 'E', real scaling factors r and c are computed to equilibrate the system:
trans = 'N': diag(r)*A*diag(c) *inv(diag(c))*X = diag(r)*B

trans = 'T': (diag(r)*A*diag(c))T *inv(diag(r))*X = diag(c)*B
trans = 'C': (diag(r)*A*diag(c))H *inv(diag(r))*X = diag(c)*B
Whether the system will be equilibrated depends on the scaling of the matrix A, but if equilibration is
used, A is overwritten by diag(r)*A*diag(c) and B by diag(r)*B (if trans='N') or diag(c)*B (if
trans = 'T'or 'C').
2. If fact = 'N'or 'E', the LU decomposition is used to factor the matrix A (after equilibration if fact =
'E') as A = L*U, where L is a product of permutation and unit lower triangular matrices with kl
subdiagonals, and U is upper triangular with kl+ku superdiagonals.
3. If some Ui,i = 0, so that U is exactly singular, then the routine returns with info = i. Otherwise, the
680
LAPACK Routines 3
Input Parameters

Specifies whether the factored form of the matrix A is supplied on
entry, and if not, whether the matrix A should be equilibrated before it
is factored.
If fact = 'F': on entry, afb and ipiv contain the factored form of A.
If equed is not 'N', the matrix A is equilibrated with scaling factors
given by r and c.
ab, afb, and ipiv are not modified.
If fact = 'N', the matrix A will be copied to afb and factored.

copied to afb and factored.

If trans = 'C', the system has the form AH*X = B (Transpose for
real flavors, conjugate transpose for complex flavors).
n INTEGER. The number of linear equations, the order of the matrix A;

n 0.
nrhs INTEGER. The number of right hand sides, the number of columns of
ab, afb, b, work REAL for sgbsvx

DOUBLE PRECISION for dgbsvx
COMPLEX for cgbsvx
DOUBLE COMPLEX for zgbsvx.
Arrays: ab(ldab,*), afb(ldafb,*), b(size ldb by *), work(*).
The array ab contains the matrix A in band storage (see Matrix

Storage Schemes). The second dimension of ab must be at least
max(1, n). If fact = 'F' and equed is not 'N', then A must have
been equilibrated by the scaling factors in r and/or c.
The array afb is an input argument if fact = 'F'. The second
dimension of afb must be at least max(1,n). It contains the factored
form of the matrix A, that is, the factors L and U from the factorization
A = P*L*U as computed by ?gbtrf. U is stored as an upper
triangular band matrix with kl + ku superdiagonals in the first 1 + kl
681
+ ku rows of afb. The multipliers used during the factorization are

stored in the next kl rows. If equed is not 'N', then afb is the factored
form of the equilibrated matrix A.

flavors.
ldafb INTEGER. The leading dimension of afb; ldafb 2*kl+ku+1.
ipiv INTEGER.
fact = 'F'. It contains the pivot indices from the factorization A =
P*L*U as computed by ?gbtrf; row i of the matrix was interchanged
with row ipiv(i).

If equed = 'N', no equilibration was done (always true if fact =
'N').
if equed = 'B', both row and column equilibration was done, that is,
A has been replaced by diag(r)*A*diag(c).

Arrays: r (size n), c (size n).
The array r contains the row scale factors for A, and the array c
contains the column scale factors for A. These arrays are input
arguments if fact = 'F' only; otherwise they are output arguments.
If equed = 'R'or 'B', A is multiplied on the left by diag(r); if

equed = 'N' or 'C', r is not accessed.
positive.
If equed = 'C'or 'B', A is multiplied on the right by diag(c); if
equed = 'N'or 'R', c is not accessed.
682
LAPACK Routines 3
If fact = 'F' and equed = 'C'or 'B', each element of c must be
positive.
n).
flavors only.

Workspace array, size at least max(1, n); used in complex flavors
only.
Output Parameters
x REAL for sgbsvx

DOUBLE PRECISION for dgbsvx
COMPLEX for cgbsvx
DOUBLE COMPLEX for zgbsvx.
X to the original system of equations. Note that A and B are modified
on exit if equed'N', and the solution to the equilibrated system is:
inv(diag(c))*X, if trans = 'N' and equed = 'C'or 'B';
inv(diag(r))*X, if trans = 'T' or 'C' and equed = 'R' or 'B'.
The second dimension of x must be at least max(1,nrhs).
ab Array ab is not modified on exit if fact = 'F' or 'N', or if fact =

'E' and equed = 'N'.
afb If fact = 'N' or 'E', then afb is an output argument and on exit
returns details of the LU factorization of the original matrix A (if fact
= 'N') or of the equilibrated matrix A (if fact = 'E'). See the
description of ab for the form of the equilibrated matrix.
b Overwritten by diag(r)*b if trans = 'N' and equed = 'R' or 'B';
overwritten by diag(c)*b if trans = 'T' or 'C' and equed = 'C'

or 'B';
r, c These arrays are output arguments if fact'F'. See the description

of r, c in Input Arguments section.
683

equilibration (if done).
If rcond is less than the machine precision (in particular, if rcond =0),
the matrix is singular to working precision. This condition is indicated
by a return code of info>0.


exact solution.
contains the pivot indices from the factorization A = L*U of the
fact = 'E').


returned. If info = i, and i = n+1, then U is nonsingular, but rcond
is less than machine precision, meaning that the matrix is singular to
working precision. Nevertheless, the solution and error bounds are
computed because there are a number of situations where the
computed solution can be more accurate than the value of rcond
would suggest.

Arguments section).
work, rwork On exit, work(1) for real flavors, or rwork(1) for complex flavors,
contains the reciprocal pivot growth factor norm(A)/norm(U). The
"max absolute element" norm is used. If work(1) for real flavors, or
rwork(1) for complex flavors is much less than 1, then the stability of
684
LAPACK Routines 3
the LU factorization of the (equilibrated) matrix A could be poor. This
also means that the solution x, condition estimator rcond, and forward
error bound ferr could be unreliable. If factorization fails with 0 <
infon, then work(1) for real flavors, or rwork(1) for complex
flavors contains the reciprocal pivot growth factor for the leading info
columns of A.


returned. If info = i, and i = n+1, then U is nonsingular, but rcond
is less than machine precision, meaning that the matrix is singular to
would suggest.

Specific details for the routine gbsvx interface are as follows:
afb Holds the array AF of size (2*kl+ku+1,n).
r Holds the vector of length n. Default value for each element is r(i) =
1.0_WP.
c Holds the vector of length n. Default value for each element is c(i) =
1.0_WP.
equed Must be 'N', 'B', 'C', or 'R'. The default value is 'N'.
then both arguments af and ipiv must be present; otherwise, an error
is returned.
685
rpvgrw Real value that contains the reciprocal pivot growth factor norm(A)/
norm(U).
See Also
?gbsvxx
banded coefficient matrix A and multiple right-hand
sides
Syntax
call sgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
b, ldb, x, ldx, rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp,
nparams, params, work, iwork, info )
call dgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
nparams, params, work, iwork, info )
call cgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
nparams, params, work, rwork, info )
call zgbsvxx( fact, trans, n, kl, ku, nrhs, ab, ldab, afb, ldafb, ipiv, equed, r, c,
nparams, params, work, rwork, info )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, AT*X = B, or AH*X = B, where A is an n-by-n banded matrix, the columns of the matrix B are
individual right-hand sides, and the columns of X are the corresponding solutions.
The routine ?gbsvxx performs the following steps:
1. If fact = 'E', scaling factors r and c are computed to equilibrate the system:

686
LAPACK Routines 3
upper triangular.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to improve the
computed solution matrix and calculate error bounds. Refinement calculates the residual to at least
twice the working precision.
Input Parameters

factored.
If fact = 'F', on entry, afb and ipiv contain the factored form of A. If
given by r and c. Parameters ab, afb, and ipiv are not modified.
If fact = 'E', the matrix A will be equilibrated, if necessary, copied to afb

and factored.

If trans = 'C', the system has the form AH*X = B (Conjugate Transpose
= Transpose for real flavors, Conjugate Transpose for complex flavors).
ab, afb, b, work REAL for sgbsvxx

DOUBLE PRECISION for dgbsvxx
687
COMPLEX for cgbsvxx

DOUBLE COMPLEX for zgbsvxx.
Arrays: ab(ldab,*), afb(ldafb,*), b(size ldb by *), work(*).
The array ab contains the matrix A in band storage, in rows 1 to kl+ku+1.

The j-th column of A is stored in the j-th column of the array ab as follows:
ab(ku+1+i-j,j) = A(i,j) for max(1,j-ku) i min(n,j+kl).

If fact = 'F' and equed is not 'N', then AB must have been equilibrated
by the scaling factors in r and/or c. The second dimension of a must be at
least max(1,n).
The array afb is an input argument if fact = 'F'. It contains the factored
form of the banded matrix A, that is, the factors L and U from the
factorization A = P*L*U as computed by ?gbtrf. U is stored as an upper
triangular banded matrix with kl + ku superdiagonals in rows 1 to kl + ku
+ 1. The multipliers used during the factorization are stored in rows kl + ku
+ 2 to 2*kl + ku + 1. If equed is not 'N', then afb is the factored form of
the equilibrated matrix A.
max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldabkl+ku+1.
ldafb INTEGER. The leading dimension of the array afb; ldafb 2*kl+ku+1.
ipiv INTEGER.
= 'F'. It contains the pivot indices from the factorization A = P*L*U as
computed by ?gbtrf; row i of the matrix was interchanged with row
ipiv(i).

been replaced by diag(r)*A*diag(c).

688
LAPACK Routines 3
Arrays: r (size n), c (size n). The array r contains the row scale factors for
A, and the array c contains the column scale factors for A. These arrays are
input arguments if fact = 'F' only; otherwise they are output arguments.

'N'or 'C', r is not accessed.
positive.
positive.




are computed.
=1.0 Use the extra-precise refinement algorithm.

refinement.
Default 10.0
689


convergence).
only.

Output Parameters
x REAL for sgbsvxx

DOUBLE PRECISION for dgbsvxx
COMPLEX for cgbsvxx
DOUBLE COMPLEX for zgbsvxx.
inv(diag(c))*X, if trans = 'N' and equed = 'C' or 'B'; or
inv(diag(r))*X, if trans = 'T' or 'C' and equed = 'R' or 'B'. The
ab Array ab is not modified on exit if fact = 'F' or 'N', or if fact = 'E'

and equed = 'N'.

afb If fact = 'N' or 'E', then afb is an output argument and on exit returns
the factors L and U from the factorization A = PLU of the original matrix A
(if fact = 'N') or of the equilibrated matrix A (if fact = 'E').
b Overwritten by diag(r)*B if trans = 'N' and equed = 'R' or 'B';
overwritten by trans = 'T' or 'C' and equed = 'C' or 'B';
690
LAPACK Routines 3
r, c These arrays are output arguments if fact'F'. Each element of these
arrays is a power of the radix. See the description of r, c in Input
Arguments section.



pivot growth factor for the leading info columns of A. In ?gbsvx, this
quantity is returned in work(1).



side.
fields:
691



are:

approximately 1.

follows:


side.
692
LAPACK Routines 3
fields:



are:

contains the pivot indices from the factorization A = P*L*U of the original
matrix A (if fact = 'N') or of the equilibrated matrix A (if fact = 'E').

Arguments section).
for that parameter, otherwise the entry is not modified.


693
See Also
?gtsv
equations with a tridiagonal coefficient matrix A and
Syntax
call sgtsv( n, nrhs, dl, d, du, b, ldb, info )
call dgtsv( n, nrhs, dl, d, du, b, ldb, info )
call cgtsv( n, nrhs, dl, d, du, b, ldb, info )
call zgtsv( n, nrhs, dl, d, du, b, ldb, info )
call gtsv( dl, d, du, b [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n tridiagonal matrix, the
columns of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The routine uses Gaussian elimination with partial pivoting.
Note that the equation AT*X = B may be solved by interchanging the order of the arguments du and dl.
Input Parameters
n INTEGER. The order of A, the number of rows in B; n 0.
nrhs INTEGER. The number of right-hand sides, the number of columns in

B; nrhs 0.
dl REAL for sgtsv

DOUBLE PRECISION for dgtsv
COMPLEX for cgtsv
DOUBLE COMPLEX for zgtsv.
The array dl (size n - 1) contains the (n - 1) subdiagonal elements
of A.
694
LAPACK Routines 3
d REAL for sgtsv
COMPLEX for cgtsv
The array d (size n) contains the diagonal elements of A.
du REAL for sgtsv

COMPLEX for cgtsv
The array du (size n - 1) contains the (n - 1) superdiagonal
elements of A.
b REAL for sgtsv

COMPLEX for cgtsv
The array b(size ldb by *) contains the matrix B whose columns are
the right-hand sides for the systems of equations. The second
dimension of b must be at least max(1,nrhs).
Output Parameters
dl Overwritten by the (n-2) elements of the second superdiagonal of the

upper triangular matrix U from the LU factorization of A. These
elements are stored in dl(1), ..., dl(n - 2).
d Overwritten by the n diagonal elements of U.
du Overwritten by the (n-1) elements of the first superdiagonal of U.

If info = i, U(i, i) is exactly zero, and the solution has not been
computed. The factorization has not been completed unless i = n.

Specific details for the routine gtsv interface are as follows:
695
See Also
?gtsvx
Computes the solution to the real or complex system
of linear equations with a tridiagonal coefficient matrix
A and multiple right-hand sides, and provides error
bounds on the solution.
Syntax
call sgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call dgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call cgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call zgtsvx( fact, trans, n, nrhs, dl, d, du, dlf, df, duf, du2, ipiv, b, ldb, x,
call gtsvx( dl, d, du, b, x [,dlf] [,df] [,duf] [,du2] [,ipiv] [,fact] [,trans] [,ferr]
[,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, AT*X = B, or AH*X = B, where A is a tridiagonal matrix of order n, the columns of matrix B are
The routine ?gtsvx performs the following steps:
1. If fact = 'N', the LU decomposition is used to factor the matrix A as A = L*U, where L is a product
of permutation and unit lower bidiagonal matrices and U is an upper triangular matrix with nonzeroes in
only the main diagonal and first two superdiagonals.
696
LAPACK Routines 3
Input Parameters
fact CHARACTER*1. Must be 'F' or 'N'.

Specifies whether or not the factored form of the matrix A has been
supplied on entry.
If fact = 'F': on entry, dlf, df, duf, du2, and ipiv contain the
factored form of A; arrays dl, d, du, dlf, df, duf, du2, and ipiv will not
be modified.
If fact = 'N', the matrix A will be copied to dlf, df, and duf and
factored.

If trans = 'C', the system has the form AH*X = B (Conjugate

transpose).

n 0.
nrhs INTEGER. The number of right hand sides, the number of columns of
dl,d,du,dlf,df, duf,du2,b REAL for sgtsvx

,work
DOUBLE PRECISION for dgtsvx
COMPLEX for cgtsvx
DOUBLE COMPLEX for zgtsvx.
Arrays:
dl, size (n -1), contains the subdiagonal elements of A.
d, size (n), contains the diagonal elements of A.
du, size (n -1), contains the superdiagonal elements of A.
dlf, size (n -1). If fact = 'F', then dlf is an input argument and on
entry contains the (n -1) multipliers that define the matrix L from the
LU factorization of A as computed by ?gttrf.
df, size (n). If fact = 'F', then df is an input argument and on

entry contains the n diagonal elements of the upper triangular matrix
duf, size (n -1). If fact = 'F', then duf is an input argument and on
entry contains the (n -1) elements of the first superdiagonal of U.
du2, size (n -2). If fact = 'F', then du2 is an input argument and
on entry contains the (n-2) elements of the second superdiagonal of
U.
b(ldb, *) contains the right-hand side matrix B. The second
dimension of b must be at least max(1, nrhs).
697
work(*) is a workspace array. The size of work must be at least

max(1, 3*n) for real flavors and max(1, 2*n) for complex flavors.
ipiv INTEGER.
Array, size at least max(1, n). If fact = 'F', then ipiv is an input
argument and on entry contains the pivot indices, as returned by ?
gttrf.
rwork REAL for cgtsvx

DOUBLE PRECISION for zgtsvx.
Workspace array, size (n). Used for complex flavors only.
Output Parameters
x REAL for sgtsvx

DOUBLE PRECISION for dgtsvx
COMPLEX for cgtsvx
DOUBLE COMPLEX for zgtsvx.
X. The second dimension of x must be at least max(1, nrhs).
dlf If fact = 'N', then dlf is an output argument and on exit contains
the (n-1) multipliers that define the matrix L from the LU
factorization of A.
df If fact = 'N', then df is an output argument and on exit contains

the n diagonal elements of the upper triangular matrix U from the LU
factorization of A.
duf If fact = 'N', then duf is an output argument and on exit contains
the (n-1) elements of the first superdiagonal of U.
du2 If fact = 'N', then du2 is an output argument and on exit contains
the (n-2) elements of the second superdiagonal of U.
ipiv The array ipiv is an output argument if fact = 'N'and, on exit,

contains the pivot indices from the factorization A = L*U ; row i of
the matrix was interchanged with row ipiv(i). The value of ipiv(i) will
always be i or i+1; ipiv(i)=i indicates a row interchange was not
required.

698
LAPACK Routines 3
An estimate of the reciprocal condition number of the matrix A. If
rcond is less than the machine precision (in particular, if rcond =0),
by a return code of info>0.


exact solution.


has not been completed unless i = n, but the factor U is exactly
singular, so the solution and error bounds could not be computed;
rcond = 0 is returned. If info = i, and i = n + 1, then U is
nonsingular, but rcond is less than machine precision, meaning that
the matrix is singular to working precision. Nevertheless, the solution
and error bounds are computed because there are a number of
situations where the computed solution can be more accurate than the
value of rcond would suggest.

Specific details for the routine gtsvx interface are as follows:
699
dlf Holds the vector of length (n-1).
duf Holds the vector of length (n-1).
fact Must be 'N' or 'F'. The default value is 'N'. If fact = 'F', then
the arguments dlf, df, duf, du2, and ipiv must be present; otherwise,
an error is returned.
See Also
?dtsvb
equations with a diagonally dominant tridiagonal
coefficient matrix A and multiple right-hand sides.
Syntax
call sdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call ddtsvb( n, nrhs, dl, d, du, b, ldb, info )
call cdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call zdtsvb( n, nrhs, dl, d, du, b, ldb, info )
call dtsvb( dl, d, du, b [, info])
Include Files
mkl.fi, lapack.f90
Description
The ?dtsvb routine solves a system of linear equations A*X = B for X, where A is an n-by-n diagonally
dominant tridiagonal matrix, the columns of matrix B are individual right-hand sides, and the columns of X
are the corresponding solutions. The routine uses the BABE (Burning At Both Ends) algorithm.
Note that the equation AT*X = B may be solved by interchanging the order of the arguments du and dl.
Input Parameters
n INTEGER. The order of A, the number of rows in B; n 0.

B; nrhs 0.
700
LAPACK Routines 3
dl, d, du, b REAL for sdtsvb
DOUBLE PRECISION for ddtsvb
COMPLEX for cdtsvb
DOUBLE COMPLEX for zdtsvb.
Arrays: dl (size n - 1), d (size n), du (size n - 1), b(size ldb,*).
The array dl contains the (n - 1) subdiagonal elements of A.
The array d contains the diagonal elements of A.
The array du contains the (n - 1) superdiagonal elements of A.
Output Parameters
dl Overwritten by the (n-1) elements of the subdiagonal of the lower

triangular matrices L1, L2 from the factorization of A (see dttrfb).
d Overwritten by the n diagonal element reciprocals of U.

If info = i, uii is exactly zero, and the solution has not been
Application Notes
A diagonally dominant tridiagonal system is defined such that |di| > |dli-1| + |dui| for any i:
1 < i < n, and |d1| > |du1|, |dn| > |dln-1|
The underlying BABE algorithm is designed for diagonally dominant systems. Such systems have no
numerical stability issue unlike the canonical systems that use elimination with partial pivoting (see ?gtsv).
The diagonally dominant systems are much faster than the canonical systems.
NOTE
The current implementation of BABE has a potential accuracy issue on very small or large data
close to the underflow or overflow threshold respectively. Scale the matrix before applying the
solver in the case of such input data.
Applying the ?dtsvb factorization to non-diagonally dominant systems may lead to an accuracy
loss, or false singularity detected due to no pivoting.
701
?posv
equations with a symmetric or Hermitian positive-
definite coefficient matrix A and multiple right-hand
sides.
Syntax
call sposv( uplo, n, nrhs, a, lda, b, ldb, info )
call dposv( uplo, n, nrhs, a, lda, b, ldb, info )
call cposv( uplo, n, nrhs, a, lda, b, ldb, info )
call zposv( uplo, n, nrhs, a, lda, b, ldb, info )
call dsposv( uplo, n, nrhs, a, lda, b, ldb, x, ldx, work, swork, iter, info )
call zcposv( uplo, n, nrhs, a, lda, b, ldb, x, ldx, work, swork, rwork, iter, info )
call posv( a, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n
symmetric/Hermitian positive-definite matrix, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
The Cholesky decomposition is used to factor A as
A = UT*U (real flavors) and A = UH*U (complex flavors), if uplo = 'U'
or A = L*LT (real flavors) and A = L*LH (complex flavors), if uplo = 'L',
where U is an upper triangular matrix and L is a lower triangular matrix. The factored form of A is then used
to solve the system of equations A*X = B.
The dsposv and zcposv are mixed precision iterative refinement subroutines for exploiting fast single
precision hardware. They first attempt to factorize the matrix in single precision (dsposv) or single complex
precision (zcposv) and use this factorization within an iterative refinement procedure to produce a solution
with double precision (dsposv) / double complex precision (zcposv) normwise backward error quality (see
below). If the approach fails, the method switches to a double precision or double complex precision
factorization respectively and computes the solution.
The iterative refinement is not going to be a winning strategy if the ratio single precision/complex
performance over double precision/double complex performance is too small. A reasonable strategy should
take the number of right-hand sides and the size of the matrix into account. This might be done with a call to
ilaenv in the future. At present, iterative refinement is implemented.
The iterative refinement process is stopped if
iter > itermax
or for all the right-hand sides:
rnmr < sqrt(n)*xnrm*anrm*eps*bwdmax,
where
iter is the number of the current iteration in the iterative refinement process
rnmr is the infinity-norm of the residual
xnrm is the infinity-norm of the solution
702
LAPACK Routines 3
anrm is the infinity-operator-norm of the matrix A
eps is the machine epsilon returned by dlamch (Epsilon).
The values itermax and bwdmax are fixed to 30 and 1.0d+00 respectively.
Input Parameters


B; nrhs 0.
a, b REAL for sposv

DOUBLE PRECISION for dposv and dsposv.
COMPLEX for cposv
DOUBLE COMPLEX for zposv and zcposv.
Arrays: a(size lda,*), b(ldb, *). The array a contains the upper or
the lower triangular part of the matrix A (see uplo). The second
Note that in the case of zcposv the imaginary parts of the diagonal
elements need not be set and are assumed to be zero.
work DOUBLE PRECISION for dsposv

DOUBLE COMPLEX for zcposv.
Workspace array, size (n*nrhs). This array is used to hold the
residual vectors.
swork REAL for dsgesv

COMPLEX for zcgesv.
Workspace array, size (n*(n+nrhs)). This array is used to use the
single precision matrix and the right-hand sides or solutions in single
precision.
rwork DOUBLE PRECISION. Workspace array, size (n).
703
Output Parameters
a If info = 0, the upper or lower triangular part of a is overwritten by

the Cholesky factor U or L, as specified by uplo.
If iterative refinement has been successfully used (info= 0 and
iter 0), then A is unchanged.
If double precision factorization has been used (info= 0 and iter <
0), then the array A contains the factors L or U from the Cholesky
factorization.
ipiv INTEGER.
Array, size at least max(1, n). The pivot indices that define the
permutation matrix P; row i of the matrix was interchanged with row
ipiv(i). Corresponds to the single precision factorization (if info= 0
and iter 0) or the double precision factorization (if info= 0 and
iter < 0).
x DOUBLE PRECISION for dsposv

DOUBLE COMPLEX for zcposv.
Array, size ldx by nrhs. If info = 0, contains the n-by-nrhs solution
matrix X.
iter INTEGER.
If iter < 0: iterative refinement has failed, double precision
factorization has been performed
If iter = -1: the routine fell back to full precision for

implementation- or machine-specific reason
If iter = -2: narrowing the precision induced an overflow, the
routine fell back to full precision
If iter = -3: failure of spotrf for dsposv, or cpotrf for zcposv
If iter = -31: stop the iterative refinement after the 30th
iteration.
If iter > 0: iterative refinement has been successfully used. Returns

the number of iterations.


itself) is not positive definite, so the factorization could not be
completed, and the solution has not been computed.

Specific details for the routine posv interface are as follows:
704
LAPACK Routines 3
See Also
?posvx
Uses the Cholesky factorization to compute the
symmetric or Hermitian positive-definite coefficient
matrix A, and provides error bounds on the solution.
Syntax
call sposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
call dposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
call cposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
call zposvx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
call posvx( a, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A=UT*U (real flavors) / A=UH*U (complex flavors) or A=L*LT (real
flavors) / A=L*LH (complex flavors) to compute the solution to a real or complex system of linear equations
A*X = B, where A is a n-by-n real symmetric/Hermitian positive definite matrix, the columns of matrix B are
The routine ?posvx performs the following steps:
1. If fact = 'E', real scaling factors s are computed to equilibrate the system:
diag(s)*A*diag(s)*inv(diag(s))*X = diag(s)*B.
equilibration is used, A is overwritten by diag(s)*A*diag(s) and B by diag(s)*B.
2. If fact = 'N' or 'E', the Cholesky decomposition is used to factor the matrix A (after equilibration if
fact = 'E') as
A = UT*U (real), A = UH*U (complex), if uplo = 'U',
or A = L*LT (real), A = L*LH (complex), if uplo = 'L',
where U is an upper triangular matrix and L is a lower triangular matrix.
705
3. If the leading i-by-i principal minor is not positive-definite, then the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A. If the
reciprocal of the condition number is less than machine precision, info = n + 1 is returned as a
warning, but the routine still goes on to solve for X and compute error bounds as described below.
6. If equilibration was used, the matrix X is premultiplied by diag(s) so that it solves the original system
before equilibration.
Input Parameters

If fact = 'F': on entry, af contains the factored form of A. If equed
= 'Y', the matrix A has been equilibrated with scaling factors given
by s.
a and af will not be modified.

copied to af and factored.


B; nrhs 0.
a, af, b, work REAL for sposvx

DOUBLE PRECISION for dposvx
COMPLEX for cposvx
DOUBLE COMPLEX for zposvx.
Arrays: a(size lda by *), af(size ldaf by *), b( size ldb by *),
work(*).
The array a contains the matrix A as specified by uplo. If fact = 'F'
and equed = 'Y', then A must have been equilibrated by the scaling
factors in s, and a must contain the equilibrated matrix
diag(s)*A*diag(s). The second dimension of a must be at least
max(1,n).
706
LAPACK Routines 3
The array af is an input argument if fact = 'F'. It contains the
triangular factor U or L from the Cholesky factorization of A in the
same storage format as A. If equed is not 'N', then af is the factored
form of the equilibrated matrix diag(s)*A*diag(s). The second
dimension of af must be at least max(1,n).

flavors.

if equed = 'N', no equilibration was done (always true if fact =
'N');
if equed = 'Y', equilibration was done, that is, A has been replaced
by diag(s)*A*diag(s).

Array, size (n). The array s contains the scale factors for A. This array
is an input argument if fact = 'F' only; otherwise it is an output
argument.
If fact = 'F' and equed = 'Y', each element of s must be positive.
n).
flavors only.
rwork REAL for cposvx

DOUBLE PRECISION for zposvx.
only.
Output Parameters
x REAL for sposvx

DOUBLE PRECISION for dposvx
707
COMPLEX for cposvx

DOUBLE COMPLEX for zposvx.
X to the original system of equations. Note that if equed = 'Y', A
and B are modified on exit, and the solution to the equilibrated system
is inv(diag(s))*X. The second dimension of x must be at least
max(1,nrhs).
a Array a is not modified on exit if fact = 'F' or 'N', or if fact =

'E' and equed = 'N'.
If fact = 'E' and equed = 'Y', A is overwritten by
diag(s)*A*diag(s).
af If fact = 'N' or 'E', then af is an output argument and on exit

returns the triangular factor U or L from the Cholesky factorization
A=UT*U or A=L*LT (real routines), A=UH*U or A=L*LH (complex
routines) of the original matrix A (if fact = 'N'), or of the
equilibrated matrix A (if fact = 'E'). See the description of a for the
form of the equilibrated matrix.
b Overwritten by diag(s)*B, if equed = 'Y'; not changed if equed =

'N'.
s This array is an output argument if fact'F'. See the description of s

in Input Arguments section.

equilibration (if done). If rcond is less than the machine precision (in
particular, if rcond =0), the matrix is singular to working precision.
This condition is indicated by a return code of info>0.

error bound for each solution vector xj (the j-th column of the solution
matrix X). If xtrue is the true solution corresponding to xj, ferr(j) is
an estimated upper bound for the magnitude of the largest element in
(xj) - xtrue) divided by the magnitude of the largest element in xj. The
estimate is as reliable as the estimate for rcond, and is almost always
a slight overestimate of the true error.

708
LAPACK Routines 3
relative backward error for each solution vector xj, that is, the
smallest relative change in any element of A or B that makes xj an
exact solution.

Arguments section).

If info = i, and in, the leading minor of order i (and therefore the
matrix A itself) is not positive-definite, so the factorization could not
be completed, and the solution and error bounds could not be
computed; rcond =0 is returned.
If info = i, and i = n + 1, then U is nonsingular, but rcond is less
than machine precision, meaning that the matrix is singular to
would suggest.

Specific details for the routine posvx interface are as follows:
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
then af must be present; otherwise, an error is returned.
equed Must be 'N' or 'Y'. The default value is 'N'.
See Also
709
?posvxx
symmetric or Hermitian positive-definite coefficient
matrix A applying the Cholesky factorization.
Syntax
call sposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call dposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, iwork,
info )
call cposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
call zposvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, equed, s, b, ldb, x, ldx, rcond,
rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work, rwork,
info )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, where A is an n-by-n real symmetric/Hermitian positive definite matrix, the columns of matrix B
are individual right-hand sides, and the columns of X are the corresponding solutions.
The routine ?posvxx performs the following steps:
1. If fact = 'E', scaling factors are computed to equilibrate the system:
diag(s)*A*diag(s) *inv(diag(s))*X = diag(s)*B

fact = 'E') as
710
LAPACK Routines 3
3. If the leading i-by-i principal minor is not positive-definite, the routine returns with info = i.
Otherwise, the factored form of A is used to estimate the condition number of the matrix A (see the
rcond parameter). If the reciprocal of the condition number is less than machine precision, the routine
still goes on to solve for X and compute error bounds.
5. By default, unless params(1) is set to zero, the routine applies iterative refinement to get a small error
and error bounds. Refinement calculates the residual to at least twice the working precision.
Input Parameters

factored.
If fact = 'F', on entry, af contains the factored form of A. If equed is not
'N', the matrix A has been equilibrated with scaling factors given by s.
Parameters a and af are not modified.

and factored.

a, af, b, work REAL for sposvxx

DOUBLE PRECISION for dposvxx
COMPLEX for cposvxx
DOUBLE COMPLEX for zposvxx.
Arrays: a(size lda by *), af(size ldafby *), b(size ldb by *), work(*).
The array a contains the matrix A as specified by uplo . If fact = 'F' and
equed = 'Y', then A must have been equilibrated by the scaling factors in
s, and a must contain the equilibrated matrix diag(s)*A*diag(s). The
second dimension of a must be at least max(1,n).
The array af is an input argument if fact = 'F'. It contains the triangular

factor U or L from the Cholesky factorization of A in the same storage
format as A. If equed is not 'N', then af is the factored form of the
equilibrated matrix diag(s)*A*diag(s). The second dimension of af must
be at least max(1,n).
711
max(1,nrhs).
lda INTEGER. The leading dimension of the array a; lda max(1,n).
ldaf INTEGER. The leading dimension of the array af; ldaf max(1,n).

if equed = 'Y', both row and column equilibration was done, that is, A has
been replaced by diag(s)*A*diag(s).

Array, size (n). The array s contains the scale factors for A. This array is an
input argument if fact = 'F' only; otherwise it is an output argument.

err_bnds_comp descriptions in the Output Arguments section below.



712
LAPACK Routines 3
are computed.

refinement.
Default 10.0


convergence).
only.

Output Parameters
x REAL for sposvxx

DOUBLE PRECISION for dposvxx
COMPLEX for cposvxx
DOUBLE COMPLEX for zposvxx.
inv(diag(s))*X.
a Array a is not modified on exit if fact = 'F' or 'N', or if fact = 'E' and
equed = 'N'.
If fact = 'E' and equed = 'Y', A is overwritten by diag(s)*A*diag(s).
713
af If fact = 'N' or 'E', then af is an output argument and on exit returns

the triangular factor U or L from the Cholesky factorization A=UT*U or
A=L*LT (real routines), A=UH*U or A=L*LH (complex routines) of the original
matrix A (if fact = 'N'), or of the equilibrated matrix A (if fact = 'E').
See the description of a for the form of the equilibrated matrix.
b If equed = 'N', B is not modified.
If equed = 'Y', B is overwritten by diag(s)*B.
s This array is an output argument if fact'F'. Each element of this array is

a power of the radix. See the description of s in Input Arguments section.



pivot growth factor for the leading info columns of A.


714
LAPACK Routines 3
side.
fields:



are:

approximately 1.

follows:
715


side.
fields:



are:


Arguments section).

716
LAPACK Routines 3
See Also
?ppsv
equations with a symmetric (Hermitian) positive
definite packed coefficient matrix A and multiple right-
hand sides.
Syntax
call sppsv( uplo, n, nrhs, ap, b, ldb, info )
call dppsv( uplo, n, nrhs, ap, b, ldb, info )
call cppsv( uplo, n, nrhs, ap, b, ldb, info )
call zppsv( uplo, n, nrhs, ap, b, ldb, info )
call ppsv( ap, b [,uplo] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the real or complex system of linear equations A*X = B, where A is an n-by-n real
symmetric/Hermitian positive-definite matrix stored in packed format, the columns of matrix B are individual
right-hand sides, and the columns of X are the corresponding solutions.
where U is an upper triangular matrix and L is a lower triangular matrix. The factored form of A is then used
to solve the system of equations A*X = B.
Input Parameters

717

B; nrhs 0.
ap, b REAL for sppsv

DOUBLE PRECISION for dppsv
COMPLEX for cppsv
DOUBLE COMPLEX for zppsv.
Arrays: ap(size *), b(size ldb, *). The array ap contains the upper or
the lower triangular part of the matrix A (as specified by uplo) in
packed storage (see Matrix Storage Schemes). The dimension of ap
must be at least max(1,n(n+1)/2).
Output Parameters
ap If info = 0, the upper or lower triangular part of A in packed storage is

overwritten by the Cholesky factor U or L, as specified by uplo.


itself) is not positive-definite, so the factorization could not be

Specific details for the routine ppsv interface are as follows:
See Also
718
LAPACK Routines 3
?ppsvx
symmetric (Hermitian) positive definite packed
coefficient matrix A, and provides error bounds on the
solution.
Syntax
call sppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
call dppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
call cppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
call zppsvx( fact, uplo, n, nrhs, ap, afp, equed, s, b, ldb, x, ldx, rcond, ferr,
call ppsvx( ap, b, x [,uplo] [,af] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
Description
A*X = B, where A is a n-by-n symmetric or Hermitian positive-definite matrix stored in packed format, the
columns of matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The routine ?ppsvx performs the following steps:
fact = 'E') as

reciprocal of the condition number is less than machine precision, info = n+1 is returned as a
719
Input Parameters

If fact = 'F': on entry, afp contains the factored form of A. If
equed = 'Y', the matrix A has been equilibrated with scaling factors
given by s.
ap and afp will not be modified.
If fact = 'N', the matrix A will be copied to afp and factored.

copied to afp and factored.

nrhs INTEGER. The number of right-hand sides; the number of columns in

B; nrhs 0.
ap, afp, b, work REAL for sppsvx

DOUBLE PRECISION for dppsvx
COMPLEX for cppsvx
DOUBLE COMPLEX for zppsvx.
Arrays: (size *), afp(size *), b(size ldb by *), work(*).
The array ap contains the upper or lower triangle of the original

symmetric/Hermitian matrix A in packed storage (see Matrix Storage
Schemes). In case when fact = 'F' and equed = 'Y', ap must
contain the equilibrated matrix diag(s)*A*diag(s).
The array afp is an input argument if fact = 'F' and contains the
triangular factor U or L from the Cholesky factorization of A in the
same storage format as A. If equed is not 'N', then afp is the
factored form of the equilibrated matrix A.
+1)/2); the second dimension of b must be at least max(1,nrhs);
the dimension of work must be at least max(1, 3*n) for real flavors
720
LAPACK Routines 3
'N');
by diag(s)A*diag(s).

argument.
n).
flavors only.
rwork REAL for cppsvx;

DOUBLE PRECISION for zppsvx.
only.
Output Parameters
x REAL for sppsvx

DOUBLE PRECISION for dppsvx
COMPLEX for cppsvx
DOUBLE COMPLEX for zppsvx.
X to the original system of equations. Note that if equed = 'Y', A
and B are modified on exit, and the solution to the equilibrated system
is inv(diag(s))*X. The second dimension of x must be at least
max(1,nrhs).
ap Array ap is not modified on exit if fact = 'F' or 'N', or if fact =

'E'and equed = 'N'.
If fact = 'E' and equed = 'Y', ap is overwritten by
diag(s)*A*diag(s).
721
afp If fact = 'N'or 'E', then afp is an output argument and on exit
equilibrated matrix A (if fact = 'E'). See the description of ap for
the form of the equilibrated matrix.

'N'.


particular, if rcond = 0), the matrix is singular to working precision.

solution matrix X). If xtrue is the true solution corresponding to
x(j) , ferr(j) is an estimated upper bound for the magnitude of the
largest element in (x(j) - xtrue) divided by the magnitude of the
largest element in x(j) . The estimate is as reliable as the estimate
for rcond, and is almost always a slight overestimate of the true
error.

relative backward error for each solution vector x(j) , that is, the
exact solution.

Arguments section).

computed; rcond = 0 is returned.
722
LAPACK Routines 3
would suggest.

Specific details for the routine ppsvx interface are as follows:
afp Holds the matrix AF of size (n*(n+1)/2).
s Holds the vector of length n. Default value for each element is s(i) =
1.0_WP.
See Also
?pbsv
equations with a symmetric or Hermitian positive-
definite band coefficient matrix A and multiple right-
hand sides.
Syntax
call spbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call dpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call cpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call zpbsv( uplo, n, kd, nrhs, ab, ldab, b, ldb, info )
call pbsv( ab, b [,uplo] [,info] )
723
Include Files
mkl.fi, lapack.f90
Description
symmetric/Hermitian positive definite band matrix, the columns of matrix B are individual right-hand sides,
and the columns of X are the corresponding solutions.
where U is an upper triangular band matrix and L is a lower triangular band matrix, with the same number of
superdiagonals or subdiagonals as A. The factored form of A is then used to solve the system of equations
A*X = B.
Input Parameters

kd INTEGER. The number of superdiagonals of the matrix A if uplo =

'U', or the number of subdiagonals if uplo = 'L';kd 0.

B; nrhs 0.
ab, b REAL for spbsv

DOUBLE PRECISION for dpbsv
COMPLEX for cpbsv
DOUBLE COMPLEX for zpbsv.
Arrays: ab(size ldab by *), b(size ldb by *). The array ab contains the
uplo) in band storage (see Matrix Storage Schemes). The second
dimension of ab must be at least max(1, n).
724
LAPACK Routines 3
Output Parameters
ab The upper or lower triangular part of A (in band storage) is

overwritten by the Cholesky factor U or L, as specified by uplo, in the
same storage format as A.


itself) is not positive-definite, so the factorization could not be

Specific details for the routine pbsv interface are as follows:
See Also
?pbsvx
symmetric (Hermitian) positive-definite band
coefficient matrix A, and provides error bounds on the
solution.
Syntax
call spbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call dpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, iwork, info )
call cpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
call zpbsvx( fact, uplo, n, kd, nrhs, ab, ldab, afb, ldafb, equed, s, b, ldb, x, ldx,
rcond, ferr, berr, work, rwork, info )
call pbsvx( ab, b, x [,uplo] [,afb] [,fact] [,equed] [,s] [,ferr] [,berr] [,rcond]
[,info] )
Include Files
mkl.fi, lapack.f90
725
Description
A*X = B, where A is a n-by-n symmetric or Hermitian positive definite band matrix, the columns of matrix B
The routine ?pbsvx performs the following steps:
fact = 'E') as
where U is an upper triangular band matrix and L is a lower triangular band matrix.
3. If the leading i-by-i principal minor is not positive definite, then the routine returns with info = i.
Input Parameters

If fact = 'F': on entry, afb contains the factored form of A. If
equed = 'Y', the matrix A has been equilibrated with scaling factors
given by s.
ab and afb will not be modified.

copied to afb and factored.

726
LAPACK Routines 3

A; kd 0.

B; nrhs 0.
ab, afb, b, work REAL for spbsvx

DOUBLE PRECISION for dpbsvx
COMPLEX for cpbsvx
DOUBLE COMPLEX for zpbsvx.
Arrays: ab(size ldab by *), afb(size ldafb by *), b(size ldb by *),
work(*).
The array ab contains the upper or lower triangle of the matrix A in
band storage (see Matrix Storage Schemes).
If fact = 'F' and equed = 'Y', then ab must contain the
equilibrated matrix diag(s)*A*diag(s). The second dimension of ab
The array afb is an input argument if fact = 'F'. It contains the

triangular factor U or L from the Cholesky factorization of the band
matrix A in the same storage format as A. If equed = 'Y', then afb
is the factored form of the equilibrated matrix A. The second
dimension of afb must be at least max(1, n).

The dimension of work must be at least max(1,3*n) for real flavors,
and at least max(1,2*n) for complex flavors.
ldab INTEGER. The leading dimension of ab; ldabkd+1.
ldafb INTEGER. The leading dimension of afb; ldafbkd+1.

'N')
by diag(s)*A*diag(s).

727
argument.
n).
flavors only.
rwork REAL for cpbsvx

DOUBLE PRECISION for zpbsvx.
only.
Output Parameters
x REAL for spbsvx

DOUBLE PRECISION for dpbsvx
COMPLEX for cpbsvx
DOUBLE COMPLEX for zpbsvx.
If info = 0 or info = n+1, the array x contains the solution matrix X
to the original system of equations. Note that if equed = 'Y', A and
B are modified on exit, and the solution to the equilibrated system is
inv(diag(s))*X. The second dimension of x must be at least
max(1,nrhs).
ab On exit, if fact = 'E'and equed = 'Y', A is overwritten by

diag(s)*A*diag(s).
afb If fact = 'N'or 'E', then afb is an output argument and on exit
equilibrated matrix A (if fact = 'E'). See the description of ab for
the form of the equilibrated matrix.

'N'.


728
LAPACK Routines 3


exact solution.

Arguments section).

matrix A itself) is not positive definite, so the factorization could not
computed; rcond =0 is returned. If info = i, and i = n + 1, then U
is nonsingular, but rcond is less than machine precision, meaning that
the matrix is singular to working precision. Nevertheless, the solution
and error bounds are computed because there are a number of
situations where the computed solution can be more accurate than the
value of rcond would suggest.

Specific details for the routine pbsvx interface are as follows:
729
afb Holds the array AF of size (kd+1,n).
s Holds the vector with the number of elements n. Default value for
each element is s(i) = 1.0_WP.
ferr Holds the vector with the number of elements nrhs.
berr Holds the vector with the number of elements nrhs.
See Also
?ptsv
equations with a symmetric or Hermitian positive
definite tridiagonal coefficient matrix A and multiple
right-hand sides.
Syntax
call sptsv( n, nrhs, d, e, b, ldb, info )
call dptsv( n, nrhs, d, e, b, ldb, info )
call cptsv( n, nrhs, d, e, b, ldb, info )
call zptsv( n, nrhs, d, e, b, ldb, info )
call ptsv( d, e, b [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric/Hermitian positive-definite tridiagonal matrix, the columns of matrix B are individual right-hand
sides, and the columns of X are the corresponding solutions.
A is factored as A = L*D*LT (real flavors) or A = L*D*LH (complex flavors), and the factored form of A is
then used to solve the system of equations A*X = B.
Input Parameters

B; nrhs 0.
d REAL for single precision flavors
730
LAPACK Routines 3
Array, dimension at least max(1, n). Contains the diagonal elements
of the tridiagonal matrix A.
e, b REAL for sptsv

DOUBLE PRECISION for dptsv
COMPLEX for cptsv
DOUBLE COMPLEX for zptsv.
Arrays: e (size n - 1), b(size ldb by *). The array e contains the (n -
1) subdiagonal elements of A.
Output Parameters
d Overwritten by the n diagonal elements of the diagonal matrix D from

the L*D*LT (real)/ L*D*LH (complex) factorization of A.
e Overwritten by the (n - 1) subdiagonal elements of the unit

bidiagonal factor L from the factorization of A.


itself) is not positive-definite, and the solution has not been

Specific details for the routine ptsv interface are as follows:
See Also
731
?ptsvx
Uses factorization to compute the solution to the
system of linear equations with a symmetric
(Hermitian) positive definite tridiagonal coefficient
matrix A, and provides error bounds on the solution.
Syntax
call sptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
info )
call dptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
info )
call cptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
rwork, info )
call zptsvx( fact, n, nrhs, d, e, df, ef, b, ldb, x, ldx, rcond, ferr, berr, work,
rwork, info )
call ptsvx( d, e, b, x [,df] [,ef] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the Cholesky factorization A = L*D*LT (real)/A = L*D*LH (complex) to compute the
solution to a real or complex system of linear equations A*X = B, where A is a n-by-n symmetric or
Hermitian positive definite tridiagonal matrix, the columns of matrix B are individual right-hand sides, and
the columns of X are the corresponding solutions.
The routine ?ptsvx performs the following steps:
1. If fact = 'N', the matrix A is factored as A = L*D*LT (real flavors)/A = L*D*LH (complex flavors),
where L is a unit lower bidiagonal matrix and D is diagonal. The factorization can also be regarded as
having the form A = UT*D*U (real flavors)/A = UH*D*U (complex flavors).
Input Parameters

on entry.
If fact = 'F': on entry, df and ef contain the factored form of A.
Arrays d, e, df, and ef will not be modified.
If fact = 'N', the matrix A will be copied to df and ef, and factored.
732
LAPACK Routines 3

B; nrhs 0.
d, df, rwork REAL for single precision flavors

Arrays: d (size n), df (size n), rwork(n).
The array d contains the n diagonal elements of the tridiagonal matrix

A.
The array df is an input argument if fact = 'F' and on entry
contains the n diagonal elements of the diagonal matrix D from the
L*D*LT (real)/ L*D*LH (complex) factorization of A.
The array rwork is a workspace array used for complex flavors only.
e,ef,b,work REAL for sptsvx

DOUBLE PRECISION for dptsvx
COMPLEX for cptsvx
DOUBLE COMPLEX for zptsvx.
Arrays: e (size n -1), ef (size n -1), b(size ldb, *), work(*). The
array e contains the (n - 1) subdiagonal elements of the tridiagonal
matrix A.
The array ef is an input argument if fact = 'F' and on entry
contains the (n - 1) subdiagonal elements of the unit bidiagonal
factor L from the L*D*LT (real)/ L*D*LH (complex) factorization of A.
The array work is a workspace array. The dimension of work must be
at least 2*n for real flavors, and at least n for complex flavors.
Output Parameters
x REAL for sptsvx

DOUBLE PRECISION for dptsvx
COMPLEX for cptsvx
DOUBLE COMPLEX for zptsvx.
X to the system of equations. The second dimension of x must be at
least max(1,nrhs).
733
df, ef These arrays are output arguments if fact = 'N'. See the
description of df, ef in Input Arguments section.



exact solution.

computed; rcond =0 is returned.
would suggest.

Specific details for the routine ptsvx interface are as follows:
734
LAPACK Routines 3
ef Holds the vector of length (n-1).
both arguments af and ipiv must be present; otherwise, an error is
returned.
See Also
?sysv
equations with a real or complex symmetric coefficient
matrix A and multiple right-hand sides.
Syntax
call ssysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call dsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call csysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zsysv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call sysv( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the
The diagonal pivoting method is used to factor A as A = U*D*UT or A = L*D*LT, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is symmetric and block diagonal with 1-
by-1 and 2-by-2 diagonal blocks.
The factored form of A is then used to solve the system of equations A*X = B.
Input Parameters

735

B; nrhs 0.
a, b, work REAL for ssysv

DOUBLE PRECISION for dsysv
COMPLEX for csysv
DOUBLE COMPLEX for zsysv.
Arrays: a(size lda by *), b(size ldb by *), work(*).
symmetric matrix A (see uplo). The second dimension of a must be at
least max(1, n).
work is a workspace array, dimension at least max(1,lwork).
lwork INTEGER. The size of the work array; lwork 1.

issued by xerbla. See Application Notes below for details and for the
suggested value of lwork.
Output Parameters
a If info = 0, a is overwritten by the block-diagonal matrix D and the

multipliers used to obtain the factor U (or L) from the factorization of
A as computed by ?sytrf.
b If info = 0, b is overwritten by the solution matrix X.
ipiv INTEGER.
and the block structure of D, as determined by ?sytrf.
If ipiv(i) = k >0, then dii is a 1-by-1 diagonal block, and the i-th
row and column of A was interchanged with the k-th row and column.
If uplo = 'U' and ipiv(i) = ipiv(i-1) = -m < 0, then D has a 2-by-2
If uplo = 'L'and ipiv(i) = ipiv(i+1) = -m < 0, then D has a 2-by-2
736
LAPACK Routines 3
runs.


exactly singular, so the solution could not be computed.

Specific details for the routine sysv interface are as follows:
Application Notes
lwork = -1.
See Also
?sysv_rook
matrix A and multiple right-hand sides.
Syntax
call ssysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call dsysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call csysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zsysv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
737
call sysv_rook( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the
The ?sysv_rook routine is called to compute the factorization of a complex symmetric matrix A using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method.
Input Parameters


B; nrhs 0.
a, b, work REAL for ssysv_rook

DOUBLE PRECISION for dsysv_rook
COMPLEX for csysv_rook
DOUBLE COMPLEX for zsysv_rook.
least max(1, n).
738
LAPACK Routines 3
Output Parameters

A as computed by sytrf_rook.
ipiv INTEGER.
and the block structure of D, as determined by sytrf_rook.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k -
1 and -ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows
and columns k and -ipiv(k) were interchanged, rows and columns k
+ 1 and -ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2
diagonal block.

runs.



Specific details for the routine sysv_rook interface are as follows:
739
See Also
?sysvx
Uses the diagonal pivoting factorization to compute
the solution to the system of linear equations with a
real or complex symmetric coefficient matrix A, and
provides error bounds on the solution.
Syntax
call ssysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, iwork, info )
call dsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, iwork, info )
call csysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
berr, work, lwork, rwork, info )
call zsysvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
call sysvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a real or complex system of
linear equations A*X = B, where A is a n-by-n symmetric matrix, the columns of matrix B are individual
right-hand sides, and the columns of X are the corresponding solutions.
The routine ?sysvx performs the following steps:
1. If fact = 'N', the diagonal pivoting method is used to factor the matrix A. The form of the
factorization is A = U*D*UT or A = L*D*LT, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
2. If some di,i= 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
condition number is less than machine precision, info = n+1 is returned as a warning, but the routine
still goes on to solve for X and compute error bounds as described below.
Input Parameters

supplied on entry.
If fact = 'F': on entry, af and ipiv contain the factored form of A.
Arrays a, af, and ipiv will not be modified.
740
LAPACK Routines 3


B; nrhs 0.
a, af, b, work REAL for ssysvx

DOUBLE PRECISION for dsysvx
COMPLEX for csysvx
DOUBLE COMPLEX for zsysvx.
Arrays: a(size lda by *), af(size ldaf by *), b(size ldb by *), work(*).
least max(1,n).
The array af is an input argument if fact = 'F'. It contains the block

diagonal matrix D and the multipliers used to obtain the factor U or L
from the factorization A = U*D*UT orA = L*D*LT as computed by ?
sytrf. The second dimension of af must be at least max(1,n).
work(*) is a workspace array, dimension at least max(1,lwork).
ipiv INTEGER.
fact = 'F'. It contains details of the interchanges and the block
structure of D, as determined by ?sytrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 diagonal block, and the i-th
If uplo = 'U'and ipiv(i) = ipiv(i-1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i-1, and (i-1)-th row and column
of A was interchanged with the m-th row and column.
If uplo = 'L'and ipiv(i) = ipiv(i+1) = -m < 0, then D has a
2-by-2 block in rows/columns i and i+1, and (i+1)-th row and column
of A was interchanged with the m-th row and column.
741
n).
lwork INTEGER. The size of the work array.

flavors only.
rwork REAL for csysvx;

DOUBLE PRECISION for zsysvx.
only.
Output Parameters
x REAL for ssysvx

DOUBLE PRECISION for dsysvx
COMPLEX for csysvx
DOUBLE COMPLEX for zsysvx.
least max(1,nrhs).
af, ipiv These arrays are output arguments if fact = 'N'.
See the description of af, ipiv in Input Arguments section.

rcond is less than the machine precision (in particular, if rcond = 0),
by a return code of info > 0.

742
LAPACK Routines 3
exact solution.

runs.

If info = i, and in, then dii is exactly zero. The factorization has
been completed, but the block diagonal matrix D is exactly singular,
so the solution and error bounds could not be computed; rcond = 0 is
returned.
If info = i, and i = n + 1, then D is nonsingular, but rcond is less
would suggest.

Specific details for the routine sysvx interface are as follows:
returned.
743
Application Notes
The value of lwork must be at least max(1,m*n), where for real flavors m = 3 and for complex flavors m =
2. For better performance, try using lwork = max(1, m*n, n*blocksize), where blocksize is the optimal
block size for ?sytrf.
lwork = -1.
See Also
?sysvxx
symmetric indefinite coefficient matrix A applying the
diagonal pivoting factorization.
Syntax
call ssysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rcond, rpvgrw, berr, n_err_bnds, err_bnds_norm, err_bnds_comp, nparams, params, work,
iwork, info )
call dsysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
iwork, info )
call csysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rwork, info )
call zsysvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
linear equations A*X = B, where A is an n-by-n real symmetric/Hermitian matrix, the columns of matrix B
744
LAPACK Routines 3
The routine ?sysvxx performs the following steps:

= 'E') as
A = U*D*UT, if uplo = 'U',
or A = L*D*LT, if uplo = 'L',
where U or L is a product of permutation and unit upper (lower) triangular matrices, and D is a
symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
3. If some D(i,i)=0, so that D is exactly singular, the routine returns with info = i. Otherwise, the
6. If equilibration was used, the matrix X is premultiplied by diag(r) so that it solves the original system
Input Parameters

factored.
given by s. Parameters a, af, and ipiv are not modified.

and factored.

745
a, af, b, work REAL for ssysvxx

DOUBLE PRECISION for dsysvxx
COMPLEX for csysvxx
DOUBLE COMPLEX for zsysvxx.
Arrays: a(size lda by *), af(size ldaf by *), b(size ldb, *), work(*).
The array a contains the symmetric matrix A as specified by uplo. If uplo

= 'U', the leading n-by-n upper triangular part of a contains the upper
least max(1,n).

diagonal matrix D and the multipliers used to obtain the factor U and L from
the factorization A = U*D*UT or A = L*D*LT as computed by ?sytrf. The
second dimension of af must be at least max(1,n).
max(1,nrhs).
ipiv INTEGER.
= 'F'. It contains details of the interchanges and the block structure of D
as determined by ?sytrf. If ipiv(k) > 0, rows and columns k and
ipiv(k) were interchanged and D(k,k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, rows and columns k-1
and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, rows and columns k+1
and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.


746
LAPACK Routines 3
Array, size (n). The array s contains the scale factors for A. If equed =
'Y', A is multiplied on the left and right by diag(s).
This array is an input argument if fact = 'F' only; otherwise it is an
output argument.





are computed.

refinement.
Default 10.0

747

convergence).
only.

Output Parameters
x REAL for ssysvxx

DOUBLE PRECISION for dsysvxx
COMPLEX for csysvxx
DOUBLE COMPLEX for zsysvxx.
Array, size ldx by nrhs).
inv(diag(s))*X.
a If fact = 'E' and equed = 'Y', overwritten by diag(s)*A*diag(s).
af If fact = 'N', af is an output argument and on exit returns the block

diagonal matrix D and the multipliers used to obtain the factor U or L from
the factorization A = U*D*UT or A = L*D*LT.



748
LAPACK Routines 3



The array is indexed by the type of error information as described below. Up

to three pieces of information are returned.
side.
fields:



749

are:

approximately 1.

follows:


side.
fields:


750
LAPACK Routines 3
are:

ipiv If fact = 'N', ipiv is an output argument and on exit contains details of
the interchanges and the block structure D, as determined by ssytrf for
single precision flavors and dsytrf for double precision flavors.

Arguments section).

If 0 < infon: U(info,info) is exactly zero. The factorization has been

See Also
?hesv
equations with a Hermitian matrix A and multiple
right-hand sides.
751
Syntax
call chesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zhesv( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call hesv( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the complex system of linear equations A*X = B, where A is an n-by-n symmetric
matrix, the columns of matrix B are individual right-hand sides, and the columns of X are the corresponding
solutions.
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1
and 2-by-2 diagonal blocks.
Input Parameters

how A is factored:

B; nrhs 0.
a, b, work COMPLEX for chesv

DOUBLE COMPLEX for zhesv.
Arrays: a(size lda by *), bb(size ldb by *), work(*). The array a
contains the upper or the lower triangular part of the Hermitian matrix
A (see uplo). The second dimension of a must be at least max(1, n).
lwork INTEGER. The size of the work array (lwork 1).
752
LAPACK Routines 3
Output Parameters

A as computed by ?hetrf.
ipiv INTEGER.
and the block structure of D, as determined by ?hetrf.
If uplo = 'U'and ipiv(i) =ipiv(i-1) = -m < 0, then D has a 2-by-2
If uplo = 'L'and ipiv(i) =ipiv(i+1) = -m < 0, then D has a 2-by-2

runs.



Specific details for the routine hesv interface are as follows:
753
Application Notes
lwork = -1.
See Also
?hesv_rook
equations for Hermitian matrices using the bounded
Bunch-Kaufman diagonal pivoting method.
Syntax
call chesv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call zhesv_rook( uplo, n, nrhs, a, lda, ipiv, b, ldb, work, lwork, info )
call hesv_rook( a, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the complex system of linear equations A*X = B, where A is an n-by-n Hermitian
matrix, and X and B are n-by-nrhs matrices.
The bounded Bunch-Kaufman ("rook") diagonal pivoting method is used to factor A as

A = U*D*UH if uplo = 'U', or
A = L*D*LH if uplo = 'L',
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is Hermitian and
block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
hetrf_rook is called to compute the factorization of a complex Hermition matrix A using the bounded Bunch-
Kaufman ("rook") diagonal pivoting method.
The factored form of A is then used to solve the system of equations A*X = B by calling ?HETRS_ROOK,
which uses BLAS level 2 routines.
Input Parameters

754
LAPACK Routines 3
matrix A.
matrix A.
n INTEGER. The number of linear equations, which is the order of matrix

A; n 0.

B; nrhs 0.
a, b, work COMPLEX for chesv_rook

COMPLEX*16 for zhesv_rook.
The array a contains the Hermitian matrix A. If uplo = 'U', the

leading n-by-n upper triangular part of a contains the upper triangular
part of the matrix A, and the strictly lower triangular part of a is not
referenced. If uplo = 'L', the leading n-by-n lower triangular part of a
contains the lower triangular part of the matrix A, and the strictly
upper triangular part of a is not referenced. The second dimension of
a must be at least max(1, n).
The array b contains the n-by-nrhs right hand side matrix B. The
second dimension of b must be at least max(1,nrhs).
lwork INTEGER. The size of the work array (lwork 1).

Output Parameters

A as computed by hetrf_rook.
b If info = 0, b is overwritten by the n-by-nrhs solution matrix X.
ipiv INTEGER.
and the block structure of D, as determined by ?hetrf_rook.
If uplo = 'U':
755
If ipiv(k) > 0, rows and columns k and ipiv(k) were

If ipiv(k) < 0 and ipiv(k - 1) < 0, rows and columns k and -
ipiv(k) were interchanged, rows and columns k - 1 and -ipiv(k
- 1) were interchanged, and Dk - 1:k, k - 1:k is a 2-by-2 diagonal
block.
If uplo = 'L':
If ipiv(k) > 0, rows and columns k and ipiv(k) were
If ipiv(k) < 0 and ipiv(k + 1) < 0, rows and columns k and -
ipiv(k) were interchanged, rows and columns k + 1 and -ipiv(k
+ 1) were interchanged, and Dk:k + 1, k:k + 1 is a 2-by-2 diagonal
block.

runs.



Specific details for the routine hesv_rook interface are as follows:
See Also
?hesvx
the solution to the complex system of linear equations
with a Hermitian coefficient matrix A, and provides
error bounds on the solution.
Syntax
call chesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
call zhesvx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, b, ldb, x, ldx, rcond, ferr,
756
LAPACK Routines 3
call hesvx( a, b, x [,uplo] [,af] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system of linear
equations A*X = B, where A is an n-by-n Hermitian matrix, the columns of matrix B are individual right-hand
sides, and the columns of X are the corresponding solutions.
The routine ?hesvx performs the following steps:
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
Input Parameters

supplied on entry.
If fact = 'F': on entry, af and ipiv contain the factored form of A.
Arrays a, af, and ipiv are not modified.
If fact = 'N', the matrix A is copied to af and factored.

how A is factored:
Hermitian matrix A, and A is factored as U*D*UH.
Hermitian matrix A; A is factored as L*D*LH.

B; nrhs 0.
a, af, b, work COMPLEX for chesvx

DOUBLE COMPLEX for zhesvx.
Arrays: a(size lda by *), af(size ldaf by *), b(size ldb by *), work(*).
757
Hermitian matrix A (see uplo). The second dimension of a must be at
least max(1,n).
The array af is an input argument if fact = 'F'. It contains he block

diagonal matrix D and the multipliers used to obtain the factor U or L
from the factorization A = U*D*UH or A = L*D*LH as computed by ?
hetrf. The second dimension of af must be at least max(1,n).
ipiv INTEGER.
structure of D, as determined by ?hetrf.
n).

rwork REAL for chesvx

DOUBLE PRECISION for zhesvx.
Output Parameters
x COMPLEX for chesvx

DOUBLE COMPLEX for zhesvx.
758
LAPACK Routines 3
least max(1,nrhs).
af, ipiv These arrays are output arguments if fact = 'N'. See the
description of af, ipiv in Input Arguments section.
rcond REAL for chesvx

ferr REAL for chesvx

element in x(j). The estimate is as reliable as the estimate for rcon,
and is almost always a slight overestimate of the true error.
berr REAL for chesvx

exact solution.

runs.

returned.
would suggest.
759

Specific details for the routine hesvx interface are as follows:
returned.
Application Notes
The value of lwork must be at least 2*n. For better performance, try using lwork = n*blocksize, where
blocksize is the optimal block size for ?hetrf.
lwork = -1.
See Also
?hesvxx
Hermitian indefinite coefficient matrix A applying the
diagonal pivoting factorization.
Syntax
call chesvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rwork, info )
760
LAPACK Routines 3
call zhesvxx( fact, uplo, n, nrhs, a, lda, af, ldaf, ipiv, equed, s, b, ldb, x, ldx,
rwork, info )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex/double complex
system of linear equations A*X = B, where A is an n-by-n Hermitian matrix, the columns of matrix B are
The routine ?hesvxx performs the following steps:

= 'E') as
A = U*D*UT, if uplo = 'U',
or A = L*D*LT, if uplo = 'L',
where U or L is a product of permutation and unit upper (lower) triangular matrices, and D is a
symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
3. If some D(i,i)=0, so that D is exactly singular, the routine returns with info = i. Otherwise, the
6. If equilibration was used, the matrix X is premultiplied by diag(r) so that it solves the original system
Input Parameters

factored.
761

given by s. Parameters a, af, and ipiv are not modified.

and factored.

a, af, b, work COMPLEX for chesvxx

DOUBLE COMPLEX for zhesvxx.
Arrays: a(size lda by *), af(size ldaf by *), b(ldb, *), work(*).
The array a contains the Hermitian matrix A as specified by uplo. If uplo =

'U', the leading n-by-n upper triangular part of a contains the upper
least max(1,n).

diagonal matrix D and the multipliers used to obtain the factor U and L from
the factorization A = U*D*UT or A = L*D*LT as computed by ?hetrf. The
second dimension of af must be at least max(1,n).
max(1,nrhs).
max(1,5*n).
ipiv INTEGER.
= 'F'. It contains details of the interchanges and the block structure of D
as determined by ?sytrf.
If ipiv(k) > 0, rows and columns k and ipiv(k) were interchanged and
D(k,k) is a 1-by-1 diagonal block.
762
LAPACK Routines 3
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, rows and columns k-1
and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, rows and columns k+1
and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.

s REAL for chesvxx

DOUBLE PRECISION for zhesvxx.
Array, size (n). The array s contains the scale factors for A. If equed =
'Y', A is multiplied on the left and right by diag(s).
This array is an input argument if fact = 'F' only; otherwise it is an
output argument.




763

are computed.

refinement.
Default 10
Aggressive Set to 100 to permit convergence using


convergence).
rwork REAL for chesvxx

Output Parameters
x COMPLEX for chesvxx

DOUBLE COMPLEX for zhesvxx.
Array, size ldx by nrhs.
inv(diag(s))*X.
a If fact = 'E' and equed = 'Y', overwritten by diag(s)*A*diag(s).
af If fact = 'N', af is an output argument and on exit returns the block

diagonal matrix D and the multipliers used to obtain the factor U or L from
the factorization A = U*D*UT or A = L*D*LT.

rcond REAL for chesvxx
764
LAPACK Routines 3
rpvgrw REAL for chesvxx


berr REAL for chesvxx

Array, size at least max(1, nrhs). Contains the component-wise relative
err_bnds_norm REAL for chesvxx


side.
fields:

threshold sqrt(n)*slamch() for chesvxx and
sqrt(n)*dlamch() for zhesvxx.
765

for chesvxx and sqrt(n)*dlamch() for
zhesvxx. This error bound should only be

sqrt(n)*slamch() for chesvxx and
sqrt(n)*dlamch()for zhesvxx to determine if
the error estimate is "guaranteed". These
reciprocal condition numbers for some
appropriately scaled matrix Z are:

approximately 1.
err_bnds_comp REAL for chesvxx

follows:


side.
fields:

threshold sqrt(n)*slamch() for chesvxx and
sqrt(n)*dlamch() for zhesvxx.
766
LAPACK Routines 3
for chesvxx and sqrt(n)*dlamch() for
zhesvxx. This error bound should only be

sqrt(n)*slamch() for chesvxx and
sqrt(n)*dlamch() for zhesvxx to determine
if the error estimate is "guaranteed". These
reciprocal condition numbers for some
appropriately scaled matrix Z are:

ipiv If fact = 'N', ipiv is an output argument and on exit contains details of
the interchanges and the block structure D, as determined by ssytrf for
single precision flavors and dsytrf for double precision flavors.

Arguments section).

If 0 < infon:U(info,info) is exactly zero. The factorization has been

See Also
767
?spsv
matrix A stored in packed format, and multiple right-
hand sides.
Syntax
call sspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call dspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call cspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zspsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call spsv( ap, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
symmetric matrix stored in packed format, the columns of matrix B are individual right-hand sides, and the
Input Parameters


B; nrhs 0.
ap, b REAL for sspsv

DOUBLE PRECISION for dspsv
COMPLEX for cspsv
DOUBLE COMPLEX for zspsv.
Arrays: ap(size *), b(size ldb by *).
The dimension of ap must be at least max(1,n(n+1)/2). The array ap

contains the factor U or L, as specified by uplo, in packed storage
(see Matrix Storage Schemes).
768
LAPACK Routines 3
Output Parameters
ap The block-diagonal matrix D and the multipliers used to obtain the

factor U (or L) from the factorization of A as computed by ?sptrf,
stored as a packed triangular matrix in the same storage format as A.
ipiv INTEGER.
and the block structure of D, as determined by ?sptrf.
If ipiv(i) = k > 0, then dii is a 1-by-1 block, and the i-th row and
column of A was interchanged with the k-th row and column.



Specific details for the routine spsv interface are as follows:
ipiv Holds the vector with the number of elements n.
See Also
769
?spsvx
real or complex symmetric coefficient matrix A stored
in packed format, and provides error bounds on the
solution.
Syntax
call sspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, iwork, info )
call dspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, iwork, info )
call cspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call zspsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call spsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
linear equations A*X = B, where A is a n-by-n symmetric matrix stored in packed format, the columns of
matrix B are individual right-hand sides, and the columns of X are the corresponding solutions.
The routine ?spsvx performs the following steps:
factorization is A = U*D*UT orA = L*D*LT, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is symmetric and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
Input Parameters

supplied on entry.
If fact = 'F': on entry, afp and ipiv contain the factored form of A.
Arrays ap, afp, and ipiv are not modified.
If fact = 'N', the matrix A is copied to afp and factored.
770
LAPACK Routines 3
how A is factored:
symmetric matrix A, and A is factored as U*D*UT.
symmetric matrix A; A is factored as L*D*LT.

B; nrhs 0.
ap, afp, b, work REAL for sspsvx

DOUBLE PRECISION for dspsvx
COMPLEX for cspsvx
DOUBLE COMPLEX for zspsvx.
Arrays: ap(size *), afp(size *), b(size ldb by *), work(*).
The array ap contains the upper or lower triangle of the symmetric

matrix A in packed storage (see Matrix Storage Schemes).
The array afp is an input argument if fact = 'F'. It contains the
block diagonal matrix D and the multipliers used to obtain the factor U
or L from the factorization A = U*D*UT or A = L*D*LT as computed
by ?sptrf, in the same storage format as A.
the dimension of work must be at least max(1,3*n) for real flavors
and max(1,2*n) for complex flavors.
ipiv INTEGER.
structure of D, as determined by ?sptrf.
771
n).
flavors only.
rwork REAL for cspsvx

DOUBLE PRECISION for zspsvx.
only.
Output Parameters
x REAL for sspsvx

DOUBLE PRECISION for dspsvx
COMPLEX for cspsvx
DOUBLE COMPLEX for zspsvx.
least max(1,nrhs).
afp, ipiv These arrays are output arguments if fact = 'N'. See the
description of afp, ipiv in Input Arguments section.


forward and relative backward errors, respectively, for each solution
vector.

returned.
772
LAPACK Routines 3
would suggest.

Specific details for the routine spsvx interface are as follows:
returned.
See Also
?hpsv
equations with a Hermitian coefficient matrix A stored
in packed format, and multiple right-hand sides.
Syntax
call chpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call zhpsv( uplo, n, nrhs, ap, ipiv, b, ldb, info )
call hpsv( ap, b [,uplo] [,ipiv] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine solves for X the system of linear equations A*X = B, where A is an n-by-n Hermitian matrix
stored in packed format, the columns of matrix B are individual right-hand sides, and the columns of X are
the corresponding solutions.
773
The diagonal pivoting method is used to factor A as A = U*D*UH or A = L*D*LH, where U (or L) is a product
of permutation and unit upper (lower) triangular matrices, and D is Hermitian and block diagonal with 1-by-1
and 2-by-2 diagonal blocks.
Input Parameters


B; nrhs 0.
ap, b COMPLEX for chpsv

DOUBLE COMPLEX for zhpsv.
Arrays: ap(size *), b(size ldb by *).
The dimension of ap must be at least max(1,n(n+1)/2). The array ap
contains the factor U or L, as specified by uplo, in packed storage
(see Matrix Storage Schemes).
Output Parameters
ap The block-diagonal matrix D and the multipliers used to obtain the

factor U (or L) from the factorization of A as computed by ?hptrf,
stored as a packed triangular matrix in the same storage format as A.
ipiv INTEGER.
and the block structure of D, as determined by ?hptrf.
774
LAPACK Routines 3


Specific details for the routine hpsv interface are as follows:
See Also
?hpsvx
Hermitian coefficient matrix A stored in packed
format, and provides error bounds on the solution.
Syntax
call chpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call zhpsvx( fact, uplo, n, nrhs, ap, afp, ipiv, b, ldb, x, ldx, rcond, ferr, berr,
work, rwork, info )
call hpsvx( ap, b, x [,uplo] [,afp] [,ipiv] [,fact] [,ferr] [,berr] [,rcond] [,info] )
Include Files
mkl.fi, lapack.f90
Description
The routine uses the diagonal pivoting factorization to compute the solution to a complex system of linear
equations A*X = B, where A is a n-by-n Hermitian matrix stored in packed format, the columns of matrix B
The routine ?hpsvx performs the following steps:
factorization is A = U*D*UH or A = L*D*LH, where U (or L) is a product of permutation and unit upper
(lower) triangular matrices, and D is a Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal
blocks.
775
2. If some di,i = 0, so that D is exactly singular, then the routine returns with info = i. Otherwise, the
Input Parameters

supplied on entry.
If fact = 'F': on entry, afp and ipiv contain the factored form of A.
Arrays ap, afp, and ipiv are not modified.
If fact = 'N', the matrix A is copied to afp and factored.

how A is factored:
Hermitian matrix A, and A is factored as U*D*UH.
Hermitian matrix A, and A is factored as L*D*LH.

B; nrhs 0.
ap, afp, b, work COMPLEX for chpsvx

DOUBLE COMPLEX for zhpsvx.
Arrays: ap(size *), afp(size *), b(size ldb by *), work(*).
The array ap contains the upper or lower triangle of the Hermitian

matrix A in packed storage (see Matrix Storage Schemes).
The array afp is an input argument if fact = 'F'. It contains the
block diagonal matrix D and the multipliers used to obtain the factor U
or L from the factorization A = U*D*UH or A = L*D*LH as computed
by ?hptrf, in the same storage format as A.
the dimension of work must be at least max(1,2*n).
ipiv INTEGER.
776
LAPACK Routines 3
structure of D, as determined by ?hptrf.
n).
rwork REAL for chpsvx

DOUBLE PRECISION for zhpsvx.
Output Parameters
x COMPLEX for chpsvx

DOUBLE COMPLEX for zhpsvx.
least max(1,nrhs).
afp, ipiv These arrays are output arguments if fact = 'N'. See the
description of afp, ipiv in Input Arguments section.
rcond REAL for chpsvx

ferr REAL for chpsvx

777
berr REAL for chpsvx

exact solution.

returned.
would suggest.

Specific details for the routine hpsvx interface are as follows:
returned.
See Also
778
LAPACK Routines 3
LAPACK Least Squares and Eigenvalue Problem Routines
This section includes descriptions of LAPACK computational routines and driver routines for solving linear
least squares problems, eigenvalue and singular value problems, and performing a number of related
computational tasks. For a full reference on LAPACK routines and related information see [LUG].
Least Squares Problems. A typical least squares problem is as follows: given a matrix A and a vector b,
find the vector x that minimizes the sum of squares i((Ax)i - bi)2 or, equivalently, find the vector x that
minimizes the 2-norm ||Ax - b||2.
In the most usual case, A is an m-by-n matrix with mn and rank(A) = n. This problem is also referred to
as finding the least squares solution to an overdetermined system of linear equations (here we have more
equations than unknowns). To solve this problem, you can use the QR factorization of the matrix A (see QR
Factorization).
If m < n and rank(A) = m, there exist an infinite number of solutions x which exactly satisfy Ax = b, and
thus minimize the norm ||Ax - b||2. In this case it is often useful to find the unique solution that
minimizes ||x||2. This problem is referred to as finding the minimum-norm solution to an
underdetermined system of linear equations (here we have more unknowns than equations). To solve this
problem, you can use the LQ factorization of the matrix A (see LQ Factorization).
In the general case you may have a rank-deficient least squares problem, with rank(A)< min(m, n): find
the minimum-norm least squares solution that minimizes both ||x||2 and ||Ax - b||2. In this case (or
when the rank of A is in doubt) you can use the QR factorization with pivoting or singular value
decomposition (see Singular Value Decomposition).
Eigenvalue Problems. The eigenvalue problems (from German eigen "own") are stated as follows: given a
matrix A, find the eigenvalues and the corresponding eigenvectorsz that satisfy the equation
Az = z (right eigenvectors z)
or the equation
zHA = zH (left eigenvectors z).
If A is a real symmetric or complex Hermitian matrix, the above two equations are equivalent, and the
problem is called a symmetric eigenvalue problem. Routines for solving this type of problems are described
in the section Symmetric Eigenvalue Problems.
Routines for solving eigenvalue problems with nonsymmetric or non-Hermitian matrices are described in the
section Nonsymmetric Eigenvalue Problems.
The library also includes routines that handle generalized symmetric-definite eigenvalue problems: find
the eigenvalues and the corresponding eigenvectors x that satisfy one of the following equations:
Az = Bz, ABz = z, or BAz = z,
where A is symmetric or Hermitian, and B is symmetric positive-definite or Hermitian positive-definite.
Routines for reducing these problems to standard symmetric eigenvalue problems are described in the
section Generalized Symmetric-Definite Eigenvalue Problems.
To solve a particular problem, you usually call several computational routines. Sometimes you need to
combine the routines of this chapter with other LAPACK routines described in "LAPACK Routines: Linear
Equations" as well as with BLAS routines described in "BLAS and Sparse BLAS Routines".
For example, to solve a set of least squares problems minimizing ||Ax - b||2 for all columns b of a given
matrix B (where A and B are real matrices), you can call ?geqrf to form the factorization A = QR, then call ?
ormqr to compute C = QHB and finally call the BLAS routine ?trsm to solve for X the system of equations RX
= C.
Another way is to call an appropriate driver routine that performs several tasks in one call. For example, to
solve the least squares problem the driver routine ?gels can be used.
LAPACK Least Squares and Eigenvalue Problem Computational Routines

In the sections that follow, the descriptions of LAPACK computational routines are given. These routines
perform distinct computational tasks that can be used for:
779
Orthogonal Factorizations
Singular Value Decomposition
Symmetric Eigenvalue Problems
Generalized Symmetric-Definite Eigenvalue Problems
Nonsymmetric Eigenvalue Problems
Generalized Nonsymmetric Eigenvalue Problems
Generalized Singular Value Decomposition
See also the respective driver routines.
Orthogonal Factorizations: LAPACK Computational Routines

This section describes the LAPACK routines for the QR (RQ) and LQ (QL) factorization of matrices. Routines
for the RZ factorization as well as for generalized QR and RQ factorizations are also included.
QR Factorization. Assume that A is an m-by-n matrix to be factored.
If mn, the QR factorization is given by
where R is an n-by-n upper triangular matrix with real diagonal elements, and Q is an m-by-m orthogonal (or
unitary) matrix.
You can use the QR factorization for solving the following least squares problem: minimize ||Ax - b||2
where A is a full-rank m-by-n matrix (mn). After factoring the matrix, compute the solution x by solving Rx
= (Q1)Tb.
If m < n, the QR factorization is given by
A = QR = Q(R1R2)
where R is trapezoidal, R1 is upper triangular and R2 is rectangular.
Q is represented as a product of min(m, n) elementary reflectors. Routines are provided to work with Q in
this representation.
LQ Factorization LQ factorization of an m-by-n matrix A is as follows. If mn,
where L is an m-by-m lower triangular matrix with real diagonal elements, and Q is an n-by-n orthogonal (or
unitary) matrix.
If m > n, the LQ factorization is
where L1 is an n-by-n lower triangular matrix, L2 is rectangular, and Q is an n-by-n orthogonal (or unitary)
matrix.
780
LAPACK Routines 3
You can use the LQ factorization to find the minimum-norm solution of an underdetermined system of linear
equations Ax = b where A is an m-by-n matrix of rank m (m < n). After factoring the matrix, compute the
solution vector x as follows: solve Ly = b for y, and then compute x = (Q1)Hy.
Table "Computational Routines for Orthogonal Factorization" lists LAPACK routines that perform orthogonal
factorization of matrices.
Computational Routines for Orthogonal Factorization
Matrix type, factorization Factorize without Factorize with Generate Apply
pivoting pivoting matrix Q matrix Q
general matrices, QR factorization geqrf geqpf orgqr ormqr

geqr geqp3 ungqr unmqr
geqrfp gemqr
general matrices, blocked QR geqrt gemqrt

factorization
general matrices, RQ factorization gerqf orgrq ormrq

ungrq unmrq
general matrices, LQ factorization gelqf orglq ormlq

gelq unglq unmlq
gemlq
general matrices, QL factorization geqlf orgql ormql

ungql unmql
trapezoidal matrices, RZ tzrzf ormrz

factorization
unmrz
pair of matrices, generalized QR ggqrf

factorization
pair of matrices, generalized RQ ggrqf

factorization
triangular-pentagonal matrices, tpqrt tpmqrt

blocked QR factorization
?geqrf
Computes the QR factorization of a general m-by-n
matrix.
Syntax
call sgeqrf(m, n, a, lda, tau, work, lwork, info)
call dgeqrf(m, n, a, lda, tau, work, lwork, info)
call cgeqrf(m, n, a, lda, tau, work, lwork, info)
call zgeqrf(m, n, a, lda, tau, work, lwork, info)
call geqrf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
781
The routine forms the QR factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of min(m, n)
elementary reflectors. Routines are provided to work with Q in this representation.
NOTE
Input Parameters
n INTEGER. The number of columns in A (n 0).
a, work REAL for sgeqrf

DOUBLE PRECISION for dgeqrf
COMPLEX for cgeqrf
DOUBLE COMPLEX for zgeqrf.
Arrays: a(lda,*) contains the matrix A. The second dimension of a must be
at least max(1, n).
work is a workspace array, its dimension max(1, lwork).
lda INTEGER. The leading dimension of a; at least max(1, m).

calculates the optimal size of the work array, returns this value as the first
entry of the work array, and no error message related to lwork is issued by
xerbla.
Output Parameters
a Overwritten by the factorization data as follows:

The elements on and above the diagonal of the array contain the min(m,n)-
by-n upper trapezoidal matrix R (R is upper triangular if mn); the elements
below the diagonal, with the array tau, present the orthogonal matrix Q as
a product of min(m,n) elementary reflectors (see Orthogonal Factorizations).
tau REAL for sgeqrf

DOUBLE PRECISION for dgeqrf
COMPLEX for cgeqrf
Array, size at least max (1, min(m, n)). Contains scalars that define
elementary reflectors for the matrix Qin its decomposition in a product of
elementary reflectors (see Orthogonal Factorizations).
782
LAPACK Routines 3
required for optimum performance. Use this lwork for subsequent runs.
info INTEGER.

counterparts. For general conventions applied to skip redundant or restorable arguments, see LAPACK 95
Specific details for the routine geqrf interface are the following:
tau Holds the vector of length min(m,n)
Application Notes
lwork = -1.
The computed factorization is the exact factorization of a matrix A + E, where
||E||2 = O()||A||2.
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.
The number of operations for complex flavors is 4 times greater.

To solve a set of least squares problems minimizing ||A*x - b||2 for all columns b of a given matrix B, you
can call the following:
?geqrf (this routine) to factorize A = QR;
ormqr to compute C = QT*B (for real matrices);
unmqr to compute C = QH*B (for complex matrices);
783
trsm (a BLAS routine) to solve R*X = C.
(The columns of the computed X are the least squares solution vectors x.)
To compute the elements of Q explicitly, call
orgqr (for real matrices)
ungqr (for complex matrices).
See Also
mkl_progress
?geqr
Computes a QR factorization of a general matrix, with
best performance for tall and skinny matrices.
call sgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call dgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call cgeqr(m, n, a, lda, t, tsize, work, lwork, info)
call zgeqr(m, n, a, lda, t, tsize, work, lwork, info)
Description
The ?geqr routine computes a QR factorization of an m-by-n matrix A. If the matrix is tall and skinny (m is
substantially larger than m), a highly scalable algorithm is used to avoid communication overhead.
NOTE
The internal format of the elementary reflectors generated by ?geqr is only compatible with the ?
gemqr routine and not any other QR routines.
Input Parameters
m INTEGER. The number of rows of the matrix A. m 0.
n REAL for sgeqr

INTEGER. The number of columns of the matrix A. n 0.
a DOUBLE PRECISION for dgeqr

COMPLEX for cgeqr
COMPLEX*16 for zgeqr
Array of size (lda,n). On entry, the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a. lda max(1,m).
tsize INTEGER. If tsize 5, the size of the array t. If tsize = -1 or tsize = -2,
then the routine performs a workspace query. The routine calculates the
sizes required for the t and work arrays and returns these values as the
first entries of the t and work arrays, without issuing any error message
related to t or work by xerbla.
784
LAPACK Routines 3
If tsize = -1, the routine calculates the optimal size of t for optimum
performance and returns this value in t(1).
If tsize = -2, the routine calculates then minimum size required for t and
returns this value in t(1).
lwork INTEGER. The size of the array work. If lwork = -1 or lwork = -2, then the
routine performs a workspace query. The routine only calculates the sizes of
the t and work arrays and returns these values as the first entries of the t
and work arrays, without issuing any error message related to t or work by
xerbla.
If lwork = -1, the routine calculates the optimal size of work for optimum
performance and returns this value in work(1).
If lwork = -2, the routine calculates the minimum size required for work
and returns this value in work(1).
Output Parameters
a REAL for sgeqr

DOUBLE PRECISION for dgeqr
COMPLEX for cgeqr
COMPLEX*16 for zgeqr
On exit, the elements on and above the diagonal of the array contain the
(min(m,n))-by-n upper trapezoidal matrix R (R is upper triangular if mn);
the elements below the diagonal represent Q.
t REAL for sgelq

DOUBLE PRECISION for dgelq
COMPLEX for cgelq
COMPLEX*16 for zgelq
Array, size (max(5,tsize)).
If info = 0, t(1) returns the optimal value for tsize. You can specify that
it return the minimum required value for tsize instead - see the tsize
description for details. The remaining entries of t contains part of the data
structure used to represent Q. To apply or construct Q, you need to retain a
and t and pass them to other routines.
work REAL for sgelq

COMPLEX for cgelq
Array, size (max(1,lwork)).
If info = 0, work(1) contains the optimal value for lwork. You can specify
that it return the minimum required value for lwork instead - see the
lwork description for details.
785
info INTEGER.
info = 0 indicates a successful exit.
info < 0: if info = -i, the i-th argument had an illegal value.
See Also
?gemqr Multiples a matrix C by a real orthogonal or complex unitary matrix Q, as computed by ?
geqr, with best performance for tall and skinny matrices.
?geqrfp
matrix with non-negative diagonal elements.
Syntax
call sgeqrfp(m, n, a, lda, tau, work, lwork, info)
call dgeqrfp(m, n, a, lda, tau, work, lwork, info)
call cgeqrfp(m, n, a, lda, tau, work, lwork, info)
call zgeqrfp(m, n, a, lda, tau, work, lwork, info)
Include Files
mkl.fi
Description
The routine forms the QR factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
pivoting is performed. The diagonal entries of R are real and nonnegative.
NOTE
Input Parameters
a, work REAL for sgeqrfp

DOUBLE PRECISION for dgeqrfp
COMPLEX for cgeqrfp
DOUBLE COMPLEX for zgeqrfp.
Arrays: a(lda,*) contains the matrix A. The second dimension of a must be
at least max(1, n).
786
LAPACK Routines 3
xerbla.
Output Parameters

The diagonal elements of the matrix R are real and non-negative.
tau REAL for sgeqrfp

DOUBLE PRECISION for dgeqrfp
COMPLEX for cgeqrfp
DOUBLE COMPLEX for zgeqrfp.
Array, size at least max (1, min(m, n)). Contains scalars that define
elementary reflectors for the matrix Qin its decomposition in a product of
elementary reflectors (see Orthogonal Factorizations).

info INTEGER.

Specific details for the routine geqrfp interface are the following:
Application Notes
lwork = -1.
787
||E||2 = O()||A||2.
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.

?geqrfp (this routine) to factorize A = QR;
(The columns of the computed X are the least squares solution vectors x.)
See Also
mkl_progress
?geqrt
Computes a blocked QR factorization of a general real
or complex matrix using the compact WY
representation of Q.
Syntax
call sgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call dgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call cgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call zgeqrt(m, n, nb, a, lda, t, ldt, work, info)
call geqrt(a, t, nb[, info])
788
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The strictly lower triangular matrix V contains the elementary reflectors H(i) in the ith column below the
diagonal. For example, if m=5 and n=3, the matrix V is
where vi represents one of the vectors that define H(i). The vectors are returned in the lower triangular part
of array a.
NOTE
The 1s along the diagonal of V are not stored in a.
Let k = min(m,n). The number of blocks is b = ceiling(k/nb), where each block is of order nb except for
the last block, which is of order ib = k - (b-1)*nb. For each of the b blocks, a upper triangular block
reflector factor is computed:t1, t2, ..., tb. The nb-by-nb (and ib-by-ib for the last block) ts are stored
in the nb-by-n array t as
t = (t1t2 ... tb).
Input Parameters
nb INTEGER. The block size to be used in the blocked QR (min(m, n) nb 1).
a, work REAL for sgeqrt

DOUBLE PRECISION for dgeqrt
COMPLEX for cgeqrt
COMPLEX*16 for zgeqrt.
Arrays: aDIMENSION (lda, n) contains the m-by-n matrix A.
workDIMENSION (nb, n) is a workspace array.
ldt INTEGER. The leading dimension of t; at least nb.
789
Output Parameters

below the diagonal, with the array t, present the orthogonal matrix Q as a
product of min(m,n) elementary reflectors (see Orthogonal Factorizations).
t REAL for sgeqrt

DOUBLE PRECISION for dgeqrt
COMPLEX for cgeqrt
COMPLEX*16 for zgeqrt.
Array, DIMENSION (ldt, min(m, n)).
The upper triangular block reflector's factors stored as a sequence of upper

triangular blocks.
info INTEGER.
If info < 0 and info = -i, the ith argument had an illegal value.
?gemqrt
Multiplies a general matrix by the orthogonal/unitary
matrix Q of the QR factorization formed by ?geqrt.
Syntax
call sgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call dgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call cgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call zgemqrt(side, trans, m, n, k, nb, v, ldv, t, ldt, c, ldc, work, info)
call gemqrt( v, t, c, k, nb[, trans][, side][, info])
Include Files
mkl.fi, lapack.f90
Description
The ?gemqrt routine overwrites the general real or complex m-by-n matrixC with
side ='L' side ='R'

trans = 'N': Q*C C*Q
trans = 'T': QT*C C*QT
trans = 'C': QH*C C*QH
where Q is a real orthogonal (complex unitary) matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k) = I - V*T*VT for real flavors, and
Q = H(1) H(2)... H(k) = I - V*T*VH for complex flavors,
generated using the compact WY representation as returned by geqrt. Q is of order m if side = 'L' and of
order n if side = 'R'.
790
LAPACK Routines 3
Input Parameters
side CHARACTER
='L': apply Q, QT, or QH from the left.
='R': apply Q, QT, or QH from the right.
trans CHARACTER
='N', no transpose, apply Q.
='T', transpose, apply QT.
='C', transpose, apply QH.
m INTEGER. The number of rows in the matrix C, (m 0).
n INTEGER. The number of columns in the matrix C, (n 0).
k INTEGER. The number of elementary reflectors whose product defines the

matrix Q. Constraints:
If side = 'L', mk0
If side = 'R', nk0.
nb INTEGER.
The block size used for the storage of t, knb 1. This must be the same
value of nb used to generate t in geqrt.
v REAL for sgemqrt

DOUBLE PRECISION for dgemqrt
COMPLEX for cgemqrt
COMPLEX*16 for zgemqrt.
Array, DIMENSION (ldv, k).
The ith column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,...,k, as returned by geqrt in the first k columns of
its array argument a.
ldv INTEGER. The leading dimension of the array v.

if side = 'L', ldv must be at least max(1,m);
if side = 'R', ldv must be at least max(1,n).
t REAL for sgemqrt

COMPLEX for cgemqrt
Array, DIMENSION (ldt, k).
The upper triangular factors of the block reflectors as returned by geqrt.
ldt INTEGER. The leading dimension of the array t. ldt must be at least nb.
791
c REAL for sgemqrt

COMPLEX for cgemqrt
The m-by-n matrix C.
ldc INTEGER. The leadinng dimension of the array c. ldc must be at least
max(1, m).
work REAL for sgemqrt

COMPLEX for cgemqrt
Workspace array.
If side = 'L' DIMENSION n*nb.
If side = 'R' DIMENSION m*nb.
Output Parameters
c Overwritten by the product Q*C, C*Q, QT*C, C*QT, QH*C, or C*QH as

specified by side and trans.
info INTEGER.
= 0: the execution is successful.
< 0: if info = -i, the ith argument had an illegal value.
?geqpf
matrix with pivoting.
Syntax
call sgeqpf(m, n, a, lda, jpvt, tau, work, info)
call dgeqpf(m, n, a, lda, jpvt, tau, work, info)
call cgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
call zgeqpf(m, n, a, lda, jpvt, tau, work, rwork, info)
call geqpf(a, jpvt [,tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is deprecated and has been replaced by routine geqp3.
The routine ?geqpf forms the QR factorization of a general m-by-n matrix A with column pivoting: A*P =
Q*R (see Orthogonal Factorizations). Here P denotes an n-by-n permutation matrix.
792
LAPACK Routines 3
Input Parameters
a, work REAL for sgeqpf

DOUBLE PRECISION for dgeqpf
COMPLEX for cgeqpf
DOUBLE COMPLEX for zgeqpf.
Arrays: a (lda,*) contains the matrix A. The second dimension of a must be
at least max(1, n).
work (lwork) is a workspace array. The size of the work array must be at
least max(1, 3*n) for real flavors and at least max(1, n) for complex
flavors.
jpvt INTEGER. Array, size at least max(1, n).

On entry, if jpvt(i) > 0, the i-th column of A is moved to the beginning of
A*P before the computation, and fixed in place during the computation.
If jpvt(i) = 0, the ith column of A is a free column (that is, it may be
interchanged during the computation with any other free column).
rwork REAL for cgeqpf

DOUBLE PRECISION for zgeqpf.
A workspace array, DIMENSION at least max(1, 2*n).
Output Parameters

tau REAL for sgeqpf

DOUBLE PRECISION for dgeqpf
COMPLEX for cgeqpf
DOUBLE COMPLEX for zgeqpf.
Array, size at least max (1, min(m, n)). Contains additional information on
the matrix Q.
jpvt Overwritten by details of the permutation matrix P in the factorization A*P

= Q*R. More precisely, the columns of A*P are the columns of A in the
following order:
793
jpvt(1), jpvt(2), ..., jpvt(n).
info INTEGER.

Specific details for the routine geqpf interface are the following:
jpvt Holds the vector of length n.
Application Notes
||E||2 = O()||A||2.
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.

?geqpf (this routine) to factorize A*P = Q*R;
(The columns of the computed X are the permuted least squares solution vectors x; the output array jpvt
specifies the permutation order.)
794
LAPACK Routines 3
?geqp3
matrix with column pivoting using level 3 BLAS.
Syntax
call sgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call dgeqp3(m, n, a, lda, jpvt, tau, work, lwork, info)
call cgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
call zgeqp3(m, n, a, lda, jpvt, tau, work, lwork, rwork, info)
call geqp3(a, jpvt [,tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the QR factorization of a general m-by-n matrix A with column pivoting: A*P = Q*R (see
Orthogonal Factorizations) using Level 3 BLAS. Here P denotes an n-by-n permutation matrix. Use this
routine instead of geqpf for better performance.
Input Parameters
a, work REAL for sgeqp3

DOUBLE PRECISION for dgeqp3
COMPLEX for cgeqp3
DOUBLE COMPLEX for zgeqp3.
Arrays:
a (lda,*) contains the matrix A.
lwork INTEGER. The size of the work array; must be at least max(1, 3*n+1) for
real flavors, and at least max(1, n+1) for complex flavors.

xerbla. See Application Notes below for details.
jpvt INTEGER.
795
On entry, if jpvt(i) 0, the i-th column of A is moved to the beginning of

AP before the computation, and fixed in place during the computation.
If jpvt(i) = 0, the i-th column of A is a free column (that is, it may be
interchanged during the computation with any other free column).
rwork REAL for cgeqp3

DOUBLE PRECISION for zgeqp3.
A workspace array, size at least max(1, 2*n). Used in complex flavors only.
Output Parameters

tau REAL for sgeqp3

DOUBLE PRECISION for dgeqp3
COMPLEX for cgeqp3
DOUBLE COMPLEX for zgeqp3.
Array, size at least max (1, min(m, n)). Contains scalar factors of the
elementary reflectors for the matrix Q.
jpvt Overwritten by details of the permutation matrix P in the factorization A*P

= Q*R. More precisely, the columns of AP are the columns of A in the
following order:
jpvt(1), jpvt(2), ..., jpvt(n).
info INTEGER.

Specific details for the routine geqp3 interface are the following:
jpvt Holds the vector of length n.
Application Notes
796
LAPACK Routines 3
?geqp3 (this routine) to factorize A*P = Q*R;
(The columns of the computed X are the permuted least squares solution vectors x; the output array jpvt
specifies the permutation order.)
lwork = -1.
?orgqr
Generates the real orthogonal matrix Q of the QR
factorization formed by ?geqrf.
Syntax
call sorgqr(m, n, k, a, lda, tau, work, lwork, info)
call dorgqr(m, n, k, a, lda, tau, work, lwork, info)
call orgqr(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of m-by-m orthogonal matrix Q of the QR factorization formed by
the routines geqrf or geqpf. Use this routine after a call to sgeqrf/dgeqrf or sgeqpf/dgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with mp. To compute the whole
matrix Q, use:
call?orgqr(m, m, p, a, lda, tau, work, lwork, info)
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned by the
columns of A):
call?orgqr(m, p, p, a, lda, work, lwork, info)
797
To compute the matrix Qk of the QR factorization of leading k columns of the matrix A:
call?orgqr(m, m, k, a, lda, tau, work, lwork, info)
To compute the leading k columns of Qk (which form an orthonormal basis in the space spanned by leading k
columns of the matrix A):
call?orgqr(m, k, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The order of the orthogonal matrix Q (m 0).
n INTEGER. The number of columns of Q to be computed

(0 nm).

matrix Q (0 kn).
a, tau, work REAL for sorgqr

DOUBLE PRECISION for dorgqr
Arrays:
a(lda,*) and tau(*) are the arrays returned by sgeqrf / dgeqrf or
sgeqpf / dgeqpf.
The size of tau must be at least max(1, k).

xerbla.
Output Parameters
a Overwritten by n leading columns of the m-by-m orthogonal matrix Q.

info INTEGER.
798
LAPACK Routines 3
Specific details for the routine orgqr interface are the following:
tau Holds the vector of length (k)
Application Notes
lwork = -1.
The computed Q differs from an exactly orthogonal matrix by a matrix E such that
||E||2 = O()|*|A||2 where is the machine precision.
The total number of floating-point operations is approximately 4*m*n*k - 2*(m + n)*k2 + (4/3)*k3.
If n = k, the number is approximately (2/3)*n2*(3m - n).
The complex counterpart of this routine is ungqr.
?ormqr
Multiplies a real matrix by the orthogonal matrix Q of
the QR factorization formed by ?geqrf or ?geqpf.
Syntax
call sormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormqr(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q of the QR factorization
formed by the routines geqrf or geqpf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QT*C,
C*Q, or C*QT (overwriting the result on C).
799
Input Parameters
side CHARACTER*1. Must be either 'L' or 'R'.

If side='L', Q or QT is applied to C from the left.
If side='R', Q or QT is applied to C from the right.
trans CHARACTER*1. Must be either 'N' or 'T'.

If trans='N', the routine multiplies C by Q.
If trans='T', the routine multiplies C by QT.
m INTEGER. The number of rows in the matrix C (m 0).
n INTEGER. The number of columns in C (n 0).

0 km if side='L';
0 kn if side='R'.
a, tau, c, work REAL for sgeqrf

DOUBLE PRECISION for dgeqrf.
Arrays:
a(lda,*) and tau(*) are the arrays returned by sgeqrf / dgeqrf or
sgeqpf / dgeqpf.
The second dimension of a must be at least max(1, k).
c(ldc,*) contains the matrix C.
The second dimension of c must be at least max(1, n)
lda INTEGER. The leading dimension of a. Constraints:

if side = 'L', lda max(1, m) ;
if side = 'R', lda max(1, n).
ldc INTEGER. The leading dimension of c. Constraint:

ldc max(1, m).
lwork INTEGER. The size of the work array. Constraints:

lwork max(1, n) if side = 'L';
lwork max(1, m) if side = 'R'.
xerbla.
800
LAPACK Routines 3
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, or C*QT (as specified by side
and trans).

info INTEGER.

Specific details for the routine ormqr interface are the following:
a Holds the matrix A of size (r,k).

r = m if side = 'L'.
r = n if side = 'R'.
tau Holds the vector of length (k).
trans Must be 'N' or 'T'. The default value is 'N'.
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if
side = 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
performance of the blocked algorithm.
lwork = -1.
The complex counterpart of this routine is unmqr.
801
?gemqr
Multiples a matrix C by a real orthogonal or complex
unitary matrix Q, as computed by ?geqr, with best
performance for tall and skinny matrices.
call sgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call dgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call cgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call zgemqr(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
Description
The ?gemqr routine multiplies an m-by-n matrix C by Op(Q), where matrix Q is the factor from the LQ
factorization of matrix A formed by ?geqr, and
Op(Q) = Q, or
Op(Q) = QT, or
Op(Q) = QH.
NOTE
You must use ?geqr for LQ factorization before calling ?gemqr. ?gemqr is not compatible with QR
factorization routines other than ?geqr.
For real flavors, C is real and Q is real orthogonal.

For complex flavors, C is complex and Q is complex unitary.
If matrix A is tall and skinny, a highly scalable algorithm is used to avoid communication overhead.
Otherwise, ?ormqr or ?unmqr is used.
Input Parameters
side CHARACTER*1.
If side = 'L': apply Op(Q) from the left.
If side = 'R'': apply Op(Q) from the right.
trans CHARACTER*1.
If trans = 'N': No transpose, Op(Q) = Q.
If trans = 'T': Transpose, Op(Q) = QT.
If trans = 'C': Conjugate transpose, Op(Q) = QH.
n INTEGER. The number of columns of the matrix C. mn 0.

matrix Q.
If side = 'L', mk 0.
if side = 'R', nk 0.
a REAL for sgemqr
802
LAPACK Routines 3
DOUBLE PRECISION for dgemqr
COMPLEX for cgemqr
COMPLEX*16 for zgemqr
Array, size (lda,k).
Part of the data structure to represent Q as returned by ?geqr.

If side = 'L', ldamax(1,m).
If side = 'R', ldamax(1,n).
t REAL for sgemlq

DOUBLE PRECISION for dgemlq
COMPLEX for cgemlq
COMPLEX*16 for zgemlq
Array, size (max(5,tsize)). Part of the data structure to represent Q as
returned by ?geqr.
tsize INTEGER. The size of the array t. tsize 5.
c REAL for sgemqr

DOUBLE PRECISION for dgemqr
COMPLEX for cgemqr
COMPLEX*16 for zgemqr
Array of size (ldc,n). On entry, contains the m-by-n matrix C.
ldc INTEGER. The leading dimension of the array c. ldcmax(1,m).
lwork INTEGER. The size of the array work.

calculates the optimal size of the work array and returns this value as
work(1); no error message related to lwork is issued by xerbla.
Output Parameters
c On exit, c is overwritten by Op(Q)*C or C*Op(Q).
work REAL for sgemlq

COMPLEX for cgemlq
Workspace array of size (max(1,lwork)).
info INTEGER.
803
?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by ?geqrf.
Syntax
call cungqr(m, n, k, a, lda, tau, work, lwork, info)
call zungqr(m, n, k, a, lda, tau, work, lwork, info)
call ungqr(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of m-by-m unitary matrix Q of the QR factorization formed by the
routines geqrf or geqpf. Use this routine after a call to cgeqrf/zgeqrf or cgeqpf/zgeqpf.
Usually Q is determined from the QR factorization of an m by p matrix A with mp. To compute the whole
matrix Q, use:
call?ungqr(m, m, p, a, lda, tau, work, lwork, info)
To compute the leading p columns of Q (which form an orthonormal basis in the space spanned by the
columns of A):
call?ungqr(m, p, p, a, lda, tau, work, lwork, info)
To compute the matrix Qk of the QR factorization of the leading k columns of the matrix A:
call?ungqr(m, m, k, a, lda, tau, work, lwork, info)
To compute the leading k columns of Qk (which form an orthonormal basis in the space spanned by the
leading k columns of the matrix A):
call?ungqr(m, k, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The order of the unitary matrix Q (m 0).
n INTEGER. The number of columns of Q to be computed

(0 nm).

matrix Q (0 kn).
a, tau, work COMPLEX for cungqr

DOUBLE COMPLEX for zungqr
Arrays: a(lda,*) and tau(*) are the arrays returned by cgeqrf/zgeqrf or
cgeqpf/zgeqpf.
804
LAPACK Routines 3

xerbla.
Output Parameters
a Overwritten by n leading columns of the m-by-m unitary matrix Q.

info INTEGER.

Specific details for the routine ungqr interface are the following:
Application Notes
For better performance, try using lwork =n*blocksize, where blocksize is a machine-dependent value
If it is not clear how much workspace to supply, use a generous value of lwork for the first run, or set lwork
= -1.
In first case the routine completes the task, though probably not so fast as with a recommended workspace,
If lwork = -1, then the routine returns immediately and provides the recommended workspace in the first
element of the corresponding array (work). This operation is called a workspace query.
Note that if lwork is less than the minimal required value and is not equal to -1, then the routine returns
immediately with an error exit and does not provide any information on the recommended workspace.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O()*||A||2,
where is the machine precision.
805
If n = k, the number is approximately (8/3)*n2*(3m - n).
The real counterpart of this routine is orgqr.
?unmqr
Multiplies a complex matrix by the unitary matrix Q of
the QR factorization formed by ?geqrf.
Syntax
call cunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmqr(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmqr(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a rectangular complex matrix C by Q or QH, where Q is the unitary matrix Q of the QR
factorization formed by the routines geqrf or geqpf.
Depending on the parameters side and trans, the routine can form one of the matrix products Q*C, QH*C,
C*Q, or C*QH (overwriting the result on C).
Input Parameters

If side = 'L', Q or QH is applied to C from the left.
If side = 'R', Q or QH is applied to C from the right.
trans CHARACTER*1. Must be either 'N' or 'C'.

If trans = 'N', the routine multiplies C by Q.
If trans = 'C', the routine multiplies C by QH.

0 km if side = 'L';
0 kn if side = 'R'.
a, c, tau, work COMPLEX for cgeqrf

Arrays:
a(lda,*) and tau(*) are the arrays returned by cgeqrf / zgeqrf or
cgeqpf / zgeqpf.
806
LAPACK Routines 3
c(ldc,*) contains the m-by-n matrix C.


lda max(1, m) if side = 'L';
lda max(1, n) if side = 'R'.
ldc INTEGER. The leading dimension of c. Constraint:

ldc max(1, m) .

xerbla.
See Application notes for the suggested value of lwork.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, or C*QH (as specified by side
and trans).

info INTEGER.

Specific details for the routine unmqr interface are the following:

807
Application Notes
For better performance, try using lwork = n*blocksize (if side = 'L') or lwork = m*blocksize (if side
= 'R') where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
= -1.
The real counterpart of this routine is ormqr.
?gelqf
Computes the LQ factorization of a general m-by-n
matrix.
Syntax
call sgelqf(m, n, a, lda, tau, work, lwork, info)
call dgelqf(m, n, a, lda, tau, work, lwork, info)
call cgelqf(m, n, a, lda, tau, work, lwork, info)
call zgelqf(m, n, a, lda, tau, work, lwork, info)
call gelqf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the LQ factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
NOTE
This routine supports the Progress Routine feature. See Progress Function section for details.
Input Parameters
808
LAPACK Routines 3
a, work REAL for sgelqf

DOUBLE PRECISION for dgelqf
COMPLEX for cgelqf
DOUBLE COMPLEX for zgelqf.
Arrays:
Array a(lda,*) contains the matrix A.
lwork INTEGER. The size of the work array; at least max(1, m).
xerbla.
Output Parameters

The elements on and below the diagonal of the array contain the m-by-
min(m,n) lower trapezoidal matrix R (R is lower triangular if mn); the
elements above the diagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors.
tau REAL for sgelqf

DOUBLE PRECISION for dgelqf
COMPLEX for cgelqf
DOUBLE COMPLEX for zgelqf.
Array, size at least max(1, min(m, n)).
Contains scalars that define elementary reflectors for the matrix Q (see
Orthogonal Factorizations).

info INTEGER.
809

Specific details for the routine gelqf interface are the following:
tau Holds the vector of length min(m,n).
Application Notes
For better performance, try using lwork =m*blocksize, where blocksize is a machine-dependent value
lwork = -1.
||E||2 = O() ||A||2.
(4/3)n3 if m = n,
(2/3)n2(3m-n) if m > n,
(2/3)m2(3n-m) if m < n.

To find the minimum-norm solution of an underdetermined least squares problem minimizing ||A*x - b||2
for all columns b of a given matrix B, you can call the following:
?gelqf (this routine) to factorize A = L*Q;
trsm (a BLAS routine) to solve L*Y = B for Y;
ormlq to compute X = (Q1)T*Y (for real matrices);
unmlq to compute X = (Q1)H*Y (for complex matrices).
(The columns of the computed X are the minimum-norm solution vectors x. Here A is an m-by-n matrix with
m < n; Q1 denotes the first m columns of Q).
orglq (for real matrices)
810
LAPACK Routines 3
unglq (for complex matrices).
See Also
mkl_progress
?gelq
Computes an LQ factorization of a general matrix.
call sgelq(m, n, a, lda, t, tsize, work, lwork, info)
call dgelq(m, n, a, lda, t, tsize, work, lwork, info)
call cgelq(m, n, a, lda, t, tsize, work, lwork, info)
call zgelq(m, n, a, lda, t, tsize, work, lwork, info)
Description
The ?gelq routines computes an LQ factorization of an m-by-n matrix A. If the matrix is short and wide (n is
substantially larger than m), a highly scalable algorithm is used to avoid communication overhead.
NOTE
The internal format of the elementary reflectors generated by ?gelq is only compatible with the ?
gemlq routine and not any other LQ routines.
NOTE
An optimized version of ?gelq is not available.
Input Parameters
n INTEGER. The number of columns of the matrix A. n 0.
a REAL for sgelq

COMPLEX for cgelq
Array, size (lda, n). Contains the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a. ldamax(1, m).
tsize INTEGER. If tsize 5, the size of the array t. If tsize = -1 or tsize = -2,
then the routine performs a workspace query. The routine calculates the
sizes required for the t and work arrays and returns these values as the
first entries of the t and work arrays, without issuing any error message
related to t or work by xerbla.
If tsize = -1, the routine calculates the optimal size of t for optimum
performance and returns this value in t(1).
811
If tsize = -2, the routine calculates then minimum size required for t and
returns this value in t(1).
lwork INTEGER. The size of the array work. If lwork = -1 or lwork = -2, then the
routine performs a workspace query. The routine only calculates the sizes of
the t and work arrays and returns these values as the first entries of the t
and work arrays, without issuing any error message related to t or work by
xerbla.
If lwork = -1, the routine calculates the optimal size of work for optimum
performance and returns this value in work(1).
If lwork = -2, the routine calculates the minimum size required for work
and returns this value in work(1).
Output Parameters
a REAL for sgelq

COMPLEX for cgelq
The elements on and below the diagonal of the array contain the lower
trapezoidal matrix L (L is lower triangular if mn). The elements above the
diagonal are used to store part of the internal data structure representing
Q.
t REAL for sgelq

COMPLEX for cgelq
Array, size (max(5,tsize)).
If info = 0, t(1) returns the optimal value for tsize. You can specify that
it return the minimum required value for tsize instead - see the tsize
description for details. The remaining entries of t contains part of the data
structure used to represent Q. To apply or construct Q, you need to retain a
and t and pass them to other routines.
work REAL for sgelq

COMPLEX for cgelq
If info = 0, work(1) contains the optimal value for lwork. You can specify
that it return the minimum required value for lwork instead - see the
lwork description for details.
info INTEGER.
812
LAPACK Routines 3
See Also
?gemlq Multiples a matrix C by a real orthogonal or complex unitary matrix Q, as computed by ?
gelq.
?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
call sorglq(m, n, k, a, lda, tau, work, lwork, info)
call dorglq(m, n, k, a, lda, tau, work, lwork, info)
call orglq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of n-by-n orthogonal matrix Q of the LQ factorization formed by the
routines gelqf. Use this routine after a call to sgelqf/dgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with np. To compute the whole
matrix Q, use:
call ?orglq(n, n, p, a, lda, tau, work, lwork, info)
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by the rows of A,
use:
call ?orglq(p, n, p, a, lda, tau, work, lwork, info)
To compute the matrix Qk of the LQ factorization of the leading k rows of A, use:
call ?orglq(n, n, k, a, lda, tau, work, lwork, info)
To compute the leading k rows of Qk, which form an orthonormal basis in the space spanned by the leading k
rows of A, use:
call ?orgqr(k, n, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The number of rows of Q to be computed

(0 mn).
n INTEGER. The order of the orthogonal matrix Q (nm).

matrix Q (0 km).
a, tau, work REAL for sorglq

DOUBLE PRECISION for dorglq
813
Arrays: a(lda,*) and tau(*) are the arrays returned by sgelqf/dgelqf.

xerbla.
Output Parameters
a Overwritten by m leading rows of the n-by-n orthogonal matrix Q.

info INTEGER.

Specific details for the routine orglq interface are the following:
Application Notes
lwork = -1.
814
LAPACK Routines 3
The computed Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O()*||A||2,
If m = k, the number is approximately (2/3)*m2*(3n - m).
The complex counterpart of this routine is unglq.
?ormlq
the LQ factorization formed by ?gelqf.
Syntax
call sormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormlq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the orthogonal matrix Q of the LQ
factorization formed by the routine gelqf.
Input Parameters

If side = 'L', Q or QT is applied to C from the left.
If side = 'R', Q or QT is applied to C from the right.

If trans = 'T', the routine multiplies C by QT.

0 km if side = 'L';
0 kn if side = 'R'.
a, c, tau, work REAL for sormlq

DOUBLE PRECISION for dormlq.
Arrays:
815
a(lda,*) and tau(*) are arrays returned by ?gelqf.
The second dimension of a must be:

at least max(1, m) if side = 'L';
at least max(1, n) if side = 'R'.
The dimension of tau must be at least max(1, k).


lda INTEGER. The leading dimension of a; lda max(1, k).
ldc INTEGER. The leading dimension of c; ldc max(1, m).

xerbla.
Output Parameters
and trans).

info INTEGER.

Specific details for the routine ormlq interface are the following:
a Holds the matrix A of size (k,m).
816
LAPACK Routines 3
Application Notes
lwork= -1.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace in the first
The complex counterpart of this routine is unmlq.
?gemlq
Multiples a matrix C by a real orthogonal or complex
unitary matrix Q, as computed by ?gelq.
call sgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call dgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call cgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
call zgemlq(side, trans, m, n, k, a, lda, t, tsize, c, ldc, work, lwork, info)
Description
The ?gemlq routine multiplies an m-by-n matrix C by Op(Q), where matrix Q is the factor from the LQ
factorization of matrix A formed by ?gelq, and
Op(Q) = Q, or
Op(Q) = QT, or
Op(Q) = QH.
NOTE
You must use ?gelq for LQ factorization before calling ?gemlq. ?gemlq is not compatible with LQ
factorization routines other than ?gelq.
For real flavors, C is real and Q is real orthogonal.

For complex flavors, C is complex and Q is complex unitary.
If matrix A is short and wide, a highly scalable algorithm is used to avoid communication overhead.
Otherwise, ?ormlq or ?unmlq is used.
NOTE
An optimized version of ?gemlq is not available.
817
Input Parameters
side CHARACTER*1.
If side = 'L': apply Op(Q) from the left.
If side = 'R'': apply Op(Q) from the right.
trans CHARACTER*1.
If trans = 'N': No transpose, Op(Q) = Q.
If trans = 'T': Transpose, Op(Q) = QT.
If trans = 'C': Conjugate transpose, Op(Q) = QH.
m INTEGER. The number of rows of the matrix A. m0.
n INTEGER. The number of columns of the matrix C. n 0.

matrix Q.
If side = 'L', mk 0.
a REAL for sgemlq

COMPLEX for cgemlq
Array of size (lda,m) if side = 'L', or (lda,n) if side = 'R'.
Part of the data structure to represent Q as returned by ?gelq.
lda INTEGER . The leading dimension of the array a. lda max(1,k).
t REAL for sgemlq

COMPLEX for cgemlq
Array, size (max(5,tsize)). Part of the data structure to represent Q as
returned by ?gelq.
tsize INTEGER. The size of the array t. tsize 5.
c REAL for sgemlq

COMPLEX for cgemlq
Array, size (ldc,n). On entry, a contains the m-by-n matrix C.
ldc INTEGER. The leading dimension of the array c. ldcmax(1,m).
818
LAPACK Routines 3
calculates the optimal size of the work array and returns this value as
work(1); no error message related to lwork is issued by xerbla.
Output Parameters
c On exit, c is overwritten by Op(Q)*C or C*Op(Q).
work REAL for sgemlq

COMPLEX for cgemlq
Workspace array of size (max(1,lwork)).
info INTEGER.
?unglq
Generates the complex unitary matrix Q of the LQ
factorization formed by ?gelqf.
Syntax
call cunglq(m, n, k, a, lda, tau, work, lwork, info)
call zunglq(m, n, k, a, lda, tau, work, lwork, info)
call unglq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of n-by-n unitary matrix Q of the LQ factorization formed by the
routines gelqf. Use this routine after a call to cgelqf/zgelqf.
Usually Q is determined from the LQ factorization of an p-by-n matrix A with n < p. To compute the whole
matrix Q, use:
call ?unglq(n, n, p, a, lda, tau, work, lwork, info)
To compute the leading p rows of Q, which form an orthonormal basis in the space spanned by the rows of A,
use:
call ?unglq(p, n, p, a, lda, tau, work, lwork, info)
To compute the matrix Qk of the LQ factorization of the leading k rows of A, use:
call ?unglq(n, n, k, a, lda, tau, work, lwork, info)
819
To compute the leading k rows of Qk, which form an orthonormal basis in the space spanned by the leading k
rows of A, use:
call ?ungqr(k, n, k, a, lda, tau, work, lwork, info)
Input Parameters
m INTEGER. The number of rows of Q to be computed (0 mn).
n INTEGER. The order of the unitary matrix Q (nm).

matrix Q (0 km).
a, tau, work COMPLEX for cunglq

DOUBLE COMPLEX for zunglq
Arrays: a(lda,*) and tau(*) are the arrays returned by cgelqf/zgelqf.

The dimension of tau must be at least max(1, k).
xerbla.
Output Parameters
a Overwritten by m leading rows of the n-by-n unitary matrix Q.

info INTEGER.

Specific details for the routine unglq interface are the following:
820
LAPACK Routines 3
Application Notes
For better performance, try using lwork = m*blocksize, where blocksize is a machine-dependent value
= -1.
The computed Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O()*||A||2,
If m = k, the number is approximately (8/3)*m2*(3n - m) .
The real counterpart of this routine is orglq.
?unmlq
the LQ factorization formed by ?gelqf.
Syntax
call cunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmlq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmlq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QH, where Q is the unitary matrix Q of the LQ
factorization formed by the routine gelqf.
Input Parameters


821

0 km if side = 'L';
0 kn if side = 'R'.
a, c, tau, work COMPLEX for cunmlq

DOUBLE COMPLEX for zunmlq.
Arrays:
a(lda,*) and tau(*) are arrays returned by ?gelqf.
The second dimension of a must be:

at least max(1, m) if side = 'L';



xerbla.
Output Parameters
and trans).

info INTEGER.
822
LAPACK Routines 3
Specific details for the routine unmlq interface are the following:
Application Notes
= -1.
The real counterpart of this routine is ormlq.
?geqlf
Computes the QL factorization of a general m-by-n
matrix.
Syntax
call sgeqlf(m, n, a, lda, tau, work, lwork, info)
call dgeqlf(m, n, a, lda, tau, work, lwork, info)
call cgeqlf(m, n, a, lda, tau, work, lwork, info)
call zgeqlf(m, n, a, lda, tau, work, lwork, info)
call geqlf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the QL factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
823
NOTE
Input Parameters
a, work REAL for sgeqlf

DOUBLE PRECISION for dgeqlf
COMPLEX for cgeqlf
DOUBLE COMPLEX for zgeqlf.
Arrays:
lwork INTEGER. The size of the work array; at least max(1, n).
xerbla.
Output Parameters
a Overwritten on exit by the factorization data as follows:

if mn, the lower triangle of the subarray a(m-n+1:m, 1:n) contains the n-
by-n lower triangular matrix L; if mn, the elements on and below the (n-
m)-th superdiagonal contain the m-by-n lower trapezoidal matrix L; in both
cases, the remaining elements, with the array tau, represent the
orthogonal/unitary matrix Q as a product of elementary reflectors.
tau REAL for sgeqlf

DOUBLE PRECISION for dgeqlf
COMPLEX for cgeqlf
DOUBLE COMPLEX for zgeqlf.
Array, size at least max(1, min(m, n)). Contains scalar factors of the
elementary reflectors for the matrix Q (see Orthogonal Factorizations).
824
LAPACK Routines 3
required for optimum performance.
info INTEGER.

Specific details for the routine geqlf interface are the following:
Application Notes
lwork = -1.
Related routines include:
orgql to generate matrix Q (for real matrices);
ungql to generate matrix Q (for complex matrices);
ormql to apply matrix Q (for real matrices);
unmql to apply matrix Q (for complex matrices).
See Also
mkl_progress
?orgql
Generates the real matrix Q of the QL factorization
formed by ?geqlf.
Syntax
call sorgql(m, n, k, a, lda, tau, work, lwork, info)
825
call dorgql(m, n, k, a, lda, tau, work, lwork, info)

call orgql(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n real matrix Q with orthonormal columns, which is defined as the last n
columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...* H(2)*H(1) as returned
by the routines geqlf. Use this routine after a call to sgeqlf/dgeqlf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m 0).
n INTEGER. The number of columns of the matrix Q (m n 0).

matrix Q (n k 0).
a, tau, work REAL for sorgql

DOUBLE PRECISION for dorgql
Arrays: a(lda,*), tau(*).
On entry, the (n - k + i)th column of a must contain the vector which
defines the elementary reflector H(i), for i = 1,2,...,k, as returned by
sgeqlf/dgeqlf in the last k columns of its array argument a; tau(i) must
contain the scalar factor of the elementary reflector H(i), as returned by
sgeqlf/dgeqlf;
xerbla.
Output Parameters
a Overwritten by the last n columns of the m-by-m orthogonal matrix Q.

info INTEGER.
826
LAPACK Routines 3

Specific details for the routine orgql interface are the following:
Application Notes
lwork = -1.
The complex counterpart of this routine is ungql.
?ungql
Generates the complex matrix Q of the QL
factorization formed by ?geqlf.
Syntax
call cungql(m, n, k, a, lda, tau, work, lwork, info)
call zungql(m, n, k, a, lda, tau, work, lwork, info)
call ungql(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n complex matrix Q with orthonormal columns, which is defined as the last n
columns of a product of k elementary reflectors H(i) of order m: Q = H(k) *...* H(2)*H(1) as returned
by the routines geqlf/geqlf . Use this routine after a call to cgeqlf/zgeqlf.
827
Input Parameters
m INTEGER. The number of rows of the matrix Q (m0).
n INTEGER. The number of columns of the matrix Q (mn0).

matrix Q (nk0).
a, tau, work COMPLEX for cungql

DOUBLE COMPLEX for zungql
Arrays: a(lda,*), tau(*), work(lwork).
On entry, the (n - k + i)th column of a must contain the vector which
defines the elementary reflector H(i), for i = 1,2,...,k, as returned by
cgeqlf/zgeqlf in the last k columns of its array argument a;
tau(i) must contain the scalar factor of the elementary reflector H(i), as
returned by cgeqlf/zgeqlf;

xerbla.
Output Parameters
a Overwritten by the last n columns of the m-by-m unitary matrix Q.

info INTEGER.

Specific details for the routine ungql interface are the following:
828
LAPACK Routines 3
Application Notes
= -1.
The real counterpart of this routine is orgql.
?ormql
the QL factorization formed by ?geqlf.
Syntax
call sormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormql(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the orthogonal matrix Q of the QL
factorization formed by the routine geqlf.
Depending on the parameters side and trans, the routine ormql can form one of the matrix products Q*C,
QT*C, C*Q, or C*QT (overwriting the result over C).
Input Parameters


829

0 km if side = 'L';
0 kn if side = 'R'.
a, tau, c, work REAL for sormql

DOUBLE PRECISION for dormql.
Arrays: a(lda,*), tau(*), c(ldc,*).
On entry, the ith column of a must contain the vector which defines the
elementary reflector Hi, for i = 1,2,...,k, as returned by sgeqlf/dgeqlf in
the last k columns of its array argument a.
tau(i) must contain the scalar factor of the elementary reflector Hi, as
returned by sgeqlf/dgeqlf.

lda INTEGER. The leading dimension of a;

if side = 'L', lda max(1, m) ;

xerbla.
Output Parameters
and trans).

info INTEGER.
830
LAPACK Routines 3

Specific details for the routine ormql interface are the following:

Application Notes
lwork = -1.
The complex counterpart of this routine is unmql.
?unmql
the QL factorization formed by ?geqlf.
Syntax
call cunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmql(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmql(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
831
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the unitary matrix Q of the QL
factorization formed by the routine geqlf.
Depending on the parameters side and trans, the routine unmql can form one of the matrix products Q*C,
QH*C, C*Q, or C*QH (overwriting the result over C).
Input Parameters



0 km if side = 'L';
0 kn if side = 'R'.
a, tau, c, work COMPLEX for cunmql

DOUBLE COMPLEX for zunmql.
Arrays: a(lda,*), tau(*), c(ldc,*), work(lwork).
On entry, the i-th column of a must contain the vector which defines the
elementary reflector H(i), for i = 1,2,...,k, as returned by cgeqlf/zgeqlf
in the last k columns of its array argument a.
returned by cgeqlf/zgeqlf.


832
LAPACK Routines 3
xerbla.
Output Parameters
and trans).

info INTEGER.

Specific details for the routine unmql interface are the following:

side Must be 'L' or 'L'. The default value is 'L'.
Application Notes
= -1.
833
The real counterpart of this routine is ormql.
?gerqf
Computes the RQ factorization of a general m-by-n
matrix.
Syntax
call sgerqf(m, n, a, lda, tau, work, lwork, info)
call dgerqf(m, n, a, lda, tau, work, lwork, info)
call cgerqf(m, n, a, lda, tau, work, lwork, info)
call zgerqf(m, n, a, lda, tau, work, lwork, info)
call gerqf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the RQ factorization of a general m-by-n matrix A (see Orthogonal Factorizations). No
NOTE
Input Parameters
a, work REAL for sgerqf

DOUBLE PRECISION for dgerqf
COMPLEX for cgerqf
DOUBLE COMPLEX for zgerqf.
Arrays:
Array a(lda,*) contains the m-by-n matrix A.
lwork INTEGER. The size of the work array;

lwork max(1, m).
834
LAPACK Routines 3
xerbla.
Output Parameters

if mn, the upper triangle of the subarray
a(1:m, n-m+1:n ) contains the m-by-m upper triangular matrix R;

if mn, the elements on and above the (m-n)th subdiagonal contain the m-
by-n upper trapezoidal matrix R;
in both cases, the remaining elements, with the array tau, represent the
orthogonal/unitary matrix Q as a product of min(m,n) elementary
reflectors.
tau REAL for sgerqf

DOUBLE PRECISION for dgerqf
COMPLEX for cgerqf
DOUBLE COMPLEX for zgerqf.
Array, size at least max (1, min(m, n)). (See Orthogonal Factorizations.)
Contains scalar factors of the elementary reflectors for the matrix Q.

info INTEGER.

Specific details for the routine gerqf interface are the following:
Application Notes
lwork = -1.
835
orgrq to generate matrix Q (for real matrices);
ungrq to generate matrix Q (for complex matrices);
ormrq to apply matrix Q (for real matrices);
unmrq to apply matrix Q (for complex matrices).
See Also
mkl_progress
?orgrq
Generates the real matrix Q of the RQ factorization
formed by ?gerqf.
Syntax
call sorgrq(m, n, k, a, lda, tau, work, lwork, info)
call dorgrq(m, n, k, a, lda, tau, work, lwork, info)
call orgrq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n real matrix with orthonormal rows, which is defined as the last m rows of a
product of k elementary reflectors H(i) of order n: Q = H(1)* H(2)*...*H(k)as returned by the routines
gerqf. Use this routine after a call to sgerqf/dgerqf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m 0).
n INTEGER. The number of columns of the matrix Q (n m).

matrix Q (m k 0).
a, tau, work REAL for sorgrq

DOUBLE PRECISION for dorgrq
Arrays: a(lda,*), tau(*).
836
LAPACK Routines 3
On entry, the (m - k + i)-th row of a must contain the vector which defines
the elementary reflector H(i), for i = 1,2,...,k, as returned by sgerqf/
dgerqf in the last k rows of its array argument a;
returned by sgerqf/dgerqf;

xerbla.
Output Parameters
a Overwritten by the last m rows of the n-by-n orthogonal matrix Q.

info INTEGER.

Specific details for the routine orgrq interface are the following:
Application Notes
lwork = -1.
837
The complex counterpart of this routine is ungrq.
?ungrq
Generates the complex matrix Q of the RQ
factorization formed by ?gerqf.
Syntax
call cungrq(m, n, k, a, lda, tau, work, lwork, info)
call zungrq(m, n, k, a, lda, tau, work, lwork, info)
call ungrq(a, tau [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates an m-by-n complex matrix with orthonormal rows, which is defined as the last m rows
of a product of k elementary reflectors H(i) of order n: Q = H(1)H* H(2)H*...*H(k)H as returned by the
routines gerqf. Use this routine after a call to cgerqf/zgerqf.
Input Parameters
m INTEGER. The number of rows of the matrix Q (m0).
n INTEGER. The number of columns of the matrix Q (nm ).

matrix Q (mk0).
a, tau, work REAL for cungrq

DOUBLE PRECISION for zungrq
Arrays: a(lda,*), tau(*), work(lwork).
On entry, the (m - k + i)th row of a must contain the vector which defines
the elementary reflector H(i), for i = 1,2,...,k, as returned by cgerqf/
zgerqf in the last k rows of its array argument a;
returned by cgerqf/zgerqf;

838
LAPACK Routines 3
xerbla.
Output Parameters
a Overwritten by the m last rows of the n-by-n unitary matrix Q.

info INTEGER.

Specific details for the routine ungrq interface are the following:
Application Notes
= -1.
The real counterpart of this routine is orgrq.
?ormrq
the RQ factorization formed by ?gerqf.
Syntax
call sormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormrq(a, tau, c [,side] [,trans] [,info])
839
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real m-by-n matrix C by Q or QT, where Q is the real orthogonal matrix defined as a
product of k elementary reflectors Hi : Q = H1H2 ... Hk as returned by the RQ factorization routine gerqf.
C*Q, or C*QT (overwriting the result over C).
Input Parameters



0 km, if side = 'L';
0 kn, if side = 'R'.
a, tau, c, work REAL for sormrq

DOUBLE PRECISION for dormrq.
On entry, the ith row of a must contain the vector which defines the
elementary reflector Hi, for i = 1,2,...,k, as returned by sgerqf/dgerqf in
the last k rows of its array argument a.
The second dimension of a must be at least max(1, m) if side = 'L', and
tau(i) must contain the scalar factor of the elementary reflector Hi, as
returned by sgerqf/dgerqf.

840
LAPACK Routines 3
xerbla.
Output Parameters
and trans).

info INTEGER.

Specific details for the routine ormrq interface are the following:
Application Notes
lwork = -1.
841
The complex counterpart of this routine is unmrq.
?unmrq
the RQ factorization formed by ?gerqf.
Syntax
call cunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmrq(side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmrq(a, tau, c [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the complex unitary matrix defined
as a product of k elementary reflectors H(i) of order n: Q = H(1)H* H(2)H*...*H(k)Has returned by the
RQ factorization routine gerqf .
C*Q, or C*QH (overwriting the result over C).
Input Parameters



a, tau, c, work COMPLEX for cunmrq

DOUBLE COMPLEX for zunmrq.
842
LAPACK Routines 3
elementary reflector H(i), for i = 1,2,...,k, as returned by cgerqf/zgerqf in
returned by cgerqf/zgerqf.

lda INTEGER. The leading dimension of a; lda max(1, k) .

xerbla.
Output Parameters
and trans).

info INTEGER.

Specific details for the routine unmrq interface are the following:
843
Application Notes
= -1.
The real counterpart of this routine is ormrq.
?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
call stzrzf(m, n, a, lda, tau, work, lwork, info)
call dtzrzf(m, n, a, lda, tau, work, lwork, info)
call ctzrzf(m, n, a, lda, tau, work, lwork, info)
call ztzrzf(m, n, a, lda, tau, work, lwork, info)
call tzrzf(a [, tau] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces the m-by-n (mn) real/complex upper trapezoidal matrix A to upper triangular form by
means of orthogonal/unitary transformations. The upper trapezoidal matrix A = [A1 A2] = [A1:m, 1:m, A1:m, m
+1:n] is factored as
A = [R0]*Z,
where Z is an n-by-n orthogonal/unitary matrix, R is an m-by-m upper triangular matrix, and 0 is the m-by-
(n-m) zero matrix.
See larz that applies an elementary reflector returned by ?tzrzf to a general matrix.
The ?tzrzf routine replaces the deprecated ?tzrqf routine.
844
LAPACK Routines 3
Input Parameters
n INTEGER. The number of columns in A (nm).
a, work REAL for stzrzf

DOUBLE PRECISION for dtzrzf
COMPLEX for ctzrzf
DOUBLE COMPLEX for ztzrzf.
Arrays: a(lda,*),work(lwork).
The leading m-by-n upper trapezoidal part of the array a contains the
matrix A to be factorized.

lwork max(1, m).
xerbla.
Output Parameters

the leading m-by-m upper triangular part of a contains the upper triangular
matrix R, and elements m +1 to n of the first m rows of a, with the array
tau, represent the orthogonal matrix Z as a product of m elementary
reflectors.
tau REAL for stzrzf

DOUBLE PRECISION for dtzrzf
COMPLEX for ctzrzf
DOUBLE COMPLEX for ztzrzf.
Array, size at least max (1, m). Contains scalar factors of the elementary
reflectors for the matrix Z.

info INTEGER.
845

Specific details for the routine tzrzf interface are the following:
tau Holds the vector of length (m).
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, Z(k), which is used
to introduce zeros into the (m - k + 1)-th row of A, is given in the form
where for real flavors
and for complex flavors
tau is a scalar and z(k) is an l-element vector. tau and z(k) are chosen to annihilate the elements of the k-th
row of A2.
The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th row of A, such that the
elements of z(k) are stored in a(k, m+1), ..., a(k, n).
The elements of R are returned in the upper triangular part of A.

The matrix Z is given by
Z = Z(1)*Z(2)*...*Z(m).
= -1.
846
LAPACK Routines 3
ormrz to apply matrix Q (for real matrices)
unmrz to apply matrix Q (for complex matrices).
?ormrz
Multiplies a real matrix by the orthogonal matrix
defined from the factorization formed by ?tzrzf.
Syntax
call sormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call dormrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call ormrz(a, tau, c, l [, side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?ormrz routine multiplies a real m-by-n matrix C by Q or QT, where Q is the real orthogonal matrix
defined as a product of k elementary reflectors H(i) of order n: Q = H(1)* H(2)*...*H(k) as returned by
the factorization routine tzrzf .
C*Q, or C*QT (overwriting the result over C).
The matrix Q is of order m if side = 'L' and of order n if side = 'R'.
The ?ormrz routine replaces the deprecated ?latzm routine.
Input Parameters


847

l INTEGER.
The number of columns of the matrix A containing the meaningful part of
the Householder reflectors. Constraints:
0 lm, if side = 'L';
0 ln, if side = 'R'.
a, tau, c, work REAL for sormrz

DOUBLE PRECISION for dormrz.
elementary reflector H(i), for i = 1,2,...,k, as returned by stzrzf/dtzrzf
in the last k rows of its array argument a.
returned by stzrzf/dtzrzf.

lda INTEGER. The leading dimension of a; lda max(1, k) .

xerbla.
Output Parameters
and trans).

848
LAPACK Routines 3
info INTEGER.

Specific details for the routine ormrz interface are the following:
Application Notes
lwork = -1.
The complex counterpart of this routine is unmrz.
?unmrz
Multiplies a complex matrix by the unitary matrix
defined from the factorization formed by ?tzrzf.
Syntax
call cunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call zunmrz(side, trans, m, n, k, l, a, lda, tau, c, ldc, work, lwork, info)
call unmrz(a, tau, c, l [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
849
Description
The routine multiplies a complex m-by-n matrix C by Q or QH, where Q is the unitary matrix defined as a
product of k elementary reflectors H(i):
Q = H(1)H* H(2)H*...*H(k)H as returned by the factorization routine tzrzf.

C*Q, or C*QH (overwriting the result over C).
The matrix Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters



l INTEGER.
The number of columns of the matrix A containing the meaningful part of
the Householder reflectors. Constraints:
0 lm, if side = 'L';
0 ln, if side = 'R'.
a, tau, c, work COMPLEX for cunmrz

DOUBLE COMPLEX for zunmrz.
elementary reflector H(i), for i = 1,2,...,k, as returned by ctzrzf/ztzrzf
in the last k rows of its array argument a.
returned by ctzrzf/ztzrzf.

850
LAPACK Routines 3

xerbla.
Output Parameters
and trans).

info INTEGER.
If info = -i, the ith parameter had an illegal value.

Specific details for the routine unmrz interface are the following:
Application Notes
= -1.
851
The real counterpart of this routine is ormrz.
?ggqrf
Computes the generalized QR factorization of two
matrices.
Syntax
call sggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggqrf(n, m, p, a, lda, taua, b, ldb, taub, work, lwork, info)
call ggqrf(a, b [,taua] [,taub] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the generalized QR factorization of an n-by-m matrix A and an n-by-p matrix B as A =
Q*R, B = Q*T*Z, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix,
and R and T assume one of the forms:
or
where R11 is upper triangular, and
852
LAPACK Routines 3
where T12 or T21 is a p-by-p upper triangular matrix.

In particular, if B is square and nonsingular, the GQR factorization of A and B implicitly gives the QR
factorization of B-1A as:
B-1*A = ZT*(T-1*R) (for real flavors) or B-1*A = ZH*(T-1*R) (for complex flavors).
Input Parameters
n INTEGER. The number of rows of the matrices A and B (n 0).
m INTEGER. The number of columns in A (m 0).
p INTEGER. The number of columns in B (p 0).
a, b, work REAL for sggqrf

DOUBLE PRECISION for dggqrf
COMPLEX for cggqrf
DOUBLE COMPLEX for zggqrf.
Arrays: a(lda,*) contains the matrix A.
The second dimension of a must be at least max(1, m).
b(ldb,*) contains the matrix B.
The second dimension of b must be at least max(1, p).
ldb INTEGER. The leading dimension of b; at least max(1, n).
lwork INTEGER. The size of the work array; must be at least max(1, n, m, p).
xerbla.
Output Parameters
a, b Overwritten by the factorization data as follows:

on exit, the elements on and above the diagonal of the array a contain the
min(n,m)-by-m upper trapezoidal matrix R (R is upper triangular if nm);the
elements below the diagonal, with the array taua, represent the orthogonal/
unitary matrix Q as a product of min(n,m) elementary reflectors ;
853
if np, the upper triangle of the subarray b(1:n, p-n+1:p ) contains the n-
by-n upper triangular matrix T;
if n > p, the elements on and above the (n-p)th subdiagonal contain the n-
by-p upper trapezoidal matrix T; the remaining elements, with the array
taub, represent the orthogonal/unitary matrix Z as a product of elementary
reflectors.
taua, taub REAL for sggqrf

DOUBLE PRECISION for dggqrf
COMPLEX for cggqrf
DOUBLE COMPLEX for zggqrf.
Arrays, size at least max (1, min(n, m)) for taua and at least max (1,
min(n, p)) for taub. The array taua contains the scalar factors of the
elementary reflectors which represent the orthogonal/unitary matrix Q.
The array taub contains the scalar factors of the elementary reflectors
which represent the orthogonal/unitary matrix Z.

info INTEGER.

Specific details for the routine ggqrf interface are the following:
a Holds the matrix A of size (n,m).
b Holds the matrix B of size (n,p).
taua Holds the vector of length min(n,m).
taub Holds the vector of length min(n,p).
Application Notes
The matrix Q is represented as a product of elementary reflectors
Q = H(1)H(2)...H(k), where k = min(n,m).
Each H(i) has the form
H(i) = I - a*v*vT for real flavors, or
H(i) = I - a*v*vH for complex flavors,
where a is a real/complex scalar, and v is a real/complex vector with vj = 0 for 1 ji - 1, vi = 1.
On exit, fori + 1 jn, vj is stored in a(i+1:n, i) and a is stored in taua(i)
The matrix Z is represented as a product of elementary reflectors
854
LAPACK Routines 3
Z = H(1)H(2)...H(k), where k = min(n,p).
H(i) = I - b*v*vT for real flavors, or
H(i) = I - b*v*vH for complex flavors,
where b is a real/complex scalar, and v is a real/complex vector with vp - k + 1 = 1, vj = 0 for p - k + 1 jp -
1, .
On exit, for 1 jp - k + i - 1, vj is stored in b(n-k+i, 1:p-k+i-1) and b is stored in taub(i).
For better performance, try using lwork max(n,m, p)*max(nb1,nb2,nb3), where nb1 is the optimal
blocksize for the QR factorization of an n-by-m matrix, nb2 is the optimal blocksize for the RQ factorization of
an n-by-p matrix, and nb3 is the optimal blocksize for a call of ormqr/unmqr.
lwork = -1.
?ggrqf
Computes the generalized RQ factorization of two
matrices.
Syntax
call sggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call dggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call cggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call zggrqf (m, p, n, a, lda, taua, b, ldb, taub, work, lwork, info)
call ggrqf(a, b [,taua] [,taub] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the generalized RQ factorization of an m-by-n matrix A and an p-by-n matrix B as A =
R*Q, B = Z*T*Q, where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix,
and R and T assume one of the forms:
or
855
where R11 or R21 is upper triangular, and
or
where T11 is upper triangular.

In particular, if B is square and nonsingular, the GRQ factorization of A and B implicitly gives the RQ
factorization of A*B-1 as:
A*B-1 = (R*T-1)*ZT (for real flavors) or A*B-1 = (R*T-1)*ZH (for complex flavors).
Input Parameters
m INTEGER. The number of rows of the matrix A (m 0).
p INTEGER. The number of rows in B (p 0).
n INTEGER. The number of columns of the matrices A and B (n 0).
a, b, work REAL for sggrqf

DOUBLE PRECISION for dggrqf
COMPLEX for cggrqf
DOUBLE COMPLEX for zggrqf.
Arrays:
a(lda,*) contains the m-by-n matrix A.
b(ldb,*) contains the p-by-n matrix B.
The second dimension of b must be at least max(1, n).
856
LAPACK Routines 3
ldb INTEGER. The leading dimension of b; at least max(1, p).
lwork INTEGER. The size of the work array; must be at least max(1, n, m, p).
xerbla.
Output Parameters
a, b Overwritten by the factorization data as follows:

on exit, if mn, the upper triangle of the subarray a(1:m, n-m+1:n )
contains the m-by-m upper triangular matrix R;
if m > n, the elements on and above the (m-n)th subdiagonal contain the
m-by-n upper trapezoidal matrix R;
the remaining elements, with the array taua, represent the orthogonal/
unitary matrix Q as a product of elementary reflectors.
The elements on and above the diagonal of the array b contain the
min(p,n)-by-n upper trapezoidal matrix T (T is upper triangular if pn); the
elements below the diagonal, with the array taub, represent the orthogonal/
unitary matrix Z as a product of elementary reflectors.
taua, taub REAL for sggrqf

DOUBLE PRECISION for dggrqf
COMPLEX for cggrqf
DOUBLE COMPLEX for zggrqf.
Arrays, size at least max (1, min(m, n)) for taua and at least max (1,
min(p, n)) for taub.
The array taua contains the scalar factors of the elementary reflectors
which represent the orthogonal/unitary matrix Q.
which represent the orthogonal/unitary matrix Z.

info INTEGER.

Specific details for the routine ggrqf interface are the following:
857
b Holds the matrix A of size (p,n).
taua Holds the vector of length min(m,n).
taub Holds the vector of length min(p,n).
Application Notes
Q = H(1)H(2)...H(k), where k = min(m,n).
H(i) = I - taua*v*vT for real flavors, or
H(i) = I - taua*v*vH for complex flavors,
where taua is a real/complex scalar, and v is a real/complex vector with vn - k + i = 1, vn - k + i + 1:n = 0.
On exit, v1:n - k + i - 1 is stored in a(m-k+i,1:n-k+i-1) and taua is stored in taua(i).

Z = H(1)H(2)...H(k), where k = min(p,n).
H(i) = I - taub*v*vT for real flavors, or
H(i) = I - taub*v*vH for complex flavors,
where taub is a real/complex scalar, and v is a real/complex vector with v1:i - 1 = 0, vi = 1.
On exit, vi + 1:p is stored in b(i+1:p, i) and taub is stored in taub(i).
For better performance, try using

lwork max(n,m, p)*max(nb1,nb2,nb3),
where nb1 is the optimal blocksize for the RQ factorization of an m-by-n matrix, nb2 is the optimal blocksize
for the QR factorization of an p-by-n matrix, and nb3 is the optimal blocksize for a call of ?ormrq/?unmrq.
lwork= -1.
If you set lwork= -1, the routine returns immediately and provides the recommended workspace in the first
?tpqrt
Computes a blocked QR factorization of a real or
complex "triangular-pentagonal" matrix, which is
composed of a triangular block and a pentagonal
block, using the compact WY representation for Q.
858
LAPACK Routines 3
Syntax
call stpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call dtpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call ctpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call ztpqrt(m, n, l, nb, a, lda, b, ldb, t, ldt, work, info)
call tpqrt(a, b, t, nb[, info])
Include Files
mkl.fi, lapack.f90
Description
The input matrix C is an (n+m)-by-n matrix
where A is an n-by-n upper triangular matrix, and B is an m-by-n pentagonal matrix consisting of an (m-l)-
by-n rectangular matrix B1 on top of an l-by-n upper trapezoidal matrix B2:
The upper trapezoidal matrix B2 consists of the first l rows of an n-by-n upper triangular matrix, where 0
l min(m,n). If l=0, B is an m-by-n rectangular matrix. If m=l=n, B is upper triangular. The elementary
reflectors H(i) are stored in the ith column below the diagonal in the (n+m)-by-n input matrix C. The
structure of vectors defining the elementary reflectors is illustrated by:
The elements of the unit matrix I are not stored. Thus, V contains all of the necessary information, and is
returned in array b.
NOTE
Note that V has the same form as B:
859
The columns of V represent the vectors which define the H(i)s.

The number of blocks is k = ceiling(n/nb), where each block is of order nb except for the last block, which is
of order ib = n - (k-1)*nb. For each of the k blocks, an upper triangular block reflector factor is computed:
T1, T2, ..., Tk. The nb-by-nb (ib-by-ib for the last block) Tis are stored in the nb-by-n array t as
t = [T1T2 ... Tk] .
Input Parameters
m INTEGER. The total number of rows in the matrix B (m 0).
n INTEGER. The number of columns in B and the order of the triangular

matrix A (n 0).
l INTEGER. The number of rows of the upper trapezoidal part of B (min(m, n)

l 0).
nb INTEGER. The block size to use in the blocked QR factorization (nnb 1).
a, b, work REAL for stpqrt

DOUBLE PRECISION for dtpqrt
COMPLEX for ctpqrt
COMPLEX*16 for ztpqrt.
Arrays: a size (lda, n) contains the n-by-n upper triangular matrix A.
b size (ldb, n), the pentagonal m-by-n matrix B. The first (m-l) rows
contain the rectangular B1 matrix, and the next l rows contain the upper
trapezoidal B2 matrix.
work size (nb, n) is a workspace array.
ldb INTEGER. The leading dimension of b; at least max(1, m).
ldt INTEGER. The leading dimension of t; at least nb.
Output Parameters
a The elements on and above the diagonal of the array contain the upper
triangular matrix R.
b The pentagonal matrix V.
t REAL for stpqrt

DOUBLE PRECISION for dtpqrt
COMPLEX for ctpqrt
COMPLEX*16 for ztpqrt.
Array, size (ldt, n).
The upper triangular block reflectors stored in compact form as a sequence

of upper triangular blocks.
860
LAPACK Routines 3
info INTEGER.
?tpmqrt
Applies a real or complex orthogonal matrix obtained
from a "triangular-pentagonal" complex block reflector
to a general real or complex matrix, which consists of
two blocks.
Syntax
call stpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call dtpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call ctpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call ztpmqrt(side, trans, m, n, k, l, nb, v, ldv, t, ldt, a, lda, b, ldb, work, info)
call tpmqrt( v, t, a, b, k, nb[, trans][, side][, info])
Include Files
mkl.fi, lapack.f90
Description
The columns of the pentagonal matrix V contain the elementary reflectors H(1), H(2), ..., H(k); V is
composed of a rectangular block V1 and a trapezoidal block V2:
The size of the trapezoidal block V2 is determined by the parameter l, where 0 lk. V2 is upper trapezoidal,
consisting of the first l rows of a k-by-k upper triangular matrix.
If l=k, V2 is upper triangular;
If l=0, there is no trapezoidal block, so V = V1 is rectangular.
If side = 'L':
where A is k-by-n, B is m-by-n and V is m-by-k.
If side = 'R':
861
where A is m-by-k, B is m-by-n and V is n-by-k.
The real/complex orthogonal matrix Q is formed from V and T.

If trans='N' and side='L', c contains Q * C on exit.
If trans='T' and side='L', C contains QT * C on exit.
If trans='C' and side='L', C contains QH * C on exit.
If trans='N' and side='R', C contains C * Q on exit.
If trans='T' and side='R', C contains C * QT on exit.
If trans='C' and side='R', C contains C * QH on exit.
Input Parameters
side CHARACTER*1
='L': apply Q, QT, or QH from the left.
='R': apply Q, QT, or QH from the right.
trans CHARACTER*1
='N', no transpose, apply Q.
='T', transpose, apply QT.
='C', transpose, apply QH.
m INTEGER. The number of rows in the matrix B, (m 0).
n INTEGER. The number of columns in the matrix B, (n 0).

matrix Q, (k 0).
l INTEGER. The order of the trapezoidal part of V (kl 0).
nb INTEGER.
The block size used for the storage of t, knb 1. This must be the same
value of nb used to generate t in tpqrt.
v REAL for stpmqrt

DOUBLE PRECISION for dtpmqrt
COMPLEX for ctpmqrt
COMPLEX*16 for ztpmqrt.
Size (ldv, k)
The ith column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,...,k, as returned by tpqrt in array argument b.

If side = 'L', ldv must be at least max(1,m);
If side = 'R', ldv must be at least max(1,n).
t REAL for stpmqrt
862
LAPACK Routines 3
COMPLEX for ctpmqrt
Array, size (ldt, k).
The upper triangular factors of the block reflectors as returned by tpqrt,

stored as an nb-by-k matrix.
ldt INTEGER. The leading dimension of the array t. ldt must be at least nb.
a REAL for stpmqrt

COMPLEX for ctpmqrt
If side = 'L', size (lda, n).
If side = 'R', size (lda, k).
The k-by-n or m-by-k matrix A.

If side = 'L', lda must be at least max(1,k).
If side = 'R', lda must be at least max(1,m).
b REAL for stpmqrt

COMPLEX for ctpmqrt
Size (ldb, n).
The m-by-n matrix B.
ldb INTEGER. The leading dimension of the array b. ldb must be at least
max(1,m).
work REAL for stpmqrt

COMPLEX for ctpmqrt
Workspace array. If side = 'L' DIMENSION n*nb. If side = 'R' DIMENSION
m*nb.
Output Parameters
a Overwritten by the corresponding block of the product Q*C, C*Q, QT*C,

C*QT, QH*C, or C*QH.
863
b Overwritten by the corresponding block of the product Q*C, C*Q, QT*C,

C*QT, QH*C, or C*QH.
info (global) INTEGER.

< 0: if info = -i, the ith argument had an illegal value.
Singular Value Decomposition: LAPACK Computational Routines

This section describes LAPACK routines for computing the singular value decomposition (SVD) of a general
m-by-n matrix A:
A = UVH.
In this decomposition, U and V are unitary (for complex A) or orthogonal (for real A); is an m-by-n
diagonal matrix with real diagonal elements i:
12 ... min(m, n) 0.
The diagonal elements i are singular values of A. The first min(m, n) columns of the matrices U and V are,
respectively, left and right singular vectors of A. The singular values and singular vectors satisfy
Avi = iui and AHui = ivi
where ui and vi are the i-th columns of U and V, respectively.
To find the SVD of a general matrix A, call the LAPACK routine ?gebrd or ?gbbrd for reducing A to a
bidiagonal matrix B by a unitary (orthogonal) transformation: A = QBPH. Then call ?bdsqr, which forms the
SVD of a bidiagonal matrix: B = U1V1H.
Thus, the sought-for SVD of A is given by A = UVH =(QU1)(V1HPH).
Table "Computational Routines for Singular Value Decomposition (SVD)" lists LAPACK routines (FORTRAN 77
interface) that perform singular value decomposition of matrices. The corresponding routine names in the
Fortran 95 interface are the same except that the first character is removed.
Computational Routines for Singular Value Decomposition (SVD)
Operation Real matrices Complex matrices
Reduce A to a bidiagonal matrix B: A = QBPH ?gebrd ?gebrd

(full storage)
Reduce A to a bidiagonal matrix B: A = QBPH ?gbbrd ?gbbrd

(band storage)
Generate the orthogonal (unitary) matrix Q or ?orgbr ?ungbr

P
Apply the orthogonal (unitary) matrix Q or P ?ormbr ?unmbr
Form singular value decomposition of the ?bdsqr ?bdsdc ?bdsqr

bidiagonal matrix B: B = UVH
864
LAPACK Routines 3
Decision Tree: Singular Value Decomposition
Figure "Decision Tree: Singular Value Decomposition" presents a decision tree that helps you choose the right
sequence of routines for SVD, depending on whether you need singular values only or singular vectors as
well, whether A is real or complex, and so on.
You can use the SVD to find a minimum-norm solution to a (possibly) rank-deficient least squares problem of
minimizing ||Ax - b||2. The effective rank k of the matrix A can be determined as the number of singular
values which exceed a suitable threshold. The minimum-norm solution is
x = Vk(k)-1c
where k is the leading k-by-k submatrix of , the matrix Vk consists of the first k columns of V = PV1, and
the vector c consists of the first k elements of UHb = U1HQHb.
?gebrd
Reduces a general matrix to bidiagonal form.
Syntax
call sgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call dgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call cgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call zgebrd(m, n, a, lda, d, e, tauq, taup, work, lwork, info)
call gebrd(a [, d] [,e] [,tauq] [,taup] [,info])
Include Files
mkl.fi, lapack.f90
865
Description
The routine reduces a general m-by-n matrix A to a bidiagonal matrix B by an orthogonal (unitary)
transformation.
If mn, the reduction is given by
where B1 is an n-by-n upper diagonal matrix, Q and P are orthogonal or, for a complex A, unitary matrices;
Q1 consists of the first n columns of Q.
If m < n, the reduction is given by
A = Q*B*PH = Q*(B10)*PH = Q1*B1*P1H,

where B1 is an m-by-m lower diagonal matrix, Q and P are orthogonal or, for a complex A, unitary matrices;
P1 consists of the first m columns of P.
The routine does not form the matrices Q and P explicitly, but represents them as products of elementary
reflectors. Routines are provided to work with the matrices Q and P in this representation:
If the matrix A is real,
to compute Q and P explicitly, call orgbr.

to multiply a general matrix by Q or P, call ormbr.
If the matrix A is complex,
to compute Q and P explicitly, call ungbr.

to multiply a general matrix by Q or P, call unmbr.
Input Parameters
a, work REAL for sgebrd

DOUBLE PRECISION for dgebrd
COMPLEX for cgebrd
DOUBLE COMPLEX for zgebrd.
Arrays:
a(lda,*) contains the matrix A.
lwork INTEGER.
The dimension of work; at least max(1, m, n).

xerbla.
866
LAPACK Routines 3
Output Parameters
a If mn, the diagonal and first super-diagonal of a are overwritten by the

upper bidiagonal matrix B. The elements below the diagonal, with the array
tauq, represent the orthogonal matrix Q as a product of elementary
reflectors, and the elements above the first superdiagonal, with the array
taup, represent the orthogonal matrix P as a product of elementary
reflectors.
If m < n, the diagonal and first sub-diagonal of a are overwritten by the
lower bidiagonal matrix B. The elements below the first subdiagonal, with
the array tauq, represent the orthogonal matrix Q as a product of
elementary reflectors, and the elements above the diagonal, with the array
taup, represent the orthogonal matrix P as a product of elementary
reflectors.
d REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors.
Contains the diagonal elements of B.
e REAL for single-precision flavors

Array, size at least max(1, min(m, n) - 1). Contains the off-diagonal
elements of B.
tauq, taup REAL for sgebrd

COMPLEX for cgebrd
Arrays, size at least max (1, min(m, n)). The scalar factors of the
elementary reflectors which represent the orthogonal or unitary matrices P
and Q.

info INTEGER.

Specific details for the routine gebrd interface are the following:
867
d Holds the vector of length min(m,n).
e Holds the vector of length min(m,n)-1.
tauq Holds the vector of length min(m,n).
taup Holds the vector of length min(m,n).
Application Notes
For better performance, try using lwork = (m + n)*blocksize, where blocksize is a machine-dependent
value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
lwork = -1.
The computed matrices Q, B, and P satisfy QBPH = A + E, where ||E||2 = c(n) ||A||2, c(n) is a
modestly increasing function of n, and is the machine precision.
(4/3)*n2*(3*m - n) for mn,
(4/3)*m2*(3*n - m) for m < n.
If n is much less than m, it can be more efficient to first form the QR factorization of A by calling geqrf and
then reduce the factor R to bidiagonal form. This requires approximately 2*n2*(m + n) floating-point
operations.
If m is much less than n, it can be more efficient to first form the LQ factorization of A by calling gelqf and
then reduce the factor L to bidiagonal form. This requires approximately 2*m2*(m + n) floating-point
operations.
?gbbrd
Reduces a general band matrix to bidiagonal form.
Syntax
call sgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
info)
call dgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
info)
call cgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
rwork, info)
call zgbbrd(vect, m, n, ncc, kl, ku, ab, ldab, d, e, q, ldq, pt, ldpt, c, ldc, work,
rwork, info)
868
LAPACK Routines 3
call gbbrd(ab [, c] [,d] [,e] [,q] [,pt] [,kl] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces an m-by-n band matrix A to upper bidiagonal matrix B: A = Q*B*PH. Here the matrices
Q and P are orthogonal (for real A) or unitary (for complex A). They are determined as products of Givens
rotation matrices, and may be formed explicitly by the routine if required. The routine can also update a
matrix C as follows: C = QH*C.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'Q' or 'P' or 'B'.

If vect = 'N', neither Q nor PH is generated.
If vect = 'Q', the routine generates the matrix Q.
If vect = 'P', the routine generates the matrix PH.
If vect = 'B', the routine generates both Q and PH.
ncc INTEGER. The number of columns in C (ncc 0).
kl INTEGER. The number of sub-diagonals within the band of A (kl 0).
ku INTEGER. The number of super-diagonals within the band of A (ku 0).
ab, c, work REAL for sgbbrd

DOUBLE PRECISION for dgbbrd
COMPLEX for cgbbrd
DOUBLE COMPLEX for zgbbrd.
Arrays:
ab(ldab,*) contains the matrix A in band storage (see Matrix Storage
Schemes).
c(ldc,*) contains an m-by-ncc matrix C.
If ncc = 0, the array c is not referenced.
The second dimension of c must be at least max(1, ncc).

The dimension of work must be at least 2*max(m, n) for real flavors, or
max(m, n) for complex flavors.
ldab INTEGER. The leading dimension of the array ab (ldabkl + ku + 1).
ldq INTEGER. The leading dimension of the output array q.
869
ldq max(1, m) if vect = 'Q' or 'B', ldq 1 otherwise.
ldpt INTEGER. The leading dimension of the output array pt.

ldpt max(1, n) if vect = 'P' or 'B', ldpt 1 otherwise.
ldc INTEGER. The leading dimension of the array c.

ldc max(1, m) if ncc > 0; ldc 1 if ncc = 0.
rwork REAL for cgbbrd DOUBLE PRECISION for zgbbrd.

A workspace array, size at least max(m, n).
Output Parameters
ab Overwritten by values generated during the reduction.

Array, size at least max(1, min(m, n)). Contains the diagonal elements of
the matrix B.

Array, size at least max(1, min(m, n) - 1).
Contains the off-diagonal elements of B.
q, pt REAL for sgebrd

COMPLEX for cgebrd
Arrays:
q(ldq,*) contains the output m-by-m matrix Q.
The second dimension of q must be at least max(1, m).

p(ldpt,*) contains the output n-by-n matrix PT.
The second dimension of pt must be at least max(1, n).
c Overwritten by the product QH*C.
c is not referenced if ncc = 0.
info INTEGER.
870
LAPACK Routines 3
Specific details for the routine gbbrd interface are the following:
c Holds the matrix C of size (m,ncc).
d Holds the vector with the number of elements min(m,n).
e Holds the vector with the number fo elements min(m,n)-1.
q Holds the matrix Q of size (m,m).
pt Holds the matrix PT of size (n,n).
m If omitted, assumed m = n.
vect Restored based on the presence of arguments q and pt as follows:
vect = 'B', if both q and pt are present,

vect = 'Q', if q is present and pt omitted, vect = 'P',
if q is omitted and pt present, vect = 'N', if both q and pt are omitted.
Application Notes
The computed matrices Q, B, and P satisfy Q*B*PH = A + E, where ||E||2 = c(n) ||A||2, c(n) is a
If m = n, the total number of floating-point operations for real flavors is approximately the sum of:
6*n2*(kl + ku) if vect = 'N' and ncc = 0,

3*n2*ncc*(kl + ku - 1)/(kl + ku) if C is updated, and
3*n3*(kl + ku - 1)/(kl + ku) if either Q or PH is generated (double this if both).
To estimate the number of operations for complex flavors, use the same formulas with the coefficients 20
and 10 (instead of 6 and 3).
?orgbr
Generates the real orthogonal matrix Q or PT
determined by ?gebrd.
Syntax
call sorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call dorgbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call orgbr(a, tau [,vect] [,info])
871
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of the orthogonal matrices Q and PT formed by the routines gebrd/
gebrd. Use this routine after a call to sgebrd/dgebrd. All valid combinations of arguments are described in
Input parameters. In most cases you need the following:
To compute the whole m-by-m matrix Q:
call ?orgbr('Q', m, m, n, a ... )
(note that the array a must have at least m columns).

To form the n leading columns of Q if m > n:
call ?orgbr('Q', m, n, n, a ... )
To compute the whole n-by-n matrix PT:
call ?orgbr('P', n, n, m, a ... )
(note that the array a must have at least n rows).

To form the m leading rows of PT if m < n:
call ?orgbr('P', m, n, m, a ... )
Input Parameters
vect CHARACTER*1. Must be 'Q' or 'P'.

If vect = 'P', the routine generates the matrix PT.
m, n INTEGER. The number of rows (m) and columns (n) in the matrix Q or PT to
be returned (m 0, n 0).
If vect = 'Q', mn min(m, k).
If vect = 'P', nm min(n, k).
k If vect = 'Q', the number of columns in the original m-by-k matrix

reduced by gebrd.
If vect = 'P', the number of rows in the original k-by-n matrix reduced
by gebrd.
a REAL for sorgbr

DOUBLE PRECISION for dorgbr
The vectors which define the elementary reflectors, as returned by gebrd.
lda INTEGER. The leading dimension of the array a. lda max(1, m) .
tau REAL for sorgbr

872
LAPACK Routines 3
Array, size min (m,k) if vect = 'Q', min (n,k) if vect = 'P'.
Scalar factor of the elementary reflector H(i) or G(i), which determines Q
and PT as returned by gebrd in the array tauq or taup.
work REAL for sorgbr

Workspace array, size max(1, lwork).
lwork INTEGER. Dimension of the array work. See Application Notes for the
If lwork = -1 then the routine performs a workspace query and calculates
the optimal size of the work array, returns this value as the first entry of the
work array, and no error message related to lwork is issued by xerbla.
Output Parameters
a Overwritten by the orthogonal matrix Q or PT (or the leading rows or

columns thereof) as specified by vect, m, and n.

info INTEGER.

Specific details for the routine orgbr interface are the following:
tau Holds the vector of length min(m,k) where

k = m, if vect = 'P',
k = n, if vect = 'Q'.
vect Must be 'Q' or 'P'. The default value is 'Q'.
Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a machine-dependent
lwork = -1.
873
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O().
The approximate numbers of floating-point operations for the cases listed in Description are as follows:
To form the whole of Q:
(4/3)*n*(3m2 - 3m*n + n2) if m > n;
(4/3)*m3 if mn.
To form the n leading columns of Q when m > n:
(2/3)*n2*(3m - n2) if m > n.

To form the whole of PT:
(4/3)*n3 if mn;
(4/3)*m*(3n2 - 3m*n + m2) if m < n.
To form the m leading columns of PT when m < n:
(2/3)*n2*(3m - n2) if m > n.

The complex counterpart of this routine is ungbr.
?ormbr
Multiplies an arbitrary real matrix by the real
orthogonal matrix Q or PT determined by ?gebrd.
Syntax
call sormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call dormbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call ormbr(a, tau, c [,vect] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
Given an arbitrary real matrix C, this routine forms one of the matrix products Q*C, QT*C, C*Q, C*QT, P*C,
PT*C, C*P, C*PT, where Q and P are orthogonal matrices computed by a call to gebrd. The routine overwrites
the product on C.
Input Parameters
In the descriptions below, r denotes the order of Q or PT:
If side = 'L', r = m; if side = 'R', r = n.

If vect = 'Q', then Q or QT is applied to C.
If vect = 'P', then P or PT is applied to C.
side CHARACTER*1. Must be 'L' or 'R'.
874
LAPACK Routines 3
If side = 'L', multipliers are applied to C from the left.
If side = 'R', they are applied to C from the right.
trans CHARACTER*1. Must be 'N' or 'T'.

If trans = 'N', then Q or P is applied to C.
If trans = 'T', then QT or PT is applied to C.
m INTEGER. The number of rows in C.
n INTEGER. The number of columns in C.
k INTEGER. One of the dimensions of A in ?gebrd:

If vect = 'Q', the number of columns in A;
If vect = 'P', the number of rows in A.
Constraints: m 0, n 0, k 0.
a, c, work REAL for sormbr

DOUBLE PRECISION for dormbr.
Arrays:
a(lda,*) is the array a as returned by ?gebrd.
Its second dimension must be at least max(1, min(r,k)) for vect = 'Q', or
max(1, r)) for vect = 'P'.
c(ldc,*) holds the matrix C.

Its second dimension must be at least max(1, n).

lda max(1, r) if vect = 'Q';
lda max(1, min(r,k)) if vect = 'P'.
ldc INTEGER. The leading dimension of c; ldc max(1, m) .
tau REAL for sormbr

DOUBLE PRECISION for dormbr.
Array, size at least max (1, min(r, k)).
For vect = 'Q', the array tauq as returned by ?gebrd. For vect = 'P',
the array taup as returned by ?gebrd.

xerbla.
875
Output Parameters
c Overwritten by the product Q*C, QT*C, C*Q, C*Q,T, P*C, PT*C, C*P, or C*PT,
as specified by vect, side, and trans.

info INTEGER.

Specific details for the routine ormbr interface are the following:
a Holds the matrix A of size (r,min(nq,k)) where

r = nq, if vect = 'Q',
r = min(nq,k), if vect = 'P',
nq = m, if side = 'L',
nq = n, if side = 'R',
tau Holds the vector of length min(nq,k).
Application Notes
For better performance, try using
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
lwork = -1.
876
LAPACK Routines 3
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2.
The total number of floating-point operations is approximately

2*n*k(2*m - k) if side = 'L' and mk;
2*m*k(2*n - k) if side = 'R' and nk;
2*m2*n if side = 'L' and m < k;
2*n2*m if side = 'R' and n < k.
The complex counterpart of this routine is unmbr.
?ungbr
Generates the complex unitary matrix Q or PH
determined by ?gebrd.
Syntax
call cungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call zungbr(vect, m, n, k, a, lda, tau, work, lwork, info)
call ungbr(a, tau [,vect] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine generates the whole or part of the unitary matrices Q and PH formed by the routines gebrd/
gebrd. Use this routine after a call to cgebrd/zgebrd. All valid combinations of arguments are described in
Input Parameters; in most cases you need the following:
To compute the whole m-by-m matrix Q, use:
call ?ungbr('Q', m, m, n, a ... )
(note that the array a must have at least m columns).

To form the n leading columns of Q if m > n, use:
call ?ungbr('Q', m, n, n, a ... )
To compute the whole n-by-n matrix PH, use:
call ?ungbr('P', n, n, m, a ... )
(note that the array a must have at least n rows).

To form the m leading rows of PH if m < n, use:
call ?ungbr('P', m, n, m, a ... )
877
Input Parameters

If vect = 'P', the routine generates the matrix PH.
m INTEGER. The number of required rows of Q or PH.
n INTEGER. The number of required columns of Q or PH.

For vect = 'Q': knm if m > k, or m = n if mk.
For vect = 'P': kmn if n > k, or m = n if nk.
a, work COMPLEX for cungbr

DOUBLE COMPLEX for zungbr.
Arrays:

tau COMPLEX for cungbr

DOUBLE COMPLEX for zungbr.
The dimension of tau must be at least max(1, min(m, k)) for vect = 'Q',
or max(1, min(m, k)) for vect = 'P'.

Constraint: lwork < max(1, min(m, n)).

xerbla.
Output Parameters
a Overwritten by the orthogonal matrix Q or PT (or the leading rows or

columns thereof) as specified by vect, m, and n.
878
LAPACK Routines 3
info INTEGER.

Specific details for the routine ungbr interface are the following:
tau Holds the vector of length min(m,k) where

Application Notes
For better performance, try using lwork = min(m,n)*blocksize, where blocksize is a machine-dependent
= -1.
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O().
The approximate numbers of possible floating-point operations are listed below:

To compute the whole matrix Q:
(16/3)n(3m2 - 3m*n + n2) if m > n;
(16/3)m3 if mn.
To form the n leading columns of Q when m > n:
(8/3)n2(3m - n2).
To compute the whole matrix PH:
(16/3)n3 if mn;
(16/3)m(3n2 - 3m*n + m2) if m < n.
To form the m leading columns of PH when m < n:
879
(8/3)n2(3m - n2) if m > n.

The real counterpart of this routine is orgbr.
?unmbr
Multiplies an arbitrary complex matrix by the unitary
matrix Q or P determined by ?gebrd.
Syntax
call cunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call zunmbr(vect, side, trans, m, n, k, a, lda, tau, c, ldc, work, lwork, info)
call unmbr(a, tau, c [,vect] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
Given an arbitrary complex matrix C, this routine forms one of the matrix products Q*C, QH*C, C*Q, C*QH,
P*C, PH*C, C*P, or C*PH, where Q and P are unitary matrices computed by a call to gebrd/gebrd. The routine
overwrites the product on C.
Input Parameters
In the descriptions below, r denotes the order of Q or PH:

If vect = 'Q', then Q or QH is applied to C.
If vect = 'P', then P or PH is applied to C.

If side = 'L', multipliers are applied to C from the left.
If side = 'R', they are applied to C from the right.
trans CHARACTER*1. Must be 'N' or 'C'.

If trans = 'N', then Q or P is applied to C.
If trans = 'C', then QH or PH is applied to C.
m INTEGER. The number of rows in C.
n INTEGER. The number of columns in C.

a, c, work COMPLEX for cunmbr
880
LAPACK Routines 3
DOUBLE COMPLEX for zunmbr.
Arrays:
Its second dimension must be at least max(1, min(r,k)) for vect = 'Q', or
max(1, r)) for vect = 'P'.
c(ldc,*) holds the matrix C.

Its second dimension must be at least max(1, n).

lda max(1, r) if vect = 'Q';
lda max(1, min(r,k)) if vect = 'P'.
tau COMPLEX for cunmbr

DOUBLE COMPLEX for zunmbr.
Array, size at least max (1, min(r, k)).

lwork 1 if n=0 or m=0.
For optimum performance lwork max(1,n*nb) if side = 'L', and
lwork max(1,m*nb) if side = 'R', where nb is the optimal blocksize.
(nb = 0 if m = 0 or n = 0.)

xerbla.
Output Parameters
c Overwritten by the product Q*C, QH*C, C*Q, C*QH, P*C, PH*C, C*P, or
C*PH, as specified by vect, side, and trans.

info INTEGER.
881

Specific details for the routine unmbr interface are the following:
a Holds the matrix A of size (r,min(nq,k)) where

r = nq, if vect = 'Q',
r = min(nq,k), if vect = 'P',
nq = m, if side = 'L',
nq = n, if side = 'R',
tau Holds the vector of length min(nq,k).
Application Notes
For better performance, use
lwork = n*blocksize for side = 'L', or
lwork = m*blocksize for side = 'R',
where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
= -1.
The total number of floating-point operations is approximately

8*n*k(2*m - k) if side = 'L' and mk;
8*m*k(2*n - k) if side = 'R' and nk;
8*m2*n if side = 'L' and m < k;
8*n2*m if side = 'R' and n < k.
882
LAPACK Routines 3
The real counterpart of this routine is ormbr.
?bdsqr
Computes the singular value decomposition of a
general matrix that has been reduced to bidiagonal
form.
Syntax
call sbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work, info)
call dbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work, info)
call cbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, rwork, info)
call zbdsqr(uplo, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, rwork, info)
call rbdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
call bdsqr(d, e [,vt] [,u] [,c] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular values and, optionally, the right and/or left singular vectors from the
Singular Value Decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal matrix B using the implicit
zero-shift QR algorithm. The SVD of B has the form B = Q*S*PH where S is the diagonal matrix of singular
values, Q is an orthogonal matrix of left singular vectors, and P is an orthogonal matrix of right singular
vectors. If left singular vectors are requested, this subroutine actually returns U *Q instead of Q, and, if right
singular vectors are requested, this subroutine returns PH *VT instead of PH, for given real/complex input
matrices U and VT. When U and VT are the orthogonal/unitary matrices that reduce a general matrix A to
bidiagonal form: A = U*B*VT, as computed by ?gebrd, then
A = (U*Q)*S*(PH*VT)
is the SVD of A. Optionally, the subroutine may also compute QH *C for a given real/complex input matrix C.
See also lasq1, lasq2, lasq3, lasq4, lasq5, lasq6 used by this routine.
Input Parameters

If uplo = 'U', B is an upper bidiagonal matrix.
If uplo = 'L', B is a lower bidiagonal matrix.
n INTEGER. The order of the matrix B (n 0).
ncvt INTEGER. The number of columns of the matrix VT, that is, the number of
right singular vectors (ncvt 0).
Set ncvt = 0 if no right singular vectors are required.
nru INTEGER. The number of rows in U, that is, the number of left singular
vectors (nru 0).
Set nru = 0 if no left singular vectors are required.
883
ncc INTEGER. The number of columns in the matrix C used for computing the
product QH*C (ncc 0). Set ncc = 0 if no matrix C is supplied.
d, e REAL for single-precision flavors

Arrays:
d(*) contains the diagonal elements of B.
The size of d must be at least max(1, n).
e(*) contains the (n-1) off-diagonal elements of B.
The size of e must be at least max(1, n - 1).
work REAL for sbdsqr

DOUBLE PRECISION for dbdsqr.
The size of work must be at least max(1, 4*n).
rwork REAL for cbdsqr

DOUBLE PRECISION for zbdsqr.
rwork(*) is a workspace array.
The size of rwork must be at least max(1, 4*n).
vt, u, c REAL for sbdsqr

DOUBLE PRECISION for dbdsqr
COMPLEX for cbdsqr
DOUBLE COMPLEX for zbdsqr.
Arrays:
vt(ldvt,*) contains an n-by-ncvt matrix VT.
The second dimension of vt must be at least max(1, ncvt).
vt is not referenced if ncvt = 0.
u(ldu,*) contains an nru by n matrix U.
The second dimension of u must be at least max(1, n).
u is not referenced if nru = 0.
c(ldc,*) contains the n-by-ncc matrix C for computing the product QH*C.
The second dimension of c must be at least max(1, ncc). The array is not
referenced if ncc = 0.
ldvt INTEGER. The leading dimension of vt. Constraints:

ldvt max(1, n) if ncvt > 0;
ldvt 1 if ncvt = 0.
ldu INTEGER. The leading dimension of u. Constraint:
884
LAPACK Routines 3
ldu max(1, nru) .
ldc INTEGER. The leading dimension of c. Constraints:

ldc max(1, n) if ncc > 0; ldc 1 otherwise.
Output Parameters
d On exit, if info = 0, overwritten by the singular values in decreasing order

(see info).
e On exit, if info = 0, e is destroyed. See also info below.
c Overwritten by the product QH*C.
vt On exit, this array is overwritten by PH *VT. Not referenced if ncvt = 0.
u On exit, this array is overwritten by U *Q. Not referenced if nru = 0.
info INTEGER.
If info > 0,
If ncvt = nru = ncc = 0,
info = 1, a split was marked by a positive value in e

info = 2, the current block of z not diagonalized after 100*n iterations
(in the inner while loop)
info = 3, termination criterion of the outer while loop is not met (the
program created more than n unreduced blocks).
In all other cases when ncvt, nru, or ncc > 0, the algorithm did not
converge; d and e contain the elements of a bidiagonal matrix that is
orthogonally similar to the input matrix B; if info = i, i elements of e
have not converged to zero.

Specific details for the routine bdsqr interface are the following:
d Holds the vector of length (n).
e Holds the vector of length (n).
vt Holds the matrix VT of size (n, ncvt).
u Holds the matrix U of size (nru,n).
c Holds the matrix C of size (n,ncc).
885
ncvt If argument vt is present, then ncvt is equal to the number of columns in matrix
VT; otherwise, ncvt is set to zero.
nru If argument u is present, then nru is equal to the number of rows in matrix U;
otherwise, nru is set to zero.
ncc If argument c is present, then ncc is equal to the number of columns in matrix C;
otherwise, ncc is set to zero.
Note that two variants of Fortran 95 interface for bdsqr routine are needed because of an ambiguous choice
between real and complex cases appear when vt, u, and c are omitted. Thus, the name rbdsqr is used in
real cases (single or double precision), and the name bdsqr is used in complex cases (single or double
precision).
Application Notes
Each singular value and singular vector is computed to high relative accuracy. However, the reduction to
bidiagonal form (prior to calling the routine) may decrease the relative accuracy in the small singular values
of the original matrix if its singular values vary widely in magnitude.
If si is an exact singular value of B, and si is the corresponding computed value, then
|si - i| p*(m,n)**i
where p(m, n) is a modestly increasing function of m and n, and is the machine precision.
If only singular values are computed, they are computed more accurately than when some singular vectors
are also computed (that is, the function p(m, n) is smaller).
If ui is the corresponding exact left singular vector of B, and wi is the corresponding computed left singular
vector, then the angle (ui, wi) between them is bounded as follows:
(ui, wi) p(m,n)* / min ij(|i - j|/|i + j|).
Here minij(|i - j|/|i + j|) is the relative gap between i and the other singular values. A similar
error bound holds for the right singular vectors.
The total number of real floating-point operations is roughly proportional to n2 if only the singular values are
computed. About 6n2*nru additional operations (12n2*nru for complex flavors) are required to compute the
left singular vectors and about 6n2*ncvt operations (12n2*ncvt for complex flavors) to compute the right
singular vectors.
?bdsdc
Computes the singular value decomposition of a real
bidiagonal matrix using a divide and conquer method.
Syntax
call sbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
call dbdsdc(uplo, compq, n, d, e, u, ldu, vt, ldvt, q, iq, work, iwork, info)
call bdsdc(d, e [,u] [,vt] [,q] [,iq] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
886
LAPACK Routines 3
Description
The routine computes the Singular Value Decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal
matrix B: B = U**VT, using a divide and conquer method, where is a diagonal matrix with non-negative
diagonal elements (the singular values of B), and U and V are orthogonal matrices of left and right singular
vectors, respectively. ?bdsdc can be used to compute all singular values, and optionally, singular vectors or
singular vectors in compact form.
This rotuine uses ?lasd0, ?lasd1, ?lasd2, ?lasd3, ?lasd4, ?lasd5, ?lasd6, ?lasd7, ?lasd8, ?lasd9, ?
lasda, ?lasdq, ?lasdt.
Input Parameters

If uplo = 'U', B is an upper bidiagonal matrix.
If uplo = 'L', B is a lower bidiagonal matrix.
compq CHARACTER*1. Must be 'N', 'P', or 'I'.

If compq = 'N', compute singular values only.
If compq = 'P', compute singular values and compute singular vectors in

compact form.
If compq = 'I', compute singular values and singular vectors.
d, e, work REAL for sbdsdc

DOUBLE PRECISION for dbdsdc.
Arrays:
d(*) contains the n diagonal elements of the bidiagonal matrix B. The size
of d must be at least max(1, n).
e(*) contains the off-diagonal elements of the bidiagonal matrix B. The size
of e must be at least max(1, n).
The dimension of work must be at least:
max(1, 4*n), if compq = 'N' or compq = 'P';
max(1, 3*n2+4*n), if compq = 'I'.
ldu INTEGER. The leading dimension of the output array u; ldu 1.

If singular vectors are desired, then ldu max(1, n).
ldvt INTEGER. The leading dimension of the output array vt; ldvt 1.
If singular vectors are desired, then ldvt max(1, n).
iwork INTEGER. Workspace array, dimension at least max(1, 8*n).
Output Parameters
d If info = 0, overwritten by the singular values of B.
887
e On exit, e is overwritten.
u, vt, q REAL for sbdsdc

DOUBLE PRECISION for dbdsdc.
Arrays: u(ldu,*), vt(ldvt,*), q(*).
If compq = 'I', then on exit u contains the left singular vectors of the
bidiagonal matrix B, unless info 0 (seeinfo). For other values of compq, u
is not referenced.
The second dimension of u must be at least max(1,n).
if compq = 'I', then on exit vtT contains the right singular vectors of the
bidiagonal matrix B, unless info 0 (seeinfo). For other values of compq,
vt is not referenced. The second dimension of vt must be at least max(1,n).
If compq = 'P', then on exit, if info = 0, q and iq contain the left and
right singular vectors in a compact form. Specifically, q contains all the
REAL (for sbdsdc) or DOUBLE PRECISION (for dbdsdc) data for singular
vectors. For other values of compq, q is not referenced.
iq INTEGER.
Array: iq(*).
If compq = 'P', then on exit, if info = 0, q and iq contain the left and
right singular vectors in a compact form. Specifically, iq contains all the
INTEGER data for singular vectors. For other values of compq, iq is not
referenced.
info INTEGER.
If info = i, the algorithm failed to compute a singular value. The update

process of divide and conquer failed.

Specific details for the routine bdsdc interface are the following:
e Holds the vector of length n.
u Holds the matrix U of size (n,n).
vt Holds the matrix VT of size (n,n).
q Holds the vector of length (ldq), where

ldq n*(11 + 2*smlsiz + 8*int(log_2(n/(smlsiz + 1)))) and smlsiz is
returned by ilaenv and is equal to the maximum size of the subproblems at the
bottom of the computation tree (usually about 25).
888
LAPACK Routines 3
compq Restored based on the presence of arguments u, vt, q, and iq as follows:
compq = 'N', if none of u, vt, q, and iq are present,
compq = 'I', if both u and vt are present. Arguments u and vt must either be
both present or both omitted,
compq = 'P', if both q and iq are present. Arguments q and iq must either be
both present or both omitted.
Note that there will be an error condition if all of u, vt, q, and iq arguments are
present simultaneously.
See Also
?lasd0
?lasd1
?lasd2
?lasd3
?lasd4
?lasd5
?lasd6
?lasd7
?lasd8
?lasd9
?lasda
?lasdq
?lasdt
Symmetric Eigenvalue Problems: LAPACK Computational Routines

Symmetric eigenvalue problems are posed as follows: given an n-by-n real symmetric or complex
Hermitian matrix A, find the eigenvalues and the corresponding eigenvectors z that satisfy the equation
Az = z (or, equivalently, zHA = zH).
In such eigenvalue problems, all n eigenvalues are real not only for real symmetric but also for complex
Hermitian matrices A, and there exists an orthonormal system of n eigenvectors. If A is a symmetric or
Hermitian positive-definite matrix, all eigenvalues are positive.
To solve a symmetric eigenvalue problem with LAPACK, you usually need to reduce the matrix to tridiagonal
form and then solve the eigenvalue problem with the tridiagonal matrix obtained. LAPACK includes routines
for reducing the matrix to a tridiagonal form by an orthogonal (or unitary) similarity transformation A =
QTQH as well as for solving tridiagonal symmetric eigenvalue problems. These routines (for FORTRAN 77
interface) are listed in Table "Computational Routines for Solving Symmetric Eigenvalue Problems". The
corresponding routine names in the Fortran 95 interface are without the first symbol.
There are different routines for symmetric eigenvalue problems, depending on whether you need all
eigenvectors or only some of them or eigenvalues only, whether the matrix A is positive-definite or not, and
so on.
These routines are based on three primary algorithms for computing eigenvalues and eigenvectors of
symmetric problems: the divide and conquer algorithm, the QR algorithm, and bisection followed by inverse
iteration. The divide and conquer algorithm is generally more efficient and is recommended for computing all
eigenvalues and eigenvectors. Furthermore, to solve an eigenvalue problem using the divide and conquer
algorithm, you need to call only one routine. In general, more than one routine has to be called if the QR
algorithm or bisection followed by inverse iteration is used.
The decision tree in Figure "Decision Tree: Real Symmetric Eigenvalue Problems" will help you choose the
right routine or sequence of routines for eigenvalue problems with real symmetric matrices. Figure "Decision
Tree: Complex Hermitian Eigenvalue Problems" presents a similar decision tree for complex Hermitian
matrices.
889
: Decision Tree: Real Symmetric Eigenvalue Problems
: Decision Tree: Complex Hermitian Eigenvalue Problems
890
LAPACK Routines 3
Computational Routines for Solving Symmetric Eigenvalue Problems

Operation Real symmetric matrices Complex Hermitian
matrices
Reduce to tridiagonal form A = QTQH (full sytrd syrdb hetrd herdb

storage)
Reduce to tridiagonal form A = QTQH sptrd hptrd

(packed storage)
Reduce to tridiagonal form A = QTQH (band sbtrd hbtrd

storage).
Generate matrix Q (full storage) orgtr ungtr
Generate matrix Q (packed storage) opgtr upgtr
Apply matrix Q (full storage) ormtr unmtr
Multiplies a general matrix by an orm22 unm22

orthogonal/unitary matrix with a 2x2
structure.
891
Operation Real symmetric matrices Complex Hermitian

matrices
Apply matrix Q (packed storage) opmtr upmtr
Find all eigenvalues of a tridiagonal matrix sterf

T
Find all eigenvalues and eigenvectors of a steqr stedc steqr stedc

tridiagonal matrix T
Find all eigenvalues and eigenvectors of a pteqr pteqr

tridiagonal positive-definite matrix T.
Find selected eigenvalues of a tridiagonal stebz stegr stegr

matrix T
Find selected eigenvectors of a tridiagonal stein stegr stein stegr

matrix T
Find selected eigenvalues and eigenvectors stemr stemr

of f a real symmetric tridiagonal matrix T
Compute the reciprocal condition numbers disna disna

for the eigenvectors
?sytrd
Reduces a real symmetric matrix to tridiagonal form.
Syntax
call ssytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call dsytrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call sytrd(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal similarity
transformation: A = Q*T*QT. The orthogonal matrix Q is not formed explicitly but is represented as a
product of n-1 elementary reflectors. Routines are provided for working with Q in this representation (see
Application Notes below).
Input Parameters

If uplo = 'U', a stores the upper triangular part of A.
If uplo = 'L', a stores the lower triangular part of A.
n INTEGER. The order of the matrix A (n 0).
a, work REAL for ssytrd

DOUBLE PRECISION for dsytrd.
892
LAPACK Routines 3
a(lda,*) is an array containing either upper or lower triangular part of the
matrix A, as specified by uplo. If uplo = 'U', the leading n-by-n upper
triangular part of a contains the upper triangular part of the matrix A, and
the strictly lower triangular part of A is not referenced. If uplo = 'L', the
leading n-by-n lower triangular part of a contains the lower triangular part
of the matrix A, and the strictly upper triangular part of A is not referenced.

xerbla.
Output Parameters
a On exit,
if uplo = 'U', the diagonal and first superdiagonal of A are overwritten by
the corresponding elements of the tridiagonal matrix T, and the elements
above the first superdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors;
if uplo = 'L', the diagonal and first subdiagonal of A are overwritten by
below the first subdiagonal, with the array tau, represent the orthogonal
d, e, tau REAL for ssytrd

DOUBLE PRECISION for dsytrd.
Arrays:
d(*) contains the diagonal elements of the matrix T.
e(*) contains the off-diagonal elements of T.
The size of e must be at least max(1, n-1).
tau(*) stores (n-1) scalars that define elementary reflectors in
decomposition of the orthogonal matrix Q in a product of n-1 elementary
reflectors. tau(n) is used as workspace.
The size of tau must be at least max(1, n).
work(1) If info=0, on exit work(1) contains the minimum value of lwork required
for optimum performance. Use this lwork for subsequent runs.
info INTEGER.
893

Specific details for the routine sytrd interface are the following:
tau Holds the vector of length (n-1).
Note that diagonal (d) and off-diagonal (e) elements of the matrix T are omitted because they are kept in the
matrix A on exit.
Application Notes
= -1.
The computed matrix T is exactly similar to a matrix A+E, where ||E||2 = c(n)**||A||2, c(n) is a
The approximate number of floating-point operations is (4/3)n3.
After calling this routine, you can call the following:
orgtr to form the computed matrix Q explicitly
ormtr to multiply a real matrix by Q.
The complex counterpart of this routine is ?hetrd.
?syrdb
Reduces a real symmetric matrix to tridiagonal form
with Successive Bandwidth Reduction approach.
Syntax
call ssyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call dsyrdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
894
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine reduces a real symmetric matrix A to symmetric tridiagonal form T by an orthogonal similarity
transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms Q.
This routine reduces a full symmetric matrix A to the banded symmetric matrix B, and then to the tridiagonal
symmetric matrix T with a Successive Bandwidth Reduction approach after C. Bischof's works (see for
instance, [Bischof00]). ?syrdb is functionally close to ?sytrd routine but the tridiagonal form may differ
from those obtained by ?sytrd. Unlike ?sytrd, the orthogonal matrix Q cannot be restored from the details
of matrix A on exit.
Input Parameters
jobz CHARACTER*1. Must be 'N' or 'V'.

If jobz = 'N', then only A is reduced to T.
If jobz = 'V', then A is reduced to T and A contains Q on exit.
If jobz = 'U', then A is reduced to T and Z contains Z*Q on exit.

kd INTEGER. The bandwidth of the banded matrix B (kd 1, kdn-1).
a,z, work REAL for ssyrdb.

DOUBLE PRECISION for dsyrdb.
matrix A, as specified by uplo.
z(ldz,*), the second dimension of z must be at least max(1, n).

If jobz = 'U', then the matrix z is multiplied by Q.
If jobz = 'N' or 'V', then z is not referenced.
work(lwork) is a workspace array.
ldz INTEGER. The leading dimension of z; at least max(1, n). Not referenced if
jobz = 'N'
lwork INTEGER. The size of the work array (lwork (2kd+1)n+kd).

xerbla.
895
Output Parameters
a If jobz = 'V', then overwritten by Q matrix.
If jobz = 'N' or 'U', then overwritten by the banded matrix B and details
of the orthogonal matrix QB to reduce A to B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q.
d, e, tau DOUBLE PRECISION.

Arrays:
The dimension of d must be at least max(1, n).
The dimension of e must be at least max(1, n-1).
tau(*) stores further details of the orthogonal matrix Q.
The dimension of tau must be at least max(1, n-kd-1).
info INTEGER.
Application Notes
For better performance, try using lwork = n*(3*kd+3).
= -1.
For better performance, try using kd equal to 40 if n 2000 and 64 otherwise.
Try using ?syrdb instead of ?sytrd on large matrices obtaining only eigenvalues - when no eigenvectors are
needed, especially in multi-threaded environment. ?syrdb becomes faster beginning approximately with n =
1000, and much faster at larger matrices with a better scalability than ?sytrd.
896
LAPACK Routines 3
Avoid applying ?syrdb for computing eigenvectors due to the two-step reduction, that is, the number of
operations needed to apply orthogonal transformations to Z is doubled compared to the traditional one-step
reduction. In that case it is better to apply ?sytrd and ?ormtr/?orgtr to obtain tridiagonal form along with
the orthogonal transformation matrix Q.
?herdb
Reduces a complex Hermitian matrix to tridiagonal
form with Successive Bandwidth Reduction approach.
Syntax
call cherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
call zherdb(jobz, uplo, n, kd, a, lda, d, e, tau, z, ldz, work, lwork, info)
Include Files
mkl.fi
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary similarity
transformation: A = Q*T*QT and optionally multiplies matrix Z by Q, or simply forms Q.
This routine reduces a full symmetric matrix A to the banded symmetric matrix B, and then to the tridiagonal
symmetric matrix T with a Successive Bandwidth Reduction approach after C. Bischof's works (see for
instance, [Bischof00]). ?herdb is functionally close to ?hetrd routine but the tridiagonal form may differ
from those obtained by ?hetrd. Unlike ?hetrd, the orthogonal matrix Q cannot be restored from the details
of matrix A on exit.
Input Parameters

If jobz = 'N', then only A is reduced to T.
If jobz = 'V', then A is reduced to T and A contains Q on exit.
If jobz = 'U', then A is reduced to T and Z contains Z*Q on exit.

kd INTEGER. The bandwidth of the banded matrix B (kd 1, kdn-1).
a,z, work COMPLEX for cherdb.

DOUBLE COMPLEX for zherdb.
z(ldz,*), the second dimension of z must be at least max(1, n).

If jobz = 'U', then the matrix z is multiplied by Q.
897
ldz INTEGER. The leading dimension of z; at least max(1, n). Not referenced if
jobz = 'N'
lwork INTEGER. The size of the work array (lwork (2kd+1)n+kd).

xerbla.
Output Parameters
a If jobz = 'V', then overwritten by Q matrix.
If jobz = 'N' or 'U', then overwritten by the banded matrix B and details
of the unitary matrix QB to reduce A to B as specified by uplo.
z On exit,
if jobz = 'U', then the matrix z is overwritten by Z*Q .
d, e REAL for cherdb.

DOUBLE PRECISION for zherdb.
Arrays:
tau(*) stores further details of the orthogonal matrix Q.
The dimension of tau must be at least max(1, n-kd-1).
tau COMPLEX for cherdb.

DOUBLE COMPLEX for zherdb.
Array, size at least max(1, n-1)
Stores further details of the unitary matrix QB. The dimension of tau must
be at least max(1, n-kd-1).
info INTEGER.
898
LAPACK Routines 3
Application Notes
For better performance, try using lwork = n*(3*kd+3).
= -1.
For better performance, try using kd equal to 40 if n 2000 and 64 otherwise.
Try using ?herdb instead of ?hetrd on large matrices obtaining only eigenvalues - when no eigenvectors are
needed, especially in multi-threaded environment. ?herdb becomes faster beginning approximately with n =
1000, and much faster at larger matrices with a better scalability than ?hetrd.
Avoid applying ?herdb for computing eigenvectors due to the two-step reduction, that is, the number of
operations needed to apply orthogonal transformations to Z is doubled compared to the traditional one-step
reduction. In that case it is better to apply ?hetrd and ?unmtr/?ungtr to obtain tridiagonal form along with
the unitary transformation matrix Q.
?orgtr
Generates the real orthogonal matrix Q determined
by ?sytrd.
Syntax
call sorgtr(uplo, n, a, lda, tau, work, lwork, info)
call dorgtr(uplo, n, a, lda, tau, work, lwork, info)
call orgtr(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by sytrd when reducing a real
symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sytrd.
Input Parameters

Use the same uplo as supplied to ?sytrd.
n INTEGER. The order of the matrix Q (n 0).
a, tau, work REAL for sorgtr
899
DOUBLE PRECISION for dorgtr.

Arrays:
a(lda,*) is the array a as returned by ?sytrd.

tau(*) is the array tau as returned by ?sytrd.
The size of tau must be at least max(1, n-1).


xerbla.
Output Parameters
a Overwritten by the orthogonal matrix Q.

info INTEGER.

Specific details for the routine orgtr interface are the following:
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a machine-dependent
lwork = -1.
900
LAPACK Routines 3
The computed matrix Q differs from an exactly orthogonal matrix by a matrix E such that ||E||2 = O(),
The complex counterpart of this routine is ungtr.
?ormtr
Multiplies a real matrix by the real orthogonal matrix
Q determined by ?sytrd.
Syntax
call sormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call dormtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call ormtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q formed by sytrd when
reducing a real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sytrd.
Input Parameters
In the descriptions below, r denotes the order of Q:


Use the same uplo as supplied to ?sytrd.

901
a, c, tau, work REAL for sormtr

DOUBLE PRECISION for dormtr
a(lda,*) and tau are the arrays returned by ?sytrd.
The second dimension of a must be at least max(1, r).

The size of tau must be at least max(1, r-1).
lda INTEGER. The leading dimension of a; lda max(1, r).

xerbla.
Output Parameters
and trans).

info INTEGER.

Specific details for the routine ormtr interface are the following:
a Holds the matrix A of size (r,r).

902
LAPACK Routines 3
tau Holds the vector of length (r-1).
Application Notes
For better performance, try using lwork = n*blocksize for side = 'L', or lwork = m*blocksize for
side = 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum
lwork = -1.
The total number of floating-point operations is approximately 2*m2*n, if side = 'L', or 2*n2*m, if side =
'R'.
The complex counterpart of this routine is unmtr.
?hetrd
form.
Syntax
call chetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call zhetrd(uplo, n, a, lda, d, e, tau, work, lwork, info)
call hetrd(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a complex Hermitian matrix A to symmetric tridiagonal form T by a unitary similarity
transformation: A = Q*T*QH. The unitary matrix Q is not formed explicitly but is represented as a product of
n-1 elementary reflectors. Routines are provided to work with Q in this representation. (They are described
later in this section .)
903
Input Parameters

a, work COMPLEX for chetrd

DOUBLE COMPLEX for zhetrd.
matrix A, as specified by uplo. If uplo = 'U', the leading n-by-n upper
the strictly lower triangular part of A is not referenced. If uplo = 'L', the
of the matrix A, and the strictly upper triangular part of A is not referenced.

xerbla.
Output Parameters
a On exit,
if uplo = 'U', the diagonal and first superdiagonal of A are overwritten by
above the first superdiagonal, with the array tau, represent the orthogonal
matrix Q as a product of elementary reflectors;
if uplo = 'L', the diagonal and first subdiagonal of A are overwritten by
below the first subdiagonal, with the array tau, represent the orthogonal
d, e REAL for chetrd

DOUBLE PRECISION for zhetrd.
Arrays:
904
LAPACK Routines 3
tau COMPLEX for chetrdDOUBLE COMPLEX for zhetrd.
Array, size at least max(1, n-1). Stores (n-1) scalars that define elementary
reflectors in decomposition of the unitary matrix Q in a product of n-1
elementary reflectors.

info INTEGER.

Specific details for the routine hetrd interface are the following:
matrix A on exit.
Application Notes
lwork = -1.
The computed matrix T is exactly similar to a matrix A + E, where ||E||2 = c(n)**||A||2, c(n) is a
ungtr to form the computed matrix Q explicitly
unmtr to multiply a complex matrix by Q.
The real counterpart of this routine is ?sytrd.
905
?ungtr
Generates the complex unitary matrix Q determined
by ?hetrd.
Syntax
call cungtr(uplo, n, a, lda, tau, work, lwork, info)
call zungtr(uplo, n, a, lda, tau, work, lwork, info)
call ungtr(a, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by hetrd when reducing a complex
Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?hetrd.
Input Parameters

Use the same uplo as supplied to ?hetrd.
a, tau, work COMPLEX for cungtr

DOUBLE COMPLEX for zungtr.
Arrays:
a(lda,*) is the array a as returned by ?hetrd.

tau(*) is the array tau as returned by ?hetrd.
The dimension of tau must be at least max(1, n-1).


xerbla.
Output Parameters
a Overwritten by the unitary matrix Q.

906
LAPACK Routines 3
info INTEGER.

Specific details for the routine ungtr interface are the following:
Application Notes
For better performance, try using lwork = (n-1)*blocksize, where blocksize is a machine-dependent
= -1.
The computed matrix Q differs from an exactly unitary matrix by a matrix E such that ||E||2 = O(), where
is the machine precision.
The real counterpart of this routine is orgtr.
?unmtr
Multiplies a complex matrix by the complex unitary
matrix Q determined by ?hetrd.
Syntax
call cunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call zunmtr(side, uplo, trans, m, n, a, lda, tau, c, ldc, work, lwork, info)
call unmtr(a, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
907
The routine multiplies a complex matrix C by Q or QH, where Q is the unitary matrix Q formed by hetrd when
reducing a complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?
hetrd.
Input Parameters


Use the same uplo as supplied to ?hetrd.

a, c, tau, work COMPLEX for cunmtr

DOUBLE COMPLEX for zunmtr.
a(lda,*) and tau are the arrays returned by ?hetrd.
The second dimension of a must be at least max(1, r).

The dimension of tau must be at least max(1, r-1).
lda INTEGER. The leading dimension of a; lda max(1, r).
ldc INTEGER. The leading dimension of c; ldc max(1, n) .

xerbla.
908
LAPACK Routines 3
Output Parameters
and trans).

info INTEGER.

Specific details for the routine unmtr interface are the following:

Application Notes
For better performance, try using lwork = n*blocksize (for side = 'L') or lwork = m*blocksize (for
= -1.
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2, where
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m if side =
'R'.
909
The real counterpart of this routine is ormtr.
?orm22/?unm22
Multiplies a general matrix by an orthogonal/unitary
matrix with a 2x2 structure.
Syntax
call sorm22 (side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info )
call dorm22 (side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info )
call cunm22(side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info)
call zunm22(side, trans, m, n, n1, n2, q, ldq, c, ldc, work, lwork, info)
Include Files
mkl.fi
Description
?orm22/?unm22 overwrites the general real/complex m-by-n matrix C with
side = 'L' side = 'R'
trans = 'N' Q*C C*Q
trans = 'T' QT * C C * QT
applies to sorm22 and dorm22 only
trans = 'C' QH * C C * QH
applies to cunm22 and zunm22 only
where Q is a real orthogonal/complex unitary matrix of order nq, with nq = m if side = 'L' and nq = n if side
= 'R'.
The orthogonal/unitary matrix Q processes a 2-by-2 block structure:
Q11 Q12
Q=
Q21 Q22
where Q12 is an n1-by-n1 lower triangular matrix and Q21 is an n2-by-n2 upper triangular matrix.
Input Parameters
side CHARACTER*1. = 'L': apply Q, QT, or QH from the left;

= 'R': apply Q, QT, or QH from the right.
trans CHARACTER*1. = 'N': apply Q (no transpose);

= 'T': apply QT (transpose) - sorm22 and dorm22 only;
= 'C': apply QH (conjugate transpose) - cunm22 and zunm22 only.
m INTEGER. The number of rows of the matrix C.

m 0.
n INTEGER. The number of columns of the matrix C.
910
LAPACK Routines 3
n 0.
n1 INTEGER. The dimension of Q12.

n1 0.
The following requirement must be satisfied: n1 + n2 = m if side = 'L' and
n1 + n2 = n if side = 'R'.
n2 INTEGER. The dimension of Q21.

n2 0.
The following requirement must be satisfied: n1 + n2 = m if side = 'L' and
n1 + n2 = n if side = 'R'.
q REAL for sorm22

DOUBLE PRECISION for dorm22
COMPLEX for cunm22
DOUBLE COMPLEX for zunm22
Array, size (ldq,m) if side = 'L' and (ldq,n) if side = 'R'.
ldq INTEGER. The leading dimension of the array q.
ldq max(1,m) if side = 'L';

ldq max(1,n) if side = 'R'.
c REAL for sorm22

COMPLEX for cunm22
Array, size (ldc,n)
On entry, the m-by-n matrix C.
ldc max(1,m).
lwork INTEGER. The dimension of the array work.
If side = 'L', lwork max(1,n);
if side = 'R', lwork max(1,m).
For optimum performance lworkm*n.

xerbla.
Output Parameters
c On exit, c is overwritten by the product:
911
Q*C,
QT*C,
QH * C,
C*QT,
C * QH, or
C*Q.
work REAL for sorm22

COMPLEX for cunm22
Array, size (max(1,lwork))
On exit, if info = 0, work(1) returns the optimal lwork.

?sptrd
Reduces a real symmetric matrix to tridiagonal form
using packed storage.
Syntax
call ssptrd(uplo, n, ap, d, e, tau, info)
call dsptrd(uplo, n, ap, d, e, tau, info)
call sptrd(ap, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a packed real symmetric matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT. The orthogonal matrix Q is not formed explicitly but is represented as
a product of n-1 elementary reflectors. Routines are provided for working with Q in this representation. See
Application Notes below for details.
Input Parameters

If uplo = 'U', ap stores the packed upper triangle of A.
If uplo = 'L', ap stores the packed lower triangle of A.
ap REAL for ssptrd

DOUBLE PRECISION for dsptrd.
912
LAPACK Routines 3
Array, size at least max(1, n(n+1)/2). Contains either upper or lower
triangle of A (as specified by uplo) in the packed form described in Matrix
Storage Schemes.
Output Parameters
ap Overwritten by the tridiagonal matrix T and details of the orthogonal matrix

Q, as specified by uplo.
d, e, tau REAL for ssptrd

DOUBLE PRECISION for dsptrd.
Arrays:
tau(*) Stores (n-1) scalars that define elementary reflectors in
decomposition of the matrix Q in a product of n-1 reflectors.
info INTEGER.

Specific details for the routine sptrd interface are the following:
tau Holds the vector with the number of elements n-1.
matrix A on exit.
Application Notes
The matrix Q is represented as a product of n-1 elementary reflectors, as follows :
If uplo = 'U', Q = H(n-1) ... H(2)H(1)

H(i) = I - tau*v*vT
where tau is a real scalar and v is a real vector with v(i+1:n) = 0 and v(i) = 1.
On exit, tau is stored in tau(i), and v(1:i-1) is stored in AP, overwriting A(1:i-1, i+1).
If uplo = 'L', Q = H(1)H(2) ... H(n-1)
913

H(i) = I - tau*v*vT
where tau is a real scalar and v is a real vector with v(1:i) = 0 and v(i+1) = 1.
On exit, tau is stored in tau(i), and v(i+2:n) is stored in AP, overwriting A(i+2:n, i).
modestly increasing function of n, and is the machine precision. The approximate number of floating-point
operations is (4/3)n3.
opgtr to form the computed matrix Q explicitly
opmtr to multiply a real matrix by Q.
The complex counterpart of this routine is hptrd.
?opgtr
by ?sptrd.
Syntax
call sopgtr(uplo, n, ap, tau, q, ldq, work, info)
call dopgtr(uplo, n, ap, tau, q, ldq, work, info)
call opgtr(ap, tau, q [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n orthogonal matrix Q formed by sptrd when reducing a packed real
symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?sptrd.
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as supplied to ?
sptrd.
ap, tau REAL for sopgtr

DOUBLE PRECISION for dopgtr.
Arrays ap and tau, as returned by ?sptrd.
The size of ap must be at least max(1, n(n+1)/2).

The size of tau must be at least max(1, n-1).
ldq INTEGER. The leading dimension of the output array q; at least max(1, n).
work REAL for sopgtr

914
LAPACK Routines 3
Workspace array, size at least max(1, n-1).
Output Parameters
q REAL for sopgtr

Array, size (ldq,*) .
Contains the computed matrix Q.
The second dimension of q must be at least max(1, n).
info INTEGER.

Specific details for the routine opgtr interface are the following:
tau Holds the vector with the number of elements n - 1.
q Holds the matrix Q of size (n,n).
Application Notes
The complex counterpart of this routine is upgtr.
?opmtr
Multiplies a real matrix by the real orthogonal matrix
Q determined by ?sptrd.
Syntax
call sopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call dopmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call opmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
915
The routine multiplies a real matrix C by Q or QT, where Q is the orthogonal matrix Q formed by sptrd when
reducing a packed real symmetric matrix A to tridiagonal form: A = Q*T*QT. Use this routine after a call to ?
sptrd.
Input Parameters


Use the same uplo as supplied to ?sptrd.

ap, tau, c, work REAL for sopmtr

DOUBLE PRECISION for dopmtr.
ap and tau are the arrays returned by ?sptrd.
The dimension of ap must be at least max(1, r(r+1)/2).

The dimension of tau must be at least max(1, r-1).
The dimension of work must be at least
max(1, n) if side = 'L';
max(1, m) if side = 'R'.
ldc INTEGER. The leading dimension of c; ldc max(1, n) .
Output Parameters
and trans).
info INTEGER.
916
LAPACK Routines 3

Specific details for the routine opmtr interface are the following:
ap Holds the array A of size (r*(r+1)/2), where

tau Holds the vector with the number of elements r - 1.
Application Notes
The computed product differs from the exact product by a matrix E such that ||E||2 = O() ||C||2, where
The total number of floating-point operations is approximately 2*m2*n if side = 'L', or 2*n2*m if side =
'R'.
The complex counterpart of this routine is upmtr.
?hptrd
form using packed storage.
Syntax
call chptrd(uplo, n, ap, d, e, tau, info)
call zhptrd(uplo, n, ap, d, e, tau, info)
call hptrd(ap, tau [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a packed complex Hermitian matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QH. The unitary matrix Q is not formed explicitly but is represented as a
product of n-1 elementary reflectors. Routines are provided for working with Q in this representation (see
917
Input Parameters

If uplo = 'U', ap stores the packed upper triangle of A.
If uplo = 'L', ap stores the packed lower triangle of A.
ap COMPLEX for chptrd

DOUBLE COMPLEX for zhptrd.
Array, size at least max(1, n(n+1)/2). Contains either upper or lower
triangle of A (as specified by uplo) in the packed form described in "Matrix
Storage Schemes.
Output Parameters
ap Overwritten by the tridiagonal matrix T and details of the unitary matrix Q,

as specified by uplo.
d, e REAL for chptrd

DOUBLE PRECISION for zhptrd.
Arrays:
tau COMPLEX for chptrd

DOUBLE COMPLEX for zhptrd.
Array, size at least max(1, n-1). Stores (n-1) scalars that define elementary
reflectors in decomposition of the unitary matrix Q in a product of
reflectors.
info INTEGER.

Specific details for the routine hptrd interface are the following:
918
LAPACK Routines 3
matrix A on exit.
Application Notes
upgtr to form the computed matrix Q explicitly
upmtr to multiply a complex matrix by Q.
The real counterpart of this routine is sptrd.
?upgtr
by ?hptrd.
Syntax
call cupgtr(uplo, n, ap, tau, q, ldq, work, info)
call zupgtr(uplo, n, ap, tau, q, ldq, work, info)
call upgtr(ap, tau, q [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the n-by-n unitary matrix Q formed by hptrd when reducing a packed
complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call to ?hptrd.
Input Parameters
uplo CHARACTER*1. Must be 'U' or 'L'. Use the same uplo as supplied to ?
hptrd.
ap, tau COMPLEX for cupgtr

DOUBLE COMPLEX for zupgtr.
Arrays ap and tau, as returned by ?hptrd.
The dimension of ap must be at least max(1, n(n+1)/2).

ldq INTEGER. The leading dimension of the output array q;

at least max(1, n).
919
work COMPLEX for cupgtr

Workspace array, size at least max(1, n-1).
Output Parameters
q COMPLEX for cupgtr

Array, size (ldq,*) .
Contains the computed matrix Q.
info INTEGER.

Specific details for the routine upgtr interface are the following:
Application Notes
The real counterpart of this routine is opgtr.
?upmtr
Multiplies a complex matrix by the unitary matrix Q
determined by ?hptrd.
Syntax
call cupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call zupmtr(side, uplo, trans, m, n, ap, tau, c, ldc, work, info)
call upmtr(ap, tau, c [,side] [,uplo] [,trans] [,info])
920
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a complex matrix C by Q or QH, where Q is the unitary matrix formed by hptrd when
reducing a packed complex Hermitian matrix A to tridiagonal form: A = Q*T*QH. Use this routine after a call
to ?hptrd.
Input Parameters


Use the same uplo as supplied to ?hptrd.

If trans = 'T', the routine multiplies C by QH.
ap, tau, c, COMPLEX for cupmtr

DOUBLE COMPLEX for zupmtr.
ap and tau are the arrays returned by ?hptrd.
The size of ap must be at least max(1, r(r+1)/2).

The size of tau must be at least max(1, r-1).
The dimension of work must be at least
max(1, n) if side = 'L';
max(1, m) if side = 'R'.
921
Output Parameters
and trans).
info INTEGER.

Specific details for the routine upmtr interface are the following:
ap Holds the array A of size (r*(r+1)/2), where

uplo Must be 'U' or 'L'.The default value is 'U'.
Application Notes
The computed product differs from the exact product by a matrix E such that ||E||2 = O()*||C||2, where
The total number of floating-point operations is approximately 8*m2*n if side = 'L' or 8*n2*m if side =
'R'.
The real counterpart of this routine is opmtr.
?sbtrd
Reduces a real symmetric band matrix to tridiagonal
form.
Syntax
call ssbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call dsbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call sbtrd(ab[, q] [,vect] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
922
LAPACK Routines 3
Description
The routine reduces a real symmetric band matrix A to symmetric tridiagonal form T by an orthogonal
similarity transformation: A = Q*T*QT. The orthogonal matrix Q is determined as a product of Givens
rotations.
If required, the routine can also form the matrix Q explicitly.
Input Parameters
vect CHARACTER*1. Must be 'V', 'N', or 'U'.

If vect = 'V', the routine returns the explicit matrix Q.
If vect = 'N', the routine does not return Q.
If vect = 'U', the routine updates matrix X by forming X*Q.

If uplo = 'U', ab stores the upper triangular part of A.
If uplo = 'L', ab stores the lower triangular part of A.
n INTEGER. The order of the matrix A (n0).
kd INTEGER. The number of super- or sub-diagonals in A

(kd0).
ab, q, work REAL for ssbtrd

DOUBLE PRECISION for dsbtrd.
ab(ldab,*) is an array containing either upper or lower triangular part of the
matrix A (as specified by uplo) in band storage format.
q(ldq,*) is an array.
If vect = 'U', the q array must contain an n-by-n matrix X.
If vect = 'N' or 'V', the q parameter need not be set.

The dimension of work must be at least max(1, n).
ldab INTEGER. The leading dimension of ab; at least kd+1 .
ldq INTEGER. The leading dimension of q. Constraints:

ldq max(1, n) if vect = 'V' or 'U';
ldq 1 if vect = 'N'.
923
Output Parameters
ab On exit, the diagonal elements of the array ab are overwritten by the

diagonal elements of the tridiagonal matrix T. If kd > 0, the elements on
the first superdiagonal (if uplo = 'U') or the first subdiagonal (if uplo =
'L') are ovewritten by the off-diagonal elements of T. The rest of ab is
overwritten by values generated during the reduction.
d, e, q REAL for ssbtrd

DOUBLE PRECISION for dsbtrd.
Arrays:
q(ldq,*) is not referenced if vect = 'N'.
If vect = 'V', q contains the n-by-n matrix Q.
If vect = 'U', q contains the product X* Q.
The second dimension of q must be:

at least max(1, n) if vect = 'V';
at least 1 if vect = 'N'.
info INTEGER.

Specific details for the routine sbtrd interface are the following:
vect If omitted, this argument is restored based on the presence of argument q as

follows: vect = 'V', if q is present, vect = 'N', if q is omitted.
If present, vect must be equal to 'V' or 'U' and the argument q must also be
present. Note that there will be an error condition if vect is present and q
omitted.
matrix A on exit.
924
LAPACK Routines 3
Application Notes
modestly increasing function of n, and is the machine precision. The computed matrix Q differs from an
exactly orthogonal matrix by a matrix E such that ||E||2 = O().
The total number of floating-point operations is approximately 6n2*kd if vect = 'N', with 3n3*(kd-1)/kd
additional operations if vect = 'V'.
The complex counterpart of this routine is hbtrd.
?hbtrd
Reduces a complex Hermitian band matrix to
tridiagonal form.
Syntax
call chbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call zhbtrd(vect, uplo, n, kd, ab, ldab, d, e, q, ldq, work, info)
call hbtrd(ab [, q] [,vect] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a complex Hermitian band matrix A to symmetric tridiagonal form T by a unitary
similarity transformation: A = Q*T*QH. The unitary matrix Q is determined as a product of Givens rotations.
If required, the routine can also form the matrix Q explicitly.
Input Parameters
vect CHARACTER*1. Must be 'V', 'N', or 'U'.

If vect = 'V', the routine returns the explicit matrix Q.
If vect = 'N', the routine does not return Q.
If vect = 'U', the routine updates matrix X by forming Q*X.


(kd 0).
ab, work COMPLEX for chbtrd

DOUBLE COMPLEX for zhbtrd.
matrix A (as specified by uplo) in band storage format.
925

q COMPLEX for chbtrd

DOUBLE COMPLEX for zhbtrd.
q(ldq,*) is an array.
If vect = 'U', the q array must contain an n-by-n matrix X.
If vect = 'N' or 'V', the q parameter need not be set.'
ldab INTEGER. The leading dimension of ab; at least kd+1.
ldq INTEGER. The leading dimension of q. Constraints:

ldq max(1, n) if vect = 'V' or 'U';
ldq 1 if vect = 'N'.
Output Parameters
ab On exit, the diagonal elements of the array ab are overwritten by the

diagonal elements of the tridiagonal matrix T. If kd > 0, the elements on
the first superdiagonal (if uplo = 'U') or the first subdiagonal (if uplo =
'L') are ovewritten by the off-diagonal elements of T. The rest of ab is
overwritten by values generated during the reduction.
d, e REAL for chbtrd

DOUBLE PRECISION for zhbtrd.
Arrays:
q If vect = 'N', q is not referenced.
If vect = 'V', q contains the n-by-n matrix Q.
If vect = 'U', q contains the product X* Q.
The second dimension of q must be:

at least max(1, n) if vect = 'V';
at least 1 if vect = 'N'.
info INTEGER.
926
LAPACK Routines 3
Specific details for the routine hbtrd interface are the following:
vect If omitted, this argument is restored based on the presence of argument q as

follows: vect = 'V', if q is present, vect = 'N', if q is omitted.
If present, vect must be equal to 'V' or 'U' and the argument q must also be
present. Note that there will be an error condition if vect is present and q
omitted.
matrix A on exit.
Application Notes
modestly increasing function of n, and is the machine precision. The computed matrix Q differs from an
exactly unitary matrix by a matrix E such that ||E||2 = O().
The total number of floating-point operations is approximately 20n2*kd if vect = 'N', with
10n3*(kd-1)/kd additional operations if vect = 'V'.
The real counterpart of this routine is sbtrd.
?sterf
Computes all eigenvalues of a real symmetric
tridiagonal matrix using QR algorithm.
Syntax
call ssterf(n, d, e, info)
call dsterf(n, d, e, info)
call sterf(d, e [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues of a real symmetric tridiagonal matrix T (which can be obtained by
reducing a symmetric or Hermitian matrix to tridiagonal form). The routine uses a square-root-free variant of
the QR algorithm.
If you need not only the eigenvalues but also the eigenvectors, call steqr.
927
Input Parameters
n INTEGER. The order of the matrix T (n 0).
d, e REAL for ssterf

DOUBLE PRECISION for dsterf.
Arrays:
d(*) contains the diagonal elements of T.
Output Parameters
d The n eigenvalues in ascending order, unless info > 0.

See also info.
e On exit, the array is overwritten; see info.
info INTEGER.
If info = i, the algorithm failed to find all the eigenvalues after 30n
iterations:
i off-diagonal elements have not converged to zero. On exit, d and e
contain, respectively, the diagonal and off-diagonal elements of a
tridiagonal matrix orthogonally similar to T.

counterparts. For general conventions applied to skip redundant or restorable arguments, see Fortran 95
Specific details for the routine sterf interface are the following:
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix T+E such that ||E||2 = O()*||T||2,
If i is an exact eigenvalue, and mi is the corresponding computed value, then
|i - i| c(n)**||T||2
where c(n) is a modestly increasing function of n.
The total number of floating-point operations depends on how rapidly the algorithm converges. Typically, it is
about 14n2.
928
LAPACK Routines 3
?steqr
Computes all eigenvalues and eigenvectors of a
symmetric or Hermitian matrix reduced to tridiagonal
form (QR algorithm).
Syntax
call ssteqr(compz, n, d, e, z, ldz, work, info)
call dsteqr(compz, n, d, e, z, ldz, work, info)
call csteqr(compz, n, d, e, z, ldz, work, info)
call zsteqr(compz, n, d, e, z, ldz, work, info)
call rsteqr(d, e [,z] [,compz] [,info])
call steqr(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric tridiagonal
matrix T. In other words, the routine can compute the spectral factorization: T = Z**ZT. Here is a
diagonal matrix whose diagonal elements are the eigenvalues i; Z is an orthogonal matrix whose columns
are eigenvectors. Thus,
T*zi = i*zi for i = 1, 2, ..., n.
The routine normalizes the eigenvectors so that ||zi||2 = 1.
You can also use the routine for computing the eigenvalues and eigenvectors of an arbitrary real symmetric
(or complex Hermitian) matrix A reduced to tridiagonal form T: A = Q*T*QH. In this case, the spectral
factorization is as follows: A = Q*T*QH = (Q*Z)**(Q*Z)H. Before calling ?steqr, you must reduce A to
tridiagonal form and generate the explicit matrix Q by calling the following routines:
for real matrices: for complex matrices:
full storage ?sytrd, ?orgtr ?hetrd, ?ungtr
packed storage ?sptrd, ?opgtr ?hptrd, ?upgtr
band storage ?sbtrd(vect='V') ?hbtrd(vect='V')
If you need eigenvalues only, it's more efficient to call sterf. If T is positive-definite, pteqr can compute small
eigenvalues more accurately than ?steqr.
To solve the problem by a single call, use one of the divide and conquer routines stevd, syevd, spevd, or
sbevd for real symmetric matrices or heevd, hpevd, or hbevd for complex Hermitian matrices.
Input Parameters
compz CHARACTER*1. Must be 'N' or 'I' or 'V'.

If compz = 'N', the routine computes eigenvalues only.
If compz = 'I', the routine computes the eigenvalues and eigenvectors of

the tridiagonal matrix T.
929
If compz = 'V', the routine computes the eigenvalues and eigenvectors of

the original symmetric matrix. On entry, z must contain the orthogonal
matrix used to reduce the original matrix to tridiagonal form.
d, e, work REAL for single-precision flavors

Arrays:
The size of work must be:
at least 1 if compz = 'N';
at least max(1, 2*n-2) if compz = 'V' or 'I'.
z REAL for ssteqr

DOUBLE PRECISION for dsteqr
COMPLEX for csteqr
DOUBLE COMPLEX for zsteqr.
Array, size (ldz, *).
If compz = 'N' or 'I', z need not be set.
If vect = 'V', z must contain the orthogonal matrix used in the reduction
to tridiagonal form.
The second dimension of z must be:
at least max(1, n) if compz = 'V' or 'I'.
work (lwork) is a workspace array.
ldz INTEGER. The leading dimension of z. Constraints:

ldz 1 if compz = 'N';
ldz max(1, n) if compz = 'V' or 'I'.
Output Parameters
d The n eigenvalues in ascending order, unless info > 0.

See also info.
930
LAPACK Routines 3
z If info = 0, contains the n-by-n matrix the columns of which are
orthonormal eigenvectors (the i-th column corresponds to the i-th
eigenvalue).
info INTEGER.
If info = i, the algorithm failed to find all the eigenvalues after 30n
iterations: i off-diagonal elements have not converged to zero. On exit, d
and e contain, respectively, the diagonal and off-diagonal elements of a
tridiagonal matrix orthogonally similar to T.

Specific details for the routine steqr interface are the following:
z Holds the matrix Z of size (n,n).
compz If omitted, this argument is restored based on the presence of argument z as

follows:
compz = 'I', if z is present,
compz = 'N', if z is omitted.
If present, compz must be equal to 'I' or 'V' and the argument z must also be
present. Note that there will be an error condition if compz is present and z
omitted.
Note that two variants of Fortran 95 interface for steqr routine are needed
because of an ambiguous choice between real and complex cases appear when z
is omitted. Thus, the name rsteqr is used in real cases (single or double
precision), and the name steqr is used in complex cases (single or double
precision).
Application Notes
If i is an exact eigenvalue, and i is the corresponding computed value, then
|i - i| c(n)**||T||2
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle
(zi, wi) between them is bounded as follows:
(zi, wi) c(n)**||T||2 / minij|i - j|.
931
The total number of floating-point operations depends on how rapidly the algorithm converges. Typically, it is
about
24n2 if compz = 'N';
7n3 (for complex flavors, 14n3) if compz = 'V' or 'I'.
?stemr
Computes selected eigenvalues and eigenvectors of a
real symmetric tridiagonal matrix.
Syntax
call sstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
work, lwork, iwork, liwork, info)
call dstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
call cstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
call zstemr(jobz, range, n, d, e, vl, vu, il, iu, m, w, z, ldz, nzc, isuppz, tryrac,
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal
matrix T. Any such unreduced matrix has a well defined set of pairwise different real eigenvalues, the
corresponding real eigenvectors are pairwise orthogonal.
The spectrum may be computed either completely or partially by specifying either an interval (vl,vu] or a
range of indices il:iu for the desired eigenvalues.
Depending on the number of desired eigenvalues, these are computed either by bisection or the dqds
algorithm. Numerically orthogonal eigenvectors are computed by the use of various suitable L*D*LT
factorizations near clusters of close eigenvalues (referred to as RRRs, Relatively Robust Representations). An
informal sketch of the algorithm follows.
For each unreduced block (submatrix) of T,
a. Compute T - sigma*I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of L and D cause only small relative
changes in the eigenvalues and eigenvectors. The standard (unfactored) representation of the
tridiagonal matrix T does not have this property in general.
b. Compute the eigenvalues to suitable accuracy. If the eigenvectors are desired, the algorithm attains full
accuracy of the computed eigenvalues only right before the corresponding vectors have to be
computed, see steps c and d.
c. For each cluster of close eigenvalues, select a new shift close to the cluster, find a new factorization,
and refine the shifted eigenvalues to suitable accuracy.
d. For each eigenvalue with a large enough relative separation compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to step c for any clusters that remain.
For more details, see: [Dhillon04], [Dhillon04-02], [Dhillon97]
932
LAPACK Routines 3
Input Parameters

If jobz = 'N', then only eigenvalues are computed.
If jobz = 'V', then eigenvalues and eigenvectors are computed.
range CHARACTER*1. Must be 'A' or 'V' or 'I'.

If range = 'A', the routine computes all eigenvalues.
If range = 'V', the routine computes all eigenvalues in the half-open

interval: (vl, vu].
If range = 'I', the routine computes eigenvalues with indices il to iu.
n INTEGER. The order of the matrix T (n0).
d REAL for single precision flavors

Array, size (n).
Contains n diagonal elements of the tridiagonal matrix T.
e REAL for single precision flavors

Array, size n.
Contains (n-1) off-diagonal elements of the tridiagonal matrix T in
elements 1 to n-1 of e. e(n) need not be set on input, but is used
internally as workspace.
vl, vu REAL for single precision flavors

If range = 'V', the lower and upper bounds of the interval to be searched
for eigenvalues. Constraint: vl<vu.
If range = 'A' or 'I', vl and vu are not referenced.
il, iu INTEGER.
If range = 'I', the indices in ascending order of the smallest and largest
eigenvalues to be returned.
Constraint: 1iliun, if n>0.
If range = 'A' or 'V', il and iu are not referenced.
ldz INTEGER. The leading dimension of the output array z.

if jobz = 'V', then ldz max(1, n) ;
ldz 1 otherwise.
nzc INTEGER. The number of eigenvectors to be held in the array z.

If range = 'A', then nzcmax(1, n);
933
If range = 'V', then nzc is greater than or equal to the number of

eigenvalues in the half-open interval: (vl, vu].
If range = 'I', then nzciu-il+1.
If nzc = -1, then a workspace query is assumed; the routine calculates the
number of columns of the array z that are needed to hold the eigenvectors.
This value is returned as the first entry of the array z, and no error
message related to nzc is issued by the routine xerbla.
tryrac LOGICAL.
If tryrac= .TRUE. is true, it indicates that the code should check whether
the tridiagonal matrix defines its eigenvalues to high relative accuracy. If
so, the code uses relative-accuracy preserving algorithms that might be (a
bit) slower depending on the matrix. If the matrix does not define its
eigenvalues to high relative accuracy, the code can uses possibly faster
algorithms.
If tryrac= .FALSE. is not true, the code is not required to guarantee
relatively accurate eigenvalues and can use the fastest possible techniques.
work REAL for single precision flavors

Workspace array, size (lwork).
lwork INTEGER.
The dimension of the array work,
lwork max(1, 18*n).
If lwork=-1, then a workspace query is assumed; the routine only
xerbla.
iwork INTEGER.
Workspace array, size (liwork).
liwork INTEGER.
The dimension of the array iwork.
lworkmax(1, 10*n) if the eigenvectors are desired, and lworkmax(1,
8*n) if only the eigenvalues are to be computed.
If liwork=-1, then a workspace query is assumed; the routine only
calculates the optimal size of the iwork array, returns this value as the first
entry of the iwork array, and no error message related to liwork is issued by
xerbla.
Output Parameters
d On exit, the array d is overwritten.
e On exit, the array e is overwritten.
934
LAPACK Routines 3
m INTEGER.
The total number of eigenvalues found, 0mn.
If range = 'A', then m=n, and if range = 'I', then m=iu-il+1.
w REAL for single precision flavors

Array, size (n).
The first m elements contain the selected eigenvalues in ascending order.
z REAL for sstemr

DOUBLE PRECISION for dstemr
COMPLEX for cstemr
DOUBLE COMPLEX for zstemr.
Array z(ldz,*), the second dimension of z must be at least max(1, m).
If jobz = 'V', and info = 0, then the first m columns of z contain the
orthonormal eigenvectors of the matrix T corresponding to the selected
eigenvalues, with the i-th column of z holding the eigenvector associated
with w(i).
If jobz = 'N', then z is not referenced.
Note: you must ensure that at least max(1,m) columns are supplied in the
array z ; if range = 'V', the exact value of m is not known in advance and
can be computed with a workspace query by setting nzc=-1, see
description of the parameter nzc.
isuppz INTEGER.
Array, size (2*max(1, m)).
The support of the eigenvectors in z, that is the indices indicating the

nonzero elements in z. The i-th computed eigenvector is nonzero only in
elements isuppz(2*i-1) through isuppz(2*i). This is relevant in the
case when the matrix is split. isuppz is only accessed when jobz = 'V'
and n>0.
tryrac On exit, TRUE. tryrac is set to .FALSE. if the matrix does not define its
eigenvalues to high relative accuracy.
work(1) On exit, if info = 0, then work(1) returns the optimal (and minimal) size
of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal size of liwork.
info INTEGER.
If = 0, the execution is successful.
If info = 1, internal error in ?larre occurred,
if info = 2, internal error in ?larrv occurred.
935
?stedc
Computes all eigenvalues and eigenvectors of a
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
call sstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstedc(compz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call cstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork, info)
call zstedc(compz, n, d, e, z, ldz, work, lwork, rwork, lrwork, iwork, liwork, info)
call rstedc(d, e [,z] [,compz] [,info])
call stedc(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a symmetric tridiagonal
matrix using the divide and conquer method. The eigenvectors of a full or band real symmetric or complex
Hermitian matrix can also be found if sytrd/hetrd or sptrd/hptrd or sbtrd/hbtrd has been used to reduce this
matrix to tridiagonal form.
See also laed0, laed1, laed2, laed3, laed4, laed5, laed6, laed7, laed8, laed9, and laeda used by this function.
Input Parameters


the tridiagonal matrix.
original symmetric/Hermitian matrix. On entry, the array z must contain the
orthogonal/unitary matrix used to reduce the original matrix to tridiagonal
form.
n INTEGER. The order of the symmetric tridiagonal matrix (n 0).
d, e, rwork REAL for single-precision flavors

Arrays:
d(*) contains the diagonal elements of the tridiagonal matrix.
e(*) contains the subdiagonal elements of the tridiagonal matrix.
rwork is a workspace array, its dimension max(1, lrwork).
z, work REAL for sstedc
936
LAPACK Routines 3
DOUBLE PRECISION for dstedc
COMPLEX for cstedc
DOUBLE COMPLEX for zstedc.
Arrays: z(ldz, *), work(*).
If compz = 'V', then, on entry, z must contain the orthogonal/unitary
matrix used to reduce the original matrix to tridiagonal form.
The second dimension of z must be at least max(1, n).


For real functions sstedc and dstedc:
If compz = 'N'or n 1, lwork must be at least 1.

If compz = 'V' and n > 1, lwork must be at least 1 + 3*n
+ 2*n*log2(n) + 4*n2, where log2(n) is the smallest integer k such
that 2kn.
If compz = 'I' and n > 1 then lwork must be at least 1 + 4*n + n2
Note that for compz = 'I' or 'V' and if n is less than or equal to the
minimum divide size, usually 25, then lwork need only be max(1,
2*(n-1)).
For complex functions cstedc and zstedc:
If compz = 'N'or 'I', or n 1, lwork must be at least 1.

If compz = 'V' and n > 1, lwork must be at least n2.
Note that for compz = 'V', and if n is less than or equal to the
minimum divide size, usually 25, then lwork need only be 1.

calculates the optimal size of the work, rwork and iwork arrays, returns
these values as the first entries of the work, rwork and iwork arrays, and no
error message related to lwork or lrwork or liwork is issued by xerbla. See
Application Notes for the required value of lwork.
lrwork INTEGER. The dimension of the array rwork (used for complex flavors only).
If compz = 'N', or n 1, lrwork must be at least 1.
If compz = 'V' and n > 1, lrwork must be at least (1 + 3*n

+ 2*n*log2(n) + 4*n2), where log2(n)is the smallest integer k such that
2kn.
If compz = 'I' and n > 1, lrwork must be at least (1 + 4*n + 2*n2).
Note that for compz = 'V'or 'I', and if n is less than or equal to the
minimum divide size, usually 25, then lrwork need only be max(1,
2*(n-1)).
937
If lrwork = -1, then a workspace query is assumed; the routine only

Application Notes for the required value of lrwork.
iwork INTEGER. Workspace array, its dimension max(1, liwork).
liwork INTEGER. The dimension of the array iwork.

If compz = 'N', or n 1, liwork must be at least 1.
If compz = 'V' and n > 1, liwork must be at least (6 + 6*n

+ 5*n*log2(n)), where log2(n)is the smallest integer k such that 2kn.
If compz = 'I' and n > 1, liwork must be at least (3 + 5*n).
Note that for compz = 'V'or 'I', and if n is less than or equal to the
minimum divide size, usually 25, then liwork need only be 1.
If liwork = -1, then a workspace query is assumed; the routine only

Application Notes for the required value of liwork.
Output Parameters
d The n eigenvalues in ascending order, unless info 0.
See also info.
z If info = 0, then if compz = 'V', z contains the orthonormal eigenvectors

of the original symmetric/Hermitian matrix, and if compz = 'I', z contains
the orthonormal eigenvectors of the symmetric tridiagonal matrix. If compz
= 'N', z is not referenced.
work(1) On exit, if info = 0, then work(1) returns the optimal lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the optimal lrwork (for complex
flavors only).
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal liwork.
info INTEGER.
If info = i, the algorithm failed to compute an eigenvalue while working

on the submatrix lying in rows and columns i/(n+1) through mod(i, n+1).

938
LAPACK Routines 3
Specific details for the routine stedc interface are the following:

follows: compz = 'I', if z is present, compz = 'N', if z is omitted.
omitted.
Note that two variants of Fortran 95 interface for stedc routine are needed because of an ambiguous choice
between real and complex cases appear when z and work are omitted. Thus, the name rstedc is used in
real cases (single or double precision), and the name stedc is used in complex cases (single or double
precision).
Application Notes
The required size of workspace arrays must be as follows.
For sstedc/dstedc:
If compz = 'N' or n 1 then lwork must be at least 1.
If compz = 'V' and n > 1 then lwork must be at least (1 + 3n + 2nlog2n + 4n2), where log2(n) =
smallest integer k such that 2kn.
If compz = 'I' and n > 1 then lwork must be at least (1 + 4n + n2).
If compz = 'N' or n 1 then liwork must be at least 1.
If compz = 'V' and n > 1 then liwork must be at least (6 + 6n + 5nlog2n).
If compz = 'I' and n > 1 then liwork must be at least (3 + 5n).
For cstedc/zstedc:
If compz = 'N' or 'I', or n 1, lwork must be at least 1.
If compz = 'V' and n > 1, lwork must be at least n2.
If compz = 'N' or n 1, lrwork must be at least 1.
If compz = 'V' and n > 1, lrwork must be at least (1 + 3n + 2nlog2n + 4n2), where log2(n ) = smallest
integer k such that 2kn.
If compz = 'I' and n > 1, lrwork must be at least(1 + 4n + 2n2).
The required value of liwork for complex flavors is the same as for real flavors.
If lwork (or liwork or lrwork, if supplied) is equal to -1, then the routine returns immediately and provides the
recommended workspace in the first element of the corresponding array (work, iwork, rwork). This operation
is called a workspace query.
Note that if lwork (liwork, lrwork) is less than the minimal required value and is not equal to -1, the routine
returns immediately with an error exit and does not provide any information on the recommended
workspace.
939
?stegr
Syntax
call sstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
lwork, iwork, liwork, info)
call dstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
call cstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
call zstegr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
call rstegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
call stegr(d, e, w [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
matrix T.
The spectrum may be computed either completely or partially by specifying either an interval (vl,vu] or a
range of indices il:iu for the desired eigenvalues.
?stegr is a compatibility wrapper around the improved stemr routine. See its description for further details.
Note that the abstol parameter no longer provides any benefit and hence is no longer used.
See also auxiliary lasq2lasq5, lasq6, used by this routine.
Input Parameters

If job = 'N', then only eigenvalues are computed.
If job = 'V', then eigenvalues and eigenvectors are computed.

If range = 'V', the routine computes eigenvalues w(i) in the half-open

interval:
vl<w(i)vu.
d, e, work REAL for single precision flavors

Arrays:
940
LAPACK Routines 3
e(*) contains the subdiagonal elements of T in elements 1 to n-1; e(n)
need not be set on input, but it is used as a workspace.
The dimension of e must be at least max(1, n).
vl, vu REAL for single precision flavors

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 iliun, if n > 0.
abstol REAL for single precision flavors

Unused. Was the absolute error tolerance for the eigenvalues/eigenvectors
in previous versions.
ldz INTEGER. The leading dimension of the output array z. Constraints:

ldz 1 if jobz = 'N';
ldz max(1, n) if jobz = 'V'.
lwork INTEGER.
The dimension of the array work,
lworkmax(1, 18*n) if jobz = 'V', and
lworkmax(1, 12*n) if jobz = 'N'.
iwork INTEGER.
Workspace array, size (liwork).
liwork INTEGER.
The dimension of the array iwork, liwork max(1, 10*n) if the
eigenvectors are desired, and liwork max(1, 8*n) if only the
eigenvalues are to be computed..
941

Output Parameters
d, e On exit, d and e are overwritten.
m INTEGER. The total number of eigenvalues found,

0 mn.
If range = 'A', m = n, and if range = 'I', m = iu-il+1.
w REAL for single precision flavors

The selected eigenvalues in ascending order, stored in w(1) to w(m).
z REAL for sstegr

DOUBLE PRECISION for dstegr
COMPLEX for cstegr
DOUBLE COMPLEX for zstegr.
Array z(ldz, *), the second dimension of z must be at least max(1, m).
If jobz = 'V', and if info = 0, the first m columns of z contain the
with w(i).
Note: if range = 'V', the exact value of m is not known in advance and an
upper bound must be used. Using n = m is always safe.
isuppz INTEGER.
Array, size at least (2*max(1, m)).
The support of the eigenvectors in z, that is the indices indicating the

nonzero elements in z. The i-th computed eigenvector is nonzero only in
elements isuppz(2*i-1) through isuppz(2*i). This is relevant in the
case when the matrix is split. isuppz is only accessed when jobz = 'V',
and n > 0.
work(1) On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
942
LAPACK Routines 3
If info = 1x, internal error in ?larre occurred,
If info = 2x, internal error in ?larrv occurred. Here the digit x =

abs(iinfo) < 10, where iinfo is the non-zero error code returned by ?
larre or ?larrv, respectively.

Specific details for the routine stegr interface are the following:
w Holds the vector of length n.
z Holds the matrix Z of size (n,m).
isuppz Holds the vector of length (2*m).
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a) means the
largest machine number of the same precision as argument a.
vu Default value for this argument is vu = HUGE (vl).
il Default value for this argument is il = 1.
iu Default value for this argument is iu = n.
abstol Default value for this argument is abstol = 0.0_WP.
jobz Restored based on the presence of the argument z as follows:

jobz = 'V', if z is present,
jobz = 'N', if z is omitted.
range Restored based on the presence of arguments vl, vu, il, iu as follows:
range = 'V', if one of or both vl and vu are present,
range = 'I', if one of or both il and iu are present,
range = 'A', if none of vl, vu, il, iu is present,
Note that there will be an error condition if one of or both vl and vu are present
and at the same time one of or both il and iu are present.
Note that two variants of Fortran 95 interface for stegr routine are needed because of an ambiguous choice
between real and complex cases appear when z is omitted. Thus, the name rstegr is used in real cases
(single or double precision), and the name stegr is used in complex cases (single or double precision).
943
Application Notes
?stegr works only on machines which follow IEEE-754 floating-point standard in their handling of infinities
and NaNs. Normal execution of ?stegr may create NaNs and infinities and hence may abort due to a floating
point exception in environments which do not conform to the IEEE-754 standard.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run, or
set lwork = -1 (liwork = -1).
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, then the
routine completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array (work, iwork) on exit. Use this value
(work(1), iwork(1)) for subsequent runs.
If lwork = -1 (liwork = -1), then the routine returns immediately and provides the recommended
workspace in the first element of the corresponding array (work, iwork). This operation is called a workspace
query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, then the routine
workspace.
?pteqr
Computes all eigenvalues and (optionally) all
eigenvectors of a real symmetric positive-definite
tridiagonal matrix.
Syntax
call spteqr(compz, n, d, e, z, ldz, work, info)
call dpteqr(compz, n, d, e, z, ldz, work, info)
call cpteqr(compz, n, d, e, z, ldz, work, info)
call zpteqr(compz, n, d, e, z, ldz, work, info)
call rpteqr(d, e [,z] [,compz] [,info])
call pteqr(d, e [,z] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and (optionally) all the eigenvectors of a real symmetric positive-
definite tridiagonal matrix T. In other words, the routine can compute the spectral factorization: T =
Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i; Z is an orthogonal matrix whose
columns are eigenvectors. Thus,
T*zi = i*zi for i = 1, 2, ..., n.
(The routine normalizes the eigenvectors so that ||zi||2 = 1.)
You can also use the routine for computing the eigenvalues and eigenvectors of real symmetric (or complex
Hermitian) positive-definite matrices A reduced to tridiagonal form T: A = Q*T*QH. In this case, the spectral
factorization is as follows: A = Q*T*QH = (QZ)**(QZ)H. Before calling ?pteqr, you must reduce A to
tridiagonal form and generate the explicit matrix Q by calling the following routines:
944
LAPACK Routines 3
for real matrices: for complex matrices:
full storage ?sytrd, ?orgtr ?hetrd, ?ungtr
packed storage ?sptrd, ?opgtr ?hptrd, ?upgtr
band storage ?sbtrd(vect='V') ?hbtrd(vect='V')
The routine first factorizes T as L*D*LH where L is a unit lower bidiagonal matrix, and D is a diagonal matrix.
Then it forms the bidiagonal matrix B = L*D1/2 and calls ?bdsqr to compute the singular values of B, which
are the square roots of the eigenvalues of T.
Input Parameters


A (and the array z must contain the matrix Q on entry).
d, e, work REAL for single-precision flavors

Arrays:
The dimension of work must be:
at least max(1, 4*n-4) if compz = 'V' or 'I'.
z REAL for spteqr

DOUBLE PRECISION for dpteqr
COMPLEX for cpteqr
DOUBLE COMPLEX for zpteqr.
Array, size (ldz,*)
If compz = 'N' or 'I', z need not be set.
If compz = 'V', z must contain the orthogonal matrix used in the

reduction to tridiagonal form..
945
at least max(1, n) if compz = 'V' or 'I'.

Output Parameters
d The n eigenvalues in descending order, unless info > 0.
See also info.
e On exit, the array is overwritten.
z If info = 0, contains an n-byn matrix the columns of which are

orthonormal eigenvectors. (The i-th column corresponds to the i-th
eigenvalue.)
info INTEGER.
If info = i, the leading minor of order i (and hence T itself) is not

positive-definite.
If info = n + i, the algorithm for computing singular values failed to
converge; i off-diagonal elements have not converged to zero.

Specific details for the routine pteqr interface are the following:

follows:
omitted.
Note that two variants of Fortran 95 interface for pteqr routine are needed because of an ambiguous choice
between real and complex cases appear when z is omitted. Thus, the name rpteqr is used in real cases
(single or double precision), and the name pteqr is used in complex cases (single or double precision).
946
LAPACK Routines 3
Application Notes
|i - i| c(n)**K*i
where c(n) is a modestly increasing function of n, is the machine precision, and K = ||DTD||2 *||
(DTD)-1||2, D is diagonal with dii = tii-1/2.
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle (zi,
wi) between them is bounded as follows:
(ui, wi) c(n)K / minij(|i - j|/|i + j|).
Here minij(|i - j|/|i + j|) is the relative gap between i and the other eigenvalues.
The total number of floating-point operations depends on how rapidly the algorithm converges.
Typically, it is about
30n2 if compz = 'N';
6n3 (for complex flavors, 12n3) if compz = 'V' or 'I'.
?stebz
Computes selected eigenvalues of a real symmetric
tridiagonal matrix by bisection.
Syntax
call sstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w, iblock,
isplit, work, iwork, info)
call dstebz (range, order, n, vl, vu, il, iu, abstol, d, e, m, nsplit, w, iblock,
isplit, work, iwork, info)
call stebz(d, e, m, nsplit, w, iblock, isplit [, order] [,vl] [,vu] [,il] [,iu]
[,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes some (or all) of the eigenvalues of a real symmetric tridiagonal matrix T by bisection.
The routine searches for zero or negligible off-diagonal elements to see if T splits into block-diagonal form T
= diag(T1, T2, ...). Then it performs bisection on each of the blocks Ti and returns the block index of
each computed eigenvalue, so that a subsequent call to stein can also take advantage of the block structure.
See also laebz.
Input Parameters


interval: vl < w(i)vu.
order CHARACTER*1. Must be 'B' or 'E'.
947
If order = 'B', the eigenvalues are to be ordered from smallest to largest

within each split-off block.
If order = 'E', the eigenvalues for the entire matrix are to be ordered
from smallest to largest.
vl, vu REAL for sstebz

DOUBLE PRECISION for dstebz.
interval:
vl < w(i)) vu.
il, iu INTEGER. Constraint: 1 iliun.

If range = 'I', the routine computes eigenvalues w(i) such that iliiu
(assuming that the eigenvalues w(i) are in ascending order).
abstol REAL for sstebz

The absolute tolerance to which each eigenvalue is required. An eigenvalue
(or cluster) is considered to have converged if it lies in an interval of width
abstol.
If abstol 0.0, then the tolerance is taken as eps*|T|, where eps is the
machine precision, and |T| is the 1-norm of the matrix T.
d, e, work REAL for sstebz

Arrays:
The dimension of work must be at least max(1, 4n).
iwork INTEGER. Workspace.

Array, size at least max(1, 3n).
Output Parameters
m INTEGER. The actual number of eigenvalues found.
nsplit INTEGER. The number of diagonal blocks detected in T.
948
LAPACK Routines 3
w REAL for sstebz
Array, size at least max(1, n). The computed eigenvalues, stored in w(1) to
w(m).
iblock, isplit INTEGER.

Arrays, size at least max(1, n).
A positive value iblock(i) is the block number of the eigenvalue stored in
w(i) (see also info).
The leading nsplit elements of isplit contain points at which T splits into
blocks Ti as follows: the block T1 contains rows/columns 1 to isplit(1);
the block T2 contains rows/columns isplit(1)+1 to isplit(2), and so
on.
info INTEGER.
If info = 1, for range = 'A' or 'V', the algorithm failed to compute

some of the required eigenvalues to the desired accuracy; iblock(i)<0
indicates that the eigenvalue stored in w(i) failed to converge.
If info = 2, for range = 'I', the algorithm failed to compute some of
the required eigenvalues. Try calling the routine again with range = 'A'.
If info = 3:
for range = 'A' or 'V', same as info = 1;
for range = 'I', same as info = 2.
If info = 4, no eigenvalues have been computed. The floating-point

arithmetic on the computer is not behaving as expected.

Specific details for the routine stebz interface are the following:
iblock Holds the vector of length n.
isplit Holds the vector of length n.
order Must be 'B' or 'E'. The default value is 'B'.
vl Default value for this argument is vl = - HUGE (vl) where HUGE(a) means the
largest machine number of the same precision as argument a.
949
vu Default value for this argument is vu = HUGE (vl).
abstol Default value for this argument is abstol = 0.0_WP.
range = 'A', if none of vl, vu, il,
iu is present, Note that there will be an error condition if one of or both vl and vu
are present and at the same time one of or both il and iu are present.
Application Notes
The eigenvalues of T are computed to high relative accuracy which means that if they vary widely in
magnitude, then any small eigenvalues will be computed more accurately than, for example, with the
standard QR method. However, the reduction to tridiagonal form (prior to calling the routine) may exclude
the possibility of obtaining high relative accuracy in the small eigenvalues of the original matrix if its
eigenvalues vary widely in magnitude.
?stein
Computes the eigenvectors corresponding to specified
eigenvalues of a real symmetric tridiagonal matrix.
Syntax
call sstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call dstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call cstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call zstein(n, d, e, m, w, iblock, isplit, z, ldz, work, iwork, ifailv, info)
call stein(d, e, w, iblock, isplit, z [,ifailv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the eigenvectors of a real symmetric tridiagonal matrix T corresponding to specified
eigenvalues, by inverse iteration. It is designed to be used in particular after the specified eigenvalues have
been computed by ?stebz with order = 'B', but may also be used when the eigenvalues have been
computed by other routines.
If you use this routine after ?stebz, it can take advantage of the block structure by performing inverse
iteration on each block Ti separately, which is more efficient than using the whole matrix T.
If T has been formed by reduction of a full symmetric or Hermitian matrix A to tridiagonal form, you can
transform eigenvectors of T to eigenvectors of A by calling ?ormtr or ?opmtr (for real flavors) or by calling ?
unmtr or ?upmtr (for complex flavors).
950
LAPACK Routines 3
Input Parameters
m INTEGER. The number of eigenvectors to be returned.
d, e, w REAL for single-precision flavors

Arrays:
e(*) contains the sub-diagonal elements of T stored in elements 1 to n-1
w(*) contains the eigenvalues of T, stored in w(1) to w(m) (as returned by
stebz). Eigenvalues of T1 must be supplied first, in non-decreasing order;
then those of T2, again in non-decreasing order, and so on. Constraint:
if iblock(i) = iblock(i+1), w(i) w(i+1).
The size of w must be at least max(1, n).
iblock, isplit INTEGER.

Arrays, size at least max(1, n). The arrays iblock and isplit, as returned
by ?stebz with order = 'B'.
If you did not call ?stebz with order = 'B', set all elements of iblock to
1, and isplit(1) to n.)
ldz INTEGER. The leading dimension of the output array z; ldz max(1, n).
work REAL for single-precision flavors

Workspace array, size at least max(1, 5n).
iwork INTEGER.
Output Parameters
z REAL for sstein

DOUBLE PRECISION for dstein
COMPLEX for cstein
DOUBLE COMPLEX for zstein.
Array, size (ldz, *).
If info = 0, z contains an n-by-n matrix the columns of which are
orthonormal eigenvectors. (The i-th column corresponds to the ith
eigenvalue.)
ifailv INTEGER.
951
Array, size at least max(1, m).

If info = i > 0, the first i elements of ifailv contain the indices of any
eigenvectors that failed to converge.
info INTEGER.
If info = i, then i eigenvectors (as indicated by the parameter ifailv) each

failed to converge in 5 iterations. The current iterates are stored in the
corresponding columns of the array z.

Specific details for the routine stein interface are the following:
iblock Holds the vector of length n.
isplit Holds the vector of length n.
z Holds the matrix Z of size (n,m).
ifailv Holds the vector of length (m).
Application Notes
Each computed eigenvector zi is an exact eigenvector of a matrix T+Ei, where ||Ei||2 = O()*||T||2.
However, a set of eigenvectors computed by this routine may not be orthogonal to so high a degree of
accuracy as those computed by ?steqr.
?disna
Computes the reciprocal condition numbers for the
eigenvectors of a symmetric/ Hermitian matrix or for
the left or right singular vectors of a general matrix.
Syntax
call sdisna(job, m, n, d, sep, info)
call ddisna(job, m, n, d, sep, info)
call disna(d, sep [,job] [,minmn] [,info])
Include Files
mkl.fi, lapack.f90
952
LAPACK Routines 3
Description
The routine computes the reciprocal condition numbers for the eigenvectors of a real symmetric or complex
Hermitian matrix or for the left or right singular vectors of a general m-by-n matrix.
The reciprocal condition number is the 'gap' between the corresponding eigenvalue or singular value and the
nearest other one.
The bound on the error, measured by angle in radians, in the i-th computed vector is given by
?lamch('E')*(anorm/sep(i))
where anorm = ||A||2 = max( |d(j)| ). sep(i) is not allowed to be smaller than slamch('E')*anorm in
order to limit the size of the error bound.
?disna may also be used to compute error bounds for eigenvectors of the generalized symmetric definite
eigenproblem.
Input Parameters
job CHARACTER*1. Must be 'E','L', or 'R'. Specifies for which problem the
reciprocal condition numbers should be computed:
job = 'E': for the eigenvectors of a symmetric/Hermitian matrix;
job = 'L': for the left singular vectors of a general matrix;
job = 'R': for the right singular vectors of a general matrix.
m INTEGER. The number of rows of the matrix (m 0).
n INTEGER.
If job = 'L', or 'R', the number of columns of the matrix (n 0). Ignored
if job = 'E'.
d REAL for sdisna

DOUBLE PRECISION for ddisna.
Array, dimension at least max(1,m) if job = 'E', and at least max(1,
min(m,n)) if job = 'L' or 'R'.
This array must contain the eigenvalues (if job = 'E') or singular values
(if job = 'L' or 'R') of the matrix, in either increasing or decreasing
order.
If singular values, they must be non-negative.
Output Parameters
sep REAL for sdisna

DOUBLE PRECISION for ddisna.
Array, dimension at least max(1,m) if job = 'E', and at least max(1,
min(m,n)) if job = 'L' or 'R'. The reciprocal condition numbers of the
vectors.
info INTEGER.
953

Specific details for the routine disna interface are the following:
d Holds the vector of length min(m,n).
sep Holds the vector of length min(m,n).
job Must be 'E', 'L', or 'R'. The default value is 'E'.
minmn Indicates which of the values m or n is smaller. Must be either 'M' or 'N', the
default is 'M'.
If job = 'E', this argument is superfluous, If job = 'L' or 'R', this argument
is used by the routine.
Generalized Symmetric-Definite Eigenvalue Problems: LAPACK Computational Routines

Generalized symmetric-definite eigenvalue problems are as follows: find the eigenvalues and the
corresponding eigenvectors z that satisfy one of these equations:
Az = Bz, ABz = z, or BAz = z,
where A is an n-by-n symmetric or Hermitian matrix, and B is an n-by-n symmetric positive-definite or
Hermitian positive-definite matrix.
In these problems, there exist n real eigenvectors corresponding to real eigenvalues (even for complex
Hermitian matrices A and B).
Routines described in this section allow you to reduce the above generalized problems to standard symmetric
eigenvalue problem Cy = y, which you can solve by calling LAPACK routines described earlier in this
chapter (see Symmetric Eigenvalue Problems).
Different routines allow the matrices to be stored either conventionally or in packed storage. Prior to
reduction, the positive-definite matrix B must first be factorized using either potrf or pptrf.
The reduction routine for the banded matrices A and B uses a split Cholesky factorization for which a specific
routine pbstf is provided. This refinement halves the amount of work required to form matrix C.
Table "Computational Routines for Reducing Generalized Eigenproblems to Standard Problems" lists LAPACK
routines that can be used to solve generalized symmetric-definite eigenvalue problems. The corresponding
routine names in the Fortran 95 interface are without the first symbol.
Computational Routines for Reducing Generalized Eigenproblems to Standard Problems
Matrix type Reduce to standard Reduce to standard Reduce to standard Factorize
problems (full problems (packed problems (band band
storage) storage) matrices) matrix
real sygst spgst sbgst pbstf

symmetric
matrices
complex hegst hpgst hbgst pbstf

Hermitian
matrices
954
LAPACK Routines 3
?sygst
Reduces a real symmetric-definite generalized
eigenvalue problem to the standard form.
Syntax
call ssygst(itype, uplo, n, a, lda, b, ldb, info)
call dsygst(itype, uplo, n, a, lda, b, ldb, info)
call sygst(a, b [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces real symmetric-definite generalized eigenproblems

A*z = *B*z, A*B*z = *z, or B*A*z = *z
to the standard form C*y = *y. Here A is a real symmetric matrix, and B is a real symmetric positive-
definite matrix. Before calling this routine, call ?potrf to compute the Cholesky factorization: B = UT*U or B
= L*LT.
Input Parameters
itype INTEGER. Must be 1 or 2 or 3.

If itype = 1, the generalized eigenproblem is A*z = lambda*B*z
for uplo = 'U': C = inv(UT)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LT), z = inv(LT)*y.
If itype = 2, the generalized eigenproblem is A*B*z = lambda*z
for uplo = 'U': C = U*A*UT, z = inv(U)*y;
for uplo = 'L': C = LT*A*L, z = inv(LT)*y.
If itype = 3, the generalized eigenproblem is B*A*z = lambda*z
for uplo = 'U': C = U*A*UT, z = UT*y;
for uplo = 'L': C = LT*A*L, z = L*y.

If uplo = 'U', the array a stores the upper triangle of A; you must supply
B in the factored form B = UT*U.
If uplo = 'L', the array a stores the lower triangle of A; you must supply
B in the factored form B = L*LT.
n INTEGER. The order of the matrices A and B (n 0).
a, b REAL for ssygst

DOUBLE PRECISION for dsygst.
Arrays:
a(lda,*) contains the upper or lower triangle of A.
955
b(ldb,*) contains the Cholesky-factored matrix B:

B = UT*U or B = L*LT (as returned by ?potrf).
Output Parameters
a The upper or lower triangle of A is overwritten by the upper or lower

triangle of C, as specified by the arguments itype and uplo.
info INTEGER.

Specific details for the routine sygst interface are the following:
b Holds the matrix B of size (n,n).
itype Must be 1, 2, or 3. The default value is 1.
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by inv(B) (if
itype = 1) or B (if itype = 2 or 3). When the routine is used as a step in the computation of eigenvalues
and eigenvectors of the original problem, there may be a significant loss of accuracy if B is ill-conditioned
with respect to inversion.
The approximate number of floating-point operations is n3.
?hegst
Reduces a complex Hermitian positive-definite
generalized eigenvalue problem to the standard form.
Syntax
call chegst(itype, uplo, n, a, lda, b, ldb, info)
call zhegst(itype, uplo, n, a, lda, b, ldb, info)
call hegst(a, b [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
956
LAPACK Routines 3
Description
The routine reduces a complex Hermitian positive-definite generalized eigenvalue problem to standard form.
itype Problem Result
1 A*x = *B*x A overwritten by inv(UH)*A*inv(U) or

inv(L)*A*inv(LH)
2 A*B*x = *x A overwritten by U*A*UH or LH*A*L
3 B*A*x = *x
Before calling this routine, you must call ?potrf to compute the Cholesky factorization: B = UH*U or B =
L*LH.
Input Parameters

for uplo = 'U': C = (UH)-1*A*U-1;
for uplo = 'L': C = L-1*A*(LH)-1.
for uplo = 'U': C = U*A*UH;
for uplo = 'L': C = LH*A*L.
for uplo = 'U': C = U*A*UH;
for uplo = 'L': C = LH*A*L.

If uplo = 'U', the array a stores the upper triangle of A; you must supply
B in the factored form B = UH*U.
If uplo = 'L', the array a stores the lower triangle of A; you must supply
B in the factored form B = L*LH.
a, b COMPLEX for chegst

DOUBLE COMPLEX for zhegst.
Arrays:
a(lda,*) contains the upper or lower triangle of A.
b(ldb,*) contains the Cholesky-factored matrix B:
B = UH*U or B = L*LH (as returned by ?potrf).
957
Output Parameters
a The upper or lower triangle of A is overwritten by the upper or lower

info INTEGER.

Specific details for the routine hegst interface are the following:
Application Notes
Forming the reduced matrix C is a stable procedure. However, it involves implicit multiplication by B-1 (if
?spgst
eigenvalue problem to the standard form using packed
storage.
Syntax
call sspgst(itype, uplo, n, ap, bp, info)
call dspgst(itype, uplo, n, ap, bp, info)
call spgst(ap, bp [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces real symmetric-definite generalized eigenproblems

A*x = *B*x, A*B*x = *x, or B*A*x = *x
958
LAPACK Routines 3
to the standard form C*y = *y, using packed matrix storage. Here A is a real symmetric matrix, and B is a
real symmetric positive-definite matrix. Before calling this routine, call ?pptrf to compute the Cholesky
factorization: B = UT*U or B = L*LT.
Input Parameters

for uplo = 'U': C = inv(UT)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LT), z = inv(LT)*y.
for uplo = 'U': C = U*A*UT, z = inv(U)*y;
for uplo = 'L': C = LT*A*L, z = inv(LT)*y.
for uplo = 'U': C = U*A*UT, z = UT*y;
for uplo = 'L': C = LT*A*L, z = L*y.

If uplo = 'U', ap stores the packed upper triangle of A;
you must supply B in the factored form B = UT*U.
If uplo = 'L', ap stores the packed lower triangle of A;
you must supply B in the factored form B = L*LT.
ap, bp REAL for sspgst

DOUBLE PRECISION for dspgst.
Arrays:
ap(*) contains the packed upper or lower triangle of A.
The dimension of ap must be at least max(1, n*(n+1)/2).

bp(*) contains the packed Cholesky factor of B (as returned by ?pptrf
with the same uplo value).
The dimension of bp must be at least max(1, n*(n+1)/2).
Output Parameters
ap The upper or lower triangle of A is overwritten by the upper or lower

info INTEGER.
959

Specific details for the routine spgst interface are the following:
bp Holds the array B of size (n*(n+1)/2).
Application Notes
?hpgst
Reduces a generalized eigenvalue problem with a
Hermitian matrix to a standard eigenvalue problem
using packed storage.
Syntax
call chpgst(itype, uplo, n, ap, bp, info)
call zhpgst(itype, uplo, n, ap, bp, info)
call hpgst(ap, bp [,itype] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reduces generalized eigenproblems with Hermitian matrices

A*z = *B*z, A*B*z = *z, or B*A*z = *z.
to standard eigenproblems C*y = *y, using packed matrix storage. Here A is a complex Hermitian matrix,
and B is a complex Hermitian positive-definite matrix. Before calling this routine, you must call ?pptrf to
compute the Cholesky factorization: B = UH*U or B = L*LH.
Input Parameters

for uplo = 'U': C = inv(UH)*A*inv(U), z = inv(U)*y;
for uplo = 'L': C = inv(L)*A*inv(LH), z = inv(LH)*y.
960
LAPACK Routines 3
for uplo = 'U': C = U*A*UH, z = inv(U)*y;
for uplo = 'L': C = LH*A*L, z = inv(LH)*y.
for uplo = 'U': C = U*A*UH, z = UH*y;
for uplo = 'L': C = LH*A*L, z = L*y.

If uplo = 'U', ap stores the packed upper triangle of A; you must supply
B in the factored form B = UH*U.
If uplo = 'L', ap stores the packed lower triangle of A; you must supply B
in the factored form B = L*LH.
ap, bp COMPLEX for chpgstDOUBLE COMPLEX for zhpgst.

Arrays:
ap(*) contains the packed upper or lower triangle of A.
The dimension of a must be at least max(1, n*(n+1)/2).

bp(*) contains the packed Cholesky factor of B (as returned by ?pptrf
with the same uplo value).
The dimension of b must be at least max(1, n*(n+1)/2).
Output Parameters
ap The upper or lower triangle of A is overwritten by the upper or lower

info INTEGER.

Specific details for the routine hpgst interface are the following:
961
Application Notes
?sbgst
eigenproblem for banded matrices to the standard
form using the factorization performed by ?pbstf.
Syntax
call ssbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
call dsbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, info)
call sbgst(ab, bb [,x] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
To reduce the real symmetric-definite generalized eigenproblem A*z = *B*z to the standard form C*y=*y,
where A, B and C are banded, this routine must be preceded by a call to pbstf, which computes the split
Cholesky factorization of the positive-definite matrix B: B=ST*S. The split Cholesky factorization, compared
with the ordinary Cholesky factorization, allows the work to be approximately halved.
This routine overwrites A with C = XT*A*X, where X = inv(S)*Q and Q is an orthogonal matrix chosen
(implicitly) to preserve the bandwidth of A. The routine also has an option to allow the accumulation of X,
and then, if z is an eigenvector of C, X*z is an eigenvector of the original system.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'V'.

If vect = 'N', then matrix X is not returned;
If vect = 'V', then matrix X is returned.

ka INTEGER. The number of super- or sub-diagonals in A

(ka 0).
kb INTEGER. The number of super- or sub-diagonals in B

(kakb 0).
ab, bb, work REAL for ssbgst
962
LAPACK Routines 3
DOUBLE PRECISION for dsbgst
symmetric matrix A (as specified by uplo) in band storage format.
The second dimension of the array ab must be at least max(1, n).
bb(ldbb,*) is an array containing the banded split Cholesky factor of B as
specified by uplo, n and kb and returned by pbstf/pbstf.
The second dimension of the array bb must be at least max(1, n).
work(*) is a workspace array, dimension at least max(1, 2*n)
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1.
ldbb INTEGER. The leading dimension of the array bb; must be at least kb+1.
ldx The leading dimension of the output array x. Constraints: if vect = 'N',
then ldx 1;
if vect = 'V', then ldx max(1, n).
Output Parameters
ab On exit, this array is overwritten by the upper or lower triangle of C as

specified by uplo.
x REAL for ssbgst

DOUBLE PRECISION for dsbgst
Array.
If vect = 'V', then x(ldx,*) contains the n-by-n matrix X = inv(S)*Q.
If vect = 'N', then x is not referenced.
The second dimension of x must be:

at least max(1, n), if vect = 'V';
at least 1, if vect = 'N'.
info INTEGER.

Specific details for the routine sbgst interface are the following:
ab Holds the array A of size (ka+1,n).
bb Holds the array B of size (kb+1,n).
x Holds the matrix X of size (n,n).
963
vect Restored based on the presence of the argument x as follows:

vect = 'V', if x is present,
vect = 'N', if x is omitted.
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is used as a step
in the computation of eigenvalues and eigenvectors of the original problem, there may be a significant loss of
accuracy if B is ill-conditioned with respect to inversion.
If ka and kb are much less than n then the total number of floating-point operations is approximately
6n2*kb, when vect = 'N'. Additional (3/2)n3*(kb/ka) operations are required when vect = 'V'.
?hbgst
Reduces a complex Hermitian positive-definite
generalized eigenproblem for banded matrices to the
standard form using the factorization performed by ?
pbstf.
Syntax
call chbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork, info)
call zhbgst(vect, uplo, n, ka, kb, ab, ldab, bb, ldbb, x, ldx, work, rwork, info)
call hbgst(ab, bb [,x] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
To reduce the complex Hermitian positive-definite generalized eigenproblem A*z = *B*z to the standard
form C*x = *y, where A, B and C are banded, this routine must be preceded by a call to pbstf/pbstf, which
computes the split Cholesky factorization of the positive-definite matrix B: B = SH*S. The split Cholesky
factorization, compared with the ordinary Cholesky factorization, allows the work to be approximately halved.
This routine overwrites A with C = XH*A*X, where X = inv(S)*Q, and Q is a unitary matrix chosen
(implicitly) to preserve the bandwidth of A. The routine also has an option to allow the accumulation of X,
and then, if z is an eigenvector of C, X*z is an eigenvector of the original system.
Input Parameters
vect CHARACTER*1. Must be 'N' or 'V'.

If vect = 'N', then matrix X is not returned;
If vect = 'V', then matrix X is returned.

964
LAPACK Routines 3
(ka 0).

(kakb 0).
ab, bb, work COMPLEX for chbgstDOUBLE COMPLEX for zhbgst

Hermitian matrix A (as specified by uplo) in band storage format.
bb(ldbb,*) is an array containing the banded split Cholesky factor of B as
specified by uplo, n and kb and returned by pbstf/pbstf.
work(*) is a workspace array, dimension at least max(1, n)
ldx The leading dimension of the output array x. Constraints:

if vect = 'N', then ldx 1;
if vect = 'V', then ldx max(1, n).
rwork REAL for chbgst

DOUBLE PRECISION for zhbgst
Workspace array, dimension at least max(1, n)
Output Parameters
ab On exit, this array is overwritten by the upper or lower triangle of C as

specified by uplo.
x COMPLEX for chbgst

DOUBLE COMPLEX for zhbgst
Array.
If vect = 'V', then x(ldx,*) contains the n-by-n matrix X = inv(S)*Q.
If vect = 'N', then x is not referenced.
The second dimension of x must be:

at least max(1, n), if vect = 'V';
at least 1, if vect = 'N'.
info INTEGER.
965

Specific details for the routine hbgst interface are the following:
x Holds the matrix X of size (n,n).
vect Restored based on the presence of the argument x as follows: vect = 'V', if x
is present, vect = 'N', if x is omitted.
Application Notes
Forming the reduced matrix C involves implicit multiplication by inv(B). When the routine is used as a step
in the computation of eigenvalues and eigenvectors of the original problem, there may be a significant loss of
accuracy if B is ill-conditioned with respect to inversion. The total number of floating-point operations is
approximately 20n2*kb, when vect = 'N'. Additional 5n3*(kb/ka) operations are required when vect =
'V'. All these estimates assume that both ka and kb are much less than n.
?pbstf
Computes a split Cholesky factorization of a real
symmetric or complex Hermitian positive-definite
banded matrix used in ?sbgst/?hbgst .
Syntax
call spbstf(uplo, n, kb, bb, ldbb, info)
call dpbstf(uplo, n, kb, bb, ldbb, info)
call cpbstf(uplo, n, kb, bb, ldbb, info)
call zpbstf(uplo, n, kb, bb, ldbb, info)
call pbstf(bb [, uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes a split Cholesky factorization of a real symmetric or complex Hermitian positive-
definite band matrix B. It is to be used in conjunction with sbgst/hbgst.
The factorization has the form B = ST*S (or B = SH*S for complex flavors), where S is a band matrix of the
same bandwidth as B and the following structure: S is upper triangular in the first (n+kb)/2 rows and lower
triangular in the remaining rows.
Input Parameters
966
LAPACK Routines 3
If uplo = 'U', bb stores the upper triangular part of B.
If uplo = 'L', bb stores the lower triangular part of B.

(kb 0).
bb REAL for spbstf

DOUBLE PRECISION for dpbstf
COMPLEX for cpbstf
DOUBLE COMPLEX for zpbstf.
bb(ldbb,*) is an array containing either upper or lower triangular part of the
matrix B (as specified by uplo) in band storage format.
ldbb INTEGER. The leading dimension of bb; must be at least kb+1.
Output Parameters
bb On exit, this array is overwritten by the elements of the split Cholesky

factor S.
info INTEGER.
If info = i, then the factorization could not be completed, because the

updated element bii would be the square root of a negative number; hence
the matrix B is not positive-definite.

Specific details for the routine pbstf interface are the following:
Application Notes
The computed factor S is the exact factor of a perturbed matrix B + E, where
967
The total number of floating-point operations for real flavors is approximately n(kb+1)2. The number of
operations for complex flavors is 4 times greater. All these estimates assume that kb is much less than n.
After calling this routine, you can call sbgst/hbgst to solve the generalized eigenproblem Az = Bz, where A
and B are banded and B is positive-definite.
Nonsymmetric Eigenvalue Problems: LAPACK Computational Routines

This section describes LAPACK routines for solving nonsymmetric eigenvalue problems, computing the Schur
factorization of general matrices, as well as performing a number of related computational tasks.
A nonsymmetric eigenvalue problem is as follows: given a nonsymmetric (or non-Hermitian) matrix A, find
the eigenvalues and the corresponding eigenvectorsz that satisfy the equation
Az = z (right eigenvectors z)
or the equation
zHA = zH (left eigenvectors z).
Nonsymmetric eigenvalue problems have the following properties:
The number of eigenvectors may be less than the matrix order (but is not less than the number of
distinct eigenvalues of A).
Eigenvalues may be complex even for a real matrix A.
If a real nonsymmetric matrix has a complex eigenvalue a+bi corresponding to an eigenvector z, then a-
bi is also an eigenvalue. The eigenvalue a-bi corresponds to the eigenvector whose elements are
complex conjugate to the elements of z.
To solve a nonsymmetric eigenvalue problem with LAPACK, you usually need to reduce the matrix to the
upper Hessenberg form and then solve the eigenvalue problem with the Hessenberg matrix obtained. Table
"Computational Routines for Solving Nonsymmetric Eigenvalue Problems" lists LAPACK routines to reduce the
matrix to the upper Hessenberg form by an orthogonal (or unitary) similarity transformation A = QHQH as
well as routines to solve eigenvalue problems with Hessenberg matrices, forming the Schur factorization of
such matrices and computing the corresponding condition numbers. The corresponding routine names in the
Fortran 95 interface are without the first symbol.
The decision tree in Figure "Decision Tree: Real Nonsymmetric Eigenvalue Problems" helps you choose the
right routine or sequence of routines for an eigenvalue problem with a real nonsymmetric matrix. If you need
to solve an eigenvalue problem with a complex non-Hermitian matrix, use the decision tree shown in Figure
"Decision Tree: Complex Non-Hermitian Eigenvalue Problems".
Computational Routines for Solving Nonsymmetric Eigenvalue Problems
Operation performed Routines for real matrices Routines for complex matrices
Reduce to Hessenberg form ?gehrd, ?gehrd

A = QHQH
Generate the matrix Q ?orghr ?unghr
Apply the matrix Q ?ormhr ?unmhr
Balance matrix ?gebal ?gebal
Transform eigenvectors of ?gebak ?gebak

balanced matrix to those of
the original matrix
Find eigenvalues and Schur ?hseqr ?hseqr

factorization (QR algorithm)
Find eigenvectors from ?hsein ?hsein

Hessenberg form (inverse
iteration)
968
LAPACK Routines 3
Operation performed Routines for real matrices Routines for complex matrices
Find eigenvectors from ?trevc ?trevc

Schur factorization
Estimate sensitivities of ?trsna ?trsna

eigenvalues and
eigenvectors
Reorder Schur factorization ?trexc ?trexc
Reorder Schur factorization, ?trsen ?trsen

find the invariant subspace
and estimate sensitivities
Solves Sylvester's equation. ?trsyl ?trsyl
Decision Tree: Real Nonsymmetric Eigenvalue Problems
969
Decision Tree: Complex Non-Hermitian Eigenvalue Problems
?gehrd
Reduces a general matrix to upper Hessenberg form.
Syntax
call sgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call cgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zgehrd(n, ilo, ihi, a, lda, tau, work, lwork, info)
call gehrd(a [, tau] [,ilo] [,ihi] [,info])
Include Files
mkl.fi, lapack.f90
970
LAPACK Routines 3
Description
The routine reduces a general matrix A to upper Hessenberg form H by an orthogonal or unitary similarity
transformation A = Q*H*QH. Here H has real subdiagonal elements.
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of elementary
reflectors. Routines are provided to work with Q in this representation.
Input Parameters
ilo, ihi INTEGER. If A is an output by ?gebal, then ilo and ihi must contain the
values returned by that routine. Otherwise ilo = 1 and ihi = n. (If n >
0, then 1 iloihin; if n = 0, ilo = 1 and ihi = 0.)
a, work REAL for sgehrd

DOUBLE PRECISION for dgehrd
COMPLEX for cgehrd
DOUBLE COMPLEX for zgehrd.
Arrays:
work (lwork) is a workspace array.
xerbla.
Output Parameters
a The elements on and above the subdiagonal contain the upper Hessenberg
matrix H. The subdiagonal elements of H are real. The elements below the
subdiagonal, with the array tau, represent the orthogonal matrix Q as a
product of n elementary reflectors.
tau REAL for sgehrd

DOUBLE PRECISION for dgehrd
COMPLEX for cgehrd
DOUBLE COMPLEX for zgehrd.
Array, size at least max (1, n-1).
Contains scalars that define elementary reflectors for the matrix Q.
971

info INTEGER.

Specific details for the routine gehrd interface are the following:
ilo Default value for this argument is ilo = 1.
ihi Default value for this argument is ihi = n.
Application Notes
lwork = -1.
The computed Hessenberg matrix H is exactly similar to a nearby matrix A + E, where ||E||2 < c(n)||
A||2, c(n) is a modestly increasing function of n, and is the machine precision.
The approximate number of floating-point operations for real flavors is (2/3)*(ihi - ilo)2(2ihi + 2ilo
+ 3n); for complex flavors it is 4 times greater.
?orghr
by ?gehrd.
Syntax
call sorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call dorghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call orghr(a, tau [,ilo] [,ihi] [,info])
972
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine explicitly generates the orthogonal matrix Q that has been determined by a preceding call to
sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QT, and represents the matrix Q as a product of ihi-
iloelementary reflectors. Here ilo and ihi are values determined by sgebal/dgebal when balancing the
matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.)
The matrix Q generated by ?orghr has the structure:
where Q22 occupies rows and columns ilo to ihi.
Input Parameters
ilo, ihi INTEGER. These must be the same parameters ilo and ihi, respectively, as
supplied to ?gehrd. (If n > 0, then 1 iloihin; if n = 0, ilo = 1 and
ihi = 0.)
a, tau, work REAL for sorghr

DOUBLE PRECISION for dorghr
Arrays: a(lda,*) contains details of the vectors which define the elementary
reflectors, as returned by ?gehrd.

tau(*) contains further details of the elementary reflectors, as returned
by ?gehrd.
The dimension of tau must be at least max (1, n-1).


lwork max(1, ihi-ilo).
xerbla.
973
Output Parameters
a Overwritten by the n-by-n orthogonal matrix Q.

info INTEGER.

Specific details for the routine orghr interface are the following:
Application Notes
For better performance, try using lwork =(ihi-ilo)*blocksize where blocksize is a machine-dependent
lwork = -1.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(), where is the
machine precision.
The approximate number of floating-point operations is (4/3)(ihi-ilo)3.
The complex counterpart of this routine is unghr.
?ormhr
Multiplies an arbitrary real matrix C by the real
orthogonal matrix Q determined by ?gehrd.
Syntax
call sormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
974
LAPACK Routines 3
call dormhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call ormhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine multiplies a matrix C by the orthogonal matrix Q that has been determined by a preceding call to
sgehrd/dgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QT, and represents the matrix Q as a product of ihi-
iloelementary reflectors. Here ilo and ihi are values determined by sgebal/dgebal when balancing the
matrix;if the matrix has not been balanced, ilo = 1 and ihi = n.)
With ?ormhr, you can form one of the matrix products Q*C, QT*C, C*Q, or C*QT, overwriting the result on C
(which may be any real rectangular matrix).
A common application of ?ormhr is to transform a matrix V of eigenvectors of H to the matrix QV of
eigenvectors of A.
Input Parameters

If side= 'L', then the routine forms Q*C or QT*C.
If side= 'R', then the routine forms C*Q or C*QT.
trans CHARACTER*1. Must be 'N' or 'T'.

If trans= 'N', then Q is applied to C.
If trans= 'T', then QT is applied to C.
m INTEGER. The number of rows in C (m 0).
supplied to ?gehrd.
If m > 0 and side = 'L', then 1 iloihim.
If m = 0 and side = 'L', then ilo = 1 and ihi = 0.
If n > 0 and side = 'R', then 1 iloihin.
If n = 0 and side = 'R', then ilo = 1 and ihi = 0.
a, tau, c, work REAL for sormhr

DOUBLE PRECISION for dormhr
Arrays:
a(lda,*) contains details of the vectors which define the elementary
The second dimension of a must be at least max(1, m) if side = 'L' and
975

by ?gehrd .
The dimension of tau must be at least max (1, m-1) if side = 'L' and at
least max (1, n-1) if side = 'R'.
c(ldc,*) contains the m by n matrix C.

The second dimension of c must be at least max(1, n).
lda INTEGER. The leading dimension of a; at least max(1, m) if side = 'L'

and at least max (1, n) if side = 'R'.
ldc INTEGER. The leading dimension of c; at least max(1, m) .

If side = 'L', lwork max(1, n).
If side = 'R', lwork max(1, m).

xerbla.
Output Parameters
c C is overwritten by product Q*C, QT*C, C*Q, or C*QT as specified by side and

trans.

info INTEGER.

Specific details for the routine ormhr interface are the following:

976
LAPACK Routines 3
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least m*blocksize if side
= 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance
of the blocked algorithm.
lwork = -1.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O()|*|C||2, where
The approximate number of floating-point operations is
2n(ihi-ilo)2 if side = 'L';
2m(ihi-ilo)2 if side = 'R'.
The complex counterpart of this routine is unmhr.
?unghr
by ?gehrd.
Syntax
call cunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call zunghr(n, ilo, ihi, a, lda, tau, work, lwork, info)
call unghr(a, tau [,ilo] [,ihi] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is intended to be used following a call to cgehrd/zgehrd, which reduces a complex matrix A to
upper Hessenberg form H by a unitary similarity transformation: A = Q*H*QH. ?gehrd represents the matrix
Q as a product of ihi-iloelementary reflectors. Here ilo and ihi are values determined by cgebal/zgebal
when balancing the matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.
Use the routine unghr to generate Q explicitly as a square matrix. The matrix Q has the structure:
977
where Q22 occupies rows and columns ilo to ihi.
Input Parameters
supplied to ?gehrd . (If n > 0, then 1 iloihin. If n = 0, then ilo =
1 and ihi = 0.)
a, tau, work COMPLEX for cunghr

DOUBLE COMPLEX for zunghr.
Arrays:
by ?gehrd .
The dimension of tau must be at least max (1, n-1).


lwork max(1, ihi-ilo).
xerbla.
Output Parameters
a Overwritten by the n-by-n unitary matrix Q.

info INTEGER.
978
LAPACK Routines 3
Specific details for the routine unghr interface are the following:
Application Notes
For better performance, try using lwork = (ihi-ilo)*blocksize, where blocksize is a machine-
dependent value (typically, 16 to 64) required for optimum performance of the blocked algorithm.
= -1.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O(), where is the
machine precision.
The approximate number of real floating-point operations is (16/3)(ihi-ilo)3.
The real counterpart of this routine is orghr.
?unmhr
Multiplies an arbitrary complex matrix C by the
complex unitary matrix Q determined by ?gehrd.
Syntax
call cunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call zunmhr(side, trans, m, n, ilo, ihi, a, lda, tau, c, ldc, work, lwork, info)
call unmhr(a, tau, c [,ilo] [,ihi] [,side] [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
979
The routine multiplies a matrix C by the unitary matrix Q that has been determined by a preceding call to
cgehrd/zgehrd. (The routine ?gehrd reduces a real general matrix A to upper Hessenberg form H by an
orthogonal similarity transformation, A = Q*H*QH, and represents the matrix Q as a product of ihi-ilo
elementary reflectors. Here ilo and ihi are values determined by cgebal/zgebal when balancing the
matrix; if the matrix has not been balanced, ilo = 1 and ihi = n.)
With ?unmhr, you can form one of the matrix products Q*C, QH*C, C*Q, or C*QH, overwriting the result on C
(which may be any complex rectangular matrix). A common application of this routine is to transform a
matrix V of eigenvectors of H to the matrix QV of eigenvectors of A.
Input Parameters

If side = 'L', then the routine forms Q*C or QH*C.
If side = 'R', then the routine forms C*Q or C*QH.
trans CHARACTER*1. Must be 'N' or 'C'.

If trans = 'N', then Q is applied to C.
If trans = 'T', then QH is applied to C.
m INTEGER. The number of rows in C (m 0).
supplied to ?gehrd .
If m > 0 and side = 'L', then 1 iloihim.
If m = 0 and side = 'L', then ilo = 1 and ihi = 0.
If n > 0 and side = 'R', then 1 iloihin.
If n = 0 and side = 'R', then ilo =1 and ihi = 0.
a, tau, c, work COMPLEX for cunmhr

DOUBLE COMPLEX for zunmhr.
Arrays:
The second dimension of a must be at least max(1, m) if side = 'L' and


by ?gehrd.
The dimension of tau must be at least max (1, m-1)

if side = 'L' and at least max (1, n-1) if side = 'R'.

980
LAPACK Routines 3
lda INTEGER. The leading dimension of a; at least max(1, m) if side = 'L'
and at least max (1, n) if side = 'R'.
ldc INTEGER. The leading dimension of c; at least max(1, m).

If side = 'L', lwork max(1,n).
If side = 'R', lwork max(1,m).

xerbla.
Output Parameters
c C is overwritten by Q*C, or QH*C, or C*QH, or C*Q as specified by side and

trans.

info INTEGER.

Specific details for the routine unmhr interface are the following:

981
Application Notes
For better performance, lwork should be at least n*blocksize if side = 'L' and at least m*blocksize if side
= 'R', where blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance
of the blocked algorithm.
= -1.
The computed matrix Q differs from the exact result by a matrix E such that ||E||2 = O()*||C||2, where
8n(ihi-ilo)2 if side = 'L';
8m(ihi-ilo)2 if side = 'R'.
The real counterpart of this routine is ormhr.
?gebal
Balances a general matrix to improve the accuracy of
computed eigenvalues and eigenvectors.
Syntax
call sgebal(job, n, a, lda, ilo, ihi, scale, info)
call dgebal(job, n, a, lda, ilo, ihi, scale, info)
call cgebal(job, n, a, lda, ilo, ihi, scale, info)
call zgebal(job, n, a, lda, ilo, ihi, scale, info)
call gebal(a [, scale] [,ilo] [,ihi] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine balances a matrix A by performing either or both of the following two similarity transformations:
(1) The routine first attempts to permute A to block upper triangular form:
982
LAPACK Routines 3
where P is a permutation matrix, and A'11 and A'33 are upper triangular. The diagonal elements of A'11 and
A'33 are eigenvalues of A. The rest of the eigenvalues of A are the eigenvalues of the central diagonal block
A'22, in rows and columns ilo to ihi. Subsequent operations to compute the eigenvalues of A (or its Schur
factorization) need only be applied to these rows and columns; this can save a significant amount of work if
ilo > 1 and ihi < n.
If no suitable permutation exists (as is often the case), the routine sets ilo = 1 and ihi = n, and A'22 is
the whole of A.
(2) The routine applies a diagonal similarity transformation to A', to make the rows and columns of A'22 as
close in norm as possible:
This scaling can reduce the norm of the matrix (that is, ||A''22|| < ||A'22||), and hence reduce the
effect of rounding errors on the accuracy of computed eigenvalues and eigenvectors.
Input Parameters
job CHARACTER*1. Must be 'N' or 'P' or 'S' or 'B'.

If job = 'N', then A is neither permuted nor scaled (but ilo, ihi, and scale
get their values).
If job = 'P', then A is permuted but not scaled.
If job = 'S', then A is scaled but not permuted.
If job = 'B', then A is both scaled and permuted.
a REAL for sgebal

DOUBLE PRECISION for dgebal
COMPLEX for cgebal
DOUBLE COMPLEX for zgebal.
The second dimension of a must be at least max(1, n). a is not referenced if
job = 'N'.
Output Parameters
a Overwritten by the balanced matrix (a is not referenced if job = 'N').
ilo, ihi INTEGER. The values ilo and ihi such that on exit a(i,j) is zero if i > j and
1 j < ilo or ihi < jn.
If job = 'N' or 'S', then ilo = 1 and ihi = n.
983
scale REAL for single-precision flavors DOUBLE PRECISION for double-precision

flavors
Contains details of the permutations and scaling factors.
More precisely, if pj is the index of the row and column interchanged with
row and column j, and dj is the scaling factor used to balance row and
column j, then
scale(j) = pj for j = 1, 2,..., ilo-1, ihi+1,..., n;
scale(j) = dj for j = ilo, ilo + 1,..., ihi.
The order in which the interchanges are made is n to ihi+1, then 1 to ilo-1.
info INTEGER.

Specific details for the routine gebal interface are the following:
scale Holds the vector of length n.
job Must be 'B', 'S', 'P', or 'N'. The default value is 'B'.
Application Notes
The errors are negligible, compared with those in subsequent computations.
If the matrix A is balanced by this routine, then any eigenvectors computed subsequently are eigenvectors of
the matrix A'' and hence you must call gebak to transform them back to eigenvectors of A.
If the Schur vectors of A are required, do not call this routine with job = 'S' or 'B', because then the
balancing transformation is not orthogonal (not unitary for complex flavors).
If you call this routine with job = 'P', then any Schur vectors computed subsequently are Schur vectors of
the matrix A'', and you need to call gebak (with side = 'R') to transform them back to Schur vectors of A.
The total number of floating-point operations is proportional to n2.
?gebak
Transforms eigenvectors of a balanced matrix to those
of the original nonsymmetric matrix.
Syntax
call sgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call dgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
984
LAPACK Routines 3
call cgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call zgebak(job, side, n, ilo, ihi, scale, m, v, ldv, info)
call gebak(v, scale [,ilo] [,ihi] [,job] [,side] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine is intended to be used after a matrix A has been balanced by a call to ?gebal, and eigenvectors
of the balanced matrix A''22 have subsequently been computed. For a description of balancing, see gebal. The
balanced matrix A'' is obtained as A''= D*P*A*PT*inv(D), where P is a permutation matrix and D is a
diagonal scaling matrix. This routine transforms the eigenvectors as follows:
if x is a right eigenvector of A'', then PT*inv(D)*x is a right eigenvector of A; if y is a left eigenvector of A'',
then PT*D*y is a left eigenvector of A.
Input Parameters
job CHARACTER*1. Must be 'N' or 'P' or 'S' or 'B'. The same parameter job
as supplied to ?gebal.

If side = 'L', then left eigenvectors are transformed.
If side = 'R', then right eigenvectors are transformed.
n INTEGER. The number of rows of the matrix of eigenvectors (n 0).
ilo, ihi INTEGER. The values ilo and ihi, as returned by ?gebal. (If n > 0, then 1
iloihin;
if n = 0, then ilo = 1 and ihi = 0.)
scale REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors
Contains details of the permutations and/or the scaling factors used to
balance the original general matrix, as returned by ?gebal.
m INTEGER. The number of columns of the matrix of eigenvectors (m 0).
v REAL for sgebak

DOUBLE PRECISION for dgebak
COMPLEX for cgebak
DOUBLE COMPLEX for zgebak.
Arrays:
v(ldv,*) contains the matrix of left or right eigenvectors to be transformed.
The second dimension of v must be at least max(1, m).
ldv INTEGER. The leading dimension of v; at least max(1, n) .
985
Output Parameters
v Overwritten by the transformed eigenvectors.
info INTEGER.

Specific details for the routine gebak interface are the following:
v Holds the matrix V of size (n,m).
Application Notes
The errors in this routine are negligible.
The approximate number of floating-point operations is approximately proportional to m*n.
?hseqr
Computes all eigenvalues and (optionally) the Schur
factorization of a matrix reduced to Hessenberg form.
Syntax
call shseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork, info)
call dhseqr(job, compz, n, ilo, ihi, h, ldh, wr, wi, z, ldz, work, lwork, info)
call chseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
call zhseqr(job, compz, n, ilo, ihi, h, ldh, w, z, ldz, work, lwork, info)
call hseqr(h, wr, wi [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
call hseqr(h, w [,ilo] [,ihi] [,z] [,job] [,compz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally the Schur factorization, of an upper Hessenberg
matrix H: H = Z*T*ZH, where T is an upper triangular (or, for real flavors, quasi-triangular) matrix (the
Schur form of H), and Z is the unitary or orthogonal matrix whose columns are the Schur vectors zi.
986
LAPACK Routines 3
You can also use this routine to compute the Schur factorization of a general matrix A which has been
reduced to upper Hessenberg form H:
A = Q*H*QH, where Q is unitary (orthogonal for real flavors);
A = (QZ)*T*(QZ)H.
In this case, after reducing A to Hessenberg form by gehrd, call orghr to form Q explicitly and then pass Q
to ?hseqr with compz = 'V'.
You can also call gebal to balance the original matrix before reducing it to Hessenberg form by ?hseqr, so
that the Hessenberg matrix H will have the structure:
where H11 and H33 are upper triangular.

If so, only the central diagonal block H22 (in rows and columns ilo to ihi) needs to be further reduced to
Schur form (the blocks H12 and H23 are also affected). Therefore the values of ilo and ihi can be supplied to ?
hseqr directly. Also, after calling this routine you must call gebak to permute the Schur vectors of the
balanced matrix to those of the original matrix.
If ?gebal has not been called, however, then ilo must be set to 1 and ihi to n. Note that if the Schur
factorization of A is required, ?gebal must not be called with job = 'S' or 'B', because the balancing
transformation is not unitary (for real flavors, it is not orthogonal).
?hseqr uses a multishift form of the upper Hessenberg QR algorithm. The Schur vectors are normalized so
that ||zi||2 = 1, but are determined only to within a complex factor of absolute value 1 (for the real
flavors, to within a factor 1).
Input Parameters
job CHARACTER*1. Must be 'E' or 'S'.

If job = 'E', then eigenvalues only are required.
If job = 'S', then the Schur form T is required.

If compz = 'N', then no Schur vectors are computed (and the array z is not
referenced).
If compz = 'I', then the Schur vectors of H are computed (and the array z
is initialized by the routine).
If compz = 'V', then the Schur vectors of A are computed (and the array z
must contain the matrix Q on entry).
n INTEGER. The order of the matrix H (n 0).
ilo, ihi INTEGER. If A has been balanced by ?gebal, then ilo and ihi must contain
the values returned by ?gebal. Otherwise, ilo must be set to 1 and ihi to n.
h, z, work REAL for shseqr
987
DOUBLE PRECISION for dhseqr

COMPLEX for chseqr
DOUBLE COMPLEX for zhseqr.
Arrays:
h(ldh,*) ) The n-by-n upper Hessenberg matrix H.
The second dimension of h must be at least max(1, n).
z(ldz,*)
If compz = 'V', then z must contain the matrix Q from the reduction to
Hessenberg form.
If compz = 'I', then z need not be set.
If compz = 'N', then z is not referenced.
The second dimension of z must be

at least max(1, n) if compz = 'V' or 'I';
at least 1 if compz = 'N'.

The dimension of work must be at least max (1, n).
ldh INTEGER. The leading dimension of h; at least max(1, n).
ldz INTEGER. The leading dimension of z;

If compz = 'N', then ldz 1.
If compz = 'V' or 'I', then ldz max(1, n).

lwork max(1, n) is sufficient and delivers very good and sometimes
optimal performance. However, lwork as large as 11*n may be required for
optimal performance. A workspace query is recommended to determine the
optimal workspace size.
estimates the optimal size of the work array, returns this value as the first
xerbla. See Application Notes for details.
Output Parameters
w COMPLEX for chseqr

DOUBLE COMPLEX for zhseqr.
Array, size at least max (1, n). Contains the computed eigenvalues, unless
info>0. The eigenvalues are stored in the same order as on the diagonal of
the Schur form T (if computed).
wr, wi REAL for shseqr

DOUBLE PRECISION for dhseqr
Arrays, size at least max (1, n) each.
988
LAPACK Routines 3
Contain the real and imaginary parts, respectively, of the computed
eigenvalues, unless info > 0. Complex conjugate pairs of eigenvalues
appear consecutively with the eigenvalue having positive imaginary part
first. The eigenvalues are stored in the same order as on the diagonal of the
Schur form T (if computed).
h If info = 0 and job = 'S', h contains the upper quasi-triangular matrix T

from the Schur decomposition (the Schur form).
If info = 0 and job = 'E', the contents of h are unspecified on exit. (The
output value of h when info > 0 is given under the description of info
below.)
z If compz = 'V' and info = 0, then z contains Q*Z.
If compz = 'I' and info = 0, then z contains the unitary or orthogonal

matrix Z of the Schur vectors of H.
work(1) On exit, if info = 0, then work(1) returns the optimal lwork.
info INTEGER.
If info = i, ?hseqr failed to compute all of the eigenvalues. Elements

1,2, ..., ilo-1 and i+1, i+2, ..., n of the eigenvalue arrays (wr and wi for real
flavors and w for complex flavors) contain the real and imaginary parts of
those eigenvalues that have been successfully found.
If info > 0, and job = 'E', then on exit, the remaining unconverged
eigenvalues are the eigenvalues of the upper Hessenberg matrix rows and
columns ilo through info of the final output value of H.
If info > 0, and job = 'S', then on exit (initial value of H)*U = U*(final
value of H), where U is a unitary matrix. The final value of H is upper
Hessenberg and triangular in rows and columns info+1 through ihi.
If info > 0, and compz = 'V', then on exit (final value of Z) = (initial
value of Z)*U, where U is the unitary matrix (regardless of the value of
job).
If info > 0, and compz = 'I', then on exit (final value of Z) = U, where
U is the unitary matrix (regardless of the value of job).
If info > 0, and compz = 'N', then Z is not accessed.

Specific details for the routine hseqr interface are the following:
h Holds the matrix H of size (n,n).
wr Holds the vector of length n. Used in real flavors only.
989
wi Holds the vector of length n. Used in real flavors only.
w Holds the vector of length n. Used in complex flavors only.
job Must be 'E' or 'S'. The default value is 'E'.

omitted.
Application Notes
The computed Schur factorization is the exact factorization of a nearby matrix H + E, where ||E||2 < O()
||H||2/si, and is the machine precision.
If i is an exact eigenvalue, and i is the corresponding computed value, then |i - i|c(n)**||H||2/si,
where c(n) is a modestly increasing function of n, and si is the reciprocal condition number of i. The
condition numbers si may be computed by calling trsna.
The total number of floating-point operations depends on how rapidly the algorithm converges; typical
numbers are as follows.
If only eigenvalues are computed: 7n3 for real flavors

25n3 for complex flavors.
If the Schur form is computed: 10n3 for real flavors

If the full Schur factorization is 20n3 for real flavors

computed:
lwork = -1.
?hsein
Computes selected eigenvectors of an upper
Hessenberg matrix that correspond to specified
eigenvalues.
Syntax
call shsein(side, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr, mm,
m, work, ifaill, ifailr, info)
990
LAPACK Routines 3
call dhsein(side, eigsrc, initv, select, n, h, ldh, wr, wi, vl, ldvl, vr, ldvr, mm,
m, work, ifaill, ifailr, info)
call chsein(side, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm, m,
work, rwork, ifaill, ifailr, info)
call zhsein(side, eigsrc, initv, select, n, h, ldh, w, vl, ldvl, vr, ldvr, mm, m,
work, rwork, ifaill, ifailr, info)
call hsein(h, wr, wi, select [, vl] [,vr] [,ifaill] [,ifailr] [,initv] [,eigsrc] [,m]
[,info])
call hsein(h, w, select [,vl] [,vr] [,ifaill] [,ifailr] [,initv] [,eigsrc] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes left and/or right eigenvectors of an upper Hessenberg matrix H, corresponding to
selected eigenvalues.
The right eigenvector x and the left eigenvector y, corresponding to an eigenvalue , are defined by: H*x =
*x and yH*H = *yH (or HH*y = **y). Here * denotes the conjugate of .
The eigenvectors are computed by inverse iteration. They are scaled so that, for a real eigenvector x, max|
xi| = 1, and for a complex eigenvector, max(|Rexi| + |Imxi|) = 1.
If H has been formed by reduction of a general matrix A to upper Hessenberg form, then eigenvectors of H
may be transformed to eigenvectors of A by ormhr or unmhr.
Input Parameters
side CHARACTER*1. Must be 'R' or 'L' or 'B'.

If side = 'R', then only right eigenvectors are computed.
If side = 'L', then only left eigenvectors are computed.
If side = 'B', then all eigenvectors are computed.
eigsrc CHARACTER*1. Must be 'Q' or 'N'.

If eigsrc = 'Q', then the eigenvalues of H were found using hseqr; thus if
H has any zero sub-diagonal elements (and so is block triangular), then the
j-th eigenvalue can be assumed to be an eigenvalue of the block containing
the j-th row/column. This property allows the routine to perform inverse
iteration on just one diagonal block. If eigsrc = 'N', then no such
assumption is made and the routine performs inverse iteration using the
whole matrix.
initv CHARACTER*1. Must be 'N' or 'U'.

If initv = 'N', then no initial estimates for the selected eigenvectors are
supplied.
If initv = 'U', then initial estimates for the selected eigenvectors are
supplied in vl and/or vr.
select LOGICAL.
991
Array, size at least max (1, n). Specifies which eigenvectors are to be
computed.
For real flavors:
To obtain the real eigenvector corresponding to the real eigenvalue wr(j),
set select(j) to .TRUE.
To select the complex eigenvector corresponding to the complex eigenvalue

(wr(j), wi(j)) with complex conjugate (wr(j+1), wi(j+1)), set select(j)
and/or select(j+1) to .TRUE.; the eigenvector corresponding to the first
eigenvalue in the pair is computed.
To select the eigenvector corresponding to the eigenvalue w(j), set select(j)
to .TRUE.
h, vl, vr, work REAL for shsein

DOUBLE PRECISION for dhsein
COMPLEX for chsein
DOUBLE COMPLEX for zhsein.
Arrays:
h(ldh,*) The n-by-n upper Hessenberg matrix H. If an NAN value is detected
in h, the routine returns with info = -6.

vl(ldvl,*)
If initv = 'V' and side = 'L' or 'B', then vl must contain starting
vectors for inverse iteration for the left eigenvectors. Each starting vector
must be stored in the same column or columns as will be used to store the
corresponding eigenvector.
If initv = 'N', then vl need not be set.
The second dimension of vl must be at least max(1, mm) if side = 'L' or

'B' and at least 1 if side = 'R'.
The array vl is not referenced if side = 'R'.
vr(ldvr,*)
If initv = 'V' and side = 'R' or 'B', then vr must contain starting
vectors for inverse iteration for the right eigenvectors. Each starting vector
must be stored in the same column or columns as will be used to store the
corresponding eigenvector.
If initv = 'N', then vr need not be set.
The second dimension of vr must be at least max(1, mm) if side = 'R' or

'B' and at least 1 if side = 'L'.
The array vr is not referenced if side = 'L'.

size at least max (1, n*(n+2)) for real flavors and at least max (1, n*n) for
complex flavors.
992
LAPACK Routines 3
w COMPLEX for chsein

DOUBLE COMPLEX for zhsein.
Array, size at least max (1, n).
Contains the eigenvalues of the matrix H.
If eigsrc = 'Q', the array must be exactly as returned by ?hseqr.
wr, wi REAL for shsein

DOUBLE PRECISION for dhsein
Contain the real and imaginary parts, respectively, of the eigenvalues of the
matrix H. Complex conjugate pairs of values must be stored in consecutive
elements of the arrays. If eigsrc = 'Q', the arrays must be exactly as
returned by ?hseqr.
ldvl INTEGER. The leading dimension of vl.

If side = 'L' or 'B', ldvl max(1,n) .
If side = 'R', ldvl 1.
ldvr INTEGER. The leading dimension of vr.

If side = 'R' or 'B', ldvr max(1,n) .
If side = 'L', ldvr1.
mm INTEGER. The number of columns in vl and/or vr.

Must be at least m, the actual number of columns required (see Output
Parameters below).
For real flavors, m is obtained by counting 1 for each selected real
eigenvector and 2 for each selected complex eigenvector (see select).
For complex flavors, m is the number of selected eigenvectors (see select).
Constraint:
0 mmn.
rwork REAL for chsein

DOUBLE PRECISION for zhsein.
Output Parameters
select Overwritten for real flavors only.

If a complex eigenvector was selected as specified above, then select(j) is
set to .TRUE. and select(j + 1) to .FALSE.
w The real parts of some elements of w may be modified, as close eigenvalues

are perturbed slightly in searching for independent eigenvectors.
993
wr Some elements of wr may be modified, as close eigenvalues are perturbed

slightly in searching for independent eigenvectors.
vl, vr If side = 'L' or 'B', vl contains the computed left eigenvectors (as
specified by select).
If side = 'R' or 'B', vr contains the computed right eigenvectors (as
specified by select).
The eigenvectors treated column-wise form a rectangular n-by-mm matrix.
For real flavors: a real eigenvector corresponding to a real eigenvalue

occupies one column of the matrix; a complex eigenvector corresponding to
a complex eigenvalue occupies two columns: the first column holds the real
part of the eigenvector and the second column holds the imaginary part of
the eigenvector. The matrix is stored in a one-dimensional array as
described by matrix_layout (using either column major or row major
layout).
m INTEGER. For real flavors: the number of columns of vl and/or vr required

to store the selected eigenvectors.
For complex flavors: the number of selected eigenvectors.
ifaill, ifailr INTEGER.

Arrays, size at least max(1, mm) each.
ifaill(i) = 0 if the ith column of vl converged;
ifaill(i) = j > 0 if the eigenvector stored in the i-th column of vl
(corresponding to the jth eigenvalue) failed to converge.
ifailr(i) = 0 if the ith column of vr converged;
ifailr(i) = j > 0 if the eigenvector stored in the i-th column of vr
(corresponding to the jth eigenvalue) failed to converge.
For real flavors: if the ith and (i+1)th columns of vl contain a selected
complex eigenvector, then ifaill(i) and ifaill(i + 1) are set to the same value.
A similar rule holds for vr and ifailr.
The array ifaill is not referenced if side = 'R'. The array ifailr is not
referenced if side = 'L'.
info INTEGER.
If info > 0, then i eigenvectors (as indicated by the parameters ifaill

and/or ifailr above) failed to converge. The corresponding columns of vl
and/or vr contain no useful information.

Specific details for the routine hsein interface are the following:
994
LAPACK Routines 3
select Holds the vector of length n.
vl Holds the matrix VL of size (n,mm).
vr Holds the matrix VR of size (n,mm).
ifaill Holds the vector of length (mm). Note that there will be an error condition if ifaill
is present and vl is omitted.
ifailr Holds the vector of length (mm). Note that there will be an error condition if ifailr
is present and vr is omitted.
initv Must be 'N' or 'U'. The default value is 'N'.
eigsrc Must be 'N' or 'Q'. The default value is 'N'.
side Restored based on the presence of arguments vl and vr as follows:

side = 'B', if both vl and vr are present,
side = 'L', if vl is present and vr omitted,
side = 'R', if vl is omitted and vr present,
Note that there will be an error condition if both vl and vr are omitted.
Application Notes
Each computed right eigenvector x i is the exact eigenvector of a nearby matrix A + Ei, such that ||Ei|| <
O()||A||. Hence the residual is small:
||Axi - ixi|| = O()||A||.
However, eigenvectors corresponding to close or coincident eigenvalues may not accurately span the relevant
subspaces.
Similar remarks apply to computed left eigenvectors.
?trevc
Computes selected eigenvectors of an upper (quasi-)
triangular matrix computed by ?hseqr.
Syntax
call strevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, info)
call dtrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, info)
call ctrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, rwork,
info)
call ztrevc(side, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, mm, m, work, rwork,
info)
call trevc(t [, howmny] [,select] [,vl] [,vr] [,m] [,info])
995
Include Files
mkl.fi, lapack.f90
Description
The routine computes some or all of the right and/or left eigenvectors of an upper triangular matrix T (or, for
real flavors, an upper quasi-triangular matrix T). Matrices of this type are produced by the Schur
factorization of a general matrix: A = Q*T*QH, as computed by hseqr.
The right eigenvector x and the left eigenvector y of T corresponding to an eigenvalue w, are defined by:
T*x = w*x, yH*T = w*yH, where yH denotes the conjugate transpose of y.
The eigenvalues are not input to this routine, but are read directly from the diagonal blocks of T.
This routine returns the matrices X and/or Y of right and left eigenvectors of T, or the products Q*X and/or
Q*Y, where Q is an input matrix.
If Q is the orthogonal/unitary factor that reduces a matrix A to Schur form T, then Q*X and Q*Y are the
matrices of right and left eigenvectors of A.
Input Parameters
side CHARACTER*1. Must be 'R' or 'L' or 'B'.

If side = 'R', then only right eigenvectors are computed.
If side = 'L', then only left eigenvectors are computed.
If side = 'B', then all eigenvectors are computed.
howmny CHARACTER*1. Must be 'A' or 'B' or 'S'.

If howmny = 'A', then all eigenvectors (as specified by side) are
computed.
If howmny = 'B', then all eigenvectors (as specified by side) are computed
and backtransformed by the matrices supplied in vl and vr.
If howmny = 'S', then selected eigenvectors (as specified by side and
select) are computed.
select LOGICAL.
If howmny = 'S', select specifies which eigenvectors are to be computed.
If howmny = 'A' or 'B', select is not referenced.
For real flavors:

If omega(j) is a real eigenvalue, the corresponding real eigenvector is
computed if select(j) is .TRUE..
If omega(j) and omega(j + 1) are the real and imaginary parts of a

complex eigenvalue, the corresponding complex eigenvector is computed if
either select(j) or select(j + 1) is .TRUE., and on exit select(j) is set
to .TRUE.and select(j + 1) is set to .FALSE..

The eigenvector corresponding to the j-th eigenvalue is computed if
select(j) is .TRUE..
996
LAPACK Routines 3
t, vl, vr REAL for strevc

DOUBLE PRECISION for dtrevc
COMPLEX for ctrevc
DOUBLE COMPLEX for ztrevc.
Arrays:
t(ldt,*) contains the n-by-n matrix T in Schur canonical form. For complex
flavors ctrevc and ztrevc, contains the upper triangular matrix T.
The second dimension of t must be at least max(1, n).

vl(ldvl,*)
If howmny = 'B' and side = 'L' or 'B', then vl must contain an n-by-n
matrix Q (usually the matrix of Schur vectors returned by ?hseqr).
If howmny = 'A' or 'S', then vl need not be set.
The second dimension of vl must be at least max(1, mm) if side = 'L' or

'B' and at least 1 if side = 'R'.
The array vl is not referenced if side = 'R'.
vr(ldvr,*)
If howmny = 'B' and side = 'R' or 'B', then vr must contain an n-by-n
matrix Q (usually the matrix of Schur vectors returned by ?hseqr). .
If howmny = 'A' or 'S', then vr need not be set.
The second dimension of vr must be at least max(1, mm) if side = 'R' or

'B' and at least 1 if side = 'L'.
The array vr is not referenced if side = 'L'.

size at least max (1, 3*n) for real flavors and at least max (1, 2*n) for
complex flavors.
ldt INTEGER. The leading dimension of t; at least max(1, n).

If side = 'L' or 'B', ldvln.
If side = 'R', ldvl 1.

If side = 'R' or 'B', ldvrn.
If side = 'L', ldvr 1.
mm INTEGER. The number of columns in the arrays vl and/or vr. Must be at

least m (the precise number of columns required).
If howmny = 'A' or 'B', mm = n.
If howmny = 'S': for real flavors, mm is obtained by counting 1 for each

selected real eigenvector and 2 for each selected complex eigenvector;
997
for complex flavors, mm is the number of selected eigenvectors (see

select).
Constraint: 0 mmn.
rwork REAL for ctrevc

DOUBLE PRECISION for ztrevc.
Workspace array, size at least max (1, n).
Output Parameters
select If a complex eigenvector of a real matrix was selected as specified above,

then select(j) is set to .TRUE. and select(j + 1) to .FALSE.
t COMPLEX for ctrevc

DOUBLE COMPLEX for ztrevc.
ctrevc/ztrevc modify the t(ldt,*) array, which is restored on exit.
vl, vr If side = 'L' or 'B', vl contains the computed left eigenvectors (as
specified by howmny and select).
If side = 'R' or 'B', vr contains the computed right eigenvectors (as
specified by howmny and select).
The eigenvectors treated column-wise form a rectangular n-by-mm matrix.
For real flavors: a real eigenvector corresponding to a real eigenvalue

occupies one column of the matrix; a complex eigenvector corresponding to
a complex eigenvalue occupies two columns: the first column holds the real
part of the eigenvector and the second column holds the imaginary part of
the eigenvector. The matrix is stored in a one-dimensional array as
described by matrix_layout (using either column major or row major
layout).
m INTEGER.
For complex flavors: the number of selected eigenvectors.
If howmny = 'A' or 'B', m is set to n.
For real flavors: the number of columns of vl and/or vr actually used to

store the selected eigenvectors.
info INTEGER.

Specific details for the routine trevc interface are the following:
998
LAPACK Routines 3
t Holds the matrix T of size (n,n).
side If omitted, this argument is restored based on the presence of arguments vl and
vr as follows:
side = 'L', if vr is omitted,
side = 'R', if vl is omitted.
howmny If omitted, this argument is restored based on the presence of argument select
as follows:
howmny = 'V', if q is present,
howmny = 'N', if q is omitted.
If present, vect = 'V' or 'U' and the argument q must also be present.
Note that there will be an error condition if both select and howmny are present.
Application Notes
If xi is an exact right eigenvector and yi is the corresponding computed eigenvector, then the angle (yi,
xi) between them is bounded as follows: (yi,xi)(c(n)||T||2)/sepi where sepi is the reciprocal
condition number of xi. The condition number sepi may be computed by calling ?trsna.
?trsna
Estimates condition numbers for specified eigenvalues
and right eigenvectors of an upper (quasi-) triangular
matrix.
Syntax
call strsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, iwork, info)
call dtrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, iwork, info)
call ctrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, rwork, info)
call ztrsna(job, howmny, select, n, t, ldt, vl, ldvl, vr, ldvr, s, sep, mm, m, work,
ldwork, rwork, info)
call trsna(t [, s] [,sep] [,vl] [,vr] [,select] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
999
The routine estimates condition numbers for specified eigenvalues and/or right eigenvectors of an upper
triangular matrix T (or, for real flavors, upper quasi-triangular matrix T in canonical Schur form). These are
the same as the condition numbers of the eigenvalues and right eigenvectors of an original matrix A =
Z*T*ZH (with unitary or, for real flavors, orthogonal Z), from which T may have been derived.
The routine computes the reciprocal of the condition number of an eigenvalue i as si = |vT*u|/(||u||E||
v||E) for real flavors and si = |vH*u|/(||u||E||v||E) for complex flavors,
where:
u and v are the right and left eigenvectors of T, respectively, corresponding to i.

vT/vH denote transpose/conjugate transpose of v, respectively.
This reciprocal condition number always lies between zero (ill-conditioned) and one (well-conditioned).
An approximate error estimate for a computed eigenvalue i is then given by *||T||/si, where is the
machine precision.
To estimate the reciprocal of the condition number of the right eigenvector corresponding to i, the routine
first calls trexc to reorder the diagonal elements of matrix T so that i is in the leading position:
The reciprocal condition number of the eigenvector is then estimated as sepi, the smallest singular value of
the matrix T22 - i*I.
An approximate error estimate for a computed right eigenvector u corresponding to i is then given by *||
T||/sepi.
Input Parameters
job CHARACTER*1. Must be 'E' or 'V' or 'B'.

If job = 'E', then condition numbers for eigenvalues only are computed.
If job = 'V', then condition numbers for eigenvectors only are computed.
If job = 'B', then condition numbers for both eigenvalues and

eigenvectors are computed.
howmny CHARACTER*1. Must be 'A' or 'S'.

If howmny = 'A', then the condition numbers for all eigenpairs are
computed.
If howmny = 'S', then condition numbers for selected eigenpairs (as
specified by select) are computed.
select LOGICAL.
Array, size at least max (1, n) if howmny = 'S' and at least 1 otherwise.
Specifies the eigenpairs for which condition numbers are to be computed if

howmny= 'S'.
For real flavors:

To select condition numbers for the eigenpair corresponding to the real
eigenvalue j, select(j) must be set .TRUE.;
1000
LAPACK Routines 3
to select condition numbers for the eigenpair corresponding to a complex
conjugate pair of eigenvalues j and j + 1), select(j) and/or select(j + 1)
must be set .TRUE.
For complex flavors

To select condition numbers for the eigenpair corresponding to the
eigenvalue j, select(j) must be set .TRUE.select is not referenced if
howmny = 'A'.
t, vl, vr, work REAL for strsna

DOUBLE PRECISION for dtrsna
COMPLEX for ctrsna
DOUBLE COMPLEX for ztrsna.
Arrays:
t(ldt,*) contains the n-by-n matrix T.
vl(ldvl,*)
If job = 'E' or 'B', then vl must contain the left eigenvectors of T (or of
any matrix Q*T*QH with Q unitary or orthogonal) corresponding to the
eigenpairs specified by howmny and select. The eigenvectors must be
stored in consecutive columns of vl, as returned by trevc or hsein.
The second dimension of vl must be at least max(1, mm) if job = 'E' or
'B' and at least 1 if job = 'V'.
The array vl is not referenced if job = 'V'.
vr(ldvr,*)
If job = 'E' or 'B', then vr must contain the right eigenvectors of T (or of
any matrix Q*T*QH with Q unitary or orthogonal) corresponding to the
eigenpairs specified by howmny and select. The eigenvectors must be
stored in consecutive columns of vr, as returned by trevc or hsein.
The second dimension of vr must be at least max(1, mm) if job = 'E' or
'B' and at least 1 if job = 'V'.
The array vr is not referenced if job = 'V'.
work is a workspace array, its dimension (ldwork,n+6).
The array work is not referenced if job = 'E'.

If job = 'E' or 'B', ldvl max(1,n) .
If job = 'V', ldvl 1.

If job = 'E' or 'B', ldvr max(1,n) .
1001
If job = 'R', ldvr 1.
mm INTEGER. The number of elements in the arrays s and sep, and the number
of columns in vl and vr (if used). Must be at least m (the precise number
required).
If howmny = 'A', mm = n;
if howmny = 'S', for real flavorsmm is obtained by counting 1 for each

selected real eigenvalue and 2 for each selected complex conjugate pair of
eigenvalues.
for complex flavorsmm is the number of selected eigenpairs (see select).
Constraint:
0 mmn.
ldwork INTEGER. The leading dimension of work.

If job = 'V' or 'B', ldwork max(1,n).
If job = 'E', ldwork 1.
rwork REAL for ctrsna, ztrsna.

Array, size at least max (1, n). The array is not referenced if job = 'E'.
iwork INTEGER for strsna, dtrsna.

Array, size at least max (1, 2*(n - 1)). The array is not referenced if job =
'E'.
Output Parameters
s REAL for single-precision flavors

Array, size at least max(1, mm) if job = 'E' or 'B' and at least 1 if job =
'V'.
Contains the reciprocal condition numbers of the selected eigenvalues if job
= 'E' or 'B', stored in consecutive elements of the array. Thus s(j), sep(j)
and the j-th columns of vl and vr all correspond to the same eigenpair (but
not in general the j th eigenpair unless all eigenpairs have been selected).
For real flavors: for a complex conjugate pair of eigenvalues, two
consecutive elements of s are set to the same value. The array s is not
referenced if job = 'V'.
sep REAL for single-precision flavors

Array, size at least max(1, mm) if job = 'V' or 'B' and at least 1 if job =
'E'. Contains the estimated reciprocal condition numbers of the selected
right eigenvectors if job = 'V' or 'B', stored in consecutive elements of
the array.
1002
LAPACK Routines 3
For real flavors: for a complex eigenvector, two consecutive elements of sep
are set to the same value; if the eigenvalues cannot be reordered to
compute sep(j), then sep(j) is set to zero; this can only occur when the true
value would be very small anyway. The array sep is not referenced if job =
'E'.
m INTEGER.
For complex flavors: the number of selected eigenpairs.
If howmny = 'A', m is set to n.
For real flavors: the number of elements of s and/or sep actually used to
store the estimated condition numbers.
info INTEGER.

Specific details for the routine trsna interface are the following:
s Holds the vector of length (mm).
sep Holds the vector of length (mm).
job Restored based on the presence of arguments s and sep as follows:
job = 'B', if both s and sep are present,

job = 'E', if s is present and sep omitted,
job = 'V', if s is omitted and sep present.
Note an error condition if both s and sep are omitted.
howmny Restored based on the presence of the argument select as follows:

howmny = 'S', if select is present,
howmny = 'A', if select is omitted.
Note that the arguments s, vl, and vr must either be all present or all omitted.
Otherwise, an error condition is observed.
1003
Application Notes
The computed values sepi may overestimate the true value, but seldom by a factor of more than 3.
?trexc
Reorders the Schur factorization of a general matrix.
Syntax
call strexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call dtrexc(compq, n, t, ldt, q, ldq, ifst, ilst, work, info)
call ctrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
call ztrexc(compq, n, t, ldt, q, ldq, ifst, ilst, info)
call trexc(t, ifst, ilst [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the Schur factorization of a general matrix A = Q*T*QH, so that the diagonal element or
block of T with row index ifst is moved to row ilst.
The reordered Schur form S is computed by an unitary (or, for real flavors, orthogonal) similarity
transformation: S = ZH*T*Z. Optionally the updated matrix P of Schur vectors is computed as P = Q*Z,
giving A = P*S*PH.
Input Parameters
compq CHARACTER*1. Must be 'V' or 'N'.

If compq = 'V', then the Schur vectors (Q) are updated.
If compq = 'N', then no Schur vectors are updated.
t, q REAL for strexc

DOUBLE PRECISION for dtrexc
COMPLEX for ctrexc
DOUBLE COMPLEX for ztrexc.
Arrays:
t(ldt,*) contains the n-by-n matrix T.
q(ldq,*)
If compq = 'V', then q must contain Q (Schur vectors).
If compq = 'N', then q is not referenced.
The second dimension of q must be at least max(1, n) if compq = 'V' and

at least 1 if compq = 'N'.
1004
LAPACK Routines 3
ldq INTEGER. The leading dimension of q;

If compq = 'N', then ldq 1.
If compq = 'V', then ldq max(1, n).
ifst, ilst INTEGER. 1 ifstn; 1 ilstn.

Must specify the reordering of the diagonal elements (or blocks, which is
possible for real flavors) of the matrix T. The element (or block) with row
index ifst is moved to row ilst by a sequence of exchanges between
adjacent elements (or blocks).
work REAL for strexc

DOUBLE PRECISION for dtrexc.
Output Parameters
t Overwritten by the updated matrix S.
q If compq = 'V', q contains the updated matrix of Schur vectors.
ifst, ilst Overwritten for real flavors only.

If ifst pointed to the second row of a 2 by 2 block on entry, it is changed to
point to the first row; ilst always points to the first row of the block in its
final position (which may differ from its input value by 1).
info INTEGER.

Specific details for the routine trexc interface are the following:
compq Restored based on the presence of the argument q as follows:

compq = 'V', if q is present,
compq = 'N', if q is omitted.
Application Notes
The computed matrix S is exactly similar to a matrix T+E, where ||E||2 = O()*||T||2, and is the
machine precision.
1005
Note that if a 2 by 2 diagonal block is involved in the re-ordering, its off-diagonal elements are in general
changed; the diagonal elements and the eigenvalues of the block are unchanged unless the block is
sufficiently ill-conditioned, in which case they may be noticeably altered. It is possible for a 2 by 2 block to
break into two 1 by 1 blocks, that is, for a pair of complex eigenvalues to become purely real.
for real flavors: 6n(ifst-ilst) if compq = 'N';

12n(ifst-ilst) if compq = 'V';
for complex flavors: 20n(ifst-ilst) if compq = 'N';

40n(ifst-ilst) if compq = 'V'.
?trsen
Reorders the Schur factorization of a matrix and
(optionally) computes the reciprocal condition
numbers for the selected cluster of eigenvalues and
respective invariant subspace.
Syntax
call strsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work, lwork,
iwork, liwork, info)
call dtrsen(job, compq, select, n, t, ldt, q, ldq, wr, wi, m, s, sep, work, lwork,
call ctrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork, info)
call ztrsen(job, compq, select, n, t, ldt, q, ldq, w, m, s, sep, work, lwork, info)
call trsen(t, select [,wr] [,wi] [,m] [,s] [,sep] [,q] [,info])
call trsen(t, select [,w] [,m] [,s] [,sep] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the Schur factorization of a general matrix A = Q*T*QT (for real flavors) or A = Q*T*QH
(for complex flavors) so that a selected cluster of eigenvalues appears in the leading diagonal elements (or,
for real flavors, diagonal blocks) of the Schur form. The reordered Schur form R is computed by a unitary
(orthogonal) similarity transformation: R = ZH*T*Z. Optionally the updated matrix P of Schur vectors is
computed as P = Q*Z, giving A = P*R*PH.
Let
where the selected eigenvalues are precisely the eigenvalues of the leading m-by-m submatrix T11. Let P be
correspondingly partitioned as (Q1Q2) where Q1 consists of the first m columns of Q. Then A*Q1 = Q1*T11,
and so the m columns of Q1 form an orthonormal basis for the invariant subspace corresponding to the
selected cluster of eigenvalues.
1006
LAPACK Routines 3
Optionally the routine also computes estimates of the reciprocal condition numbers of the average of the
cluster of eigenvalues and of the invariant subspace.
Input Parameters
job CHARACTER*1. Must be 'N' or 'E' or 'V' or 'B'.

If job = 'N', then no condition numbers are required.
If job = 'E', then only the condition number for the cluster of eigenvalues
is computed.
If job = 'V', then only the condition number for the invariant subspace is
computed.
If job = 'B', then condition numbers for both the cluster and the invariant
subspace are computed.
compq CHARACTER*1. Must be 'V' or 'N'.

If compq = 'V', then Q of the Schur vectors is updated.
If compq = 'N', then no Schur vectors are updated.
select LOGICAL.
Specifies the eigenvalues in the selected cluster. To select an eigenvalue j,
select(j) must be .TRUE.
For real flavors: to select a complex conjugate pair of eigenvalues j and j

+1 (corresponding 2 by 2 diagonal block), select(j) and/or select(j + 1)
must be .TRUE.; the complex conjugate jand j + 1 must be either both
included in the cluster or both excluded.
t, q, work REAL for strsen

DOUBLE PRECISION for dtrsen
COMPLEX for ctrsen
DOUBLE COMPLEX for ztrsen.
Arrays:
t(ldt,*) Theupper quasi-triangular n-by-n matrix T, in Schur canonical form.
q(ldq,*)
If compq = 'V', then q must contain the matrix Q of Schur vectors.
The second dimension of q must be at least max(1, n) if compq = 'V' and

at least 1 if compq = 'N'.
1007
If compq = 'V', then ldq max(1, n).

If job = 'V' or 'B', lwork max(1,2*m*(n-m)).
If job = 'E', then lwork max(1, m*(n-m))
If job = 'N', then lwork 1 for complex flavors and lwork max(1,n)
for real flavors.
iwork INTEGER.iwork(liwork) is a workspace array. The array iwork is not

referenced if job = 'N' or 'E'.
The actual amount of workspace required cannot exceed n2/2 if job = 'V'
or 'B'.
liwork INTEGER.
If job = 'V' or 'B', liwork max(1,2m(n-m)).
If job = 'E' or 'E', liwork 1.

Output Parameters
t Overwritten by the reordered matrix R in Schur canonical form with the

selected eigenvalues in the leading diagonal blocks.
q If compq = 'V', q contains the updated matrix of Schur vectors; the first
m columns of the Q form an orthogonal basis for the specified invariant
subspace.
w COMPLEX for ctrsen

DOUBLE COMPLEX for ztrsen.
Array, size at least max(1, n). The recorded eigenvalues of R. The
eigenvalues are stored in the same order as on the diagonal of R.
wr, wi REAL for strsen

DOUBLE PRECISION for dtrsen
Arrays, size at least max(1, n). Contain the real and imaginary parts,
respectively, of the reordered eigenvalues of R. The eigenvalues are stored
in the same order as on the diagonal of R. Note that if a complex
eigenvalue is sufficiently ill-conditioned, then its value may differ
significantly from its value before reordering.
1008
LAPACK Routines 3
m INTEGER.
For complex flavors: the dimension of the specified invariant subspaces,
which is the same as the number of selected eigenvalues (see select).
For real flavors: the dimension of the specified invariant subspace. The
value of m is obtained by counting 1 for each selected real eigenvalue and 2
for each selected complex conjugate pair of eigenvalues (see select).
Constraint: 0 mn.

If job = 'E' or 'B', s is a lower bound on the reciprocal condition number
of the average of the selected cluster of eigenvalues.
If m = 0 or n, then s = 1.
For real flavors: if info = 1, then s is set to zero.s is not referenced if job
= 'N' or 'V'.
sep REAL for single-precision flavors DOUBLE PRECISION for double-precision

flavors.
If job = 'V' or 'B', sep is the estimated reciprocal condition number of
the specified invariant subspace.
If m = 0 or n, then sep = |T|.
For real flavors: if info = 1, then sep is set to zero.
sep is not referenced if job = 'N' or 'E'.
work(1) On exit, if info = 0, then work(1) returns the optimal size of lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the optimal size of liwork.
info INTEGER.
If info = 1, the reordering of T failed because some eigenvalues are too

close to separate (the problem is very ill-conditioned); T may have been
partially reordered, and wr and wi contain the eigenvalues in the same
order as in T; s and sep (if requested) are set to zero.

Specific details for the routine trsen interface are the following:
1009
compq Restored based on the presence of the argument q as follows: compq = 'V', if q
is present, compq = 'N', if q is omitted.
job Restored based on the presence of arguments s and sep as follows:
job = 'B', if both s and sep are present,

job = 'E', if s is present and sep omitted,
job = 'V', if s is omitted and sep present,
job = 'N', if both s and sep are omitted.
Application Notes
The computed matrix R is exactly similar to a matrix T+E, where ||E||2 = O()*||T||2, and is the
machine precision. The computed s cannot underestimate the true reciprocal condition number by more than
a factor of (min(m, n-m))1/2; sep may differ from the true value by (m*n-m2)1/2. The angle between the
computed invariant subspace and the true subspace is O()*||A||2/sep. Note that if a 2-by-2 diagonal
block is involved in the re-ordering, its off-diagonal elements are in general changed; the diagonal elements
and the eigenvalues of the block are unchanged unless the block is sufficiently ill-conditioned, in which case
they may be noticeably altered. It is possible for a 2-by-2 block to break into two 1-by-1 blocks, that is, for a
pair of complex eigenvalues to become purely real.
If it is not clear how much workspace to supply, use a generous value of lwork (or liwork) for the first run or
If lwork (or liwork) has any of admissible sizes, which is no less than the minimal value described, the
If lwork = -1 (liwork = -1), the routine returns immediately and provides the recommended workspace
in the first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if lwork (liwork) is less than the minimal required value and is not equal to -1, the routine returns
?trsyl
Solves Sylvester equation for real quasi-triangular or
complex triangular matrices.
Syntax
call strsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call dtrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ctrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call ztrsyl(trana, tranb, isgn, m, n, a, lda, b, ldb, c, ldc, scale, info)
call trsyl(a, b, c, scale [, trana] [,tranb] [,isgn] [,info])
1010
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine solves the Sylvester matrix equation op(A)*XX*op(B) = *C, where op(A) = A or AH, and the
matrices A and B are upper triangular (or, for real flavors, upper quasi-triangular in canonical Schur form);
1 is a scale factor determined by the routine to avoid overflow in X; A is m-by-m, B is n-by-n, and C and X
are both m-by-n. The matrix X is obtained by a straightforward process of back substitution.
The equation has a unique solution if and only if ii 0, where {i} and {i} are the eigenvalues of A and
B, respectively, and the sign (+ or -) is the same as that used in the equation to be solved.
Input Parameters
trana CHARACTER*1. Must be 'N' or 'T' or 'C'.

If trana = 'N', then op(A) = A.
If trana = 'T', then op(A) = AT (real flavors only).
If trana = 'C' then op(A) = AH.
tranb CHARACTER*1. Must be 'N' or 'T' or 'C'.

If tranb = 'N', then op(B) = B.
If tranb = 'T', then op(B) = BT (real flavors only).
If tranb = 'C', then op(B) = BH.
isgn INTEGER. Indicates the form of the Sylvester equation.

If isgn = +1, op(A)*X + X*op(B) = alpha*C.
If isgn = -1, op(A)*X - X*op(B) = alpha*C.
m INTEGER. The order of A, and the number of rows in X and C (m 0).
n INTEGER. The order of B, and the number of columns in X and C (n 0).
a, b, c REAL for strsyl

DOUBLE PRECISION for dtrsyl
COMPLEX for ctrsyl
DOUBLE COMPLEX for ztrsyl.
Arrays:
1011
Output Parameters
c Overwritten by the solution matrix X.

The value of the scale factor .
info INTEGER.
If info = 1, A and B have common or close eigenvalues; perturbed values

were used to solve the equation.

Specific details for the routine trsyl interface are the following:
a Holds the matrix A of size (m,m).
trana Must be 'N', 'C', or 'T'. The default value is 'N'.
tranb Must be 'N', 'C', or 'T'. The default value is 'N'.
isgn Must be +1 or -1. The default value is +1.
Application Notes
Let X be the exact, Y the corresponding computed solution, and R the residual matrix: R = C - (AYYB).
Then the residual is always small:
||R||F = O()*(||A||F +||B||F)*||Y||F.
However, Y is not necessarily the exact solution of a slightly perturbed equation; in other words, the solution
is not backwards stable.
For the forward error, the following bound holds:
||Y - X||F||R||F/sep(A,B)
but this may be a considerable overestimate. See [Golub96] for a definition of sep(A, B).
The approximate number of floating-point operations for real flavors is m*n*(m + n). For complex flavors it
is 4 times greater.
1012
LAPACK Routines 3
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Computational Routines
This section describes LAPACK routines for solving generalized nonsymmetric eigenvalue problems,
reordering the generalized Schur factorization of a pair of matrices, as well as performing a number of
related computational tasks.
A generalized nonsymmetric eigenvalue problem is as follows: given a pair of nonsymmetric (or non-
Hermitian) n-by-n matrices A and B, find the generalized eigenvalues and the corresponding generalized
eigenvectorsx and y that satisfy the equations
Ax = Bx (right generalized eigenvectors x)
and
yHA = yHB (left generalized eigenvectors y).
Table "Computational Routines for Solving Generalized Nonsymmetric Eigenvalue Problems" lists LAPACK
routines (FORTRAN 77 interface) used to solve the generalized nonsymmetric eigenvalue problems and the
generalized Sylvester equation. The corresponding routine names in the Fortran 95 interface are without the
first symbol.
Computational Routines for Solving Generalized Nonsymmetric Eigenvalue Problems
Routine Operation performed
name
gghrd Reduces a pair of matrices to generalized upper Hessenberg form using orthogonal/
unitary transformations.
ggbal Balances a pair of general real or complex matrices.
ggbak Forms the right or left eigenvectors of a generalized eigenvalue problem.
gghd3 Reduces a pair of matrices to generalized upper Hessenberg form.
hgeqz Implements the QZ method for finding the generalized eigenvalues of the matrix pair
(H,T).
tgevc Computes some or all of the right and/or left generalized eigenvectors of a pair of upper
triangular matrices
tgexc Reorders the generalized Schur decomposition of a pair of matrices (A,B) so that one
diagonal block of (A,B) moves to another row index.
tgsen Reorders the generalized Schur decomposition of a pair of matrices (A,B) so that a
selected cluster of eigenvalues appears in the leading diagonal blocks of (A,B).
tgsyl Solves the generalized Sylvester equation.
tgsyl Estimates reciprocal condition numbers for specified eigenvalues and/or eigenvectors of a
pair of matrices in generalized real Schur canonical form.
?gghrd
Reduces a pair of matrices to generalized upper
Hessenberg form using orthogonal/unitary
transformations.
Syntax
call sgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call dgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call cgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call zgghrd(compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, info)
call gghrd(a, b [,ilo] [,ihi] [,q] [,z] [,compq] [,compz] [,info])
1013
Include Files
mkl.fi, lapack.f90
Description
The routine reduces a pair of real/complex matrices (A,B) to generalized upper Hessenberg form using
orthogonal/unitary transformations, where A is a general matrix and B is upper triangular. The form of the
generalized eigenvalue problem is A*x = *B*x, and B is typically made upper triangular by computing its
QR factorization and moving the orthogonal matrix Q to the left side of the equation.
This routine simultaneously reduces A to a Hessenberg matrix H:
QH*A*Z = H
and transforms B to another upper triangular matrix T:
QH*B*Z = T
in order to reduce the problem to its standard form H*y = *T*y, where y = ZH*x.
The orthogonal/unitary matrices Q and Z are determined as products of Givens rotations. They may either be
formed explicitly, or they may be postmultiplied into input matrices Q1 and Z1, so that
Q1*A*Z1H = (Q1*Q)*H*(Z1*Z)H
Q1*B*Z1H = (Q1*Q)*T*(Z1*Z)H
If Q1 is the orthogonal/unitary matrix from the QR factorization of B in the original equation A*x = *B*x,
then the routine ?gghrd reduces the original problem to generalized Hessenberg form.
Input Parameters
compq CHARACTER*1. Must be 'N', 'I', or 'V'.

If compq = 'N', matrix Q is not computed.
If compq = 'I', Q is initialized to the unit matrix, and the orthogonal/

unitary matrix Q is returned;
If compq = 'V', Q must contain an orthogonal/unitary matrix Q1 on entry,
and the product Q1*Q is returned.
compz CHARACTER*1. Must be 'N', 'I', or 'V'.

If compz = 'N', matrix Z is not computed.
If compz = 'I', Z is initialized to the unit matrix, and the orthogonal/

unitary matrix Z is returned;
If compz = 'V', Z must contain an orthogonal/unitary matrix Z1 on entry,
and the product Z1*Z is returned.
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of A which are to be
reduced. It is assumed that A is already upper triangular in rows and
columns 1:ilo-1 and ihi+1:n. Values of ilo and ihi are normally set by a
previous call to ggbal; otherwise they should be set to 1 and n respectively.
Constraint:
If n > 0, then 1 iloihin;
if n = 0, then ilo = 1 and ihi = 0.
1014
LAPACK Routines 3
a, b, q, z REAL for sgghrd
DOUBLE PRECISION for dgghrd
COMPLEX for cgghrd
DOUBLE COMPLEX for zgghrd.
Arrays:
a(lda,*) contains the n-by-n general matrix A.
b(ldb,*) contains the n-by-n upper triangular matrix B.
q(ldq,*)
If compq = 'V', then q must contain the orthogonal/unitary matrix Q1,

typically from the QR factorization of B.
z(ldz,*)
If compz = 'V', then z must contain the orthogonal/unitary matrix Z1.

If compq = 'I'or 'V', then ldq max(1, n).

If compz = 'N', then ldz 1.
If compz = 'I'or 'V', then ldz max(1, n).
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A are overwritten
with the upper Hessenberg matrix H, and the rest is set to zero.
b On exit, overwritten by the upper triangular matrix T = QH*B*Z. The

elements below the diagonal are set to zero.
q If compq = 'I', then q contains the orthogonal/unitary matrix Q, ;
If compq = 'V', then q is overwritten by the product Q1*Q.
z If compz = 'I', then z contains the orthogonal/unitary matrix Z;
If compz = 'V', then z is overwritten by the product Z1*Z.
1015
info INTEGER.

Specific details for the routine gghrd interface are the following:
compq If omitted, this argument is restored based on the presence of argument q as

follows: compq = 'I', if q is present, compq = 'N', if q is omitted.
If present, compq must be equal to 'I' or 'V' and the argument q must also be
present. Note that there will be an error condition if compq is present and q
omitted.

omitted.
?ggbal
Balances a pair of general real or complex matrices.
Syntax
call sggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call dggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call cggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call zggbal(job, n, a, lda, b, ldb, ilo, ihi, lscale, rscale, work, info)
call ggbal(a, b [,ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
1016
LAPACK Routines 3
The routine balances a pair of general real/complex matrices (A,B). This involves, first, permuting A and B by
similarity transformations to isolate eigenvalues in the first 1 to ilo-1 and last ihi+1 to n elements on the
diagonal;and second, applying a diagonal similarity transformation to rows and columns ilo to ihi to make the
rows and columns as close in norm as possible. Both steps are optional. Balancing may reduce the 1-norm of
the matrices, and improve the accuracy of the computed eigenvalues and/or eigenvectors in the generalized
eigenvalue problem A*x = *B*x.
Input Parameters
job CHARACTER*1. Specifies the operations to be performed on A and B. Must

be 'N' or 'P' or 'S' or 'B'.
If job = 'N ', then no operations are done; simply set ilo =1, ihi=n,
lscale(i) =1.0 and rscale(i)=1.0 for
i = 1,..., n.
If job = 'P', then permute only.
If job = 'S', then scale only.
If job = 'B', then both permute and scale.
a, b REAL for sggbal

DOUBLE PRECISION for dggbal
COMPLEX for cggbal
DOUBLE COMPLEX for zggbal.
Arrays:
a(lda,*) contains the matrix A. The second dimension of a must be at least
max(1, n).
b(ldb,*) contains the matrix B. The second dimension of b must be at least
max(1, n).
If job = 'N', a and b are not referenced.
work REAL for single precision flavors

Workspace array, size at least max(1, 6n) when job = 'S'or 'B', or at
least 1 when job = 'N'or 'P'.
Output Parameters
a, b Overwritten by the balanced matrices A and B, respectively.
ilo, ihi INTEGER. ilo and ihi are set to integers such that on exit Ai, j = 0 and Bi, j =
0 if i>j and j=1,...,ilo-1 or i=ihi+1,..., n.
If job = 'N'or 'S', then ilo = 1 and ihi = n.
1017
lscale, rscale REAL for single precision flavors

lscale contains details of the permutations and scaling factors applied to the
left side of A and B.
If Pj is the index of the row interchanged with row j, and Dj is the scaling
factor applied to row j, then
lscale(j) = Pj, for j = 1,..., ilo-1
= Dj, for j = ilo,...,ihi
= Pj, for j = ihi+1,..., n.
rscale contains details of the permutations and scaling factors applied to the
right side of A and B.
If Pj is the index of the column interchanged with column j, and Dj is the
scaling factor applied to column j, then
rscale(j) = Pj, for j = 1,..., ilo-1
= Dj, for j = ilo,...,ihi
= Pj, for j = ihi+1,..., n
info INTEGER.

Specific details for the routine ggbal interface are the following:
lscale Holds the vector of length (n).
rscale Holds the vector of length (n).
?ggbak
Forms the right or left eigenvectors of a generalized
eigenvalue problem.
1018
LAPACK Routines 3
Syntax
call sggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call dggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call cggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call zggbak(job, side, n, ilo, ihi, lscale, rscale, m, v, ldv, info)
call ggbak(v [, ilo] [,ihi] [,lscale] [,rscale] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine forms the right or left eigenvectors of a real/complex generalized eigenvalue problem
A*x = *B*x
by backward transformation on the computed eigenvectors of the balanced pair of matrices output by ggbal.
Input Parameters
job CHARACTER*1. Specifies the type of backward transformation required. Must

be 'N', 'P', 'S', or 'B'.
If job = 'N', then no operations are done; return.
If job = 'P', then do backward transformation for permutation only.
If job = 'S', then do backward transformation for scaling only.
If job = 'B', then do backward transformation for both permutation and

scaling. This argument must be the same as the argument job supplied to ?
ggbal.

If side = 'L', then v contains left eigenvectors.
If side = 'R', then v contains right eigenvectors.
n INTEGER. The number of rows of the matrix V (n 0).
ilo, ihi INTEGER. The integers ilo and ihi determined by ?gebal. Constraint:
lscale, rscale REAL for single precision flavors

The array lscale contains details of the permutations and/or scaling factors
applied to the left side of A and B, as returned by ?ggbal.
The array rscale contains details of the permutations and/or scaling factors
applied to the right side of A and B, as returned by ?ggbal.
m INTEGER. The number of columns of the matrix V
1019
(m 0).
v REAL for sggbak

DOUBLE PRECISION for dggbak
COMPLEX for cggbak
DOUBLE COMPLEX for zggbak.
Array v(ldv,*) . Contains the matrix of right or left eigenvectors to be
transformed, as returned by tgevc.
ldv INTEGER. The leading dimension of v; at least max(1, n) .
Output Parameters
v Overwritten by the transformed eigenvectors
info INTEGER.

Specific details for the routine ggbak interface are the following:
v Holds the matrix V of size (n,m).
lscale Holds the vector of length n.
rscale Holds the vector of length n.
side If omitted, this argument is restored based on the presence of arguments lscale
and rscale as follows:
side = 'L', if lscale is present and rscale omitted,
side = 'R', if lscale is omitted and rscale present.
Note that there will be an error condition if both lscale and rscale are present or
if they both are omitted.
?gghd3
Reduces a pair of matrices to generalized upper
Hessenberg form.
1020
LAPACK Routines 3
Syntax
call sgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call dgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call cgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
call zgghd3 (compq, compz, n, ilo, ihi, a, lda, b, ldb, q, ldq, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
?gghd3 reduces a pair of real or complex matrices (A, B) to generalized upper Hessenberg form using
orthogonal/unitary transformations, where A is a general matrix and B is upper triangular. The form of the
generalized eigenvalue problem is
A*x = *B*x,
and B is typically made upper triangular by computing its QR factorization and moving the orthogonal/unitary
matrix Q to the left side of the equation.
This subroutine simultaneously reduces A to a Hessenberg matrix H:
QT*A*Z = H for real flavors
or
QT*A*Z = H for complex flavors
and transforms B to another upper triangular matrix T:
QT*B*Z = T for real flavors
or
QT*B*Z = T for complex flavors
in order to reduce the problem to its standard form
H*y = *T*y
where y = ZT*x for real flavors
or
y = ZT*x for complex flavors.
The orthogonal/unitary matrices Q and Z are determined as products of Givens rotations. They may either be
formed explicitly, or they may be postmultiplied into input matrices Q1 and Z1, so that
for real flavors:
Q1 * A * Z1T = (Q1*Q) * H * (Z1*Z)T
Q1 * B * Z1T = (Q1*Q) * T * (Z1*Z)T
Q1 * A * Z1H = (Q1*Q) * H * (Z1*Z)T
Q1 * B * Z1T = (Q1*Q) * T * (Z1*Z)T
If Q1 is the orthogonal/unitary matrix from the QR factorization of B in the original equation A*x = *B*x,
then ?gghd3 reduces the original problem to generalized Hessenberg form.
1021
This is a blocked variant of ?gghrd, using matrix-matrix multiplications for parts of the computation to
enhance performance.
Input Parameters
compq CHARACTER*1. = 'N': do not compute q;
= 'I': q is initialized to the unit matrix, and the orthogonal/unitary matrix Q

is returned;
= 'V': q must contain an orthogonal/unitary matrix Q1 on entry, and the
product Q1*q is returned.
compz CHARACTER*1. = 'N': do not compute z;
= 'I': z is initialized to the unit matrix, and the orthogonal/unitary matrix Z

is returned;
= 'V': z must contain an orthogonal/unitary matrix Z1 on entry, and the
product Z1*z is returned.
n INTEGER. The order of the matrices A and B.

n 0.
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of a which are to be
reduced. It is assumed that a is already upper triangular in rows and
columns 1:ilo - 1 and ihi + 1:n. ilo and ihi are normally set by a
previous call to ?ggbal; otherwise they should be set to 1 and n,
respectively.
1 iloihin, if n > 0; ilo=1 and ihi=0, if n=0.
a REAL for sgghd3

DOUBLE PRECISION for dgghd3
COMPLEX for cgghd3
DOUBLE COMPLEX for zgghd3
On entry, the n-by-n general matrix to be reduced.
lda max(1,n).
b REAL for sgghd3

COMPLEX for cgghd3
Array, (ldb, n).
On entry, then-by-n upper triangular matrix B.
ldb INTEGER. The leading dimension of the array b.
ldb max(1,n).
1022
LAPACK Routines 3
q REAL for sgghd3
COMPLEX for cgghd3
Array, size (ldq, n).
On entry, if compq = 'V', the orthogonal/unitary matrix Q1, typically from

the QR factorization of b.
ldqn if compq='V' or 'I'; ldq 1 otherwise.
z REAL for sgghd3

COMPLEX for cgghd3
Array, size (ldz, n).
On entry, if compz = 'V', the orthogonal/unitary matrix Z1.
Not referenced if compz='N'.
ldz INTEGER. The leading dimension of the array z. ldzn if compz='V' or 'I';
ldz 1 otherwise.
work REAL for sgghd3

COMPLEX for cgghd3
Array, size (lwork)
lwork INTEGER. The length of the array work.

lwork 1.
For optimum performance lwork 6*n*NB, where NB is the optimal
blocksize.
xerbla.
Output Parameters
a On exit, the upper triangle and the first subdiagonal of a are overwritten
with the upper Hessenberg matrix H, and the rest is set to zero.
b On exit, the upper triangular matrix T = QTBZ for real flavors or T = QHBZ
for complex flavors. The elements below the diagonal are set to zero.
1023
q On exit, if compq='I', the orthogonal/unitary matrix Q, and if compq = 'V',

the product Q1*Q.
Not referenced if compq='N'.
z On exit, if compz='I', the orthogonal/unitary matrix Z, and if compz = 'V',

the product Z1*Z.
Not referenced if compz='N'.
work On exit, if info = 0, work(1) returns the optimal lwork.

Application Notes
This routine reduces A to Hessenberg form and maintains B in using a blocked variant of Moler and Stewart's
original algorithm, as described by Kagstrom, Kressner, Quintana-Orti, and Quintana-Orti (BIT 2008).
?hgeqz
Implements the QZ method for finding the generalized
eigenvalues of the matrix pair (H,T).
Syntax
call shgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai, beta, q,
ldq, z, ldz, work, lwork, info)
call dhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alphar, alphai, beta, q,
ldq, z, ldz, work, lwork, info)
call chgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q, ldq, z,
ldz, work, lwork, rwork, info)
call zhgeqz(job, compq, compz, n, ilo, ihi, h, ldh, t, ldt, alpha, beta, q, ldq, z,
ldz, work, lwork, rwork, info)
call hgeqz(h, t [,ilo] [,ihi] [,alphar] [,alphai] [,beta] [,q] [,z] [,job] [,compq]
[,compz] [,info])
call hgeqz(h, t [,ilo] [,ihi] [,alpha] [,beta] [,q] [,z] [,job] [,compq] [, compz]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the eigenvalues of a real/complex matrix pair (H,T), where H is an upper Hessenberg
matrix and T is upper triangular, using the double-shift version (for real flavors) or single-shift version (for
complex flavors) of the QZ method. Matrix pairs of this type are produced by the reduction to generalized
upper Hessenberg form of a real/complex matrix pair (A,B):
A = Q1*H*Z1H, B = Q1*T*Z1H,
as computed by ?gghrd.
For real flavors:

If job = 'S', then the Hessenberg-triangular pair (H,T) is reduced to generalized Schur form,
1024
LAPACK Routines 3
H = Q*S*ZT, T = Q*P*ZT,
where Q and Z are orthogonal matrices, P is an upper triangular matrix, and S is a quasi-triangular matrix
with 1-by-1 and 2-by-2 diagonal blocks. The 1-by-1 blocks correspond to real eigenvalues of the matrix pair
(H,T) and the 2-by-2 blocks correspond to complex conjugate pairs of eigenvalues.
Additionally, the 2-by-2 upper triangular diagonal blocks of P corresponding to 2-by-2 blocks of S are reduced
to positive diagonal form, that is, if Sj + 1, j is non-zero, then Pj + 1, j = Pj, j + 1 = 0, Pj, j > 0, and Pj
+ 1, j + 1 > 0.

If job = 'S', then the Hessenberg-triangular pair (H,T) is reduced to generalized Schur form,
H = Q* S*ZH, T = Q*P*ZH,
where Q and Z are unitary matrices, and S and P are upper triangular.
For all function flavors:
Optionally, the orthogonal/unitary matrix Q from the generalized Schur factorization may be post-multiplied
by an input matrix Q1, and the orthogonal/unitary matrix Z may be post-multiplied by an input matrix Z1.
If Q1 and Z1 are the orthogonal/unitary matrices from ?gghrd that reduced the matrix pair (A,B) to
generalized upper Hessenberg form, then the output matrices Q1Q and Z1Z are the orthogonal/unitary
factors from the generalized Schur factorization of (A,B):
A = (Q1Q)*S *(Z1Z)H, B = (Q1Q)*P*(Z1Z)H.
To avoid overflow, eigenvalues of the matrix pair (H,T) (equivalently, of (A,B)) are computed as a pair of
values (alpha,beta). For chgeqz/zhgeqz, alpha and beta are complex, and for shgeqz/dhgeqz, alpha is
complex and beta real. If beta is nonzero, = alpha/beta is an eigenvalue of the generalized
nonsymmetric eigenvalue problem (GNEP)
A*x = *B*x
and if alpha is nonzero, = beta/alpha is an eigenvalue of the alternate form of the GNEP
*A*y = B*y .
Real eigenvalues (for real flavors) or the values of alpha and beta for the i-th eigenvalue (for complex
flavors) can be read directly from the generalized Schur form:
alpha = Si, i, beta = Pi, i.
Input Parameters
job CHARACTER*1. Specifies the operations to be performed. Must be 'E' or

'S'.
If job = 'E', then compute eigenvalues only;
If job = 'S', then compute eigenvalues and the Schur form.
compq CHARACTER*1. Must be 'N', 'I', or 'V'.

If compq = 'N', left Schur vectors (q) are not computed;
If compq = 'I', q is initialized to the unit matrix and the matrix of left
Schur vectors of (H,T) is returned;
If compq = 'V', q must contain an orthogonal/unitary matrix Q1 on entry
and the product Q1*Q is returned.
compz CHARACTER*1. Must be 'N', 'I', or 'V'.

If compz = 'N', right Schur vectors (z) are not computed;
1025
If compz = 'I', z is initialized to the unit matrix and the matrix of right
Schur vectors of (H,T) is returned;
If compz = 'V', z must contain an orthogonal/unitary matrix Z1 on entry
and the product Z1*Z is returned.
n INTEGER. The order of the matrices H, T, Q, and Z

(n 0).
ilo, ihi INTEGER. ilo and ihi mark the rows and columns of H which are in
Hessenberg form. It is assumed that H is already upper triangular in rows
and columns 1:ilo-1 and ihi+1:n.
Constraint:
h, t, q, z, work REAL for shgeqz

DOUBLE PRECISION for dhgeqz
COMPLEX for chgeqz
DOUBLE COMPLEX for zhgeqz.
Arrays:
On entry, h(ldh,*) contains the n-by-n upper Hessenberg matrix H.
On entry, t(ldt,*) contains the n-by-n upper triangular matrix T.
q(ldq,*) :
On entry, if compq = 'V', this array contains the orthogonal/unitary matrix
Q1 used in the reduction of (A,B) to generalized Hessenberg form.

z(ldz,*) :
On entry, if compz = 'V', this array contains the orthogonal/unitary matrix
Z1 used in the reduction of (A,B) to generalized Hessenberg form.


If compq = 'I'or 'V', then ldq max(1, n).
1026
LAPACK Routines 3
If compq = 'N', then ldz 1.
If compq = 'I'or 'V', then ldz max(1, n).

lwork max(1, n).
rwork REAL for chgeqz

DOUBLE PRECISION for zhgeqz.
Workspace array, size at least max(1, n). Used in complex flavors only.
Output Parameters
h For real flavors:

If job = 'S', then on exit h contains the upper quasi-triangular matrix S
from the generalized Schur factorization.
If job = 'E', then on exit the diagonal blocks of h match those of S, but
the rest of h is unspecified.
If job = 'S', then, on exit, h contains the upper triangular matrix S from
the generalized Schur factorization.
If job = 'E', then on exit the diagonal of h matches that of S, but the rest
of h is unspecified.
t If job = 'S', then, on exit, t contains the upper triangular matrix P from
the generalized Schur factorization.
For real flavors:
2-by-2 diagonal blocks of P corresponding to 2-by-2 blocks of S are reduced
to positive diagonal form, that is, if h(j+1,j) is non-zero, then t(j
+1,j)=t(j,j+1)=0 and t(j,j) and t(j+1,j+1) will be positive.
If job = 'E', then on exit the diagonal blocks of t match those of P, but
the rest of t is unspecified.
if job = 'E', then on exit the diagonal of t matches that of P, but the rest
of t is unspecified.
alphar, alphai REAL for shgeqz;

DOUBLE PRECISION for dhgeqz.
Arrays, size at least max(1, n). The real and imaginary parts, respectively,
of each scalar alpha defining an eigenvalue of GNEP.
If alphai(j) is zero, then the j-th eigenvalue is real; if positive, then the j-th
and (j+1)-th eigenvalues are a complex conjugate pair, with
1027
alphai(j+1) = -alphai(j).
alpha COMPLEX for chgeqz;

The complex scalars alpha that define the eigenvalues of GNEP. alphai(i)
= Si, i in the generalized Schur factorization.
beta REAL for shgeqz

DOUBLE PRECISION for dhgeqz
COMPLEX for chgeqz
For real flavors:
The scalars beta that define the eigenvalues of GNEP.
Together, the quantities alpha = (alphar(j), alphai(j)) and beta =
beta(j) represent the j-th eigenvalue of the matrix pair (A,B), in one of
the forms lambda = alpha/beta or mu = beta/alpha. Since either
lambda or mu may overflow, they should not, in general, be computed.
The real non-negative scalars beta that define the eigenvalues of GNEP.
beta(i) = Pi, i in the generalized Schur factorization. Together, the
quantities alpha = alpha(j) and beta = beta(j) represent the j-th
eigenvalue of the matrix pair (A,B), in one of the forms lambda = alpha/
beta or mu = beta/alpha. Since either lambda or mu may overflow, they
should not, in general, be computed.
q On exit, if compq = 'I', q is overwritten by the orthogonal/unitary matrix

of left Schur vectors of the pair (H,T), and if compq = 'V', q is overwritten
by the orthogonal/unitary matrix of left Schur vectors of (A,B).
z On exit, if compz = 'I', z is overwritten by the orthogonal/unitary matrix

of right Schur vectors of the pair (H,T), and if compz = 'V', z is
overwritten by the orthogonal/unitary matrix of right Schur vectors of
(A,B).
work(1) If info 0, on exit, work(1) contains the minimum value of lwork required
info INTEGER.
If info = 1,..., n, the QZ iteration did not converge.
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real flavors), alpha(i)
(for complex flavors), and beta(i), i=info+1,..., n should be correct.
If info = n+1,...,2n, the shift calculation failed.
1028
LAPACK Routines 3
(H,T) is not in Schur form, but alphar(i), alphai(i) (for real flavors), alpha(i)
(for complex flavors), and beta(i), i =info-n+1,..., n should be correct.

Specific details for the routine hgeqz interface are the following:
alphar Holds the vector of length n. Used in real flavors only.
alphai Holds the vector of length n. Used in real flavors only.
alpha Holds the vector of length n. Used in complex flavors only.
beta Holds the vector of length n.
job Must be 'E' or 'S'. The default value is 'E'.
compq If omitted, this argument is restored based on the presence of argument q as

follows:
compq = 'I', if q is present,
compq = 'N', if q is omitted.
If present, compq must be equal to 'I' or 'V' and the argument q must also be
present.
Note that there will be an error condition if compq is present and q omitted.

follows:
present.
Note an error condition if compz is present and z is omitted.
Application Notes
lwork = -1.
1029
?tgevc
Computes some or all of the right and/or left
generalized eigenvectors of a pair of upper triangular
matrices.
Syntax
call stgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
info)
call dtgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
info)
call ctgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
call ztgevc(side, howmny, select, n, s, lds, p, ldp, vl, ldvl, vr, ldvr, mm, m, work,
rwork, info)
call tgevc(s, p [,howmny] [,select] [,vl] [,vr] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes some or all of the right and/or left eigenvectors of a pair of real/complex matrices
(S,P), where S is quasi-triangular (for real flavors) or upper triangular (for complex flavors) and P is upper
triangular.
Matrix pairs of this type are produced by the generalized Schur factorization of a real/complex matrix pair
(A,B):
A = Q*S*ZH, B = Q*P*ZH
as computed by ?gghrd plus ?hgeqz.
The right eigenvector x and the left eigenvector y of (S,P) corresponding to an eigenvalue w are defined by:
S*x = w*P*x, yH*S = w*yH*P
The eigenvalues are not input to this routine, but are computed directly from the diagonal blocks or diagonal
elements of S and P.
This routine returns the matrices X and/or Y of right and left eigenvectors of (S,P), or the products Z*X
and/or Q*Y, where Z and Q are input matrices.
If Q and Z are the orthogonal/unitary factors from the generalized Schur factorization of a matrix pair (A,B),
then Z*X and Q*Y are the matrices of right and left eigenvectors of (A,B).
1030
LAPACK Routines 3
Input Parameters
side CHARACTER*1. Must be 'R', 'L', or 'B'.

If side = 'R', compute right eigenvectors only.
If side = 'L', compute left eigenvectors only.
If side = 'B', compute both right and left eigenvectors.
howmny CHARACTER*1. Must be 'A', 'B', or 'S'.

If howmny = 'A', compute all right and/or left eigenvectors.
If howmny = 'B', compute all right and/or left eigenvectors,

backtransformed by the matrices in vr and/or vl.
If howmny = 'S', compute selected right and/or left eigenvectors, specified
by the logical array select.
select LOGICAL.
If howmny = 'S', select specifies the eigenvectors to be computed.
If howmny = 'A'or 'B', select is not referenced.
For real flavors:

If w(j) is a real eigenvalue, the corresponding real eigenvector is computed
if select(j) is .TRUE..
If w(j) and omega(j + 1) are the real and imaginary parts of a complex
eigenvalue, the corresponding complex eigenvector is computed if either
select(j) or select(j + 1) is .TRUE., and on exit select(j) is set
to .TRUE.and select(j + 1) is set to .FALSE..

The eigenvector corresponding to the j-th eigenvalue is computed if
select(j) is .TRUE..
n INTEGER. The order of the matrices S and P (n 0).
s, p, vl, vr, work REAL for stgevc

DOUBLE PRECISION for dtgevc
COMPLEX for ctgevc
DOUBLE COMPLEX for ztgevc.
Arrays:
s(lds,*) contains the matrix S from a generalized Schur factorization as
computed by ?hgeqz. This matrix is upper quasi-triangular for real flavors,
and upper triangular for complex flavors.
The second dimension of s must be at least max(1, n).
p(ldp,*) contains the upper triangular matrix P from a generalized Schur
factorization as computed by ?hgeqz.
For real flavors, 2-by-2 diagonal blocks of P corresponding to 2-by-2 blocks

of S must be in positive diagonal form.
1031
For complex flavors, P must have real diagonal elements. The second
dimension of p must be at least max(1, n).
If side = 'L' or 'B' and howmny = 'B', vl(ldvl,*) must contain an n-by-
n matrix Q (usually the orthogonal/unitary matrix Q of left Schur vectors
returned by ?hgeqz). The second dimension of vl must be at least max(1,
mm).
If side = 'R', vl is not referenced.
If side = 'R' or 'B' and howmny = 'B', vr(ldvr,*) must contain an n-by-
n matrix Z (usually the orthogonal/unitary matrix Z of right Schur vectors
returned by ?hgeqz). The second dimension of vr must be at least max(1,
mm).
If side = 'L', vr is not referenced.

size at least max (1, 6*n) for real flavors and at least max (1, 2*n) for
complex flavors.
lds INTEGER. The leading dimension of s; at least max(1, n).
ldp INTEGER. The leading dimension of p; at least max(1, n).
ldvl INTEGER. The leading dimension of vl;

If side = 'L' or 'B', then ldvln.
If side = 'R', then ldvl 1 .
ldvr INTEGER. The leading dimension of vr;

If side = 'R' or 'B', then ldvrn.
If side = 'L', then ldvr 1.
mm INTEGER. The number of columns in the arrays vl and/or vr (mmm).
rwork REAL for ctgevc DOUBLE PRECISION for ztgevc. Workspace array, size at
least max (1, 2*n). Used in complex flavors only.
Output Parameters
vl On exit, if side = 'L' or 'B', vl contains:
if howmny = 'A', the matrix Y of left eigenvectors of (S,P);
if howmny = 'B', the matrix Q*Y;
if howmny = 'S', the left eigenvectors of (S,P) specified by select, stored

consecutively in the columns of vl, in the same order as their eigenvalues.
For real flavors:
A complex eigenvector corresponding to a complex eigenvalue is stored in
two consecutive columns, the first holding the real part, and the second the
imaginary part.
vr On exit, if side = 'R' or 'B', vr contains:
if howmny = 'A', the matrix X of right eigenvectors of (S,P);
1032
LAPACK Routines 3
if howmny = 'B', the matrix Z*X;
if howmny = 'S', the right eigenvectors of (S,P) specified by select, stored

consecutively in the columns of vr, in the same order as their eigenvalues.
For real flavors:
A complex eigenvector corresponding to a complex eigenvalue is stored in
two consecutive columns, the first holding the real part, and the second the
imaginary part.
m INTEGER. The number of columns in the arrays vl and/or vr actually used to

store the eigenvectors.
For real flavors:

Each selected real eigenvector occupies one column and each selected
complex eigenvector occupies two columns.
Each selected eigenvector occupies one column.
info INTEGER.
For real flavors:

if info = i>0, the 2-by-2 block (i:i+1) does not have a complex
eigenvalue.

Specific details for the routine tgevc interface are the following:
s Holds the matrix S of size (n,n).
p Holds the matrix P of size (n,n).
side Restored based on the presence of arguments vl and vr as follows:

side = 'L', if vl is present and vr omitted,
side = 'R', if vl is omitted and vr present,
1033
howmny If omitted, this argument is restored based on the presence of argument select
as follows:
howmny = 'S', if select is present,
howmny = 'A', if select is omitted.
If present, howmny must be equal to 'A' or 'B' and the argument select must
be omitted.
Note that there will be an error condition if both howmny and select are present.
?tgexc
Reorders the generalized Schur decomposition of a
pair of matrices (A,B) so that one diagonal block of
(A,B) moves to another row index.
Syntax
call stgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work, lwork,
info)
call dtgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, work, lwork,
info)
call ctgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
call ztgexc(wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, ifst, ilst, info)
call tgexc(a, b [,ifst] [,ilst] [,z] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix pair (A,B)
using an orthogonal/unitary equivalence transformation
(A,B) = Q*(A,B)*ZH,
so that the diagonal block of (A, B) with row index ifst is moved to row ilst. Matrix pair (A, B) must be in a
generalized real-Schur/Schur canonical form (as returned by gges), that is, A is block upper triangular with
1-by-1 and 2-by-2 diagonal blocks and B is upper triangular. Optionally, the matrices Q and Z of generalized
Schur vectors are updated.
Qin*Ain*ZinT = Qout*Aout*ZoutT
Qin*Bin*ZinT = Qout*Bout*ZoutT.
Input Parameters
wantq, wantz LOGICAL.

If wantq = .TRUE., update the left transformation matrix Q;
If wantq = .FALSE., do not update Q;
If wantz = .TRUE., update the right transformation matrix Z;
If wantz = .FALSE., do not update Z.
1034
LAPACK Routines 3
a, b, q, z REAL for stgexc

DOUBLE PRECISION for dtgexc
COMPLEX for ctgexc
DOUBLE COMPLEX for ztgexc.
Arrays:
b(ldb,*) contains the matrix B. The second dimension of b must be at least
max(1, n).
q(ldq,*)
If wantq = .FALSE., then q is not referenced.
If wantq = .TRUE., then q must contain the orthogonal/unitary matrix Q.

z(ldz,*)
If wantz = .FALSE., then z is not referenced.
If wantz = .TRUE., then z must contain the orthogonal/unitary matrix Z.

If wantq = .FALSE., then ldq 1.
If wantq = .TRUE., then ldq max(1, n).

If wantz = .FALSE., then ldz 1.
If wantz = .TRUE., then ldz max(1, n).
ifst, ilst INTEGER. Specify the reordering of the diagonal blocks of (A, B). The block
with row index ifst is moved to row ilst, by a sequence of swapping between
adjacent blocks. Constraint: 1 ifst, ilstn.
work REAL for stgexc;

DOUBLE PRECISION for dtgexc.
Workspace array, size (lwork). Used in real flavors only.
lwork INTEGER. The dimension of work; must be at least 4n +16.
1035

Output Parameters
a, b, q, z Overwritten by the updated matrices A,B, Q, and Z respectively.
ifst, ilst Overwritten for real flavors only.

If ifst pointed to the second row of a 2 by 2 block on entry, it is changed to
point to the first row; ilst always points to the first row of the block in its
final position (which may differ from its input value by 1).
info INTEGER.
If info = 1, the transformed matrix pair (A, B) would be too far from
generalized Schur form; the problem is ill-conditioned. (A, B) may have
been partially reordered, and ilst points to the first row of the current
position of the block being moved.

Specific details for the routine tgexc interface are the following:
wantq Restored based on the presence of the argument q as follows:

wantq = .TRUE, if q is present,
wantq = .FALSE, if q is omitted.
wantz Restored based on the presence of the argument z as follows:

wantz = .TRUE, if z is present,
wantz = .FALSE, if z is omitted.
Application Notes
= -1.
1036
LAPACK Routines 3
?tgsen
Reorders the generalized Schur decomposition of a
pair of matrices (A,B) so that a selected cluster of
eigenvalues appears in the leading diagonal blocks of
(A,B).
Syntax
call stgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call dtgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alphar, alphai, beta, q,
ldq, z, ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ctgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q, ldq, z,
ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call ztgsen(ijob, wantq, wantz, select, n, a, lda, b, ldb, alpha, beta, q, ldq, z,
ldz, m, pl, pr, dif, work, lwork, iwork, liwork, info)
call tgsen(a, b, select [,alphar] [,alphai] [,beta] [,ijob] [,q] [,z] [,pl] [,pr] [,dif]
[,m] [,info])
call tgsen(a, b, select [,alpha] [,beta] [,ijob] [,q] [,z] [,pl] [,pr] [, dif] [,m]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine reorders the generalized real-Schur/Schur decomposition of a real/complex matrix pair (A, B) (in
terms of an orthogonal/unitary equivalence transformation QT*(A,B)*Z for real flavors or QH*(A,B)*Z for
complex flavors), so that a selected cluster of eigenvalues appears in the leading diagonal blocks of the pair
(A, B). The leading columns of Q and Z form orthonormal/unitary bases of the corresponding left and right
eigenspaces (deflating subspaces).
(A, B) must be in generalized real-Schur/Schur canonical form (as returned by gges), that is, A and B are
both upper triangular.
?tgsen also computes the generalized eigenvalues
j = (alphar(j) + alphai(j)*i)/beta(j) (for real flavors)
j = alpha(j)/beta(j) (for complex flavors)
of the reordered matrix pair (A, B).
Optionally, the routine computes the estimates of reciprocal condition numbers for eigenvalues and
eigenspaces. These are Difu[(A11, B11), (A22, B22)] and Difl[(A11, B11), (A22, B22)], that is, the
separation(s) between the matrix pairs (A11, B11) and (A22, B22) that correspond to the selected cluster and
the eigenvalues outside the cluster, respectively, and norms of "projections" onto left and right eigenspaces
with respect to the selected cluster in the (1,1)-block.
1037
Input Parameters
ijob INTEGER. Specifies whether condition numbers are required for the cluster
of eigenvalues (pl and pr) or the deflating subspaces Difu and Difl.
If ijob =0, only reorder with respect to select;
If ijob =1, reciprocal of norms of "projections" onto left and right

eigenspaces with respect to the selected cluster (pl and pr);
If ijob =2, compute upper bounds on Difu and Difl, using F-norm-based
estimate (dif (1:2));
If ijob =3, compute estimate of Difu and Difl, using 1-norm-based
estimate (dif (1:2)). This option is about 5 times as expensive as ijob =2;
If ijob =4,>compute pl, pr and dif (i.e., options 0, 1 and 2 above). This is
an economic version to get it all;
If ijob =5, compute pl, pr and dif (i.e., options 0, 1 and 3 above).
wantq, wantz LOGICAL.

If wantq = .TRUE., update the left transformation matrix Q;
If wantq = .FALSE., do not update Q;
If wantz = .TRUE., update the right transformation matrix Z;
If wantz = .FALSE., do not update Z.
select LOGICAL.
Array, size at least max (1, n). Specifies the eigenvalues in the selected
cluster.
To select an eigenvalue j, select(j) must be .TRUE.For real flavors: to
select a complex conjugate pair of eigenvalues j and j + 1 (corresponding
2 by 2 diagonal block), select(j) and/or select(j + 1) must be set
to .TRUE.; the complex conjugate j and j + 1 must be either both
included in the cluster or both excluded.
a, b, q, z, work REAL for stgsen

DOUBLE PRECISION for dtgsen
COMPLEX for ctgsen
DOUBLE COMPLEX for ztgsen.
Arrays:
For real flavors: A is upper quasi-triangular, with (A, B) in generalized real
Schur canonical form.
For complex flavors: A is upper triangular, in generalized Schur canonical
form.
1038
LAPACK Routines 3
For real flavors: B is upper triangular, with (A, B) in generalized real Schur
canonical form.
For complex flavors: B is upper triangular, in generalized Schur canonical
form. The second dimension of b must be at least max(1, n).
q(ldq,*)
If wantq = .TRUE., then q is an n-by-n matrix;
If wantq = .FALSE., then q is not referenced.

z(ldz,*)
If wantz = .TRUE., then z is an n-by-n matrix;
If wantz = .FALSE., then z is not referenced.

ldq INTEGER. The leading dimension of q; ldq 1.

If wantq = .TRUE., then ldq max(1, n).
ldz INTEGER. The leading dimension of z; ldz 1.

If wantz = .TRUE., then ldz max(1, n).

For real flavors:
If ijob = 1, 2, or 4, lwork max(4n+16, 2m(n-m)).
If ijob = 3 or 5, lwork max(4n+16, 4m(n-m)).

If ijob = 1, 2, or 4, lwork max(1, 2m(n-m)).
If ijob = 3 or 5, lwork max(1, 4m(n-m)).

liwork INTEGER. The dimension of the array iwork.

For real flavors:
If ijob = 1, 2, or 4, liworkn+6.
If ijob = 3 or 5, liwork max(n+6, 2m(n-m)).

If ijob = 1, 2, or 4, liworkn+2.
1039
If ijob = 3 or 5, liwork max(n+2, 2m(n-m)).

Output Parameters
a, b Overwritten by the reordered matrices A and B, respectively.
alphar, alphai REAL for stgsen;

DOUBLE PRECISION for dtgsen.
Arrays, size at least max(1, n). Contain values that form generalized
eigenvalues in real flavors.
See beta.
alpha COMPLEX for ctgsen;

Array, size at least max(1, n). Contain values that form generalized
eigenvalues in complex flavors.
See beta.
beta REAL for stgsen

DOUBLE PRECISION for dtgsen
COMPLEX for ctgsen
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will be the
generalized eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the diagonals of
the complex Schur form (S,T) that would result if the 2-by-2 diagonal
blocks of the real generalized Schur form of (A,B) were further reduced to
triangular form using complex unitary transformations.
If alphai(j) is zero, then the j-th eigenvalue is real; if positive, then the j-
th and (j + 1)-st eigenvalues are a complex conjugate pair, with alphai(j
+ 1) negative.
The diagonal elements of A and B, respectively, when the pair (A,B) has
been reduced to generalized Schur form. alpha(i)/beta(i),i=1,..., n are
the generalized eigenvalues.
q If wantq = .TRUE., then, on exit, Q has been postmultiplied by the left

orthogonal transformation matrix which reorder (A, B). The leading m
columns of Q form orthonormal bases for the specified pair of left
1040
LAPACK Routines 3
z If wantz = .TRUE., then, on exit, Z has been postmultiplied by the left
orthogonal transformation matrix which reorder (A, B). The leading m
columns of Z form orthonormal bases for the specified pair of left
m INTEGER.
The dimension of the specified pair of left and right eigen-spaces (deflating
subspaces); 0 mn.
pl, pr REAL for single precision flavors;

If ijob = 1, 4, or 5, pl and pr are lower bounds on the reciprocal of the
norm of "projections" onto left and right eigenspaces with respect to the
selected cluster.
0 < pl, pr 1. If m = 0 or m = n, pl = pr = 1.
If ijob = 0, 2 or 3, pl and pr are not referenced
dif REAL for single precision flavors;DOUBLE PRECISION for double precision
flavors.
Array, size (2).
If ijob 2, dif(1:2) store the estimates of Difu and Difl.
If ijob = 2 or 4, dif(1:2) are F-norm-based upper bounds on Difu and

Difl.
If ijob = 3 or 5, dif(1:2) are 1-norm-based estimates of Difu and Difl.
If m = 0 or m = n, dif(1:2) = F-norm([A, B]).
If ijob = 0 or 1, dif is not referenced.
work(1) If ijob is not 0 and info = 0, on exit, work(1) contains the minimum
value of lwork required for optimum performance. Use this lwork for
subsequent runs.
iwork(1) If ijob is not 0 and info = 0, on exit, iwork(1) contains the minimum
value of liwork required for optimum performance. Use this liwork for
subsequent runs.
info INTEGER.
If info = 1, Reordering of (A, B) failed because the transformed matrix

pair (A, B) would be too far from generalized Schur form; the problem is
very ill-conditioned. (A, B) may have been partially reordered.
If ijob > 0, 0 is returned in dif, pl and pr.

1041
Specific details for the routine tgsen interface are the following:
dif Holds the vector of length (2).
ijob Must be 0, 1, 2, 3, 4, or 5. The default value is 0.
wantq Restored based on the presence of the argument q as follows:

wantq = .TRUE, if q is present,
wantq = .FALSE, if q is omitted.
wantz Restored based on the presence of the argument z as follows:

wantz = .TRUE, if z is present,
wantz = .FALSE, if z is omitted.
Application Notes
?tgsyl
Solves the generalized Sylvester equation.
Syntax
call stgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
dif, work, lwork, iwork, info)
call dtgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
1042
LAPACK Routines 3
call ctgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
call ztgsyl(trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, scale,
call tgsyl(a, b, c, d, e, f [,ijob] [,trans] [,scale] [,dif] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves the generalized Sylvester equation:

A*R-L*B = scale*C
D*R-L*E = scale*F
where R and L are unknown m-by-n matrices, (A, D), (B, E) and (C, F) are given matrix pairs of size m-by-
m, n-by-n and m-by-n, respectively, with real/complex entries. (A, D) and (B, E) must be in generalized real-
Schur/Schur canonical form, that is, A, B are upper quasi-triangular/triangular and D, E are upper triangular.
The solution (R, L) overwrites (C, F). The factor scale, 0scale1, is an output scaling factor chosen to avoid
overflow.
In matrix notation the above equation is equivalent to the following: solve Z*x = scale*b, where Z is
defined as
Here Ik is the identity matrix of size k and XT is the transpose/conjugate-transpose of X. kron(X, Y) is the
Kronecker product between the matrices X and Y.
If trans = 'T' (for real flavors), or trans = 'C' (for complex flavors), the routine ?tgsyl solves the
transposed/conjugate-transposed system ZT*y = scale*b, which is equivalent to solve for R and L in
AT*R+DT*L = scale*C
R*BT+L*ET = scale*(-F)
This case (trans = 'T' for stgsyl/dtgsyl or trans = 'C' for ctgsyl/ztgsyl) is used to compute an
one-norm-based estimate of Dif[(A, D), (B, E)], the separation between the matrix pairs (A,D) and
(B,E), using lacon/lacon.
If ijob 1, ?tgsyl computes a Frobenius norm-based estimate of Dif[(A, D), (B,E)]. That is, the
reciprocal of a lower bound on the reciprocal of the smallest singular value of Z. This is a level 3 BLAS
algorithm.
Input Parameters

If trans = 'N', solve the generalized Sylvester equation.
If trans = 'T', solve the 'transposed' system (for real flavors only).
1043
If trans = 'C', solve the ' conjugate transposed' system (for complex
flavors only).
ijob INTEGER. Specifies what kind of functionality to be performed:

If ijob =0, solve the generalized Sylvester equation only;
If ijob =1, perform the functionality of ijob =0 and ijob =3;
If ijob =2, perform the functionality of ijob =0 and ijob =4;
If ijob =3, only an estimate of Dif[(A, D), (B, E)] is computed (look ahead
strategy is used);
If ijob =4, only an estimate of Dif[(A, D), (B,E)] is computed (?gecon on
sub-systems is used). If trans = 'T' or 'C', ijob is not referenced.
m INTEGER. The order of the matrices A and D, and the row dimension of the
matrices C, F, R and L.
n INTEGER. The order of the matrices B and E, and the column dimension of
the matrices C, F, R and L.
a, b, c, d, e, f, work REAL for stgsyl

DOUBLE PRECISION for dtgsyl
COMPLEX for ctgsyl
DOUBLE COMPLEX for ztgsyl.
Arrays:
a(lda,*) contains the upper quasi-triangular (for real flavors) or upper
triangular (for complex flavors) matrix A.
b(ldb,*) contains the upper quasi-triangular (for real flavors) or upper
triangular (for complex flavors) matrix B. The second dimension of b must
be at least max(1, n).
c(ldc,*) contains the right-hand-side of the first matrix equation in the
generalized Sylvester equation (as defined by trans)
d(ldd,*) contains the upper triangular matrix D.
The second dimension of d must be at least max(1, m).
e(lde,*) contains the upper triangular matrix E.
The second dimension of e must be at least max(1, n).
f(ldf,*) contains the right-hand-side of the second matrix equation in the
generalized Sylvester equation (as defined by trans)
The second dimension of f must be at least max(1, n).
1044
LAPACK Routines 3
ldd INTEGER. The leading dimension of d; at least max(1, m).
lde INTEGER. The leading dimension of e; at least max(1, n).
ldf INTEGER. The leading dimension of f; at least max(1, m) .
lwork INTEGER.
The dimension of the array work. lwork 1.
If ijob = 1 or 2 and trans = 'N', lwork max(1, 2*m*n).

iwork INTEGER. Workspace array, size at least (m+n+6) for real flavors, and at
least (m+n+2) for complex flavors.
Output Parameters
c If ijob=0, 1, or 2, overwritten by the solution R.
If ijob=3 or 4 and trans = 'N', c holds R, the solution achieved during the
computation of the Dif-estimate.
f If ijob=0, 1, or 2, overwritten by the solution L.
If ijob=3 or 4 and trans = 'N', f holds L, the solution achieved during

the computation of the Dif-estimate.
dif REAL for single-precision flavors

On exit, dif is the reciprocal of a lower bound of the reciprocal of the Dif-
function, that is, dif is an upper bound of Dif[(A, D), (B, E)] =
sigma_min(Z), where Z as defined in the description.
If ijob = 0, or trans = 'T' (for real flavors), or trans = 'C' (for
complex flavors), dif is not touched.

On exit, scale is the scaling factor in the generalized Sylvester equation.
If 0 < scale < 1, c and f hold the solutions R and L, respectively, to a
slightly perturbed system but the input matrices A, B, D and E have not
been changed.
If scale = 0, c and f hold the solutions R and L, respectively, to the
homogeneous system with C = F = 0. Normally, scale = 1.
work(1) If info = 0, work(1) contains the minimum value of lwork required for
optimum performance. Use this lwork for subsequent runs.
1045
info INTEGER.

If info > 0, (A, D) and (B, E) have common or close eigenvalues.

Specific details for the routine tgsyl interface are the following:
a Holds the matrix A of size (m,m).
d Holds the matrix D of size (m,m).
e Holds the matrix E of size (n,n).
f Holds the matrix F of size (m,n).
ijob Must be 0, 1, 2, 3, or 4. The default value is 0.
Application Notes
= -1.
?tgsna
Estimates reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a pair of matrices
in generalized real Schur canonical form.
Syntax
call stgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
m, work, lwork, iwork, info)
call dtgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
call ctgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
1046
LAPACK Routines 3
call ztgsna(job, howmny, select, n, a, lda, b, ldb, vl, ldvl, vr, ldvr, s, dif, mm,
call tgsna(a, b [,s] [,dif] [,vl] [,vr] [,select] [,m] [,info])
Include Files
mkl.fi, lapack.f90
Description
The real flavors stgsna/dtgsna of this routine estimate reciprocal condition numbers for specified
eigenvalues and/or eigenvectors of a matrix pair (A, B) in generalized real Schur canonical form (or of any
matrix pair (Q*A*ZT, Q*B*ZT) with orthogonal matrices Q and Z.
(A, B) must be in generalized real Schur form (as returned by gges/gges), that is, A is block upper triangular
with 1-by-1 and 2-by-2 diagonal blocks. B is upper triangular.
The complex flavors ctgsna/ztgsna estimate reciprocal condition numbers for specified eigenvalues and/or
eigenvectors of a matrix pair (A, B). (A, B) must be in generalized Schur canonical form, that is, A and B are
both upper triangular.
Input Parameters
job CHARACTER*1. Specifies whether condition numbers are required for

eigenvalues or eigenvectors. Must be 'E' or 'V' or 'B'.
If job = 'E', for eigenvalues only (compute s ).
If job = 'V', for eigenvectors only (compute dif ).
If job = 'B', for both eigenvalues and eigenvectors (compute both s and
dif).
howmny CHARACTER*1. Must be 'A' or 'S'.

If howmny = 'A', compute condition numbers for all eigenpairs.
If howmny = 'S', compute condition numbers for selected eigenpairs

specified by the logical array select.
select LOGICAL.
If howmny = 'S', select specifies the eigenpairs for which condition
numbers are required.
If howmny = 'A', select is not referenced.
For real flavors:

To select condition numbers for the eigenpair corresponding to a real
eigenvalue j, select(j) must be set to .TRUE.; to select condition numbers
corresponding to a complex conjugate pair of eigenvalues j and j + 1,
either select(j) or select(j + 1) must be set to .TRUE.

To select condition numbers for the corresponding j-th eigenvalue and/or
eigenvector, select(j) must be set to .TRUE..
n INTEGER. The order of the square matrix pair (A, B)
1047
(n 0).
a, b, vl, vr, work REAL for stgsna

DOUBLE PRECISION for dtgsna
COMPLEX for ctgsna
DOUBLE COMPLEX for ztgsna.
Arrays:
a(lda,*) contains the upper quasi-triangular (for real flavors) or upper
triangular (for complex flavors) matrix A in the pair (A, B).
b(ldb,*) contains the upper triangular matrix B in the pair (A, B). The
second dimension of b must be at least max(1, n).
If job = 'E' or 'B', vl(ldvl,*) must contain left eigenvectors of (A, B),
corresponding to the eigenpairs specified by howmny and select. The
eigenvectors must be stored in consecutive columns of vl, as returned by ?
tgevc.
If job = 'V', vl is not referenced. The second dimension of vl must be at
least max(1, m).
If job = 'E' or 'B', vr(ldvr,*) must contain right eigenvectors of (A, B),
corresponding to the eigenpairs specified by howmny and select. The
eigenvectors must be stored in consecutive columns of vr, as returned by ?
tgevc.
If job = 'V', vr is not referenced. The second dimension of vr must be at
least max(1, m).
If job = 'E', work is not referenced.
ldvl INTEGER. The leading dimension of vl; ldvl 1.

If job = 'E' or 'B', then ldvl max(1, n) .
ldvr INTEGER. The leading dimension of vr; ldvr 1.

If job = 'E' or 'B', then ldvr max(1, n).
mm INTEGER. The number of elements in the arrays s and dif (mmm).

lwork max(1, n).
If job = 'V' or 'B', lwork 2*n*(n+2)+16 for real flavors, and lwork
max(1, 2*n*n) for complex flavors.
1048
LAPACK Routines 3
iwork INTEGER. Workspace array, size at least (n+6) for real flavors, and at least
(n+2) for complex flavors.
If job = 'E', iwork is not referenced.
Output Parameters

Array, size (mm ).
If job = 'E' or 'B', contains the reciprocal condition numbers of the
selected eigenvalues, stored in consecutive elements of the array.
If job = 'V', s is not referenced.
For real flavors:

For a complex conjugate pair of eigenvalues two consecutive elements of s
are set to the same value. Thus, s(j), dif(j), and the j-th columns of vl and
vr all correspond to the same eigenpair (but not in general the j-th
eigenpair, unless all eigenpairs are selected).
dif REAL for single-precision flavors

Array, size (mm ).
If job = 'V' or 'B', contains the estimated reciprocal condition numbers
of the selected eigenvectors, stored in consecutive elements of the array.
If the eigenvalues cannot be reordered to compute dif(j), dif(j) is set to 0;
this can only occur when the true value would be very small anyway.
If job = 'E', dif is not referenced.
For real flavors:

For a complex eigenvector, two consecutive elements of dif are set to the
same value.
For each eigenvalue/vector specified by select, dif stores a Frobenius norm-
based estimate of Difl.
m INTEGER. The number of elements in the arrays s and dif used to store the
specified condition numbers; for each selected eigenvalue one element is
used.
work(1) work(1)
1049
If job is not 'E' and info = 0, on exit, work(1) contains the minimum
value of lwork required for optimum performance. Use this lwork for
subsequent runs.
info INTEGER.

Specific details for the routine tgsna interface are the following:
s Holds the vector of length (mm).
dif Holds the vector of length (mm).
howmny Restored based on the presence of the argument select as follows: howmny =
'S', if select is present, howmny = 'A', if select is omitted.
job Restored based on the presence of arguments s and dif as follows: job = 'B',
if both s and dif are present, job = 'E', if s is present and dif omitted, job =
'V', if s is omitted and dif present, Note that there will be an error condition if
both s and dif are omitted.
Application Notes
= -1.
Generalized Singular Value Decomposition: LAPACK Computational Routines

This section describes LAPACK computational routines used for finding the generalized singular value
decomposition (GSVD) of two matrices A and B as
UHAQ = D1*(0 R),
1050
LAPACK Routines 3
VHBQ = D2*(0 R),
where U, V, and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix, and D1, D2
are diagonal matrices of the structure detailed in the routines description section.
Table Computational Routines for Generalized Singular Value Decomposition lists LAPACK routines
(FORTRAN 77 interface) that perform generalized singular value decomposition of matrices. The
Computational Routines for Generalized Singular Value Decomposition
Routine name Operation performed
ggsvp Computes the preprocessing decomposition for the generalized SVD
ggsvp3 Performs preprocessing for a generalized SVD.
ggsvd3 Computes generalized SVD.
tgsja Computes the generalized SVD of two upper triangular or trapezoidal

matrices
You can use routines listed in the above table as well as the driver routine ggsvd to find the GSVD of a pair of
general rectangular matrices.
?ggsvp
Computes the preprocessing decomposition for the
generalized SVD (deprecated).
Syntax
call sggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, info)
call dggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, info)
call cggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, info)
call zggsvp(jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, info)
call ggsvp(a, b, tola, tolb [, k] [,l] [,u] [,v] [,q] [,info])
Include Files
mkl.fi, lapack.f90
Description
This routine is deprecated; use ggsvp3.
The routine computes orthogonal matrices U, V and Q such that
1051
where the k-by-k matrix A12 and l-by-l matrix B13 are nonsingular upper triangular; A23 is l-by-l upper
triangular if m-k-l0, otherwise A23 is (m-k)-by-l upper trapezoidal. The sum k+l is equal to the effective
numerical rank of the (m+p)-by-n matrix (AH,BH)H.
This decomposition is the preprocessing step for computing the Generalized Singular Value Decomposition
(GSVD), see subroutine ?tgsja.
Input Parameters
jobu CHARACTER*1. Must be 'U' or 'N'.

If jobu = 'U', orthogonal/unitary matrix U is computed.
If jobu = 'N', U is not computed.
jobv CHARACTER*1. Must be 'V' or 'N'.

If jobv = 'V', orthogonal/unitary matrix V is computed.
If jobv = 'N', V is not computed.
jobq CHARACTER*1. Must be 'Q' or 'N'.

If jobq = 'Q', orthogonal/unitary matrix Q is computed.
If jobq = 'N', Q is not computed.
p INTEGER. The number of rows of the matrix B (p 0).
a, b, tau, work REAL for sggsvp

DOUBLE PRECISION for dggsvp
COMPLEX for cggsvp
DOUBLE COMPLEX for zggsvp.
Arrays:
1052
LAPACK Routines 3
tau(*) is a workspace array.
The dimension of tau must be at least max(1, n).
The dimension of work must be at least max(1, 3n, m, p).
tola, tolb REAL for single-precision flavors

tola and tolb are the thresholds to determine the effective numerical rank of
matrix B and a subblock of A. Generally, they are set to
tola = max(m, n)*||A||*MACHEPS,
tolb = max(p, n)*||B||*MACHEPS.
The size of tola and tolb may affect the size of backward errors of the
decomposition.
ldu INTEGER. The leading dimension of the output array u . ldu max(1, m) if
jobu = 'U'; ldu 1 otherwise.
ldv INTEGER. The leading dimension of the output array v . ldv max(1, p) if
jobv = 'V'; ldv 1 otherwise.
ldq INTEGER. The leading dimension of the output array q . ldq max(1, n) if
jobq = 'Q'; ldq 1 otherwise.
rwork REAL for cggsvp

DOUBLE PRECISION for zggsvp.
Workspace array, size at least max(1, 2n). Used in complex flavors only.
Output Parameters
a Overwritten by the triangular (or trapezoidal) matrix described in the

Description section.
b Overwritten by the triangular matrix described in the Description section.
k, l INTEGER. On exit, k and l specify the dimension of subblocks. The sum k + l

is equal to effective numerical rank of (AH, BH)H.
u, v, q REAL for sggsvp

DOUBLE PRECISION for dggsvp
COMPLEX for cggsvp
1053
DOUBLE COMPLEX for zggsvp.

Arrays:
If jobu = 'U', u(ldu,*) contains the orthogonal/unitary matrix U.
The second dimension of u must be at least max(1, m).

If jobu = 'N', u is not referenced.
If jobv = 'V', v(ldv,*) contains the orthogonal/unitary matrix V.

If jobv = 'N', v is not referenced.
If jobq = 'Q', q(ldq,*) contains the orthogonal/unitary matrix Q.

If jobq = 'N', q is not referenced.
info INTEGER.

Specific details for the routine ggsvp interface are the following:
b Holds the matrix B of size (p,n).
u Holds the matrix U of size (m,m).
v Holds the matrix V of size (p,m).
jobu Restored based on the presence of the argument u as follows:

jobu = 'U', if u is present,
jobu = 'N', if u is omitted.
jobv Restored based on the presence of the argument v as follows:

jobz = 'V', if v is present,
jobz = 'N', if v is omitted.
jobq Restored based on the presence of the argument q as follows:

jobz = 'Q', if q is present,
jobz = 'N', if q is omitted.
1054
LAPACK Routines 3
?ggsvp3
Performs preprocessing for a generalized SVD.
Syntax
call sggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, lwork, info )
call dggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, tau, work, lwork, info )
call cggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, lwork, info )
call zggsvp3 (jobu, jobv, jobq, m, p, n, a, lda, b, ldb, tola, tolb, k, l, u, ldu, v,
ldv, q, ldq, iwork, rwork, tau, work, lwork, info )
Include Files
mkl_lapack.fi
Include Files
mkl.fi
Description
?ggsvp3 computes orthogonal or unitary matrices U, V, and Q such that
for real flavors:
nkl k l
k 0 A12 A13
U T AQ = if m - k - l 0;
l0 0 A23
mkl 0 0 0
nkl k l
U T AQ = k 0 A12 A13 if m - k - l< 0;
mk 0 0 A23
nkl k l
T
V BQ = l 0 0 B13
pl 00 0
nkl k l
k 0 A12 A13
U H AQ = if m - k - l 0;
l0 0 A23
mkl 0 0 0
nkl k l
H
U AQ = k 0 A12 A13 if m - k-l< 0;
mk 0 0 A23
nkl k l
H
V BQ = l 0 0 B13
pl 0 0 0
triangular if m-k-l 0, otherwise A23 is (m-k-by-l upper trapezoidal. k + l = the effective numerical rank of
the (m + p)-by-n matrix (AT,BT)T for real flavors or (AH,BH)H for complex flavors.
1055
This decomposition is the preprocessing step for computing the Generalized Singular Value Decomposition
(GSVD), see ?ggsvd3.
Input Parameters
jobu CHARACTER*1. = 'U': Orthogonal/unitary matrix U is computed;

= 'N': U is not computed.
jobv CHARACTER*1. = 'V': Orthogonal/unitary matrix V is computed;

= 'N': V is not computed.
jobq CHARACTER*1. = 'Q': Orthogonal/unitary matrix Q is computed;

= 'N': Q is not computed.
m INTEGER. The number of rows of the matrix A.

m 0.
p INTEGER. The number of rows of the matrix B.

p 0.
n INTEGER. The number of columns of the matrices A and B.

n 0.
a REAL for sggsvp3

DOUBLE PRECISION for dggsvp3
COMPLEX for cggsvp3
DOUBLE COMPLEX for zggsvp3
On entry, the m-by-n matrix A.
lda max(1,m).
b REAL for sggsvp3

COMPLEX for cggsvp3
On entry, the p-by-n matrix B.
ldb max(1,p).
tola, tolb REAL for sggsvp3

REAL for cggsvp3
1056
LAPACK Routines 3
DOUBLE PRECISION for zggsvp3
tola and tolb are the thresholds to determine the effective numerical rank
of matrix B and a subblock of A. Generally, they are set to
tola = max(m,n)*norm(a)*MACHEPS,
tolb = max(p,n)*norm(b)*MACHEPS.
The size of tola and tolb may affect the size of backward errors of the
decomposition.
ldu INTEGER. The leading dimension of the array u.
ldu max(1,m) if jobu = 'U'; ldu 1 otherwise.
ldv max(1,p) if jobv = 'V'; ldv 1 otherwise.
ldq max(1,n) if jobq = 'Q'; ldq 1 otherwise.
iwork INTEGER. Array, size (n).
rwork for sggsvp3
for dggsvp3
REAL for cggsvp3

DOUBLE PRECISION for zggsvp3
Array, size (2*n).
tau REAL for sggsvp3

COMPLEX for cggsvp3
Array, size (n).
The scalar factors of the elementary reflectors.
work REAL for sggsvp3

COMPLEX for cggsvp3
Array, size (MAX(1,lwork)).

xerbla.
1057
Output Parameters
a On exit, a contains the triangular (or trapezoidal) matrix described in the

Description section.
b On exit, b contains the triangular matrix described in the Description

section.
k, l INTEGER. On exit, k and l specify the dimension of the subblocks described

in Description section.
k + l = effective numerical rank of (AT,BT)T for real flavors or (AH,BH)H for
complex flavors.
u REAL for sggsvp3

COMPLEX for cggsvp3
Array, size (ldu, m).
If jobu = 'U', u contains the orthogonal/unitary matrix U.
v REAL for sggsvp3

COMPLEX for cggsvp3
Array, size (ldv, p).
If jobv = 'V', v contains the orthogonal/unitary matrix V.
q REAL for sggsvp3

COMPLEX for cggsvp3
If jobq = 'Q', q contains the orthogonal/unitary matrix Q.

Application Notes
The subroutine uses LAPACK subroutine ?geqp3 for the QR factorization with column pivoting to detect the
effective numerical rank of the A matrix. It may be replaced by a better rank determination strategy.
1058
LAPACK Routines 3
?ggsvp3 replaces the deprecated subroutine ?ggsvp.
?ggsvd3
Computes generalized SVD.
Syntax
call sggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, iwork, info)
call dggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, iwork, info)
call cggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, rwork, iwork, info)
call zggsvd3(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, lwork, rwork, iwork, info)
Include Files
mkl.fi
Description
?ggsvd3 computes the generalized singular value decomposition (GSVD) of an m-by-n real or complex matrix
A and p-by-n real or complex matrix B:
UT*A*Q = D1*( 0 R ), VT*B*Q = D2*( 0 R ) for real flavors

or
UH*A*Q = D1*( 0 R ), VH*B*Q = D2*( 0 R ) for complex flavors
where U, V and Q are orthogonal/unitary matrices.
Let k+l = the effective numerical rank of the matrix (ATBT)T for real flavors or the matrix (AH,BH)H for
complex flavors, then R is a (k + l)-by-(k + l) nonsingular upper triangular matrix, D1 and D2 are m-by-(k +
l) and p-by-(k + l) "diagonal" matrices and of the following structures, respectively:
If m-k-l 0,
k l
k I 0
D1 =
l0 C
mkl 0 0
k l
D2 = l 0 S
pl 0 0
nkl k l
0R =k 0 R11 R12
l 0 0 R22
where
C = diag( alpha(k+1), ... , alpha(k+l) ),
S = diag( beta(k+1), ... , beta(k+l) ),
C2 + S2 = I.
If m - k - l < 0,
1059
k mk k+l m
D1 = k I 0 0
mk 0 C 0
k mk k+l m
mk 0 S 0
D2 =
k+l m 0 0 I
pl 0 0 0
nkl k mk k+l m
k 0 R11 R12 R13
0R =
mk 0 0 R22 R23
k+l m 0 0 0 R33
where
C = diag(alpha(k + 1), ... , alpha(m)),
S = diag(beta(k + 1), ... , beta(m)),
C2 + S2 = I.
The routine computes C, S, R, and optionally the orthogonal/unitary transformation matrices U, V and Q.
In particular, if B is an n-by-n nonsingular matrix, then the GSVD of A and B implicitly gives the SVD of
A*inv(B):
A*inv(B) = U*(D1*inv(D2))*VT for real flavors
or
A*inv(B) = U*(D1*inv(D2))*VH for complex flavors.
If (AT,BT)T for real flavors or (AH,BH)H for complex flavors has orthonormal columns, then the GSVD of A and
B is also equal to the CS decomposition of A and B. Furthermore, the GSVD can be used to derive the
solution of the eigenvalue problem:
AT*AX = * BT*BX for real flavors
or
AH*AX = * BH*BX for complex flavors
In some literature, the GSVD of A and B is presented in the form
UT*A*X = ( 0 D1 ), VT*B*X = ( 0 D2 ) for real (A, B)
or
UH*A*X = ( 0 D1 ), VH*B*X = ( 0 D2 ) for complex (A, B)
where U and V are orthogonal and X is nonsingular, D1 and D2 are "diagonal''. The former GSVD form can be
converted to the latter form by taking the nonsingular matrix X as
I 0
X = Q*
0 inv R
Input Parameters
jobu CHARACTER*1. = 'U': Orthogonal/unitary matrix U is computed;

= 'N': U is not computed.
jobv CHARACTER*1. = 'V': Orthogonal/unitary matrix V is computed;

= 'N': V is not computed.
1060
LAPACK Routines 3
jobq CHARACTER*1. = 'Q': Orthogonal/unitary matrix Q is computed;
= 'N': Q is not computed.

m 0.
n INTEGER. The number of columns of the matrices A and B.

n 0.
p INTEGER. The number of rows of the matrix B.

p 0.
a REAL for sggsvd3

DOUBLE PRECISION for dggsvd3
COMPLEX for cggsvd3
DOUBLE COMPLEX for zggsvd3
lda max(1,m).
b REAL for sggsvd3

COMPLEX for cggsvd3
On entry, the p-by-n matrix B.
ldb max(1,p).
ldu max(1,m) if jobu = 'U'; ldu 1 otherwise.
ldv max(1,p) if jobv = 'V'; ldv 1 otherwise.
ldq max(1,n) if jobq = 'Q'; ldq 1 otherwise.
work REAL for sggsvd3

COMPLEX for cggsvd3
1061

xerbla.
rwork for sggsvd3
for dggsvd3
REAL for cggsvd3

DOUBLE PRECISION for zggsvd3
Array, size (2*n).
iwork INTEGER. Array, size (n).
Output Parameters
k, l INTEGER. On exit, k and l specify the dimension of the subblocks described

in the Description section.
k + l = effective numerical rank of (AT,BT)T for real flavors or (AH,BH)H for
complex flavors.
a On exit, a contains the triangular matrix R, or part of R.
If m-k-l 0, R is stored in a(1: k + l,n - k - l + 1:n).
R11 R12 R13

If m - k - l < 0, is stored in a(1:m, n - k - l + 1:n), and
0 R22 R23
R33 is stored in b(m - k + 1:l,n + m - k - l + 1:n) on exit.
b On exit, b contains part of the triangular matrix R if m - k - l < 0.
See Description for details.
alpha REAL for sggsvd3

REAL for cggsvd3
Array, size (n)
beta REAL for sggsvd3

REAL for cggsvd3
Array, size (n)
On exit, alpha and beta contain the generalized singular value pairs of a
and b;
1062
LAPACK Routines 3
alpha(1: k) = 1,
beta(1: k) = 0,
and if m - k - l 0,
alpha(k + 1:k + l) = C,
beta(k + 1:k + l) = S,
or if m - k - l < 0,
alpha(k + 1:m) = C, alpha(m + 1:k + l) = 0

beta(k + 1: m) =S, beta(m + 1: k + l) = 1
and
alpha(k + l + 1: n) = 0
beta(k + l + 1: n) = 0
u REAL for sggsvd3

COMPLEX for cggsvd3
Array, size (ldu, m).
If jobu = 'U', u contains the m-by-m orthogonal/unitary matrix U.
v REAL for sggsvd3

COMPLEX for cggsvd3
Array, size (ldv, p).
If jobv = 'V', v contains the p-by-p orthogonal/unitary matrix V.
q REAL for sggsvd3

COMPLEX for cggsvd3
If jobq = 'Q', q contains the n-by-n orthogonal/unitary matrix Q.
1063
iwork On exit, iwork stores the sorting information. More precisely, the following
loop uses iwork to sort alpha:
for I = k+1, min(m,k + l)

swap alpha(I) and alpha(iwork(I))
endfor
such that alpha(1) alpha(2) ... alpha(n).

> 0: if info = 1, the Jacobi-type procedure failed to converge.
For further details, see subroutine ?tgsja.
Application Notes
?ggsvd3 replaces the deprecated subroutine ?ggsvd.
?tgsja
Computes the generalized SVD of two upper triangular
or trapezoidal matrices.
Syntax
call stgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
u, ldu, v, ldv, q, ldq, work, ncycle, info)
call dtgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
call ctgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
call ztgsja(jobu, jobv, jobq, m, p, n, k, l, a, lda, b, ldb, tola, tolb, alpha, beta,
call tgsja(a, b, tola, tolb, k, l [,u] [,v] [,q] [,jobu] [,jobv] [,jobq] [,alpha]
[,beta] [,ncycle] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the generalized singular value decomposition (GSVD) of two real/complex upper
triangular (or trapezoidal) matrices A and B. On entry, it is assumed that matrices A and B have the following
forms, which may be obtained by the preprocessing subroutine ggsvp from a general m-by-n matrix A and p-
by-n matrix B:
1064
LAPACK Routines 3
triangular if m-k-l0, otherwise A23 is (m-k)-by-l upper trapezoidal.
On exit,
UH*A*Q = D1*(0 R), VH*B*Q = D2*(0 R),
where U, V and Q are orthogonal/unitary matrices, R is a nonsingular upper triangular matrix, and D1 and D2
are "diagonal" matrices, which are of the following structures:
If m-k-l0,
where
C = diag(alpha(k+1),...,alpha(k+l))
1065
S = diag(beta(k+1),...,beta(k+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
If m-k-l < 0,
where
C = diag(alpha(k+1),...,alpha(m)),
S = diag(beta(k+1),...,beta(m)),
C2 + S2 = I
On exit, is stored in a(1:m, n-k-l+1:n ) and R33 is stored

in b(m-k+1:l, n+m-k-l+1:n ).
The computation of the orthogonal/unitary transformation matrices U, V or Q is optional. These matrices may
either be formed explicitly, or they may be postmultiplied into input matrices U1, V1, or Q1.
Input Parameters
jobu CHARACTER*1. Must be 'U', 'I', or 'N'.

If jobu = 'U', u must contain an orthogonal/unitary matrix U1 on entry.
If jobu = 'I', u is initialized to the unit matrix.
If jobu = 'N', u is not computed.
1066
LAPACK Routines 3
jobv CHARACTER*1. Must be 'V', 'I', or 'N'.
If jobv = 'V', v must contain an orthogonal/unitary matrix V1 on entry.
If jobv = 'I', v is initialized to the unit matrix.
If jobv = 'N', v is not computed.
jobq CHARACTER*1. Must be 'Q', 'I', or 'N'.

If jobq = 'Q', q must contain an orthogonal/unitary matrix Q1 on entry.
If jobq = 'I', q is initialized to the unit matrix.
If jobq = 'N', q is not computed.
k, l INTEGER. Specify the subblocks in the input matrices A and B, whose GSVD
is computed.
a, b, u, v, q, work REAL for stgsja

DOUBLE PRECISION for dtgsja
COMPLEX for ctgsja
DOUBLE COMPLEX for ztgsja.
Arrays:
If jobu = 'U', u(ldu,*) must contain a matrix U1 (usually the orthogonal/
unitary matrix returned by ?ggsvp).
The second dimension of u must be at least max(1, m).

If jobv = 'V', v(ldv,*) must contain a matrix V1 (usually the orthogonal/
The second dimension of v must be at least max(1, p).

If jobq = 'Q', q(ldq,*) must contain a matrix Q1 (usually the orthogonal/

ldu INTEGER. The leading dimension of the array u .
1067
ldu max(1, m) if jobu = 'U'; ldu 1 otherwise.
ldv INTEGER. The leading dimension of the array v .

ldv max(1, p) if jobv = 'V'; ldv 1 otherwise.
ldq INTEGER. The leading dimension of the array q .

ldq max(1, n) if jobq = 'Q'; ldq 1 otherwise.
tola, tolb REAL for single-precision flavors

tola and tolb are the convergence criteria for the Jacobi-Kogbetliantz
iteration procedure. Generally, they are the same as used in ?ggsvp:
tola = max(m, n)*|A|*MACHEPS,

tolb = max(p, n)*|B|*MACHEPS.
Output Parameters
a On exit, a(n-k+1:n, 1:min(k+l, m)) contains the triangular matrix R or part

of R.
b On exit, if necessary, b(m-k+1: l, n+m-k-l+1: n)) contains a part of R.
alpha, beta REAL for single-precision flavors

Arrays, size at least max(1, n). Contain the generalized singular value pairs
of A and B:
alpha(1:k) = 1,
beta(1:k) = 0,
and if m-k-l 0,
alpha(k+1:k+l) = diag(C),
beta(k+1:k+l) = diag(S),
or if m-k-l < 0,
alpha(k+1:m)= diag(C), alpha(m+1:k+l)=0

beta(k+1:m) = diag(S),
beta(m+1:k+l) = 1.
Furthermore, if k+l < n,
alpha(k+l+1:n)= 0 and
beta(k+l+1:n) = 0.
u If jobu = 'I', u contains the orthogonal/unitary matrix U.
If jobu = 'U', u contains the product U1*U.
v If jobv = 'I', v contains the orthogonal/unitary matrix U.
1068
LAPACK Routines 3
If jobv = 'V', v contains the product V1*V.
q If jobq = 'I', q contains the orthogonal/unitary matrix U.
If jobq = 'Q', q contains the product Q1*Q.
ncycle INTEGER. The number of cycles required for convergence.
info INTEGER.
If info = 1, the procedure does not converge after MAXIT cycles.

Specific details for the routine tgsja interface are the following:
u Holds the matrix U of size (m,m).
v Holds the matrix V of size (p,p).
alpha Holds the vector of length n.
jobu If omitted, this argument is restored based on the presence of argument u as

follows:
jobu = 'U', if u is present,
jobu = 'N', if u is omitted.
If present, jobu must be equal to 'I' or 'U' and the argument u must also be
present.
Note that there will be an error condition if jobu is present and u omitted.
jobv If omitted, this argument is restored based on the presence of argument v as

follows:
jobv = 'V', if v is present,
jobv = 'N', if v is omitted.
If present, jobv must be equal to 'I' or 'V' and the argument v must also be
present.
1069
Note that there will be an error condition if jobv is present and v omitted.
jobq If omitted, this argument is restored based on the presence of argument q as

follows:
jobq = 'Q', if q is present,
jobq = 'N', if q is omitted.
If present, jobq must be equal to 'I' or 'Q' and the argument q must also be
present.
Note that there will be an error condition if jobq is present and q omitted.
Cosine-Sine Decomposition: LAPACK Computational Routines

This section describes LAPACK computational routines for computing the cosine-sine decomposition (CS
decomposition) of a partitioned unitary/orthogonal matrix. The algorithm computes a complete 2-by-2 CS
decomposition, which requires simultaneous diagonalization of all the four blocks of a unitary/orthogonal
matrix partitioned into a 2-by-2 block structure.
The computation has the following phases:
1. The matrix is reduced to a bidiagonal block form.
2. The blocks are simultaneously diagonalized using techniques from the bidiagonal SVD algorithms.
Table "Computational Routines for Cosine-Sine Decomposition (CSD)" lists LAPACK routines that perform CS
decomposition of matrices. The corresponding routine names in the Fortran 95 interface are without the first
symbol.
Computational Routines for Cosine-Sine Decomposition (CSD)
Compute the CS decomposition of an bbcsd/bbcsd bbcsd/bbcsd

orthogonal/unitary matrix in bidiagonal-block
form
Simultaneously bidiagonalize the blocks of a orbdb unbdb

partitioned orthogonal matrix
Simultaneously bidiagonalize the blocks of a orbdb unbdb

partitioned unitary matrix
See Also
Cosine-Sine Decomposition: LAPACK Driver Routines
?bbcsd
Computes the CS decomposition of an orthogonal/
unitary matrix in bidiagonal-block form.
Syntax
call sbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, work,
lwork, info )
call dbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, work,
lwork, info )
call cbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, rwork,
rlwork, info )
1070
LAPACK Routines 3
call zbbcsd( jobu1, jobu2, jobv1t, jobv2t, trans, m, p, q, theta, phi, u1, ldu1, u2,
ldu2, v1t, ldv1t, v2t, ldv2t, b11d, b11e, b12d, b12e, b21d, b21e, b21e, b22e, rwork,
rlwork, info )
call bbcsd( theta,phi,u1,u2,v1t,v2t[,b11d][,b11e][,b12d][,b12e][,b21d][,b21e][,b22d]
[,b22e][,jobu1][,jobu2][,jobv1t][,jobv2t][,trans][,info] )
Include Files
mkl.fi, lapack.f90
Description
mkl_lapack.fiThe routine ?bbcsd computes the CS decomposition of an orthogonal or unitary matrix in
bidiagonal-block form:
or
respectively.
x is m-by-m with the top-left block p-by-q. Note that q must not be larger than p, m-p, or m-q. If q is not
the smallest index, x must be transposed and/or permuted in constant time using the trans option. See ?
orcsd/?uncsd for details.
The bidiagonal matrices b11, b12, b21, and b22 are represented implicitly by angles theta(1:q) and
phi(1:q-1).
The orthogonal/unitary matrices u1, u2, v1t, and v2t are input/output. The input matrices are pre- or post-
multiplied by the appropriate singular vector matrices.
Input Parameters
jobu1 CHARACTER. If equals Y, then u1 is updated. Otherwise, u1 is not updated.
jobu2 CHARACTER. If equals Y, then u2 is updated. Otherwise, u2 is not updated.
jobv1t CHARACTER. If equals Y, then v1t is updated. Otherwise, v1t is not updated.
jobv2t CHARACTER. If equals Y, then v2t is updated. Otherwise, v2t is not updated.
trans CHARACTER
= 'T': x, u1, u2, v1t, v2t are stored in row-major order.
1071
otherwise x, u1, u2, v1t, v2t are stored in column-major

order.
m INTEGER. The number of rows and columns of the orthogonal/unitary

matrix X in bidiagonal-block form.
p INTEGER. The number of rows in the top-left block of x. 0 pm.

q INTEGER. The number of columns in the top-left block of x. 0 q min(p,m-
p,m-q).
theta REAL for sbbcsd

DOUBLE PRECISION for dbbcsd
COMPLEX for cbbcsd
DOUBLE COMPLEX for zbbcsd
Array, size (q).
On entry, the angles theta(1), ..., theta(q) that, along with phi(1), ...,
phi(q-1), define the matrix in bidiagonal-block form as returned by
orbdb/unbdb.
phi REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (q-1).
The angles phi(1), ..., phi(q-1) that, along with theta(1), ...,
theta(q), define the matrix in bidiagonal-block form as returned by
orbdb/unbdb.
u1 REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (ldu1,p).
On entry, a p-by-p matrix.
ldu1 INTEGER. The leading dimension of the array u1, ldu1 max(1, p).
u2 REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (ldu2,m-p).
On entry, an (m-p)-by-(m-p) matrix.
1072
LAPACK Routines 3
ldu2 INTEGER. The leading dimension of the array u2, ldu2 max(1, m-p).
v1t REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (ldv1t,q).
On entry, a q-by-q matrix.
ldv1t INTEGER. The leading dimension of the array v1t, ldv1t max(1, q).
v2t REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (ldv2t,m-q).
On entry, an (m-q)-by-(m-q) matrix.
ldv2t INTEGER. The leading dimension of the array v2t, ldv2t max(1, m-q).
work REAL for sbbcsd

COMPLEX for cbbcsd
Workspace array, size (max(1,lwork)).
lwork INTEGER. The size of the work array. lwork? max(1,8*q)

xerbla.
Output Parameters
theta REAL for sbbcsd

COMPLEX for cbbcsd
On exit, the angles whose cosines and sines define the diagonal blocks in
the CS decomposition.
u1 REAL for sbbcsd

COMPLEX for cbbcsd
1073

On exit, u1 is postmultiplied by the left singular vector matrix common to
[ b11 ; 0 ] and [ b12 0 0 ; 0 -I 0 ].
u2 REAL for sbbcsd

COMPLEX for cbbcsd
On exit, u2 is postmultiplied by the left singular vector matrix common to
[ b21 ; 0 ] and [ b22 0 0 ; 0 0 I ].
v1t REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (q).
On exit, v1t is premultiplied by the transpose of the right singular vector
matrix common to [ b11 ; 0 ] and [ b21 ; 0 ].
v2t REAL for sbbcsd

COMPLEX for cbbcsd
On exit, v2t is premultiplied by the transpose of the right singular vector
matrix common to [ b12 0 0 ; 0 -I 0 ] and [ b22 0 0 ; 0 0 I ].
b11d REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (q).
When ?bbcsd converges, b11d contains the cosines of theta(1), ...,
theta(q). If ?bbcsd fails to converge, b11d contains the diagonal of the
partially reduced top left block.
b11e REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (q-1).
When ?bbcsd converges, b11e contains zeros. If ?bbcsd fails to converge,
b11e contains the superdiagonal of the partially reduced top left block.
1074
LAPACK Routines 3
b12d REAL for sbbcsd
COMPLEX for cbbcsd
Array, size (q).
When ?bbcsd converges, b12d contains the negative sines of
theta(1), ..., theta(q). If ?bbcsd fails to converge, b12d contains the
diagonal of the partially reduced top right block.
b12e REAL for sbbcsd

COMPLEX for cbbcsd
Array, size (q-1).
When ?bbcsd converges, b12e contains zeros. If ?bbcsd fails to converge,
b11e contains the superdiagonal of the partially reduced top right block.
info INTEGER.
= 0: successful exit
< 0: if info = -i, the i-th argument has an illegal value
> 0: if ?bbcsd did not converge, info specifies the number of nonzero
entries in phi, and b11d, b11e, etc. contain the partially reduced matrix.

Specific details for the routine ?bbcsd interface are as follows:
theta Holds the vector of length q.
phi Holds the vector of length q-1.
u1 Holds the matrix of size (p,p).
u2 Holds the matrix of size (m-p,m-p).
v1t Holds the matrix of size (q,q).
v2t Holds the matrix of size (m-q,m-q).
b11d Holds the vector of length q.
b11e Holds the vector of length q-1.
1075
jobsu1 Indicates whether u1 is computed. Must be 'Y' or 'O'.
jobv1t Indicates whether v1t is computed. Must be 'Y' or 'O'.
trans Must be 'N' or 'T'.
See Also
?orcsd/?uncsd
xerbla
?orbdb/?unbdb
Simultaneously bidiagonalizes the blocks of a
partitioned orthogonal/unitary matrix.
Syntax
call sorbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
theta, phi, taup1, taup2, tauq1, tauq2, work, lwork, info )
call dorbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
call cunbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
call zunbdb( trans, signs, m, p, q, x11, ldx11, x12, ldx12, x21, ldx21, x22, ldx22,
call orbdb( x11,x12,x21,x22,theta,phi,taup1,taup2,tauq1,tauq2[,trans][,signs][,info] )
call unbdb( x11,x12,x21,x22,theta,phi,taup1,taup2,tauq1,tauq2[,trans][,signs][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb/?unbdb simultaneously bidiagonalizes the blocks of an m-by-m partitioned orthogonal
matrix X:
1076
LAPACK Routines 3
or unitary matrix:
x11 is p-by-q. q must not be larger than p, m-p, or m-q. Otherwise, x must be transposed and/or permuted
in constant time using the trans and signs options.
The orthogonal/unitary matrices p1, p2, q1, and q2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-by-(m-q),
respectively. They are represented implicitly by Housholder vectors.
The bidiagonal matrices b11, b12, b21, and b22 are q-by-q bidiagonal matrices represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). b11 and b12 are upper bidiagonal, while b21 and b22 are
lower bidiagonal. Every entry in each bidiagonal band is a product of a sine or cosine of theta with a sine or
cosine of phi. See [Sutton09] for details.
p1, p2, q1, and q2 are represented as products of elementary reflectors. .
Input Parameters
trans CHARACTER

order.
signs CHARACTER
= 'O': The lower-left block is made nonpositive (the

"other" convention).
otherwise The upper-right block is made nonpositive (the
"default" convention).
m INTEGER. The number of rows and columns of the matrix X.
p INTEGER. The number of rows in x11 and x12. 0 pm.
q INTEGER. The number of columns in x11 and x21. 0 q min(p,m-p,m-q).
x11 REAL for sorbdb

DOUBLE PRECISION for dorbdb
COMPLEX for cunbdb
DOUBLE COMPLEX for zunbdb
Array, size (ldx11,*) .
On entry, the top-left block of the orthogonal/unitary matrix to be reduced.
ldx11 INTEGER. The leading dimension of the array X11. If trans = 'T', ldx11p.
Otherwise, ldx11q.
1077
x12 REAL for sorbdb

COMPLEX for cunbdb
Array, size (ldx12,m-q).
On entry, the top-right block of the orthogonal/unitary matrix to be
reduced.
ldx12 INTEGER. The leading dimension of the array X12. If trans = 'N', ldx12p.
Otherwise, ldx12m-q.
x21 REAL for sorbdb

COMPLEX for cunbdb
Array, size (ldx21,q).
On entry, the bottom-left block of the orthogonal/unitary matrix to be
reduced.
ldx21 INTEGER. The leading dimension of the array X21. If trans = 'N', ldx21m-
p. Otherwise, ldx21q.
x22 REAL for sorbdb

COMPLEX for cunbdb
Array, size (ldx22,m-q).
On entry, the bottom-right block of the orthogonal/unitary matrix to be
reduced.
ldx22 INTEGER. The leading dimension of the array X21. If trans = 'N', ldx22m-
p. Otherwise, ldx22m-q.
work REAL for sorbdb

COMPLEX for cunbdb
Workspace array, size (lwork).
lwork INTEGER. The size of the work array. lworkm-q

xerbla.
1078
LAPACK Routines 3
Output Parameters
x11 On exit, the form depends on trans:
If trans='N', the columns of the lower triangle of x11 specify

reflectors for p1, the rows of the upper triangle of
x11(1:q - 1, q:q - 1) specify reflectors for q1
otherwise the rows of the upper triangle of x11 specify reflectors
trans='T', for p1, the columns of the lower triangle of x11(1:q -
1, q:q - 1) specify reflectors for q1
If trans='N', the columns of the upper triangle of x12 specify the first
p reflectors for q2
otherwise the columns of the lower triangle of x12 specify the first
trans='T', p reflectors for q2
If trans='N', the columns of the lower triangle of x21 specify the

reflectors for p2
otherwise the columns of the upper triangle of x21 specify the

trans='T', reflectors for p2
If trans='N', the rows of the upper triangle of x22(q+1:m-p,p+1:m-

q) specify the last m-p-q reflectors for q2
otherwise the columns of the lower triangle of x22(p+1:m-q,q

trans='T', +1:m-p) specify the last m-p-q reflectors for p2
theta REAL for sorbdb

COMPLEX for cunbdb
Array, size (q). The entries of bidiagonal blocks b11, b12, b21, and b22 can be
computed from the angles theta and phi. See the Description section for
details.
phi REAL for sorbdb

COMPLEX for cunbdb
Array, size (q-1). The entries of bidiagonal blocks b11, b12, b21, and b22 can
be computed from the angles theta and phi. See the Description section
for details.
taup1 REAL for sorbdb

1079
COMPLEX for cunbdb

Array, size (p).
Scalar factors of the elementary reflectors that define p1.
taup2 REAL for sorbdb
COMPLEX for cunbdb
Array, size (m-p).
tauq1 REAL for sorbdb
COMPLEX for cunbdb
Array, size (q).
Scalar factors of the elementary reflectors that define q1.
tauq2 REAL for sorbdb
COMPLEX for cunbdb
Array, size (m-q).
info INTEGER.
< 0: if info = -i, the i-th argument has an illegal value.

Specific details for the routine ?orbdb/?unbdb interface are as follows:
x11 Holds the block of matrix X of size (p, q).
x12 Holds the block of matrix X of size (p, m-q).
x21 Holds the block of matrix X of size (m-p, q).
x22 Holds the block of matrix X of size (m-p, m-q).
theta Holds the vector of length q.
1080
LAPACK Routines 3
phi Holds the vector of length q-1.
taup1 Holds the vector of length p.
taup2 Holds the vector of length m-p.
tauq1 Holds the vector of length q.
taupq2 Holds the vector of length m-q.
signs Must be 'O' or 'D'.
See Also
?orcsd/?uncsd
?orgqr
?ungqr
?orglq
?unglq
xerbla
LAPACK Least Squares and Eigenvalue Problem Driver Routines

Each of the LAPACK driver routines solves a complete problem. To arrive at the solution, driver routines
typically call a sequence of appropriate computational routines.
Driver routines are described in the following sections :
Linear Least Squares (LLS) Problems
Generalized LLS Problems
Symmetric Eigenproblems
Nonsymmetric Eigenproblems
Cosine-Sine Decomposition
Generalized Symmetric Definite Eigenproblems
Generalized Nonsymmetric Eigenproblems
Linear Least Squares (LLS) Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving linear least squares problems. Table "Driver
Routines for Solving LLS Problems" lists all such routines for the FORTRAN 77 interface. The corresponding
Driver Routines for Solving LLS Problems
Routine Name Operation performed
gels Uses QR or LQ factorization to solve a overdetermined or underdetermined linear

system with full rank matrix.
gelsy Computes the minimum-norm solution to a linear least squares problem using a
complete orthogonal factorization of A.
gelss Computes the minimum-norm solution to a linear least squares problem using the
singular value decomposition of A.
1081
gelsd Computes the minimum-norm solution to a linear least squares problem using the
singular value decomposition of A and a divide and conquer method.
getsls Solves overdetermined or underdetermined real linear systems involving a matrix or

its transpose using a tall skinny QR or short wide LQ factorization.
?gels
Uses QR or LQ factorization to solve a overdetermined
or underdetermined linear system with full rank
matrix.
Syntax
call sgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call dgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call cgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call zgels(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call gels(a, b [,trans] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n
matrix A, or its transpose/ conjugate-transpose, using a QR or LQ factorization of A. It is assumed that A has
full rank.
The following options are provided:
1. If trans = 'N' and mn: find the least squares solution of an overdetermined system, that is, solve the
least squares problem
minimize ||b - A*x||2
2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system A*X = B.
3. If trans = 'T' or 'C' and mn: find the minimum norm solution of an undetermined system AH*X = B.
4. If trans = 'T' or 'C' and m < n: find the least squares solution of an overdetermined system, that is,
solve the least squares problem
minimize ||b - AH*x||2
Several right hand side vectors b and solution vectors x can be handled in a single call; they are formed by
the columns of the right hand side matrix B and the solution matrix X (when coefficient matrix is A, B is m-
by-nrhs and X is n-by-nrhs; if the coefficient matrix is AT or AH, B isn-by-nrhs and X is m-by-nrhs.
Input Parameters

If trans = 'N', the linear system involves matrix A;
If trans = 'T', the linear system involves the transposed matrix AT (for
real flavors only);
1082
LAPACK Routines 3
If trans = 'C', the linear system involves the conjugate-transposed
matrix AH (for complex flavors only).
n INTEGER. The number of columns of the matrix A

(n 0).
nrhs INTEGER. The number of right-hand sides; the number of columns in B

(nrhs 0).
a, b, work REAL for sgels

DOUBLE PRECISION for dgels
COMPLEX for cgels
DOUBLE COMPLEX for zgels.
Arrays:
b(ldb,*) contains the matrix B of right hand side vectors.
ldb INTEGER. The leading dimension of b; must be at least max(1, m, n).
lwork INTEGER. The size of the work array; must be at least min (m, n)+max(1,
m, n, nrhs).
xerbla.
Output Parameters
a On exit, overwritten by the factorization data as follows:

if mn, array a contains the details of the QR factorization of the matrix A as
returned by ?geqrf;
if m < n, array a contains the details of the LQ factorization of the matrix A

as returned by ?gelqf.
b If info = 0, b overwritten by the solution vectors, stored columnwise:
if trans = 'N' and mn, rows 1 to n of b contain the least squares solution
vectors; the residual sum of squares for the solution in each column is
given by the sum of squares of modulus of elements n+1 to m in that
column;
1083
if trans = 'N' and m < n, rows 1 to n of b contain the minimum norm

solution vectors;
if trans = 'T' or 'C' and mn, rows 1 to m of b contain the minimum
norm solution vectors;
if trans = 'T' or 'C' and m < n, rows 1 to m of b contain the least
squares solution vectors; the residual sum of squares for the solution in
each column is given by the sum of squares of modulus of elements m+1 to
n in that column.

info INTEGER.

If info = i, the i-th diagonal element of the triangular factor of A is zero, so
that A does not have full rank; the least squares solution could not be
computed.

Specific details for the routine gels interface are the following:
b Holds the matrix of size max(m,n)-by-nrhs.

If trans = 'N', then, on entry, the size of b is m-by-nrhs,
If trans = 'T', then, on entry, the size of b is n-by-nrhs,
Application Notes
For better performance, try using lwork = min (m, n)+max(1, m, n, nrhs)*blocksize, where
blocksize is a machine-dependent value (typically, 16 to 64) required for optimum performance of the
blocked algorithm.
lwork = -1.
1084
LAPACK Routines 3
?gelsy
Computes the minimum-norm solution to a linear least
squares problem using a complete orthogonal
factorization of A.
Syntax
call sgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call dgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, info)
call cgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork, info)
call zgelsy(m, n, nrhs, a, lda, b, ldb, jpvt, rcond, rank, work, lwork, rwork, info)
call gelsy(a, b [,rank] [,jpvt] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?gelsy routine computes the minimum-norm solution to a real/complex linear least squares problem:

using a complete orthogonal factorization of A. A is an m-by-n matrix which may be rank-deficient. Several
right hand side vectors b and solution vectors x can be handled in a single call; they are stored as the
columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution matrix X.
The routine first computes a QR factorization with column pivoting:
with R11 defined as the largest leading submatrix whose estimated condition number is less than 1/rcond.
The order of R11, rank, is the effective rank of A. Then, R22 is considered to be negligible, and R12 is
annihilated by orthogonal/unitary transformations from the right, arriving at the complete orthogonal
factorization:
The minimum-norm solution is then
for real flavors and
for complex flavors,

where Q1 consists of the first rank columns of Q.
1085
The ?gelsy routine is identical to the original deprecated ?gelsx routine except for the following
differences:
The call to the subroutine ?geqpf has been substituted by the call to the subroutine ?geqp3, which is a
BLAS-3 version of the QR factorization with column pivoting.
The matrix B (the right hand side) is updated with BLAS-3.
The permutation of the matrix B (the right hand side) is faster and more simple.
Input Parameters

(n 0).

(nrhs 0).
a, b, work REAL for sgelsy

DOUBLE PRECISION for dgelsy
COMPLEX for cgelsy
DOUBLE COMPLEX for zgelsy.
Arrays:
b(ldb,*) contains the m-by-nrhs right hand side matrix B.
jpvt INTEGER.
On entry, if jpvt(i) 0, the i-th column of A is permuted to the front of AP,
otherwise the i-th column of A is a free column.
rcond REAL for single-precision flavors

rcond is used to determine the effective rank of A, which is defined as the
order of the largest leading triangular submatrix R11 in the QR factorization
with pivoting of A, whose estimated condition number < 1/rcond.

xerbla. See Application Notes for the suggested value of lwork.
1086
LAPACK Routines 3
rwork REAL for cgelsy DOUBLE PRECISION for zgelsy. Workspace array, size at
least max(1, 2n). Used in complex flavors only.
Output Parameters
a On exit, overwritten by the details of the complete orthogonal factorization

of A.
b Overwritten by the n-by-nrhs solution matrix X.
jpvt On exit, if jpvt(i)= k, then the i-th column of AP was the k-th column of A.
rank INTEGER. The effective rank of A, that is, the order of the submatrix R11.
This is the same as the order of the submatrix T11 in the complete
orthogonal factorization of A.
info INTEGER.

Specific details for the routine gelsy interface are the following:
b Holds the matrix of size max(m,n)-by-nrhs. On entry, contains the m-by-nrhs

right hand side matrix B, On exit, overwritten by the n-by-nrhs solution matrix X.
jpvt Holds the vector of length n. Default value for this element is jpvt(i) = 0.
rcond Default value for this element is rcond = 100*EPSILON(1.0_WP).
Application Notes
For real flavors:
The unblocked strategy requires that:
lwork max( mn+3n+1, 2*mn + nrhs ),
where mn = min( m, n ).
The block algorithm requires that:

lwork max( mn+2n+nb*(n+1), 2*mn+nb*nrhs ),
where nb is an upper bound on the blocksize returned by ilaenv for the routines sgeqp3/dgeqp3, stzrzf/
dtzrzf, stzrqf/dtzrqf, sormqr/dormqr, and sormrz/dormrz.
The unblocked strategy requires that:
lworkmn + max( 2*mn, n+1, mn + nrhs ),
where mn = min( m, n ).
1087
The block algorithm requires that:

lwork < mn + max(2*mn, nb*(n+1), mn+mn*nb, mn+ nb*nrhs ),
where nb is an upper bound on the blocksize returned by ilaenv for the routines cgeqp3/zgeqp3, ctzrzf/
ztzrzf, ctzrqf/ztzrqf, cunmqr/zunmqr, and cunmrz/zunmrz.
lwork = -1.
?gelss
squares problem using the singular value
decomposition of A.
Syntax
call sgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call dgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, info)
call cgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, info)
call zgelss(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, info)
call gelss(a, b [,rank] [,s] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the minimum norm solution to a real linear least squares problem:
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be rank-deficient.
Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as
the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution matrix X. The effective
rank of A is determined by treating as zero those singular values which are less than rcond times the largest
singular value.
Input Parameters

(n 0).
1088
LAPACK Routines 3
(nrhs 0).
a, b, work REAL for sgelss

DOUBLE PRECISION for dgelss
COMPLEX for cgelss
DOUBLE COMPLEX for zgelss.
Arrays:

rcond is used to determine the effective rank of A. Singular values s(i)
rcond *s(1) are treated as zero.
If rcond <0, machine precision is used instead.

xerbla.
rwork REAL for cgelss

DOUBLE PRECISION for zgelss.
Workspace array used in complex flavors only. size at least max(1,
5*min(m, n)).
Output Parameters
a On exit, the first min(m, n) rows of a are overwritten with the matrix of
right singular vectors of A, stored row-wise.

If mn and rank = n, the residual sum-of-squares for the solution in the i-
th column is given by the sum of squares of modulus of elements n+1:m in
that column.

1089
Array, size at least max(1, min(m, n)). The singular values of A in

decreasing order. The condition number of A in the 2-norm is
k2(A) = s(1)/ s(min(m, n)) .
rank INTEGER. The effective rank of A, that is, the number of singular values
which are greater than rcond *s(1).
work(1) If info = 0, on exit, work(1) contains the minimum value of lwork

info INTEGER.
If info = i, then the algorithm for computing the SVD failed to converge; i
indicates the number of off-diagonal elements of an intermediate bidiagonal
form which did not converge to zero.

Specific details for the routine gelss interface are the following:

s Holds the vector of length min(m,n).
Application Notes
For real flavors:
lwork 3*min(m, n)+ max( 2*min(m, n), max(m, n), nrhs)
lwork 2*min(m, n)+ max(m, n, nrhs)
For good performance, lwork should generally be larger.
lwork = -1.
1090
LAPACK Routines 3
?gelsd
squares problem using the singular value
decomposition of A and a divide and conquer method.
Syntax
call sgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork, info)
call dgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, iwork, info)
call cgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, iwork,
info)
call zgelsd(m, n, nrhs, a, lda, b, ldb, s, rcond, rank, work, lwork, rwork, iwork,
info)
call gelsd(a, b [,rank] [,s] [,rcond] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the minimum-norm solution to a real linear least squares problem:
using the singular value decomposition (SVD) of A. A is an m-by-n matrix which may be rank-deficient.
Several right hand side vectors b and solution vectors x can be handled in a single call; they are stored as
the columns of the m-by-nrhs right hand side matrix B and the n-by-nrhs solution matrix X.
The problem is solved in three steps:
1. Reduce the coefficient matrix A to bidiagonal form with Householder transformations, reducing the
original problem into a "bidiagonal least squares problem" (BLS).
2. Solve the BLS using a divide and conquer approach.
3. Apply back all the Householder transformations to solve the original least squares problem.
The effective rank of A is determined by treating as zero those singular values which are less than rcond
times the largest singular value.
The routine uses auxiliary routines lals0 and lalsa.
Input Parameters

(n 0).

(nrhs 0).
a, b, work REAL for sgelsd

DOUBLE PRECISION for dgelsd
COMPLEX for cgelsd
DOUBLE COMPLEX for zgelsd.
1091
Arrays:

rcond is used to determine the effective rank of A. Singular values s(i)
rcond *s(1) are treated as zero. If rcond 0, machine precision is used
instead.

calculates the optimal size of the array work and the minimum sizes of the
arrays rwork and iwork, and returns these values as the first entries of the
work, rwork and iwork arrays, and no error message related to lwork is
issued by xerbla.
iwork INTEGER. Workspace array. See Application Notes for the suggested
dimension of iwork.
rwork REAL for cgelsd

DOUBLE PRECISION for zgelsd.
Workspace array, used in complex flavors only. See Application Notes for
the suggested dimension of rwork.
Output Parameters
a On exit, A has been overwritten.

If mn and rank = n, the residual sum-of-squares for the solution in the i-
th column is given by the sum of squares of modulus of elements n+1:m in
that column.

Array, size at least max(1, min(m, n)). The singular values of A in
decreasing order. The condition number of A in the 2-norm is
k2(A) = s(1)/ s(min(m, n)).
1092
LAPACK Routines 3
rank INTEGER. The effective rank of A, that is, the number of singular values
which are greater than rcond *s(1).

rwork(1) If info = 0, on exit, rwork(1) returns the minimum size of the workspace
array iwork required for optimum performance.
iwork(1) If info = 0, on exit, iwork(1) returns the minimum size of the workspace
array iwork required for optimum performance.
info INTEGER.
If info = i, then the algorithm for computing the SVD failed to converge; i
indicates the number of off-diagonal elements of an intermediate bidiagonal
form that did not converge to zero.

Specific details for the routine gelsd interface are the following:

s Holds the vector of length min(m,n).
Application Notes
The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work
on machines with a guard digit in add/subtract. It could conceivably fail on hexadecimal or decimal machines
without guard digits, but we know of none.
The exact minimum amount of workspace needed depends on m, n and nrhs. The size lwork of the
workspace array work must be as given below.
For real flavors:
If mn,
lwork 12n + 2n*smlsiz + 8n*nlvl + n*nrhs + (smlsiz+1)2;

If m < n,
lwork 12m + 2m*smlsiz + 8m*nlvl + m*nrhs + (smlsiz+1)2;

If mn,
lwork< 2n + n*nrhs;
1093
If m < n,
lwork 2m + m*nrhs;
where smlsiz is returned by ilaenv and is equal to the maximum size of the subproblems at the bottom of
the computation tree (usually about 25), and
nlvl = INT( log2( min( m, n )/(smlsiz+1)) ) + 1.
For good performance, lwork should generally be larger.
lwork = -1.
The dimension of the workspace array iwork must be at least
3*min( m, n )*nlvl + 11*min( m, n ).
The dimension of the workspace array iwork (for complex flavors) must be at least max(1, lrwork).
lrwork 10n + 2n*smlsiz + 8n*nlvl + 3*smlsiz*nrhs + (smlsiz+1)2 if mn, and

lrwork 10m + 2m*smlsiz + 8m*nlvl + 3*smlsiz*nrhs + (smlsiz+1)2 if m < n.
?getsls
Uses QR or LQ factorization to solve an
overdetermined or underdetermined linear system
with full rank matrix, with best performance for tall
and skinny matrices.
call sgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call dgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call cgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
call zgetsls(trans, m, n, nrhs, a, lda, b, ldb, work, lwork, info)
Description
The routine solves overdetermined or underdetermined real/ complex linear systems involving an m-by-n
matrix A, or its transpose/conjugate-transpose, using a ?geqr or ?gelq factorization of A. It is assumed that
A has full rank.
The following options are provided:
1. If trans = 'N' and mn: find the least squares solution of an overdetermined system, that is, solve the
least squares problem
2. If trans = 'N' and m < n: find the minimum norm solution of an underdetermined system A*X = B.
3. If trans = 'T' or 'C' and mn: find the minimum norm solution of an undetermined system AH*X = B.
4. If trans = 'T' or 'C' and m < n: find the least squares solution of an overdetermined system, that is,
solve the least squares problem
1094
LAPACK Routines 3
minimize ||b - AH*x||2
Several right hand side vectors b and solution vectors x can be handled in a single call; they are formed by
the columns of the right hand side matrix B and the solution matrix X (when coefficient matrix is A, B is m-
by-nrhs and X is n-by-nrhs; if the coefficient matrix is AT or AH, B isn-by-nrhs and X is m-by-nrhs.
Input Parameters

If trans = 'N', the linear system involves matrix A;
If trans = 'T', the linear system involves the transposed matrix AT (for
real flavors only);
If trans = 'C', the linear system involves the conjugate-transposed
matrix AH (for complex flavors only).

(nrhs 0).
a REAL for sgetsls

DOUBLE PRECISION for dgetsls
COMPLEX for cgetsls
COMPLEX*16 for zgetsls
Array a(lda,*) contains the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a. ldamax(1,m).
b REAL for sgetsls

DOUBLE PRECISION for dgetsls
COMPLEX for cgetsls
COMPLEX*16 for zgetsls
Array b(ldb,*) contains the matrix B of right hand side vectors.
ldb INTEGER. The leading dimension of the array b. ldbmax(1,m,n).
lwork INTEGER. The size of the work array; must be at least min (m, n)+max(1,
m, n, nrhs).
xerbla.
1095
Output Parameters
a On exit, overwritten by the factorization data as follows:

if mn, array a contains the details of the QR factorization of the matrix A as
returned by ?geqr;
if m < n, array a contains the details of the LQ factorization of the matrix A

as returned by ?gelq.
b If info = 0, b overwritten by the solution vectors, stored columnwise:
if trans = 'N' and mn, rows 1 to n of b contain the least squares solution
vectors; the residual sum of squares for the solution in each column is
given by the sum of squares of modulus of elements n+1 to m in that
column;
if trans = 'N' and m < n, rows 1 to n of b contain the minimum norm
solution vectors;
if trans = 'T' or 'C' and mn, rows 1 to m of b contain the minimum
norm solution vectors;
if trans = 'T' or 'C' and m < n, rows 1 to m of b contain the least
squares solution vectors; the residual sum of squares for the solution in
each column is given by the sum of squares of modulus of elements m+1 to
n in that column.

info INTEGER.

If info = i, the i-th diagonal element of the triangular factor of A is zero, so
that A does not have full rank; the least squares solution could not be
computed.
Generalized Linear Least Squares (LLS) Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving generalized linear least squares problems.
Table "Driver Routines for Solving Generalized LLS Problems" lists all such routines. The corresponding
Driver Routines for Solving Generalized LLS Problems
gglse Solves the linear equality-constrained least squares problem using a generalized RQ
factorization.
ggglm Solves a general Gauss-Markov linear model problem using a generalized QR

factorization.
?gglse
Solves the linear equality-constrained least squares
problem using a generalized RQ factorization.
1096
LAPACK Routines 3
Syntax
call sgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call dgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call cgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call zgglse(m, n, p, a, lda, b, ldb, c, d, x, work, lwork, info)
call gglse(a, b, c, d, x [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves the linear equality-constrained least squares (LSE) problem:
minimize ||c - A*x||2 subject to B*x = d
where A is an m-by-n matrix, B is a p-by-n matrix, c is a given m-vector, andd is a given p-vector. It is
assumed that pnm+p, and
These conditions ensure that the LSE problem has a unique solution, which is obtained using a generalized
RQ factorization of the matrices (B, A) given by
B=(0 R)*Q, A=Z*T*Q
Input Parameters
p INTEGER. The number of rows of the matrix B

(0 pnm+p).
a, b, c, d, work REAL for sgglse

DOUBLE PRECISION for dgglse
COMPLEX for cgglse
DOUBLE COMPLEX for zgglse.
Arrays:
b(ldb,*) contains the p-by-nmatrix B.
1097
c(*), size at least max(1, m), contains the right hand side vector for the
least squares part of the LSE problem.
d(*),, size at least max(1, p), contains the right hand side vector for the
constrained equation.

lwork max(1, m+n+p).
xerbla.
Output Parameters
a The elements on and above the diagonal contain the min(m, n)-by-n upper
trapezoidal matrix T as returned by ?ggrqf.
x REAL for sgglse

The solution of the LSE problem.
b On exit, the upper right triangle of the subarray b(1:p, n-p+1:n) contains
the p-by-p upper triangular matrix R as returned by ?ggrqf.
d On exit, d is destroyed.
c On exit, the residual sum-of-squares for the solution is given by the sum of
squares of elements n-p+1 to m of vector c.

info INTEGER.
If info = 1, the upper triangular factor R associated with B in the

generalized RQ factorization of the pair (B, A) is singular, so that rank(B)
< p; the least squares solution could not be computed.
If info = 2, the (n-p)-by-(n-p) part of the upper trapezoidal factor T
associated with A in the generalized RQ factorization of the pair (B, A) is
singular, so that
1098
LAPACK Routines 3
; the least squares solution could not be computed.

Specific details for the routine gglse interface are the following:
c Holds the vector of length (m).
d Holds the vector of length (p).
x Holds the vector of length n.
Application Notes
For optimum performance, use
lworkp + min(m, n) + max(m, n)*nb,
where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr and ?ormrq/?
unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended workspace in the
?ggglm
Solves a general Gauss-Markov linear model problem
using a generalized QR factorization.
Syntax
call sggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call dggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call cggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call zggglm(n, m, p, a, lda, b, ldb, d, x, y, work, lwork, info)
call ggglm(a, b, d, x, y [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine solves a general Gauss-Markov linear model (GLM) problem:
1099
minimizex ||y||2 subject to d = A*x + B*y

where A is an n-by-m matrix, B is an n-by-p matrix, and d is a given n-vector. It is assumed that mnm+p,
and rank(A) = m and rank(AB) = n.
Under these assumptions, the constrained equation is always consistent, and there is a unique solution x and
a minimal 2-norm solution y, which is obtained using a generalized QR factorization of the matrices (A, B )
given by
In particular, if matrix B is square nonsingular, then the problem GLM is equivalent to the following weighted
linear least squares problem
minimizex ||B-1(d-A*x)||2.
Input Parameters
n INTEGER. The number of rows of the matrices A and B (n 0).
m INTEGER. The number of columns in A (m 0).
p INTEGER. The number of columns in B (pn - m).
a, b, d, work REAL for sggglm

DOUBLE PRECISION for dggglm
COMPLEX for cggglm
DOUBLE COMPLEX for zggglm.
Arrays:
a(lda,*) contains the n-by-m matrix A.
b(ldb,*) contains the n-by-p matrix B.
The second dimension of b must be at least max(1, p).
d(*), size at least max(1, n), contains the left hand side of the GLM
equation.
lwork INTEGER. The size of the work array; lwork max(1, n+m+p).
xerbla.
1100
LAPACK Routines 3
Output Parameters
x, y REAL for sggglm

DOUBLE PRECISION for dggglm
COMPLEX for cggglm
DOUBLE COMPLEX for zggglm.
Arrays x(*), y(*). size at least max(1, m) for x and at least max(1, p) for y.
On exit, x and y are the solutions of the GLM problem.
a On exit, the upper triangular part of the array a contains the m-by-m upper
b On exit, if np, the upper right triangle of the subarray b(1:n,p-n+1:p)

contains the n-by-n upper triangular matrix T as returned by ?ggrqf; if n >
p, the elements on and above the (n-p)-th subdiagonal contain the n-by-p
upper trapezoidal matrix T.
d On exit, d is destroyed

info INTEGER.
If info = 1, the upper triangular factor R associated with A in the

generalized QR factorization of the pair (A, B) is singular, so that rank(A)
< m; the least squares solution could not be computed.
If info = 2, the bottom (n-m)-by-(n-m) part of the upper trapezoidal
factor T associated with B in the generalized QR factorization of the pair (A,
B) is singular, so that rank(AB) < n; the least squares solution could not
be computed.

Specific details for the routine ggglm interface are the following:
a Holds the matrix A of size (n,m).
b Holds the matrix B of size (n,p).
x Holds the vector of length (m).
y Holds the vector of length (p).
1101
Application Notes
For optimum performance, use
lworkm + min(n, p) + max(n, p)*nb,
where nb is an upper bound for the optimal blocksizes for ?geqrf, ?gerqf, ?ormqr/?unmqr and ?ormrq/?
unmrq.
You may set lwork to -1. The routine returns immediately and provides the recommended workspace in the
Symmetric Eigenvalue Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving symmetric eigenvalue problems. See also
computational routines that can be called to solve these problems. Table "Driver Routines for Solving
Symmetric Eigenproblems" lists all such driver routines for the FORTRAN 77 interface. The corresponding
Driver Routines for Solving Symmetric Eigenproblems
syev/heev Computes all eigenvalues and, optionally, eigenvectors of a real symmetric /

Hermitian matrix.
syevd/heevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric /
Hermitian matrix using divide and conquer algorithm.
syevx/heevx Computes selected eigenvalues and, optionally, eigenvectors of a symmetric /

Hermitian matrix.
syevr/heevr Computes selected eigenvalues and, optionally, eigenvectors of a real symmetric /

Hermitian matrix using the Relatively Robust Representations.
spev/hpev Computes all eigenvalues and, optionally, eigenvectors of a real symmetric /

Hermitian matrix in packed storage.
spevd/hpevd Uses divide and conquer algorithm to compute all eigenvalues and (optionally) all
eigenvectors of a real symmetric / Hermitian matrix held in packed storage.
spevx/hpevx Computes selected eigenvalues and, optionally, eigenvectors of a real symmetric /

Hermitian matrix in packed storage.
sbev /hbev Computes all eigenvalues and, optionally, eigenvectors of a real symmetric /
Hermitian band matrix.
sbevd/hbevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric /
Hermitian band matrix using divide and conquer algorithm.
sbevx/hbevx Computes selected eigenvalues and, optionally, eigenvectors of a real symmetric /

stev Computes all eigenvalues and, optionally, eigenvectors of a real symmetric

tridiagonal matrix.
stevd Computes all eigenvalues and (optionally) all eigenvectors of a real symmetric
tridiagonal matrix using divide and conquer algorithm.
stevx Computes selected eigenvalues and eigenvectors of a real symmetric tridiagonal

matrix.
1102
LAPACK Routines 3
stevr Computes selected eigenvalues and, optionally, eigenvectors of a real symmetric

tridiagonal matrix using the Relatively Robust Representations.
?syev
Computes all eigenvalues and, optionally,
eigenvectors of a real symmetric matrix.
Syntax
call ssyev(jobz, uplo, n, a, lda, w, work, lwork, info)
call dsyev(jobz, uplo, n, a, lda, w, work, lwork, info)
call syev(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
Note that for most cases of real symmetric eigenvalue problems the default choice should be syevr function
as its underlying algorithm is faster and uses less workspace.
Input Parameters


a, work REAL for ssyev

DOUBLE PRECISION for dsyev
symmetric matrix A, as specified by uplo.

Must be at least max(1, n).
lwork INTEGER.
The dimension of the array work.
Constraint: lwork max(1, 3n-1).
1103

xerbla.
Output Parameters
a On exit, if jobz = 'V', then if info = 0, array a contains the orthonormal

eigenvectors of the matrix A.
If jobz = 'N', then on exit the lower triangle
(if uplo = 'L') or the upper triangle (if uplo = 'U') of A, including the
diagonal, is overwritten.
w REAL for ssyev

DOUBLE PRECISION for dsyev
If info = 0, contains the eigenvalues of the matrix A in ascending order.
work(1) On exit, if lwork > 0, then work(1) returns the required minimal size of
lwork.
info INTEGER.
If info = i, then the algorithm failed to converge; i indicates the number

of elements of an intermediate tridiagonal form which did not converge to
zero.

Specific details for the routine syev interface are the following:
job Must be 'N' or 'V'. The default value is 'N'.
Application Notes
For optimum performance set lwork (nb+2)*n, where nb is the blocksize for ?sytrd returned by ilaenv.
lwork = -1.
1104
LAPACK Routines 3
= -1.
If lwork has any of admissible sizes, which is no less than the minimal value described, then the routine
completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array on exit. Use this value (work(1))
for subsequent runs.
element of the corresponding array work. This operation is called a workspace query.
?heev
eigenvectors of a Hermitian matrix.
Syntax
call cheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
call zheev(jobz, uplo, n, a, lda, w, work, lwork, rwork, info)
call heev(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
Note that for most cases of complex Hermitian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace.
Input Parameters


1105
a, work COMPLEX for cheev

DOUBLE COMPLEX for zheev
Arrays:
Hermitian matrix A, as specified by uplo.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n).
lwork INTEGER.
The dimension of the array work. C
onstraint: lwork max(1, 2n-1).

xerbla.
rwork REAL for cheev

DOUBLE PRECISION for zheev.
Workspace array, size at least max(1, 3n-2).
Output Parameters
a On exit, if jobz = 'V', then if info = 0, array a contains the orthonormal

eigenvectors of the matrix A.
If jobz = 'N', then on exit the lower triangle
(if uplo = 'L') or the upper triangle (if uplo = 'U') of A, including the
diagonal, is overwritten.
w REAL for cheev

DOUBLE PRECISION for zheev
lwork.
info INTEGER.

zero.
1106
LAPACK Routines 3
Specific details for the routine heev interface are the following:
job Must be 'N' or 'V'. The default value is 'N'.
Application Notes
For optimum performance use
lwork (nb+1)*n,
where nb is the blocksize for ?hetrd returned by ilaenv.
lwork = -1.
?syevd
Computes all eigenvalues and, optionally, all
eigenvectors of a real symmetric matrix using divide
and conquer algorithm.
Syntax
call ssyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
call dsyevd(jobz, uplo, n, a, lda, w, work, lwork, iwork, liwork, info)
call syevd(a, w [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric matrix A.
In other words, it can compute the spectral factorization of A as: A = Z**ZT.
Here is a diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the orthogonal matrix
whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
1107
If the eigenvectors are requested, then this routine uses a divide and conquer algorithm to compute
eigenvalues and eigenvectors. However, if only eigenvalues are required, then it uses the Pal-Walker-Kahan
variant of the QL or QR algorithm.
as its underlying algorithm is faster and uses less workspace. ?syevd requires more workspace but is faster
in some cases, especially for large matrices.
Input Parameters


a REAL for ssyevd

DOUBLE PRECISION for dsyevd
Array, size (lda, *).

work REAL for ssyevd

DOUBLE PRECISION for dsyevd.
Workspace array, size at least lwork.
lwork INTEGER.
Constraints:
if n 1, then lwork 1;
if jobz = 'N' and n > 1, then lwork 2*n + 1;
if jobz = 'V' and n > 1, then lwork 2*n2+ 6*n + 1.

calculates the required sizes of the work and iwork arrays, returns these
values as the first entries of the work and iwork arrays, and no error
message related to lwork or liwork is issued by xerbla. See Application
Notes for details.
iwork INTEGER.
1108
LAPACK Routines 3
Workspace array, its dimension max(1, liwork).
liwork INTEGER.
Constraints:
if n 1, then liwork 1;
if jobz = 'N' and n > 1, then liwork 1;
if jobz = 'V' and n > 1, then liwork 5*n + 3.

Notes for details.
Output Parameters
w REAL for ssyevd

DOUBLE PRECISION for dsyevd
See also info.
a If jobz = 'V', then on exit this array is overwritten by the orthogonal

matrix Z which contains the eigenvectors of A.
lwork.
iwork(1) On exit, if liwork > 0, then iwork(1) returns the required minimal size of
liwork.
info INTEGER.
If info = i, and jobz = 'N', then the algorithm failed to converge; i

indicates the number of off-diagonal elements of an intermediate tridiagonal
form which did not converge to zero.
If info = i, and jobz = 'V', then the algorithm failed to compute an
eigenvalue while working on the submatrix lying in rows and columns
info/(n+1) through mod(info,n+1).

Specific details for the routine syevd interface are the following:
1109
jobz Must be 'N' or 'V'. The default value is 'N'.
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2 = O()*||A||2,
query.
workspace.
The complex analogue of this routine is heevd
?heevd
eigenvectors of a complex Hermitian matrix using
divide and conquer algorithm.
Syntax
call cheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork, liwork, info)
call zheevd(jobz, uplo, n, a, lda, w, work, lwork, rwork, lrwork, iwork, liwork, info)
call heevd(a, w [,job] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian matrix
A. In other words, it can compute the spectral factorization of A as: A = Z**ZH.
Here is a real diagonal matrix whose diagonal elements are the eigenvalues i, and Z is the (complex)
unitary matrix whose columns are the eigenvectors zi. Thus,
A*zi = i*zi for i = 1, 2, ..., n.
1110
LAPACK Routines 3
Note that for most cases of complex Hermetian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace. ?heevd requires more workspace but
is faster in some cases, especially for large matrices.
Input Parameters


a COMPLEX for cheevd

DOUBLE COMPLEX for zheevd
work COMPLEX for cheevd

DOUBLE COMPLEX for zheevd.
lwork INTEGER.
The dimension of the array work. Constraints:
if jobz = 'N' and n > 1, then lworkn+1;
if jobz = 'V' and n > 1, then lworkn2+2*n.

Application Notes for details.
rwork REAL for cheevd

DOUBLE PRECISION for zheevd
Workspace array, size at least lrwork.
lrwork INTEGER.
The dimension of the array rwork. Constraints:
if n 1, then lrwork 1;
1111
if job = 'N' and n > 1, then lrworkn;
if job = 'V' and n > 1, then lrwork 2*n2+ 5*n + 1.

liwork INTEGER.
The dimension of the array iwork. Constraints: if n 1, then liwork 1;
if jobz = 'V' and n > 1, then liwork 5*n+3.

Output Parameters
w REAL for cheevd

DOUBLE PRECISION for zheevd
See also info.
a If jobz = 'V', then on exit this array is overwritten by the unitary matrix
Z which contains the eigenvectors of A.
work(1) On exit, if lwork > 0, then the real part of work(1) returns the required
minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required minimal size of
lrwork.
liwork.
info INTEGER.
If info = i, and jobz = 'N', then the algorithm failed to converge; i off-
diagonal elements of an intermediate tridiagonal form did not converge to
zero;
if info = i, and jobz = 'V', then the algorithm failed to compute an
info/(n+1) through mod(info, n+1).
1112
LAPACK Routines 3

Specific details for the routine heevd interface are the following:
w Holds the vector of length (n).
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()*||A||2,
If you are in doubt how much workspace to supply, use a generous value of lwork (liwork or lrwork) for the
first run or set lwork = -1 (liwork = -1, lrwork = -1).
If you choose the first option and set any of admissible lwork (liwork or lrwork) sizes, which is no less than
the minimal value described, the routine completes the task, though probably not so fast as with a
recommended workspace, and provides the recommended workspace in the first element of the
corresponding array (work, iwork, rwork) on exit. Use this value (work(1), iwork(1), rwork(1)) for
subsequent runs.
If you set lwork = -1 (liwork = -1, lrwork = -1), the routine returns immediately and provides the
Note that if you set lwork (liwork, lrwork) to less than the minimal required value and not -1, the routine
workspace.
The real analogue of this routine is syevd. See also hpevd for matrices held in packed storage, and hbevd for
banded matrices.
?syevx
Computes selected eigenvalues and, optionally,
eigenvectors of a symmetric matrix.
Syntax
call ssyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, iwork, ifail, info)
call dsyevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, iwork, ifail, info)
call syevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
1113
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices for
the desired eigenvalues.
as its underlying algorithm is faster and uses less workspace. ?syevx is faster for a few selected
eigenvalues.
Input Parameters

range CHARACTER*1. Must be 'A', 'V', or 'I'.

If range = 'A', all eigenvalues will be found.
If range = 'V', all eigenvalues in the half-open interval (vl, vu] will be
found.
If range = 'I', the eigenvalues with indices il through iu will be found.

a, work REAL for ssyevx

DOUBLE PRECISION for dsyevx.
Arrays:
lda INTEGER. The leading dimension of the array a. Must be at least max(1, n) .
vl, vu REAL for ssyevx

for eigenvalues; vlvu. Not referenced if range = 'A'or 'I'.
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest eigenvalues to be
returned.
Constraints: 1 iliun, if n > 0;
il = 1 and iu = 0, if n = 0.
1114
LAPACK Routines 3
Not referenced if range = 'A'or 'V'.
abstol REAL for ssyevx

The absolute error tolerance for the eigenvalues. See Application Notes for
more information.
ldz INTEGER. The leading dimension of the output array z; ldz 1.

If jobz = 'V', then ldz max(1, n) .
lwork INTEGER.
If n 1 then lwork 1, otherwise lwork=8*n.

xerbla.
iwork INTEGER. Workspace array, size at least max(1, 5n).
Output Parameters
a On exit, the lower triangle (if uplo = 'L') or the upper triangle (if uplo =
'U') of A, including the diagonal, is overwritten.
m INTEGER. The total number of eigenvalues found;

0 mn.
w REAL for ssyevx

DOUBLE PRECISION for dsyevx
Array, size at least max(1, n). The first m elements contain the selected
eigenvalues of the matrix A in ascending order.
z REAL for ssyevx

Array z(ldz,*) contains eigenvectors.
The second dimension of z must be at least max(1, m).
If jobz = 'V', then if info = 0, the first m columns of z contain the
orthonormal eigenvectors of the matrix A corresponding to the selected
with w(i).
If an eigenvector fails to converge, then that column of z contains the latest
approximation to the eigenvector, and the index of the eigenvector is
returned in ifail.
1115
array z; if range = 'V', the exact value of m is not known in advance and
an upper bound must be used.
lwork.
ifail INTEGER.
If jobz = 'V', then if info = 0, the first m elements of ifail are zero; if
info > 0, then ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'V', then ifail is not referenced.
info INTEGER.
If info = i, then i eigenvectors failed to converge; their indices are stored

in the array ifail.

Specific details for the routine syevx interface are the following:
ifail Holds the vector of length n.
vl Default value for this element is vl = -HUGE(vl).
vu Default value for this element is vu = HUGE(vl).
abstol Default value for this element is abstol = 0.0_WP.
jobz Restored based on the presence of the argument z as follows: jobz = 'V', if z
is present, jobz = 'N', if z is omitted Note that there will be an error condition
if ifail is present and z is omitted.
1116
LAPACK Routines 3
range Restored based on the presence of arguments vl, vu, il, iu as follows: range =
'V', if one of or both vl and vu are present, range = 'I', if one of or both il
and iu are present, range = 'A', if none of vl, vu, il, iu is present, Note that
there will be an error condition if one of or both vl and vu are present and at the
same time one of or both il and iu are present.
Application Notes
For optimum performance use lwork (nb+3)*n, where nb is the maximum of the blocksize for ?sytrd
and ?ormtr returned by ilaenv.
If it is not clear how much workspace to supply, use a generous value of lwork for the first run or set lwork
= -1.
If lwork has any of admissible sizes, which is no less than the minimal value described, then the routine
completes the task, though probably not so fast as with a recommended workspace, and provides the
recommended workspace in the first element of the corresponding array work on exit. Use this value
(work(1)) for subsequent runs.
element of the corresponding array work. This operation is called a workspace query.
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of width
less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
If abstol is less than or equal to zero, then *||T|} is used as tolerance, where ||T|| is the 1-norm of the
tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues are computed most accurately
when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
If this routine returns with info > 0, indicating that some eigenvectors did not converge, try setting abstol
to 2*?lamch('S').
?heevx
eigenvectors of a Hermitian matrix.
Syntax
call cheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, rwork, iwork, ifail, info)
call zheevx(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz, work,
lwork, rwork, iwork, ifail, info)
call heevx(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
Note that for most cases of complex Hermetian eigenvalue problems the default choice should be heevr
function as its underlying algorithm is faster and uses less workspace. ?heevx is faster for a few selected
eigenvalues.
1117
Input Parameters

range CHARACTER*1. Must be 'A', 'V', or 'I'.

If range = 'A', all eigenvalues will be found.
If range = 'V', all eigenvalues in the half-open interval (vl, vu] will be
found.
If range = 'I', the eigenvalues with indices il through iu will be found.

a, work COMPLEX for cheevx

DOUBLE COMPLEX for zheevx.
Arrays:
vl, vu REAL for cheevx

DOUBLE PRECISION for zheevx.
for eigenvalues; vlvu. Not referenced if range = 'A'or 'I'.
il, iu INTEGER.
If range = 'I', the indices of the smallest and largest eigenvalues to be
returned. Constraints:
1 iliun, if n > 0;il = 1 and iu = 0, if n = 0. Not referenced if range =
'A'or 'V'.
abstol REAL for cheevx

DOUBLE PRECISION for zheevx. The absolute error tolerance for the
eigenvalues. See Application Notes for more information.
ldz INTEGER. The leading dimension of the output array z; ldz 1.

If jobz = 'V', then ldzmax(1, n).
lwork INTEGER.
1118
LAPACK Routines 3
lwork 1 if n1; otherwise at least 2*n.
xerbla.
rwork REAL for cheevx

DOUBLE PRECISION for zheevx.
Output Parameters
m INTEGER. The total number of eigenvalues found; 0 mn.

w REAL for cheevx

DOUBLE PRECISION for zheevx
Array, size max(1, n). The first m elements contain the selected eigenvalues
of the matrix A in ascending order.
z COMPLEX for cheevx

DOUBLE COMPLEX for zheevx.
Array z(ldz,*) contains eigenvectors.
with w(i).
returned in ifail.
If jobz = 'N', then z is not referenced. Note: you must ensure that at
least max(1,m) columns are supplied in the array z; if range = 'V', the
exact value of m is not known in advance and an upper bound must be
used.
lwork.
ifail INTEGER.
1119
info > 0, then ifail contains the indices of the eigenvectors that failed to
converge.
If jobz = 'V', then ifail is not referenced.
info INTEGER.

in the array ifail.

Specific details for the routine heevx interface are the following:
z Holds the matrix Z of size (n, n).
if ifail is present and z is omitted.
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the maximum of the blocksize for ?hetrd
and ?unmtr returned by ilaenv.
1120
LAPACK Routines 3
lwork = -1.
If abstol is less than or equal to zero, then *||T|| will be used in its place, where ||T|| is the 1-norm of
the tridiagonal matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most
accurately when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
to 2*?lamch('S').
?syevr
eigenvectors of a real symmetric matrix using the
Relatively Robust Representations.
Syntax
call ssyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, iwork, liwork, info)
call dsyevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, iwork, liwork, info)
call syevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A.
The routine first reduces the matrix A to tridiagonal form T. Then, whenever possible, ?syevr calls stemr to
compute the eigenspectrum using Relatively Robust Representations. stemr computes eigenvalues by the
dqds algorithm, while orthogonal eigenvectors are computed from various "good" L*D*LT representations
(also known as Relatively Robust Representations). Gram-Schmidt orthogonalization is avoided as far as
possible. More specifically, the various steps of the algorithm are as follows. For the each unreduced block of
T:
a. Compute T - *I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of D and L cause only small relative
1121
computed, see Steps c) and d).
d. For each eigenvalue with a large enough relative separation, compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to Step c) for any clusters that remain.
The desired accuracy of the output can be specified by the input parameter abstol.
The routine ?syevr calls stemr when the full spectrum is requested on machines that conform to the
IEEE-754 floating point standard. ?syevr calls stebz and stein on non-IEEE machines and when partial
spectrum requests are made.
Note that ?syevr is preferable for most cases of real symmetric eigenvalue problems as its underlying
algorithm is fast and uses less workspace.
Input Parameters



interval:
vl < w(i)vu.
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz and sstein/
dstein are called.

a, work REAL for ssyevr

DOUBLE PRECISION for dsyevr.
Arrays:
vl, vu REAL for ssyevr

1122
LAPACK Routines 3
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
Constraint:
1 iliun, if n > 0;
il=1 and iu=0, if n = 0.
abstol REAL for ssyevr

DOUBLE PRECISION for dsyevr. The absolute error tolerance to which each
eigenvalue/eigenvector is required.
If jobz = 'V', the eigenvalues and eigenvectors output have residual
norms bounded by abstol, and the dot products between different
eigenvectors are bounded by abstol.
If abstol < n *eps*||T||, then n *eps*||T|| is used instead, where
eps is the machine precision, and ||T|| is the 1-norm of the matrix T. The
eigenvalues are computed to an accuracy of eps*||T|| irrespective of
abstol.
If high relative accuracy is important, set abstol to ?lamch('S').

Constraints:
ldz 1 if jobz = 'N' and
lwork INTEGER.
Constraint: lwork max(1, 26n).

xerbla.
liwork INTEGER.
The dimension of the array iwork, lwork max(1, 10n).
1123

xerbla.
Output Parameters
m INTEGER. The total number of eigenvalues found, 0 mn.

If range = 'A', m = n, if range = 'I', m = iu-il+1, and if range =
'V' the exact value of m is not known in advance.
w, z REAL for ssyevr

Arrays:
w(*), size at least max(1, n), contains the selected eigenvalues in
ascending order, stored in w(1) to w(m);
z(ldz,*), the second dimension of z must be at least max(1, m).

with w(i).
If jobz = 'N', then z is not referenced. Note that you must ensure that at
least max(1, m) columns are supplied in the array z ; if range = 'V', the
exact value of m is not known in advance and an upper bound must be
used.
isuppz INTEGER.
Array, size at least 2 *max(1, m).
The support of the eigenvectors in z, i.e., the indices indicating the nonzero
elements in z. The i-th eigenvector is nonzero only in elements
isuppz( 2i-1) through isuppz( 2i ). Referenced only if eigenvectors
are needed (jobz = 'V') and all eigenvalues are needed, that is, range =
'A' or range = 'I' and il = 1 and iu = n.
lwork.
liwork.
info INTEGER.
If info = i, an internal error has occurred.
1124
LAPACK Routines 3
Specific details for the routine syevr interface are the following:
z Holds the matrix Z of size (n, n), where the values n and m are significant.
isuppz Holds the vector of length (2*m), where the values (2*m) are significant.
if isuppz is present and z is omitted.
Application Notes
For optimum performance use lwork (nb+6)*n, where nb is the maximum of the blocksize for ?sytrd
and ?ormtr returned by ilaenv.
query.
workspace.
1125
?heevr
eigenvectors of a Hermitian matrix using the
Relatively Robust Representations.
Syntax
call cheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
call zheevr(jobz, range, uplo, n, a, lda, vl, vu, il, iu, abstol, m, w, z, ldz,
isuppz, work, lwork, rwork, lrwork, iwork, liwork, info)
call heevr(a, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A.
The routine first reduces the matrix A to tridiagonal form T with a call to hetrd. Then, whenever possible, ?
heevr calls stegr to compute the eigenspectrum using Relatively Robust Representations. ?stegr computes
eigenvalues by the dqds algorithm, while orthogonal eigenvectors are computed from various "good" L*D*LT
representations (also known as Relatively Robust Representations). Gram-Schmidt orthogonalization is
avoided as far as possible. More specifically, the various steps of the algorithm are as follows. For each
unreduced block (submatrix) of T:
a. Compute T - *I = L*D*LT, so that L and D define all the wanted eigenvalues to high relative
accuracy. This means that small relative changes in the entries of D and L cause only small relative
computed, see Steps c) and d).
d. For each eigenvalue with a large enough relative separation, compute the corresponding eigenvector by
forming a rank revealing twisted factorization. Go back to Step c) for any clusters that remain.
The routine ?heevr calls stemr when the full spectrum is requested on machines which conform to the
IEEE-754 floating point standard, or stebz and stein on non-IEEE machines and when partial spectrum
requests are made.
Note that the routine ?heevr is preferable for most cases of complex Hermitian eigenvalue problems as its
underlying algorithm is fast and uses less workspace.
Input Parameters

1126
LAPACK Routines 3
If range = 'V', the routine computes eigenvalues lambda(i) in the half-

open interval: vl< lambda(i)vu.
For range = 'V'or 'I', sstebz/dstebz and cstein/zstein are called.

a, work COMPLEX for cheevr

DOUBLE COMPLEX for zheevr.
Arrays:

vl, vu REAL for cheevr

DOUBLE PRECISION for zheevr.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 iliun, if n > 0; il=1 and iu=0 if n = 0.
abstol REAL for cheevr

The absolute error tolerance to which each eigenvalue/eigenvector is
required.
eigenvectors are bounded by abstol.
1127
If abstol < n *eps*||T||, then n *eps*||T|| is used instead, where

eps is the machine precision, and ||T|| is the 1-norm of the matrix T. The
eigenvalues are computed to an accuracy of eps*||T|| irrespective of
abstol.

lwork INTEGER.
Constraint: lwork max(1, 2n).

error message related to lwork or lrwork or liwork is issued by xerbla.
rwork REAL for cheevr

lrwork INTEGER.
The dimension of the array rwork;
lwork max(1, 24n).
liwork INTEGER.
The dimension of the array iwork,
lwork max(1, 10n).
Output Parameters
1128
LAPACK Routines 3
0 mn.
'V' the exact value of m is not known in advance.
w REAL for cheevr

Array, size at least max(1, n), contains the selected eigenvalues in
ascending order, stored in w(1) to w(m).
z COMPLEX for cheevr

DOUBLE COMPLEX for zheevr.
Array z(ldz,*), the second dimension of z must be at least max(1, m).
with w(i).
isuppz INTEGER.
isuppz( 2i-1) through isuppz( 2i ). Referenced only if eigenvectors
are needed (jobz = 'V') and all eigenvalues are needed, that is, range =
'A' or range = 'I' and il = 1 and iu = n.
lwork.
rwork(1) On exit, if info = 0, then rwork(1) returns the required minimal size of
lrwork.
liwork.
info INTEGER.

Specific details for the routine heevr interface are the following:
1129
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.
if isuppz is present and z is omitted.
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the maximum of the blocksize for ?hetrd
and ?unmtr returned by ilaenv.
If you are in doubt how much workspace to supply, use a generous value of lwork (or lrwork, or liwork) for
the first run or set lwork = -1 (lrwork = -1, liwork = -1).
If you choose the first option and set any of admissible lwork (or lrwork, liwork) sizes, which is no less than
corresponding array (work, rwork, iwork) on exit. Use this value (work(1), rwork(1), iwork(1)) for
subsequent runs.
first element of the corresponding array (work, rwork, iwork). This operation is called a workspace query.
Note that if you set lwork (lrwork, liwork) to less than the minimal required value and not -1, the routine
workspace.
Normal execution of ?stemr may create NaNs and infinities and hence may abort due to a floating point
exception in environments which do not handle NaNs and infinities in the IEEE standard default manner.
For more details, see ?stemr and these references:

Inderjit S. Dhillon and Beresford N. Parlett: "Multiple representations to compute orthogonal eigenvectors
of symmetric tridiagonal matrices," Linear Algebra and its Applications, 387(1), pp. 1-28, August 2004.
Inderjit Dhillon and Beresford Parlett: "Orthogonal Eigenvectors and Relative Gaps," SIAM Journal on
Matrix Analysis and Applications, Vol. 25, 2004. Also LAPACK Working Note 154.
1130
LAPACK Routines 3
Inderjit Dhillon: "A new O(n^2) algorithm for the symmetric tridiagonal eigenvalue/eigenvector problem",
Computer Science Division Technical Report No. UCB/CSD-97-971, UC Berkeley, May 1997.
?spev
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
call sspev(jobz, uplo, n, ap, w, z, ldz, work, info)
call dspev(jobz, uplo, n, ap, w, z, ldz, work, info)
call spev(ap, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a real symmetric matrix A in
packed storage.
Input Parameters


If uplo = 'U', ap stores the packed upper triangular part of A.
If uplo = 'L', ap stores the packed lower triangular part of A.
ap, work REAL for sspev

DOUBLE PRECISION for dspev
Arrays:
Array ap(*) contains the packed upper or lower triangle of symmetric
The size of ap must be at least max(1, n*(n+1)/2).
work(*) is a workspace array, size at least max(1, 3n).

if jobz = 'N', then ldz 1;
if jobz = 'V', then ldz max(1, n).
Output Parameters
w, z REAL for sspev
1131
DOUBLE PRECISION for dspev

Arrays:
w(*), size at least max(1, n).
If info = 0, w contains the eigenvalues of the matrix A in ascending order.
z(ldz,*). The second dimension of z must be at least max(1, n).

If jobz = 'V', then if info = 0, z contains the orthonormal eigenvectors
of the matrix A, with the i-th column of z holding the eigenvector associated
with w(i).
ap On exit, this array is overwritten by the values generated during the

reduction to tridiagonal form. The elements of the diagonal and the off-
diagonal of the tridiagonal matrix overwrite the corresponding elements of
A.
info INTEGER.

zero.

Specific details for the routine spev interface are the following:
w Holds the vector with the number of elements n.
is present, jobz = 'N', if z is omitted.
?hpev
eigenvectors of a Hermitian matrix in packed storage.
Syntax
call chpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
call zhpev(jobz, uplo, n, ap, w, z, ldz, work, rwork, info)
call hpev(ap, w [,uplo] [,z] [,info])
1132
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A in
packed storage.
Input Parameters


ap COMPLEX for chpev

DOUBLE COMPLEX for zhpev.
Array ap(*) contains the packed upper or lower triangle of Hermitian
work COMPLEX for chpev

(*) is a workspace array, size at least max(1, 2n-1).

Constraints:
if jobz = 'V', then ldz max(1, n) .
rwork REAL for chpev

DOUBLE PRECISION for zhpev.
Output Parameters
w REAL for chpev

DOUBLE PRECISION for zhpev.
If info = 0, w contains the eigenvalues of the matrix A in ascending order.
z COMPLEX for chpev
1133

Array z(ldz,*).

with w(i).

A.
info INTEGER.

zero.

Specific details for the routine hpev interface are the following:

?spevd
Uses divide and conquer algorithm to compute all
eigenvalues and (optionally) all eigenvectors of a real
symmetric matrix held in packed storage.
Syntax
call sspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
call dspevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, iwork, liwork, info)
call spevd(ap, w [,uplo] [,z] [,info])
1134
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric matrix A
(held in packed storage). In other words, it can compute the spectral factorization of A as:
A = Z**ZT.
A*zi = i*zi for i = 1, 2, ..., n.
Input Parameters


ap, work REAL for sspevd

DOUBLE PRECISION for dspevd
Arrays:
ap(*) contains the packed upper or lower triangle of symmetric matrix A,
as specified by uplo.
The dimension of ap must be max(1, n*(n+1)/2)

Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork 2*n;
1135
if jobz = 'V' and n > 1, then
lworkn2+ 6*n + 1.
Notes for details.
liwork INTEGER.
Constraints:

Notes for details.
Output Parameters
w, z REAL for sspevd

DOUBLE PRECISION for dspevd
Arrays:

See also info.
z(ldz,*).
The second dimension of z must be: at least 1 if jobz = 'N';at least
max(1, n) if jobz = 'V'.
If jobz = 'V', then this array is overwritten by the orthogonal matrix Z

which contains the eigenvectors of A. If jobz = 'N', then z is not
referenced.

A.
work(1) On exit, if info = 0, then work(1) returns the required lwork.
iwork(1) On exit, if info = 0, then iwork(1) returns the required liwork.
info INTEGER.
1136
LAPACK Routines 3

zero.

Specific details for the routine spevd interface are the following:

Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2 = O()*||A||2,
The complex analogue of this routine is hpevd.
See also syevd for matrices held in full storage, and sbevd for banded matrices.
?hpevd
Uses divide and conquer algorithm to compute all
eigenvalues and, optionally, all eigenvectors of a
complex Hermitian matrix held in packed storage.
Syntax
call chpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
1137
call zhpevd(jobz, uplo, n, ap, w, z, ldz, work, lwork, rwork, lrwork, iwork, liwork,
info)
call hpevd(ap, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian matrix
A (held in packed storage). In other words, it can compute the spectral factorization of A as: A = Z**ZH.
A*zi = i*zi for i = 1, 2, ..., n.
Input Parameters


ap, work COMPLEX for chpevd

DOUBLE COMPLEX for zhpevd
Arrays:
ap(*) contains the packed upper or lower triangle of Hermitian matrix A, as
specified by uplo.

Constraints:
lwork INTEGER.
Constraints:
1138
LAPACK Routines 3
if jobz = 'N' and n > 1, then lworkn;
if jobz = 'V' and n > 1, then lwork 2*n.

rwork REAL for chpevd

DOUBLE PRECISION for zhpevd
Workspace array, its dimension max(1, lrwork).
lrwork INTEGER.
The dimension of the array rwork. Constraints:
if jobz = 'N' and n > 1, then lrworkn;
if jobz = 'V' and n > 1, then lrwork 2*n2 + 5*n + 1.

liwork INTEGER.
Constraints:

Output Parameters
w REAL for chpevd

DOUBLE PRECISION for zhpevd
See also info.
1139
z COMPLEX for chpevd

DOUBLE COMPLEX for zhpevd
Array, size (ldz,*).
at least 1 if jobz = 'N';
at least max(1, n) if jobz = 'V'.
If jobz = 'V', then this array is overwritten by the unitary matrix Z which
contains the eigenvectors of A.

A.
lwork.
lrwork.
liwork.
info INTEGER.

zero.

Specific details for the routine hpevd interface are the following:

1140
LAPACK Routines 3
Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()*||A||2,
subsequent runs.
workspace.
The real analogue of this routine is spevd.
See also heevd for matrices held in full storage, and hbevd for banded matrices.
?spevx
eigenvectors of a real symmetric matrix in packed
storage.
Syntax
call sspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
call dspevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
iwork, ifail, info)
call spevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A in
packed storage. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a
range of indices for the desired eigenvalues.
Input Parameters


1141

interval: vl< w(i)vu.

ap, work REAL for sspevx

DOUBLE PRECISION for dspevx
Arrays:
Array ap(*) contains the packed upper or lower triangle of the symmetric
vl, vu REAL for sspevx

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
Constraint: 1 iliun, if n > 0; il=1 and iu=0
if n = 0.
abstol REAL for sspevx

The absolute error tolerance to which each eigenvalue is required. See
Application notes for details on error tolerance.

Constraints:
1142
LAPACK Routines 3
Output Parameters

A.

0 mn. If range = 'A', m = n, if range = 'I', m = iu-il+1, and if
range = 'V' the exact value of m is not known in advance..
w, z REAL for sspevx

Arrays:
If info = 0, contains the selected eigenvalues of the matrix A in ascending
order.
z(ldz,*).
with w(i).

returned in ifail.
ifail INTEGER.
info > 0, the ifail contains the indices the eigenvectors that failed to
converge.
If jobz = 'N', then ifail is not referenced.
info INTEGER.

in the array ifail.
1143

Specific details for the routine spevx interface are the following:
ifail Holds the vector with the number of elements n.

jobz = 'N', if z is omitted
Note that there will be an error condition if ifail is present and z is omitted.
Application Notes
If abstol is less than or equal to zero, then *||T||1 will be used in its place, where T is the tridiagonal
matrix obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol
is set to twice the underflow threshold 2*?lamch('S'), not zero.
to 2*?lamch('S').
?hpevx
eigenvectors of a Hermitian matrix in packed storage.
1144
LAPACK Routines 3
Syntax
call chpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
rwork, iwork, ifail, info)
call zhpevx(jobz, range, uplo, n, ap, vl, vu, il, iu, abstol, m, w, z, ldz, work,
rwork, iwork, ifail, info)
call hpevx(ap, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A in
packed storage. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a
range of indices for the desired eigenvalues.
Input Parameters




ap, work COMPLEX for chpevx

DOUBLE COMPLEX for zhpevx
Arrays:
Array ap(*) contains the packed upper or lower triangle of the Hermitian
vl, vu REAL for chpevx

DOUBLE PRECISION for zhpevx
for eigenvalues.
1145
Constraint: vl< vu.
il, iu INTEGER.
abstol REAL for chpevx


Constraints:
rwork REAL for chpevx

Output Parameters

A.

range = 'V' the exact value of m is not known in advance..
w REAL for chpevx

If info = 0, contains the selected eigenvalues of the matrix A in ascending
order.
z COMPLEX for chpevx

DOUBLE COMPLEX for zhpevx
Array z(ldz,*).
1146
LAPACK Routines 3
with w(i).
returned in ifail.
ifail INTEGER.
converge.
info INTEGER.

in the array ifail.

Specific details for the routine hpevx interface are the following:
1147

Application Notes
to 2*?lamch('S').
?sbev
eigenvectors of a real symmetric band matrix.
Syntax
call ssbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
call dsbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, info)
call sbev(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric band matrix A.
Input Parameters


1148
LAPACK Routines 3

(kd 0).
ab, work REAL for ssbev

DOUBLE PRECISION for dsbev.
Arrays:
ab(lda,*) is an array containing either upper or lower triangular part of the
work (*) is a workspace array.
The dimension of work must be at least max(1, 3n-2).
ldab INTEGER. The leading dimension of ab; must be at least kd +1.

Constraints:
Output Parameters
w, z REAL for ssbev

DOUBLE PRECISION for dsbev
Arrays:
z(ldz,*).
with w(i).
ab On exit, this array is overwritten by the values generated during the

reduction to tridiagonal form (see the description of ?sbtrd).
info INTEGER.
1149

zero.

Specific details for the routine sbev interface are the following:

?hbev
eigenvectors of a Hermitian band matrix.
Syntax
call chbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
call zhbev(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, rwork, info)
call hbev(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a complex Hermitian band matrix A.
Input Parameters


1150
LAPACK Routines 3
(kd 0).
ab, work COMPLEX for chbev

DOUBLE COMPLEX for zhbev.
Arrays:

Constraints:
rwork REAL for chbev

DOUBLE PRECISION for zhbev
Output Parameters
w REAL for chbev

DOUBLE PRECISION for zhbev
If info = 0, contains the eigenvalues in ascending order.
z COMPLEX for chbev

DOUBLE COMPLEX for zhbev.
Array z(ldz,*).

with w(i).

reduction to tridiagonal form(see the description of hbtrd).
info INTEGER.
1151
If info = i, then the algorithm failed to converge;
i indicates the number of elements of an intermediate tridiagonal form

which did not converge to zero.

Specific details for the routine hbev interface are the following:

?sbevd
eigenvectors of a real symmetric band matrix using
divide and conquer algorithm.
Syntax
call ssbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork, info)
call dsbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, iwork, liwork, info)
call sbevd(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric band
matrix A. In other words, it can compute the spectral factorization of A as:
A = Z**ZT
A*zi = i*zi for i = 1, 2, ..., n.
1152
LAPACK Routines 3
Input Parameters



(kd 0).
ab, work REAL for ssbevd

DOUBLE PRECISION for dsbevd.
Arrays:
ldab INTEGER. The leading dimension of ab; must be at least kd+1.

Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lwork 2n;
if jobz = 'V' and n > 1, then lwork 2*n2 + 5*n + 1.

calculates the optimal size of the work and iwork arrays, returns these
Notes for details.
liwork INTEGER.
1153
The dimension of the array iwork. Constraints: if n 1, then liwork < 1; if

job = 'N' and n > 1, then liwork < 1; if job = 'V' and n > 1, then
liwork < 5*n+3.
Notes for details.
Output Parameters
w, z REAL for ssbevd

DOUBLE PRECISION for dsbevd
Arrays:
See also info.
z(ldz,*).
at least 1 if job = 'N';
at least max(1, n) if job = 'V'.
If job = 'V', then this array is overwritten by the orthogonal matrix Z

which contains the eigenvectors of A. The i-th column of Z contains the
eigenvector which corresponds to the eigenvalue w(i).
If job = 'N', then z is not referenced.

reduction to tridiagonal form.
lwork.
liwork.
info INTEGER.

zero.

Specific details for the routine sbevd interface are the following:
1154
LAPACK Routines 3

Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A+E such that ||E||2=O()*||A||2,
If any of admissible lwork (or liwork) has any of admissible sizes, which is no less than the minimal value
and provides the recommended workspace in the first element of the corresponding array (work, iwork) on
exit. Use this value (work(1), iwork(1)) for subsequent runs.
Note that if work (liwork) is less than the minimal required value and is not equal to -1, the routine returns
The complex analogue of this routine is hbevd.
See also syevd for matrices held in full storage, and spevd for matrices held in packed storage.
?hbevd
eigenvectors of a complex Hermitian band matrix
using divide and conquer algorithm.
Syntax
call chbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
call zhbevd(jobz, uplo, n, kd, ab, ldab, w, z, ldz, work, lwork, rwork, lrwork, iwork,
liwork, info)
call hbevd(ab, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a complex Hermitian band
matrix A. In other words, it can compute the spectral factorization of A as: A = Z**ZH.
1155
A*zi = i*zi for i = 1, 2, ..., n.

Input Parameters



(kd 0).
ab, work COMPLEX for chbevd

DOUBLE COMPLEX for zhbevd.
Arrays:
work (*) is a workspace array, its dimension max(1, lwork).
ldab INTEGER. The leading dimension of ab; must be at least kd+1.

Constraints:
lwork INTEGER.
Constraints:
if jobz = 'N' and n > 1, then lworkn;
if jobz = 'V' and n > 1, then lwork 2*n2.

1156
LAPACK Routines 3
rwork REAL for chbevd
DOUBLE PRECISION for zhbevd
Workspace array, size at least lrwork.
lrwork INTEGER.
The dimension of the array rwork.
Constraints:
if jobz = 'N' and n > 1, then lrworkn;
if jobz = 'V' and n > 1, then lrwork 2*n2 + 5*n + 1.

iwork INTEGER. Workspace array, size max(1, liwork).
liwork INTEGER.
Constraints:
if jobz = 'N' or n 1, then liwork 1;

Output Parameters
w REAL for chbevd

DOUBLE PRECISION for zhbevd
See also info.
z COMPLEX for chbevd

DOUBLE COMPLEX for zhbevd
1157
If jobz = 'V', then this array is overwritten by the unitary matrix Z which
contains the eigenvectors of A. The i-th column of Z contains the
eigenvector which corresponds to the eigenvalue w(i).

work(1) On exit, if lwork > 0, then the real part of work(1) returns the required
minimal size of lwork.
rwork(1) On exit, if lrwork > 0, then rwork(1) returns the required minimal size of
lrwork.
liwork.
info INTEGER.

zero.

Specific details for the routine hbevd interface are the following:

Application Notes
The computed eigenvalues and eigenvectors are exact for a matrix A + E such that ||E||2 = O()||A||2,
1158
LAPACK Routines 3
subsequent runs.
workspace.
The real analogue of this routine is sbevd.
See also heevd for matrices held in full storage, and hpevd for matrices held in packed storage.
?sbevx
eigenvectors of a real symmetric band matrix.
Syntax
call ssbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, iwork, ifail, info)
call dsbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, iwork, ifail, info)
call sbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a real symmetric band matrix A.
Input Parameters



interval: vl<w(i)vu.
If range = 'I', the routine computes eigenvalues with indices in range il

to iu.
1159

(kd 0).
ab, work REAL for ssbevx

DOUBLE PRECISION for dsbevx.
Arrays:
Arrays:
Array ab(lda,*) contains either upper or lower triangular part of the
vl, vu REAL for ssbevx

DOUBLE PRECISION for dsbevx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for chpevx

ldq, ldz INTEGER. The leading dimensions of the output arrays q and z, respectively.
Constraints:
ldq 1, ldz 1;
If jobz = 'V', then ldq max(1, n) and ldz max(1, n) .
1160
LAPACK Routines 3
Output Parameters
q REAL for ssbevxDOUBLE PRECISION for dsbevx.

Array, size (ldz,n).
If jobz = 'V', the n-by-n orthogonal matrix is used in the reduction to
tridiagonal form.
If jobz = 'N', the array q is not referenced.

'V', the exact value of m is not known in advance.
w, z REAL for ssbevx

DOUBLE PRECISION for dsbevx
Arrays:
w(*), size at least max(1, n). The first m elements of w contain the selected
z(ldz,*).
with w(i).

returned in ifail.

If uplo = 'U', the first superdiagonal and the diagonal of the tridiagonal
matrix T are returned in rows kd and kd+1 of ab, and if uplo = 'L', the
diagonal and first subdiagonal of T are returned in the first two rows of ab.
ifail INTEGER.
converge.
info INTEGER.
1161

in the array ifail.

Specific details for the routine sbevx interface are the following:
q Holds the matrix Q of size (n, n).

Note that there will be an error condition if either ifail or q is present and z is
omitted.
Application Notes
1162
LAPACK Routines 3
If abstol is less than or equal to zero, then *||T||1 is used as tolerance, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues will be computed most accurately when abstol is set
to twice the underflow threshold 2*?lamch('S'), not zero.
to 2*?lamch('S').
?hbevx
eigenvectors of a Hermitian band matrix.
Syntax
call chbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, rwork, iwork, ifail, info)
call zhbevx(jobz, range, uplo, n, kd, ab, ldab, q, ldq, vl, vu, il, iu, abstol, m, w,
z, ldz, work, rwork, iwork, ifail, info)
call hbevx(ab, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian band matrix
A. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of indices
for the desired eigenvalues.
Input Parameters





(kd 0).
ab, work COMPLEX for chbevx
1163
DOUBLE COMPLEX for zhbevx.

Arrays:
vl, vu REAL for chbevx

DOUBLE PRECISION for zhbevx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
abstol REAL for chbevx

DOUBLE PRECISION for zhbevx.
ldq, ldz INTEGER. The leading dimensions of the output arrays q and z, respectively.
Constraints:
ldq 1, ldz 1;
If jobz = 'V', then ldq max(1, n) and ldz max(1, n).
rwork REAL for chbevx

DOUBLE PRECISION for zhbevx
Output Parameters
q COMPLEX for chbevxDOUBLE COMPLEX for zhbevx.

Array, size (ldz,n).
If jobz = 'V', the n-by-n unitary matrix is used in the reduction to
tridiagonal form.
1164
LAPACK Routines 3
If jobz = 'N', the array q is not referenced.

0 mn.
'V', the exact value of m is not known in advance..
w REAL for chbevx

DOUBLE PRECISION for zhbevx
Array, size at least max(1, n). The first m elements contain the selected
z COMPLEX for chbevx

DOUBLE COMPLEX for zhbevx.
Array z(ldz,*).

with w(i).

returned in ifail.

If uplo = 'U', the first superdiagonal and the diagonal of the tridiagonal
matrix T are returned in rows kd and kd+1 of ab, and if uplo = 'L', the
diagonal and first subdiagonal of T are returned in the first two rows of ab.
ifail INTEGER.
info > 0, the ifail contains the indices of the eigenvectors that failed to
converge.
info INTEGER.

in the array ifail.
1165

Specific details for the routine hbevx interface are the following:

Note that there will be an error condition if either ifail or q is present and z is
omitted.
Application Notes
less than or equal to abstol + * max( |a|,|b| ), where is the machine precision.
to 2*?lamch('S').
1166
LAPACK Routines 3
?stev
eigenvectors of a real symmetric tridiagonal matrix.
Syntax
call sstev(jobz, n, d, e, z, ldz, work, info)
call dstev(jobz, n, d, e, z, ldz, work, info)
call stev(d, e [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all eigenvalues and, optionally, eigenvectors of a real symmetric tridiagonal matrix A.
Input Parameters

d, e, work REAL for sstev

DOUBLE PRECISION for dstev.
Arrays:
Array d(*) contains the n diagonal elements of the tridiagonal matrix A.

Array e(*) contains the n-1 subdiagonal elements of the tridiagonal matrix
A.
The size of e must be at least max(1, n). The n-th element of this array is
used as workspace.
The dimension of work must be at least max(1, 2n-2).
If jobz = 'N', work is not referenced.
ldz INTEGER. The leading dimension of the output array z; ldz 1. If jobz =
'V' then ldz max(1, n).
Output Parameters
d On exit, if info = 0, contains the eigenvalues of the matrix A in ascending

order.
z REAL for sstev

DOUBLE PRECISION for dstev
1167

with the eigenvalue returned in d(i).
If job = 'N', then z is not referenced.
e On exit, this array is overwritten with intermediate results.
info INTEGER.
If info = i, then the algorithm failed to converge;
i elements of e did not converge to zero.

Specific details for the routine stev interface are the following:

?stevd
eigenvectors of a real symmetric tridiagonal matrix
using divide and conquer algorithm.
Syntax
call sstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call dstevd(jobz, n, d, e, z, ldz, work, lwork, iwork, liwork, info)
call stevd(d, e [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally all the eigenvectors, of a real symmetric tridiagonal
matrix T. In other words, the routine can compute the spectral factorization of T as: T = Z**ZT.
1168
LAPACK Routines 3
T*zi = i*zi for i = 1, 2, ..., n.
There is no complex analogue of this routine.
Input Parameters

d, e, work REAL for sstevd

DOUBLE PRECISION for dstevd.
Arrays:
d(*) contains the n diagonal elements of the tridiagonal matrix T.
e(*) contains the n-1 off-diagonal elements of T.
The dimension of e must be at least max(1, n). The n-th element of this
array is used as workspace.
The dimension of work must be at least lwork.

ldz 1 if job = 'N';
ldz max(1, n) if job = 'V'.
lwork INTEGER.
Constraints:
if jobz = 'N' or n 1, then lwork 1;
if jobz = 'V' and n > 1, then lwork n2 + 4*n + 1.

Notes for details.
liwork INTEGER.
Constraints:
1169
if jobz = 'N' or n 1, then liwork 1;

Notes for details.
Output Parameters
d On exit, if info = 0, contains the eigenvalues of the matrix T in ascending

order.
See also info.
z REAL for sstevd

DOUBLE PRECISION for dstevd
Array, size (ldz,*) .
If jobz = 'V', then this array is overwritten by the orthogonal matrix Z

which contains the eigenvectors of T.
e On exit, this array is overwritten with intermediate results.
lwork.
liwork.
info INTEGER.

zero.

Specific details for the routine stevd interface are the following:
1170
LAPACK Routines 3

Application Notes
|i - i| c(n)**||T||2
If zi is the corresponding exact eigenvector, and wi is the corresponding computed vector, then the angle
(zi, wi) between them is bounded as follows:
(zi, wi) c(n)**||T||2 / min ij|i - j|.
Thus the accuracy of a computed eigenvector depends on the gap between its eigenvalue and all the other
eigenvalues.
query.
workspace.
?stevx
Syntax
call sstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work, iwork,
ifail, info)
call dstevx(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, work, iwork,
ifail, info)
call stevx(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
1171
matrix A. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of
indices for the desired eigenvalues.
Input Parameters



interval: vl<w(i)vu.
d, e, work REAL for sstevx

DOUBLE PRECISION for dstevx.
Arrays:
d(*) contains the n diagonal elements of the tridiagonal matrix A.
e(*) contains the n-1 subdiagonal elements of A.
The dimension of e must be at least max(1, n-1). The n-th element of this
vl, vu REAL for sstevx

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
abstol REAL for sstevx

DOUBLE PRECISION for dstevx. The absolute error tolerance to which each
eigenvalue is required. See Application notes for details on error tolerance.
1172
LAPACK Routines 3
ldz INTEGER. The leading dimensions of the output array z; ldz 1. If jobz =
'V', then ldz max(1, n).
Output Parameters

0 mn.
'V' the exact value of m is unknown.
w, z REAL for sstevx

Arrays:
The first m elements of w contain the selected eigenvalues of the matrix A
in ascending order.
z(ldz,*) .
with w(i).

returned in ifail.
d, e On exit, these arrays may be multiplied by a constant factor chosen to

avoid overflow or underflow in computing the eigenvalues.
ifail INTEGER.
converge.
info INTEGER.
1173

in the array ifail.

Specific details for the routine stevx interface are the following:

Application Notes
If abstol is less than or equal to zero, then *|A|1 is used instead. Eigenvalues are computed most accurately
If this routine returns with info > 0, indicating that some eigenvectors did not converge, set abstol to 2*?
lamch('S').
1174
LAPACK Routines 3
?stevr
eigenvectors of a real symmetric tridiagonal matrix
using the Relatively Robust Representations.
Syntax
call sstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
call dstevr(jobz, range, n, d, e, vl, vu, il, iu, abstol, m, w, z, ldz, isuppz, work,
call stevr(d, e, w [, z] [,vl] [,vu] [,il] [,iu] [,m] [,isuppz] [,abstol] [,info])
Include Files
mkl.fi, lapack.f90
Description
matrix T. Eigenvalues and eigenvectors can be selected by specifying either a range of values or a range of
indices for the desired eigenvalues.
Whenever possible, the routine calls stemr to compute the eigenspectrum using Relatively Robust
Representations. stegr computes eigenvalues by the dqds algorithm, while orthogonal eigenvectors are
computed from various "good" L*D*LT representations (also known as Relatively Robust Representations).
Gram-Schmidt orthogonalization is avoided as far as possible. More specifically, the various steps of the
algorithm are as follows. For the i-th unreduced block of T:
a. Compute T - i = Li*Di*LiT, such that Li*Di*LiT is a relatively robust representation.

b. Compute the eigenvalues, j, of Li*Di*LiT to high relative accuracy by the dqds algorithm.
c. If there is a cluster of close eigenvalues, "choose" i close to the cluster, and go to Step (a).
d. Given the approximate eigenvalue j of Li*Di*LiT, compute the corresponding eigenvector by forming a
rank-revealing twisted factorization.
The routine ?stevr calls stemr when the full spectrum is requested on machines which conform to the
IEEE-754 floating point standard. ?stevr calls stebz and stein on non-IEEE machines and when partial
spectrum requests are made.
Input Parameters


If range = 'V', the routine computes eigenvalues w(i)in the half-open

interval:
vl<w(i)vu.
1175
For range = 'V'or 'I' and iu-il < n-1, sstebz/dstebz and sstein/
dstein are called.
d, e, work REAL for sstevr

DOUBLE PRECISION for dstevr.
Arrays:
d(*) contains the n diagonal elements of the tridiagonal matrix T.
e(*)contains the n-1 subdiagonal elements of A.
The dimension of e must be at least max(1, n-1). The n-th element of this
vl, vu REAL for sstevr

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
abstol REAL for sstevr

The absolute error tolerance to which each eigenvalue/eigenvector is
required.
eigenvectors are bounded by abstol. If abstol < n *eps*||T||, then n
*eps*||T|| will be used in its place, where eps is the machine precision,
and ||T|| is the 1-norm of the matrix T. The eigenvalues are computed to
an accuracy of eps*||T|| irrespective of abstol.

Constraints:
1176
LAPACK Routines 3
lwork INTEGER.
The dimension of the array work. Constraint:
lwork max(1, 20*n).
Notes for details.
iwork INTEGER.
liwork INTEGER.
The dimension of the array iwork,
lwork max(1, 10*n).
Notes for details.
Output Parameters

range = 'V' the exact value of m is unknown..
w, z REAL for sstevr

Arrays:
The first m elements of w contain the selected eigenvalues of the matrix T
in ascending order.
z(ldz,*).
with w(i).
d, e On exit, these arrays may be multiplied by a constant factor chosen to

avoid overflow or underflow in computing the eigenvalues.
1177
isuppz INTEGER.
isuppz( 2i-1) through isuppz(2i).
Implemented only for range = 'A' or 'I' and iu-il = n-1.
lwork.
liwork.
info INTEGER.

Specific details for the routine stevr interface are the following:
isuppz Holds the vector of length (2*n), where the values (2*m) are significant.

1178
LAPACK Routines 3
Application Notes
Normal execution of the routine ?stegr may create NaNs and infinities and hence may abort due to a floating
point exception in environments which do not handle NaNs and infinities in the IEEE standard default manner.
query.
workspace.
Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving nonsymmetric eigenproblems. See also
computational routines that can be called to solve these problems.
Table "Driver Routines for Solving Nonsymmetric Eigenproblems" lists all such driver routines for the
FORTRAN 77 interface. The corresponding routine names in the Fortran 95 interface are without the first
symbol.
Driver Routines for Solving Nonsymmetric Eigenproblems
gees Computes the eigenvalues and Schur factorization of a general matrix, and orders
the factorization so that selected eigenvalues are at the top left of the Schur form.
geesx Computes the eigenvalues and Schur factorization of a general matrix, orders the
factorization and computes reciprocal condition numbers.
geev Computes the eigenvalues and left and right eigenvectors of a general matrix.
geevx Computes the eigenvalues and left and right eigenvectors of a general matrix, with
preliminary matrix balancing, and computes reciprocal condition numbers for the
eigenvalues and right eigenvectors.
?gees
Computes the eigenvalues and Schur factorization of a
general matrix, and orders the factorization so that
selected eigenvalues are at the top left of the Schur
form.
1179
Syntax
call sgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work, lwork, bwork,
info)
call dgees(jobvs, sort, select, n, a, lda, sdim, wr, wi, vs, ldvs, work, lwork, bwork,
info)
call cgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork, rwork,
bwork, info)
call zgees(jobvs, sort, select, n, a, lda, sdim, w, vs, ldvs, work, lwork, rwork,
bwork, info)
call gees(a, wr, wi [,vs] [,select] [,sdim] [,info])
call gees(a, w [,vs] [,select] [,sdim] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the real Schur
form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that selected
eigenvalues are at the top left. The leading columns of Z then form an orthonormal basis for the invariant
subspace corresponding to the selected eigenvalues.
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks. 2-by-2 blocks
will be standardized in the form
where b*c < 0. The eigenvalues of such a block are
A complex matrix is in Schur form if it is upper triangular.
Input Parameters
jobvs CHARACTER*1. Must be 'N' or 'V'.

If jobvs = 'N', then Schur vectors are not computed.
If jobvs = 'V', then Schur vectors are computed.
sort CHARACTER*1. Must be 'N' or 'S'. Specifies whether or not to order the
eigenvalues on the diagonal of the Schur form.
If sort = 'N', then eigenvalues are not ordered.
If sort = 'S', eigenvalues are ordered (see select).
select LOGICAL FUNCTION of two REAL arguments for real flavors.

LOGICAL FUNCTION of one COMPLEX argument for complex flavors.
1180
LAPACK Routines 3
select must be declared EXTERNAL in the calling subroutine.
If sort = 'S', select is used to select eigenvalues to sort to the top left of
the Schur form.
If sort = 'N', select is not referenced.
For real flavors:

An eigenvalue wr(j)+sqrt(-1)*wi(j) is selected if select(wr(j), wi(j)) is
true; that is, if either one of a complex conjugate pair of eigenvalues is
selected, then both complex eigenvalues are selected.
An eigenvalue w(j) is selected if select(w(j) is true.
Note that a selected complex eigenvalue may no longer satisfy select(wr(j),
wi(j))= .TRUE. after ordering, since ordering may change the value of
complex eigenvalues (especially if the eigenvalue is ill-conditioned); in this
case info may be set to n+2 (see info below).
a, work REAL for sgees

DOUBLE PRECISION for dgees
COMPLEX for cgees
DOUBLE COMPLEX for zgees.
Arrays:
a(lda,*) is an array containing the n-by-n matrix A.
ldvs INTEGER. The leading dimension of the output array vs. Constraints:
ldvs 1;
ldvs max(1, n) if jobvs = 'V'.
lwork INTEGER.
Constraint:
lwork max(1, 3n) for real flavors;
lwork max(1, 2n) for complex flavors.
xerbla.
rwork REAL for cgees

DOUBLE PRECISION for zgees
1181
bwork LOGICAL. Workspace array, size at least max(1, n). Not referenced if sort
= 'N'.
Output Parameters
a On exit, this array is overwritten by the real-Schur/Schur form T.
sdim INTEGER.
If sort = 'N', sdim= 0.
If sort = 'S', sdim is equal to the number of eigenvalues (after sorting)

for which select is true.
Note that for real flavors complex conjugate pairs for which select is true for
either eigenvalue count as 2.
wr, wi REAL for sgees

Arrays, size at least max (1, n) each. Contain the real and imaginary parts,
respectively, of the computed eigenvalues, in the same order that they
appear on the diagonal of the output real-Schur form T. Complex conjugate
pairs of eigenvalues appear consecutively with the eigenvalue having
positive imaginary part first.
w COMPLEX for cgees

Array, size at least max(1, n). Contains the computed eigenvalues. The
eigenvalues are stored in the same order as they appear on the diagonal of
the output Schur form T.
vs REAL for sgees

COMPLEX for cgees
Array vs(ldvs,*);the second dimension of vs must be at least max(1, n).
If jobvs = 'V', vs contains the orthogonal/unitary matrix Z of Schur
vectors.
If jobvs = 'N', vs is not referenced.
lwork.
info INTEGER.
If info = i, and
in:
1182
LAPACK Routines 3
the QR algorithm failed to compute all the eigenvalues; elements 1:ilo-1
and i+1:n of wr and wi (for real flavors) or w (for complex flavors) contain
those eigenvalues which have converged; if jobvs = 'V', vs contains the
matrix which reduces A to its partially converged Schur form;
i = n+1:
the eigenvalues could not be reordered because some eigenvalues were too
close to separate (the problem is very ill-conditioned);
i = n+2:
after reordering, round-off changed values of some complex eigenvalues so
that leading eigenvalues in the Schur form no longer satisfy select
= .TRUE.. This could also be caused by underflow due to scaling.

Specific details for the routine gees interface are the following:
vs Holds the matrix VS of size (n, n).
jobvs Restored based on the presence of the argument vs as follows:

jobvs = 'V', if vs is present,
jobvs = 'N', if vs is omitted.
sort Restored based on the presence of the argument select as follows:

sort = 'S', if select is present,
sort = 'N', if select is omitted.
Application Notes
lwork = -1.
1183
?geesx
Computes the eigenvalues and Schur factorization of a
general matrix, orders the factorization and computes
reciprocal condition numbers.
Syntax
call sgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs, rconde,
rcondv, work, lwork, iwork, liwork, bwork, info)
call dgeesx(jobvs, sort, select, sense, n, a, lda, sdim, wr, wi, vs, ldvs, rconde,
rcondv, work, lwork, iwork, liwork, bwork, info)
call cgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde, rcondv,
work, lwork, rwork, bwork, info)
call zgeesx(jobvs, sort, select, sense, n, a, lda, sdim, w, vs, ldvs, rconde, rcondv,
work, lwork, rwork, bwork, info)
call geesx(a, wr, wi [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
call geesx(a, w [,vs] [,select] [,sdim] [,rconde] [,rcondev] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues, the real-Schur/
Schur form T, and, optionally, the matrix of Schur vectors Z. This gives the Schur factorization A = Z*T*ZH.
Optionally, it also orders the eigenvalues on the diagonal of the real-Schur/Schur form so that selected
eigenvalues are at the top left; computes a reciprocal condition number for the average of the selected
eigenvalues (rconde); and computes a reciprocal condition number for the right invariant subspace
corresponding to the selected eigenvalues (rcondv). The leading columns of Z form an orthonormal basis for
this invariant subspace.
For further explanation of the reciprocal condition numbers rconde and rcondv, see [LUG], Section 4.10
(where these quantities are called s and sep respectively).
A real matrix is in real-Schur form if it is upper quasi-triangular with 1-by-1 and 2-by-2 blocks. 2-by-2 blocks
will be standardized in the form
where b*c < 0. The eigenvalues of such a block are
A complex matrix is in Schur form if it is upper triangular.
Input Parameters
jobvs CHARACTER*1. Must be 'N' or 'V'.

If jobvs = 'N', then Schur vectors are not computed.
If jobvs = 'V', then Schur vectors are computed.
1184
LAPACK Routines 3
eigenvalues on the diagonal of the Schur form.
If sort = 'S', eigenvalues are ordered (see select).
select LOGICAL FUNCTION of two REAL arguments for real flavors.

LOGICAL FUNCTION of one COMPLEX argument for complex flavors.
select must be declared EXTERNAL in the calling subroutine.
If sort = 'S', select is used to select eigenvalues to sort to the top left of
the Schur form.
If sort = 'N', select is not referenced.
For real flavors:

An eigenvalue wr(j)+sqrt(-1)*wi(j) is selected if select(wr(j), wi(j)) is
true; that is, if either one of a complex conjugate pair of eigenvalues is
selected, then both complex eigenvalues are selected.
An eigenvalue w(j) is selected if select(w(j)) is true.
Note that a selected complex eigenvalue may no longer satisfy select(wr(j),
wi(j))= .TRUE. after ordering, since ordering may change the value of
complex eigenvalues (especially if the eigenvalue is ill-conditioned); in this
case info may be set to n+2 (see info below).
sense CHARACTER*1. Must be 'N', 'E', 'V', or 'B'. Determines which reciprocal
condition number are computed.
If sense = 'N', none are computed;
If sense = 'E', computed for average of selected eigenvalues only;
If sense = 'V', computed for selected right invariant subspace only;
If sense = 'B', computed for both.
If sense is 'E', 'V', or 'B', then sort must equal 'S'.
a, work REAL for sgeesx

DOUBLE PRECISION for dgeesx
COMPLEX for cgeesx
DOUBLE COMPLEX for zgeesx.
Arrays:
ldvs INTEGER. The leading dimension of the output array vs. Constraints:
1185
ldvs 1;
ldvs max(1, n)if jobvs = 'V'.
lwork INTEGER.
The dimension of the array work. Constraint:
lwork max(1, 3n) for real flavors;
Also, if sense = 'E', 'V', or 'B', then
lworkn+2*sdim*(n-sdim) for real flavors;

lwork 2*sdim*(n-sdim) for complex flavors;
where sdim is the number of selected eigenvalues computed by this routine.
Note that 2*sdim*(n-sdim) n*n/2. Note also that an error is only
returned if lwork<max(1, 2*n), but if sense = 'E', or 'V', or 'B' this
may not be large enough.
For good performance, lwork must generally be larger.
calculates upper bound on the optimal size of the array work, returns this
value as the first entry of the work array, and no error message related to
lwork is issued by xerbla.
iwork INTEGER.
Workspace array, size (liwork). Used in real flavors only. Not referenced if
sense = 'N' or 'E'.
liwork INTEGER.
The dimension of the array iwork. Used in real flavors only.
Constraint:
liwork 1;
if sense = 'V' or 'B', liworksdim*(n-sdim).
rwork REAL for cgeesx

DOUBLE PRECISION for zgeesx
bwork LOGICAL. Workspace array, size at least max(1, n). Not referenced if sort
= 'N'.
Output Parameters
a On exit, this array is overwritten by the real-Schur/Schur form T.
sdim INTEGER.

for which select is true.
1186
LAPACK Routines 3
Note that for real flavors complex conjugate pairs for which select is true for
either eigenvalue count as 2.
wr, wi REAL for sgeesx

respectively, of the computed eigenvalues, in the same order that they
appear on the diagonal of the output real-Schur form T. Complex conjugate
pairs of eigenvalues appear consecutively with the eigenvalue having
positive imaginary part first.
w COMPLEX for cgeesx

Array, size at least max(1, n). Contains the computed eigenvalues. The
eigenvalues are stored in the same order as they appear on the diagonal of
the output Schur form T.
vs REAL for sgeesx

COMPLEX for cgeesx
Array vs(ldvs,*); the second dimension of vs must be at least max(1, n).
If jobvs = 'V', vs contains the orthogonal/unitary matrix Z of Schur
vectors.
If jobvs = 'N', vs is not referenced.
rconde, rcondv REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
If sense = 'E' or 'B', rconde contains the reciprocal condition number for
the average of the selected eigenvalues.
If sense = 'N' or 'V', rconde is not referenced.
If sense = 'V' or 'B', rcondv contains the reciprocal condition number for
the selected right invariant subspace.
If sense = 'N' or 'E', rcondv is not referenced.
lwork.
info INTEGER.
If info = i, and
in:
1187
the QR algorithm failed to compute all the eigenvalues; elements 1:ilo-1

those eigenvalues which have converged; if jobvs = 'V', vs contains the
transformation which reduces A to its partially converged Schur form;
i = n+1:
the eigenvalues could not be reordered because some eigenvalues were too
close to separate (the problem is very ill-conditioned);
i = n+2:
after reordering, roundoff changed values of some complex eigenvalues so
that leading eigenvalues in the Schur form no longer satisfy select
= .TRUE.. This could also be caused by underflow due to scaling.

Specific details for the routine geesx interface are the following:
wr Holds the vector of length (n). Used in real flavors only.
wi Holds the vector of length (n). Used in real flavors only.
w Holds the vector of length (n). Used in complex flavors only.
vs Holds the matrix VS of size (n, n).
jobvs Restored based on the presence of the argument vs as follows:

jobvs = 'V', if vs is present,
jobvs = 'N', if vs is omitted.

sense Restored based on the presence of arguments rconde and rcondv as follows:
sense = 'B', if both rconde and rcondv are present,

sense = 'E', if rconde is present and rcondv omitted,
sense = 'V', if rconde is omitted and rcondv present,
sense = 'N', if both rconde and rcondv are omitted.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork) for the first run
or set lwork = -1 (liwork = -1).
1188
LAPACK Routines 3
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no less than the
minimal value described, the routine completes the task, though probably not so fast as with a recommended
workspace, and provides the recommended workspace in the first element of the corresponding array (work,
iwork) on exit. Use this value (work(1), iwork(1)) for subsequent runs.
first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the routine returns
?geev
Computes the eigenvalues and left and right
eigenvectors of a general matrix.
Syntax
call sgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork, info)
call dgeev(jobvl, jobvr, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, work, lwork, info)
call cgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork, info)
call zgeev(jobvl, jobvr, n, a, lda, w, vl, ldvl, vr, ldvr, work, lwork, rwork, info)
call geev(a, wr, wi [,vl] [,vr] [,info])
call geev(a, w [,vl] [,vr] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and, optionally,
the left and/or right eigenvectors. The right eigenvector v of A satisfies
A*v = *v
where is its eigenvalue.
The left eigenvector u of A satisfies
uH*A = *uH
where uH denotes the conjugate transpose of u. The computed eigenvectors are normalized to have
Euclidean norm equal to 1 and largest component real.
Input Parameters
jobvl CHARACTER*1. Must be 'N' or 'V'.

If jobvl = 'N', then left eigenvectors of A are not computed.
If jobvl = 'V', then left eigenvectors of A are computed.
jobvr CHARACTER*1. Must be 'N' or 'V'.

If jobvr = 'N', then right eigenvectors of A are not computed.
If jobvr = 'V', then right eigenvectors of A are computed.
1189
a, work REAL for sgeev

DOUBLE PRECISION for dgeev
COMPLEX for cgeev
DOUBLE COMPLEX for zgeev.
Arrays:
lda INTEGER. The leading dimension of the array a. Must be at least max(1,
n).
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl and vr,
respectively.
Constraints:
ldvl 1; ldvr 1.
If jobvl = 'V', ldvl max(1, n);
If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
Constraint for real lwork max(1, 3n). If computing eigenvectors

flavors: (jobvl = 'V' or jobvr = 'V'), lwork
max(1, 4n).
Constraint for complex lwork max(1, 2n).

flavors:

xerbla.
rwork REAL for cgeev

DOUBLE PRECISION for zgeev
Output Parameters
a On exit, this array is overwritten.
wr, wi REAL for sgeev

1190
LAPACK Routines 3
Contain the real and imaginary parts, respectively, of the computed
eigenvalues. Complex conjugate pairs of eigenvalues appear consecutively
with the eigenvalue having positive imaginary part first.
w COMPLEX for cgeev

Contains the computed eigenvalues.
vl, vr REAL for sgeev

COMPLEX for cgeev
Arrays:
vl(ldvl,*); the second dimension of vl must be at least max(1, n).
If jobvl = 'N', vl is not referenced.
For real flavors:

If the j-th eigenvalue is real, then uj = vl(:,j), the j-th column of vl.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then for
i = sqrt(-1), uj = vl(:,j) + i*vl(:,j+1) and uj + 1 = vl(:,j)-
i*vl(:,j+1).
uj = vl(:,j), the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least max(1, n).
If jobvr = 'N', vr is not referenced.
For real flavors:

If the j-th eigenvalue is real, then vj = vr(:,j), the j-th column of vr.
i = sqrt(-1), vj = vr(:,j) + i*vr(:,j+1) and vj + 1 = vr(:,j) -
i*vr(:,j+1).

vj = vr(:,j), the j-th column of vr.
lwork.
info INTEGER.
1191
If info = i, the QR algorithm failed to compute all the eigenvalues, and no

eigenvectors have been computed; elements i+1:n of wr and wi (for real
flavors) or w (for complex flavors) contain those eigenvalues which have
converged.

Specific details for the routine geev interface are the following:
vl Holds the matrix VL of size (n, n).
vr Holds the matrix VR of size (n, n).
jobvl Restored based on the presence of the argument vl as follows:
jobvl = 'V', if vl is present,

jobvl = 'N', if vl is omitted.
jobvr Restored based on the presence of the argument vr as follows:
jobvr = 'V', if vr is present,

jobvr = 'N', if vr is omitted.
Application Notes
lwork = -1.
Note that if you set lwork to less than the minimal required value and not -1, the routine exits immediately
with an error and does not provide any information on the recommended workspace.
?geevx
Computes the eigenvalues and left and right
eigenvectors of a general matrix, with preliminary
matrix balancing, and computes reciprocal condition
numbers for the eigenvalues and right eigenvectors.
1192
LAPACK Routines 3
Syntax
call sgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, ilo,
ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call dgeevx(balanc, jobvl, jobvr, sense, n, a, lda, wr, wi, vl, ldvl, vr, ldvr, ilo,
ihi, scale, abnrm, rconde, rcondv, work, lwork, iwork, info)
call cgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr, ilo, ihi,
scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
call zgeevx(balanc, jobvl, jobvr, sense, n, a, lda, w, vl, ldvl, vr, ldvr, ilo, ihi,
scale, abnrm, rconde, rcondv, work, lwork, rwork, info)
call geevx(a, wr, wi [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [, rconde]
[,rcondv] [,info])
call geevx(a, w [,vl] [,vr] [,balanc] [,ilo] [,ihi] [,scale] [,abnrm] [,rconde] [,
rcondv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for an n-by-n real/complex nonsymmetric matrix A, the eigenvalues and, optionally,
the left and/or right eigenvectors.
Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and
eigenvectors (ilo, ihi, scale, and abnrm), reciprocal condition numbers for the eigenvalues (rconde), and
reciprocal condition numbers for the right eigenvectors (rcondv).
The right eigenvector v of A satisfies
Av = v
where is its eigenvalue.
The left eigenvector u of A satisfies
uHA = uH
where uH denotes the conjugate transpose of u. The computed eigenvectors are normalized to have Euclidean
norm equal to 1 and largest component real.
Balancing a matrix means permuting the rows and columns to make it more nearly upper triangular, and
applying a diagonal similarity transformation D*A*inv(D), where D is a diagonal matrix, to make its rows and
columns closer in norm and the condition numbers of its eigenvalues and eigenvectors smaller. The computed
reciprocal condition numbers correspond to the balanced matrix. Permuting rows and columns will not
change the condition numbers in exact arithmetic) but diagonal scaling will. For further explanation of
balancing, see [LUG], Section 4.10.
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Indicates how the input
matrix should be diagonally scaled and/or permuted to improve the
conditioning of its eigenvalues.
If balanc = 'N', do not diagonally scale or permute;
If balanc = 'P', perform permutations to make the matrix more nearly

upper triangular. Do not diagonally scale;
1193
If balanc = 'S', diagonally scale the matrix, i.e. replace A by

D*A*inv(D), where D is a diagonal matrix chosen to make the rows and
columns of A more equal in norm. Do not permute;
If balanc = 'B', both diagonally scale and permute A.
Computed reciprocal condition numbers will be for the matrix after

balancing and/or permuting. Permuting does not change condition numbers
(in exact arithmetic), but balancing does.

If jobvl = 'N', left eigenvectors of A are not computed;
If jobvl = 'V', left eigenvectors of A are computed.
If sense = 'E' or 'B', then jobvl must be 'V'.

If jobvr = 'N', right eigenvectors of A are not computed;
If jobvr = 'V', right eigenvectors of A are computed.
If sense = 'E' or 'B', then jobvr must be 'V'.
If sense = 'E', computed for eigenvalues only;
If sense = 'V', computed for right eigenvectors only;
If sense = 'B', computed for eigenvalues and right eigenvectors.
If sense is 'E' or 'B', both left and right eigenvectors must also be
computed (jobvl = 'V' and jobvr = 'V').
a, work REAL for sgeevx

DOUBLE PRECISION for dgeevx
COMPLEX for cgeevx
DOUBLE COMPLEX for zgeevx.
Arrays:
ldvl, ldvr INTEGER. The leading dimensions of the output arrays vl and vr,
respectively.
Constraints:
ldvl 1; ldvr 1.
1194
LAPACK Routines 3
If jobvl = 'V', ldvl max(1, n);
If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
For real flavors:
If sense = 'N' or 'E', lwork max(1, 2n), and if jobvl = 'V' or
jobvr = 'V', lwork 3n;
If sense = 'V' or 'B', lworkn*(n+6).

If sense = 'N'or 'E', lwork max(1, 2n);
If sense = 'V' or 'B', lworkn2+2n. For good performance, lwork must

generally be larger.
xerbla.
rwork REAL for cgeevx

DOUBLE PRECISION for zgeevx
iwork INTEGER.
Workspace array, size at least max(1, 2n-2). Used in real flavors only. Not
referenced if sense = 'N' or 'E'.
Output Parameters
a On exit, this array is overwritten.

If jobvl = 'V' or jobvr = 'V', it contains the real-Schur/Schur form of
the balanced version of the input matrix A.
wr, wi REAL for sgeevx

respectively, of the computed eigenvalues. Complex conjugate pairs of
eigenvalues appear consecutively with the eigenvalue having positive
imaginary part first.
w COMPLEX for cgeevx

Array, size at least max(1, n). Contains the computed eigenvalues.
vl, vr REAL for sgeevx

1195
COMPLEX for cgeevx

Arrays:
For real flavors:

If the j-th eigenvalue is real, then uj = vl(:,j), the j-th column of vl.
i = sqrt(-1), uj = vl(:,j) + i*vl(:,j+1) and uj + 1 = vl(:,j)-
i*vl(:,j+1).
uj = vl(:,j), the j-th column of vl.
For real flavors:

If the j-th eigenvalue is real, then vj = vr(:,j), the j-th column of vr.
i = sqrt(-1), vj = vr(:,j) + i*vr(:,j+1) and vj + 1 = vr(:,j) -
i*vr(:,j+1).

vj = vr(:,j), the j-th column of vr.
ilo, ihi INTEGER. ilo and ihi are integer values determined when A was balanced.
The balanced A(i,j) = 0 if i > j and j = 1,..., ilo-1 or i = ihi
+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.

Array, size at least max(1, n). Details of the permutations and scaling
factors applied when balancing A.
If P(j) is the index of the row and column interchanged with row and
column j, and D(j) is the scaling factor applied to row and column j, then
scale(j) = P(j), for j = 1,...,ilo-1
= D(j), for j = ilo,...,ihi
= P(j) for j = ihi+1,..., n.
abnrm REAL for single-precision flavors

1196
LAPACK Routines 3
The one-norm of the balanced matrix (the maximum of the sum of absolute
values of elements of any column).
flavors.
Arrays, size at least max(1, n) each.
rconde(j) is the reciprocal condition number of the j-th eigenvalue.
rcondv(j) is the reciprocal condition number of the j-th right eigenvector.
lwork.
info INTEGER.
If info = i, the QR algorithm failed to compute all the eigenvalues, and no

eigenvectors or condition numbers have been computed; elements 1:ilo-1
eigenvalues which have converged.

Specific details for the routine geevx interface are the following:
rconde Holds the vector of length n.
rcondv Holds the vector of length n.
balanc Must be 'N', 'B', 'P' or 'S'. The default value is 'N'.

1197


Application Notes
lwork = -1.
Singular Value Decomposition: LAPACK Driver Routines

Table "Driver Routines for Singular Value Decomposition" lists the LAPACK driver routines that perform
singular value decomposition for the FORTRAN 77 interface. The corresponding routine names in the Fortran
95 interface are the same except that the first character is removed.
Driver Routines for Singular Value Decomposition
?gesvd Computes the singular value decomposition of a general rectangular matrix.
?gesdd Computes the singular value decomposition of a general rectangular matrix using a
divide and conquer method.
?gejsv Computes the singular value decomposition of a real matrix using a preconditioned
Jacobi SVD method.
?gesvj Computes the singular value decomposition of a real matrix using Jacobi plane
rotations.
?ggsvd Computes the generalized singular value decomposition of a pair of general

rectangular matrices.
?gesvdx Computes the SVD and left and right singular vectors for a matrix.
?bdsvdx Computes the SVD of a bidiagonal matrix.
Singular Value Decomposition: LAPACK Computational Routines
?gesvd
general rectangular matrix.
1198
LAPACK Routines 3
Syntax
call sgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call dgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, info)
call cgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, info)
call zgesvd(jobu, jobvt, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, info)
call gesvd(a, s [,u] [,vt] [,ww] [,job] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, optionally
computing the left and/or right singular vectors. The SVD is written as
A = U**VT for real routines
A = U**VH for complex routines
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m
orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal elements of are the
singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,
n) columns of U and V are the left and right singular vectors of A.
Note that the routine returns VT (for real flavors) or VH (for complex flavors), not V.
Input Parameters
jobu CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies options for
computing all or part of the matrix U.
If jobu = 'A', all m columns of U are returned in the array u;
if jobu = 'S', the first min(m, n) columns of U (the left singular vectors)
are returned in the array u;
if jobu = 'O', the first min(m, n) columns of U (the left singular vectors)
are overwritten on the array a;
if jobu = 'N', no columns of U (no left singular vectors) are computed.
jobvt CHARACTER*1. Must be 'A', 'S', 'O', or 'N'. Specifies options for
computing all or part of the matrix VT/VH.
If jobvt = 'A', all n rows of VT/VH are returned in the array vt;
if jobvt = 'S', the first min(m,n) rows of VT/VH (the right singular
vectors) are returned in the array vt;
if jobvt = 'O', the first min(m,n) rows of VT/VH) (the right singular
vectors) are overwritten on the array a;
if jobvt = 'N', no rows of VT/VH (no right singular vectors) are computed.
jobvt and jobu cannot both be 'O'.
1199
a, work REAL for sgesvd

DOUBLE PRECISION for dgesvd
COMPLEX for cgesvd
DOUBLE COMPLEX for zgesvd.
Arrays:
a(lda,*) is an array containing the m-by-n matrix A.

Must be at least max(1, m) .
ldu, ldvt INTEGER. The leading dimensions of the output arrays u and vt,
respectively.
Constraints:
ldu 1; ldvt 1.
If jobu = 'A'or 'S', ldum;
If jobvt = 'A', ldvtn;
If jobvt = 'S', ldvt min(m, n) .
lwork INTEGER.
Constraints:
lwork 1
lwork max(3*min(m, n)+max(m, n), 5*min(m,n)) (for real flavors);
lwork 2*min(m, n)+max(m, n) (for complex flavors).
rwork REAL for cgesvd

DOUBLE PRECISION for zgesvd
Workspace array, size at least max(1, 5*min(m, n)). Used in complex
flavors only.
Output Parameters
a On exit,
If jobu = 'O', a is overwritten with the first min(m,n) columns of U (the
left singular vectors stored columnwise);
1200
LAPACK Routines 3
If jobvt = 'O', a is overwritten with the first min(m, n) rows of VT/VH (the
right singular vectors stored rowwise);
If jobu'O' and jobvt'O', the contents of a are destroyed.
s REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
Array, size at least max(1, min(m,n)). Contains the singular values of A
sorted so that s(i) s(i+1).
u, vt REAL for sgesvd

DOUBLE PRECISION for dgesvd
COMPLEX for cgesvd
DOUBLE COMPLEX for zgesvd.
Arrays:
u(ldu,*); the second dimension of u must be at least max(1, m) if jobu =
'A', and at least max(1, min(m, n)) if jobu = 'S'.
If jobu = 'A', u contains the m-by-m orthogonal/unitary matrix U.
If jobu = 'S', u contains the first min(m, n) columns of U (the left

singular vectors stored column-wise).
If jobu = 'N' or 'O', u is not referenced.
vt(ldvt,*); the second dimension of vt must be at least max(1, n).

If jobvt = 'A', vt contains the n-by-n orthogonal/unitary matrix VT/VH.
If jobvt = 'S', vt contains the first min(m, n) rows of VT/VH (the right
singular vectors stored row-wise).
If jobvt = 'N'or 'O', vt is not referenced.
work On exit, if info = 0, then work(1) returns the required minimal size of
lwork.
For real flavors:
If info > 0, work(2:min(m,n)) contains the unconverged superdiagonal
elements of an upper bidiagonal matrix B whose diagonal is in s (not
necessarily sorted). B satisfies A=u*B*vt, so it has the same singular
values as A, and singular vectors related by u and vt.
rwork On exit (for complex flavors), if info > 0, rwork(1:min(m,n)-1)

contains the unconverged superdiagonal elements of an upper bidiagonal
matrix B whose diagonal is in s (not necessarily sorted). B satisfies A =
u*B*vt, so it has the same singular values as A, and singular vectors
related by u and vt.
info INTEGER.
1201
If info = i, then if ?bdsqr did not converge, i specifies how many

superdiagonals of the intermediate bidiagonal form B did not converge to
zero (see the description of the work and rwork parameters for details).

Specific details for the routine gesvd interface are the following:
s Holds the vector of length min(m, n).
u If present and is a square m-by-m matrix, on exit contains the m-by-m

orthogonal/unitary matrix U.
Otherwise, if present, on exit contains the first min(m,n) columns of the matrix
U (left singular vectors stored column-wise).
vt If present and is a square n-by-n matrix, on exit contains the n-by-n orthogonal/
unitary matrix V'T/V'H.
Otherwise, if present, on exit contains the first min(m,n) rows of the matrix
V'T/V'H (right singular vectors stored row-wise).
ww Holds the vector of length min(m, n)-1. ww contains the unconverged

superdiagonal elements of an upper bidiagonal matrix B whose diagonal is in s
(not necessarily sorted). B satisfies A = U*B*VT, so it has the same singular
values as A, and singular vectors related by U and VT.
job Must be either 'N', or 'U', or 'V'. The default value is 'N'.
If job = 'U', and u is not present, then u is returned in the array a.
If job = 'V', and vt is not present, then vt is returned in the array a.
Application Notes
lwork = -1.
?gesdd
general rectangular matrix using a divide and conquer
method.
1202
LAPACK Routines 3
Syntax
call sgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call dgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, iwork, info)
call cgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork, info)
call zgesdd(jobz, m, n, a, lda, s, u, ldu, vt, ldvt, work, lwork, rwork, iwork, info)
call gesdd(a, s [,u] [,vt] [,jobz] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, optionally
computing the left and/or right singular vectors.
If singular vectors are desired, it uses a divide-and-conquer algorithm. The SVD is written
A = U**VT for real routines,
A = U**VH for complex routines,
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m
orthogonal/unitary matrix, and V is an n-by-n orthogonal/unitary matrix. The diagonal elements of are the
singular values of A; they are real and non-negative, and are returned in descending order. The first min(m,
n) columns of U and V are the left and right singular vectors of A.
Note that the routine returns vt = VT (for real flavors) or vt =VH (for complex flavors), not V.
Input Parameters
jobz CHARACTER*1. Must be 'A', 'S', 'O', or 'N'.

Specifies options for computing all or part of the matrices U and V.
If jobz = 'A', all m columns of U and all n rows of VT or VH are returned
in the arrays u and vt;
if jobz = 'S', the first min(m, n) columns of U and the first min(m, n)
rows of VT or VH are returned in the arrays u and vt;
if jobz = 'O', then
if m n, the first n columns of U are overwritten in the array a and all rows
of VT or VH are returned in the array vt;
if m<n, all columns of U are returned in the array u and the first m rows of
VT or VH are overwritten in the array a;
if jobz = 'N', no columns of U or rows of VT or VH are computed.
a, work REAL for sgesdd

DOUBLE PRECISION for dgesdd
COMPLEX for cgesdd
DOUBLE COMPLEX for zgesdd.
1203
Arrays:
a(lda,*) is an array containing the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a. Must be at least max(1, m).
ldu, ldvt INTEGER. The leading dimensions of the output arrays u and vt,
respectively.
Constraints:
ldu 1; ldvt 1.
If jobz = 'S' or 'A', or jobz = 'O' and m < n,
then ldum;
If jobz = 'A' or jobz = 'O' and m n,
then ldvtn;
If jobz = 'S', ldvt min(m, n).
lwork INTEGER.
The dimension of the array work; lwork 1.

work(1), and no error message related to lwork is issued by xerbla.
See Application Notes for the theoretical minimum value of lwork when a
workspace query is not performed.
rwork REAL for cgesdd

DOUBLE PRECISION for zgesdd
Workspace array, size at least max(1, 7*min(m,n)) if jobz = 'N'.
Otherwise, the dimension of rwork must be at least

max(1,min(m,n)*max(5*min(m,n)+7,2*max(m,n)+2*min(m,n)+1)).
This array is used in complex flavors only.
iwork INTEGER. Workspace array, size at least max(1, 8 *min(m, n)).
Output Parameters
a On exit:
If jobz = 'O', then if m n, a is overwritten with the first n columns of U
(the left singular vectors, stored columnwise). If m < n, a is overwritten
with the first m rows of VT (the right singular vectors, stored rowwise);
If jobz'O', the contents of a are destroyed.
s REAL for single precision flavors DOUBLE PRECISION for double precision
flavors.
1204
LAPACK Routines 3
Array, size at least max(1, min(m,n)). Contains the singular values of A
sorted so that s(i) s(i+1).
u, vt REAL for sgesdd

DOUBLE PRECISION for dgesdd
COMPLEX for cgesdd
DOUBLE COMPLEX for zgesdd.
Arrays:
u(ldu,*); the second dimension of u must be at least max(1, m) if jobz =
'A' or jobz = 'O' and m < n.
If jobz = 'S', the second dimension of u must be at least max(1, min(m,
n)).
If jobz = 'A'or jobz = 'O' and m < n, u contains the m-by-m
orthogonal/unitary matrix U.
If jobz = 'S', u contains the first min(m, n) columns of U (the left
singular vectors, stored columnwise).
If jobz = 'O' and mn, or jobz = 'N', u is not referenced.
vt(ldvt,*); the second dimension of vt must be at least max(1, n).

If jobz = 'A'or jobz = 'O' and mn, vt contains the n-by-n orthogonal/
unitary matrix VT.
If jobz = 'S', vt contains the first min(m, n) rows of VT (the right singular
vectors, stored rowwise).
If jobz = 'O' and m < n, or jobz = 'N', vt is not referenced.
work(1) On exit, if info = 0, then work(1) returns the optimal size of lwork.
info INTEGER.
If info = i, then ?bdsdc did not converge, updating process failed.

Specific details for the routine gesdd interface are the following:
s Holds the vector of length min(m, n).
u Holds the matrix U of size
(m,m) if jobz='A' or jobz='O' and m < n

(m,min(m, n)) if jobz='S'
u is not referenced if jobz is not supplied or if jobz='N' or jobz='O' and mn.
1205
vt Holds the matrix VT of size
(n,n) if jobz='A' or jobz='O' and mn

(min(m, n), n) if jobz='S'
vt is not referenced if jobz is not supplied or if jobz='N' or jobz='O' and m <
n.
job Must be 'N', 'A', 'S', or 'O'. The default value is 'N'.
Application Notes
The theoretical minimum value for lwork depends on the flavor of the routine.
For real flavors:

If jobz = 'N', lwork= 3*min(m, n) + max (max(m,n), 6*min(m, n));
If jobz = 'O', lwork= 3*(min(m, n))2 + max (max(m, n), 5*(min(m, n))2 + 4*min(m, n));
If jobz = 'S' or 'A', lwork= min(m, n)*(6 + 4*min(m, n)) + max(m, n);

If jobz = 'N', lwork= 2*min(m, n) + max(m, n);
If jobz = 'O', lwork= 2*(min(m, n))2 + max(m, n) + 2*min(m, n);
If jobz = 'S' or 'A', lwork= (min(m, n))2 + max(m, n) + 2*min(m, n);
The optimal value of lwork returned by a workspace query generally provides better performance than the
theoretical minimum value.The value of lwork returned by a workspace query is generally larger than the
theoretical minimum value, but for very small matrices it can be smaller. The absolute minimum value of
lwork is the minimum of the workspace query result and the theoretical minimum.
If you set lwork to a value less than the absolute minimum value and not equal to -1, the routine returns
immediately with an error exit and does not provide information on the recommended workspace size.
?gejsv
Computes the singular value decomposition using a
preconditioned Jacobi SVD method.
Syntax
call sgejsv(joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
work, lwork, iwork, info)
call dgejsv(joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
work, lwork, iwork, info)
call cgejsv (joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
cwork, lwork, rwork, lrwork, iwork, info )
call zgejsv (joba, jobu, jobv, jobr, jobt, jobp, m, n, a, lda, sva, u, ldu, v, ldv,
cwork, lwork, rwork, lrwork, iwork, info )
Include Files
mkl.fi
Description
The routine computes the singular value decomposition (SVD) of a real/complex m-by-n matrix A, where mn.
The SVD is written as
1206
LAPACK Routines 3
A = U**VT, for real routines
A = U**VH, for complex routines
where is an m-by-n matrix which is zero except for its n diagonal elements, U is an m-by-n (or m-by-m)
orthonormal matrix, and V is an n-by-n orthogonal matrix. The diagonal elements of are the singular values
of A; the columns of U and V are the left and right singular vectors of A, respectively. The matrices U and V
are computed and stored in the arrays u and v, respectively. The diagonal of is computed and stored in the
array sva.
The ?gejsv routine can sometimes compute tiny singular values and their singular vectors much more
accurately than other SVD routines.
The routine implements a preconditioned Jacobi SVD algorithm. It uses ?geqp3, ?geqrf, and ?gelqf as
preprocessors and preconditioners. Optionally, an additional row pivoting can be used as a preprocessor,
which in some cases results in much higher accuracy. An example is matrix A with the structure A = D1 * C
* D2, where D1, D2 are arbitrarily ill-conditioned diagonal matrices and C is a well-conditioned matrix. In that
case, complete pivoting in the first QR factorizations provides accuracy dependent on the condition number
of C, and independent of D1, D2. Such higher accuracy is not completely understood theoretically, but it
If A can be written as A = B*D, with well-conditioned B and some diagonal D, then the high accuracy is
guaranteed, both theoretically and in software, independent of D. For more details see [Drmac08-1],
[Drmac08-2].
The computational range for the singular values can be the full range ( UNDERFLOW,OVERFLOW ), provided
that the machine arithmetic and the BLAS and LAPACK routines called by ?gejsv are implemented to work in
that range. If that is not the case, the restriction for safe computation with the singular values in the range
of normalized IEEE numbers is that the spectral condition number kappa(A)=sigma_max(A)/sigma_min(A)
does not overflow. This code (?gejsv) is best used in this restricted range, meaning that singular values of
magnitude below ||A||_2 / slamch('O') (for single precision) or ||A||_2 / dlamch('O') (for double
precision) are returned as zeros. See jobr for details on this.
This implementation is slower than the one described in [Drmac08-1], [Drmac08-2] due to replacement of
some non-LAPACK components, and because the choice of some tuning parameters in the iterative part (?
gesvj) is left to the implementer on a particular machine.
The rank revealing QR factorization (in this code: ?geqp3) should be implemented as in [Drmac08-3].
If m is much larger than n, it is obvious that the inital QRF with column pivoting can be preprocessed by the
QRF without pivoting. That well known trick is not used in ?gejsv because in some cases heavy row
weighting can be treated with complete pivoting. The overhead in cases m much larger than n is then only
due to pivoting, but the benefits in accuracy have prevailed. You can incorporate this extra QRF step easily
and also improve data movement (matrix transpose, matrix copy, matrix transposed copy) - this
implementation of ?gejsv uses only the simplest, naive data movement.
Optimization Notice
1207
Input Parameters
joba CHARACTER*1. Must be 'C', 'E', 'F', 'G', 'A', or 'R'.

Specifies the level of accuracy:
If joba = 'C', high relative accuracy is achieved if A = B*D with well-
conditioned B and arbitrary diagonal matrix D. The accuracy cannot be
spoiled by column scaling. The accuracy of the computed output depends
on the condition of B, and the procedure aims at the best theoretical
accuracy. The relative error max_{i=1:N}|d sigma_i| / sigma_i is
bounded by f(M,N)*epsilon* cond(B), independent of D. The input
matrix is preprocessed with the QRF with column pivoting. This initial
preprocessing and preconditioning by a rank revealing QR factorization is
common for all values of joba. Additional actions are specified as follows:
If joba = 'E', computation as with 'C' with an additional estimate of the

condition number of B. It provides a realistic error bound.
If joba = 'F', accuracy higher than in the 'C' option is achieved, if A =
D1*C*D2 with ill-conditioned diagonal scalings D1, D2, and a well-
conditioned matrix C. This option is advisable, if the structure of the input
matrix is not known and relative accuracy is desirable. The input matrix A is
preprocessed with QR factorization with full (row and column) pivoting.
If joba = 'G', computation as with 'F' with an additional estimate of the
condition number of B, where A = B*D. If A has heavily weighted rows,
using this condition number gives too pessimistic error bound.
If joba = 'A', small singular values are the noise and the matrix is treated
as numerically rank defficient. The error in the computed singular values is
bounded by f(m,n)*epsilon*||A||. The computed SVD A = U*S*V**t
(for real flavors) or A = U*S*V**H (for complex flavors) restores A up to
f(m,n)*epsilon*||A||. This enables the procedure to set all singular
values below n*epsilon*||A|| to zero.
If joba = 'R', the procedure is similar to the 'A' option. Rank revealing
property of the initial QR factorization is used to reveal (using triangular
factor) a gap sigma_{r+1} < epsilon * sigma_r, in which case the
numerical rank is declared to be r. The SVD is computed with absolute error
bounds, but more accurately than with 'A'.
jobu CHARACTER*1. Must be 'U', 'F', 'W', or 'N'.

Specifies whether to compute the columns of the matrix U:
If jobu = 'U', n columns of U are returned in the array u
If jobu = 'F', a full set of m left singular vectors is returned in the array u.
If jobu = 'W', u may be used as workspace of length m*n. See the

description of u.
If jobu = 'N', u is not computed.
jobv CHARACTER*1. Must be 'V', 'J', 'W', or 'N'.

Specifies whether to compute the matrix V:
If jobv = 'V', n columns of V are returned in the array v; Jacobi rotations
are not explicitly accumulated.
1208
LAPACK Routines 3
If jobv = 'J', n columns of V are returned in the array v but they are
computed as the product of Jacobi rotations. This option is allowed only if
jobu'N'
If jobv = 'W', v may be used as workspace of length n*n. See the
description of v.
If jobv = 'N', v is not computed.
jobr CHARACTER*1. Must be 'N' or 'R'.

Specifies the range for the singular values. If small positive singular values
are outside the specified range, they may be set to zero. If A is scaled so
that the largest singular value of the scaled matrix is around sqrt(big),
big = ?lamch('O'), the function can remove columns of A whose norm in
the scaled matrix is less than sqrt(?lamch('S')) (for jobr = 'R'), or
less than small = ?lamch('S')/?lamch('E').
If jobr = 'N', the function does not remove small columns of the scaled
matrix. This option assumes that BLAS and QR factorizations and triangular
solvers are implemented to work in that range. If the condition of A if
greater that big, use ?gesvj.
If jobr = 'R', restricted range for singular values of the scaled matrix A is
[sqrt(?lamch('S'), sqrt(big)], roughly as described above. This
option is recommended.
For computing the singular values in the full range [?lamch('S'),big],
use ?gesvj.
jobt CHARACTER*1. Must be 'T' or 'N'.

If the matrix is square, the procedure may determine to use a transposed A
if AT (for real flavors) or AH (for complex flavors) seems to be better with
respect to convergence. If the matrix is not square, jobt is ignored. This is
subject to changes in the future.
The decision is based on two values of entropy over the adjoint orbit of AT *
A (for real flavors) or AH * A (for complex flavors). See the descriptions of
work(6) and work(7).
If jobt = 'T', the function performs transposition if the entropy test
indicates possibly faster convergence of the Jacobi process, if A is taken as
input. If A is replaced with AT or AH, the row pivoting is included
automatically.
If jobt = 'N', the functions attempts no speculations. This option can be
used to compute only the singular values, or the full SVD (u, sigma, and v).
For only one set of singular vectors (u or v), the caller should provide both
u and v, as one of the arrays is used as workspace if the matrix A is
transposed. The implementer can easily remove this constraint and make
the code more complicated. See the descriptions of u and v.
jobp CHARACTER*1. Must be 'P' or 'N'.

Enables structured perturbations of denormalized numbers. This option
should be active if the denormals are poorly implemented, causing slow
computation, especially in cases of fast convergence. For details, see
1209
[Drmac08-1], [Drmac08-2] . For simplicity, such perturbations are included

only when the full SVD or only the singular values are requested. You can
add the perturbation for the cases of computing one set of singular vectors.
If jobp = 'P', the function introduces perturbation.
If jobp = 'N', the function introduces no perturbation.
m INTEGER. The number of rows of the input matrix A; m 0.
n INTEGER. The number of columns in the input matrix A; mn 0.
a, u, v REAL for sgejsv

DOUBLE PRECISION for dgejsv.
COMPLEX for cgejsv
DOUBLE COMPLEX for zgejsv
Array a(lda,n) is an array containing the m-by-n matrix A.
u is a workspace array, its size is (ldu,*); the second dimension of u must

be m if jobu = 'F' and n otherwise, . When jobt = 'T' and m = n, u must be
provided even though jobu = 'N'.
v is a workspace array, its size is (ldv,n). When jobt = 'T' and m = n, v

must be provided even though jobv = 'N'.
m) .
sva REAL for sgejsv

REAL for cgejsv
DOUBLE PRECISION for zgejsv
sva is a workspace array, its size is n.
ldu INTEGER. The leading dimension of the array u; ldu 1.

jobu = 'U' or 'F' or 'W', ldum for column major layout.
ldv INTEGER. The leading dimension of the array v; ldv 1.

jobv = 'V' or 'J' or 'W', ldvn.
work REAL for sgejsv

lwork INTEGER.
For real flavors:
Length of work to confirm proper allocation of work space. lwork depends
on the task performed:
If only sigma is needed (jobu = 'N', jobv = 'N') and
1210
LAPACK Routines 3
... no scaled condition estimate is required, then lwork max(2*m+n,
4*n+1,7). This is the minimal requirement. For optimal performance
(blocked code) the optimal value is lwork max(2*m+n,3*n+(n+1)*nb,
7). Here nb is the optimal block size for ?geqp3/?geqrf.
In general, the optimal length lwork is computed as
lwork max(2*m+n,n+lwork(sgeqp3),n+lwork(sgeqrf),7) for

sgejsv
lwork max(2*m+n,n+lwork(dgeqp3),n+lwork(dgeqrf),7) for
dgejsv
... an estimate of the scaled condition number of A is required (joba =
'E', 'G'). In this case, lwork is the maximum of the above and n*n
+4*n, that is, lwork max(2*m+n,n*n+4*n,7). For optimal
performance (blocked code) the optimal value is lwork max(2*m+n,
3*n+(n+1)*nb, n*n+4*n, 7).
In general, the optimal length lwork is computed as
lwork max(2*m+n,n+lwork(sgeqp3),n+lwork(sgeqrf),n+n*n
+lwork(spocon, 7) for sgejsv
lwork max(2*m+n,n+lwork(dgeqp3),n+lwork(dgeqrf),n+n*n
+lwork(dpocon, 7) for dgejsv
If sigma and the right singular vectors are needed (jobv = 'V'),
the minimal requirement is lwork max(2*m+n,4*n+1,7).

for optimal performance, lwork max(2*m+n,3*n+(n+1)*nb,7), where
nb is the optimal block size for ?geqp3, ?geqrf, ?gelqf, ?ormlq. In
general, the optimal length lwork is computed as
lwork max(2*m+n, n+lwork(sgeqp3), n+lwork(spocon), n

+lwork(sgelqf), 2*n+lwork(sgeqrf), n+lwork(sormlq) for
sgejsv
lwork max(2*m+n, n+lwork(dgeqp3), n+lwork(dpocon), n
+lwork(dgelqf), 2*n+lwork(dgeqrf), n+lwork(dormlq) for
dgejsv
If sigma and the left singular vectors are needed
the minimal requirement is lwork max(2*n+m,4*n+1,7).

for optimal performance,
if jobu = 'U' :: lwork max(2*m+n,3*n+(n+1)*nb, 7),
if jobu = 'F' :: lwork max(2*m+n,3*n+(n+1)*nb, n+m*nb, 7),
where nb is the optimal block size for ?geqp3, ?geqrf, ?ormlq . In

general, the optimal length lwork is computed as
lwork max(2*m+n, n+lwork(sgeqp3), n+lwork(spocon), 2*n

+lwork(sgeqrf), n+lwork(sormlq) for sgejsv
lwork max(2*m+n, n+lwork(dgeqp3), n+lwork(dpocon), 2*n
+lwork(dgeqrf), n+lwork(dormlq) for dgejsv
Here lwork(?ormlq) equals n*nb (for jobu = 'U') or m*nb (for jobu =
'F')
1211
If full SVD is needed (jobu = 'U' or 'F') and
if jobv = 'V',
the minimal requirement is lwork max(2*m+n, 6*n+2*n*n)
if jobv = 'J',
the minimal requirement is lwork max(2*m+n, 4*n+n*n, 2*n+n*n
+6)
For optimal performance, lwork should be additionally larger than n
+m*nb, where nb is the optimal block size for ?ormlq.

Length of cwork to confirm proper allocation of workspace. lwork depends
on the job:
1. If only sigma is needed ( jobu.EQ.'N', jobv.EQ.'N' ) and
... no scaled condition estimate required: lwork 2*n+1. This is the

minimal requirement. For optimal performance (blocked code) the
optimal value is lworkn + (n+1)*nb. Here nb is the optimal block size
for ?geqp3 and ?geqrf. In general, optimal lwork is computed as
lwork max(n+lwork(?geqp3),n+lwork(?geqrf)).
... an estimate of the scaled condition number of a is required
(joba='E', or 'G'). In this case, the minimal requirement is
lworkn*n + 3*n. For optimal performance (blocked code) the optimal
value is lwork max(n+(n+1)*nb, n*n+3*n). In general, the optimal
length lwork is computed as lwork max(n+lwork(?geqp3),n
+lwork(?geqrf), n+n*n+lwork(?pocon)).
2. If sigma and the right singular vectors are needed (jobv.EQ.'V'),

(jobu.EQ.'N'), then the minimal requirement is lwork 3*n. For optimal
performance, lwork max(n+(n+1)*nb, 3*n,2*n+n*nb), where nb is the
optimal block size for ?geqp3, ?geqrf, ?gelq, ?unmlq. In general, the
optimal length lwork is computed as lwork max(n+lwork(?geqp3), n
+lwork(?pocon), n+lwork(?gesvj), n+lwork(?gelqf), 2*n
+lwork(?geqrf), n+lwork(?unmlq)).
3. If sigma and the left singular vectors are needed, then the minimal
requirement is lwork 3*n. For optimal performance: if
jobu.EQ.'U' :: lwork max(3*n, n+(n+1)*nb, 2*n+n*nb), where nb
is the optimal block size for ?geqp3, ?geqrf, ?unmqr. In general, the
optimal length lwork is computed as lwork max(n+lwork(?geqp3),n
+lwork(?pocon), 2*n+lwork(?geqrf), n+lwork(?unmqr)).
4. If the full SVD is needed: (jobu.EQ.'U' or jobu.EQ.'F') and
if jobv.EQ.'V' the minimal requirement is lwork 5*n+2*n*n.

if jobv.EQ.'J' the minimal requirement is lwork 4*n+n*n. In both
cases, the allocated cwork can accommodate blocked runs of ?geqp3, ?
geqrf, ?gelqf, ?unmqr, ?unmlq.
cwork COMPLEX for cgejsv

DOUBLE COMPLEX for zgejsv
cwork is a workspace array, dimension is at least lwork.
1212
LAPACK Routines 3
rwork REAL for cgejsv
DOUBLE PRECISION for zgejsv
rwork is an array, dimension is at least max(7, lrwork).
lrwork INTEGER. Length of rwork to confirm proper allocation of workspace.

lrwork depends on the job:
1. If only singular values are requested i.e. if lsame(jobu,'N') .AND.
lsame(jobv,'N') then:
If lsame(jobt,'T') .OR. lsame(joba,'F') .OR.

lsame(joba,'G'), then lrwork = max( 7, n + 2 * m ).
Otherwise, lrwork = max( 7, 2 * n ).
2. If singular values with the right singular vectors are requested i.e. if
(lsame(jobv,'V').OR.lsame(jobv,'J')) .AND. .NOT.
(lsame(jobu,'U').OR.lsame(jobu,'F')) then:

3. If singular values with the left singular vectors are requested, i.e. if
(lsame(jobu,'U').OR.lsame(jobu,'F')) .AND. .NOT.
(lsame(jobv,'V').OR.lsame(jobv,'J')) then:

4. If singular values with both the left and the right singular vectors are
requested, i.e. if (lsame(jobu,'U').OR.lsame(jobu,'F')) .AND.
(lsame(jobv,'V').OR.lsame(jobv,'J')) then:

iwork INTEGER. Workspace array, size:

For real flavors:
max(3, m+3*n).
If lsame(joba,'F') .OR. lsame(joba,'G'), then the dimension of

iwork is max( 3, 2 * n + m ).
Otherwise, the dimension of iwork is
max( 3, 2*n ) for full SVD
max( 3, n ) for singular values only or singular values with one set
of singular vectors (left or right).
Output Parameters
sva On exit:
1213
For work(1)/work(2) = one: the singular values of A. During the

computation sva contains Euclidean column norms of the iterated matrices
in the array a.
For work(1)work(2): the singular values of A are (work(1)/work(2)) *

sva(1:n). This factored form is used if sigma_max(A) overflows or if small
singular values have been saved from underflow by scaling the input matrix
A.
jobr = 'R', some of the singular values may be returned as exact zeros
obtained by 'setting to zero' because they are below the numerical rank
threshold or are denormalized numbers.
u On exit:
If jobu = 'U', contains the m-by-n matrix of the left singular vectors.
If jobu = 'F', contains the m-by-m matrix of the left singular vectors,
including an orthonormal basis of the orthogonal complement of the range
of A.
If jobu = 'W' and jobv = 'V', jobt = 'T', and m = n, then u is used
as workspace if the procedure replaces A with AT (for real flavors) or AH (for
complex flavors). In that case, v is computed in u as left singular vectors of
AT or AH and copied back to the v array. This 'W' option is just a reminder
to the caller that in this case u is reserved as workspace of length n*n.
v On exit:
If jobv = 'V' or 'J', contains the n-by-n matrix of the right singular
vectors.
If jobv = 'W' and jobu = 'U', jobt = 'T', and m = n, then v is used
as workspace if the procedure replaces A with AT (for real flavors) or AH (for
complex flavors). In that case, u is computed in v as right singular vectors
of AT or AH and copied back to the u array. This 'W' option is just a
reminder to the caller that in this case v is reserved as workspace of length
n*n.
work On exit,
work(1) = scale = work(2)/work(1) is the scaling factor such that
scale*sva(1:n) are the computed singular values of A. See the
description of sva().
work(2) = see the description of work(1).

work(3) = sconda is an estimate for the condition number of column
equilibrated A. If joba = 'E' or 'G', sconda is an estimate of sqrt(||
(R**t * R)**(-1)||_1). It is computed using ?pocon. It holds
n**(-1/4) * sconda ||R**(-1)||_2 n**(1/4) * sconda, where R
is the triangular factor from the QRF of A. However, if R is truncated and the
numerical rank is determined to be strictly smaller than n, sconda is
returned as -1, indicating that the smallest singular values might be lost.
1214
LAPACK Routines 3
If full SVD is needed, the following two condition numbers are useful for the
analysis of the algorithm. They are provided for a user who is familiar with
the details of the method.
work(4) = an estimate of the scaled condition number of the triangular
factor in the first QR factorization.
work(5) = an estimate of the scaled condition number of the triangular
factor in the second QR factorization.
The following two parameters are computed if jobt = 'T'. They are
provided for a user who is familiar with the details of the method.
work(6) = the entropy of A**t*A :: this is the Shannon entropy of
diag(A**t*A) / Trace(A**t*A) taken as point in the probability
simplex.
work(7) = the entropy of A*A**t.
rwork On exit,
rwork(1) determines the scaling factor scale = rwork(2) / rwork(1)
such that scale*sva(1:n) are the computed singular values of a. (See the
description of sva().)
rwork(2) = see the description of rwork(1).

rwork(3) = sconda is an estimate for the condition number of column
equilibrated A. If joba = 'E' or 'G', sconda is an estimate of SQRT(||
(R^* * R)^(-1)||_1). It is computed using ?pocon. It holds n^(-1/4) *
sconda ||R^(-1)||_2 n^(1/4) * sconda where R is the triangular
factor from the QRF of A. However, if R is truncated and the numerical rank
is determined to be strictly smaller than n, sconda is returned as -1, thus
indicating that the smallest singular values might be lost.
If full SVD is needed, the following two condition numbers are useful for the
analysis of the algorithm. They are provided for a user who is familiar with
the details of the method.
rwork(4) = an estimate of the scaled condition number of the triangular
factor in the first QR factorization.
rwork(5) = an estimate of the scaled condition number of the triangular
factor in the second QR factorization.
The following two parameters are computed if jobt = 'T'. They are
provided for a user who is familiar with the details of the method.
rwork(6) = the entropy of A^* * A :: this is the Shannon entropy of
diag(A^* * A) / Trace(A^* * A) taken as point in the probability
simplex.
rwork(7) = the entropy of A * A^*. (See the description of rwork(6).)
iwork INTEGER. On exit,

iwork(1) = the numerical rank determined after the initial QR factorization
with pivoting. See the descriptions of joba and jobr.
iwork(2) = the number of the computed nonzero singular value.
1215
iwork(3) = if nonzero, a warning message. If iwork(3)=1, some of the

column norms of A were denormalized floats. The requested high accuracy
is not warranted by the data.
info INTEGER.
If info > 0, the function did not converge in the maximal number of
sweeps. The computed values may be inaccurate.
See Also
?geqp3
?geqrf
?gelqf
?gesvj
?lamch
?pocon
?ormlq
?gesvj
Computes the singular value decomposition of a real
matrix using Jacobi plane rotations.
Syntax
call sgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, work, lwork, info)
call dgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, work, lwork, info)
call cgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, cwork, lwork, rwork,
lrwork, info )
call zgesvj(joba, jobu, jobv, m, n, a, lda, sva, mv, v, ldv, cwork, lwork, rwork,
lrwork, info )
Include Files
mkl.fi
Description
The routine computes the singular value decomposition (SVD) of a real or complex m-by-n matrix A, where
mn.
The SVD of A is written as
A = U**VT for real flavors, or
A = U**VH for complex flavors,
where is an m-by-n diagonal matrix, U is an m-by-n orthonormal matrix, and V is an n-by-n orthogonal/
unitary matrix. The diagonal elements of are the singular values of A; the columns of U and V are the left
and right singular vectors of A, respectively. The matrices U and V are computed and stored in the arrays u
and v, respectively. The diagonal of is computed and stored in the array sva.
The ?gesvj routine can sometimes compute tiny singular values and their singular vectors much more
accurately than other SVD routines.
1216
LAPACK Routines 3
The n-by-n orthogonal matrix V is obtained as a product of Jacobi plane rotations. The rotations are
implemented as fast scaled rotations of Anda and Park [AndaPark94]. In the case of underflow of the Jacobi
angle, a modified Jacobi transformation of Drmac ([Drmac08-4]) is used. Pivot strategy uses column
interchanges of de Rijk ([deRijk98]). The relative accuracy of the computed singular values and the accuracy
of the computed singular vectors (in angle metric) is as guaranteed by the theory of Demmel and Veselic
[Demmel92]. The condition number that determines the accuracy in the full rank case is essentially
where (.) is the spectral condition number. The best performance of this Jacobi SVD procedure is achieved if
used in an accelerated version of Drmac and Veselic [Drmac08-1], [Drmac08-2].
The computational range for the nonzero singular values is the machine number interval
( UNDERFLOW,OVERFLOW ). In extreme cases, even denormalized singular values can be computed with the
corresponding gradual loss of accurate digit.
Input Parameters
joba CHARACTER*1. Must be 'L', 'U' or 'G'.

Specifies the structure of A:
If joba = 'L', the input matrix A is lower triangular.
If joba = 'U', the input matrix A is upper triangular.
If joba = 'G', the input matrix A is a general m-by-n, mn.
jobu CHARACTER*1. Must be 'U', 'C' or 'N'.

Specifies whether to compute the left singular vectors (columns of U):
If jobu = 'U', the left singular vectors corresponding to the nonzero
singular values are computed and returned in the leading columns of A. See
more details in the description of a. The default numerical orthogonality
threshold is set to approximately TOL=CTOL*EPS, CTOL=sqrt(m), EPS
= ?lamch('E')
If jobu = 'C', analogous to jobu = 'U', except that you can control the
level of numerical orthogonality of the computed left singular vectors. TOL
can be set to TOL=CTOL*EPS, where CTOL is given on input in the array
work. No CTOL smaller than ONE is allowed. CTOL greater than 1 / EPS is
meaningless. The option 'C' can be used if m*EPS is satisfactory
orthogonality of the computed left singular vectors, so CTOL=m could save a
few sweeps of Jacobi rotations. See the descriptions of a and work(1).
If jobu = 'N', u is not computed. However, see the description of a.
jobv CHARACTER*1. Must be 'V', 'A' or 'N'.

Specifies whether to compute the right singular vectors, that is, the matrix
V:
If jobv = 'V', the matrix V is computed and returned in the array v.
If jobv = 'A', the Jacobi rotations are applied to the mv-byn array v. In
other words, the right singular vector matrix V is not computed explicitly,
instead it is applied to an mv-byn matrix initially stored in the first mv rows
of V.
1217
If jobv = 'N', the matrix V is not computed and the array v is not
referenced.
m INTEGER. The number of rows of the input matrix A.

1/slamch('E')> m 0 for sgesvj.
1/dlamch('E')> m 0 for dgesvj.
n INTEGER. The number of columns in the input matrix A; mn 0.
a, v REAL for sgesvj

DOUBLE PRECISION for dgesvj.
COMPLEX for cgesvj
DOUBLE COMPLEX for zgesvj
Array a(lda,n) is an array containing the m-by-n matrix A.
Array v is a workspace array, its dimension is (ldv,*); the second

dimension of v must be at least max(1, n).
m) .
mv INTEGER.
Ifjobv = 'A', the product of Jacobi rotations in ?gesvj is applied to the
first mv rows of v. See the description of jobv. 0 mvldv.
ldv INTEGER. The leading dimension of the array v; ldv 1.

jobv = 'V', ldv max(1, n).
jobv = 'A', ldv max(1, mv).
work REAL for sgesvj

work is a workspace array, its dimension max(4, m+n).
If jobu = 'C', work(1)=CTOL, where CTOL defines the threshold for

convergence. The process stops if all columns of A are mutually orthogonal
up to CTOL*EPS, EPS=?lamch('E'). It is required that CTOL 1, that is, it
is not allowed to force the routine to obtain orthogonality below .
cwork COMPLEX for cgesvj

DOUBLE COMPLEX for zgesvj
cwork is a workspace array, its dimension m+n.
lwork INTEGER.
Length of work for real flavors or cwork for complex flavors, lwork
max(6,m+n).
rwork REAL for cgesvj

DOUBLE PRECISION for zgesvj
1218
LAPACK Routines 3
rwork is a workspace array, its dimension max(6,m+n).
If jobu = 'C', rwork(1) = CTOL, where CTOL defines the threshold for
convergence. The process stops if all columns of a are mutually orthogonal
up to CTOL*EPS, EPS=?lamch('E'). It is required that CTOL 1, that is, it
is not allowed to force the routine to obtain orthogonality below .
lrwork INTEGER for cgesvj

INTEGER for zgesvj
Length of rwork, lrwork max(6,n).
Output Parameters
a On exit:
If jobu = 'U' or jobu = 'C':
if info = 0, the leading columns of A contain left singular vectors

corresponding to the computed singular values of a that are above the
underflow threshold ?lamch('S'), that is, non-zero singular values. The
number of the computed non-zero singular values is returned in
work(2) for real flavors or rwork(2) for complex flavors. Also see the
descriptions of sva and work for real flavors or rwork for complex
flavors. The computed columns of u are mutually numerically orthogonal
up to approximately TOL=sqrt(m)*EPS (default); or
TOL=CTOL*EPSjobu = 'C', see the description of jobu.
if info > 0, the procedure ?gesvj did not converge in the given
number of iterations (sweeps). In that case, the computed columns of u
may not be orthogonal up to TOL. The output u (stored in a), sigma
(given by the computed singular values in sva(1:n)) and v is still a
decomposition of the input matrix A in the sense that the residual ||A-
scale*U*sigma*VT||2 / ||A||2 for real flavors or ||A-
scale*U*sigma*VH||2 / ||A||2 for complex flavors (where scale =
stat[0]) is small.
If jobu = 'N':
if info = 0, note that the left singular vectors are 'for free' in the one-
sided Jacobi SVD algorithm. However, if only the singular values are
needed, the level of numerical orthogonality of u is not an issue and
iterations are stopped when the columns of the iterated matrix are
numerically orthogonal up to approximately m*EPS. Thus, on exit, a
contains the columns of u scaled with the corresponding singular values.
if info > 0, the procedure ?gesvj did not converge in the given
number of iterations (sweeps).
sva REAL for sgesvj

REAL for cgesvj
DOUBLE PRECISION for zgesvj
Array size n.
1219
If info = 0, depending on the value scale =work(1) for real flavors or

rwork(1) for complex flavors, where scale is the scaling factor:
if scale = 1, sva(1:n) contains the computed singular values of a.

During the computation, sva contains the Euclidean column norms of
the iterated matrices in the array a.
if scale 1, the singular values of a are scale*sva(1:n), and this
factored representation is due to the fact that some of the singular
values of a might underflow or overflow.
If info > 0, the procedure ?gesvj did not converge in the given number
of iterations (sweeps) and scale*sva(1:n) may not be accurate.
v On exit:
If jobv = 'V', contains the n-by-n matrix of the right singular vectors.
If jobv = 'A', then v contains the product of the computed right singular
vector matrix and the initial matrix in the array v.
work On exit,
work(1) = scale is the scaling factor such that scale*sva(1:n) are the
computed singular values of A. See the description of sva().
work(2) is the number of the computed nonzero singular values.

work(3) is the number of the computed singular values that are larger than
the underflow threshold.
work(4) is the number of sweeps of Jacobi rotations needed for numerical
convergence.
work(5) = max_{ij} |COS(A(:,i),A(:,j))| in the last sweep. This is
useful information in cases when ?gesvj did not converge, as it can be
used to estimate whether the output is still useful and for post festum
analysis.
work(6) is the largest absolute value over all sines of the Jacobi rotation
angles in the last sweep. It can be useful in a post festum analysis.
rwork On exit,
rwork(1) = scale is the scaling factor such that scale*sva(1:n) are the
computed singular values of A. See description of sva().
rwork(2) is the number of the computed nonzero singular values.

rwork(3) is the number of the computed singular values that are larger
than the underflow threshold.
rwork(4) is the number of sweeps of Jacobi rotations needed for numerical
convergence.
rwork(5) = max_{ij} |COS(A(:,i),A(:,j))| in the last sweep. This
is useful information in cases when ?gesvj did not converge, as it can be
used to estimate whether the output is still useful and for post festum
analysis.
1220
LAPACK Routines 3
rwork(6) is the largest absolute value over all sines of the Jacobi rotation
angles in the last sweep. It can be useful for a post festum analysis.
info INTEGER.
If info > 0, the function did not converge in the maximal number (30) of
sweeps. The output may still be useful. See the description of work or
rwork.
See Also
?lamch
?ggsvd
Computes the generalized singular value
decomposition of a pair of general rectangular
matrices (deprecated).
Syntax
call sggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, iwork, info)
call dggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, iwork, info)
call cggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, rwork, iwork, info)
call zggsvd(jobu, jobv, jobq, m, n, p, k, l, a, lda, b, ldb, alpha, beta, u, ldu, v,
ldv, q, ldq, work, rwork, iwork, info)
call ggsvd(a, b, alpha, beta [, k] [,l] [,u] [,v] [,q] [,iwork] [,info])
Include Files
mkl.fi, lapack.f90
Description
This routine is deprecated; use ggsvd3.
The routine computes the generalized singular value decomposition (GSVD) of an m-by-n real/complex
matrix A and p-by-n real/complex matrix B:
U'*A*Q = D1*(0 R), V'*B*Q = D2*(0 R),
where U, V and Q are orthogonal/unitary matrices and U', V' mean transpose/conjugate transpose of U and V
respectively.
Let k+l = the effective numerical rank of the matrix (A', B')', then R is a (k+l)-by-(k+l) nonsingular upper
triangular matrix, D1 and D2 are m-by-(k+l) and p-by-(k+l) "diagonal" matrices and of the following
structures, respectively:
If m-k-l0,
1221
where
C = diag(alpha(K+1),..., alpha(K+l))
S = diag(beta(K+1),...,beta(K+l))
C2 + S2 = I
R is stored in a(1:k+l, n-k-l+1:n ) on exit.
If m-k-l < 0,
1222
LAPACK Routines 3
where
C = diag(alpha(K+1),..., alpha(m)),
S = diag(beta(K+1),...,beta(m)),
C2 + S2 = I
On exit, is stored in a(1:m, n-k-l+1:n ) and R33 is stored in b(m-k+1:l, n+m-k-l

+1:n ).
The routine computes C, S, R, and optionally the orthogonal/unitary transformation matrices U, V and Q.
In particular, if B is an n-by-n nonsingular matrix, then the GSVD of A and B implicitly gives the SVD of
A*B-1:
A*B-1 = U*(D1*D2-1)*V'.
If (A', B')' has orthonormal columns, then the GSVD of A and B is also equal to the CS decomposition of A
and B. Furthermore, the GSVD can be used to derive the solution of the eigenvalue problem:
A'**A*x = *B'*B*x.
Input Parameters
jobu CHARACTER*1. Must be 'U' or 'N'.

If jobu = 'U', orthogonal/unitary matrix U is computed.
If jobu = 'N', U is not computed.
jobv CHARACTER*1. Must be 'V' or 'N'.

If jobv = 'V', orthogonal/unitary matrix V is computed.
If jobv = 'N', V is not computed.
jobq CHARACTER*1. Must be 'Q' or 'N'.

If jobq = 'Q', orthogonal/unitary matrix Q is computed.
If jobq = 'N', Q is not computed.
a, b, work REAL for sggsvd

DOUBLE PRECISION for dggsvd
COMPLEX for cggsvd
1223
DOUBLE COMPLEX for zggsvd.

Arrays:
The dimension of work must be at least max(3n, m, p)+n.
ldu INTEGER. The leading dimension of the array u .

ldu max(1, m) if jobu = 'U'; ldu 1 otherwise.
ldv INTEGER. The leading dimension of the array v .

ldv max(1, p) if jobv = 'V'; ldv 1 otherwise.
ldq INTEGER. The leading dimension of the array q .

ldq max(1, n) if jobq = 'Q'; ldq 1 otherwise.
iwork INTEGER.
rwork REAL for cggsvd DOUBLE PRECISION for zggsvd.

Output Parameters
k, l INTEGER. On exit, k and l specify the dimension of the subblocks. The sum
k+l is equal to the effective numerical rank of (A', B')'.
a On exit, a contains the triangular matrix R or part of R.
b On exit, b contains part of the triangular matrix R if m-k-l < 0.
alpha, beta REAL for single-precision flavors

Contain the generalized singular value pairs of A and B:
alpha(1:k) = 1,
beta(1:k) = 0,
and if m-k-l 0,
alpha(k+1:k+l) = C,
beta(k+1:k+l) = S,
1224
LAPACK Routines 3
or if m-k-l < 0,
alpha(k+1:m)= C, alpha(m+1:k+l)=0
beta(k+1:m) = S, beta(m+1:k+l) = 1
and
alpha(k+l+1:n) = 0
beta(k+l+1:n) = 0.
u, v, q REAL for sggsvd

DOUBLE PRECISION for dggsvd
COMPLEX for cggsvd
DOUBLE COMPLEX for zggsvd.
Arrays:
u(ldu,*); the second dimension of u must be at least max(1, m).
If jobu = 'U', u contains the m-by-m orthogonal/unitary matrix U.
v(ldv,*); the second dimension of v must be at least max(1, p).

If jobv = 'V', v contains the p-by-p orthogonal/unitary matrix V.
q(ldq,*); the second dimension of q must be at least max(1, n).

If jobq = 'Q', q contains the n-by-n orthogonal/unitary matrix Q.
iwork On exit, iwork stores the sorting information.
info INTEGER.

If info = 1, the Jacobi-type procedure failed to converge. For further
details, see subroutine tgsja.

Specific details for the routine ggsvd interface are the following:
b Holds the matrix B of size (p, n).
alpha Holds the vector of length n.
1225
u Holds the matrix U of size (m, m).
v Holds the matrix V of size (p, p).
iwork Holds the vector of length n.
jobu Restored based on the presence of the argument u as follows:

jobu = 'U', if u is present, jobu = 'N', if u is omitted.
jobv Restored based on the presence of the argument v as follows:

jobz = 'V', if v is present,
jobz = 'N', if v is omitted.
jobq Restored based on the presence of the argument q as follows:

jobz = 'Q', if q is present,
jobz = 'N', if q is omitted.
?gesvdx
Computes the SVD and left and right singular vectors
for a matrix.
Syntax
call sgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, iwork, info)
call dgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, iwork, info)
call cgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, rwork, iwork, info)
call zgesvdx(jobu, jobvt, range, m, n, a, lda, vl, vu, il, iu, ns, s, u, ldu, vt,
ldvt, work, lwork, rwork, iwork, info)
Include Files
mkl.fi
Description
?gesvdx computes the singular value decomposition (SVD) of a real or complex m-by-n matrix A, optionally
computing the left and right singular vectors. The SVD is written
A = U * * transpose(V)
where is an m-by-n matrix which is zero except for its min(m,n) diagonal elements, U is an m-by-m matrix,
and V is an n-by-n matrix. The matrices U and V are orthogonal for real A, and unitary for complex A. The
diagonal elements of are the singular values of A; they are real and non-negative, and are returned in
descending order. The first min(m,n) columns of U and V are the left and right singular vectors of A.
?gesvdx uses an eigenvalue problem for obtaining the SVD, which allows for the computation of a subset of
singular values and vectors. See ?bdsvdx for details.
Note that the routine returns VT, not V.
1226
LAPACK Routines 3
Input Parameters
jobu CHARACTER*1. Specifies options for computing all or part of the matrix U:
= 'V': the first min(m,n) columns of U (the left singular vectors) or as
specified by range are returned in the array u;
= 'N': no columns of U (no left singular vectors) are computed.
jobvt CHARACTER*1. Specifies options for computing all or part of the matrix VT:
= 'V': the first min(m,n) rows of VT (the right singular vectors) or as
specified by range are returned in the array vt;
= 'N': no rows of VT (no right singular vectors) are computed.
range CHARACTER*1. = 'A': find all singular values.

= 'V': all singular values in the half-open interval (vl,vu] are found.
= 'I': the il-th through iu-th singular values are found.
m INTEGER. The number of rows of the input matrix A. m 0.
n INTEGER. The number of columns of the input matrix A. n 0.
a REAL for sgesvdx

DOUBLE PRECISION for dgesvdx
COMPLEX for cgesvdx
DOUBLE COMPLEX for zgesvdx
Array, size (lda,n)
lda max(1,m).
vl REAL for sgesvdx

REAL for cgesvdx
DOUBLE PRECISION for zgesvdx
vl0.
vu REAL for sgesvdx

REAL for cgesvdx
If range='V', the lower and upper bounds of the interval to be searched for
singular values. vu > vl. Not referenced if range = 'A' or 'I'.
il INTEGER.
1227
iu INTEGER. If range='I', the indices (in ascending order) of the smallest and
largest singular values to be returned. 1 iliu min(m,n), if min(m,n) > 0.
Not referenced if range = 'A' or 'V'.
ldu INTEGER. The leading dimension of the array u. ldu 1; if jobu = 'V',
ldum.
ldvt INTEGER. The leading dimension of the array vt. ldvt 1; if jobvt = 'V',
ldvtns (see above).
work REAL for sgesvdx

COMPLEX for cgesvdx
On exit, if info = 0, work(1) returns the optimal lwork;
lwork max(1, min(m, n)*(min(m, n) + 4)) for the paths (see comments
inside the code):
PATH 1 (m much larger than n)

PATH 1t (n much larger than m)
lwork max(1, min(m, n)*2 + max(m, n)) for the other paths. For good
performance, lwork should generally be larger.

xerbla.
rwork REAL for cgesvdx

Array, size (max(1, lrwork)).
lrwork min(m, n)*(min(m, n)*2 + 15*min(m, n)).
Output Parameters
a On exit, the contents of a are destroyed.
ns INTEGER. The total number of singular values found,

0 ns min(m, n).
If range = 'A', ns = min(m, n); if range = 'I', ns = iu - il + 1.
s REAL for sgesvdx

REAL for cgesvdx
1228
LAPACK Routines 3
Array, size (min(m,n))
The singular values of A, sorted so that s(i)s(i + 1).
u REAL for sgesvdx

COMPLEX for cgesvdx
If jobu = 'V', u contains columns of U (the left singular vectors, stored
columnwise) as specified by range; if jobu = 'N', u is not referenced.
NOTE
Make sure that ucolns; if range = 'V', the exact value of ns is not
known in advance and an upper bound must be used.
vt REAL for sgesvdx

COMPLEX for cgesvdx
Array, size (ldvt,n)
If jobvt = 'V', vt contains the rows of VT (the right singular vectors, stored
rowwise) as specified by range; if jobvt = 'N', vt is not referenced.
NOTE
Make sure that ldvtns; if range = 'V', the exact value of ns is not
known in advance and an upper bound must be used.
iwork INTEGER. Array, size (12*min(m, n)).
If info = 0, the first ns elements of iwork are zero. If info > 0, then
iwork contains the indices of the eigenvectors that failed to converge in ?
bdsvdx/?stevx.
info INTEGER.
= 0: successful exit.
> 0: if info = i, then i eigenvectors failed to converge in ?bdsvdx/?stevx. if
info = n*2 + 1, an internal error occurred in ?bdsvdx.
?bdsvdx
Computes the SVD of a bidiagonal matrix.
Syntax
call sbdsvdx (uplo, jobz, range, n, d, e, vl, vu, il, iu, ns, s, z, ldz, work, iwork,
info )
1229
call dbdsvdx (uplo, jobz, range, n, d, e, vl, vu, il, iu, ns, s, z, ldz, work, iwork,
info )
Include Files
mkl.fi
Description
?bdsvdx computes the singular value decomposition (SVD) of a real n-by-n (upper or lower) bidiagonal
matrix B, B = U * S * VT, where S is a diagonal matrix with non-negative diagonal elements (the singular
values of B), and U and VT are orthogonal matrices of left and right singular vectors, respectively.
Given an upper bidiagonal B with diagonal d = [d1d2 ... dn] and superdiagonal e = [e1e2 ... en - 1], ?bdsvdx
computes the singular value decompositon of B through the eigenvalues and eigenvectors of the n*2-by-n*2
tridiagonal matrix
0 d1
d 1 0 e1
TGK = e1 0 d 2
d2

If (s,u,v) is a singular triplet of B with ||u|| = ||v|| = 1, then (s,q), ||q|| = 1, are eigenpairs of TGK, with
v1 u1 v2 u2 vn un
u v
q =P* 2
= 2
, and P = en + 1 e1 en + 2 e2 .
Given a TGK matrix, one can either
1. compute -s, -v and change signs so that the singular values (and corresponding vectors) are already in
descending order (as in ?gesvd/?gesdd) or
2. compute s, v and reorder the values (and corresponding vectors).
?bdsvdx implements (1) by calling ?stevx (bisection plus inverse iteration, to be replaced with a version of
the Multiple Relative Robust Representation algorithm. (See P. Willems and B. Lang, A framework for the
MR^3 algorithm: theory and implementation, SIAM J. Sci. Comput., 35:740-766, 2013.)
Input Parameters
uplo CHARACTER*1. = 'U': B is upper bidiagonal;

= 'L': B is lower bidiagonal.
jobz CHARACTER*1. = 'N': Compute singular values only;

= 'V': Compute singular values and singular vectors.
range CHARACTER*1. = 'A': Find all singular values.

= 'V': all singular values in the half-open interval [vl,vu) are found.
= 'I': the il-th through iu-th singular values are found.
n INTEGER. The order of the bidiagonal matrix.

n >= 0.
d REAL for sbdsvdx

DOUBLE PRECISION for dbdsvdx
1230
LAPACK Routines 3
Array, size n.
The n diagonal elements of the bidiagonal matrix B.
e REAL for sbdsvdx

Array, size (max(1,n - 1))
The (n - 1) superdiagonal elements of the bidiagonal matrix B in elements 1

to n - 1.
vl REAL for sbdsvdx

vl 0.
vu REAL for sbdsvdx

If range='V', the lower and upper bounds of the interval to be searched for
singular values. vu > vl.
Not referenced if range = 'A' or 'I'.
il, iu INTEGER. If range='I', the indices (in ascending order) of the smallest and
largest singular values to be returned.
1 iliu min(m,n), if min(m,n) > 0.
Not referenced if range = 'A' or 'V'.
ldz INTEGER. The leading dimension of the array z.
ldz 1, and if jobz = 'V', ldz max(2,n*2).
Output Parameters
ns INTEGER. The total number of singular values found. 0 nsn.
If range = 'A', ns = n, and if range = 'I', ns = iu - il + 1.
s REAL for sbdsvdx

Array, size (n)
The first ns elements contain the selected singular values in ascending

order.
z REAL for sbdsvdx

Array, size (2*n, k)
If jobz = 'V', then if info = 0 the first ns columns of z contain the singular
vectors of the matrix B corresponding to the selected singular values, with
U in rows 1 to n and V in rows n+1 to n*2, i.e.
1231
U
z=
V
NOTE
Make sure that at least k = ns+1 columns are supplied in the array z;
if range = 'V', the exact value of ns is not known in advance and an
upper bound must be used.
work REAL for sbdsvdx

Array, size (14*n)
iwork INTEGER. Array, size (12*n).
If jobz = 'V', then if info = 0, the first ns elements of iwork are zero. If
info > 0, then iwork contains the indices of the eigenvectors that failed to
converge in ?stevx.

> 0:
if info = i, then i eigenvectors failed to converge in ?stevx. The indices of
the eigenvectors (as returned by ?stevx) are stored in the array iwork.
if info = n*2 + 1, an internal error occurred.
Cosine-Sine Decomposition: LAPACK Driver Routines

This section describes LAPACK driver routines for computing the cosine-sine decomposition (CS
decomposition). You can also call the corresponding computational routines to perform the same task.
The computation has the following phases:
1. The matrix is reduced to a bidiagonal block form.

2. The blocks are simultaneously diagonalized using techniques from the bidiagonal SVD algorithms.
Table "Driver Routines for Cosine-Sine Decomposition (CSD)" lists LAPACK routines (FORTRAN 77 interface)
that perform CS decomposition of matrices. The corresponding routine names in the Fortran 95 interface are
without the first symbol.
Computational Routines for Cosine-Sine Decomposition (CSD)
Compute the CS decomposition of a block- orcsd uncsd

partitioned orthogonal matrix
orcsd2by1 uncsd2by1
Compute the CS decomposition of a block- orcsd uncsd

partitioned unitary matrix
See Also
Cosine-Sine Decomposition: LAPACK Computational Routines
1232
LAPACK Routines 3
?orcsd/?uncsd
Computes the CS decomposition of a block-partitioned
orthogonal/unitary matrix.
Syntax
call sorcsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
ldx12, x21, ldx21, x22, ldx22, theta, u1, ldu1, u2, ldu2, v1t, ldv1t, v2t, ldv2t,
work, lwork, iwork, info )
call dorcsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
work, lwork, iwork, info )
call cuncsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
work, lwork, rwork, lrwork, iwork, info )
call zuncsd( jobu1, jobu2, jobv1t, jobv2t, trans, signs, m, p, q, x11, ldx11, x12,
work, lwork, rwork, lrwork, iwork, info )
call orcsd( x11,x12,x21,x22,theta,u1,u2,v1t,v2t[,jobu1][,jobu2][,jobv1t][,jobv2t]
[,trans][,signs][,info] )
call uncsd( x11,x12,x21,x22,theta,u1,u2,v1t,v2t[,jobu1][,jobu2][,jobv1t][,jobv2t]

[,trans][,signs][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orcsd/?uncsd compute the CS decomposition of an m-by-m partitioned orthogonal matrix X:
or unitary matrix:
1233
x11 is p-by-q. The orthogonal/unitary matrices u1, u2, v1, and v2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-
by-(m-q), respectively. C and S are r-by-r nonnegative diagonal matrices satisfying C2 + S2 = I, in which r
= min(p,m-p,q,m-q).
Input Parameters
jobu1 CHARACTER. If equals Y, then u1 is computed. Otherwise, u1 is not

computed.
jobu2 CHARACTER. If equals Y, then u2 is computed. Otherwise, u2 is not

computed.
jobv1t CHARACTER. If equals Y, then v1t is computed. Otherwise, v1t is not

computed.
jobv2t CHARACTER. If equals Y, then v2t is computed. Otherwise, v2t is not

computed.
trans CHARACTER

order.
signs CHARACTER
= 'O': The lower-left block is made nonpositive (the

"other" convention).
otherwise The upper-right block is made nonpositive (the
"default" convention).
p INTEGER. The number of rows in x11 and x12. 0 pm.
q INTEGER. The number of columns in x11 and x21. 0 qm.
x11, x12, x21, x22 REAL for sorcsd

DOUBLE PRECISION for dorcsd
COMPLEX for cuncsd
DOUBLE COMPLEX for zuncsd
Arrays of size x11 (ldx11,q), x12 (ldx12,m - q), x21 (ldx21,q), and x22
(ldx22,m - q).
Contain the parts of the orthogonal/unitary matrix whose CSD is desired.
ldx11, ldx12, ldx21, ldx22 INTEGER. The leading dimensions of the parts of array X. ldx11 max(1,
p), ldx12 max(1, p), ldx21 max(1, m - p), ldx22 max(1, m - p).
ldu1 INTEGER. The leading dimension of the array u1. If jobu1 = 'Y', ldu1
max(1,p).
1234
LAPACK Routines 3
max(1,m-p).
ldv1t INTEGER. The leading dimension of the array v1t. If jobv1t = 'Y', ldv1t
max(1,q).
max(1,m-q).
work REAL for sorcsd

COMPLEX for cuncsd

xerbla.
rwork REAL for cuncsd

DOUBLE PRECISION for zuncsd
Workspace array, size (max(1,lrwork)).
lrwork INTEGER. The size of the rwork array. Constraints:

calculates the optimal size of the rwork array, returns this value as the first
entry of the rwork array, and no error message related to lrwork is issued
by xerbla.
iwork INTEGER. Workspace array, dimension m.
Output Parameters
theta REAL for sorcsd

COMPLEX for cuncsd
Array, size (r), in which r = min(p,m-p,q,m-q).
C = diag( cos(theta(1)), ..., cos(theta(r)) ), and

S = diag( sin(theta(1)), ..., sin(theta(r)) ).
u1 REAL for sorcsd

COMPLEX for cuncsd
1235

Array, size (p).
If jobu1 = 'Y', u1 contains the p-by-p orthogonal/unitary matrix u1.
u2 REAL for sorcsd

COMPLEX for cuncsd
Array, size (ldu2,m-p).
If jobu2 = 'Y', u2 contains the (m-p)-by-(m-p) orthogonal/unitary matrix

u2.
v1t REAL for sorcsd

COMPLEX for cuncsd
Array, size (ldv1t,*) .
If jobv1t = 'Y', v1t contains the q-by-q orthogonal matrix v1T or unitary
matrix v1H.
v2t REAL for sorcsd

COMPLEX for cuncsd
Array, size (ldv2t,m-q).
If jobv2t = 'Y', v2t contains the (m-q)-by-(m-q) orthogonal matrix v2T or

unitary matrix v2H.
work On exit,
If info = 0, work(1) returns the optimal lwork.
If info > 0, For ?orcsd, work(2:r) contains the values

phi(1), ..., phi(r-1) that, together with
theta(1), ..., theta(r) define the matrix in
intermediate bidiagonal-block form remaining
after nonconvergence. info specifies the number
of nonzero phi's.
rwork On exit,
If info = 0, rwork(1) returns the optimal lrwork.
If info > 0, For ?uncsd, rwork(2:r) contains the values

phi(1), ..., phi(r-1) that, together with
theta(1), ..., theta(r) define the matrix in
1236
LAPACK Routines 3
intermediate bidiagonal-block form remaining
after nonconvergence. info specifies the number
of nonzero phi's.
info INTEGER.
> 0: ?orcsd/?uncsd did not converge. See the description of work above
for details.

Specific details for the routine ?orcsd/?uncsd interface are as follows:
x11 Holds the block of matrix X of size (p, q).
x12 Holds the block of matrix X of size (p, m-q).
x21 Holds the block of matrix X of size (m-p, q).
x22 Holds the block of matrix X of size (m-p, m-q).
theta Holds the vector of length r = min(p,m-p,q,m-q).
u1 Holds the matrix of size (p,p).
u2 Holds the matrix of size (m-p,m-p).
v1t Holds the matrix of size (q,q).
v2t Holds the matrix of size (m-q,m-q).
signs Must be 'O' or 'D'.
See Also
?bbcsd
xerbla
?orcsd2by1/?uncsd2by1
Computes the CS decomposition of a block-partitioned
1237
Syntax
call sorcsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, iwork, info )
call dorcsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, iwork, info )
call cuncsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, rwork, lrwork, iwork, info )
call zuncsd2by1( jobu1, jobu2, jobv1t, m, p, q, x11, ldx11, x21, ldx21, theta, u1,
ldu1, u2, ldu2, v1t, ldv1t, work, lwork, rwork, lrwork, iwork, info )
call orcsd2by1( x11,x21,theta,u1,u2,v1t[,jobu1][,jobu2][,jobv1t][,info] )
call uncsd2by1( x11,x21,theta,u1,u2,v1t[,jobu1][,jobu2][,jobv1t][,info] )
Include Files
mkl.fi, lapack.f90
Description
The routines ?orcsd2by1/?uncsd2by1 compute the CS decomposition of an m-by-q matrix X with
orthonormal columns that has been partitioned into a 2-by-1 block structure:
x11 is p-by-q. The orthogonal/unitary matrices u1, u2, v1, and v2 are p-by-p, (m-p)-by-(m-p), q-by-q, (m-q)-
by-(m-q), respectively. C and S are r-by-r nonnegative diagonal matrices satisfying C2 + S2 = I, in which r
= min(p,m-p,q,m-q).
Input Parameters
jobu1 CHARACTER. If equal to 'Y', then u1 is computed. Otherwise, u1 is not

computed.
jobu2 CHARACTER. If equal to 'Y', then u2 is computed. Otherwise, u2 is not

computed.
jobv1t CHARACTER. If equal to 'Y', then v1t is computed. Otherwise, v1t is not
computed.
p INTEGER. The number of rows in x11. 0 pm.
q INTEGER. The number of columns in x11 . 0 qm.
x11 REAL for sorcsd2by1
1238
LAPACK Routines 3
DOUBLE PRECISION for dorcsd2by1
COMPLEX for cuncsd2by1
DOUBLE COMPLEX for zuncsd2by1
On entry, the part of the orthogonal matrix whose CSD is desired.
ldx11 INTEGER. The leading dimension of the array x11. ldx11 max(1,p).
x21 REAL for sorcsd2by1

On entry, the part of the orthogonal matrix whose CSD is desired.
ldx21 INTEGER. The leading dimension of the array X. ldx21 max(1,m - p).
max(1,p).
max(1,m-p).
max(1,q).
work REAL for sorcsd2by1


xerbla.
rwork REAL for cuncsd2by1

DOUBLE PRECISION for zuncsd2by1
Workspace array, size (max(1,lrwork)).
lrwork INTEGER. The size of the rwork array. Constraints:
1239

calculates the optimal size of the rwork array, returns this value as the first
entry of the rwork array, and no error message related to lrwork is issued
by xerbla.
iwork INTEGER. Workspace array, dimension m - min(p, m - p, q, m - q).
Output Parameters
theta REAL for sorcsd2by1

Array, size (r), in which r = min(p,m-p,q,m-q).
C = diag( cos(theta(1)), ..., cos(theta(r)) ), and

S = diag( sin(theta(1)), ..., sin(theta(r)) ).
u1 REAL for sorcsd2by1

Array, size (ldu1,p) .
If jobu1 = 'Y', u1 contains the p-by-p orthogonal/unitary matrix u1.
u2 REAL for sorcsd2by1

Array, size (ldu2,m - p) .
If jobu2 = 'Y', u2 contains the (m-p)-by-(m-p) orthogonal/unitary matrix
u2.
v1t REAL for sorcsd2by1

Array, size (ldv1t,q) .
If jobv1t = 'Y', v1t contains the q-by-q orthogonal matrix v1T or unitary
matrix v1H.
work On exit,
If info = 0, work(1) returns the optimal lwork.
1240
LAPACK Routines 3
If info > 0, work(2:r) contains the values phi(1), ...,
phi(r-1) that, together with theta(1), ...,
theta(r) define the matrix in intermediate
bidiagonal-block form remaining after
nonconvergence. info specifies the number of
nonzero phi's.
rwork On exit,
If info = 0, rwork(1) returns the optimal lrwork.
If info > 0, rwork(2:r) contains the values phi(1), ...,

phi(r-1) that, together with theta(1), ...,
theta(r) define the matrix in intermediate
bidiagonal-block form remaining after
nonconvergence. info specifies the number of
nonzero phi's.
info INTEGER.
> 0: ?orcsd2by1/?uncsd2by1 did not converge. See the description of

work above for details.

Specific details for the routine ?orcsd2by1/?orcsd2by1 interface are as follows:
x11 Holds the block of matrix X11 of size (p, q).
x21 Holds the block of matrix X21 of size (m-p, q).
theta Holds the vector of length r = min(p,m-p,q,m-q).
u1 Holds the matrix u1 of size (p,p).
u2 Holds the matrix u2 of size (m-p,m-p).
v1t Holds the matrix v1T or v1H of size (q,q).
jobu1 Indicates whether u1 is computed. Must be 'Y' or 'O'.
jobu2 Indicates whether u2 is computed. Must be 'Y' or 'O'.
See Also
?bbcsd
xerbla
1241
Generalized Symmetric Definite Eigenvalue Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving generalized symmetric definite eigenproblems.
See also computational routines that can be called to solve these problems. Table "Driver Routines for
Solving Generalized Symmetric Definite Eigenproblems" lists all such driver routines for the FORTRAN 77
interface. The corresponding routine names in the Fortran 95 interface are without the first symbol.
Driver Routines for Solving Generalized Symmetric Definite Eigenproblems
sygv/hegv Computes all eigenvalues and, optionally, eigenvectors of a real / complex

generalized symmetric /Hermitian positive-definite eigenproblem.
sygvd/hegvd Computes all eigenvalues and, optionally, eigenvectors of a real / complex

generalized symmetric /Hermitian positive-definite eigenproblem. If eigenvectors
are desired, it uses a divide and conquer method.
sygvx/hegvx Computes selected eigenvalues and, optionally, eigenvectors of a real / complex

generalized symmetric /Hermitian positive-definite eigenproblem.
spgv/hpgv Computes all eigenvalues and, optionally, eigenvectors of a real / complex

generalized symmetric /Hermitian positive-definite eigenproblem with matrices in
packed storage.
spgvd/hpgvd Computes all eigenvalues and, optionally, eigenvectors of a real / complex

packed storage. If eigenvectors are desired, it uses a divide and conquer method.
spgvx/hpgvx Computes selected eigenvalues and, optionally, eigenvectors of a real / complex

packed storage.
sbgv/hbgv Computes all eigenvalues and, optionally, eigenvectors of a real / complex

generalized symmetric /Hermitian positive-definite eigenproblem with banded
matrices.
sbgvd/hbgvd Computes all eigenvalues and, optionally, eigenvectors of a real / complex

matrices. If eigenvectors are desired, it uses a divide and conquer method.
sbgvx/hbgvx Computes selected eigenvalues and, optionally, eigenvectors of a real / complex

matrices.
?sygv
eigenvectors of a real generalized symmetric definite
eigenproblem.
Syntax
call ssygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
call dsygv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, info)
call sygv(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
1242
LAPACK Routines 3
The routine computes all the eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric and B is also positive definite.
Input Parameters

Specifies the problem type to be solved:
if itype = 1, the problem type is A*x = lambda*B*x;
if itype = 2, the problem type is A*B*x = lambda*x;
if itype = 3, the problem type is B*A*x = lambda*x.

If jobz = 'N', then compute eigenvalues only.
If jobz = 'V', then compute eigenvalues and eigenvectors.

If uplo = 'U', arrays a and b store the upper triangles of A and B;
If uplo = 'L', arrays a and b store the lower triangles of A and B.
a, b, work REAL for ssygv

DOUBLE PRECISION for dsygv.
Arrays:
a(lda,*) contains the upper or lower triangle of the symmetric matrix A, as
specified by uplo.
b(ldb,*) contains the upper or lower triangle of the symmetric positive
definite matrix B, as specified by uplo.
lwork INTEGER.
The dimension of the array work;
lwork max(1, 3n-1).
xerbla.
1243
Output Parameters
a On exit, if jobz = 'V', then if info = 0, a contains the matrix Z of

eigenvectors. The eigenvectors are normalized as follows:
if itype = 1 or 2, ZT*B*Z = I;
if itype = 3, ZT*inv(B)*Z = I;
If jobz = 'N', then on exit the upper triangle (if uplo = 'U') or the
lower triangle (if uplo = 'L') of A, including the diagonal, is destroyed.
b On exit, if infon, the part of b containing the matrix is overwritten by the

triangular factor U or L from the Cholesky factorization B = UT*U or B =
L*LT.
w REAL for ssygv

DOUBLE PRECISION for dsygv.
lwork.
info INTEGER.
If info = -i, the i-th argument had an illegal value.
If info > 0, spotrf/dpotrf or ssyev/dsyev returned an error code:
If info = in, ssyev/dsyev failed to converge, and i off-diagonal

elements of an intermediate tridiagonal did not converge to zero;
If info = n + i, for 1 in, then the leading minor of order i of B is not
positive-definite. The factorization of B could not be completed and no
eigenvalues or eigenvectors were computed.

Specific details for the routine sygv interface are the following:
b Holds the matrix B of size (n, n).
1244
LAPACK Routines 3
Application Notes
For optimum performance use lwork (nb+2)*n, where nb is the blocksize for ssytrd/dsytrd returned by
ilaenv.
?hegv
eigenvectors of a complex generalized Hermitian
positive-definite eigenproblem.
Syntax
call chegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
call zhegv(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, info)
call hegv(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes all the eigenvalues, and optionally, the eigenvectors of a complex generalized
Hermitian positive-definite eigenproblem, of the form
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite.
Input Parameters
itype INTEGER. Must be 1 or 2 or 3. Specifies the problem type to be solved:



1245
a, b, work COMPLEX for chegv

DOUBLE COMPLEX for zhegv.
Arrays:
a(lda,*) contains the upper or lower triangle of the Hermitian matrix A, as
specified by uplo.
b(ldb,*) contains the upper or lower triangle of the Hermitian positive
lwork INTEGER.
The dimension of the array work; lwork max(1, 2n-1).

xerbla.
rwork REAL for chegv

DOUBLE PRECISION for zhegv.
Output Parameters

if itype = 1 or 2, ZH*B*Z = I;
if itype = 3, ZH*inv(B)*Z = I;

triangular factor U or L from the Cholesky factorization B = UH*U or B =
L*LH.
w REAL for chegv

DOUBLE PRECISION for zhegv.
1246
LAPACK Routines 3
lwork.
info INTEGER.
If info = -i, the i-th argument has an illegal value.
If info > 0, cpotrf/zpotrf or cheev/zheev return an error code:
If info = in, cheev/zheev fails to converge, and i off-diagonal elements

of an intermediate tridiagonal do not converge to zero;
positive-definite. The factorization of B can not be completed and no
eigenvalues or eigenvectors are computed.

Specific details for the routine hegv interface are the following:
Application Notes
For optimum performance use lwork (nb+1)*n, where nb is the blocksize for chetrd/zhetrd returned by
ilaenv.
lwork = -1.
1247
?sygvd
eigenproblem using a divide and conquer method.
Syntax
call ssygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork, liwork, info)
call dsygvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, iwork, liwork, info)
call sygvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x .
Here A and B are assumed to be symmetric and B is also positive definite.
It uses a divide and conquer algorithm.
Input Parameters



a, b, work REAL for ssygvd

DOUBLE PRECISION for dsygvd.
Arrays:
specified by uplo.
1248
LAPACK Routines 3
lwork INTEGER.
Constraints:
If n 1, lwork 1;
If jobz = 'N' and n>1, lwork < 2n+1;
If jobz = 'V' and n>1, lwork < 2n2+6n+1.

Notes for details.
iwork INTEGER.
Workspace array, its dimension max(1, lwork).
liwork INTEGER.
Constraints:
If n 1, liwork 1;
If jobz = 'N' and n>1, liwork 1;
If jobz = 'V' and n>1, liwork 5n+3.

Notes for details.
Output Parameters


L*LT.
w REAL for ssygvd
1249
DOUBLE PRECISION for dsygvd.

lwork.
liwork.
info INTEGER.
If info > 0, an error code is returned as specified below.
For infon:
If info = i and jobz = 'N', then the algorithm failed to converge;

i off-diagonal elements of an intermediate tridiagonal form did not
converge to zero.
If jobz = 'V', then the algorithm failed to compute an eigenvalue
while working on the submatrix lying in rows and columns info/(n
+1) through mod(info,n+1).
For info > n:
If info = n + i, for 1 in, then the leading minor of order i of B

is not positive-definite. The factorization of B could not be completed
and no eigenvalues or eigenvectors were computed.

Specific details for the routine sygvd interface are the following:
Application Notes
1250
LAPACK Routines 3
?hegvd
positive-definite eigenproblem using a divide and
conquer method.
Syntax
call chegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, lrwork,
call zhegvd(itype, jobz, uplo, n, a, lda, b, ldb, w, work, lwork, rwork, lrwork,
call hegvd(a, b, w [,itype] [,jobz] [,uplo] [,info])
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite.
It uses a divide and conquer algorithm.
Input Parameters



1251
a, b, work COMPLEX for chegvd

DOUBLE COMPLEX for zhegvd.
Arrays:
specified by uplo.
lwork INTEGER.
Constraints:
If n 1, lwork 1;
If jobz = 'N' and n>1, lworkn+1;
If jobz = 'V' and n>1, lworkn2+2n.

rwork REAL for chegvd

DOUBLE PRECISION for zhegvd.
Workspace array, size max(1, lrwork).
lrwork INTEGER.
Constraints:
If n 1, lrwork 1;
If jobz = 'N' and n>1, lrworkn;
If jobz = 'V' and n>1, lrwork 2n2+5n+1.

1252
LAPACK Routines 3
iwork INTEGER.
Workspace array, size max(1, liwork).
liwork INTEGER.
Constraints:
If n 1, liwork 1;

Output Parameters

if itype = 1 or 2, ZH* B*Z = I;

L*LH.
w REAL for chegvd

DOUBLE PRECISION for zhegvd.
lwork.
lrwork.
liwork.
info INTEGER.
1253
If info = i, and jobz = 'N', then the algorithm failed to converge; i off-
diagonal elements of an intermediate tridiagonal form did not converge to
zero;
if info = i, and jobz = 'V', then the algorithm failed to compute an
info/(n+1) through mod(info, n+1).

Specific details for the routine hegvd interface are the following:
Application Notes
subsequent runs.
workspace.
?sygvx
eigenproblem.
Syntax
call ssygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, iwork, ifail, info)
1254
LAPACK Routines 3
call dsygvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, iwork, ifail, info)
call sygvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a real generalized symmetric-
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric and B is also positive definite. Eigenvalues and eigenvectors can
be selected by specifying either a range of values or a range of indices for the desired eigenvalues.
Input Parameters

if itype = 1, the problem type is A*x = *B*x;
if itype = 2, the problem type is A*B*x = *x;
if itype = 3, the problem type is B*A*x = *x.



open interval:
vl<lambda(i)vu.

a, b, work REAL for ssygvx

DOUBLE PRECISION for dsygvx.
Arrays:
specified by uplo.
1255

vl, vu REAL for ssygvx

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for ssygvx

DOUBLE PRECISION for dsygvx. The absolute error tolerance for the
eigenvalues. See Application Notes for more information.

ldz 1; if jobz = 'V', ldz max(1, n) .
lwork INTEGER.
The dimension of the array work;
lwork < max(1, 8n).
xerbla.
iwork INTEGER.
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower triangle (if uplo =
'L') of A, including the diagonal, is overwritten.
1256
LAPACK Routines 3
L*LT.

0 mn. If range = 'A', m = n, and if range = 'I',
m = iu-il+1.
w, z REAL for ssygvx

Arrays:
The first m elements of w contain the selected eigenvalues in ascending
order.
z(ldz,*) .
with w(i). The eigenvectors are normalized as follows:

returned in ifail.
lwork.
ifail INTEGER.
converge.
info INTEGER.
If info = -i, the ith argument had an illegal value.
If info > 0, spotrf/dpotrf and ssyevx/dsyevx returned an error code:
1257
If info = in, ssyevx/dsyevx failed to converge, and i eigenvectors failed

to converge. Their indices are stored in the array ifail;

Specific details for the routine sygvx interface are the following:

1258
LAPACK Routines 3
Application Notes
An approximate eigenvalue is accepted as converged when it is determined to lie in an interval [a,b] of
width less than or equal to abstol+*max(|a|,|b|), where is the machine precision.
obtained by reducing C to tridiagonal form, where C is the symmetric matrix of the standard symmetric
problem to which the generalized problem is transformed. Eigenvalues will be computed most accurately
lamch('S').
For optimum performance use lwork (nb+3)*n, where nb is the blocksize for ssytrd/dsytrd returned by
ilaenv.
= -1.
?hegvx
positive-definite eigenproblem.
Syntax
call chegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, rwork, iwork, ifail, info)
call zhegvx(itype, jobz, range, uplo, n, a, lda, b, ldb, vl, vu, il, iu, abstol, m,
w, z, ldz, work, lwork, rwork, iwork, ifail, info)
call hegvx(a, b, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes selected eigenvalues, and optionally, the eigenvectors of a complex generalized
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian and B is also positive definite. Eigenvalues and eigenvectors can
be selected by specifying either a range of values or a range of indices for the desired eigenvalues.
Input Parameters
1259
if itype = 1, the problem type is A*x = *B*x;
if itype = 2, the problem type is A*B*x = *x;
if itype = 3, the problem type is B*A*x = *x.



open interval:
vl<lambda(i)vu.

a, b, work COMPLEX for chegvx

DOUBLE COMPLEX for zhegvx.
Arrays:
specified by uplo.
vl, vu REAL for chegvx

DOUBLE PRECISION for zhegvx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
1260
LAPACK Routines 3
if n = 0.
abstol REAL for chegvx

more information.

ldz 1; if jobz = 'V', ldz max(1, n).
lwork INTEGER.
The dimension of the array work; lwork max(1, 2n).

xerbla.
rwork REAL for chegvx

iwork INTEGER.
Output Parameters
a On exit, the upper triangle (if uplo = 'U') or the lower triangle (if uplo =
'L') of A, including the diagonal, is overwritten.

L*LH.

m = iu-il+1.
w REAL for chegvx

1261
The first m elements of w contain the selected eigenvalues in ascending

order.
z COMPLEX for chegvx

DOUBLE COMPLEX for zhegvx.
Array z(ldz,*). The second dimension of z must be at least max(1, m).


returned in ifail.
lwork.
ifail INTEGER.
converge.
info INTEGER.
If info = -i, the ith argument had an illegal value.
If info > 0, cpotrf/zpotrf and cheevx/zheevx returned an error code:
If info = in, cheevx/zheevx failed to converge, and i eigenvectors failed


Specific details for the routine hegvx interface are the following:
1262
LAPACK Routines 3

Application Notes
matrix obtained by reducing C to tridiagonal form, where C is the symmetric matrix of the standard
symmetric problem to which the generalized problem is transformed. Eigenvalues will be computed most
accurately when abstol is set to twice the underflow threshold 2*?lamch('S'), not zero.
to 2*?lamch('S').
For optimum performance use lwork (nb+1)*n, where nb is the blocksize for chetrd/zhetrd returned by
ilaenv.
lwork = -1.
1263
?spgv
eigenproblem with matrices in packed storage.
Syntax
call sspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
call dspgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, info)
call spgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be symmetric, stored in packed format, and B is also positive definite.
Input Parameters



If uplo = 'U', arrays ap and bp store the upper triangles of A and B;
If uplo = 'L', arrays ap and bp store the lower triangles of A and B.
ap, bp, work REAL for sspgv

DOUBLE PRECISION for dspgv.
1264
LAPACK Routines 3
Arrays:
ap(*) contains the packed upper or lower triangle of the symmetric matrix
A, as specified by uplo.
bp(*) contains the packed upper or lower triangle of the symmetric matrix
B, as specified by uplo.
'V', ldz max(1, n).
Output Parameters
ap On exit, the contents of ap are overwritten.
bp On exit, contains the triangular factor U or L from the Cholesky factorization

B = UT*U or B = L*LT, in the same storage format as B.
w, z REAL for sspgv

Arrays:
z(ldz,*) .
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors.
The eigenvectors are normalized as follows:
info INTEGER.
If info > 0, spptrf/dpptrf and sspev/dspev returned an error code:
If info = in, sspev/dspev failed to converge, and i off-diagonal

1265

Specific details for the routine spgv interface are the following:

?hpgv
positive-definite eigenproblem with matrices in packed
storage.
Syntax
call chpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
call zhpgv(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, rwork, info)
call hpgv(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Here A and B are assumed to be Hermitian, stored in packed format, and B is also positive definite.
Input Parameters

1266
LAPACK Routines 3

ap, bp, work COMPLEX for chpgv

DOUBLE COMPLEX for zhpgv.
Arrays:
ap(*) contains the packed upper or lower triangle of the Hermitian matrix
bp(*) contains the packed upper or lower triangle of the Hermitian matrix
work(*) is a workspace array, size at least max(1, 2n-1).
'V', ldz max(1, n).
rwork REAL for chpgv

DOUBLE PRECISION for zhpgv.
Output Parameters

B = UH*U or B = L*LH, in the same storage format as B.
w REAL for chpgv

DOUBLE PRECISION for zhpgv.
z COMPLEX for chpgv

DOUBLE COMPLEX for zhpgv.
Array z(ldz,*).

1267
info INTEGER.
If info > 0, cpptrf/zpptrf and chpev/zhpev returned an error code:
If info = in, chpev/zhpev failed to converge, and i off-diagonal


Specific details for the routine hpgv interface are the following:

?spgvd
eigenproblem with matrices in packed storage using a
divide and conquer method.
Syntax
call sspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork, liwork, info)
call dspgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, iwork, liwork, info)
call spgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
1268
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
If eigenvectors are desired, it uses a divide and conquer algorithm.
Input Parameters



ap, bp, work REAL for sspgvd

DOUBLE PRECISION for dspgvd.
Arrays:
'V', ldz max(1, n).
lwork INTEGER.
Constraints:
If n 1, lwork 1;
1269
If jobz = 'N' and n>1, lwork 2n;
If jobz = 'V' and n>1, lwork 2n2+6n+1.

Notes for details.
iwork INTEGER.
Workspace array, dimension max(1, lwork).
liwork INTEGER.
Constraints:
If n 1, liwork 1;

Notes for details.
Output Parameters

w, z REAL for sspgv

Arrays:
z(ldz,*).
lwork.
1270
LAPACK Routines 3
liwork.
info INTEGER.
If info > 0, spptrf/dpptrf and sspevd/dspevd returned an error code:
If info = in, sspevd/dspevd failed to converge, and i off-diagonal


Specific details for the routine spgvd interface are the following:

Application Notes
query.
workspace.
1271
?hpgvd
positive-definite eigenproblem with matrices in packed
storage using a divide and conquer method.
Syntax
call chpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork, lrwork,
call zhpgvd(itype, jobz, uplo, n, ap, bp, w, z, ldz, work, lwork, rwork, lrwork,
call hpgvd(ap, bp, w [,itype] [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Input Parameters



ap, bp, work COMPLEX for chpgvd

DOUBLE COMPLEX for zhpgvd.
Arrays:
1272
LAPACK Routines 3
'V', ldz max(1, n).
lwork INTEGER.
Constraints:
If n 1, lwork 1;
If jobz = 'N' and n>1, lworkn;
If jobz = 'V' and n>1, lwork 2n.

rwork REAL for chpgvd

DOUBLE PRECISION for zhpgvd.
Workspace array, its dimension max(1, lrwork).
lrwork INTEGER.
Constraints:
If n 1, lrwork 1;
If jobz = 'V' and n>1, lrwork 2n2+5n+1.

iwork INTEGER.
liwork INTEGER.
Constraints:
If n 1, liwork 1;
1273

Output Parameters

w REAL for chpgvd

DOUBLE PRECISION for zhpgvd.
z COMPLEX for chpgvd

DOUBLE COMPLEX for zhpgvd.
Array z(ldz,*).

lwork.
lrwork.
liwork.
info INTEGER.
If info > 0, cpptrf/zpptrf and chpevd/zhpevd returned an error code:
If info = in, chpevd/zhpevd failed to converge, and i off-diagonal

1274
LAPACK Routines 3
Specific details for the routine hpgvd interface are the following:

Application Notes
subsequent runs.
workspace.
?spgvx
eigenproblem with matrices in packed storage.
Syntax
call sspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, iwork, ifail, info)
call dspgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, iwork, ifail, info)
call spgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
1275
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Input Parameters




open interval:
vl<lambda(i)vu.

ap, bp, work REAL for sspgvx

DOUBLE PRECISION for dspgvx.
Arrays:
The size of bp must be at least max(1, n*(n+1)/2).
1276
LAPACK Routines 3
vl, vu REAL for sspgvx

for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for sspgvx

more information.

ldz 1; if jobz = 'V', ldz max(1, n) .
iwork INTEGER.
Output Parameters


m = iu-il+1.
w, z REAL for sspgvx

Arrays:
z(ldz,*) .
1277


returned in ifail.
ifail INTEGER.
converge.
info INTEGER.
If info > 0, spptrf/dpptrf and sspevx/dspevx returned an error code:
If info = in, sspevx/dspevx failed to converge, and i eigenvectors failed


Specific details for the routine spgvx interface are the following:
1278
LAPACK Routines 3

Application Notes
If abstol is less than or equal to zero, then *||T||1 is used instead, where T is the tridiagonal matrix
obtained by reducing A to tridiagonal form. Eigenvalues are computed most accurately when abstol is set to
twice the underflow threshold 2*?lamch('S'), not zero.
lamch('S').
?hpgvx
eigenvectors of a generalized Hermitian positive-
definite eigenproblem with matrices in packed
storage.
Syntax
call chpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, rwork, iwork, ifail, info)
call zhpgvx(itype, jobz, range, uplo, n, ap, bp, vl, vu, il, iu, abstol, m, w, z,
ldz, work, rwork, iwork, ifail, info)
call hpgvx(ap, bp, w [,itype] [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail]
[,abstol] [,info])
1279
Include Files
mkl.fi, lapack.f90
Description
A*x = *B*x, A*B*x = *x, or B*A*x = *x.
Input Parameters




open interval:
vl<lambda(i)vu.

ap, bp, work COMPLEX for chpgvx

DOUBLE COMPLEX for zhpgvx.
Arrays:
1280
LAPACK Routines 3
vl, vu REAL for chpgvx

DOUBLE PRECISION for zhpgvx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for chpgvx

The absolute error tolerance for the eigenvalues.
See Application Notes for more information.
'V', ldz max(1, n).
rwork REAL for chpgvx

iwork INTEGER.
Output Parameters


m = iu-il+1.
w REAL for chpgvx

1281
z COMPLEX for chpgvx

DOUBLE COMPLEX for zhpgvx.
Array z(ldz,*).


returned in ifail.
ifail INTEGER.
converge.
info INTEGER.
If info > 0, cpptrf/zpptrf and chpevx/zhpevx returned an error code:
If info = in, chpevx/zhpevx failed to converge, and i eigenvectors failed


Specific details for the routine hpgvx interface are the following:
1282
LAPACK Routines 3

Application Notes
to 2*?lamch('S').
?sbgv
eigenproblem with banded matrices.
Syntax
call ssbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
call dsbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, info)
call sbgv(ab, bb, w [,uplo] [,z] [,info])
1283
Include Files
mkl.fi, lapack.f90
Description
definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be symmetric and
banded, and B is also positive definite.
Input Parameters


If uplo = 'U', arrays ab and bb store the upper triangles of A and B;
If uplo = 'L', arrays ab and bb store the lower triangles of A and B.

(ka 0).
kb INTEGER. The number of super- or sub-diagonals in B (kb 0).
ab, bb, work REAL for ssbgv

DOUBLE PRECISION for dsbgv
Arrays:
symmetric matrix B (as specified by uplo) in band storage format.
work(*) is a workspace array, dimension at least max(1, 3n)
ldab INTEGER. The leading dimension of the array ab; must be at least ka+1 .
'V', ldz max(1, n).
Output Parameters
ab On exit, the contents of ab are overwritten.
1284
LAPACK Routines 3
bb On exit, contains the factor S from the split Cholesky factorization B =
ST*S, as returned by pbstf/pbstf.
w, z REAL for ssbgv

DOUBLE PRECISION for dsbgv
Arrays:
z(ldz,*) .
If jobz = 'V', then if info = 0, z contains the matrix Z of eigenvectors,
with the i-th column of z holding the eigenvector associated with w(i). The
eigenvectors are normalized so that ZT*B*Z = I.
info INTEGER.
If info > 0, and
if in, the algorithm failed to converge, and i off-diagonal elements of an

intermediate tridiagonal did not converge to zero;
if info = n + i, for 1 in, then pbstf/pbstf returned info = i and B is
not positive-definite. The factorization of B could not be completed and no

Specific details for the routine sbgv interface are the following:

1285
?hbgv
positive-definite eigenproblem with banded matrices.
Syntax
call chbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork, info)
call zhbgv(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, rwork, info)
call hbgv(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are Hermitian and
banded matrices, and matrix B is also positive definite.
Input Parameters



(ka 0).
ab, bb, work COMPLEX for chbgv

DOUBLE COMPLEX for zhbgv
Arrays:
Hermitian matrix B (as specified by uplo) in band storage format.
work(*) is a workspace array, dimension at least max(1, n).
1286
LAPACK Routines 3
'V', ldz max(1, n).
rwork REAL for chbgv

DOUBLE PRECISION for zhbgv.
Output Parameters

SH*S, as returned by pbstf/pbstf.
w REAL for chbgv

DOUBLE PRECISION for zhbgv.
z COMPLEX for chbgv

DOUBLE COMPLEX for zhbgv
Array z(ldz,*).

eigenvectors are normalized so that ZH*B*Z = I.
info INTEGER.
If info > 0, and


Specific details for the routine hbgv interface are the following:
1287

?sbgvd
eigenproblem with banded matrices. If eigenvectors
are desired, it uses a divide and conquer method.
Syntax
call ssbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, iwork,
liwork, info)
call dsbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, iwork,
liwork, info)
call sbgvd(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
banded, and B is also positive definite.
Input Parameters


1288
LAPACK Routines 3
(ka 0).
ab, bb, work REAL for ssbgvd

DOUBLE PRECISION for dsbgvd
Arrays:
'V', ldz max(1, n).
lwork INTEGER.
Constraints:
If n 1, lwork 1;
If jobz = 'N' and n>1, lwork 3n;
If jobz = 'V' and n>1, lwork 2n2+5n+1.

Notes for details.
iwork INTEGER.
liwork INTEGER.
Constraints:
If n 1, liwork 1;
1289

Notes for details.
Output Parameters

w, z REAL for ssbgvd

DOUBLE PRECISION for dsbgvd
Arrays:
z(ldz,*).
lwork.
liwork.
info INTEGER.
If info > 0, and


Specific details for the routine sbgvd interface are the following:
1290
LAPACK Routines 3

Application Notes
?hbgvd
If eigenvectors are desired, it uses a divide and
conquer method.
Syntax
call chbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, rwork,
lrwork, iwork, liwork, info)
call zhbgvd(jobz, uplo, n, ka, kb, ab, ldab, bb, ldbb, w, z, ldz, work, lwork, rwork,
lrwork, iwork, liwork, info)
call hbgvd(ab, bb, w [,uplo] [,z] [,info])
Include Files
mkl.fi, lapack.f90
Description
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be
Hermitian and banded, and B is also positive definite.
1291
Input Parameters



(ka0).
ab, bb, work COMPLEX for chbgvd

DOUBLE COMPLEX for zhbgvd
Arrays:
'V', ldz max(1, n).
lwork INTEGER.
Constraints:
If n 1, lwork 1;
If jobz = 'N' and n>1, lworkn;
If jobz = 'V' and n>1, lwork 2n2.

1292
LAPACK Routines 3
rwork REAL for chbgvd
DOUBLE PRECISION for zhbgvd.
Workspace array, size max(1, lrwork).
lrwork INTEGER.
Constraints:
If n 1, lrwork 1;
If jobz = 'V' and n>1, lrwork 2n2+5n +1.

iwork INTEGER.
liwork INTEGER.
Constraints:
If n 1, lwork 1;

Output Parameters

w REAL for chbgvd

DOUBLE PRECISION for zhbgvd.
Array, size at least max(1, n) .
z COMPLEX for chbgvd

DOUBLE COMPLEX for zhbgvd
1293
Array z(ldz,*).

lwork.
lrwork.
liwork.
info INTEGER.
If info > 0, and


Specific details for the routine hbgvd interface are the following:

Application Notes
1294
LAPACK Routines 3
subsequent runs.
workspace.
?sbgvx
eigenproblem with banded matrices.
Syntax
call ssbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, iwork, ifail, info)
call dsbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, iwork, ifail, info)
call sbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
banded, and B is also positive definite. Eigenvalues and eigenvectors can be selected by specifying either all
eigenvalues, a range of values or a range of indices for the desired eigenvalues.
Input Parameters



open interval:
vl<lambda(i)vu.
If range = 'I', the routine computes eigenvalues in range il to iu.
1295

(ka 0).
ab, bb, work REAL for ssbgvx

DOUBLE PRECISION for dsbgvx
Arrays:
work(*) is a workspace array, size (7*n).
vl, vu REAL for ssbgvx

DOUBLE PRECISION for dsbgvx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for ssbgvx

DOUBLE PRECISION for dsbgvx.
more information.
'V', ldz max(1, n).
1296
LAPACK Routines 3
ldq INTEGER. The leading dimension of the output array q; ldq < 1.
If jobz = 'V', ldq < max(1, n).
iwork INTEGER.
Workspace array, size (5*n).
Output Parameters


m = iu-il+1.
w, z, q REAL for ssbgvx

DOUBLE PRECISION for dsbgvx
Arrays:
w(*), size at least max(1, n) .
z(ldz,*) .
q(ldq,*) .
If jobz = 'V', then q contains the n-by-n matrix used in the reduction of
A*x = lambda*B*x to standard form, that is, C*x= lambda*x and
consequently C to tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
Array, size (m).
converge.
info INTEGER.
1297
If info > 0, and


Specific details for the routine sbgvx interface are the following:

Note that there will be an error condition if ifail or q is present and z is omitted.
1298
LAPACK Routines 3
Application Notes
to 2*?lamch('S').
?hbgvx
Syntax
call chbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
call zhbgvx(jobz, range, uplo, n, ka, kb, ab, ldab, bb, ldbb, q, ldq, vl, vu, il, iu,
abstol, m, w, z, ldz, work, rwork, iwork, ifail, info)
call hbgvx(ab, bb, w [,uplo] [,z] [,vl] [,vu] [,il] [,iu] [,m] [,ifail] [,q] [,abstol]
[,info])
Include Files
mkl.fi, lapack.f90
Description
Hermitian positive-definite banded eigenproblem, of the form A*x = *B*x. Here A and B are assumed to be
Hermitian and banded, and B is also positive definite. Eigenvalues and eigenvectors can be selected by
specifying either all eigenvalues, a range of values or a range of indices for the desired eigenvalues.
Input Parameters



open interval:
vl< lambda(i)vu.

1299

(ka 0).
ab, bb, work COMPLEX for chbgvx

DOUBLE COMPLEX for zhbgvx
Arrays:
work(*) is a workspace array, size at least max(1, n).
vl, vu REAL for chbgvx

DOUBLE PRECISION for zhbgvx.
for eigenvalues.
Constraint: vl< vu.
il, iu INTEGER.
if n = 0.
abstol REAL for chbgvx

more information.
'V', ldz max(1, n).
1300
LAPACK Routines 3
ldq INTEGER. The leading dimension of the output array q; ldq 1. If jobz =
'V', ldq max(1, n).
rwork REAL for chbgvx

iwork INTEGER.
Output Parameters


m = iu-il+1.
w REAL for chbgvx

Array w(*), size at least max(1, n).
z, q COMPLEX for chbgvx

DOUBLE COMPLEX for zhbgvx
Arrays:
z(ldz,*).
q(ldq,*).
If jobz = 'V', then q contains the n-by-n matrix used in the reduction of
Ax = Bx to standard form, that is, Cx = x and consequently C to
tridiagonal form.
If jobz = 'N', then q is not referenced.
ifail INTEGER.
1301
converge.
info INTEGER.
If info > 0, and


Specific details for the routine hbgvx interface are the following:

Note that there will be an error condition if ifail or q is present and z is omitted.
1302
LAPACK Routines 3
Application Notes
to 2*?lamch('S').
Generalized Nonsymmetric Eigenvalue Problems: LAPACK Driver Routines

This section describes LAPACK driver routines used for solving generalized nonsymmetric eigenproblems. See
also computational routines that can be called to solve these problems. Table "Driver Routines for Solving
Generalized Nonsymmetric Eigenproblems" lists all such driver routines for the FORTRAN 77 interface. The
Driver Routines for Solving Generalized Nonsymmetric Eigenproblems
gges Computes the generalized eigenvalues, Schur form, and the left and/or right Schur
vectors for a pair of nonsymmetric matrices.
ggesx Computes the generalized eigenvalues, Schur form, and, optionally, the left and/or
right matrices of Schur vectors.
gges3 Computes generalized Schur factorization for a pair of matrices.
ggev Computes the generalized eigenvalues, and the left and/or right generalized
eigenvectors for a pair of nonsymmetric matrices.
ggevx Computes the generalized eigenvalues, and, optionally, the left and/or right
generalized eigenvectors.
ggev3 Computes generalized Schur factorization for a pair of matrices.
?gges
Computes the generalized eigenvalues, Schur form,
and the left and/or right Schur vectors for a pair of
nonsymmetric matrices.
Syntax
call sgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call dgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info)
call cgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
1303
call zgges(jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info)
call gges(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,info])
call gges(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,info])
Include Files
mkl.fi, lapack.f90
Description
The ?gges routine computes the generalized eigenvalues, the generalized real/complex Schur form (S,T),
optionally, the left and/or right matrices of Schur vectors (vsl and vsr) for a pair of n-by-n real/complex
nonsymmetric matrices (A,B). This gives the generalized Schur factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
Optionally, it also orders the eigenvalues so that a selected cluster of eigenvalues appears in the leading
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T. The leading
columns of vsl and vsr then form an orthonormal/unitary basis for the corresponding left and right
If only the generalized eigenvalues are needed, use the driver ggev instead, which is faster.
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta = w, such that A -
w*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation
for beta=0 or for both being zero. A pair of matrices (S,T) is in the generalized real Schur form if T is upper
triangular with non-negative diagonal and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1
blocks correspond to real generalized eigenvalues, while 2-by-2 blocks of S are "standardized" by making the
corresponding elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of generalized
eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular
and, in addition, the diagonal of T are non-negative real numbers.
The ?gges routine replaces the deprecated ?gegs routine.
Input Parameters
jobvsl CHARACTER*1. Must be 'N' or 'V'.

If jobvsl = 'N', then the left Schur vectors are not computed.
If jobvsl = 'V', then the left Schur vectors are computed.
jobvsr CHARACTER*1. Must be 'N' or 'V'.

If jobvsr = 'N', then the right Schur vectors are not computed.
If jobvsr = 'V', then the right Schur vectors are computed.
eigenvalues on the diagonal of the generalized Schur form.
1304
LAPACK Routines 3
If sort = 'S', eigenvalues are ordered (see selctg).
selctg LOGICAL FUNCTION of three REAL arguments for real flavors.

LOGICAL FUNCTION of two COMPLEX arguments for complex flavors.
selctg must be declared EXTERNAL in the calling subroutine.
If sort = 'S', selctg is used to select eigenvalues to sort to the top left
of the Schur form.
If sort = 'N', selctg is not referenced.
For real flavors:

An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected if selctg(alphar(j),
alphai(j), beta(j)) is true; that is, if either one of a complex conjugate pair
of eigenvalues is selected, then both complex eigenvalues are selected.
Note that in the ill-conditioned case, a selected complex eigenvalue may no
longer satisfy selctg(alphar(j), alphai(j), beta(j)) = .TRUE.
after ordering. In this case info is set to n+2 .
An eigenvalue alpha(j) / beta(j) is selected if selctg(alpha(j), beta(j))
is true.
Note that a selected complex eigenvalue may no longer satisfy
selctg(alpha(j), beta(j)) = .TRUE. after ordering, since ordering
may change the value of complex eigenvalues (especially if the eigenvalue
is ill-conditioned); in this case info is set to n+2 (see info below).
n INTEGER. The order of the matrices A, B, vsl, and vsr (n 0).
a, b, work REAL for sgges

DOUBLE PRECISION for dgges
COMPLEX for cgges
DOUBLE COMPLEX for zgges.
Arrays:
a(lda,*) is an array containing the n-by-n matrix A (first of the pair of
matrices).
b(ldb,*) is an array containing the n-by-n matrix B (second of the pair of
matrices).
ldb INTEGER. The leading dimension of the array b. Must be at least max(1, n).
ldvsl, ldvsr INTEGER. The leading dimensions of the output matrices vsl and vsr,
respectively. Constraints:
1305
ldvsl 1. If jobvsl = 'V', ldvsl max(1, n).

ldvsr 1. If jobvsr = 'V', ldvsr max(1, n).
lwork INTEGER.
lwork max(1, 8n+16) for real flavors;
xerbla.
rwork REAL for cgges

DOUBLE PRECISION for zgges
bwork LOGICAL.
Not referenced if sort = 'N'.
Output Parameters
a On exit, this array has been overwritten by its generalized Schur form S.
b On exit, this array has been overwritten by its generalized Schur form T.
sdim INTEGER.

for which selctg is true.
Note that for real flavors complex conjugate pairs for which selctg is true
for either eigenvalue count as 2.
alphar, alphai REAL for sgges;

DOUBLE PRECISION for dgges.
Arrays, size at least max(1, n) each. Contain values that form generalized
See beta.
alpha COMPLEX for cgges;

eigenvalues in complex flavors. See beta.
1306
LAPACK Routines 3
beta REAL for sgges
COMPLEX for cgges
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will be the generalized
eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the diagonals of the
complex Schur form (S,T) that would result if the 2-by-2 diagonal blocks of
the real generalized Schur form of (A,B) were further reduced to triangular
form using complex unitary transformations. If alphai(j) is zero, then the j-
th eigenvalue is real; if positive, then the j-th and (j+1)-st eigenvalues are
a complex conjugate pair, with alphai(j+1) negative.
On exit, alpha(j)/beta(j), j=1,..., n, will be the generalized eigenvalues.
alphaalpha(j) and beta(j), j=1,..., n are the diagonals of the complex Schur
form (S,T) output by cgges/zgges. The beta(j) will be non-negative real.
See also Application Notes below.

vsl, vsr REAL for sgges
COMPLEX for cgges
Arrays:
vsl(ldvsl,*), the second dimension of vsl must be at least max(1, n).
If jobvsl = 'V', this array will contain the left Schur vectors.
If jobvsl = 'N', vsl is not referenced.
vsr(ldvsr,*), the second dimension of vsr must be at least max(1, n).

If jobvsr = 'V', this array will contain the right Schur vectors.
If jobvsr = 'N', vsr is not referenced.
lwork.
info INTEGER.
If info = i, and
in:
the QZ iteration failed. (A, B) is not in Schur form, but alphar(j), alphai(j)
(for real flavors), or alpha(j) (for complex flavors), and beta(j), j=info
+1,..., n should be correct.
1307
i > n: errors that usually indicate LAPACK problems:

i = n+1: other than QZ iteration failed in hgeqz;
i = n+2: after reordering, roundoff changed values of some complex
eigenvalues so that leading eigenvalues in the generalized Schur form no
longer satisfy selctg = .TRUE.. This could also be caused due to scaling;
i = n+3: reordering failed in tgsen.

Specific details for the routine gges interface are the following:
vsl Holds the matrix VSL of size (n, n).
vsr Holds the matrix VSR of size (n, n).
jobvsl Restored based on the presence of the argument vsl as follows:

jobvsl = 'V', if vsl is present,
jobvsl = 'N', if vsl is omitted.
jobvsr Restored based on the presence of the argument vsr as follows:

jobvsr = 'V', if vsr is present,
jobvsr = 'N', if vsr is omitted.

Application Notes
lwork = -1.
1308
LAPACK Routines 3
The quotients alphar(j)/beta(j) and alphai(j)/beta(j) may easily over- or underflow, and beta(j) may even be
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai will be always less than
and usually comparable with norm(A) in magnitude, and beta always less than and usually comparable with
norm(B).
?ggesx
Computes the generalized eigenvalues, Schur form,
and, optionally, the left and/or right matrices of Schur
vectors.
Syntax
call sggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, iwork, liwork,
bwork, info)
call dggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alphar,
alphai, beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, iwork, liwork,
bwork, info)
call cggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alpha,
beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork, iwork, liwork,
bwork, info)
call zggesx (jobvsl, jobvsr, sort, selctg, sense, n, a, lda, b, ldb, sdim, alpha,
beta, vsl, ldvsl, vsr, ldvsr, rconde, rcondv, work, lwork, rwork, iwork, liwork,
bwork, info)
call ggesx(a, b, alphar, alphai, beta [,vsl] [,vsr] [,select] [,sdim] [,rconde] [,
rcondv] [,info])
call ggesx(a, b, alpha, beta [, vsl] [,vsr] [,select] [,sdim] [,rconde] [,rcondv] [,
info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for a pair of n-by-n real/complex nonsymmetric matrices (A,B), the generalized
eigenvalues, the generalized real/complex Schur form (S,T), optionally, the left and/or right matrices of
Schur vectors (vsl and vsr). This gives the generalized Schur factorization
(A,B) = ( vsl*S *vsrH, vsl*T*vsrH )
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T; computes a
reciprocal condition number for the average of the selected eigenvalues (rconde); and computes a reciprocal
condition number for the right and left deflating subspaces corresponding to the selected eigenvalues
(rcondv). The leading columns of vsl and vsr then form an orthonormal/unitary basis for the corresponding
left and right eigenspaces (deflating subspaces).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha / beta = w, such that A
- w*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation
for beta=0 or for both being zero. A pair of matrices (S,T) is in generalized real Schur form if T is upper
1309
triangular with non-negative diagonal and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1
blocks correspond to real generalized eigenvalues, while 2-by-2 blocks of S will be "standardized" by making
the corresponding elements of T have the form:
and the pair of corresponding 2-by-2 blocks in S and T will have a complex conjugate pair of generalized
eigenvalues. A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular
and, in addition, the diagonal of T are non-negative real numbers.
Input Parameters
jobvsl CHARACTER*1. Must be 'N' or 'V'.

If jobvsl = 'N', then the left Schur vectors are not computed.
If jobvsl = 'V', then the left Schur vectors are computed.
jobvsr CHARACTER*1. Must be 'N' or 'V'.

If jobvsr = 'N', then the right Schur vectors are not computed.
If jobvsr = 'V', then the right Schur vectors are computed.
eigenvalues on the diagonal of the generalized Schur form.
If sort = 'S', eigenvalues are ordered (see selctg).
selctg LOGICAL FUNCTION of three REAL arguments for real flavors.

LOGICAL FUNCTION of two COMPLEX arguments for complex flavors.
selctg must be declared EXTERNAL in the calling subroutine.
If sort = 'S', selctg is used to select eigenvalues to sort to the top left
of the Schur form.
If sort = 'N', selctg is not referenced.
For real flavors:

An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected if selctg(alphar(j),
alphai(j), beta(j)) is true; that is, if either one of a complex conjugate pair
of eigenvalues is selected, then both complex eigenvalues are selected.
longer satisfy selctg(alphar(j), alphai(j), beta(j)) = .TRUE.
after ordering. In this case info is set to n+2.
An eigenvalue alpha(j) / beta(j) is selected if selctg(alpha(j), beta(j))
is true.
1310
LAPACK Routines 3
selctg(alpha(j), beta(j)) = .TRUE. after ordering, since ordering
may change the value of complex eigenvalues (especially if the eigenvalue
is ill-conditioned); in this case info is set to n+2 (see info below).
If sense = 'E', computed for average of selected eigenvalues only;
If sense = 'V', computed for selected deflating subspaces only;
If sense = 'B', computed for both.
If sense is 'E', 'V', or 'B', then sort must equal 'S'.
n INTEGER. The order of the matrices A, B, vsl, and vsr (n 0).
a, b, work REAL for sggesx

DOUBLE PRECISION for dggesx
COMPLEX for cggesx
DOUBLE COMPLEX for zggesx.
Arrays:
matrices).
matrices).


ldvsl, ldvsr INTEGER. The leading dimensions of the output matrices vsl and vsr,
respectively. Constraints:
ldvsl 1. If jobvsl = 'V', ldvsl max(1, n).
ldvsr 1. If jobvsr = 'V', ldvsr max(1, n).
lwork INTEGER.
For real flavors:
If n=0 then lwork1.
If n>0 and sense = 'N', then lwork max(8*n, 6*n+16).
1311
If n>0 and sense = 'E', 'V', or 'B', then lwork max(8*n, 6*n+16,
2*sdim*(n-sdim));
If n=0 then lwork1.
If n>0 and sense = 'N', then lwork max(1, 2*n);
If n>0 and sense = 'E', 'V', or 'B', then lwork max(1, 2*n,
2*sdim*(n-sdim)).
Note that 2*sdim*(n-sdim) n*n/2.
An error is only returned if lwork< max(8*n, 6*n+16)for real flavors, and

lwork< max(1, 2*n) for complex flavors, but if sense = 'E', 'V', or
'B', this may not be large enough.
If lwork=-1, then a workspace query is assumed; the routine only
calculates the bound on the optimal size of the work array and the
minimum size of the iwork array, returns these values as the first entries of
the work and iwork arrays, and no error message related to lwork or
liwork is issued by xerbla.
rwork REAL for cggesx

DOUBLE PRECISION for zggesx
iwork INTEGER.
liwork INTEGER.
If sense = 'N', or n=0, then liwork1,
otherwise liwork (n+6) for real flavors, and liwork (n+2) for complex
flavors.
If liwork=-1, then a workspace query is assumed; the routine only
calculates the bound on the optimal size of the work array and the
minimum size of the iwork array, returns these values as the first entries of
the work and iwork arrays, and no error message related to lwork or
liwork is issued by xerbla.
bwork LOGICAL.
Not referenced if sort = 'N'.
Output Parameters
a On exit, this array has been overwritten by its generalized Schur form S.
b On exit, this array has been overwritten by its generalized Schur form T.
1312
LAPACK Routines 3
sdim INTEGER.

for which selctg is true.
Note that for real flavors complex conjugate pairs for which selctg is true
for either eigenvalue count as 2.
alphar, alphai REAL for sggesx;

DOUBLE PRECISION for dggesx.
See beta.
alpha COMPLEX for cggesx;

beta REAL for sggesx

COMPLEX for cggesx
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n will be the generalized
eigenvalues.
alphar(j) + alphai(j)*i and beta(j), j=1,..., n are the diagonals of the
complex Schur form (S,T) that would result if the 2-by-2 diagonal blocks of
the real generalized Schur form of (A,B) were further reduced to triangular
form using complex unitary transformations. If alphai(j) is zero, then the j-
th eigenvalue is real; if positive, then the j-th and (j+1)-st eigenvalues are
a complex conjugate pair, with alphai(j+1) negative.
On exit, alpha(j)/beta(j), j=1,..., n will be the generalized eigenvalues.
alpha(j) and beta(j), j=1,..., n are the diagonals of the complex Schur form
(S,T) output by cggesx/zggesx. The beta(j) will be non-negative real.

vsl, vsr REAL for sggesx
COMPLEX for cggesx
Arrays:
vsl(ldvsl,*), the second dimension of vsl must be at least max(1, n).
1313
If jobvsl = 'V', this array will contain the left Schur vectors.
If jobvsl = 'N', vsl is not referenced.
vsr(ldvsr,*), the second dimension of vsr must be at least max(1, n).

If jobvsr = 'V', this array will contain the right Schur vectors.
If jobvsr = 'N', vsr is not referenced.
rconde, rcondv REAL for single precision flavors

Arrays, size (2) each
If sense = 'E' or 'B', rconde[0] and rconde[1] contain the reciprocal
condition numbers for the average of the selected eigenvalues.
Not referenced if sense = 'N' or 'V'.
If sense = 'V' or 'B', rcondv(1) and rcondv(2) contain the reciprocal

condition numbers for the selected deflating subspaces.
Not referenced if sense = 'N' or 'E'.
lwork.
liwork.
info INTEGER.
If info = i, and
in:
the QZ iteration failed. (A, B) is not in Schur form, but alphar(j), alphai(j)
(for real flavors), or alpha(j) (for complex flavors), and beta(j), j=info
+1,..., n should be correct.
i = n+1: other than QZ iteration failed in ?hgeqz;
i = n+2: after reordering, roundoff changed values of some complex
eigenvalues so that leading eigenvalues in the generalized Schur form no
longer satisfy selctg = .TRUE.. This could also be caused due to scaling;
i = n+3: reordering failed in tgsen.

Specific details for the routine ggesx interface are the following:
1314
LAPACK Routines 3
vsl Holds the matrix VSL of size (n, n).
vsr Holds the matrix VSR of size (n, n).
rconde Holds the vector of length (2).
rcondv Holds the vector of length (2).
jobvsl Restored based on the presence of the argument vsl as follows:

jobvsl = 'V', if vsl is present,
jobvsl = 'N', if vsl is omitted.
jobvsr Restored based on the presence of the argument vsr as follows:

jobvsr = 'V', if vsr is present,
jobvsr = 'N', if vsr is omitted.


Note that there will be an error condition if rconde or rcondv are present and select is omitted.
Application Notes
If you are in doubt how much workspace to supply, use a generous value of lwork (or liwork) for the first run
or set lwork = -1 (liwork = -1).
If you choose the first option and set any of admissible lwork (or liwork) sizes, which is no less than the
minimal value described, the routine completes the task, though probably not so fast as with a recommended
workspace, and provides the recommended workspace in the first element of the corresponding array (work,
iwork) on exit. Use this value (work(1), iwork(1)) for subsequent runs.
first element of the corresponding array (work, iwork). This operation is called a workspace query.
Note that if you set lwork (liwork) to less than the minimal required value and not -1, the routine returns
1315
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai will be always less than
and usually comparable with norm(A) in magnitude, and beta always less than and usually comparable with
norm(B).
?gges3
Computes generalized Schur factorization for a pair of
matrices.
Syntax
call sgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info )
call dgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alphar, alphai,
beta, vsl, ldvsl, vsr, ldvsr, work, lwork, bwork, info )
call cgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info )
call zgges3 (jobvsl, jobvsr, sort, selctg, n, a, lda, b, ldb, sdim, alpha, beta, vsl,
ldvsl, vsr, ldvsr, work, lwork, rwork, bwork, info )
Include Files
mkl.fi
Description
For a pair of n-by-n real or complex nonsymmetric matrices (A,B), ?gges3 computes the generalized
eigenvalues, the generalized real or complex Schur form (S,T), and optionally the left or right matrices of
Schur vectors (VSL and VSR). This gives the generalized Schur factorization
(A,B) = ( (VSL)*S*(VSR)T, (VSL)*T*(VSR)T ) for real (A,B)
or
(A,B) = ( (VSL)*S*(VSR)H, (VSL)*T*(VSR)H ) for complex (A,B)
where (VSR)H is the conjugate-transpose of VSR.
diagonal blocks of the upper quasi-triangular matrix S and the upper triangular matrix T. The leading
columns of VSL and VSR then form an orthonormal basis for the corresponding left and right eigenspaces
(deflating subspaces).
NOTE
If only the generalized eigenvalues are needed, use the driver ?ggev instead, which is faster.
A generalized eigenvalue for a pair of matrices (A,B) is a scalar w or a ratio alpha/beta = w, such that A -
w*B is singular. It is usually represented as the pair (alpha,beta), as there is a reasonable interpretation for
beta=0 or both being zero.
For real flavors:
A pair of matrices (S,T) is in generalized real Schur form if T is upper triangular with non-negative diagonal
and S is block upper triangular with 1-by-1 and 2-by-2 blocks. 1-by-1 blocks correspond to real generalized
eigenvalues, while 2-by-2 blocks of S will be "standardized" by making the corresponding elements of T have
the form:
a 0
0 b
1316
LAPACK Routines 3
and the pair of corresponding 2-by-2 blocks in S and T have a complex conjugate pair of generalized
eigenvalues.
A pair of matrices (S,T) is in generalized complex Schur form if S and T are upper triangular and, in addition,
the diagonal elements of T are non-negative real numbers.
Input Parameters
jobvsl CHARACTER*1. = 'N': do not compute the left Schur vectors;
jobvsr CHARACTER*1. = 'N': do not compute the right Schur vectors;

= 'V': compute the right Schur vectors.
sort CHARACTER*1. Specifies whether or not to order the eigenvalues on the

diagonal of the generalized Schur form.
= 'N': Eigenvalues are not ordered;
= 'S': Eigenvalues are ordered (see selctg).
selctg LOGICAL. selctg is a function of three arguments for real flavors or two
arguments for complex flavors. selctg must be declared EXTERNAL in the
calling subroutine. If sort = 'N', selctg is not referenced. If sort = 'S',
selctg is used to select eigenvalues to sort to the top left of the Schur
form.
For real flavors:
An eigenvalue (alphar(j) + alphai(j))/beta(j) is selected if
selctg(alphar(j),alphai(j),beta(j)) is true. In other words, if either
one of a complex conjugate pair of eigenvalues is selected, then both
complex eigenvalues are selected.
longer satisfy selctg(alphar(j),alphai(j), beta(j)) = .TRUE. after
ordering. info is to be set to n+2 in this case.

An eigenvalue alpha(j)/beta(j) is selected if
selctg(alpha(j),beta(j)) is true.
selctg(alpha(j),beta(j))= .TRUE. after ordering, since ordering may
change the value of complex eigenvalues (especially if the eigenvalue is ill-
conditioned), in this case info is set to n + 2 (See info below)..
n INTEGER. The order of the matrices A, B, VSL, and VSR. n 0.
a REAL for sgges3

DOUBLE PRECISION for dgges3
COMPLEX for cgges3
DOUBLE COMPLEX for zgges3
Array, size (lda, n). On entry, the first of the pair of matrices.
lda INTEGER. The leading dimension of a. lda max(1,n).
1317
b REAL for sgges3

COMPLEX for cgges3
Array, size (ldb, n). On entry, the second of the pair of matrices.
ldb INTEGER. The leading dimension of b. ldb max(1,n).
ldvsl INTEGER. The leading dimension of the matrix VSL. ldvsl 1, and if
jobvsl = 'V', ldvsl n.
ldvsr INTEGER. The leading dimension of the matrix VSR. ldvsr 1, and if
jobvsr = 'V', ldvsr n.
lwork INTEGER. The size of the array work. If lwork = -1, then a workspace
query is assumed; the routine only calculates the optimal size of the work
array, returns this value as the first entry of the work array, and no error
message related to lwork is issued by xerbla.
work REAL for sgges3

COMPLEX for cgges3
Array, size (MAX(1,lwork)).
rwork REAL for cgges3

DOUBLE PRECISION for zgges3
Array, size (8*n).
bwork LOGICAL. Array, size (n). Not referenced if sort = 'N'.
Output Parameters
a On exit, a is overwritten by its generalized Schur form S.
b On exit, b is overwritten by its generalized Schur form T.
sdim INTEGER. If sort = 'N', sdim = 0. If sort = 'S', sdim = number of

eigenvalues (after sorting) for which selctg is true.
alpha COMPLEX for cgges3

Array, size (n).
alphar REAL for sgges3

1318
LAPACK Routines 3
Array, size (n).
alphai REAL for sgges3

Array, size (n).
beta REAL for sgges3

COMPLEX for cgges3
Array, size (n).
For real flavors:

On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,...,n, are the
generalized eigenvalues. alphar(j) + alphai(j)*i, and beta(j),j=1,...,n
are the diagonals of the complex Schur form (S,T) that would result if the
2-by-2 diagonal blocks of the real Schur form of (a,b) were further reduced
to triangular form using 2-by-2 complex unitary transformations. If
alphai(j) is zero, then the j-th eigenvalue is real; if positive, then the j-th
and (j+1)-st eigenvalues are a complex conjugate pair, with alphai(j
+ 1) negative.
Note: the quotients alphar(j)/beta(j) and alphai(j)/beta(j) can
easily over- or underflow, and beta(j) might even be zero. Thus, you
should avoid computing the ratio alpha/beta by simply dividing alpha by
beta. However, alphar and alphai is always less than and usually
comparable with norm(a) in magnitude, and beta is always less than and
usually comparable with norm(b).

On exit, alpha(j)(j)/beta(j), j=1,...,n, are the generalized eigenvalues.
alpha(j), j=1,...,n and beta(j), j=1,...,n are the diagonals of the
complex Schur form (a,b) output by ?gges3. The beta(j) is non-negative
real.
Note: the quotient alpha(j)/beta(j) can easily over- or underflow, and
beta(j) might even be zero. Thus, you should avoid computing the ratio
alpha/beta by simply dividing alpha by beta. However, alpha is always
less than and usually comparable with norm(a) in magnitude, and beta is
always less than and usually comparable with norm(b).
vsl REAL for sgges3

COMPLEX for cgges3
Array, size (ldvsl, n).
If jobvsl = 'V', vsl contains the left Schur vectors. Not referenced if
jobvsl = 'N'.
vsr REAL for sgges3
1319

COMPLEX for cgges3
Array, size (ldvsr, n).
If jobvsr = 'V', vsr contains the right Schur vectors. Not referenced if
jobvsr = 'N'.
info INTEGER. = 0: successful exit < 0: if info = -i, the i-th argument had an
illegal value.
=1,...,n:
for real flavors:

The QZ iteration failed. (a,b) are not in Schur form, but alphar(j),
alphai(j) and beta(j) should be correct for j=info+1,...,n.
The QZ iteration failed. (a,b) are not in Schur form, but alpha(j) and
beta(j) should be correct for j=info+1,...,n.
> n:
=n+1: other than QZ iteration failed in ?hgeqz.

=n+2: after reordering, roundoff changed values of some complex
eigenvalues so that leading eigenvalues in the Generalized Schur form
no longer satisfy selctg= .TRUE. This could also be caused due to
scaling.
=n+3: reordering failed in ?tgsen.
?ggev
Computes the generalized eigenvalues, and the left
and/or right generalized eigenvectors for a pair of
nonsymmetric matrices.
Syntax
call sggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr, ldvr,
work, lwork, info)
call dggev(jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr, ldvr,
work, lwork, info)
call cggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info)
call zggev(jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info)
call ggev(a, b, alphar, alphai, beta [,vl] [,vr] [,info])
call ggev(a, b, alpha, beta [, vl] [,vr] [,info])
Include Files
mkl.fi, lapack.f90
1320
LAPACK Routines 3
Description
The ?ggev routine computes the generalized eigenvalues, and optionally, the left and/or right generalized
eigenvectors for a pair of n-by-n real/complex nonsymmetric matrices (A,B).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar or a ratio alpha / beta = , such that A -
*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation for
beta =0 and even for both being zero.
The right generalized eigenvector v(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
A*v(j) = (j)*B*v(j).
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
u(j)H*A = (j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
The ?ggev routine replaces the deprecated ?gegv routine.
Input Parameters

If jobvl = 'N', the left generalized eigenvectors are not computed;
If jobvl = 'V', the left generalized eigenvectors are computed.

If jobvr = 'N', the right generalized eigenvectors are not computed;
If jobvr = 'V', the right generalized eigenvectors are computed.
n INTEGER. The order of the matrices A, B, vl, and vr (n 0).
a, b, work REAL for sggev

DOUBLE PRECISION for dggev
COMPLEX for cggev
DOUBLE COMPLEX for zggev.
Arrays:
matrices).

matrices).
ldb INTEGER. The leading dimension of the array b. Must be at least max(1, n).
ldvl, ldvr INTEGER. The leading dimensions of the output matrices vl and vr,
respectively.
1321
Constraints:
ldvl 1. If jobvl = 'V', ldvl max(1, n).
ldvr 1. If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
lwork max(1, 8n+16) for real flavors;
xerbla.
rwork REAL for cggev

DOUBLE PRECISION for zggev
Output Parameters
a, b On exit, these arrays have been overwritten.
alphar, alphai REAL for sggev;

DOUBLE PRECISION for dggev.
See beta.
alpha COMPLEX for cggev;

beta REAL for sggev

COMPLEX for cggev
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, are the generalized
eigenvalues.
and (j+1)-st eigenvalues are a complex conjugate pair, with alphai(j+1)
negative.
1322
LAPACK Routines 3
On exit, alpha(j)/beta(j), j=1,..., n, are the generalized eigenvalues.
vl, vr REAL for sggev

COMPLEX for cggev
Arrays:
vl(ldvl,*); the second dimension of vl must be at least max(1, n). Contains
the matrix of left generalized eigenvectors VL.
If jobvl = 'V', the left generalized eigenvectors uj are stored one after
another in the columns of VL, in the same order as their eigenvalues. Each
eigenvector is scaled so the largest component has abs(Re) + abs(Im) =
1.
For real flavors:

If the j-th eigenvalue is real,then uj = VL*,j, the j-th column of VL.
i = sqrt(-1), uj = VL*,j + i*VL*,j + 1 and uj + 1 = VL*,j - i*VL*,j+
+ 1 .

uj = VL*,j, the j-th column of vl.
vr(ldvr,*); the second dimension of vr must be at least max(1, n). Contains
the matrix of right generalized eigenvectors VR.
If jobvr = 'V', the right generalized eigenvectors vj are stored one after
another in the columns of VR, in the same order as their eigenvalues. Each
eigenvector is scaled so the largest component has abs(Re) + abs(Im) = 1.
For real flavors:

If the j-th eigenvalue is real, then vj = VR*,j, the j-th column of VR.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, thenvj
= VR*,j + i*VR*,j + 1 and vj + 1 = VR*,j - i*VR*,j + 1.

vj = VR*,j, the j-th column of VR.
lwork.
info INTEGER.
If info = i, and
1323
in: the QZ iteration failed. No eigenvectors have been calculated, but

alphar(j), alphai(j) (for real flavors), or alpha(j) (for complex flavors), and
beta(j), j=info+1,..., n should be correct.

i = n+2: error return from tgevc.

Specific details for the routine ggev interface are the following:


Application Notes
lwork = -1.
1324
LAPACK Routines 3
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai (for real flavors) or
alpha (for complex flavors) will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
?ggevx
Computes the generalized eigenvalues, and,
optionally, the left and/or right generalized
eigenvectors.
Syntax
call sggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork,
iwork, bwork, info)
call dggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alphar, alphai, beta, vl,
ldvl, vr, ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork,
iwork, bwork, info)
call cggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr,
ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork, rwork,
iwork, bwork, info)
call zggevx(balanc, jobvl, jobvr, sense, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr,
ldvr, ilo, ihi, lscale, rscale, abnrm, bbnrm, rconde, rcondv, work, lwork, rwork,
iwork, bwork, info)
call ggevx(a, b, alphar, alphai, beta [,vl] [,vr] [,balanc] [,ilo] [,ihi] [, lscale]
[,rscale] [,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
call ggevx(a, b, alpha, beta [, vl] [,vr] [,balanc] [,ilo] [,ihi] [,lscale] [, rscale]
[,abnrm] [,bbnrm] [,rconde] [,rcondv] [,info])
Include Files
mkl.fi, lapack.f90
Description
The routine computes for a pair of n-by-n real/complex nonsymmetric matrices (A,B), the generalized
eigenvalues, and optionally, the left and/or right generalized eigenvectors.
Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and
eigenvectors (ilo, ihi, lscale, rscale, abnrm, and bbnrm), reciprocal condition numbers for the eigenvalues
(rconde), and reciprocal condition numbers for the right eigenvectors (rcondv).
A generalized eigenvalue for a pair of matrices (A,B) is a scalar or a ratio alpha / beta = , such that A -
*B is singular. It is usually represented as the pair (alpha, beta), as there is a reasonable interpretation for
beta=0 and even for both being zero. The right generalized eigenvector v(j) corresponding to the
generalized eigenvalue (j) of (A,B) satisfies
A*v(j) = (j)*B*v(j).
The left generalized eigenvector u(j) corresponding to the generalized eigenvalue (j) of (A,B) satisfies
u(j)H*A = (j)*u(j)H*B
where u(j)H denotes the conjugate transpose of u(j).
1325
Input Parameters
balanc CHARACTER*1. Must be 'N', 'P', 'S', or 'B'. Specifies the balance option
to be performed.
If balanc = 'N', do not diagonally scale or permute;
If balanc = 'P', permute only;
If balanc = 'S', scale only;
If balanc = 'B', both permute and scale.
Computed reciprocal condition numbers will be for the matrices after

balancing and/or permuting. Permuting does not change condition numbers
(in exact arithmetic), but balancing does.

If jobvl = 'N', the left generalized eigenvectors are not computed;
If jobvl = 'V', the left generalized eigenvectors are computed.

If jobvr = 'N', the right generalized eigenvectors are not computed;
If jobvr = 'V', the right generalized eigenvectors are computed.
If sense = 'E', computed for eigenvalues only;
If sense = 'V', computed for eigenvectors only;
If sense = 'B', computed for eigenvalues and eigenvectors.
n INTEGER. The order of the matrices A, B, vl, and vr (n 0).
a, b, work REAL for sggevx

DOUBLE PRECISION for dggevx
COMPLEX for cggevx
DOUBLE COMPLEX for zggevx.
Arrays:
matrices).
matrices).

1326
LAPACK Routines 3
ldvl, ldvr INTEGER. The leading dimensions of the output matrices vl and vr,
respectively.
Constraints:
ldvl 1. If jobvl = 'V', ldvl max(1, n).
ldvr 1. If jobvr = 'V', ldvr max(1, n).
lwork INTEGER.
The dimension of the array work. lwork max(1, 2*n);
For real flavors:

If balanc = 'S', or 'B', or jobvl = 'V', or jobvr = 'V', then lwork
max(1, 6*n);
if sense = 'E', or 'B', then lwork max(1, 10*n);
if sense = 'V', or 'B', lwork (2n2+ 8*n+16).

if sense = 'E', lwork max(1, 4*n);
if sense = 'V', or 'B', lworkmax(1, 2*n2+ 2*n).

xerbla.
rwork REAL for cggevx

DOUBLE PRECISION for zggevx
Workspace array, size at least max(1, 6*n) if balanc = 'S', or 'B', and
at least max(1, 2*n) otherwise.
iwork INTEGER.
Workspace array, size at least (n+6) for real flavors and at least (n+2) for
complex flavors.
Not referenced if sense = 'E'.
bwork LOGICAL. Workspace array, size at least max(1, n).

Not referenced if sense = 'N'.
Output Parameters
a, b On exit, these arrays have been overwritten.

If jobvl = 'V' or jobvr = 'V' or both, then a contains the first part of
the real Schur form of the "balanced" versions of the input A and B, and b
contains its second part.
1327
alphar, alphai REAL for sggevx;

DOUBLE PRECISION for dggevx.
See beta.
alpha COMPLEX for cggevx;

beta REAL for sggevx

COMPLEX for cggevx
For real flavors:
On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,..., n, will be the generalized
eigenvalues.
and (j+1)-st eigenvalues are a complex conjugate pair, with alphai(j+1)
negative.
On exit, alpha(j)/beta(j), j=1,..., n, will be the generalized eigenvalues.
vl, vr REAL for sggevx
COMPLEX for cggevx
Arrays:
If jobvl = 'V', the left generalized eigenvectors u(j) are stored one after
another in the columns of vl, in the same order as their eigenvalues. Each
eigenvector will be scaled so the largest component have abs(Re) +
abs(Im) = 1.
For real flavors:

If the j-th eigenvalue is real, then u(j) = vl(:,j), the j-th column of vl.
i = sqrt(-1), u(j) = vl(:,j) + i*vl(:,j+1) and u(j+1) =
vl(:,j) - i*vl(:,j+1).
1328
LAPACK Routines 3
u(j) = vl(:,j), the j-th column of vl.
If jobvr = 'V', the right generalized eigenvectors v(j) are stored one after
another in the columns of vr, in the same order as their eigenvalues. Each
eigenvector will be scaled so the largest component have abs(Re) +
abs(Im) = 1.
For real flavors:

If the j-th eigenvalue is real, then v(j) = vr(:,j), the j-th column of vr.
If the j-th and (j+1)-st eigenvalues form a complex conjugate pair, then
v(j) = vr(:,j) + i*vr(:,j+1) and v(j+1) = vr(:,j) - i*vr(:,j
+1).
v(j) = vr(:,j), the j-th column of vr.
ilo, ihi INTEGER. ilo and ihi are integer values such that on exit Ai j = 0 and Bi j
= 0 if i > j and j = 1,..., ilo-1 or i = ihi+1,..., n.
If balanc = 'N' or 'S', ilo = 1 and ihi = n.
lscale, rscale REAL for single-precision flavors

lscale contains details of the permutations and scaling factors applied to the
left side of A and B.
If PL(j) is the index of the row interchanged with row j, and DL(j) is the
scaling factor applied to row j, then
lscale(j) = PL(j), for j = 1,..., ilo-1
= DL(j), for j = ilo,...,ihi
= PL(j) for j = ihi+1,..., n.
rscale contains details of the permutations and scaling factors applied to the
right side of A and B.
If PR(j) is the index of the column interchanged with column j, and DR(j)
is the scaling factor applied to column j, then
rscale(j) = PR(j), for j = 1,..., ilo-1
= DR(j), for j = ilo,...,ihi
= PR(j) for j = ihi+1,..., n.
abnrm, bbnrm REAL for single-precision flavors

The one-norms of the balanced matrices A and B, respectively.
1329
flavors.
If sense = 'E', or 'B', rconde contains the reciprocal condition numbers
of the eigenvalues, stored in consecutive elements of the array. For a
complex conjugate pair of eigenvalues two consecutive elements of rconde
are set to the same value. Thus rconde(j), rcondv(j), and the j-th columns
of vl and vr all correspond to the same eigenpair (but not in general the j-th
eigenpair, unless all eigenpairs are selected).
If sense = 'N', or 'V', rconde is not referenced.
If sense = 'V', or 'B', rcondv contains the estimated reciprocal condition

numbers of the eigenvectors, stored in consecutive elements of the array.
For a complex eigenvector two consecutive elements of rcondv are set to
the same value.
If the eigenvalues cannot be reordered to compute rcondv(j)rconde[j],
rcondv(j) is set to 0; this can only occur when the true value would be very
small anyway.
If sense = 'N', or 'E', rcondv is not referenced.
lwork.
info INTEGER.
If info = i, and
in:
the QZ iteration failed. No eigenvectors have been calculated, but alphar(j),
alphai(j) (for real flavors), or alpha(j) (for complex flavors), and beta(j),
j=info+1,..., n should be correct.
i = n+2: error return from tgevc.

Specific details for the routine ggevx interface are the following:
1330
LAPACK Routines 3
lscale Holds the vector of length n.
rscale Holds the vector of length n.
rconde Holds the vector of length n.
rcondv Holds the vector of length n.
balanc Must be 'N', 'B', or 'P'. The default value is 'N'.



Application Notes
lwork = -1.
zero. Thus, you should avoid simply computing the ratio. However, alphar and alphai (for real flavors) or
alpha (for complex flavors) will be always less than and usually comparable with norm(A) in magnitude, and
beta always less than and usually comparable with norm(B).
1331
?ggev3
Computes the generalized eigenvalues and the left
and right generalized eigenvectors for a pair of
matrices.
Syntax
call sggev3 (jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr,
ldvr, work, lwork, info )
call dggev3 (jobvl, jobvr, n, a, lda, b, ldb, alphar, alphai, beta, vl, ldvl, vr,
ldvr, work, lwork, info )
call cggev3 (jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info )
call zggev3 (jobvl, jobvr, n, a, lda, b, ldb, alpha, beta, vl, ldvl, vr, ldvr, work,
lwork, rwork, info )
Include Files
mkl.fi
Description
For a pair of n-by-n real or complex nonsymmetric matrices (A, B), ?ggev3 computes the generalized
eigenvalues, and optionally, the left and right generalized eigenvectors.
A generalized eigenvalue for a pair of matrices (A, B) is a scalar or a ratio alpha/beta = , such that A - *B
is singular. It is usually represented as the pair (alpha,beta), as there is a reasonable interpretation for
beta=0, and even for both being zero.
For real flavors:
The right eigenvector vj corresponding to the eigenvalue j of (A, B) satisfies
A * vj = j * B * vj.
The left eigenvector uj corresponding to the eigenvalue j of (A, B) satisfies
ujH * A = j * ujH * B
where ujH is the conjugate-transpose of uj.
The right generalized eigenvector vj corresponding to the generalized eigenvalue j of (A, B) satisfies
A * vj = j * B * vj.
The left generalized eigenvector uj corresponding to the generalized eigenvalues j of (A, B) satisfies
ujH * A = j * ujH * B
where ujH is the conjugate-transpose of uj.
Input Parameters
jobvl CHARACTER*1. = 'N': do not compute the left generalized eigenvectors;

= 'V': compute the left generalized eigenvectors.
jobvr CHARACTER*1. = 'N': do not compute the right generalized eigenvectors;

= 'V': compute the right generalized eigenvectors.
n INTEGER. The order of the matrices A, B, VL, and VR.
1332
LAPACK Routines 3
n 0.
a REAL for sggev3

DOUBLE PRECISION for dggev3
COMPLEX for cggev3
DOUBLE COMPLEX for zggev3
On entry, the matrix A in the pair (A, B).
lda max(1,n).
b REAL for sggev3

COMPLEX for cggev3
On entry, the matrix B in the pair (A, B).
ldb INTEGER. The leading dimension of b.
ldb max(1,n).
ldvl INTEGER. The leading dimension of the matrix VL.

ldvl 1, and if jobvl = 'V', ldvln.
ldvr INTEGER. The leading dimension of the matrix VR.

ldvr 1, and if jobvr = 'V', ldvrn.
work REAL for sggev3

COMPLEX for cggev3
Array, size (MAX(1,lwork))

calculates the optimal (A, B) of the work array, returns this value as the
issued by xerbla.
rwork REAL for cggev3

DOUBLE PRECISION for zggev3
Array, size (8*n).
1333
Output Parameters
a On exit, a is overwritten.
b On exit, b is overwritten.
alphar REAL for sggev3

Array, size (n).
alphai REAL for sggev3

Array, size (n).
alpha COMPLEX for cggev3

Array, size (n).
beta REAL for sggev3

COMPLEX for cggev3
Array, size (n).
For real flavors:

On exit, (alphar(j) + alphai(j)*i)/beta(j), j=1,...,n, are the generalized
eigenvalues. If alphai(j) is zero, then the j-th eigenvalue is real; if
positive, then the j-th and (j+1)-st eigenvalues are a complex conjugate
pair, with alphai(j + 1) negative.
Note: the quotients alphar(j)/beta(j) and alphai(j)/beta(j) can

easily over- or underflow, and beta(j) might even be zero. Thus, you
should avoid computing the ratio alpha/beta by simply dividing alpha by
beta. However, alphar and alphai are always less than and usually
comparable with norm(A) in magnitude, and beta is always less than and
usually comparable with norm(B).
On exit, alpha(j)/beta(j), j=1,...,n, are the generalized eigenvalues.
Note: the quotients alpha(j)/beta(j) may easily over- or underflow, and

beta(j) can even be zero. Thus, you should avoid computing the ratio
alpha/beta by simply dividing alpha by beta. However, alpha is always
less than and usually comparable with norm(A) in magnitude, and betais
always less than and usually comparable with norm(B).
vl REAL for sggev3

COMPLEX for cggev3
1334
LAPACK Routines 3
Array, size (ldvl, n).
For real flavors:

If jobvl = 'V', the left eigenvectors uj are stored one after another in the
columns of vl, in the same order as their eigenvalues. If the j-th eigenvalue
is real, then uj =vl(:,j), the j-th column of vl. If the j-th and (j+1)-st
eigenvalues form a complex conjugate pair, then uj = vl(:,j)+i*vl(:,j+1)
and uj + 1 = vl(:,j)-i*vl(:,j+1).
Each eigenvector is scaled so the largest component has abs(real part)

+abs(imag. part)=1.
Not referenced if jobvl = 'N'.

If jobvl = 'V', the left generalized eigenvectors uj are stored one after
another in the columns of vl, in the same order as their eigenvalues.
Each eigenvector is scaled so the largest component has abs(real part) +

abs(imag. part) = 1.
Not referenced if jobvl = 'N'.
vr REAL for sggev3

COMPLEX for cggev3
Array, size (ldvr, n).
For real flavors:

If jobvr = 'V', the right eigenvectors vj are stored one after another in the
columns of vr, in the same order as their eigenvalues. If the j-th eigenvalue
is real, then vj =vr(:,j), the j-th column of vr. If the j-th and (j + 1)-st
eigenvalues form a complex conjugate pair, then vj = vr(:,j) + i*vr(:,j + 1)
and vj + 1 = vr(:,j)-i*vr(:,j+1).
Each eigenvector is scaled so the largest component has abs(real part)

+abs(imag. part)=1.
Not referenced if jobvr = 'N'.

If jobvr = 'V', the right generalized eigenvectors vj are stored one after
another in the columns of vr, in the same order as their eigenvalues. Each
eigenvector is scaled so the largest component has abs(real part) +
abs(imag. part) = 1.
Not referenced if jobvr = 'N'.

=1,...,n:
for real flavors:
1335
The QZ iteration failed. No eigenvectors have been calculated, but

alphar(j), alphar(j) and beta(j) should be correct for j=info + 1,...,n.
The QZ iteration failed. No eigenvectors have been calculated, but
alpha(j) and beta(j) should be correct for j=info + 1,...,n.
> n:
=n + 1: other than QZ iteration failed in ?hgeqz,

=n + 2: error return from ?tgevc.
LAPACK Auxiliary Routines

Routine naming conventions, mathematical notation, and matrix storage schemes used for LAPACK auxiliary
routines are the same as for the driver and computational routines described in previous chapters.
The table below summarizes information about the available LAPACK auxiliary routines.
LAPACK Auxiliary Routines
Routine Name Data Description
Types
?lacgv c, z Conjugates a complex vector.
?lacrm c, z Multiplies a complex matrix by a square real matrix.
?lacrt c, z Performs a linear transformation of a pair of complex vectors.
?laesy c, z Computes the eigenvalues and eigenvectors of a 2-by-2 complex

symmetric matrix.
?rot c, z Applies a plane rotation with real cosine and complex sine to a pair
of complex vectors.
?spmv c, z Computes a matrix-vector product for complex vectors using a

complex symmetric packed matrix
?spr c, z Performs the symmetrical rank-1 update of a complex symmetric

packed matrix.
?syconv s, c, d, z Converts a symmetric matrix given by a triangular matrix

factorization into two matrices and vice versa.
?symv c, z Computes a matrix-vector product for a complex symmetric

matrix.
?syr c, z Performs the symmetric rank-1 update of a complex symmetric

matrix.
i?max1 c, z Finds the index of the vector element whose real part has
maximum absolute value.
?sum1 sc, dz Forms the 1-norm of the complex vector using the true absolute
value.
?gbtf2 s, d, c, z Computes the LU factorization of a general band matrix using the

unblocked version of the algorithm.
?gebd2 s, d, c, z Reduces a general matrix to bidiagonal form using an unblocked

algorithm.
1336
LAPACK Routines 3
Types
?gehd2 s, d, c, z Reduces a general square matrix to upper Hessenberg form using

an unblocked algorithm.
?gelq2 s, d, c, z Computes the LQ factorization of a general rectangular matrix

using an unblocked algorithm.
?geql2 s, d, c, z Computes the QL factorization of a general rectangular matrix

?geqr2 s, d, c, z Computes the QR factorization of a general rectangular matrix

?geqr2p s, d, c, z Computes the QR factorization of a general rectangular matrix with

non-negative diagonal elements using an unblocked algorithm.
?geqrt2 s, d, c, z Computes a QR factorization of a general real or complex matrix

using the compact WY representation of Q.
?geqrt3 s, d, c, z Recursively computes a QR factorization of a general real or

complex matrix using the compact WY representation of Q.
?gerq2 s, d, c, z Computes the RQ factorization of a general rectangular matrix

?gesc2 s, d, c, z Solves a system of linear equations using the LU factorization with

complete pivoting computed by ?getc2.
?getc2 s, d, c, z Computes the LU factorization with complete pivoting of the

general n-by-n matrix.
?getf2 s, d, c, z Computes the LU factorization of a general m-by-n matrix using

partial pivoting with row interchanges (unblocked algorithm).
?gtts2 s, d, c, z Solves a system of linear equations with a tridiagonal matrix using

the LU factorization computed by ?gttrf.
?isnan s, d, Tests input for NaN.
?laisnan s, d, Tests input for NaN by comparing two arguments for inequality.
?labrd s, d, c, z Reduces the first nb rows and columns of a general matrix to a

bidiagonal form.
?lacn2 s, d, c, z Estimates the 1-norm of a square matrix, using reverse

communication for evaluating matrix-vector products.
?lacon s, d, c, z Estimates the 1-norm of a square matrix, using reverse

communication for evaluating matrix-vector products.
?lacpy s, d, c, z Copies all or part of one two-dimensional array to another.
?ladiv s, d, c, z Performs complex division in real arithmetic, avoiding unnecessary

overflow.
?lae2 s, d Computes the eigenvalues of a 2-by-2 symmetric matrix.
?laebz s, d Computes the number of eigenvalues of a real symmetric

tridiagonal matrix which are less than or equal to a given value,
and performs other tasks required by the routine ?stebz.
?laed0 s, d, c, z Used by ?stedc. Computes all eigenvalues and corresponding

eigenvectors of an unreduced symmetric tridiagonal matrix using
the divide and conquer method.
1337

Types
?laed1 s, d Used by sstedc/dstedc. Computes the updated eigensystem of a

diagonal matrix after modification by a rank-one symmetric
matrix. Used when the original matrix is tridiagonal.
?laed2 s, d Used by sstedc/dstedc. Merges eigenvalues and deflates secular

equation. Used when the original matrix is tridiagonal.
?laed3 s, d Used by sstedc/dstedc. Finds the roots of the secular equation

and updates the eigenvectors. Used when the original matrix is
tridiagonal.
?laed4 s, d Used by sstedc/dstedc. Finds a single root of the secular

equation.
?laed5 s, d Used by sstedc/dstedc. Solves the 2-by-2 secular equation.
?laed6 s, d Used by sstedc/dstedc. Computes one Newton step in solution of

the secular equation.
?laed7 s, d, c, z Used by ?stedc. Computes the updated eigensystem of a diagonal

matrix after modification by a rank-one symmetric matrix. Used
when the original matrix is dense.
?laed8 s, d, c, z Used by ?stedc. Merges eigenvalues and deflates secular

equation. Used when the original matrix is dense.
?laed9 s, d Used by sstedc/dstedc. Finds the roots of the secular equation

and updates the eigenvectors. Used when the original matrix is
dense.
?laeda s, d Used by ?stedc. Computes the Z vector determining the rank-one

modification of the diagonal matrix. Used when the original matrix
is dense.
?laein s, d, c, z Computes a specified right or left eigenvector of an upper

Hessenberg matrix by inverse iteration.
?laev2 s, d, c, z Computes the eigenvalues and eigenvectors of a 2-by-2

symmetric/Hermitian matrix.
?laexc s, d Swaps adjacent diagonal blocks of a real upper quasi-triangular

matrix in Schur canonical form, by an orthogonal similarity
transformation.
?lag2 s, d Computes the eigenvalues of a 2-by-2 generalized eigenvalue

problem, with scaling as necessary to avoid over-/underflow.
?lags2 s, d Computes 2-by-2 orthogonal matrices U, V, and Q, and applies

them to matrices A and B such that the rows of the transformed A
and B are parallel.
?lagtf s, d Computes an LU factorization of a matrix T-I, where T is a

general tridiagonal matrix, and a scalar, using partial pivoting
with row interchanges.
?lagtm s, d, c, z Performs a matrix-matrix product of the form C = AB+C, where

A is a tridiagonal matrix, B and C are rectangular matrices, and
and are scalars, which may be 0, 1, or -1.
?lagts s, d Solves the system of equations (T-I)x = y or (T-I)Tx = y,where T

is a general tridiagonal matrix and a scalar, using the LU
factorization computed by ?lagtf.
1338
LAPACK Routines 3
Types
?lagv2 s, d Computes the Generalized Schur factorization of a real 2-by-2

matrix pencil (A,B) where B is upper triangular.
?lahqr s, d, c, z Computes the eigenvalues and Schur factorization of an upper

Hessenberg matrix, using the double-shift/single-shift QR
algorithm.
?lahrd s, d, c, z Reduces the first nb columns of a general rectangular matrix A so

that elements below the k-th subdiagonal are zero, and returns
auxiliary matrices which are needed to apply the transformation to
the unreduced part of A.
?lahr2 s, d, c, z Reduces the specified number of first columns of a general

rectangular matrix A so that elements below the specified
subdiagonal are zero, and returns auxiliary matrices which are
needed to apply the transformation to the unreduced part of A.
?laic1 s, d, c, z Applies one step of incremental condition estimation.
?lakf2 s, d, c, z Forms a matrix containing Kronecker products between the given

matrices.
?lals0 s, d, c, z Applies back multiplying factors in solving the least squares

problem using divide and conquer SVD approach. Used by ?gelsd.
?lalsa s, d, c, z Computes the SVD of the coefficient matrix in compact form. Used
by ?gelsd.
?lalsd s, d, c, z Uses the singular value decomposition of A to solve the least

squares problem.
?lamrg s, d Creates a permutation list to merge the entries of two

independently sorted sets into a single set sorted in ascending
order.
?laneg s, d Computes the Sturm count.
?langb s, d, c, z Returns the value of the 1-norm, Frobenius norm, infinity-norm, or

the largest absolute value of any element of general band matrix.
?lange s, d, c, z Returns the value of the 1-norm, Frobenius norm, infinity-norm, or

the largest absolute value of any element of a general rectangular
matrix.
?langt s, d, c, z Returns the value of the 1-norm, Frobenius norm, infinity-norm, or

the largest absolute value of any element of a general tridiagonal
matrix.
?lanhs s, d, c, z Returns the value of the 1-norm, Frobenius norm, infinity-norm, or

the largest absolute value of any element of an upper Hessenberg
matrix.
?lansb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a
symmetric band matrix.
?lanhb c, z Returns the value of the 1-norm, or the Frobenius norm, or the
1339

Types
?lansp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
symmetric matrix supplied in packed form.
?lanhp c, z Returns the value of the 1-norm, or the Frobenius norm, or the
complex Hermitian matrix supplied in packed form.
?lanst/?lanht s, d/c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a real
symmetric or complex Hermitian tridiagonal matrix.
?lansy s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
infinity norm, or the element of largest absolute value of a real/
complex symmetric matrix.
?lanhe c, z Returns the value of the 1-norm, or the Frobenius norm, or the
complex Hermitian matrix.
?lantb s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
triangular band matrix.
?lantp s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
triangular matrix supplied in packed form.
?lantr s, d, c, z Returns the value of the 1-norm, or the Frobenius norm, or the
trapezoidal or triangular matrix.
?lanv2 s, d Computes the Schur factorization of a real 2-by-2 nonsymmetric

matrix in standard form.
?lapll s, d, c, z Measures the linear dependence of two vectors.
?lapmr s, d, c, z Rearranges rows of a matrix as specified by a permutation vector.
?lapmt s, d, c, z Performs a forward or backward permutation of the columns of a

matrix.
?lapy2 s, d Returns sqrt(x2+y2).
?lapy3 s, d Returns sqrt(x2+y2+z2).
?laqgb s, d, c, z Scales a general band matrix, using row and column scaling
factors computed by ?gbequ.
?laqge s, d, c, z Scales a general rectangular matrix, using row and column scaling
factors computed by ?geequ.
?laqhb c, z Scales a Hermitian band matrix, using scaling factors computed

by ?pbequ.
?laqp2 s, d, c, z Computes a QR factorization with column pivoting of the matrix

block.
?laqps s, d, c, z Computes a step of QR factorization with column pivoting of a real

m-by-n matrix A by using BLAS level 3.
1340
LAPACK Routines 3
Types
?laqr0 s, d, c, z Computes the eigenvalues of a Hessenberg matrix, and optionally

the matrices from the Schur decomposition.
?laqr1 s, d, c, z Sets a scalar multiple of the first column of the product of 2-by-2
or 3-by-3 matrix H and specified shifts.
?laqr2 s, d, c, z Performs the orthogonal/unitary similarity transformation of a

Hessenberg matrix to detect and deflate fully converged
eigenvalues from a trailing principal submatrix (aggressive early
deflation).
?laqr3 s, d, c, z Performs the orthogonal/unitary similarity transformation of a

Hessenberg matrix to detect and deflate fully converged
eigenvalues from a trailing principal submatrix (aggressive early
deflation).
?laqr4 s, d, c, z Computes the eigenvalues of a Hessenberg matrix, and optionally

the matrices from the Schur decomposition.
?laqr5 s, d, c, z Performs a single small-bulge multi-shift QR sweep.
?laqsb s, d, c, z Scales a symmetric/Hermitian band matrix, using scaling factors

computed by ?pbequ.
?laqsp s, d, c, z Scales a symmetric/Hermitian matrix in packed storage, using

scaling factors computed by ?ppequ.
?laqsy s, d, c, z Scales a symmetric/Hermitian matrix, using scaling factors

computed by ?poequ.
?laqtr s, d Solves a real quasi-triangular system of equations, or a complex

quasi-triangular system of special form, in real arithmetic.
?lar1v s, d, c, z Computes the (scaled) r-th column of the inverse of the submatrix
in rows b1 through bn of the tridiagonal matrix LDLT - I.
?lar2v s, d, c, z Applies a vector of plane rotations with real cosines and real/
complex sines from both sides to a sequence of 2-by-2 symmetric/
Hermitian matrices.
?laran s, d Returns a random real number from a uniform distribution.
?larf s, d, c, z Applies an elementary reflector to a general rectangular matrix.
?larfb s, d, c, z Applies a block reflector or its transpose/conjugate-transpose to a

general rectangular matrix.
?larfg s, d, c, z Generates an elementary reflector (Householder matrix).
?larfgp s, d, c, z Generates an elementary reflector (Householder matrix) with non-

negatibe beta.
?larft s, d, c, z Forms the triangular factor T of a block reflector H = I - vtvH
?larfx s, d, c, z Applies an elementary reflector to a general rectangular matrix,

with loop unrolling when the reflector has order 10.
?large s, d, c, z Pre- and post-multiplies a real general matrix with a random

orthogonal matrix.
?larnd s, d, c, z Returns a random real number from a uniform or normal

distribution.
1341

Types
?largv s, d, c, z Generates a vector of plane rotations with real cosines and real/
complex sines.
?larnv s, d, c, z Returns a vector of random numbers from a uniform or normal

distribution.
?laror s, d, c, z Pre- or post-multiplies an m-by-n matrix by a random orthogonal/

unitary matrix.
?larot s, d, c, z Applies a Givens rotation to two adjacent rows or columns.
?larra s, d Computes the splitting points with the specified threshold.
?larrb s, d Provides limited bisection to locate eigenvalues for more accuracy.
?larrc s, d Computes the number of eigenvalues of the symmetric tridiagonal

matrix.
?larrd s, d Computes the eigenvalues of a symmetric tridiagonal matrix to

suitable accuracy.
?larre s, d Given the tridiagonal matrix T, sets small off-diagonal elements to

zero and for each unreduced block Ti, finds base representations
and eigenvalues.
?larrf s, d Finds a new relatively robust representation such that at least one
of the eigenvalues is relatively isolated.
?larrj s, d Performs refinement of the initial estimates of the eigenvalues of

the matrix T.
?larrk s, d Computes one eigenvalue of a symmetric tridiagonal matrix T to

suitable accuracy.
?larrr s, d Performs tests to decide whether the symmetric tridiagonal matrix

T warrants expensive computations which guarantee high relative
accuracy in the eigenvalues.
?larrv s, d, c, z Computes the eigenvectors of the tridiagonal matrix T = LDLT

given L, D and the eigenvalues of LDLT.
?lartg s, d, c, z Generates a plane rotation with real cosine and real/complex sine.
?lartgp s, d Generates a plane rotation so that the diagonal is nonnegative.
?lartgs s, d Generates a plane rotation designed to introduce a bulge in

implicit QR iteration for the bidiagonal SVD problem.
?lartv s, d, c, z Applies a vector of plane rotations with real cosines and real/
complex sines to the elements of a pair of vectors.
?laruv s, d Returns a vector of n random real numbers from a uniform

distribution.
?larz s, d, c, z Applies an elementary reflector (as returned by ?tzrzf) to a

general matrix.
?larzb s, d, c, z Applies a block reflector or its transpose/conjugate-transpose to a

general matrix.
?larzt s, d, c, z Forms the triangular factor T of a block reflector H = I - vtvH.
?las2 s, d Computes singular values of a 2-by-2 triangular matrix.
1342
LAPACK Routines 3
Types
?lascl s, d, c, z Multiplies a general rectangular matrix by a real scalar defined as

cto/cfrom.
?lasd0 s, d Computes the singular values of a real upper bidiagonal n-by-m

matrix B with diagonal d and off-diagonal e. Used by ?bdsdc.
?lasd1 s, d Computes the SVD of an upper bidiagonal matrix B of the specified

size. Used by ?bdsdc.
?lasd2 s, d Merges the two sets of singular values together into a single
sorted set. Used by ?bdsdc.
?lasd3 s, d Finds all square roots of the roots of the secular equation, as
defined by the values in D and Z, and then updates the singular
vectors by matrix multiplication. Used by ?bdsdc.
?lasd4 s, d Computes the square root of the i-th updated eigenvalue of a

positive symmetric rank-one modification to a positive diagonal
matrix. Used by ?bdsdc.
?lasd5 s, d Computes the square root of the i-th eigenvalue of a positive

symmetric rank-one modification of a 2-by-2 diagonal matrix.
Used by ?bdsdc.
?lasd6 s, d Computes the SVD of an updated upper bidiagonal matrix obtained

by merging two smaller ones by appending a row. Used by ?
bdsdc.
?lasd7 s, d Merges the two sets of singular values together into a single
sorted set. Then it tries to deflate the size of the problem. Used
by ?bdsdc.
?lasd8 s, d Finds the square roots of the roots of the secular equation, and
stores, for each element in D, the distance to its two nearest
poles. Used by ?bdsdc.
?lasda s, d Computes the singular value decomposition (SVD) of a real upper

bidiagonal matrix with diagonal d and off-diagonal e. Used by ?
bdsdc.
?lasdq s, d Computes the SVD of a real bidiagonal matrix with diagonal d and
off-diagonal e. Used by ?bdsdc.
?lasdt s, d Creates a tree of subproblems for bidiagonal divide and conquer.

Used by ?bdsdc.
?laset s, d, c, z Initializes the off-diagonal elements and the diagonal elements of

a matrix to given values.
?lasq1 s, d Computes the singular values of a real square bidiagonal matrix.

Used by ?bdsqr.
?lasq2 s, d Computes all the eigenvalues of the symmetric positive definite

tridiagonal matrix associated with the qd Array Z to high relative
accuracy. Used by ?bdsqr and ?stegr.
?lasq3 s, d Checks for deflation, computes a shift and calls dqds. Used by ?
bdsqr.
?lasq4 s, d Computes an approximation to the smallest eigenvalue using

values of d from the previous transform. Used by ?bdsqr.
1343

Types
?lasq5 s, d Computes one dqds transform in ping-pong form. Used by ?bdsqr

and ?stegr.
?lasq6 s, d Computes one dqd transform in ping-pong form. Used by ?bdsqr

and ?stegr.
?lasr s, d, c, z Applies a sequence of plane rotations to a general rectangular

matrix.
?lasrt s, d Sorts numbers in increasing or decreasing order.
?lassq s, d, c, z Updates a sum of squares represented in scaled form.
?lasv2 s, d Computes the singular value decomposition of a 2-by-2 triangular

matrix.
?laswp s, d, c, z Performs a series of row interchanges on a general rectangular

matrix.
?lasy2 s, d Solves the Sylvester matrix equation where the matrices are of
order 1 or 2.
?lasyf s, d, c, z Computes a partial factorization of a real/complex symmetric

matrix, using the diagonal pivoting method.
?lasyf_rook s, d, c, z Computes a partial factorization of a real/complex symmetric

matrix, using the bounded Bunch-Kaufman diagonal pivoting
method.
?lahef c, z Computes a partial factorization of a complex Hermitian indefinite

?lahef_rook c, z Computes a partial factorization of a complex Hermitian indefinite

method.
?latbs s, d, c, z Solves a triangular banded system of equations.
?latdf s, d, c, z Uses the LU factorization of the n-by-n matrix computed by ?

getc2 and computes a contribution to the reciprocal Dif-estimate.
?latm1 s, d, c, z Computes the entries of a matrix as specified.
?latm2 s, d, c, z Returns an entry of a random matrix.
?latm3 s, d, c, z Returns set entry of a random matrix.
?latm5 s, d, c, z Generates matrices involved in the Generalized Sylvester equation.
?latm6 s, d, c, z Generates test matrices for the generalized eigenvalue problem,

their corresponding right and left eigenvector matrices, and also
reciprocal condition numbers for all eigenvalues and the reciprocal
condition numbers of eigenvectors corresponding to the 1th and
5th eigenvalues.
?latme s, d, c, z Generates random non-symmetric square matrices with specified

eigenvalues.
?latmr s, d, c, z Generates random matrices of various types.
?latps s, d, c, z Solves a triangular system of equations with the matrix held in

packed storage.
1344
LAPACK Routines 3
Types
?latrd s, d, c, z Reduces the first nb rows and columns of a symmetric/Hermitian

matrix A to real tridiagonal form by an orthogonal/unitary
similarity transformation.
?latrs s, d, c, z Solves a triangular system of equations with the scale factor set to
prevent overflow.
?latrz s, d, c, z Factors an upper trapezoidal matrix by means of orthogonal/

unitary transformations.
?lauu2 s, d, c, z Computes the product UUH or LHL, where U and L are upper or
lower triangular matrices (unblocked algorithm).
?lauum s, d, c, z Computes the product UUH or LHL, where U and L are upper or
lower triangular matrices (blocked algorithm).
?orbdb1/?unbdb1 s, d, c, z Simultaneously bidiagonalizes the blocks of a tall and skinny

matrix with orthonormal columns.
?orbdb2/?unbdb2
?orbdb3/?unbdb3
?orbdb4/?unbdb4
?orbdb5/?unbdb5 s, d, c, z Orthogonalizes a column vector with respect to the orthonormal

basis matrix.
?orbdb6/?unbdb6
?org2l/?ung2l s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from a QL

factorization determined by ?geqlf (unblocked algorithm).
?org2r/?ung2r s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from a QR

factorization determined by ?geqrf (unblocked algorithm).
?orgl2/?ungl2 s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from an

LQ factorization determined by ?gelqf (unblocked algorithm).
?orgr2/?ungr2 s, d/c, z Generates all or part of the orthogonal/unitary matrix Q from an

RQ factorization determined by ?gerqf (unblocked algorithm).
?orm2l/?unm2l s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix from

a QL factorization determined by ?geqlf (unblocked algorithm).
?orm2r/?unm2r s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix from

a QR factorization determined by ?geqrf (unblocked algorithm).
?orml2/?unml2 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix from

a LQ factorization determined by ?gelqf (unblocked algorithm).
?ormr2/?unmr2 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix from

a RQ factorization determined by ?gerqf (unblocked algorithm).
?ormr3/?unmr3 s, d/c, z Multiplies a general matrix by the orthogonal/unitary matrix from

a RZ factorization determined by ?tzrzf (unblocked algorithm).
?pbtf2 s, d, c, z Computes the Cholesky factorization of a symmetric/ Hermitian

positive definite band matrix (unblocked algorithm).
?potf2 s, d, c, z Computes the Cholesky factorization of a symmetric/Hermitian

positive definite matrix (unblocked algorithm).
?ptts2 s, d, c, z Solves a tridiagonal system of the form AX=B using the LDLH
factorization computed by ?pttrf.
1345

Types
?rscl s, d, cs, Multiplies a vector by the reciprocal of a real scalar.

zd
?syswapr s, d, c, z Applies an elementary permutation on the rows and columns of a

symmetric matrix.
?heswapr c, z Applies an elementary permutation on the rows and columns of a

Hermitian matrix.
?sygs2/?hegs2 s, d/c, z Reduces a symmetric/Hermitian positive-definite generalized

eigenproblem to standard form, using the factorization results
obtained from ?potrf (unblocked algorithm).
?sytd2/?hetd2 s, d/c, z Reduces a symmetric/Hermitian matrix to real symmetric

tridiagonal form by an orthogonal/unitary similarity transformation
(unblocked algorithm).
?sytf2 s, d, c, z Computes the factorization of a real/complex symmetric indefinite

matrix, using the diagonal pivoting method (unblocked algorithm).
?sytf2_rook s, d, c, z Computes the factorization of a real/complex symmetric indefinite

method (unblocked algorithm).
?hetf2 c, z Computes the factorization of a complex Hermitian matrix, using

the diagonal pivoting method (unblocked algorithm).
?hetf2_rook c, z Computes the factorization of a complex Hermitian matrix, using

the bounded Bunch-Kaufman diagonal pivoting method (unblocked
algorithm).
?tgex2 s, d, c, z Swaps adjacent diagonal blocks in an upper (quasi) triangular

matrix pair by an orthogonal/unitary equivalence transformation.
?tgsy2 s, d, c, z Solves the generalized Sylvester equation (unblocked algorithm).
?trti2 s, d, c, z Computes the inverse of a triangular matrix (unblocked

algorithm).
clag2z cz Converts a complex single precision matrix to a complex double

precision matrix.
dlag2s ds Converts a double precision matrix to a single precision matrix.
slag2d sd Converts a single precision matrix to a double precision matrix.
zlag2c zc Converts a complex double precision matrix to a complex single

precision matrix.
?larfp s, d, c, z Generates a real or complex elementary reflector.
ila?lc s, d, c, z Scans a matrix for its last non-zero column.
ila?lr s, d, c, z Scans a matrix for its last non-zero row.
?gsvj0 s, d Pre-processor for the routine ?gesvj.
?gsvj1 s, d Pre-processor for the routine ?gesvj, applies Jacobi rotations

targeting only particular pivots.
?sfrk s, d Performs a symmetric rank-k operation for matrix in RFP format.
?hfrk c, z Performs a Hermitian rank-k operation for matrix in RFP format.
1346
LAPACK Routines 3
Types
?tfsm s, d, c, z Solves a matrix equation (one operand is a triangular matrix in

RFP format).
?lansf s, d Returns the value of the 1-norm, or the Frobenius norm, or the
symmetric matrix in RFP format.
?lanhf c, z Returns the value of the 1-norm, or the Frobenius norm, or the
Hermitian matrix in RFP format.
?tfttp s, d, c, z Copies a triangular matrix from the rectangular full packed format
(TF) to the standard packed format (TP).
?tfttr s, d, c, z Copies a triangular matrix from the rectangular full packed format
(TF) to the standard full format (TR).
?tpqrt2 s, d, c, z Computes a QR factorization of a real or complex "triangular-

pentagonal" matrix, which is composed of a triangular block and a
pentagonal block, using the compact WY representation for Q.
?tprfb s, d, c, z Applies a real or complex "triangular-pentagonal" blocked reflector

to a real or complex matrix, which is composed of two blocks.
?tpttf s, d, c, z Copies a triangular matrix from the standard packed format (TP)
to the rectangular full packed format (TF).
?tpttr s, d, c, z Copies a triangular matrix from the standard packed format (TP)
to the standard full format (TR).
?trttf s, d, c, z Copies a triangular matrix from the standard full format (TR) to
the rectangular full packed format (TF).
?trttp s, d, c, z Copies a triangular matrix from the standard full format (TR) to
the standard packed format (TP).
?pstf2 s, d, c, z Computes the Cholesky factorization with complete pivoting of a

real symmetric or complex Hermitian positive semi-definite matrix.
dlat2s ds Converts a double-precision triangular matrix to a single-precision

triangular matrix.
zlat2c zc Converts a double complex triangular matrix to a complex

triangular matrix.
?lacp2 c, z Copies all or part of a real two-dimensional array to a complex

array.
?la_gbamv s, d, c, z Performs a matrix-vector operation to calculate error bounds.
?la_gbrcond s, d Estimates the Skeel condition number for a general banded matrix.
?la_gbrcond_c c, z Computes the infinity norm condition number of

op(A)*inv(diag(c)) for general banded matrices.
?la_gbrcond_x c, z Computes the infinity norm condition number of op(A)*diag(x) for

general banded matrices.
? s, d, c, z Improves the computed solution to a system of linear equations

la_gbrfsx_extended for general banded matrices by performing extra-precise iterative
refinement and provides error bounds and backward error
estimates for the solution.
1347

Types
?la_gbrpvgrw s, d, c, z Computes the reciprocal pivot growth factor norm(A)/norm(U) for

a general banded matrix.
?la_geamv s, d, c, z Computes a matrix-vector product using a general matrix to

calculate error bounds.
?la_gercond s, d Estimates the Skeel condition number for a general matrix.
?la_gercond_c c, z Computes the infinity norm condition number of

op(A)*inv(diag(c)) for general matrices.
?la_gercond_x c, z Computes the infinity norm condition number of op(A)*diag(x) for

general matrices.
? s, d Improves the computed solution to a system of linear equations

la_gerfsx_extended for general matrices by performing extra-precise iterative
refinement and provides error bounds and backward error
?la_heamv c, z Computes a matrix-vector product using a Hermitian indefinite

matrix to calculate error bounds.
?la_hercond_c c, z Computes the infinity norm condition number of

op(A)*inv(diag(c)) for Hermitian indefinite matrices.
?la_hercond_x c, z Computes the infinity norm condition number of op(A)*diag(x) for

Hermitian indefinite matrices.
? c, z Improves the computed solution to a system of linear equations

la_herfsx_extended for Hermitian indefinite matrices by performing extra-precise
iterative refinement and provides error bounds and backward error
?la_lin_berr s, d, c, z Computes a component-wise relative backward error.
?la_porcond s, d Estimates the Skeel condition number for a symmetric positive-

definite matrix.
?la_porcond_c c, z Computes the infinity norm condition number of

op(A)*inv(diag(c)) for Hermitian positive-definite matrices.
?la_porcond_x c, z Computes the infinity norm condition number of op(A)*diag(x) for

Hermitian positive-definite matrices.

la_porfsx_extended for symmetric or Hermitian positive-definite matrices by
performing extra-precise iterative refinement and provides error
bounds and backward error estimates for the solution.
?la_porpvgrw s, d, c, z Computes the reciprocal pivot growth factor norm(A)/norm(U) for

a symmetric or Hermitian positive-definite matrix.
?laqhe c, z Scales a Hermitian matrix.
?laqhp c, z Scales a Hermitian matrix stored in packed form.
?larcm c, z Copies all or part of a real two-dimensional array to a complex

array.
?la_rpvgrw c, z Multiplies a square real matrix by a complex matrix.
?larscl2 s, d, c, z Performs reciprocal diagonal scaling on a vector.
1348
LAPACK Routines 3
Types
?lascl2 s, d, c, z Performs diagonal scaling on a vector.
?la_syamv s, d, c, z Computes a matrix-vector product using a symmetric indefinite

?la_syrcond s, d Estimates the Skeel condition number for a symmetric indefinite

matrix.
?la_syrcond_c c, z Computes the infinity norm condition number of

op(A)*inv(diag(c)) for symmetric indefinite matrices.
?la_syrcond_x c, z Computes the infinity norm condition number of op(A)*diag(x) for

symmetric indefinite matrices.

la_syrfsx_extended for symmetric indefinite matrices by performing extra-precise
iterative refinement and provides error bounds and backward error
?la_syrpvgrw s, d, c, z Computes the reciprocal pivot growth factor norm(A)/norm(U) for

a symmetric indefinite matrix.
?la_wwaddw s, d, c, z Adds a vector into a doubled-single vector.
mkl_?tppack s, d, c, z Copies a triangular/symmetric matrix or submatrix from standard

full format to standard packed format.
mkl_?tpunpack s, d, c, z Copies a triangular/symmetric matrix or submatrix from standard

packed format to full format.
?lacgv
Conjugates a complex vector.
Syntax
call clacgv( n, x, incx )
call zlacgv( n, x, incx )
Include Files
mkl.fi
Description
The routine conjugates a complex vector x of length n and increment incx (see "Vector Arguments in BLAS"
in Appendix B).
Input Parameters
n INTEGER. The length of the vector x (n 0).
x COMPLEX for clacgv

DOUBLE COMPLEX for zlacgv.
Array, dimension (1+(n-1)* |incx|).
1349
Contains the vector of length n to be conjugated.
incx INTEGER. The spacing between successive elements of x.
Output Parameters
x On exit, overwritten with conjg(x).
?lacrm
Multiplies a complex matrix by a square real matrix.
Syntax
call clacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
call zlacrm( m, n, a, lda, b, ldb, c, ldc, rwork )
Include Files
mkl.fi
Description
The routine performs a simple matrix-matrix multiplication of the form

C = A*B,
where A is m-by-n and complex, B is n-by-n and real, C is m-by-n and complex.
Input Parameters
m INTEGER. The number of rows of the matrix A and of the matrix C (m 0).
n INTEGER. The number of columns and rows of the matrix B and the number
of columns of the matrix C
(n 0).
a COMPLEX for clacrm

DOUBLE COMPLEX for zlacrm
Array, DIMENSION(lda, n). Contains the m-by-n matrix A.
lda INTEGER. The leading dimension of the array a, ldamax(1, m).
b REAL for clacrm

DOUBLE PRECISION for zlacrm
Array, DIMENSION(ldb, n). Contains the n-by-n matrix B.
ldb INTEGER. The leading dimension of the array b, ldbmax(1, n).
ldc INTEGER. The leading dimension of the output array c, ldcmax(1, n).
rwork REAL for clacrm

DOUBLE PRECISION for zlacrm
1350
LAPACK Routines 3
Workspace array, DIMENSION(2*m*n).
Output Parameters
c COMPLEX for clacrm

DOUBLE COMPLEX for zlacrm
Array, DIMENSION (ldc, n). Contains the m-by-n matrix C.
?lacrt
Performs a linear transformation of a pair of complex
vectors.
Syntax
call clacrt( n, cx, incx, cy, incy, c, s )
call zlacrt( n, cx, incx, cy, incy, c, s )
Include Files
mkl.fi
Description
The routine performs the following transformation
where c, s are complex scalars and x, y are complex vectors.
Input Parameters
n INTEGER. The number of elements in the vectors cx and cy (n 0).
cx, cy COMPLEX for clacrt

DOUBLE COMPLEX for zlacrt
Arrays, dimension (n).
Contain input vectors x and y, respectively.
incx INTEGER. The increment between successive elements of cx.
incy INTEGER. The increment between successive elements of cy.
c, s COMPLEX for clacrt

DOUBLE COMPLEX for zlacrt
Complex scalars that define the transform matrix
1351
Output Parameters
cx On exit, overwritten with c*x + s*y .
cy On exit, overwritten with -s*x + c*y .
?laesy
Computes the eigenvalues and eigenvectors of a 2-
by-2 complex symmetric matrix, and checks that the
norm of the matrix of eigenvectors is larger than a
threshold value.
Syntax
call claesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
call zlaesy( a, b, c, rt1, rt2, evscal, cs1, sn1 )
Include Files
mkl.fi
Description
The routine performs the eigendecomposition of a 2-by-2 symmetric matrix
provided the norm of the matrix of eigenvectors is larger than some threshold value.
rt1 is the eigenvalue of larger absolute value, and rt2 of smaller absolute value. If the eigenvectors are
computed, then on return (cs1, sn1) is the unit eigenvector for rt1, hence
Input Parameters
a, b, c COMPLEX for claesy

DOUBLE COMPLEX for zlaesy
Elements of the input matrix.
1352
LAPACK Routines 3
Output Parameters
rt1, rt2 COMPLEX for claesy

Eigenvalues of larger and smaller modulus, respectively.
evscal COMPLEX for claesy

The complex value by which the eigenvector matrix was scaled to make it
orthonormal. If evscal is zero, the eigenvectors were not computed. This
means one of two things: the 2-by-2 matrix could not be diagonalized, or
the norm of the matrix of eigenvectors before scaling was larger than the
threshold value thresh (set to 0.1E0).
cs1, sn1 COMPLEX for claesy

If evscal is not zero, then (cs1, sn1) is the unit right eigenvector for rt1.
?rot
Applies a plane rotation with real cosine and complex
sine to a pair of complex vectors.
Syntax
call crot( n, cx, incx, cy, incy, c, s )
call zrot( n, cx, incx, cy, incy, c, s )
Include Files
mkl.fi
Description
The routine applies a plane rotation, where the cosine (c) is real and the sine (s) is complex, and the vectors
cx and cy are complex. This routine has its real equivalents in BLAS (see ?rot in Chapter 2).
Input Parameters
n INTEGER. The number of elements in the vectors cx and cy.
cx, cy REAL for srot

COMPLEX for crot
DOUBLE COMPLEX for zrot
Arrays of dimension (n), contain input vectors x and y, respectively.
incx INTEGER. The increment between successive elements of cx.
incy INTEGER. The increment between successive elements of cy.
1353
c REAL for crot

DOUBLE PRECISION for zrot
s REAL for srot

COMPLEX for crot
DOUBLE COMPLEX for zrot
Values that define a rotation
where c*c + s*conjg(s) = 1.0.
Output Parameters
cx On exit, overwritten with c*x + s*y.
cy On exit, overwritten with -conjg(s)*x + c*y.
?spmv
Computes a matrix-vector product for complex vectors
using a complex symmetric packed matrix.
Syntax
call cspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
call zspmv( uplo, n, alpha, ap, x, incx, beta, y, incy )
Include Files
mkl.fi
Description
The ?spmv routines perform a matrix-vector operation defined as
y := alpha*a*x + beta*y,
where:
alpha and beta are complex scalars,
x and y are n-element complex vectors
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spmv in Chapter 2 ).
Input Parameters
matrix a is supplied in the packed array ap.
1354
LAPACK Routines 3
If uplo = 'U' or 'u', the upper triangular part of the matrix a is supplied
in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the matrix a is supplied
in the array ap .
n INTEGER.
Specifies the order of the matrix a.
alpha, beta COMPLEX for cspmv

DOUBLE COMPLEX for zspmv
Specify complex scalars alpha and beta. When beta is supplied as zero,
then y need not be set on input.
ap COMPLEX for cspmv

Array, DIMENSION at least ((n*(n + 1))/2). Before entry, with uplo =
'U' or 'u', the array ap must contain the upper triangular part of the
symmetric matrix packed sequentially, column-by-column, so that ap(1)
contains A(1, 1), ap(2) and ap(3) contain A(1, 2) and A(2, 2)
respectively, and so on. Before entry, with uplo = 'L' or 'l', the array ap
must contain the lower triangular part of the symmetric matrix packed
sequentially, column-by-column, so that ap(1) contains a(1, 1), ap(2)
and ap(3) contain a(2, 1) and a(3, 1) respectively, and so on.
x COMPLEX for cspmv

Array, DIMENSION at least (1 + (n - 1)*abs(incx)). Before entry, the
incx INTEGER. Specifies the increment for the elements of x. The value of incx
must not be zero.
y COMPLEX for cspmv

Array, DIMENSION at least (1 + (n - 1)*abs(incy)). Before entry, the
must not be zero.
Output Parameters
?spr
Performs the symmetrical rank-1 update of a complex
symmetric packed matrix.
1355
Syntax
call cspr( uplo, n, alpha, x, incx, ap )
call zspr( uplo, n, alpha, x, incx, ap )
Include Files
mkl.fi
Description
The ?spr routines perform a matrix-vector operation defined as
a:= alpha*x*xH + a,
where:
alpha is a complex scalar
x is an n-element complex vector
a is an n-by-n complex symmetric matrix, supplied in packed form.
These routines have their real equivalents in BLAS (see ?spr in Chapter 2).
Input Parameters
matrix a is supplied in the packed array ap, as follows:
If uplo = 'U' or 'u', the upper triangular part of the matrix a is supplied
in the array ap.
If uplo = 'L' or 'l', the lower triangular part of the matrix a is supplied
in the array ap .
n INTEGER.
Specifies the order of the matrix a.
alpha COMPLEX for cspr

DOUBLE COMPLEX for zspr
x COMPLEX for cspr

must not be zero.
ap COMPLEX for cspr

1356
LAPACK Routines 3
Array, DIMENSION at least ((n*(n + 1))/2). Before entry, with uplo = 'U'
or 'u', the array ap must contain the upper triangular part of the
symmetric matrix packed sequentially, column-by-column, so that ap(1)
contains A(1,1), ap(2) and ap(3) contain A(1,2) and A(2,2)
respectively, and so on.
Before entry, with uplo = 'L' or 'l', the array ap must contain the lower
column, so that ap(1) contains a(1,1), ap(2) and ap(3) contain a(2,1)
and a(3,1) respectively, and so on.
Note that the imaginary parts of the diagonal elements need not be set,
they are assumed to be zero, and on exit they are set to zero.
Output Parameters
updated matrix.
updated matrix.
?syconv
Converts a symmetric matrix given by a triangular
matrix factorization into two matrices and vice versa.
Syntax
call ssyconv( uplo, way, n, a, lda, ipiv, e, info )
call dsyconv( uplo, way, n, a, lda, ipiv, e, info )
call csyconv( uplo, way, n, a, lda, ipiv, e, info )
call zsyconv( uplo, way, n, a, lda, ipiv, e, info )
call syconv( a[,uplo][,way][,ipiv][,info][,e] )
Include Files
mkl.fi, lapack.f90
Description
The routine converts matrix A, which results from a triangular matrix factorization, into matrices L and D and
vice versa. The routine returns non-diagonalized elements of D and applies or reverses permutation done
with the triangular matrix factorization.
Input Parameters

Indicates whether the details of the factorization are stored as an upper or
lower triangular matrix:
If uplo = 'U': the upper triangular, A = U*D*UT.
If uplo = 'L': the lower triangular, A = L*D*LT.
1357
way CHARACTER*1. Must be 'C' or 'R'.
a REAL for ssyconv

DOUBLE PRECISION for dsyconv
COMPLEX for csyconv
DOUBLE COMPLEX for zsyconv
Array of size lda by n.
The block diagonal matrix D and the multipliers used to obtain the factor U
or L as computed by ?sytrf.

Details of the interchanges and the block structure of D, as returned by ?
sytrf.
Output Parameters
e REAL for ssyconv

DOUBLE PRECISION for dsyconv
COMPLEX for csyconv
DOUBLE COMPLEX for zsyconv
Array of size max(1, n) containing the superdiagonal/subdiagonal of the
symmetric 1-by-1 or 2-by-2 block diagonal matrix D in L*D*LT.

If info < 0, the i-th parameter had an illegal value.
If info = -1011, memory allocation error occurred.

Specific details for the routine syconv interface are as follows:
way Must be 'C' or 'R'.
See Also
?sytrf
1358
LAPACK Routines 3
?symv
Computes a matrix-vector product for a complex
symmetric matrix.
Syntax
call csymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
call zsymv( uplo, n, alpha, a, lda, x, incx, beta, y, incy )
Include Files
mkl.fi
Description
The routine performs the matrix-vector operation defined as

y := alpha*a*x + beta*y,
where:
alpha and beta are complex scalars
x and y are n-element complex vectors
a is an n-by-n symmetric complex matrix.
These routines have their real equivalents in BLAS (see ?symv in Chapter 2).
Input Parameters
array a is used:
If uplo = 'L' or 'l', then the lower triangular part of the array a is used.
n INTEGER. Specifies the order of the matrix a. The value of n must be at

least zero.
alpha, beta COMPLEX for csymv

DOUBLE COMPLEX for zsymv
Specify the scalars alpha and beta. When beta is supplied as zero, then y
need not be set on input.
a COMPLEX for csymv

Array, DIMENSION (lda, n). Before entry with uplo = 'U' or 'u', the
leading n-by-n upper triangular part of the array a must contain the upper
triangular part of the symmetric matrix and the strictly lower triangular part
of a is not referenced. Before entry with uplo = 'L' or 'l', the leading n-
by-n lower triangular part of the array a must contain the lower triangular
part of the symmetric matrix and the strictly upper triangular part of a is
not referenced.
1359
lda INTEGER. Specifies the leading dimension of A as declared in the calling

(sub)program. The value of lda must be at least max(1,n).
x COMPLEX for csymv

must not be zero.
y COMPLEX for csymv

Array, DIMENSION at least (1 + (n - 1)*abs(incy)). Before entry, the
must not be zero.
Output Parameters
?syr
Performs the symmetric rank-1 update of a complex
symmetric matrix.
Syntax
call csyr( uplo, n, alpha, x, incx, a, lda )
call zsyr( uplo, n, alpha, x, incx, a, lda )
Include Files
mkl.fi
Description
The routine performs the symmetric rank 1 operation defined as

a := alpha*x*xH + a,
where:
alpha is a complex scalar.

x is an n-element complex vector.
a is an n-by-n complex symmetric matrix.
These routines have their real equivalents in BLAS (see ?syr in Chapter 2).
Input Parameters
array a is used:
1360
LAPACK Routines 3
If uplo = 'L' or 'l', then the lower triangular part of the array a is used.
n INTEGER. Specifies the order of the matrix a. The value of n must be at

least zero.
alpha COMPLEX for csyr

DOUBLE COMPLEX for zsyr
x COMPLEX for csyr

must not be zero.
a COMPLEX for csyr

Array, size (lda, n). Before entry with uplo = 'U' or 'u', the leading n-
part of the symmetric matrix and the strictly lower triangular part of a is
not referenced.

(sub)program. The value of lda must be at least max(1,n).
Output Parameters

i?max1
Finds the index of the vector element whose real part
has maximum absolute value.
Syntax
index = icmax1( n, cx, incx )
1361
index = izmax1( n, cx, incx )
Include Files
mkl.fi
Description
Given a complex vector cx, the i?max1 functions return the index of the first vector element of maximum
absolute value. These functions are based on the BLAS functions icamax/izamax, but using the absolute
value of components. They are designed for use with clacon/zlacon.
Input Parameters
n INTEGER. Specifies the number of elements in the vector cx.
cx COMPLEX for icmax1

DOUBLE COMPLEX for izmax1
Contains the input vector.
incx INTEGER. Specifies the spacing between successive elements of cx.
Output Parameters
index INTEGER. Index of the vector element of maximum absolute value.
?sum1
Forms the 1-norm of the complex vector using the
true absolute value.
Syntax
res = scsum1( n, cx, incx )
res = dzsum1( n, cx, incx )
Include Files
mkl.fi
Description
Given a complex vector cx, scsum1/dzsum1 functions take the sum of the absolute values of vector elements
and return a single/double precision result, respectively. These functions are based on scasum/dzasum from
Level 1 BLAS, but use the true absolute value and were designed for use with clacon/zlacon.
Input Parameters
n INTEGER. Specifies the number of elements in the vector cx.
cx COMPLEX for scsum1

DOUBLE COMPLEX for dzsum1
1362
LAPACK Routines 3
Contains the input vector whose elements will be summed.
incx INTEGER. Specifies the spacing between successive elements of cx (incx >
0).
Output Parameters
res REAL for scsum1

DOUBLE PRECISION for dzsum1
Sum of absolute values.
?gbtf2
Computes the LU factorization of a general band
matrix using the unblocked version of the algorithm.
Syntax
call sgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call dgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call cgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
call zgbtf2( m, n, kl, ku, ab, ldab, ipiv, info )
Include Files
mkl.fi
Description
The routine forms the LU factorization of a general real/complex m-by-n band matrix A with kl sub-diagonals
and ku super-diagonals. The routine uses partial pivoting with row interchanges and implements the
unblocked version of the algorithm, calling Level 2 BLAS. See also ?gbtrf.
Input Parameters
kl INTEGER. The number of sub-diagonals within the band of A (kl 0).
ku INTEGER. The number of super-diagonals within the band of A (ku 0).
ab REAL for sgbtf2

DOUBLE PRECISION for dgbtf2
COMPLEX for cgbtf2
DOUBLE COMPLEX for zgbtf2.
Array, DIMENSION (ldab,*).
The array ab contains the matrix A in band storage (see Matrix Arguments).
1363
ldab INTEGER. The leading dimension of the array ab.

(ldab 2kl + ku +1)
Output Parameters
ab Overwritten by details of the factorization. The diagonal and kl + ku super-

diagonals of U are stored in the first 1 + kl + ku rows of ab. The multipliers
used during the factorization are stored in the next kl rows.
ipiv INTEGER.
Array, DIMENSION at least max(1,min(m,n)).
The pivot indices: row i was interchanged with row ipiv(i).
info INTEGER. If info =0, the execution is successful.

If info = i, uii is 0. The factorization has been completed, but U is exactly

singular. Division by 0 will occur if you use the factor U for solving a system
of linear equations.
?gebd2
Reduces a general matrix to bidiagonal form using an
unblocked algorithm.
Syntax
call sgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call dgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call cgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
call zgebd2( m, n, a, lda, d, e, tauq, taup, work, info )
Include Files
mkl.fi
Description
The routine reduces a general m-by-n matrix A to upper or lower bidiagonal form B by an orthogonal
(unitary) transformation: QT*A*P = B (for real flavors) or QH*A*P = B (for complex flavors).
If mn, B is upper bidiagonal; if m < n, B is lower bidiagonal.
The routine does not form the matrices Q and P explicitly, but represents them as products of elementary
reflectors. if mn,
Q = H(1)*H(2)*...*H(n), and P = G(1)*G(2)*...*G(n-1)

if m < n,
Q = H(1)*H(2)*...*H(m-1), and P = G(1)*G(2)*...*G(m)

Each H(i) and G(i) has the form
H(i) = I - tauq*v*vT and G(i) = I - taup*u*uT for real flavors, or

H(i) = I - tauq*v*vH and G(i) = I - taup*u*uH for complex flavors
1364
LAPACK Routines 3
where tauq and taup are scalars (real for sgebd2/dgebd2, complex for cgebd2/zgebd2), and v and u are
vectors (real for sgebd2/dgebd2, complex for cgebd2/zgebd2).
Input Parameters
a, work REAL for sgebd2

DOUBLE PRECISION for dgebd2
COMPLEX for cgebd2
DOUBLE COMPLEX for zgebd2.
Arrays:
a(lda,*) contains the m-by-n general matrix A to be reduced. The second
work(*) is a workspace array, the dimension of work must be at least
max(1, m, n).
Output Parameters
a if mn, the diagonal and first super-diagonal of a are overwritten with the
upper bidiagonal matrix B. Elements below the diagonal, with the array
tauq, represent the orthogonal/unitary matrix Q as a product of elementary
reflectors, and elements above the first superdiagonal, with the array taup,
represent the orthogonal/unitary matrix p as a product of elementary
reflectors.
if m < n, the diagonal and first sub-diagonal of a are overwritten by the
lower bidiagonal matrix B. Elements below the first subdiagonal, with the
array tauq, represent the orthogonal/unitary matrix Q as a product of
elementary reflectors, and elements above the diagonal, with the array
taup, represent the orthogonal/unitary matrix p as a product of elementary
reflectors.

Array, DIMENSION at least max(1, min(m, n)).
Contains the diagonal elements of the bidiagonal matrix B: d(i) = a(i,

i).

DOUBLE PRECISION for double-precision flavors. Array, DIMENSION at least
max(1, min(m, n) - 1).
Contains the off-diagonal elements of the bidiagonal matrix B:
if mn, e(i) = a(i, i+1) for i = 1,2,..., n-1;
if m < n, e(i) = a(i+1, i) for i = 1,2,..., m-1.
1365
tauq, taup REAL for sgebd2

DOUBLE PRECISION for dgebd2
COMPLEX for cgebd2
DOUBLE COMPLEX for zgebd2.
Arrays, DIMENSION at least max (1, min(m, n)).
Contain scalar factors of the elementary reflectors which represent

orthogonal/unitary matrices Q and p, respectively.
info INTEGER.
?gehd2
Reduces a general square matrix to upper Hessenberg
form using an unblocked algorithm.
Syntax
call sgehd2( n, ilo, ihi, a, lda, tau, work, info )
call dgehd2( n, ilo, ihi, a, lda, tau, work, info )
call cgehd2( n, ilo, ihi, a, lda, tau, work, info )
call zgehd2( n, ilo, ihi, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine reduces a real/complex general matrix A to upper Hessenberg form H by an orthogonal or
unitary similarity transformation QT*A*Q = H (for real flavors) or QH*A*Q = H (for complex flavors).
The routine does not form the matrix Q explicitly. Instead, Q is represented as a product of elementary
reflectors.
Input Parameters
n INTEGER The order of the matrix A (n 0).
ilo, ihi INTEGER. It is assumed that A is already upper triangular in rows and
columns 1:ilo -1 and ihi+1:n.
If A has been output by ?gebal, then
ilo and ihi must contain the values returned by that routine. Otherwise they
should be set to ilo = 1 and ihi = n. Constraint: 1 iloihi max(1,
n).
a, work REAL for sgehd2

DOUBLE PRECISION for dgehd2
1366
LAPACK Routines 3
COMPLEX for cgehd2
DOUBLE COMPLEX for zgehd2.
Arrays:
a (lda,*) contains the n-by-n matrix A to be reduced. The second
work (n) is a workspace array.
Output Parameters
a On exit, the upper triangle and the first subdiagonal of A are overwritten
with the upper Hessenberg matrix H and the elements below the first
subdiagonal, with the array tau, represent the orthogonal/unitary matrix Q
as a product of elementary reflectors. See Application Notes below.
tau REAL for sgehd2

DOUBLE PRECISION for dgehd2
COMPLEX for cgehd2
DOUBLE COMPLEX for zgehd2.
Array, DIMENSION at least max (1, n-1).
Contains the scalar factors of elementary reflectors. See Application Notes

below.
info INTEGER.
Application Notes
The matrix Q is represented as a product of (ihi - ilo) elementary reflectors
Q = H(ilo)*H(ilo +1)*...*H(ihi -1)
H(i) = I - tau*v*vT for real flavors, or
H(i) = I - tau*v*vH for complex flavors
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i) = 0, v(i+1) = 1 and v(ihi
+1:n) = 0.
On exit, v(i+2:ihi) is stored in a(i+2:ihi, i) and tau in tau(i).
The contents of a are illustrated by the following example, with n = 7, ilo = 2 and ihi = 6:
1367
where a denotes an element of the original matrix A, h denotes a modified element of the upper Hessenberg
matrix H, and vi denotes an element of the vector defining H(i).
?gelq2
Computes the LQ factorization of a general
rectangular matrix using an unblocked algorithm.
Syntax
call sgelq2( m, n, a, lda, tau, work, info )
call dgelq2( m, n, a, lda, tau, work, info )
call cgelq2( m, n, a, lda, tau, work, info )
call zgelq2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine computes an LQ factorization of a real/complex m-by-n matrix A as A = L*Q.
elementary reflectors :
Q = H(k) ... H(2) H(1) (or Q = H(k)H ... H(2)HH(1)H for complex flavors), where k = min(m, n)
H(i) = I - tau*v*vH for complex flavors,
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v1:i-1 = 0 and vi =
1.
On exit, vi+1:n (for real functions) and conjg(vi+1:n) (for complex functions) are stored in a(i, i+1:n).
Input Parameters
1368
LAPACK Routines 3
a, work REAL for sgelq2

DOUBLE PRECISION for dgelq2
COMPLEX for cgelq2
DOUBLE COMPLEX for zgelq2.
Arrays: a(lda,*) contains the m-by-n matrix A. The second dimension of a
work(m) is a workspace array.
lda INTEGER. The leading dimension of a; at least max(1, m) .
Output Parameters

on exit, the elements on and below the diagonal of the array a contain the
m-by-min(n,m) lower trapezoidal matrix L (L is lower triangular if nm); the
elements above the diagonal, with the array tau, represent the orthogonal/
unitary matrix Q as a product of min(n,m) elementary reflectors.
tau REAL for sgelq2

DOUBLE PRECISION for dgelq2
COMPLEX for cgelq2
DOUBLE COMPLEX for zgelq2.
Contains scalar factors of the elementary reflectors.
info INTEGER.
?geql2
Computes the QL factorization of a general
Syntax
call sgeql2( m, n, a, lda, tau, work, info )
call dgeql2( m, n, a, lda, tau, work, info )
call cgeql2( m, n, a, lda, tau, work, info )
call zgeql2( m, n, a, lda, tau, work, info )
1369
Include Files
mkl.fi
Description
The routine computes a QL factorization of a real/complex m-by-n matrix A as A = Q*L.
Q = H(k)* ... *H(2)*H(1), where k = min(m, n).
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(m-k+i+1:m) = 0
and v(m-k+i) = 1.
On exit, v(1:m-k+i-1) is stored in a(1:m-k+i-1, n-k+i).
Input Parameters
a, work REAL for sgeql2

DOUBLE PRECISION for dgeql2
COMPLEX for cgeql2
DOUBLE COMPLEX for zgeql2.
Arrays:
Output Parameters

on exit, if mn, the lower triangle of the subarray a(m-n+1:m, 1:n)
contains the n-by-n lower triangular matrix L; if m < n, the elements on
and below the (n-m)th superdiagonal contain the m-by-n lower trapezoidal
matrix L; the remaining elements, with the array tau, represent the
tau REAL for sgeql2

DOUBLE PRECISION for dgeql2
COMPLEX for cgeql2
DOUBLE COMPLEX for zgeql2.
1370
LAPACK Routines 3
info INTEGER.
?geqr2
Computes the QR factorization of a general
Syntax
call sgeqr2( m, n, a, lda, tau, work, info )
call dgeqr2( m, n, a, lda, tau, work, info )
call cgeqr2( m, n, a, lda, tau, work, info )
call zgeqr2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine computes a QR factorization of a real/complex m-by-n matrix A as A = Q*R.
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v1:i-1 = 0 and vi =
1.
On exit, vi+1:m is stored in a(i+1:m, i).
Input Parameters
a, work REAL for sgeqr2

DOUBLE PRECISION for dgeqr2
COMPLEX for cgeqr2
DOUBLE COMPLEX for zgeqr2.
1371
Arrays:
work(n) is a workspace array.
lda INTEGER. The leading dimension of a; at least max(1, m) .
Output Parameters

min(n,m)-by-n upper trapezoidal matrix R (R is upper triangular if mn); the
elements below the diagonal, with the array tau, represent the orthogonal/
tau REAL for sgeqr2

DOUBLE PRECISION for dgeqr2
COMPLEX for cgeqr2
DOUBLE COMPLEX for zgeqr2.
info INTEGER.
?geqr2p
Computes the QR factorization of a general
rectangular matrix with non-negative diagonal
elements using an unblocked algorithm.
Syntax
call sgeqr2p( m, n, a, lda, tau, work, info )
call dgeqr2p( m, n, a, lda, tau, work, info )
call cgeqr2p( m, n, a, lda, tau, work, info )
call zgeqr2p( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine computes a QR factorization of a real/complex m-by-n matrix A as A = Q*R. The diagonal entries
of R are real and nonnegative.
1372
LAPACK Routines 3
Q = H(1)*H(2)* ... *H(k), where k = min(m, n)
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(1:i-1) = 0 and
v(i) = 1.
On exit, v(i+1:m) is stored in a(i+1:m, i).
Input Parameters
a, work REAL for sgeqr2p

DOUBLE PRECISION for d
COMPLEX for cgeqr2p
DOUBLE COMPLEX for zgeqr2p.
Arrays:
work(n) is a workspace array.
Output Parameters

min(n,m)-by-n upper trapezoidal matrix R (R is upper triangular if mn).
The diagonal entries of R are real and nonnegative; the elements below the
diagonal, with the array tau, represent the orthogonal/unitary matrix Q as a
product of elementary reflectors.
tau REAL for sgeqr2p

DOUBLE PRECISION for dgeqr2p
COMPLEX for cgeqr2p
DOUBLE COMPLEX for zgeqr2p.
info INTEGER.
1373
?geqrt2
Computes a QR factorization of a general real or
complex matrix using the compact WY representation
of Q.
Syntax
call sgeqrt2(m, n, a, lda, t, ldt, info)
call dgeqrt2(m, n, a, lda, t, ldt, info)
call cgeqrt2(m, n, a, lda, t, ldt, info)
call zgeqrt2(m, n, a, lda, t, ldt, info)
call geqrt2(a, t, [info])
Include Files
mkl.fi, lapack.f90
Description
where vi represents the vector that defines H(i). The vectors are returned in the lower triangular part of array
a.
NOTE
The block reflector H is then given by

H = I - V*T*VT for real flavors, and
H = I - V*T*VH for complex flavors,
where VT is the transpose and VH is the conjugate transpose of V.
Input Parameters
m INTEGER. The number of rows in the matrix A (mn).
1374
LAPACK Routines 3
a REAL for sgeqrt2

DOUBLE PRECISION for dgeqrt2
COMPLEX for cgeqrt2
COMPLEX*16 for zgeqrt2.
Array, size lda by n. Array a contains the m-by-n matrix A.
Output Parameters

The elements on and above the diagonal of the array contain the n-by-n
upper triangular matrix R. The elements below the diagonal are the
columns of V.
t REAL for sgeqrt2

COMPLEX for cgeqrt2
Array, size (ldt, min(m, n)).
The n-by-n upper triangular factor of the block reflector. The elements on
and above the diagonal contain the block reflector T. The elements below
the diagonal are not used.
info INTEGER.
?geqrt3
Recursively computes a QR factorization of a general
real or complex matrix using the compact WY
representation of Q.
Syntax
call sgeqrt3(m, n, a, lda, t, ldt, info)
call dgeqrt3(m, n, a, lda, t, ldt, info)
call cgeqrt3(m, n, a, lda, t, ldt, info)
call zgeqrt3(m, n, a, lda, t, ldt, info)
call geqrt3(a, t [, info])
1375
Include Files
mkl.fi, lapack.f90
Description
where vi represents one of the vectors that define H(i). The vectors are returned in the lower part of
triangular array a.
NOTE
The block reflector H is then given by

H = I - V*T*VT for real flavors, and
H = I - V*T*VH for complex flavors,
where VT is the transpose and VHis the conjugate transpose of V.
Input Parameters
m INTEGER. The number of rows in the matrix A (mn).
a REAL for sgeqrt3

COMPLEX for cgeqrt3
Array, size (lda, n). Array a contains the m-by-n matrix A.
1376
LAPACK Routines 3
Output Parameters
a The elements on and above the diagonal of the array contain the n-by-n
upper triangular matrix R. The elements below the diagonal are the
columns of V.
t REAL for sgeqrt3

COMPLEX for cgeqrt3
Array, size ldt by n.
The n-by-n upper triangular factor of the block reflector. The elements on
and above the diagonal contain the block reflector T. The elements below
the diagonal are not used.
info INTEGER.
?gerq2
Computes the RQ factorization of a general
Syntax
call sgerq2( m, n, a, lda, tau, work, info )
call dgerq2( m, n, a, lda, tau, work, info )
call cgerq2( m, n, a, lda, tau, work, info )
call zgerq2( m, n, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine computes a RQ factorization of a real/complex m-by-n matrix A as A = R*Q.
Q = H(1)*H(2)* ... *H(k) for real flavors, or
Q = H(1)H*H(2)H* ... *H(k)H for complex flavors
where k = min(m, n).

1377
where tau is a real/complex scalar stored in tau(i), and v is a real/complex vector with v(n-k+i+1:n) = 0
and v(n-k+i) = 1.
On exit, v(1:n-k+i-1) is stored in a(m-k+i, 1:n-k+i-1).
Input Parameters
a, work REAL for sgerq2

DOUBLE PRECISION for dgerq2
COMPLEX for cgerq2
DOUBLE COMPLEX for zgerq2.
Arrays:
Output Parameters

on exit, if mn, the upper triangle of the subarray a(1:m, n-m+1:n )
contains the m-by-m upper triangular matrix R; if m > n, the elements on
and above the (m-n)-th subdiagonal contain the m-by-n upper trapezoidal
matrix R; the remaining elements, with the array tau, represent the
tau REAL for sgerq2

DOUBLE PRECISION for dgerq2
COMPLEX for cgerq2
DOUBLE COMPLEX for zgerq2.
info INTEGER.
?gesc2
Solves a system of linear equations using the LU
factorization with complete pivoting computed by ?
getc2.
1378
LAPACK Routines 3
Syntax
call sgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call dgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call cgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
call zgesc2( n, a, lda, rhs, ipiv, jpiv, scale )
Include Files
mkl.fi
Description
The routine solves a system of linear equations

A*X = scale*RHS
with a general n-by-n matrix A using the LU factorization with complete pivoting computed by ?getc2.
Input Parameters
a, rhs REAL for sgesc2

DOUBLE PRECISION for dgesc2
COMPLEX for cgesc2
DOUBLE COMPLEX for zgesc2.
Arrays:
a(lda,*) contains the LU part of the factorization of the n-by-n matrix A
computed by ?getc2:
A = P*L*U*Q.
The second dimension of a must be at least max(1, n);
rhs(n) contains on entry the right hand side vector for the system of
equations.
ipiv INTEGER.
Array, DIMENSION at least max(1,n).
The pivot indices: for 1 in, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
The pivot indices: for 1 jn, column j of the matrix has been
interchanged with column jpiv(j).
Output Parameters
rhs On exit, overwritten with the solution vector X.
1379
scale REAL for sgesc2/cgesc2

DOUBLE PRECISION for dgesc2/zgesc2
Contains the scale factor. scale is chosen in the range 0 scale 1 to
prevent overflow in the solution.
?getc2
Computes the LU factorization with complete pivoting
of the general n-by-n matrix.
Syntax
call sgetc2( n, a, lda, ipiv, jpiv, info )
call dgetc2( n, a, lda, ipiv, jpiv, info )
call cgetc2( n, a, lda, ipiv, jpiv, info )
call zgetc2( n, a, lda, ipiv, jpiv, info )
Include Files
mkl.fi
Description
The routine computes an LU factorization with complete pivoting of the n-by-n matrix A. The factorization
has the form A = P*L*U*Q, where P and Q are permutation matrices, L is lower triangular with unit diagonal
elements and U is upper triangular.
The LU factorization computed by this routine is used by ?latdf to compute a contribution to the reciprocal
Dif-estimate.
Input Parameters
a REAL for sgetc2

DOUBLE PRECISION for dgetc2
COMPLEX for cgetc2
DOUBLE COMPLEX for zgetc2.
Array a(lda,*) contains the n-by-n matrix A to be factored. The second
dimension of a must be at least max(1, n);
Output Parameters
a On exit, the factors L and U from the factorization A = P*L*U*Q; the unit
diagonal elements of L are not stored. If U(k, k) appears to be less than
smin, U(k, k) is given the value of smin, that is giving a nonsingular
perturbed system.
ipiv INTEGER.
1380
LAPACK Routines 3
The pivot indices: for 1 i n, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
The pivot indices: for 1 j n, column j of the matrix has been

interchanged with column jpiv(j).
info INTEGER.
If info = k>0, U(k, k) is likely to produce overflow if we try to solve for x

in A*x = b. So U is perturbed to avoid the overflow.
?getf2
matrix using partial pivoting with row interchanges
Syntax
call sgetf2( m, n, a, lda, ipiv, info )
call dgetf2( m, n, a, lda, ipiv, info )
call cgetf2( m, n, a, lda, ipiv, info )
call zgetf2( m, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the LU factorization of a general m-by-n matrix A using partial pivoting with row
interchanges. The factorization has the form
A = P*L*U
where p is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m >
n) and U is upper triangular (upper trapezoidal if m < n).
Input Parameters
a REAL for sgetf2

DOUBLE PRECISION for dgetf2
COMPLEX for cgetf2
1381
DOUBLE COMPLEX for zgetf2.

Array, size (lda,*). Contains the matrix A to be factored. The second
Output Parameters
a Overwritten by L and U. The unit diagonal elements of L are not stored.
ipiv INTEGER.
Array, size at least max(1,min(m,n)).
The pivot indices: for 1 i n, row i was interchanged with row ipiv(i).

If info = i >0, uii is 0. The factorization has been completed, but U is

exactly singular. Division by 0 will occur if you use the factor U for solving a
?gtts2
Solves a system of linear equations with a tridiagonal
matrix using the LU factorization computed by ?
gttrf.
Syntax
call sgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call dgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call cgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
call zgtts2( itrans, n, nrhs, dl, d, du, du2, ipiv, b, ldb )
Include Files
mkl.fi
Description
The routine solves for X one of the following systems of linear equations with multiple right hand sides:
A*X = B, AT*X = B, or AH*X = B (for complex matrices only), with a tridiagonal matrix A using the LU
factorization computed by ?gttrf.
Input Parameters
itrans INTEGER. Must be 0, 1, or 2.

Indicates the form of the equations to be solved:
If itrans = 0, then A*X = B (no transpose).
1382
LAPACK Routines 3
If itrans = 1, then AT*X = B (transpose).
If itrans = 2, then AH*X = B (conjugate transpose).
nrhs INTEGER. The number of right-hand sides, i.e., the number of columns in B
(nrhs 0).
dl,d,du,du2,b REAL for sgtts2

DOUBLE PRECISION for dgtts2
COMPLEX for cgtts2
DOUBLE COMPLEX for zgtts2.
Arrays: dl(n - 1), d(n ), du(n - 1), du2(n - 2), b(ldb,nrhs).
The array dl contains the (n - 1) multipliers that define the matrix L from
the LU factorization of A.
The array d contains the n diagonal elements of the upper triangular matrix
The array du contains the (n - 1) elements of the first super-diagonal of U.
The array du2 contains the (n - 2) elements of the second super-diagonal of
U.
for the systems of equations.
ldb INTEGER. The leading dimension of b; must be ldb max(1, n).
ipiv INTEGER.
Array, DIMENSION (n).
The pivot indices array, as returned by ?gttrf.
Output Parameters
?isnan
Tests input for NaN.
Syntax
val = sisnan( sin )
val = disnan( din )
Include Files
mkl.fi
Description
This logical routine returns .TRUE. if its argument is NaN, and .FALSE. otherwise.
1383
Input Parameters
sin REAL for sisnan

Input to test for NaN.
din DOUBLE PRECISION for disnan

Input to test for NaN.
Output Parameters
val Logical. Result of the test.
?laisnan
Tests input for NaN.
Syntax
val = slaisnan( sin1, sin2 )
val = dlaisnan( din1, din2 )
Include Files
mkl.fi
Description
This logical routine checks for NaNs (NaN stands for 'Not A Number') by comparing its two arguments for
inequality. NaN is the only floating-point value where NaN NaN returns .TRUE. To check for NaNs, pass the
same variable as both arguments.
This routine is not for general use. It exists solely to avoid over-optimization in ?isnan.
Input Parameters
sin1, sin2 REAL for sisnan

Two numbers to compare for inequality.
din2, din2 DOUBLE PRECISION for disnan

Two numbers to compare for inequality.
Output Parameters
val Logical. Result of the comparison.
?labrd
Reduces the first nb rows and columns of a general
matrix to a bidiagonal form.
Syntax
call slabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call dlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
1384
LAPACK Routines 3
call clabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
call zlabrd( m, n, nb, a, lda, d, e, tauq, taup, x, ldx, y, ldy )
Include Files
mkl.fi
Description
The routine reduces the first nb rows and columns of a general m-by-n matrix A to upper or lower bidiagonal
form by an orthogonal/unitary transformation Q'*A*P, and returns the matrices X and Y which are needed to
apply the transformation to the unreduced part of A.
if mn, A is reduced to upper bidiagonal form; if m < n, to lower bidiagonal form.
The matrices Q and P are represented as products of elementary reflectors: Q = H(1)*(2)* ...*H(nb),
and P = G(1)*G(2)* ...*G(nb)
Each H(i) and G(i) has the form

H(i) = I - tauq*v*v' and G(i) = I - taup*u*u'
where tauq and taup are scalars, and v and u are vectors.
The elements of the vectors v and u together form the m-by-nb matrix V and the nb-by-n matrix U' which
are needed, with X and Y, to apply the transformation to the unreduced part of the matrix, using a block
update of the form: A := A - V*Y' - X*U'.
This is an auxiliary routine called by ?gebrd.
Input Parameters
nb INTEGER. The number of leading rows and columns of A to be reduced.
a REAL for slabrd

DOUBLE PRECISION for dlabrd
COMPLEX for clabrd
DOUBLE COMPLEX for zlabrd.
Array a(lda,*) contains the matrix A to be reduced. The second dimension
ldx INTEGER. The leading dimension of the output array x; must beat least
max(1, m).
ldy INTEGER. The leading dimension of the output array y; must beat least
max(1, n).
Output Parameters
a On exit, the first nb rows and columns of the matrix are overwritten; the
rest of the array is unchanged.
1385
if mn, elements on and below the diagonal in the first nb columns, with the
elementary reflectors; and elements above the diagonal in the first nb rows,
with the array taup, represent the orthogonal/unitary matrix p as a product
of elementary reflectors.
if m < n, elements below the diagonal in the first nb columns, with the
elementary reflectors, and elements on and above the diagonal in the first
nb rows, with the array taup, represent the orthogonal/unitary matrix p as
a product of elementary reflectors.
d, e REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors. Arrays, DIMENSION (nb)
each. The array d contains the diagonal elements of the first nb rows and
columns of the reduced matrix:
d(i) = a(i,i).
The array e contains the off-diagonal elements of the first nb rows and
columns of the reduced matrix.
tauq, taup REAL for slabrd

COMPLEX for clabrd
Arrays, DIMENSION (nb) each. Contain scalar factors of the elementary
reflectors which represent the orthogonal/unitary matrices Q and P,
respectively.
x, y REAL for slabrd

COMPLEX for clabrd
Arrays, dimension x(ldx, nb), y(ldy, nb).
The array x contains the m-by-nb matrix X required to update the
unreduced part of A.
The array y contains the n-by-nb matrix Y required to update the unreduced
part of A.
Application Notes
if mn, then for the elementary reflectors H(i) and G(i),
v(1:i-1) = 0, v(i) = 1, and v(i:m) is stored on exit in a(i:m, i); u(1:i) = 0, u(i+1) = 1, and u(i
+1:n) is stored on exit in a(i, i+1:n);
tauq is stored in tauq(i) and taup in taup(i).
if m < n,
v(1:i) = 0, v(i+1) = 1, and v(i+1:m) is stored on exit in a(i+2:m, i) ; u(1:i-1) = 0, u(i) = 1,

and u(i:n) is stored on exit in a(i, i+1:n); tauq is stored in tauq(i) and taup in taup(i).
1386
LAPACK Routines 3
The contents of a on exit are illustrated by the following examples with nb = 2:
where a denotes an element of the original matrix which is unchanged, vi denotes an element of the vector
defining H(i), and ui an element of the vector defining G(i).
?lacn2
Estimates the 1-norm of a square matrix, using
reverse communication for evaluating matrix-vector
products.
Syntax
call slacn2( n, v, x, isgn, est, kase, isave )
call dlacn2( n, v, x, isgn, est, kase, isave )
call clacn2( n, v, x, est, kase, isave )
call zlacn2( n, v, x, est, kase, isave )
Include Files
mkl.fi
Description
The routine estimates the 1-norm of a square, real or complex matrix A. Reverse communication is used for
evaluating matrix-vector products.
Input Parameters
v, x REAL for slacn2

DOUBLE PRECISION for dlacn2
COMPLEX for clacn2
DOUBLE COMPLEX for zlacn2.
Arrays, size (n) each.
v is a workspace array.
1387
x is used as input after an intermediate return.
isgn INTEGER.
Workspace array, size (n), used with real flavors only.
est REAL for slacn2/clacn2

DOUBLE PRECISION for dlacn2/zlacn2
On entry with kase set to 1 or 2, and isave(1) = 1, est must be
unchanged from the previous call to the routine.
kase INTEGER.
On the initial call to the routine, kase must be set to 0.
isave INTEGER. Array, size (3).

Contains variables from the previous call to the routine.
Output Parameters
est An estimate (a lower bound) for norm(A).
kase On an intermediate return, kase is set to 1 or 2, indicating whether x is

overwritten by A*x or AT*x for real flavors and A*x or AH*x for complex
flavors.
On the final return, kase is set to 0.
v On the final return, v = A*w, where est = norm(v)/norm(w) (w is not

returned).
x On an intermediate return, x is overwritten by

A*x, if kase = 1,
AT*x, if kase = 2 (for real flavors),
AH*x, if kase = 2 (for complex flavors),
and the routine must be re-called with all the other parameters unchanged.
isave This parameter is used to save variables between calls to the routine.
?lacon
Estimates the 1-norm of a square matrix, using
reverse communication for evaluating matrix-vector
products.
Syntax
call slacon( n, v, x, isgn, est, kase )
call dlacon( n, v, x, isgn, est, kase )
call clacon( n, v, x, est, kase )
call zlacon( n, v, x, est, kase )
1388
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine estimates the 1-norm of a square, real/complex matrix A. Reverse communication is used for
evaluating matrix-vector products.
WARNING
The ?lacon routine is not thread-safe. It is deprecated and retained for the backward compatibility
only. Use the thread-safe ?lacn2 routine instead.
Input Parameters
v, x REAL for slacon

DOUBLE PRECISION for dlacon
COMPLEX for clacon
DOUBLE COMPLEX for zlacon.
Arrays, DIMENSION (n) each.
v is a workspace array.
x is used as input after an intermediate return.
isgn INTEGER.
Workspace array, DIMENSION (n), used with real flavors only.
est REAL for slacon/clacon

DOUBLE PRECISION for dlacon/zlacon
An estimate that with kase=1 or 2 should be unchanged from the previous
call to ?lacon.
kase INTEGER.
On the initial call to ?lacon, kase should be 0.
Output Parameters
est REAL for slacon/clacon

DOUBLE PRECISION for dlacon/zlacon
An estimate (a lower bound) for norm(A).
kase On an intermediate return, kase will be 1 or 2, indicating whether x should

be overwritten by A*x or AT*x for real flavors and A*x or AH*x for complex
flavors. On the final return from ?lacon, kase will again be 0.
v On the final return, v = A*w, where est = norm(v)/norm(w) (w is not

returned).
1389
x On an intermediate return, x should be overwritten by

A*x, if kase = 1,
AT*x, if kase = 2 (for real flavors),
AH*x, if kase = 2 (for complex flavors),
and ?lacon must be re-called with all the other parameters unchanged.
?lacpy
Copies all or part of one two-dimensional array to
another.
Syntax
call slacpy( uplo, m, n, a, lda, b, ldb )
call dlacpy( uplo, m, n, a, lda, b, ldb )
call clacpy( uplo, m, n, a, lda, b, ldb )
call zlacpy( uplo, m, n, a, lda, b, ldb )
Include Files
mkl.fi
Description
The routine copies all or part of a two-dimensional matrix A to another matrix B.
Input Parameters
uplo CHARACTER*1.
Specifies the part of the matrix A to be copied to B.
If uplo = 'U', the upper triangular part of A;
if uplo = 'L', the lower triangular part of A.
Otherwise, all of the matrix A is copied.
a REAL for slacpy

DOUBLE PRECISION for dlacpy
COMPLEX for clacpy
DOUBLE COMPLEX for zlacpy.
Array a(lda,*), contains the m-by-n matrix A.
If uplo = 'U', only the upper triangle or trapezoid is accessed; if uplo =

'L', only the lower triangle or trapezoid is accessed.
1390
LAPACK Routines 3
lda INTEGER. The leading dimension of a; ldamax(1,m).
ldb INTEGER. The leading dimension of the output array b; ldb max(1, m).
Output Parameters
b REAL for slacpy

DOUBLE PRECISION for dlacpy
COMPLEX for clacpy
DOUBLE COMPLEX for zlacpy.
Array b(ldb,*), contains the m-by-n matrix B.
The second dimension of b must be at least max(1,n).
On exit, B = A in the locations specified by uplo.
?ladiv
Performs complex division in real arithmetic, avoiding
unnecessary overflow.
Syntax
call sladiv( a, b, c, d, p, q )
call dladiv( a, b, c, d, p, q )
res = cladiv( x, y )
res = zladiv( x, y )
Include Files
mkl.fi
Description
The routines sladiv/dladiv perform complex division in real arithmetic as
Complex functions cladiv/zladiv compute the result as
res = x/y,
where x and y are complex. The computation of x / y will not overflow on an intermediary step unless the
results overflows.
The algorithm used is due to [Baudin12].
Input Parameters
a, b, c, d REAL for sladiv

DOUBLE PRECISION for dladiv
1391
The scalars a, b, c, and d in the above expression (for real flavors only).
x, y COMPLEX for cladiv

DOUBLE COMPLEX for zladiv
The complex scalars x and y (for complex flavors only).
Output Parameters
p, q REAL for sladiv

DOUBLE PRECISION for dladiv
The scalars p and q in the above expression (for real flavors only).
res COMPLEX for cladiv

DOUBLE COMPLEX for zladiv
Contains the result of division x / y.
?lae2
Computes the eigenvalues of a 2-by-2 symmetric
matrix.
Syntax
call slae2( a, b, c, rt1, rt2 )
call dlae2( a, b, c, rt1, rt2 )
Include Files
mkl.fi
Description
The routines sla2/dlae2 compute the eigenvalues of a 2-by-2 symmetric matrix
On return, rt1 is the eigenvalue of larger absolute value, and rt1 is the eigenvalue of smaller absolute value.
Input Parameters
a, b, c REAL for slae2

DOUBLE PRECISION for dlae2
The elements a, b, and c of the 2-by-2 matrix above.
Output Parameters
rt1, rt2 REAL for slae2
1392
LAPACK Routines 3
DOUBLE PRECISION for dlae2
The computed eigenvalues of larger and smaller absolute value,
respectively.
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive cancellation in
the determinant a*c-b*b; higher precision or correctly rounded or correctly truncated arithmetic would be
needed to compute rt2 accurately in all cases.
Overflow is possible only if rt1 is within a factor of 5 of overflow. Underflow is harmless if the input data is 0
or exceeds
underflow_threshold / macheps.
?laebz
Computes the number of eigenvalues of a real
symmetric tridiagonal matrix which are less than or
equal to a given value, and performs other tasks
required by the routine ?stebz.
Syntax
call slaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d, e, e2,
nval, ab, c, mout, nab, work, iwork, info )
call dlaebz( ijob, nitmax, n, mmax, minp, nbmin, abstol, reltol, pivmin, d, e, e2,
nval, ab, c, mout, nab, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?laebz contains the iteration loops which compute and use the function n(w), which is the count
of eigenvalues of a symmetric tridiagonal matrix T less than or equal to its argument w. It performs a choice
of two types of loops:
ijob =1, followed by
ijob =2: It takes as input a list of intervals and returns a list of sufficiently small
intervals whose union contains the same eigenvalues as the union of the original
intervals. The input intervals are (ab(j,1),ab(j,2)], j=1,...,minp. The
output interval (ab(j,1),ab(j,2)] will contain eigenvalues nab(j,
1)+1,...,nab(j,2), where 1 j mout.
ijob =3: It performs a binary search in each input interval (ab(j,1),ab(j,2)] for a
point w(j) such that n(w(j))=nval(j), and uses c(j) as the starting point of the
search. If such a w(j) is found, then on output ab(j,1)=ab(j,2)=w. If no such
w(j) is found, then on output (ab(j,1),ab(j,2)] will be a small interval
containing the point where n(w) jumps through nval(j), unless that point lies
outside the initial interval.
Note that the intervals are in all cases half-open intervals, that is, of the form (a,b], which includes b but
not a .
1393
To avoid underflow, the matrix should be scaled so that its largest element is no greater than overflow1/2 *
overflow1/4 in absolute value. To assure the most accurate computation of small eigenvalues, the matrix
should be scaled to be not much smaller than that, either.
NOTE
In general, the arguments are not checked for unreasonable values.
Input Parameters
ijob INTEGER. Specifies what is to be done:

= 1: Compute nab for the initial intervals.
= 2: Perform bisection iteration to find eigenvalues of T.
= 3: Perform bisection iteration to invert n(w), i.e., to find a point which
has a specified number of eigenvalues of T to its left. Other values will
cause ?laebz to return with info=-1.
nitmax INTEGER. The maximum number of "levels" of bisection to be performed,

i.e., an interval of width W will not be made smaller than 2-nitmax*W. If
not all intervals have converged after nitmax iterations, then info is set to
the number of non-converged intervals.
n INTEGER. The dimension n of the tridiagonal matrix T. It must be at least 1.
mmax INTEGER. The maximum number of intervals. If more than mmax intervals
are generated, then ?laebz will quit with info=mmax+1.
minp INTEGER. The initial number of intervals. It may not be greater than mmax.
nbmin INTEGER. The smallest number of intervals that should be processed using
a vector loop. If zero, then only the scalar loop will be used.
abstol REAL for slaebz

DOUBLE PRECISION for dlaebz.
The minimum (absolute) width of an interval. When an interval is narrower
than abstol, or than reltol times the larger (in magnitude) endpoint, then it
is considered to be sufficiently small, i.e., converged. This must be at least
zero.
reltol REAL for slaebz

The minimum relative width of an interval. When an interval is narrower
than abstol, or than reltol times the larger (in magnitude) endpoint, then it
is considered to be sufficiently small, i.e., converged. Note: this should
always be at least radix*machineepsilon.
pivmin REAL for slaebz

1394
LAPACK Routines 3
The minimum absolute value of a "pivot" in the Sturm sequence loop. This
value must be at least (max |e(j)**2|*safe_min) and at least safe_min,
where safe_min is at least the smallest number that can divide one without
overflow.
d, e, e2 REAL for slaebz

Arrays, dimension (n) each. The array d contains the diagonal elements of
The array e contains the off-diagonal elements of the tridiagonal matrix T in
positions 1 through n-1. e(n)vis arbitrary.
The array e2 contains the squares of the off-diagonal elements of the
tridiagonal matrix T. e2(n) is ignored.
nval INTEGER.
Array, dimension (minp).
If ijob=1 or 2, not referenced.
If ijob=3, the desired values of n(w).
ab REAL for slaebz

Array, dimension (mmax,2) The endpoints of the intervals. ab(j,1) is a(j),
the left endpoint of the j-th interval, and ab(j,2) is b(j), the right endpoint
of the j-th interval.
c REAL for slaebz

Array, dimension (mmax)
If ijob=1, ignored.
If ijob=2, workspace.
If ijob=3, then on input c(j) should be initialized to the first search point in
the binary search.
nab INTEGER.
Array, dimension (mmax,2)
If ijob=2, then on input, nab(i,j) should be set. It must satisfy the
condition:
n(ab(i,1)) nab(i,1) nab(i,2) n(ab(i,2)), which means that in
interval i only eigenvalues nab(i,1)+1,...,nab(i,2) are considered.
Usually, nab(i,j)=n(ab(i,j)), from a previous call to ?laebz with
ijob=1.
If ijob=3, normally, nab should be set to some distinctive value(s) before ?
laebz is called.
work REAL for slaebz

1395
Workspace array, dimension (mmax).
iwork INTEGER.
Workspace array, dimension (mmax).
Output Parameters
nval The elements of nval will be reordered to correspond with the intervals in
ab. Thus, nval(j) on output will not, in general be the same as nval(j) on
input, but it will correspond with the interval (ab(j,1),ab(j,2)] on
output.
ab The input intervals will, in general, be modified, split, and reordered by the
calculation.
mout INTEGER.
If ijob=1, the number of eigenvalues in the intervals.
If ijob=2 or 3, the number of intervals output.
If ijob=3, mout will equal minp.
nab If ijob=1, then on output nab(i,j) will be set to N(ab(i,j)).

If ijob=2, then on output, nab(i,j) will contain max(na(k, min(nb(k),
N(ab(i,j)))), where k is the index of the input interval that the output
interval (ab(j,1),ab(j,2)] came from, and na(k) and nb(k) are the input
values of nab(k,1) and nab(k,2).
If ijob=3, then on output, nab(i,j) contains N(ab(i,j)), unless N(w) > nval(i)
for all search points w, in which case nab(i,1) will not be modified, i.e., the
output value will be the same as the input value (modulo reorderings, see
nval and ab), or unless N(w) < nval(i) for all search points w, in which case
nab(i,2) will not be modified.
info INTEGER.
If info = 0 - all intervals converged
If info = 1--mmax - the last info interval did not converge.
If info = mmax+1 - more than mmax intervals were generated
Application Notes
This routine is intended to be called only by other LAPACK routines, thus the interface is less user-friendly. It
is intended for two purposes:
(a) finding eigenvalues. In this case, ?laebz should have one or more initial intervals set up in ab, and ?
laebz should be called with ijob=1. This sets up nab, and also counts the eigenvalues. Intervals with no
eigenvalues would usually be thrown out at this point. Also, if not all the eigenvalues in an interval i are
desired, nab(i,1) can be increased or nab(i,2) decreased. For example, set nab(i,1)=nab(i,2)-1 to get the
largest eigenvalue. ?laebz is then called with ijob=2 and mmax no smaller than the value of mout returned
by the call with ijob=1. After this (ijob=2) call, eigenvalues nab(i,1)+1 through nab(i,2) are approximately
ab(i,1) (or ab(i,2)) to the tolerance specified by abstol and reltol.
(b) finding an interval (a',b'] containing eigenvalues w(f),...,w(l). In this case, start with a Gershgorin
interval (a,b). Set up ab to contain 2 search intervals, both initially (a,b). One nval element should contain
f-1 and the other should contain l, while c should contain a and b, respectively. nab(i,1) should be -1 and
1396
LAPACK Routines 3
nab(i,2) should be n+1, to flag an error if the desired interval does not lie in (a,b). ?laebz is then called with
ijob=3. On exit, if w(f-1) < w(f), then one of the intervals -- j -- will have ab(j,1)=ab(j,2) and nab(j,
1)=nab(j,2)=f-1, while if, to the specified tolerance, w(f-k)=...=w(f+r), k > 0 and r 0, then the
interval will have n(ab(j,1))=nab(j,1)=f-k and n(ab(j,2))=nab(j,2)=f+r. The cases w(l) < w(l+1)
and w(l-r)=...=w(l+k) are handled similarly.
?laed0
Used by ?stedc. Computes all eigenvalues and
corresponding eigenvectors of an unreduced
symmetric tridiagonal matrix using the divide and
conquer method.
Syntax
call slaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info )
call dlaed0( icompq, qsiz, n, d, e, q, ldq, qstore, ldqs, work, iwork, info )
call claed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
call zlaed0( qsiz, n, d, e, q, ldq, qstore, ldqs, rwork, iwork, info )
Include Files
mkl.fi
Description
Real flavors of this routine compute all eigenvalues and (optionally) corresponding eigenvectors of a
symmetric tridiagonal matrix using the divide and conquer method.
Complex flavors claed0/zlaed0 compute all eigenvalues of a symmetric tridiagonal matrix which is one
diagonal block of those from reducing a dense or band Hermitian matrix and corresponding eigenvectors of
the dense or band matrix.
Input Parameters
icompq INTEGER. Used with real flavors only.

If icompq = 0, compute eigenvalues only.
If icompq = 1, compute eigenvectors of original dense symmetric matrix

also.
On entry, the array q must contain the orthogonal matrix used to reduce
the original matrix to tridiagonal form.
If icompq = 2, compute eigenvalues and eigenvectors of the tridiagonal
matrix.
qsiz INTEGER.
The dimension of the orthogonal/unitary matrix used to reduce the full
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq = 1).
n INTEGER. The dimension of the symmetric tridiagonal matrix (n 0).
d, e, rwork REAL for single-precision flavors

DOUBLE PRECISION for double-precision flavors. Arrays:
1397
d(*) contains the main diagonal of the tridiagonal matrix. The dimension of
d must be at least max(1, n).
e(*) contains the off-diagonal elements of the tridiagonal matrix. The

dimension of e must be at least max(1, n-1).
rwork(*) is a workspace array used in complex flavors only. The dimension

of rwork must be at least (1 +3n+2nlog2(n)+3n2), where log2(n) =
smallest integer k such that 2kn.
q, qstore REAL for slaed0

DOUBLE PRECISION for dlaed0
COMPLEX for claed0
DOUBLE COMPLEX for zlaed0.
Arrays: q(ldq, *), qstore(ldqs, *). The second dimension of these arrays
For real flavors:

If icompq = 0, array q is not referenced.
If icompq = 1, on entry, q is a subset of the columns of the orthogonal

matrix used to reduce the full matrix to tridiagonal form corresponding to
the subset of the full matrix which is being decomposed at this time.
If icompq = 2, on entry, q will be the identity matrix. The array qstore is a
workspace array referenced only when icompq = 1. Used to store parts of
the eigenvector matrix when the updating matrix multiplies take place.
On entry, q must contain an qsiz-by-n matrix whose columns are unitarily
orthonormal. It is a part of the unitary matrix that reduces the full dense
Hermitian matrix to a (reducible) symmetric tridiagonal matrix. The array
qstore is a workspace array used to store parts of the eigenvector matrix
when the updating matrix multiplies take place.
ldq INTEGER. The leading dimension of the array q; ldq max(1, n).
ldqs INTEGER. The leading dimension of the array qstore; ldqs max(1, n).
work REAL for slaed0

DOUBLE PRECISION for dlaed0.
Workspace array, used in real flavors only.
If icompq = 0 or 1, the dimension of work must be at least (1 +3n
+2nlog2(n)+3n2), where log2(n) = smallest integer k such that
2kn.
If icompq = 2, the dimension of work must be at least (4n+n2).
iwork INTEGER.
Workspace array.
For real flavors, if icompq = 0 or 1, and for complex flavors, the dimension
of iwork must be at least (6+6n+5nlog2(n)).
1398
LAPACK Routines 3
For real flavors, if icompq = 2, the dimension of iwork must be at least
(3+5n).
Output Parameters
d On exit, contains eigenvalues in ascending order.
e On exit, the array is destroyed.
q If icompq = 2, on exit, q contains the eigenvectors of the tridiagonal

matrix.
info INTEGER.
If info = i >0, the algorithm failed to compute an eigenvalue while

working on the submatrix lying in rows and columns i/(n+1) through
mod(i, n+1).
?laed1
Used by sstedc/dstedc. Computes the updated
eigensystem of a diagonal matrix after modification by
a rank-one symmetric matrix. Used when the original
matrix is tridiagonal.
Syntax
call slaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
call dlaed1( n, d, q, ldq, indxq, rho, cutpnt, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?laed1 computes the updated eigensystem of a diagonal matrix after modification by a rank-one
symmetric matrix. This routine is used only for the eigenproblem which requires all eigenvalues and
eigenvectors of a tridiagonal matrix. ?laed7 handles the case in which eigenvalues only or eigenvalues and
eigenvectors of a full symmetric matrix (which was reduced to tridiagonal form) are desired.
T = Q(in)*(D(in)+ rho*Z*ZT)*QT(in) = Q(out)*D(out)*QT(out)
where Z = QTu, u is a vector of length n with ones in the cutpnt and (cutpnt+1) -th elements and zeros
elsewhere. The eigenvectors of the original matrix are stored in Q, and the eigenvalues are in D. The
algorithm consists of three stages:
The first stage consists of deflating the size of the problem when there are multiple eigenvalues or if there is
a zero in the z vector. For each such occurrence the dimension of the secular equation problem is reduced by
one. This stage is performed by the routine ?laed2.
The second stage consists of calculating the updated eigenvalues. This is done by finding the roots of the
secular equation via the routine ?laed4 (as called by ?laed3). This routine also calculates the eigenvectors
of the current problem.
1399
The final stage consists of computing the updated eigenvectors directly using the updated eigenvalues. The
eigenvectors for the current problem are multiplied with the eigenvectors from the overall problem.
Input Parameters
d, q, work REAL for slaed1

Arrays:
d(*) contains the eigenvalues of the rank-1-perturbed matrix. The
dimension of d must be at least max(1, n).
q(ldq, *) contains the eigenvectors of the rank-1-perturbed matrix. The

second dimension of q must be at least max(1, n).
work(*) is a workspace array, dimension at least (4n+n2).
indxq INTEGER. Array, dimension (n).

On entry, the permutation which separately sorts the two subproblems in d
into ascending order.
rho REAL for slaed1

The subdiagonal entry used to create the rank-1 modification. This
parameter can be modified by ?laed2, where it is input/output.
cutpnt INTEGER.
The location of the last eigenvalue in the leading sub-matrix. min(1,n)
cutpntn/2.
iwork INTEGER.
Workspace array, dimension (4n).
Output Parameters
d On exit, contains the eigenvalues of the repaired matrix.
q On exit, q contains the eigenvectors of the repaired tridiagonal matrix.
indxq On exit, contains the permutation which will reintegrate the subproblems
back into sorted order, that is, d( indxq(i = 1, n )) will be in ascending
order.
info INTEGER.
If info = -i, the i-th parameter had an illegal value. If info = 1, an

eigenvalue did not converge.
1400
LAPACK Routines 3
?laed2
Used by sstedc/dstedc. Merges eigenvalues and
deflates secular equation. Used when the original
matrix is tridiagonal.
Syntax
call slaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc, indxp,
coltyp, info )
call dlaed2( k, n, n1, d, q, ldq, indxq, rho, z, dlamda, w, q2, indx, indxc, indxp,
coltyp, info )
Include Files
mkl.fi
Description
The routine ?laed2 merges the two sets of eigenvalues together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more
eigenvalues are close together or if there is a tiny entry in the z vector. For each such occurrence the order of
the related secular equation problem is reduced by one.
Input Parameters
k INTEGER. The number of non-deflated eigenvalues, and the order of the

related secular equation (0 kn).
n1 INTEGER. The location of the last eigenvalue in the leading sub-matrix;

min(1,n) n1n/2.
d, q, z REAL for slaed2

Arrays:
d(*) contains the eigenvalues of the two submatrices to be combined. The
dimension of d must be at least max(1, n).
q(ldq, *) contains the eigenvectors of the two submatrices in the two

square blocks with corners at (1,1), (n1,n1) and (n1+1,n1+1), (n,n). The
z(*) contains the updating vector (the last row of the first sub-eigenvector
matrix and the first row of the second sub-eigenvector matrix).

On entry, the permutation which separately sorts the two subproblems in d
into ascending order. Note that elements in the second half of this
permutation must first have n1 added to their values.
rho REAL for slaed2
1401

On entry, the off-diagonal element associated with the rank-1 cut which
originally split the two submatrices which are now being recombined.
indx, indxp INTEGER.

Workspace arrays, dimension (n) each. Array indx contains the permutation
used to sort the contents of dlamda into ascending order.
Array indxp contains the permutation used to place deflated values of d at
the end of the array.
indxp(1:k) points to the nondeflated d-values and indxp(k+1:n) points to
the deflated eigenvalues.
coltyp INTEGER.
Workspace array, dimension (n).
During execution, a label which will indicate which of the following types a
column in the q2 matrix is:
1 : non-zero in the upper half only;
2 : dense;
3 : non-zero in the lower half only;
4 : deflated.
Output Parameters
d On exit, d contains the trailing (n-k) updated eigenvalues (those which were
deflated) sorted into increasing order.
q On exit, q contains the trailing (n-k) updated eigenvectors (those which

were deflated) in its last n-k columns.
z On exit, z content is destroyed by the updating process.
indxq Destroyed on exit.
rho On exit, rho has been modified to the value required by ?laed3.
dlamda, w, q2 REAL for slaed2

Arrays: dlamda(n), w(n), q2(n12+(n-n1)2).
The array dlamda contains a copy of the first k eigenvalues which is used
by ?laed3 to form the secular equation.
The array w contains the first k values of the final deflation-altered z-vector
which is passed to ?laed3.
The array q2 contains a copy of the first k eigenvectors which is used by ?

laed3 in a matrix multiply (sgemm/dgemm) to solve for the new
eigenvectors.
indxc INTEGER. Array, dimension (n).
1402
LAPACK Routines 3
The permutation used to arrange the columns of the deflated q matrix into
three groups: the first group contains non-zero elements only at and above
n1, the second contains non-zero elements only below n1, and the third is
dense.
coltyp On exit, coltyp(i) is the number of columns of type i, for i=1 to 4 only (see
the definition of types in the description of coltyp in Input Parameters).
info INTEGER.
?laed3
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors. Used
when the original matrix is tridiagonal.
Syntax
call slaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
call dlaed3( k, n, n1, d, q, ldq, rho, dlamda, q2, indx, ctot, w, s, info )
Include Files
mkl.fi
Description
The routine ?laed3 finds the roots of the secular equation, as defined by the values in d, w, and rho,
between 1 and k.
It makes the appropriate calls to ?laed4 and then updates the eigenvectors by multiplying the matrix of
eigenvectors of the pair of eigensystems being combined by the matrix of eigenvectors of the k-by-k system
which is solved here.
This code makes very mild assumptions about floating point arithmetic. It will work on machines with a guard
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP,
Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard
digits, but none are known.
Input Parameters
k INTEGER. The number of terms in the rational function to be solved by ?

laed4 (k 0).
n INTEGER. The number of rows and columns in the q matrix. nk (deflation

may result in n >k).
n1 INTEGER. The location of the last eigenvalue in the leading sub-matrix;

min(1,n) n1n/2.
q REAL for slaed3

1403
Array q(ldq, *). The second dimension of q must be at least max(1, n).
Initially, the first k columns of this array are used as workspace.
rho REAL for slaed3

The value of the parameter in the rank one update equation. rho 0
required.
dlamda, q2, w REAL for slaed3

Arrays: dlamda(k), q2(ldq2, *), w(k).
The first k elements of the array dlamda contain the old roots of the
deflated updating problem. These are the poles of the secular equation.
The first k columns of the array q2 contain the non-deflated eigenvectors
for the split problem. The second dimension of q2 must be at least max(1,
n).
The first k elements of the array w contain the components of the deflation-
adjusted updating vector.
indx INTEGER. Array, dimension (n).

The permutation used to arrange the columns of the deflated q matrix into
three groups (see ?laed2).
The rows of the eigenvectors found by ?laed4 must be likewise permuted

before the matrix multiply can take place.
ctot INTEGER. Array, dimension (4).

A count of the total number of the various types of columns in q, as
described in indx. The fourth column type is any column which has been
deflated.
s REAL for slaed3

Workspace array, dimension (n1+1)*k .
Will contain the eigenvectors of the repaired matrix which will be multiplied
by the previously accumulated eigenvectors to update the system.
Output Parameters
d REAL for slaed3

Array, dimension at least max(1, n).
d(i) contains the updated eigenvalues for 1 i k.
q On exit, the columns 1 to k of q contain the updated eigenvectors.
1404
LAPACK Routines 3
dlamda May be changed on output by having lowest order bit set to zero on Cray X-
MP, Cray Y-MP, Cray-2, or Cray C-90, as described above.
w Destroyed on exit.
info INTEGER.
If info = 1, an eigenvalue did not converge.
?laed4
Used by sstedc/dstedc. Finds a single root of the
secular equation.
Syntax
call slaed4( n, i, d, z, delta, rho, dlam, info )
call dlaed4( n, i, d, z, delta, rho, dlam, info )
Include Files
mkl.fi
Description
This routine computes the i-th updated eigenvalue of a symmetric rank-one modification to a diagonal matrix
whose elements are given in the array d, and that
D(i) < D(j) for i < j
and that rho > 0. This is arranged by the calling routine, and is no loss in generality. The rank-one modified
system is thus
diag(D) + rho*Z * transpose(Z).
where we assume the Euclidean norm of Z is 1.
The method consists of approximating the rational functions in the secular equation by simpler interpolating
rational functions.
Input Parameters
n INTEGER. The length of all arrays.
i INTEGER. The index of the eigenvalue to be computed; 1 i n.
d, z REAL for slaed4

Arrays, dimension (n) each. The array d contains the original eigenvalues. It
is assumed that they are in order, d(i) < d(j) for i < j.
The array z contains the components of the updating vector Z.
rho REAL for slaed4

1405
The scalar in the symmetric updating formula.
Output Parameters
delta REAL for slaed4

Array, dimension (n).
If n 1, delta contains (d(j) - lambda_i) in its j-th component. If n = 1,
then delta(1) = 1. The vector delta contains the information necessary to
construct the eigenvectors.
dlam REAL for slaed4

The computed lambda_i, the i-th updated eigenvalue.
info INTEGER.
If info = 1, the updating process failed.
?laed5
Used by sstedc/dstedc. Solves the 2-by-2 secular
equation.
Syntax
call slaed5( i, d, z, delta, rho, dlam )
call dlaed5( i, d, z, delta, rho, dlam )
Include Files
mkl.fi
Description
The routine computes the i-th eigenvalue of a symmetric rank-one modification of a 2-by-2 diagonal matrix
diag(D) + rho*Z * transpose(Z).
The diagonal elements in the array D are assumed to satisfy
D(i) < D(j) for i < j.
We also assume rho > 0 and that the Euclidean norm of the vector Z is one.
Input Parameters
i INTEGER. The index of the eigenvalue to be computed;

1 i 2.

1406
LAPACK Routines 3
Arrays, dimension (2) each. The array d contains the original eigenvalues. It
is assumed that d(1) < d(2).
The array z contains the components of the updating vector.
rho REAL for slaed5

Output Parameters
delta REAL for slaed5

Array, dimension (2).
The vector delta contains the information necessary to construct the
eigenvectors.
dlam REAL for slaed5

The computed lambda_i, the i-th updated eigenvalue.
?laed6
Used by sstedc/dstedc. Computes one Newton step
in solution of the secular equation.
Syntax
call slaed6( kniter, orgati, rho, d, z, finit, tau, info )
call dlaed6( kniter, orgati, rho, d, z, finit, tau, info )
Include Files
mkl.fi
Description
The routine computes the positive or negative root (closest to the origin) of
It is assumed that if orgati = .TRUE. the root is between d(2) and d(3);otherwise it is between d(1) and
d(2) This routine is called by ?laed4 when necessary. In most cases, the root sought is the smallest in
magnitude, though it might not be in some extremely rare situations.
Input Parameters
kniter INTEGER.
1407
Refer to ?laed4 for its significance.
orgati LOGICAL.
If orgati = .TRUE., the needed root is between d(2) and d(3); otherwise
it is between d(1) and d(2). See ?laed4 for further details.
rho REAL for slaed6

Refer to the equation for f(x) above.

Arrays, dimension (3) each.
The array d satisfies d(1) < d(2) < d(3).
Each of the elements in the array z must be positive.
finit REAL for slaed6

The value of f(x) at 0. It is more accurate than the one evaluated inside this
routine (if someone wants to do so).
Output Parameters
tau REAL for slaed6

The root of the equation for f(x).
info INTEGER.
If info = 1, failure to converge.
?laed7
Used by ?stedc. Computes the updated eigensystem
of a diagonal matrix after modification by a rank-one
symmetric matrix. Used when the original matrix is
dense.
Syntax
call slaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho, cutpnt,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info )
call dlaed7( icompq, n, qsiz, tlvls, curlvl, curpbm, d, q, ldq, indxq, rho, cutpnt,
qstore, qptr, prmptr, perm, givptr, givcol, givnum, work, iwork, info )
call claed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq, qstore,
qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info )
call zlaed7( n, cutpnt, qsiz, tlvls, curlvl, curpbm, d, q, ldq, rho, indxq, qstore,
qptr, prmptr, perm, givptr, givcol, givnum, work, rwork, iwork, info )
1408
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine ?laed7 computes the updated eigensystem of a diagonal matrix after modification by a rank-one
symmetric matrix. This routine is used only for the eigenproblem which requires all eigenvalues and
optionally eigenvectors of a dense symmetric/Hermitian matrix that has been reduced to tridiagonal form.
For real flavors, slaed1/dlaed1 handles the case in which all eigenvalues and eigenvectors of a symmetric
tridiagonal matrix are desired.
T = Q(in)*(D(in)+rho*Z*ZT)*QT(in) = Q(out)*D(out)*QT(out) for real flavors, or
T = Q(in)*(D(in)+rho*Z*ZH)*QH(in) = Q(out)*D(out)*QH(out) for complex flavors
where Z = QT*u for real flavors and Z = QH*u for complex flavors, u is a vector of length n with ones in the
cutpnt and (cutpnt + 1) -th elements and zeros elsewhere. The eigenvectors of the original matrix are
stored in Q, and the eigenvalues are in D. The algorithm consists of three stages:
The first stage consists of deflating the size of the problem when there are multiple eigenvalues or if there is
a zero in the z vector. For each such occurrence the dimension of the secular equation problem is reduced by
one. This stage is performed by the routine slaed8/dlaed8 (for real flavors) or by the routine slaed2/
dlaed2 (for complex flavors).
The second stage consists of calculating the updated eigenvalues. This is done by finding the roots of the
secular equation via the routine ?laed4 (as called by ?laed9 or ?laed3). This routine also calculates the
eigenvectors of the current problem.
The final stage consists of computing the updated eigenvectors directly using the updated eigenvalues. The
eigenvectors for the current problem are multiplied with the eigenvectors from the overall problem.
Input Parameters


also. On entry, the array q must contain the orthogonal matrix used to
reduce the original matrix to tridiagonal form.
cutpnt INTEGER. The location of the last eigenvalue in the leading sub-matrix.
min(1,n) cutpntn .
qsiz INTEGER.
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq =
1).
tlvls INTEGER. The total number of merging levels in the overall divide and
conquer tree.
curlvl INTEGER. The current level in the overall merge routine, 0

curlvltlvls .
1409
curpbm INTEGER. The current problem in the current level in the overall merge
routine (counting from upper left to lower right).
d REAL for slaed7/claed7

DOUBLE PRECISION for dlaed7/zlaed7.
Array, dimension at least max(1, n).
Array d(*) contains the eigenvalues of the rank-1-perturbed matrix.
q, work REAL for slaed7

COMPLEX for claed7
Arrays:
q(ldq, *) contains the eigenvectors of the rank-1-perturbed matrix. The
work(*) is a workspace array, dimension at least (3n+2*qsiz*n) for real

flavors and at least (qsiz*n) for complex flavors.

Contains the permutation that separately sorts the two sub-problems in d
into ascending order.
rho REAL for slaed7 /claed7

The subdiagonal element used to create the rank-1 modification.
qstore REAL for slaed7/claed7

Array, dimension (n2+1). Serves also as output parameter.
Stores eigenvectors of submatrices encountered during divide and conquer,

packed together. qptr points to beginning of the submatrices.
qptr INTEGER. Array, dimension (n+2). Serves also as output parameter. List of
indices pointing to beginning of submatrices stored in qstore. The
submatrices are numbered starting at the bottom left of the divide and
conquer tree, from left to right and bottom to top.
prmptr, perm, givptr INTEGER. Arrays, dimension (n log2n) each.

The array prmptr(*) contains a list of pointers which indicate where in perm
a level's permutation is stored. prmptr(i+1) - prmptr(i) indicates the
size of the permutation and also the size of the full, non-deflated problem.
The array perm(*) contains the permutations (from deflation and sorting)
to be applied to each eigenblock. This parameter can be modified by ?
laed8, where it is output.
1410
LAPACK Routines 3
The array givptr(*) contains a list of pointers which indicate where in givcol
a level's Givens rotations are stored. givptr(i+1) - givptr(i) indicates
the number of Givens rotations.
givcol INTEGER. Array, dimension (2, n log2n).

Each pair of numbers indicates a pair of columns to take place in a Givens
rotation.
givnum REAL for slaed7/claed7

Array, dimension (2, n log2n).
Each number indicates the S value to be used in the corresponding Givens
rotation.
iwork INTEGER.
Workspace array, dimension (4n ).
rwork REAL for claed7

DOUBLE PRECISION for zlaed7.
Workspace array, dimension (3n+2qsiz*n). Used in complex flavors only.
Output Parameters
d On exit, contains the eigenvalues of the repaired matrix.
q On exit, q contains the eigenvectors of the repaired tridiagonal matrix.

Contains the permutation that reintegrates the subproblems back into a
sorted order, that is,
d(indxq(i = 1, n)) will be in the ascending order.
rho This parameter can be modified by ?laed8, where it is input/output.
prmptr, perm, givptr INTEGER. Arrays, dimension (n log2n) each.

The array prmptr contains an updated list of pointers.
The array perm contains an updated permutation.
The array givptr contains an updated list of pointers.
givcol This parameter can be modified by ?laed8, where it is output.
givnum This parameter can be modified by ?laed8, where it is output.
info INTEGER.
If info = 1, an eigenvalue did not converge.
1411
?laed8
Used by ?stedc. Merges eigenvalues and deflates
secular equation. Used when the original matrix is
dense.
Syntax
call slaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda, q2, ldq2,
w, perm, givptr, givcol, givnum, indxp, indx, info )
call dlaed8( icompq, k, n, qsiz, d, q, ldq, indxq, rho, cutpnt, z, dlamda, q2, ldq2,
w, perm, givptr, givcol, givnum, indxp, indx, info )
call claed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp, indx,
indxq, perm, givptr, givcol, givnum, info )
call zlaed8( k, n, qsiz, q, ldq, d, rho, cutpnt, z, dlamda, q2, ldq2, w, indxp, indx,
indxq, perm, givptr, givcol, givnum, info )
Include Files
mkl.fi
Description
The routine merges the two sets of eigenvalues together into a single sorted set. Then it tries to deflate the
size of the problem. There are two ways in which deflation can occur: when two or more eigenvalues are
close together or if there is a tiny element in the z vector. For each such occurrence the order of the related
secular equation problem is reduced by one.
Input Parameters


also.
On entry, the array q must contain the orthogonal matrix used to reduce
the original matrix to tridiagonal form.
cutpnt INTEGER. The location of the last eigenvalue in the leading sub-matrix.
min(1,n) cutpntn .
qsiz INTEGER.
matrix to tridiagonal form; qsizn (for real flavors, qsizn if icompq =
1).
d, z REAL for slaed8/claed8

Arrays, dimension at least max(1, n) each. The array d(*) contains the
eigenvalues of the two submatrices to be combined.
1412
LAPACK Routines 3
On entry, z(*) contains the updating vector (the last row of the first sub-
eigenvector matrix and the first row of the second sub-eigenvector matrix).
The contents of z are destroyed by the updating process.
q REAL for slaed8

COMPLEX for claed8
Array
q(ldq, *). The second dimension of q must be at least max(1, n). On
entry, q contains the eigenvectors of the partially solved system which has
been previously updated in matrix multiplies with other partially solved
eigensystems.
For real flavors, If icompq = 0, q is not referenced.
ldq2 INTEGER. The leading dimension of the output array q2; ldq2 max(1,
n).

The permutation that separately sorts the two sub-problems in d into
ascending order. Note that elements in the second half of this permutation
must first have cutpnt added to their values in order to be accurate.
rho REAL for slaed8/claed8

On entry, the off-diagonal element associated with the rank-1 cut which
originally split the two submatrices which are now being recombined.
Output Parameters
k INTEGER. The number of non-deflated eigenvalues, and the order of the

related secular equation.
d On exit, contains the trailing (n-k) updated eigenvalues (those which were
deflated) sorted into increasing order.
z On exit, the updating process destroys the contents of z.
q On exit, q contains the trailing (n-k) updated eigenvectors (those which

were deflated) in its last (n-k) columns.

The permutation of merged eigenvalues set.
rho On exit, rho has been modified to the value required by ?laed3.
dlamda, w REAL for slaed8/claed8

1413
Arrays, dimension (n) each. The array dlamda(*) contains a copy of the
first k eigenvalues which will be used by ?laed3 to form the secular
equation.
The array w(*) will hold the first k values of the final deflation-altered z-
vector and will be passed to ?laed3.
q2 REAL for slaed8

COMPLEX for claed8
Array
q2(ldq2, *). The second dimension of q2 must be at least max(1, n).
Contains a copy of the first k eigenvectors which will be used by slaed7/

dlaed7 in a matrix multiply (sgemm/dgemm) to update the new
eigenvectors. For real flavors, If icompq = 0, q2 is not referenced.
indxp, indx INTEGER. Workspace arrays, dimension (n) each.

The array indxp(*) will contain the permutation used to place deflated
values of d at the end of the array. On output, indxp(1:k) points to the
nondeflated d-values and indxp(k+1:n) points to the deflated eigenvalues.
The array indx(*) will contain the permutation used to sort the contents of
d into ascending order.
perm INTEGER. Array, dimension (n ).

Contains the permutations (from deflation and sorting) to be applied to
each eigenblock.
givptr INTEGER. Contains the number of Givens rotations which took place in this
subproblem.
givcol INTEGER. Array, dimension (2, n ).

rotation.
givnum REAL for slaed8/claed8

Array, dimension (2, n).
rotation.
info INTEGER.
1414
LAPACK Routines 3
?laed9
Used by sstedc/dstedc. Finds the roots of the
secular equation and updates the eigenvectors. Used
Syntax
call slaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
call dlaed9( k, kstart, kstop, n, d, q, ldq, rho, dlamda, w, s, lds, info )
Include Files
mkl.fi
Description
The routine finds the roots of the secular equation, as defined by the values in d, z, and rho, between kstart
and kstop. It makes the appropriate calls to slaed4/dlaed4 and then stores the new matrix of eigenvectors
for use in calculating the next level of z vectors.
Input Parameters
k INTEGER. The number of terms in the rational function to be solved by

slaed4/dlaed4 (k 0).
kstart, kstop INTEGER. The updated eigenvalues lambda(i),

kstart i kstop are to be computed.
1 kstartkstopk.
n INTEGER. The number of rows and columns in the Q matrix. nk (deflation

may result in n > k).
q REAL for slaed9

Workspace array, dimension (ldq, *).
rho REAL for slaed9

The value of the parameter in the rank one update equation. rho 0
required.
dlamda, w REAL for slaed9

Arrays, dimension (k) each. The first k elements of the array dlamda(*)
contain the old roots of the deflated updating problem. These are the poles
of the secular equation.
1415
The first k elements of the array w(*) contain the components of the
deflation-adjusted updating vector.
lds INTEGER. The leading dimension of the output array s; lds max(1, k).
Output Parameters
d REAL for slaed9

Array, dimension (n). Elements in d(i) are not referenced for 1 i <
kstart or kstop < in.
s REAL for slaed9

Array, dimension (lds, *) .
The second dimension of s must be at least max(1, k). Will contain the
eigenvectors of the repaired matrix which will be stored for subsequent z
vector calculation and multiplied by the previously accumulated
eigenvectors to update the system.
dlamda On exit, the value is modified to make sure all dlamda(i) - dlamda(j)
can be computed with high relative accuracy, barring overflow and
underflow.
w Destroyed on exit.
info INTEGER.
If info = -i, the i-th parameter had an illegal value. If info = 1, the
eigenvalue did not converge.
?laeda
Used by ?stedc. Computes the Z vector determining
the rank-one modification of the diagonal matrix. Used
Syntax
call slaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum, q, qptr,
z, ztemp, info )
call dlaeda( n, tlvls, curlvl, curpbm, prmptr, perm, givptr, givcol, givnum, q, qptr,
z, ztemp, info )
Include Files
mkl.fi
Description
The routine ?laeda computes the Z vector corresponding to the merge step in the curlvl-th step of the
merge process with tlvls steps for the curpbm-th problem.
1416
LAPACK Routines 3
Input Parameters
tlvls INTEGER. The total number of merging levels in the overall divide and
conquer tree.
curlvl INTEGER. The current level in the overall merge routine, 0

curlvltlvls .
curpbm INTEGER. The current problem in the current level in the overall merge
routine (counting from upper left to lower right).
prmptr, perm, givptr INTEGER. Arrays, dimension (n log2n ) each.

The array prmptr(*) contains a list of pointers which indicate where in perm
a level's permutation is stored. prmptr(i+1) - prmptr(i) indicates the
size of the permutation and also the size of the full, non-deflated problem.
The array perm(*) contains the permutations (from deflation and sorting)
to be applied to each eigenblock.
The array givptr(*) contains a list of pointers which indicate where in givcol
a level's Givens rotations are stored. givptr(i+1) - givptr(i) indicates
the number of Givens rotations.
givcol INTEGER. Array, dimension (2, n log2n ).

rotation.
givnum REAL for slaeda

DOUBLE PRECISION for dlaeda.
Array, dimension (2, n log2n).
rotation.
q REAL for slaeda

Array, dimension ( n2).
Contains the square eigenblocks from previous levels, the starting positions
for blocks are given by qptr.
qptr INTEGER. Array, dimension (n+2). Contains a list of pointers which indicate
where in q an eigenblock is stored. sqrt( qptr(i+1) - qptr(i))
indicates the size of the block.
ztemp REAL for slaeda

Output Parameters
z REAL for slaeda
1417

Array, dimension (n). Contains the updating vector (the last row of the first
sub-eigenvector matrix and the first row of the second sub-eigenvector
matrix).
info INTEGER.
?laein
Computes a specified right or left eigenvector of an
upper Hessenberg matrix by inverse iteration.
Syntax
call slaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3, smlnum,
bignum, info )
call dlaein( rightv, noinit, n, h, ldh, wr, wi, vr, vi, b, ldb, work, eps3, smlnum,
bignum, info )
call claein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum, info )
call zlaein( rightv, noinit, n, h, ldh, w, v, b, ldb, rwork, eps3, smlnum, info )
Include Files
mkl.fi
Description
The routine ?laein uses inverse iteration to find a right or left eigenvector corresponding to the eigenvalue
(wr,wi) of a real upper Hessenberg matrix H (for real flavors slaein/dlaein) or to the eigenvalue w of a
complex upper Hessenberg matrix H (for complex flavors claein/zlaein).
Input Parameters
rightv LOGICAL.
If rightv = .TRUE., compute right eigenvector;
if rightv = .FALSE., compute left eigenvector.
noinit LOGICAL.
If noinit = .TRUE., no initial vector is supplied in (vr,vi) or in v (for
complex flavors);
if noinit = .FALSE., initial vector is supplied in (vr,vi) or in v (for
complex flavors).
h REAL for slaein

DOUBLE PRECISION for dlaein
COMPLEX for claein
1418
LAPACK Routines 3
DOUBLE COMPLEX for zlaein.
Array h(ldh, *).
The second dimension of h must be at least max(1, n). Contains the upper
Hessenberg matrix H.
ldh INTEGER. The leading dimension of the array h; ldh max(1, n).
wr, wi REAL for slaein

DOUBLE PRECISION for dlaein.
The real and imaginary parts of the eigenvalue of H whose corresponding
right or left eigenvector is to be computed (for real flavors of the routine).
w COMPLEX for claein

The eigenvalue of H whose corresponding right or left eigenvector is to be
computed (for complex flavors of the routine).
vr, vi REAL for slaein

Arrays, dimension (n) each. Used for real flavors only. On entry, if noinit
= .FALSE. and wi = 0.0, vr must contain a real starting vector for inverse
iteration using the real eigenvalue wr;
if noinit = .FALSE. and wi 0.0, vr and vi must contain the real and
imaginary parts of a complex starting vector for inverse iteration using the
complex eigenvalue (wr,wi);otherwise vr and vi need not be set.
v COMPLEX for claein

Array, dimension (n). Used for complex flavors only. On entry, if noinit
= .FALSE., v must contain a starting vector for inverse iteration; otherwise
v need not be set.
b REAL for slaein

DOUBLE PRECISION for dlaein
COMPLEX for claein
Workspace array b(ldb, *). The second dimension of b must be at least
max(1, n).
ldb INTEGER. The leading dimension of the array b;

ldbn+1 for real flavors;
ldb max(1, n) for complex flavors.
work REAL for slaein

1419
Used for real flavors only.
rwork REAL for claein

DOUBLE PRECISION for zlaein.
Used for complex flavors only.
eps3, smlnum REAL for slaein/claein

DOUBLE PRECISION for dlaein/zlaein.
eps3 is a small machine-dependent value which is used to perturb close
eigenvalues, and to replace zero pivots.
smlnum is a machine-dependent value close to underflow threshold. A
suggested value for smlnum is slamch('s') * (n/slamch('p') for
slaein/claein or dlamch('s') * (n/dlamch('p') for dlaein/zlaein.
See lamch.
bignum REAL for slaein

bignum is a machine-dependent value close to overflow threshold. Used for
real flavors only. A suggested value for bignum is 1 / slamch('s') for
slaein/claein or 1 / dlamch('s') for dlaein/zlaein.
Output Parameters
vr, vi On exit, if wi = 0.0 (real eigenvalue), vr contains the computed real

eigenvector; if wi 0.0 (complex eigenvalue), vr and vi contain the real
and imaginary parts of the computed complex eigenvector. The eigenvector
is normalized so that the component of largest magnitude has magnitude 1;
here the magnitude of a complex number (x,y) is taken to be |x| + |y|.
vi is not referenced if wi = 0.0.
v On exit, v contains the computed eigenvector, normalized so that the

component of largest magnitude has magnitude 1; here the magnitude of a
complex number (x,y) is taken to be |x| + |y|.
info INTEGER.
If info = 1, inverse iteration did not converge. For real flavors, vr is set to
the last iterate, and so is vi, if wi 0.0. For complex flavors, v is set to the
last iterate.
?laev2
Computes the eigenvalues and eigenvectors of a 2-
by-2 symmetric/Hermitian matrix.
Syntax
call slaev2( a, b, c, rt1, rt2, cs1, sn1 )
1420
LAPACK Routines 3
call dlaev2( a, b, c, rt1, rt2, cs1, sn1 )
call claev2( a, b, c, rt1, rt2, cs1, sn1 )
call zlaev2( a, b, c, rt1, rt2, cs1, sn1 )
Include Files
mkl.fi
Description
The routine performs the eigendecomposition of a 2-by-2 symmetric matrix
(for claev2/zlaev2).
On return, rt1 is the eigenvalue of larger absolute value, rt2 of smaller absolute value, and (cs1, sn1) is the
unit right eigenvector for rt1, giving the decomposition
(for slaev2/dlaev2),
or
(for claev2/zlaev2).
Input Parameters
a, b, c REAL for slaev2

DOUBLE PRECISION for dlaev2
COMPLEX for claev2
DOUBLE COMPLEX for zlaev2.
Elements of the input matrix.
Output Parameters
rt1, rt2 REAL for slaev2/claev2

DOUBLE PRECISION for dlaev2/zlaev2.
Eigenvalues of larger and smaller absolute value, respectively.
1421
cs1 REAL for slaev2/claev2

DOUBLE PRECISION for dlaev2/zlaev2.
sn1 REAL for slaev2

DOUBLE PRECISION for dlaev2
COMPLEX for claev2
DOUBLE COMPLEX for zlaev2.
The vector (cs1, sn1) is the unit right eigenvector for rt1.
Application Notes
rt1 is accurate to a few ulps barring over/underflow. rt2 may be inaccurate if there is massive cancellation in
the determinant a*c-b*b; higher precision or correctly rounded or correctly truncated arithmetic would be
needed to compute rt2 accurately in all cases. cs1 and sn1 are accurate to a few ulps barring over/underflow.
Overflow is possible only if rt1 is within a factor of 5 of overflow. Underflow is harmless if the input data is 0
or exceeds underflow_threshold / macheps.
?laexc
Swaps adjacent diagonal blocks of a real upper quasi-
triangular matrix in Schur canonical form, by an
orthogonal similarity transformation.
Syntax
call slaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
call dlaexc( wantq, n, t, ldt, q, ldq, j1, n1, n2, work, info )
Include Files
mkl.fi
Description
The routine swaps adjacent diagonal blocks T11 and T22 of order 1 or 2 in an upper quasi-triangular matrix T
by an orthogonal similarity transformation.
T must be in Schur canonical form, that is, block upper triangular with 1-by-1 and 2-by-2 diagonal blocks;
each 2-by-2 diagonal block has its diagonal elements equal and its off-diagonal elements of opposite sign.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE., accumulate the transformation in the matrix Q;
If wantq = .FALSE., do not accumulate the transformation.
t, q REAL for slaexc

DOUBLE PRECISION for dlaexc
Arrays:
1422
LAPACK Routines 3
t(ldt,*) contains on entry the upper quasi-triangular matrix T, in Schur
canonical form.
q(ldq,*) contains on entry, if wantq = .TRUE., the orthogonal matrix Q.

If wantq = .FALSE., q is not referenced. The second dimension of q must
be at least max(1, n).

If wantq = .FALSE., then ldq 1.
If wantq = .TRUE., then ldq max(1,n).
j1 INTEGER. The index of the first row of the first block T11.
n1 INTEGER. The order of the first block T11

(n1 = 0, 1, or 2).
n2 INTEGER. The order of the second block T22

(n2 = 0, 1, or 2).
work REAL for slaexc;

DOUBLE PRECISION for dlaexc.
Workspace array, DIMENSION (n).
Output Parameters
t On exit, the updated matrix T, again in Schur canonical form.
q On exit, if wantq = .TRUE., the updated matrix Q.
info INTEGER.
If info = 1, the transformed matrix T would be too far from Schur form;
the blocks are not swapped and T and Q are unchanged.
?lag2
Computes the eigenvalues of a 2-by-2 generalized
eigenvalue problem, with scaling as necessary to
avoid over-/underflow.
Syntax
call slag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
call dlag2( a, lda, b, ldb, safmin, scale1, scale2, wr1, wr2, wi )
Include Files
mkl.fi
1423
Description
The routine computes the eigenvalues of a 2 x 2 generalized eigenvalue problem A - w*B, with scaling as
necessary to avoid over-/underflow. The scaling factor, s, results in a modified eigenvalue equation
s*A - w*B,
where s is a non-negative scaling factor chosen so that w, w*B, and s*A do not overflow and, if possible, do
not underflow, either.
Input Parameters
a, b REAL for slag2

DOUBLE PRECISION for dlag2
Arrays:
a(lda,2) contains, on entry, the 2 x 2 matrix A. It is assumed that its 1-
norm is less than 1/safmin. Entries less than sqrt(safmin)*norm(A) are
subject to being treated as zero.
b(ldb,2) contains, on entry, the 2 x 2 upper triangular matrix B. It is
assumed that the one-norm of B is less than 1/safmin. The diagonals
should be at least sqrt(safmin) times the largest element of B (in absolute
value); if a diagonal is smaller than that, then +/- sqrt(safmin) will be
used instead of that diagonal.
lda INTEGER. The leading dimension of a; lda 2.
ldb INTEGER. The leading dimension of b; ldb 2.
safmin REAL for slag2;

DOUBLE PRECISION for dlag2.
The smallest positive number such that 1/safmin does not overflow. (This
should always be ?lamch('S') - it is an argument in order to avoid having to
call ?lamch frequently.)
Output Parameters
scale1 REAL for slag2;

A scaling factor used to avoid over-/underflow in the eigenvalue equation
which defines the first eigenvalue. If the eigenvalues are complex, then the
eigenvalues are (wr1 +/- wii)/scale1 (which may lie outside the
exponent range of the machine), scale1=scale2, and scale1 will always be
positive.
If the eigenvalues are real, then the first (real) eigenvalue is wr1/scale1,
but this may overflow or underflow, and in fact, scale1 may be zero or less
than the underflow threshhold if the exact eigenvalue is sufficiently large.
scale2 REAL for slag2;

1424
LAPACK Routines 3
A scaling factor used to avoid over-/underflow in the eigenvalue equation
which defines the second eigenvalue. If the eigenvalues are complex, then
scale2=scale1. If the eigenvalues are real, then the second (real)
eigenvalue is wr2/scale2, but this may overflow or underflow, and in fact,
scale2 may be zero or less than the underflow threshold if the exact
eigenvalue is sufficiently large.
wr1 REAL for slag2;

If the eigenvalue is real, then wr1 is scale1 times the eigenvalue closest to
the (2,2) element of A*inv(B).
If the eigenvalue is complex, then wr1=wr2 is scale1 times the real part of
the eigenvalues.
wr2 REAL for slag2;

If the eigenvalue is real, then wr2 is scale2 times the other eigenvalue. If
the eigenvalue is complex, then wr1=wr2 is scale1 times the real part of the
eigenvalues.
wi REAL for slag2;

If the eigenvalue is real, then wi is zero. If the eigenvalue is complex, then
wi is scale1 times the imaginary part of the eigenvalues. wi will always be
non-negative.
?lags2
Computes 2-by-2 orthogonal matrices U, V, and Q,
and applies them to matrices A and B such that the
rows of the transformed A and B are parallel.
Syntax
call slags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call dlags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call clags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
call zlags2( upper, a1, a2, a3, b1, b2, b3, csu, snu, csv, snv, csq, snq)
Include Files
mkl.fi
Description
For real flavors, the routine computes 2-by-2 orthogonal matrices U, V and Q, such that if upper = .TRUE.,
then
1425
and
or if upper = .FALSE., then
and
The rows of the transformed A and B are parallel, where
Here ZT denotes the transpose of Z.

For complex flavors, the routine computes 2-by-2 unitary matrices U, V and Q, such that if upper = .TRUE.,
then
and
or if upper = .FALSE., then
1426
LAPACK Routines 3
and
The rows of the transformed A and B are parallel, where
Input Parameters
upper LOGICAL.
If upper = .TRUE., the input matrices A and B are upper triangular; If
upper = .FALSE., the input matrices A and B are lower triangular.
a1, a3 REAL for slags2 and clags2

DOUBLE PRECISION for dlags2 and zlags2
a2 REAL for slags2

DOUBLE PRECISION for dlags2
COMPLEX for clags2
COMPLEX*16 for zlags2
On entry, a1, a2 and a3 are elements of the input 2-by-2 upper (lower)
triangular matrix A.
b1, b3 REAL for slags2 and clags2

b2 REAL for slags2

COMPLEX for clags2
On entry, b1, b2 and b3 are elements of the input 2-by-2 upper (lower)
triangular matrix B.
1427
Output Parameters
csu REAL for slags2 and clags2

Element of the desired orthogonal matrix U.
snu REAL for slags2

Element of the desired orthogonal matrix U.
COMPLEX for clags2
csv REAL for slags2 and clags2

Element of the desired orthogonal matrix V.
snv REAL for slags2

COMPLEX for clags2
Element of the desired orthogonal matrix V.
csq REAL for slags2 and clags2

Element of the desired orthogonal matrix Q.
snq REAL for slags2

Element of the desired orthogonal matrix Q.
COMPLEX for clags2
?lagtf
Computes an LU factorization of a matrix T-*I, where
T is a general tridiagonal matrix, and is a scalar,
using partial pivoting with row interchanges.
Syntax
call slagtf( n, a, lambda, b, c, tol, d, in, info )
call dlagtf( n, a, lambda, b, c, tol, d, in, info )
Include Files
mkl.fi
1428
LAPACK Routines 3
Description
The routine factorizes the matrix (T - lambda*I), where T is an n-by-n tridiagonal matrix and lambda is a
scalar, as
T - lambda*I = P*L*U,
where P is a permutation matrix, L is a unit lower tridiagonal matrix with at most one non-zero sub-diagonal
elements per column and U is an upper triangular matrix with at most two non-zero super-diagonal elements
per column. The factorization is obtained by Gaussian elimination with partial pivoting and implicit row
scaling. The parameter lambda is included in the routine so that ?lagtf may be used, in conjunction with ?
lagts, to obtain eigenvectors of T by inverse iteration.
Input Parameters
a, b, c REAL for slagtf

DOUBLE PRECISION for dlagtf
Arrays, dimension a(n), b(n-1), c(n-1):
On entry, a(*) must contain the diagonal elements of the matrix T.
On entry, b(*) must contain the (n-1) super-diagonal elements of T.
On entry, c(*) must contain the (n-1) sub-diagonal elements of T.
tol REAL for slagtf

On entry, a relative tolerance used to indicate whether or not the matrix (T
- lambda*I) is nearly singular. tol should normally be chose as
approximately the largest relative error in the elements of T. For example, if
the elements of T are correct to about 4 significant figures, then tol should
be set to about 5*10-4. If tol is supplied as less than eps, where eps is the
relative machine precision, then the value eps is used in place of tol.
Output Parameters
a On exit, a is overwritten by the n diagonal elements of the upper triangular

matrix U of the factorization of T.
b On exit, b is overwritten by the n-1 super-diagonal elements of the matrix

U of the factorization of T.
c On exit, c is overwritten by the n-1 sub-diagonal elements of the matrix L

of the factorization of T.
d REAL for slagtf

Array, dimension (n-2).
On exit, d is overwritten by the n-2 second super-diagonal elements of the
matrix U of the factorization of T.
in INTEGER.
1429

On exit, in contains details of the permutation matrix p. If an interchange
occurred at the k-th step of the elimination, then in(k) = 1, otherwise
in(k) = 0. The element in(n) returns the smallest positive integer j such
that
abs(u(j,j)) norm((T - lambda*I)(j))*tol,
where norm( A(j)) denotes the sum of the absolute values of the j-th row
of the matrix A.
If no such j exists then in(n) is returned as zero. If in(n) is returned as
positive, then a diagonal element of U is small, indicating that (T -
lambda*I) is singular or nearly singular.
info INTEGER.
If info = -k, the k-th parameter had an illegal value.
?lagtm
Performs a matrix-matrix product of the form C =
alpha*A*B+beta*C, where A is a tridiagonal matrix,
B and C are rectangular matrices, and alpha and
beta are scalars, which may be 0, 1, or -1.
Syntax
call slagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call dlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call clagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
call zlagtm( trans, n, nrhs, alpha, dl, d, du, x, ldx, beta, b, ldb )
Include Files
mkl.fi
Description
The routine performs a matrix-vector product of the form:

B := alpha*A*X + beta*B
where A is a tridiagonal matrix of order n, B and X are n-by-nrhs matrices, and alpha and beta are real
scalars, each of which may be 0., 1., or -1.
Input Parameters

If trans = 'N', then B := alpha*A*X + beta*B (no transpose);
If trans = 'T', then B := alpha*AT*X + beta*B (transpose);
If trans = 'C', then B := alpha*AH*X + beta*B (conjugate transpose)
1430
LAPACK Routines 3
nrhs INTEGER. The number of right-hand sides, i.e., the number of columns in X
and B (nrhs 0).
alpha, beta REAL for slagtm/clagtm

DOUBLE PRECISION for dlagtm/zlagtm
Specify the scalars alpha and beta respectively. alpha must be 0., 1., or
-1.; otherwise, it is assumed to be 0. beta must be 0., 1., or -1.;
otherwise, it is assumed to be 1.
dl, d, du REAL for slagtm

DOUBLE PRECISION for dlagtm
COMPLEX for clagtm
DOUBLE COMPLEX for zlagtm.
Arrays: dl(n - 1), d(n), du(n - 1).
The array dl contains the (n - 1) sub-diagonal elements of T.
The array d contains the n diagonal elements of T.
The array du contains the (n - 1) super-diagonal elements of T.
x, b REAL for slagtm

DOUBLE PRECISION for dlagtm
COMPLEX for clagtm
DOUBLE COMPLEX for zlagtm.
Arrays:
x(ldx,*) contains the n-by-nrhs matrix X. The second dimension of x must
b(ldb,*) contains the n-by-nrhs matrix B. The second dimension of b must

Output Parameters
b Overwritten by the matrix expression B := alpha*A*X + beta*B
?lagts
Solves the system of equations (T - lambda*I)*x = y
or (T - lambda*I)T*x = y,where T is a general
tridiagonal matrix and lambda is a scalar, using the LU
factorization computed by ?lagtf.
Syntax
call slagts( job, n, a, b, c, d, in, y, tol, info )
1431
call dlagts( job, n, a, b, c, d, in, y, tol, info )
Include Files
mkl.fi
Description
The routine may be used to solve for x one of the systems of equations:
(T - lambda*I)*x = y or (T - lambda*I)T*x = y,
where T is an n-by-n tridiagonal matrix, following the factorization of (T - lambda*I) as
T - lambda*I = P*L*U,
computed by the routine ?lagtf.
The choice of equation to be solved is controlled by the argument job, and in each case there is an option to
perturb zero or very small diagonal elements of U, this option being intended for use in applications such as
inverse iteration.
Input Parameters
job INTEGER. Specifies the job to be performed by ?lagts as follows:

= 1: The equations (T - lambda*I)x = y are to be solved, but diagonal
elements of U are not to be perturbed.
= -1: The equations (T - lambda*I)x = y are to be solved and, if
overflow would otherwise occur, the diagonal elements of U are to be
perturbed. See argument tol below.
= 2: The equations (T - lambda*I)Tx = y are to be solved, but diagonal
elements of U are not to be perturbed.
= -2: The equations (T - lambda*I)Tx = y are to be solved and, if
overflow would otherwise occur, the diagonal elements of U are to be
perturbed. See argument tol below.
a, b, c, d REAL for slagts

DOUBLE PRECISION for dlagts
Arrays, dimension a(n), b(n-1), c(n-1), d(n-2):
On entry, a(*) must contain the diagonal elements of U as returned from ?
lagtf.
On entry, b(*) must contain the first super-diagonal elements of U as
returned from ?lagtf.
On entry, c(*) must contain the sub-diagonal elements of L as returned

from ?lagtf.
On entry, d(*) must contain the second super-diagonal elements of U as

returned from ?lagtf.
in INTEGER.
1432
LAPACK Routines 3
On entry, in(*) must contain details of the matrix p as returned from ?
lagtf.
y REAL for slagts

DOUBLE PRECISION for dlagts
Array, dimension (n). On entry, the right hand side vector y.
tol REAL for slagtf

DOUBLE PRECISION for dlagtf.
On entry, with job < 0, tol should be the minimum perturbation to be
made to very small diagonal elements of U. tol should normally be chosen
as about eps*norm(U), where eps is the relative machine precision, but if
tol is supplied as non-positive, then it is reset to
eps*max( abs( u(i,j)) ). If job > 0 then tol is not referenced.
Output Parameters
y On exit, y is overwritten by the solution vector x.
tol On exit, tol is changed as described in Input Parameters section above, only
if tol is non-positive on entry. Otherwise tol is unchanged.
info INTEGER.
If info = -i, the i-th parameter had an illegal value. If info = i >0,
overflow would occur when computing the ith element of the solution vector
x. This can only occur when job is supplied as positive and either means
that a diagonal element of U is very small, or that the elements of the right-
hand side vector y are very large.
?lagv2
Computes the Generalized Schur factorization of a real
2-by-2 matrix pencil (A,B) where B is upper
triangular.
Syntax
call slagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
call dlagv2( a, lda, b, ldb, alphar, alphai, beta, csl, snl, csr, snr )
Include Files
mkl.fi
Description
The routine computes the Generalized Schur factorization of a real 2-by-2 matrix pencil (A,B) where B is
upper triangular. The routine computes orthogonal (rotation) matrices given by csl, snl and csr, snr such
that:
1) if the pencil (A,B) has two real eigenvalues (include 0/0 or 1/0 types), then
1433
2) if the pencil (A,B) has a pair of complex conjugate eigenvalues, then
where b11b22>0.
Input Parameters
a, b REAL for slagv2

DOUBLE PRECISION for dlagv2
Arrays:
a(lda,2) contains the 2-by-2 matrix A;
b(ldb,2) contains the upper triangular 2-by-2 matrix B.
lda INTEGER. The leading dimension of the array a;

lda 2.
ldb INTEGER. The leading dimension of the array b;

ldb 2.
Output Parameters
a On exit, a is overwritten by the "A-part" of the generalized Schur form.
b On exit, b is overwritten by the "B-part" of the generalized Schur form.
alphar, alphai, beta REAL for slagv2

DOUBLE PRECISION for dlagv2.
Arrays, dimension (2) each.
(alphar(k) + i*alphai(k))/beta(k) are the eigenvalues of the pencil
(A,B), k=1,2 and i = sqrt(-1).
1434
LAPACK Routines 3
Note that beta(k) may be zero.
csl, snl REAL for slagv2

The cosine and sine of the left rotation matrix, respectively.
csr, snr REAL for slagv2

The cosine and sine of the right rotation matrix, respectively.
?lahqr
Computes the eigenvalues and Schur factorization of
an upper Hessenberg matrix, using the double-shift/
single-shift QR algorithm.
Syntax
call slahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, info )
call dlahqr( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, info )
call clahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info )
call zlahqr( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, info )
Include Files
mkl.fi
Description
The routine is an auxiliary routine called by ?hseqr to update the eigenvalues and Schur decomposition
already computed by ?hseqr, by dealing with the Hessenberg submatrix in rows and columns ilo to ihi.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., the full Schur form T is required;
If wantt = .FALSE., eigenvalues only are required.
wantz LOGICAL.
If wantz = .TRUE., the matrix of Schur vectors Z is required;
If wantz = .FALSE., Schur vectors are not required.
ilo, ihi INTEGER.

It is assumed that h is already upper quasi-triangular in rows and columns
ihi+1:n, and that h(ilo,ilo-1) = 0 (unless ilo = 1). The routine ?
lahqr works primarily with the Hessenberg submatrix in rows and columns
ilo to ihi, but applies transformations to all of h if wantt = .TRUE..
1435
Constraints:
1 ilo max(1,ihi); ihin.
h, z REAL for slahqr

DOUBLE PRECISION for dlahqr
COMPLEX for clahqr
DOUBLE COMPLEX for zlahqr.
Arrays:
h(ldh,*) contains the upper Hessenberg matrix h.
z(ldz,*)
If wantz = .TRUE., then, on entry, z must contain the current matrix z of
transformations accumulated by ?hseqr. The second dimension of z must
be at least max(1, n)
If wantz = .FALSE., then z is not referenced..
ldz INTEGER. The leading dimension of z; at least max(1, n).
iloz, ihiz INTEGER. Specify the rows of z to which transformations must be applied if
wantz = .TRUE..
1 ilozilo; ihiihizn.
Output Parameters
h On exit, if info= 0 and wantt = .TRUE., then,
for slahqr/dlahqr, h is upper quasi-triangular in rows and columns

ilo:ihi with any 2-by-2 diagonal blocks in standard form.
for clahqr/zlahqr, h is upper triangular in rows and columns ilo:ihi.
If info= 0 and wantt = .FALSE., the contents of h are unspecified on exit.

If info is positive, see description of info for the output state of h.
wr, wi REAL for slahqr

DOUBLE PRECISION for dlahqr
Arrays, DIMENSION at least max(1, n) each. Used with real flavors only.
The real and imaginary parts, respectively, of the computed eigenvalues ilo
to ihi are stored in the corresponding elements of wr and wi. If two
eigenvalues are computed as a complex conjugate pair, they are stored in
consecutive elements of wr and wi, say the i-th and (i+1)-th, with wi(i)>
0 and wi(i+1) < 0.
If wantt = .TRUE., the eigenvalues are stored in the same order as on the
diagonal of the Schur form returned in h, with wr(i) = h(i,i), and, if
h(i:i+1, i:i+1) is a 2-by-2 diagonal block, wi(i) = sqrt(h(i
+1,i)*h(i,i+1)) and wi(i+1) = -wi(i).
1436
LAPACK Routines 3
w COMPLEX for clahqr
DOUBLE COMPLEX for zlahqr.
Array, DIMENSION at least max(1, n). Used with complex flavors only. The
computed eigenvalues ilo to ihi are stored in the corresponding elements of
w.
If wantt = .TRUE., the eigenvalues are stored in the same order as on the
diagonal of the Schur form returned in h, with w(i) = h(i,i).
z If wantz = .TRUE., then, on exit z has been updated; transformations are

applied only to the submatrix z(iloz:ihiz, ilo:ihi).
info INTEGER.
With info > 0,
if info = i, ?lahqr failed to compute all the eigenvalues ilo to ihi in a

total of 30 iterations per eigenvalue; elements i+1:ihi of wr and wi (for
slahqr/dlahqr) or w (for clahqr/zlahqr) contain those eigenvalues
which have been successfully computed.
if wantt is .FALSE., then on exit the remaining unconverged eigenvalues
are the eigenvalues of the upper Hessenberg matrix rows and columns
ilo through info of the final output value of h.
if wantt is .TRUE., then on exit
(initial value of h)*u = u*(final value of h), (*)
where u is an orthognal matrix. The final value of h is upper Hessenberg
and triangular in rows and columns info+1 through ihi.
if wantz is .TRUE., then on exit
(final value of z) = (initial value of z)* u,
where u is an orthognal matrix in (*) regardless of the value of wantt.
?lahrd
Reduces the first nb columns of a general rectangular
matrix A so that elements below the k-th subdiagonal
are zero, and returns auxiliary matrices which are
needed to apply the transformation to the unreduced
part of A (deprecated).
Syntax
call slahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call dlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call clahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call zlahrd( n, k, nb, a, lda, tau, t, ldt, y, ldy )
Include Files
mkl.fi
Description
This routine is deprecated; use lahr2.
1437
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so that elements
below the k-th subdiagonal are zero. The reduction is performed by an orthogonal/unitary similarity
transformation QT*A*Q for real flavors, or QH*A*Q for complex flavors. The routine returns the matrices V
and T which determine Q as a block reflector I - V*T*VT (for real flavors) or I - V*T*VH (for complex flavors),
and also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)

H(i) = I - tau*v*vH for complex flavors, or
where tau is a real/complex scalar, and v is a real/complex vector.
Input Parameters
k INTEGER. The offset for the reduction. Elements below the k-th subdiagonal
in the first nb columns are reduced to zero.
nb INTEGER. The number of columns to be reduced.
a REAL for slahrd

DOUBLE PRECISION for dlahrd
COMPLEX for clahrd
DOUBLE COMPLEX for zlahrd.
Array a(lda, n-k+1) contains the n-by-(n-k+1) general matrix A to be
reduced.
ldt INTEGER. The leading dimension of the output array t; must be at least
max(1, nb).
ldy INTEGER. The leading dimension of the output array y; must be at least
max(1, n).
Output Parameters
a On exit, the elements on and above the k-th subdiagonal in the first nb
columns are overwritten with the corresponding elements of the reduced
matrix; the elements below the k-th subdiagonal, with the array tau,
represent the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
tau REAL for slahrd

COMPLEX for clahrd
Array, DIMENSION (nb).
1438
LAPACK Routines 3
t, y REAL for slahrd

COMPLEX for clahrd
Arrays, dimension t(ldt, nb), y(ldy, nb).
The array t contains upper triangular matrix T.
The array y contains the n-by-nb matrix Y .
Application Notes
For the elementary reflector H(i),
v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in a(i+k+1:n, i) and tau is stored in tau(i).
The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with T and Y, to
apply the transformation to the unreduced part of the matrix, using an update of the form:
A := (I - V*T*VT) * (A - Y*VT) for real flavors, or
A := (I - V*T*VH) * (A - Y*VH) for complex flavors.
The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb = 2:
See Also
?lahr2
?lahr2
Reduces the specified number of first columns of a
general rectangular matrix A so that elements below
the specified subdiagonal are zero, and returns
auxiliary matrices which are needed to apply the
transformation to the unreduced part of A.
Syntax
call slahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
1439
call dlahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )

call clahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
call zlahr2( n, k, nb, a, lda, tau, t, ldt, y, ldy )
Include Files
mkl.fi
Description
The routine reduces the first nb columns of a real/complex general n-by-(n-k+1) matrix A so that elements
below the k-th subdiagonal are zero. The reduction is performed by an orthogonal/unitary similarity
transformation QT*A*Q for real flavors, or QH*A*Q for complex flavors. The routine returns the matrices V
and T which determine Q as a block reflector I - V*T*VT (for real flavors) or I - V*T*VH (for real flavors), and
also the matrix Y = A*V*T.
The matrix Q is represented as products of nb elementary reflectors:
Q = H(1)*H(2)*... *H(nb)

where tau is a real/complex scalar, and v is a real/complex vector.
This is an auxiliary routine called by ?gehrd.
Input Parameters
k INTEGER. The offset for the reduction. Elements below the k-th subdiagonal
in the first nb columns are reduced to zero (k< n).
nb INTEGER. The number of columns to be reduced.
a REAL for slahr2

DOUBLE PRECISION for dlahr2
COMPLEX for clahr2
DOUBLE COMPLEX for zlahr2.
Array, DIMENSION (lda, n-k+1) contains the n-by-(n-k+1) general matrix A
to be reduced.
lda INTEGER. The leading dimension of the array a; lda max(1, n).
ldt INTEGER. The leading dimension of the output array t; ldt nb.
ldy INTEGER. The leading dimension of the output array y; ldy n.
1440
LAPACK Routines 3
Output Parameters
a On exit, the elements on and above the k-th subdiagonal in the first nb
columns are overwritten with the corresponding elements of the reduced
matrix; the elements below the k-th subdiagonal, with the array tau,
represent the matrix Q as a product of elementary reflectors. The other
columns of a are unchanged. See Application Notes below.
tau REAL for slahr2

COMPLEX for clahr2
Array, DIMENSION (nb).
t, y REAL for slahr2

COMPLEX for clahr2
Arrays, dimension t(ldt, nb), y(ldy, nb).
The array t contains upper triangular matrix T.
The array y contains the n-by-nb matrix Y .
Application Notes
For the elementary reflector H(i),
v(1:i+k-1) = 0, v(i+k) = 1; v(i+k+1:n) is stored on exit in a(i+k+1:n, i) and tau is stored in

tau(i).
The elements of the vectors v together form the (n-k+1)-by-nb matrix V which is needed, with T and Y, to
apply the transformation to the unreduced part of the matrix, using an update of the form:
A := (I - V*T*VT) * (A - Y*VT) for real flavors, or
A := (I - V*T*VH) * (A - Y*VH) for complex flavors.
The contents of A on exit are illustrated by the following example with n = 7, k = 3 and nb = 2:
1441
?laic1
Applies one step of incremental condition estimation.
Syntax
call slaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call dlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
call claic1( job, j, x, sest, w, gamma, sestpr, s, c )
call zlaic1( job, j, x, sest, w, gamma, sestpr, s, c )
Include Files
mkl.fi
Description
The routine ?laic1 applies one step of incremental condition estimation in its simplest version.
Let x, ||x||2 = 1 (where ||a||2 denotes the 2-norm of a), be an approximate singular vector of an j-by-j
lower triangular matrix L, such that
||L*x||2 = sest
Then ?laic1 computes sestpr, s, c such that the vector
is an approximate singular vector of
(for complex flavors), or
(for real flavors), in the sense that

||Lhat*xhat||2 = sestpr.
Depending on job, an estimate for the largest or smallest singular value is computed.
For real flavors, [sc]T and sestpr2 is an eigenpair of the system
where alpha = xT*w .
For complex flavors, [sc]H and sestpr2 is an eigenpair of the system
1442
LAPACK Routines 3
where alpha = xH*w.
Input Parameters
job INTEGER.
If job =1, an estimate for the largest singular value is computed;
If job =2, an estimate for the smallest singular value is computed;
j INTEGER. Length of x and w.
x, w REAL for slaic1

DOUBLE PRECISION for dlaic1
COMPLEX for claic1
DOUBLE COMPLEX for zlaic1.
Arrays, dimension (j) each. Contain vectors x and w, respectively.
sest REAL for slaic1/claic1;

DOUBLE PRECISION for dlaic1/zlaic1.
Estimated singular value of j-by-j matrix L.
gamma REAL for slaic1

COMPLEX for claic1
The diagonal element gamma.
Output Parameters
sestpr REAL for slaic1/claic1;

DOUBLE PRECISION for dlaic1/zlaic1.
Estimated singular value of (j+1)-by-(j+1) matrix Lhat.
s, c REAL for slaic1

COMPLEX for claic1
Sine and cosine needed in forming xhat.
1443
?lakf2
Forms a matrix containing Kronecker products
between the given matrices.
Syntax
call slakf2( m, n, a, lda, b, d, e, z, ldz )
call dlakf2( m, n, a, lda, b, d, e, z, ldz )
call clakf2( m, n, a, lda, b, d, e, z, ldz )
call zlakf2( m, n, a, lda, b, d, e, z, ldz )
Include Files
mkl.fi
Description
The routine ?lakf2 forms the 2*m*n by 2*m*n matrix Z.
,
where In is the identity matrix of size n and XT is the transpose of X. kron(X, Y) is the Kronecker product
between the matrices X and Y.
Input Parameters
m INTEGER. Size of matrix, m 1
n INTEGER. Size of matrix, n 1
a REAL for slakf2,

DOUBLE PRECISION for dlakf2,
COMPLEX for clakf2,
DOUBLE COMPLEX for zlakf2,
Array, size lda-by-n. The matrix A in the output matrix Z.
lda INTEGER. The leading dimension of a, b, d, and e. ldam+n.
b REAL for slakf2,

COMPLEX for clakf2,
Array, size lda by n. Matrix used in forming the output matrix Z.
d REAL for slakf2,

COMPLEX for clakf2,
1444
LAPACK Routines 3
Array, size lda by m. Matrix used in forming the output matrix Z.
e REAL for slakf2,

COMPLEX for clakf2,
Array, size lda by n. Matrix used in forming the output matrix Z.
ldz INTEGER. The leading dimension of Z. ldz 2* m*n.
Output Parameters
z REAL for slakf2,

COMPLEX for clakf2,
Array, size ldz-by-2*m*n. The resultant Kronecker m*n*2 -by-m*n*2
matrix.
?laln2
Solves a 1-by-1 or 2-by-2 linear system of equations
of the specified form.
Syntax
call slaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx, scale,
xnorm, info )
call dlaln2( ltrans, na, nw, smin, ca, a, lda, d1, d2, b, ldb, wr, wi, x, ldx, scale,
xnorm, info )
Include Files
mkl.fi
Description
The routine solves a system of the form

(ca*A - w*D)*X = s*B, or (ca*AT - w*D)*X = s*B
with possible scaling (s) and perturbation of A.
A is an na-by-na real matrix, ca is a real scalar, D is an na-by-na real diagonal matrix, w is a real or complex
value, and X and B are na-by-1 matrices: real if w is real, complex if w is complex. The parameter na may be
1 or 2.
If w is complex, X and B are represented as na-by-2 matrices, the first column of each being the real part
and the second being the imaginary part.
The routine computes the scaling factor s ( 1 ) so chosen that X can be computed without overflow. X is
further scaled if necessary to assure that norm(ca*A - w*D)*norm(X) is less than overflow.
1445
If both singular values of (ca*A - w*D) are less than smin, smin*I (where I stands for identity) will be used
instead of (ca*A - w*D). If only one singular value is less than smin, one element of (ca*A - w*D) will be
perturbed enough to make the smallest singular value roughly smin.
If both singular values are at least smin, (ca*A - w*D) will not be perturbed. In any case, the perturbation
will be at most some small multiple of max(smin, ulp*norm(ca*A - w*D)).
The singular values are computed by infinity-norm approximations, and thus will only be correct to a factor of
2 or so.
NOTE
All input quantities are assumed to be smaller than overflow by a reasonable factor (see bignum).
Input Parameters
trans LOGICAL.
If trans = .TRUE., A- transpose will be used.
If trans = .FALSE., A will be used (not transposed.)
na INTEGER. The size of the matrix A, possible values 1 or 2.
nw INTEGER. This parameter must be 1 if w is real, and 2 if w is complex.

Possible values 1 or 2.
smin REAL for slaln2

DOUBLE PRECISION for dlaln2.
The desired lower bound on the singular values of A.
This should be a safe distance away from underflow or overflow, for
example, between (underflow/machine_precision) and
(machine_precision * overflow). (See bignum and ulp).
ca REAL for slaln2

The coefficient by which A is multiplied.
a REAL for slaln2

Array, DIMENSION (lda,na).
The na-by-na matrix A.
lda INTEGER. The leading dimension of a. Must be at least na.
d1, d2 REAL for slaln2

The (1,1) and (2,2) elements in the diagonal matrix D, respectively. d2 is
not used if nw = 1.
b REAL for slaln2

1446
LAPACK Routines 3
Array, DIMENSION (ldb,nw). The na-by-nw matrix B (right-hand side). If nw
=2 (w is complex), column 1 contains the real part of B and column 2
contains the imaginary part.
ldb INTEGER. The leading dimension of b. Must be at least na.
wr, wi REAL for slaln2

The real and imaginary part of the scalar w, respectively.
wi is not used if nw = 1.
ldx INTEGER. The leading dimension of the output array x. Must be at least na.
Output Parameters
x REAL for slaln2

Array, DIMENSION (ldx,nw). The na-by-nw matrix X (unknowns), as
computed by the routine. If nw = 2 (w is complex), on exit, column 1 will
contain the real part of X and column 2 will contain the imaginary part.
scale REAL for slaln2

The scale factor that B must be multiplied by to insure that overflow does
not occur when computing X. Thus (ca*A - w*D) X will be scale*B, not B
(ignoring perturbations of A.) It will be at most 1.
xnorm REAL for slaln2

The infinity-norm of X, when X is regarded as an na-by-nw real matrix.
info INTEGER.
An error flag. It will be zero if no error occurs, a negative number if an
argument is in error, or a positive number if (ca*A - w*D) had to be
perturbed. The possible values are:
If info = 0: no error occurred, and (ca*A - w*D) did not have to be
perturbed.
If info = 1: (ca*A - w*D) had to be perturbed to make its smallest (or
only) singular value greater than smin.
NOTE
For higher speed, this routine does not check the inputs for errors.
1447
?lals0
Applies back multiplying factors in solving the least
squares problem using divide and conquer SVD
approach. Used by ?gelsd.
Syntax
call slals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call dlals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, info )
call clals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
call zlals0( icompq, nl, nr, sqre, nrhs, b, ldb, bx, ldbx, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, rwork, info )
Include Files
mkl.fi
Description
The routine applies back the multiplying factors of either the left or right singular vector matrix of a diagonal
matrix appended by a row to the right hand side matrix B in solving the least squares problem using the
divide-and-conquer SVD approach.
For the left singular vector matrix, three types of orthogonal matrices are involved:
(1L) Givens rotations: the number of such rotations is givptr;the pairs of columns/rows they were applied to
are stored in givcol;and the c- and s-values of these rotations are stored in givnum.
(2L) Permutation. The (nl+1)-st row of B is to be moved to the first row, and for j=2:n, perm(j)-th row of B
is to be moved to the j-th row.
(3L) The left singular vector matrix of the remaining matrix.
For the right singular vector matrix, four types of orthogonal matrices are involved:
(1R) The right singular vector matrix of the remaining matrix.
(2R) If sqre = 1, one extra Givens rotation to generate the right null space.
(3R) The inverse transformation of (2L).

(4R) The inverse transformation of (1L).
Input Parameters
icompq INTEGER. Specifies whether singular vectors are to be computed in factored

form:
If icompq = 0: Left singular vector matrix.
If icompq = 1: Right singular vector matrix.
nl INTEGER. The row dimension of the upper block.

nl 1.
nr INTEGER. The row dimension of the lower block.
1448
LAPACK Routines 3
nr 1.
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
If sqre = 1: the lower block is an nr-by-(nr+1) rectangular matrix. The

bidiagonal matrix has row dimension n = nl + nr + 1, and column
dimension m = n + sqre.
nrhs INTEGER. The number of columns of B and bx.

Must be at least 1.
b REAL for slals0

DOUBLE PRECISION for dlals0
COMPLEX for clals0
DOUBLE COMPLEX for zlals0.
Array, DIMENSION ( ldb, nrhs ).
Contains the right hand sides of the least squares problem in rows 1
through m.

Must be at least max(1,max( m, n )).
bx REAL for slals0

COMPLEX for clals0
DOUBLE COMPLEX for zlals0.
Workspace array, DIMENSION ( ldbx, nrhs ).
ldbx INTEGER. The leading dimension of bx.
perm INTEGER. Array, DIMENSION (n).

The permutations (from deflation and sorting) applied to the two blocks.
givptr INTEGER. The number of Givens rotations which took place in this
subproblem.
givcol INTEGER. Array, DIMENSION ( ldgcol, 2 ). Each pair of numbers indicates a

pair of rows/columns involved in a Givens rotation.
ldgcol INTEGER. The leading dimension of givcol, must be at least n.
givnum REAL for slals0/clals0

DOUBLE PRECISION for dlals0/zlals0
Array, DIMENSION ( ldgnum, 2 ). Each number indicates the c or s value
used in the corresponding Givens rotation.
1449
ldgnum INTEGER. The leading dimension of arrays difr, poles and givnum, must be
at least k.
poles REAL for slals0/clals0

Array, DIMENSION ( ldgnum, 2 ). On entry, poles(1:k, 1) contains the new
singular values obtained from solving the secular equation, and poles(1:k,
2) is an array containing the poles in the secular equation.
difl REAL for slals0/clals0

Array, DIMENSION ( k ). On entry, difl(i) is the distance between i-th
updated (undeflated) singular value and the i-th (undeflated) old singular
value.
difr REAL for slals0/clals0

Array, DIMENSION ( ldgnum, 2 ). On entry, difr(i, 1) contains the distances
between i-th updated (undeflated) singular value and the i+1-th
(undeflated) old singular value. And difr(i, 2) is the normalizing factor for
the i-th right singular vector.
z REAL for slals0/clals0

Array, DIMENSION ( k ). Contains the components of the deflation-adjusted
updating row vector.
K INTEGER. Contains the dimension of the non-deflated matrix. This is the

order of the related secular equation. 1 kn.
c REAL for slals0/clals0

Contains garbage if sqre =0 and the c value of a Givens rotation related to
the right null space if sqre = 1.
s REAL for slals0/clals0

Contains garbage if sqre =0 and the s value of a Givens rotation related to
the right null space if sqre = 1.
work REAL for slals0

Workspace array, DIMENSION ( k ). Used with real flavors only.
rwork REAL for clals0

DOUBLE PRECISION for zlals0
1450
LAPACK Routines 3
Workspace array, DIMENSION (k*(1+nrhs) + 2*nrhs). Used with complex
flavors only.
Output Parameters
b On exit, contains the solution X in rows 1 through n.
info INTEGER.
If info = 0: successful exit.
If info = -i < 0, the i-th argument had an illegal value.
?lalsa
Computes the SVD of the coefficient matrix in
compact form. Used by ?gelsd.
Syntax
call slalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call dlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call clalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork, info )
call zlalsa( icompq, smlsiz, n, nrhs, b, ldb, bx, ldbx, u, ldu, vt, k, difl, difr, z,
poles, givptr, givcol, ldgcol, perm, givnum, c, s, rwork, iwork, info )
Include Files
mkl.fi
Description
The routine is an intermediate step in solving the least squares problem by computing the SVD of the
coefficient matrix in compact form. The singular vectors are computed as products of simple orthogonal
matrices.
If icompq = 0, ?lalsa applies the inverse of the left singular vector matrix of an upper bidiagonal matrix to
the right hand side; and if icompq = 1, the routine applies the right singular vector matrix to the right hand
side. The singular vector matrices were generated in the compact form by ?lalsa.
Input Parameters
icompq INTEGER. Specifies whether the left or the right singular vector matrix is
involved. If icompq = 0: left singular vector matrix is used
If icompq = 1: right singular vector matrix is used.
smlsiz INTEGER. The maximum size of the subproblems at the bottom of the
computation tree.
n INTEGER. The row and column dimensions of the upper bidiagonal matrix.
1451
nrhs INTEGER. The number of columns of b and bx. Must be at least 1.
b REAL for slalsa

DOUBLE PRECISION for dlalsa
COMPLEX for clalsa
DOUBLE COMPLEX for zlalsa
Array, DIMENSION (ldb, nrhs). Contains the right hand sides of the least
squares problem in rows 1 through m.
ldb INTEGER. The leading dimension of b in the calling subprogram. Must be at

least max(1,max( m, n )).
ldbx INTEGER. The leading dimension of the output array bx.
u REAL for slalsa/clalsa

DOUBLE PRECISION for dlalsa/zlalsa
Array, DIMENSION (ldu, smlsiz). On entry, u contains the left singular vector
matrices of all subproblems at the bottom level.
ldu INTEGER, ldun. The leading dimension of arrays u, vt, difl, difr, poles,
givnum, and z.
vt REAL for slalsa/clalsa

Array, DIMENSION(ldu, smlsiz +1). On entry, vt T (for real flavors) or vt
H (for complex flavors) contains the right singular vector matrices of all
subproblems at the bottom level.
k INTEGER array, DIMENSION ( n ).
difl REAL for slalsa/clalsa

Array, DIMENSION (ldu, nlvl), where nlvl = int(log2(n /(smlsiz+1)))
+ 1.
difr REAL for slalsa/clalsa

Array, DIMENSION(ldu, 2*nlvl). On entry, difl(*, i) and difr(*, 2i -1)
record distances between singular values on the i-th level and singular
values on the (i -1)-th level, and difr(*, 2i) record the normalizing factors of
the right singular vectors matrices of subproblems on i-th level.
z REAL for slalsa/clalsa

Array, DIMENSION (ldu, nlvl . On entry, z(1, i) contains the components of
the deflation- adjusted updating the row vector for subproblems on the i-th
level.
1452
LAPACK Routines 3
poles REAL for slalsa/clalsa
Array, DIMENSION (ldu, 2*nlvl).
On entry, poles(*, 2i-1: 2i) contains the new and old singular values
involved in the secular equations on the i-th level.
givptr INTEGER. Array, DIMENSION ( n ).

On entry, givptr( i ) records the number of Givens rotations performed on
the i-th problem on the computation tree.
givcol INTEGER. Array, DIMENSION ( ldgcol, 2*nlvl ). On entry, for each i, givcol(*,
2i-1: 2i) records the locations of Givens rotations performed on the i-th
level on the computation tree.
ldgcol INTEGER, ldgcoln. The leading dimension of arrays givcol and perm.
perm INTEGER. Array, DIMENSION ( ldgcol, nlvl ). On entry, perm(*, i) records

permutations done on the i-th level of the computation tree.
givnum REAL for slalsa/clalsa

Array, DIMENSION (ldu, 2*nlvl). On entry, givnum(*, 2i-1 : 2i) records the c
and s values of Givens rotations performed on the i-th level on the
computation tree.
c REAL for slalsa/clalsa

Array, DIMENSION ( n ). On entry, if the i-th subproblem is not square, c( i )
contains the c value of a Givens rotation related to the right null space of
the i-th subproblem.
s REAL for slalsa/clalsa

Array, DIMENSION ( n ). On entry, if the i-th subproblem is not square, s( i )
contains the s-value of a Givens rotation related to the right null space of
the i-th subproblem.
work REAL for slalsa

Workspace array, DIMENSION at least (n). Used with real flavors only.
rwork REAL for clalsa

DOUBLE PRECISION for zlalsa
Workspace array, DIMENSION at least max(n, (smlsz+1)*nrhs*3). Used
with complex flavors only.
iwork INTEGER.
Workspace array, DIMENSION at least (3n).
1453
Output Parameters
b On exit, contains the solution X in rows 1 through n.
bx REAL for slalsa

COMPLEX for clalsa
DOUBLE COMPLEX for zlalsa
Array, DIMENSION (ldbx, nrhs). On exit, the result of applying the left or
right singular vector matrix to b.
info INTEGER. If info = 0: successful exit

?lalsd
Uses the singular value decomposition of A to solve
the least squares problem.
Syntax
call slalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork, info )
call dlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, iwork, info )
call clalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork, iwork,
info )
call zlalsd( uplo, smlsiz, n, nrhs, d, e, b, ldb, rcond, rank, work, rwork, iwork,
info )
Include Files
mkl.fi
Description
The routine uses the singular value decomposition of A to solve the least squares problem of finding X to
minimize the Euclidean norm of each column of A*X-B, where A is n-by-n upper bidiagonal, and X and B are
n-by-nrhs. The solution X overwrites B.
The singular values of A smaller than rcond times the largest singular value are treated as zero in solving the
least squares problem; in this case a minimum norm solution is returned. The actual singular values are
returned in d in ascending order.
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray XMP, Cray
YMP, Cray C 90, or Cray 2.
It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none.
Input Parameters
uplo CHARACTER*1.
If uplo = 'U', d and e define an upper bidiagonal matrix.
1454
LAPACK Routines 3
If uplo = 'L', d and e define a lower bidiagonal matrix.
smlsiz INTEGER. The maximum size of the subproblems at the bottom of the
computation tree.
n INTEGER. The dimension of the bidiagonal matrix.

n 0.
nrhs INTEGER. The number of columns of B. Must be at least 1.
d REAL for slalsd/clalsd

DOUBLE PRECISION for dlalsd/zlalsd
Array, DIMENSION (n). On entry, d contains the main diagonal of the
bidiagonal matrix.
e REAL for slalsd/clalsd

Array, DIMENSION (n-1). Contains the super-diagonal entries of the
bidiagonal matrix. On exit, e is destroyed.
b REAL for slalsd

DOUBLE PRECISION for dlalsd
COMPLEX for clalsd
DOUBLE COMPLEX for zlalsd
Array, DIMENSION (ldb,nrhs).
On input, b contains the right hand sides of the least squares problem. On
output, b contains the solution X.
ldb INTEGER. The leading dimension of b in the calling subprogram. Must be at

least max(1,n).
rcond REAL for slalsd/clalsd

The singular values of A less than or equal to rcond times the largest
singular value are treated as zero in solving the least squares problem. If
rcond is negative, machine precision is used instead. For example, for the
least squares problem diag(S)*X=B, where diag(S) is a diagonal matrix
of singular values, the solution is X(i)=B(i)/S(i) if S(i) is greater than
rcond *max(S), and X(i)=0 if S(i) is less than or equal to rcond
*max(S).
rank INTEGER. The number of singular values of A greater than rcond times the
largest singular value.
work REAL for slalsd

DOUBLE PRECISION for dlalsd
COMPLEX for clalsd
DOUBLE COMPLEX for zlalsd
1455
Workspace array.
DIMENSION for real flavors at least
(9n+2n*smlsiz+8n*nlvl+n*nrhs+(smlsiz+1)2),
where
nlvl = max(0, int(log2(n/(smlsiz+1))) + 1).
DIMENSION for complex flavors is (n*nrhs).
rwork REAL for clalsd

DOUBLE PRECISION for zlalsd
Workspace array, used with complex flavors only.
DIMENSION at least (9n + 2n*smlsiz + 8n*nlvl + 3*mlsiz*nrhs +
(smlsiz+1)2),
where
nlvl = max(0, int(log2(min(m,n)/(smlsiz+1))) + 1).
iwork INTEGER.
Workspace array of DIMENSION(3n*nlvl + 11n).
Output Parameters
d On exit, if info = 0, d contains singular values of the bidiagonal matrix.
e On exit, destroyed.
b On exit, b contains the solution X.
info INTEGER.
If info > 0: The algorithm failed to compute a singular value while

working on the submatrix lying in rows and columns info/(n+1) through
mod(info,n+1).
?lamrg
Creates a permutation list to merge the entries of two
independently sorted sets into a single set sorted in
acsending order.
Syntax
call slamrg( n1, n2, a, strd1, strd2, index )
call dlamrg( n1, n2, a, strd1, strd2, index )
Include Files
mkl.fi
1456
LAPACK Routines 3
Description
The routine creates a permutation list which will merge the elements of a (which is composed of two
independently sorted sets) into a single set which is sorted in ascending order.
Input Parameters
n1, n2 INTEGER. These arguments contain the respective lengths of the two sorted
lists to be merged.
a REAL for slamrg

DOUBLE PRECISION for dlamrg.
Array, DIMENSION (n1+n2).
The first n1 elements of a contain a list of numbers which are sorted in

either ascending or descending order. Likewise for the final n2 elements.
strd1, strd2 INTEGER.

These are the strides to be taken through the array a. Allowable strides are
1 and -1. They indicate whether a subset of a is sorted in ascending (strdx
= 1) or descending (strdx = -1) order.
Output Parameters
index INTEGER. Array, DIMENSION (n1+n2).

On exit, this array will contain a permutation such that if b(i) =
a(index(i)) for i=1, n1+n2, then b will be sorted in ascending order.
?laneg
Computes the Sturm count, the number of negative
pivots encountered while factoring tridiagonal T-
sigma*I = L*D*LT.
Syntax
value = slaneg( n, d, lld, sigma, pivmin, r )
value = dlaneg( n, d, lld, sigma, pivmin, r )
Include Files
mkl.fi
Description
The routine computes the Sturm count, the number of negative pivots encountered while factoring
tridiagonal T-sigma*I = L*D*LT. This implementation works directly on the factors without forming the
tridiagonal matrix T. The Sturm count is also the number of eigenvalues of T less than sigma. This routine is
called from ?larb. The current routine does not use the pivmin parameter but rather requires IEEE-754
propagation of infinities and NaNs (NaN stands for 'Not A Number'). This routine also has no input range
restrictions but does require default exception handling such that x/0 produces Inf when x is non-zero, and
Inf/Inf produces NaN. (For more information see [Marques06]).
1457
Input Parameters
n INTEGER. The order of the matrix.
d REAL for slaneg

DOUBLE PRECISION for dlaneg
Contains n diagonal elements of the matrix D.
lld REAL for slaneg

Array, DIMENSION (n-1).
Contains (n-1) elements L(i)*L(i)*D(i).
sigma REAL for slaneg

Shift amount in T-sigma*I = L*D*L**T.
pivmin REAL for slaneg

The minimum pivot in the Sturm sequence. May be used when zero pivots
are encountered on non-IEEE-754 architectures.
r INTEGER.
The twist index for the twisted factorization that is used for the negcount.
Output Parameters
value INTEGER. The number of negative pivots encountered while factoring.
?langb
Returns the value of the 1-norm, Frobenius norm,
infinity-norm, or the largest absolute value of any
element of general band matrix.
Syntax
val = slangb( norm, n, kl, ku, ab, ldab, work )
val = dlangb( norm, n, kl, ku, ab, ldab, work )
val = clangb( norm, n, kl, ku, ab, ldab, work )
val = zlangb( norm, n, kl, ku, ab, ldab, work )
Include Files
mkl.fi
Description
1458
LAPACK Routines 3
The function returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of an n-by-n band matrix A, with kl sub-diagonals and ku super-diagonals.
Input Parameters
norm CHARACTER*1. Specifies the value to be returned by the routine:

= 'M' or 'm': val = max(abs(Aij)), largest absolute value of the matrix
A.
= '1' or 'O' or 'o': val = norm1(A), 1-norm of the matrix A
(maximum column sum),
= 'I' or 'i': val = normI(A), infinity norm of the matrix A (maximum
row sum),
= 'F', 'f', 'E' or 'e': val = normF(A), Frobenius norm of the matrix
A (square root of sum of squares).
n INTEGER. The order of the matrix A. n 0. When n = 0, ?langb is set to

zero.
kl INTEGER. The number of sub-diagonals of the matrix A. kl 0.
ku INTEGER. The number of super-diagonals of the matrix A. ku 0.
ab REAL for slangb

DOUBLE PRECISION for dlangb
COMPLEX for clangb
DOUBLE COMPLEX for zlangb
Array, DIMENSION (ldab,n).
The band matrix A, stored in rows 1 to kl+ku+1. The j-th column of A is

stored in the j-th column of the array ab as follows:
ab(ku+1+i-j,j) = a(i,j)
for max(1,j-ku) i min(n,j+kl).

ldabkl+ku+1.
work REAL for slangb/clangb

DOUBLE PRECISION for dlangb/zlangb
Workspace array, DIMENSION (max(1,lwork)), where
lworkn when norm = 'I'; otherwise, work is not referenced.
Output Parameters
val REAL for slangb/clangb

DOUBLE PRECISION for dlangb/zlangb
Value returned by the function.
1459
?lange
element of a general rectangular matrix.
Syntax
val = slange( norm, m, n, a, lda, work )
val = dlange( norm, m, n, a, lda, work )
val = clange( norm, m, n, a, lda, work )
val = zlange( norm, m, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lange returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex matrix A.
Input Parameters

A.
row sum),

m 0. When m = 0, ?lange is set to zero.
n INTEGER. The number of columns of the matrix A.

n 0. When n = 0, ?lange is set to zero.
a REAL for slange

DOUBLE PRECISION for dlange
COMPLEX for clange
DOUBLE COMPLEX for zlange
Array, DIMENSION (lda,n).
Array a contains the m-by-n matrix A.
1460
LAPACK Routines 3
.
work REAL for slange and clange.

DOUBLE PRECISION for dlange and zlange.
Workspace array, DIMENSION max(1,lwork), where lworkm when norm
= 'I'; otherwise, work is not referenced.
Output Parameters
val REAL for slange/clange

DOUBLE PRECISION for dlange/zlange
?langt
element of a general tridiagonal matrix.
Syntax
val = slangt( norm, n, dl, d, du )
val = dlangt( norm, n, dl, d, du )
val = clangt( norm, n, dl, d, du )
val = zlangt( norm, n, dl, d, du )
Include Files
mkl.fi
Description
The routine returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of a real/complex tridiagonal matrix A.
Input Parameters

= 'M' or 'm': val = max(abs(Aij)), largest absolute value of the matrix A.
= '1' or 'O' or 'o': val = norm1(A), 1-norm of the matrix A (maximum

column sum),
row sum),
= 'F', 'f', 'E' or 'e': val = normF(A), Frobenius norm of the matrix A
(square root of sum of squares).
n INTEGER. The order of the matrix A. n 0. When n = 0, ?langt is set to

zero.
dl, d, du REAL for slangt
1461
DOUBLE PRECISION for dlangt

COMPLEX for clangt
DOUBLE COMPLEX for zlangt
Arrays: dl (n-1), d (n), du (n-1).
The array dl contains the (n-1) sub-diagonal elements of A.
The array d contains the diagonal elements of A.
The array du contains the (n-1) super-diagonal elements of A.
Output Parameters
val REAL for slangt/clangt

DOUBLE PRECISION for dlangt/zlangt
?lanhs
element of an upper Hessenberg matrix.
Syntax
val = slanhs( norm, n, a, lda, work )
val = dlanhs( norm, n, a, lda, work )
val = clanhs( norm, n, a, lda, work )
val = zlanhs( norm, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lanhs returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a Hessenberg matrix A.
The value val returned by the function is:
val = max(abs(Aij)), if norm = 'M' or 'm'
= norm1(A), if norm = '1' or 'O' or 'o'
= normI(A), if norm = 'I' or 'i'
= normF(A), if norm = 'F', 'f', 'E' or 'e'
where norm1 denotes the 1-norm of a matrix (maximum column sum), normI denotes the infinity norm of a
matrix (maximum row sum) and normF denotes the Frobenius norm of a matrix (square root of sum of
squares). Note that max(abs(Aij)) is not a consistent matrix norm.
1462
LAPACK Routines 3
Input Parameters
norm CHARACTER*1. Specifies the value to be returned by the routine as

described above.

n 0. When n = 0, ?lanhs is set to zero.
a REAL for slanhs

DOUBLE PRECISION for dlanhs
COMPLEX for clanhs
DOUBLE COMPLEX for zlanhs
Array, DIMENSION (lda,n). The n-by-n upper Hessenberg matrix A; the part
of A below the first sub-diagonal is not referenced.

lda max(n,1).
work REAL for slanhs and clanhs.

DOUBLE PRECISION for dlange and zlange.
Workspace array, DIMENSION(max(1,lwork)), where lworkn when norm
= 'I'; otherwise, work is not referenced.
Output Parameters
val REAL for slanhs/clanhs

DOUBLE PRECISION for dlanhs/zlanhs
?lansb
Returns the value of the 1-norm, or the Frobenius
norm, or the infinity norm, or the element of largest
absolute value of a symmetric band matrix.
Syntax
val = slansb( norm, uplo, n, k, ab, ldab, work )
val = dlansb( norm, uplo, n, k, ab, ldab, work )
val = clansb( norm, uplo, n, k, ab, ldab, work )
val = zlansb( norm, uplo, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
1463
The function ?lansb returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n real/complex symmetric band matrix A, with k super-
diagonals.
Input Parameters


column sum),
row sum),
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the band matrix A is
supplied. If uplo = 'U': upper triangular part is supplied; If uplo = 'L':
lower triangular part is supplied.
n INTEGER. The order of the matrix A. n 0.

When n = 0, ?lansb is set to zero.
k INTEGER. The number of super-diagonals or sub-diagonals of the band

matrix A. k 0.
ab REAL for slansb

DOUBLE PRECISION for dlansb
COMPLEX for clansb
DOUBLE COMPLEX for zlansb
Array, DIMENSION (ldab,n).
The upper or lower triangle of the symmetric band matrix A, stored in the
first k+1 rows of ab. The j-th column of A is stored in the j-th column of the
array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j)
for max(1,j-k) ij;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for jimin(n,j+k).

ldabk+1.
work REAL for slansb and clansb.

DOUBLE PRECISION for dlansb and zlansb.
Workspace array, DIMENSION(max(1,lwork)), where
lworkn when norm = 'I' or '1' or 'O'; otherwise, work is not

referenced.
1464
LAPACK Routines 3
Output Parameters
val REAL for slansb/clansb

DOUBLE PRECISION for dlansb/zlansb
?lanhb
absolute value of a Hermitian band matrix.
Syntax
val = clanhb( norm, uplo, n, k, ab, ldab, work )
val = zlanhb( norm, uplo, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
The routine returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the element of
largest absolute value of an n-by-n Hermitian band matrix A, with k super-diagonals.
Input Parameters


column sum),
row sum),
uplo CHARACTER*1.
supplied.
If uplo = 'U': upper triangular part is supplied;
If uplo = 'L': lower triangular part is supplied.
n INTEGER. The order of the matrix A. n 0. When n = 0, ?lanhb is set to

zero.
k INTEGER. The number of super-diagonals or sub-diagonals of the band

matrix A.
k 0.
ab COMPLEX for clanhb.
1465
DOUBLE COMPLEX for zlanhb.

Array, DIMENSION (ldaB,n). The upper or lower triangle of the Hermitian
band matrix A, stored in the first k+1 rows of ab. The j-th column of A is
if uplo = 'U', ab(k+1+i-j,j) = a(i,j)
for max(1,j-k) ij;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for ji min(n,j+k).
Note that the imaginary parts of the diagonal elements need not be set and
are assumed to be zero.
ldab INTEGER. The leading dimension of the array ab. ldabk+1.
work REAL for clanhb.

DOUBLE PRECISION for zlanhb.
Workspace array, DIMENSIONmax(1, lwork), where

referenced.
Output Parameters
val REAL for slanhb/clanhb

DOUBLE PRECISION for dlanhb/zlanhb
?lansp
absolute value of a symmetric matrix supplied in
packed form.
Syntax
val = slansp( norm, uplo, n, ap, work )
val = dlansp( norm, uplo, n, ap, work )
val = clansp( norm, uplo, n, ap, work )
val = zlansp( norm, uplo, n, ap, work )
Include Files
mkl.fi
Description
The function ?lansp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex symmetric matrix A, supplied in packed form.
1466
LAPACK Routines 3
Input Parameters


column sum),
row sum),
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric
matrix A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
If uplo = 'L': Lower triangular part of A is supplied.
n INTEGER. The order of the matrix A. n 0. When

n = 0, ?lansp is set to zero.
ap REAL for slansp

DOUBLE PRECISION for dlansp
COMPLEX for clansp
DOUBLE COMPLEX for zlansp
Array, DIMENSION (n(n+1)/2).
The upper or lower triangle of the symmetric matrix A, packed columnwise

in a linear array. The j-th column of A is stored in the array ap as follows:
if uplo = 'U', ap(i + (j-1)j/2) = A(i,j) for 1 ij;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i,j) for jin.
work REAL for slansp and clansp.

DOUBLE PRECISION for dlansp and zlansp.

referenced.
Output Parameters
val REAL for slansp/clansp

DOUBLE PRECISION for dlansp/zlansp
1467
?lanhp
absolute value of a complex Hermitian matrix supplied
in packed form.
Syntax
val = clanhp( norm, uplo, n, ap, work )
val = zlanhp( norm, uplo, n, ap, work )
Include Files
mkl.fi
Description
The function ?lanhp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a complex Hermitian matrix A, supplied in packed form.
Input Parameters


column sum),
row sum),
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the Hermitian matrix
A is supplied.
If uplo = 'U': Upper triangular part of A is supplied
If uplo = 'L': Lower triangular part of A is supplied.

n 0. When n = 0, ?lanhp is set to zero.
ap COMPLEX for clanhp.

DOUBLE COMPLEX for zlanhp.
Array, DIMENSION (n(n+1)/2). The upper or lower triangle of the Hermitian
matrix A, packed columnwise in a linear array. The j-th column of A is
stored in the array ap as follows:
work REAL for clanhp.
1468
LAPACK Routines 3
DOUBLE PRECISION for zlanhp.

referenced.
Output Parameters
val REAL for clanhp.

DOUBLE PRECISION for zlanhp.
?lanst/?lanht
absolute value of a real symmetric or complex
Hermitian tridiagonal matrix.
Syntax
val = slanst( norm, n, d, e )
val = dlanst( norm, n, d, e )
val = clanht( norm, n, d, e )
val = zlanht( norm, n, d, e )
Include Files
mkl.fi
Description
The functions ?lanst/?lanht return the value of the 1-norm, or the Frobenius norm, or the infinity norm, or
the element of largest absolute value of a real symmetric or a complex Hermitian tridiagonal matrix A.
Input Parameters

A.
row sum),

n 0. When n = 0, ?lanst/?lanht is set to zero.
d REAL for slanst/clanht
1469
DOUBLE PRECISION for dlanst/zlanht

Array, DIMENSION (n). The diagonal elements of A.
e REAL for slanst

DOUBLE PRECISION for dlanst
COMPLEX for clanht
DOUBLE COMPLEX for zlanht
The (n-1) sub-diagonal or super-diagonal elements of A.
Output Parameters
val REAL for slanst/clanht

DOUBLE PRECISION for dlanst/zlanht
?lansy
absolute value of a real/complex symmetric matrix.
Syntax
val = slansy( norm, uplo, n, a, lda, work )
val = dlansy( norm, uplo, n, a, lda, work )
val = clansy( norm, uplo, n, a, lda, work )
val = zlansy( norm, uplo, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lansy returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a real/complex symmetric matrix A.
Input Parameters

A.
row sum),
1470
LAPACK Routines 3
uplo CHARACTER*1.
matrix A is to be referenced.
= 'U': Upper triangular part of A is referenced.
= 'L': Lower triangular part of A is referenced
n INTEGER. The order of the matrix A. n 0. When n = 0, ?lansy is set to

zero.
a REAL for slansy

DOUBLE PRECISION for dlansy
COMPLEX for clansy
DOUBLE COMPLEX for zlansy
Array, size (lda,n). The symmetric matrix A.

lda max(n,1).
work REAL for slansy and clansy.

DOUBLE PRECISION for dlansy and zlansy.

referenced.
Output Parameters
val REAL for slansy/clansy

DOUBLE PRECISION for dlansy/zlansy
?lanhe
absolute value of a complex Hermitian matrix.
Syntax
val = clanhe( norm, uplo, n, a, lda, work )
1471
val = zlanhe( norm, uplo, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lanhe returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a complex Hermitian matrix A.
Input Parameters


column sum),
row sum),
uplo CHARACTER*1.
A is to be referenced.
= 'U': Upper triangular part of A is referenced.
= 'L': Lower triangular part of A is referenced
n INTEGER. The order of the matrix A. n 0. When n = 0, ?lanhe is set to

zero.
a COMPLEX for clanhe.

DOUBLE COMPLEX for zlanhe.
Array, size (lda,n). The Hermitian matrix A.

lda max(n,1).
work REAL for clanhe.

DOUBLE PRECISION for zlanhe.
1472
LAPACK Routines 3
referenced.
Output Parameters
val REAL for clanhe.

DOUBLE PRECISION for zlanhe.
?lantb
absolute value of a triangular band matrix.
Syntax
val = slantb( norm, uplo, diag, n, k, ab, ldab, work )
val = dlantb( norm, uplo, diag, n, k, ab, ldab, work )
val = clantb( norm, uplo, diag, n, k, ab, ldab, work )
val = zlantb( norm, uplo, diag, n, k, ab, ldab, work )
Include Files
mkl.fi
Description
The function ?lantb returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n triangular band matrix A, with ( k + 1 ) diagonals.
Input Parameters

A.
row sum),
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular.
= 'U': Upper triangular
= 'L': Lower triangular.
diag CHARACTER*1.
1473
Specifies whether or not the matrix A is unit triangular.

= 'N': Non-unit triangular
= 'U': Unit triangular.
n INTEGER. The order of the matrix A. n 0. When n = 0, ?lantb is set to

zero.
k INTEGER. The number of super-diagonals of the matrix A if uplo = 'U', or

the number of sub-diagonals of the matrix A if uplo = 'L'. k 0.
ab REAL for slantb

DOUBLE PRECISION for dlantb
COMPLEX for clantb
DOUBLE COMPLEX for zlantb
Array, DIMENSION (ldab,n). The upper or lower triangular band matrix A,
stored in the first k+1 rows of ab.
The j-th column of A is stored in the j-th column of the array ab as follows:
if uplo = 'U', ab(k+1+i-j,j) = a(i,j) for max(1,j-k) i j;
if uplo = 'L', ab(1+i-j,j) = a(i,j) for ji min(n,j+k).
Note that when diag = 'U', the elements of the array ab corresponding to
the diagonal elements of the matrix A are not referenced, but are assumed
to be one.

ldabk+1.
work REAL for slantb and clantb.

DOUBLE PRECISION for dlantb and zlantb.
lworkn when norm = 'I' ; otherwise, work is not referenced.
Output Parameters
val REAL for slantb/clantb.

DOUBLE PRECISION for dlantb/zlantb.
?lantp
absolute value of a triangular matrix supplied in
packed form.
Syntax
val = slantp( norm, uplo, diag, n, ap, work )
1474
LAPACK Routines 3
val = dlantp( norm, uplo, diag, n, ap, work )
val = clantp( norm, uplo, diag, n, ap, work )
val = zlantp( norm, uplo, diag, n, ap, work )
Include Files
mkl.fi
Description
The function ?lantp returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a triangular matrix A, supplied in packed form.
Input Parameters

A.
row sum),
uplo CHARACTER*1.
diag CHARACTER*1.
= 'N': Non-unit triangular
= 'U': Unit triangular.

n 0. When n = 0, ?lantp is set to zero.
ap REAL for slantp

DOUBLE PRECISION for dlantp
COMPLEX for clantp
DOUBLE COMPLEX for zlantp
The upper or lower triangular matrix A, packed columnwise in a linear array.

The j-th column of A is stored in the array ap as follows:
if uplo = 'U', AP(i + (j-1)j/2) =a(i,j) for 1ij;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = a(i,j) for jin.
1475
Note that when diag = 'U', the elements of the array ap corresponding to
the diagonal elements of the matrix A are not referenced, but are assumed
to be one.
work REAL for slantp and clantp.

DOUBLE PRECISION for dlantp and zlantp.
Workspace array, DIMENSION(max(1,lwork)), where lworkn when norm
= 'I' ; otherwise, work is not referenced.
Output Parameters
val REAL for slantp/clantp.

DOUBLE PRECISION for dlantp/zlantp.
?lantr
absolute value of a trapezoidal or triangular matrix.
Syntax
val = slantr( norm, uplo, diag, m, n, a, lda, work )
val = dlantr( norm, uplo, diag, m, n, a, lda, work )
val = clantr( norm, uplo, diag, m, n, a, lda, work )
val = zlantr( norm, uplo, diag, m, n, a, lda, work )
Include Files
mkl.fi
Description
The function ?lantr returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of a trapezoidal or triangular matrix A.
Input Parameters

A.
row sum),
1476
LAPACK Routines 3
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower trapezoidal.
= 'U': Upper trapezoidal
= 'L': Lower trapezoidal.
Note that A is triangular instead of trapezoidal if m = n.
diag CHARACTER*1.
Specifies whether or not the matrix A has unit diagonal.
= 'N': Non-unit diagonal
= 'U': Unit diagonal.
m INTEGER. The number of rows of the matrix A. m 0, and if uplo = 'U',

mn.
When m = 0, ?lantr is set to zero.
n INTEGER. The number of columns of the matrix A. n 0, and if uplo =

'L', nm.
When n = 0, ?lantr is set to zero.
a REAL for slantr

DOUBLE PRECISION for dlantr
COMPLEX for clantr
DOUBLE COMPLEX for zlantr
The trapezoidal matrix A (A is triangular if m = n).
If uplo = 'U', the leading m-by-n upper trapezoidal part of the array a
contains the upper trapezoidal matrix, and the strictly lower triangular part
of A is not referenced.
If uplo = 'L', the leading m-by-n lower trapezoidal part of the array a
contains the lower trapezoidal matrix, and the strictly upper triangular part
of A is not referenced. Note that when diag = 'U', the diagonal elements
of A are not referenced and are assumed to be one.

lda max(m,1).
work REAL for slantr/clantrp.

DOUBLE PRECISION for dlantr/zlantr.
lworkm when norm = 'I' ; otherwise, work is not referenced.
Output Parameters
val REAL for slantr/clantrp.
1477
DOUBLE PRECISION for dlantr/zlantr.

?lanv2
Computes the Schur factorization of a real 2-by-2
nonsymmetric matrix in standard form.
Syntax
call slanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
call dlanv2( a, b, c, d, rt1r, rt1i, rt2r, rt2i, cs, sn )
Include Files
mkl.fi
Description
The routine computes the Schur factorization of a real 2-by-2 nonsymmetric matrix in standard form:
where either
1. cc = 0 so that aa and dd are real eigenvalues of the matrix, or

2. aa = dd and bb*cc < 0, so that aa sqrt(bb*cc) are complex conjugate eigenvalues.
The routine was adjusted to reduce the risk of cancellation errors, when computing real eigenvalues, and to
ensure, if possible, that abs(rt1r) abs(rt2r).
Input Parameters
a, b, c, d REAL for slanv2

DOUBLE PRECISION for dlanv2.
On entry, elements of the input matrix.
Output Parameters
a, b, c, d On exit, overwritten by the elements of the standardized Schur form.
rt1r, rt1i, rt2r, rt2i REAL for slanv2

The real and imaginary parts of the eigenvalues.
If the eigenvalues are a complex conjugate pair, rt1i > 0.
cs, sn REAL for slanv2

1478
LAPACK Routines 3
Parameters of the rotation matrix.
?lapll
Measures the linear dependence of two vectors.
Syntax
call slapll( n, x, incx, Y, incy, ssmin )
call dlapll( n, x, incx, Y, incy, ssmin )
call clapll( n, x, incx, Y, incy, ssmin )
call zlapll( n, x, incx, Y, incy, ssmin )
Include Files
mkl.fi
Description
Given two column vectors x and y of length n, let

A = (xy) be the n-by-2 matrix.
The routine ?lapll first computes the QR factorization of A as A = Q*R and then computes the SVD of the
2-by-2 upper triangular matrix R. The smaller singular value of R is returned in ssmin, which is used as the
measurement of the linear dependency of the vectors x and y.
Input Parameters
n INTEGER. The length of the vectors x and y.
x REAL for slapll

DOUBLE PRECISION for dlapll
COMPLEX for clapll
DOUBLE COMPLEX for zlapll
Array, DIMENSION(1+(n-1)incx).
On entry, x contains the n-vector x.
y REAL for slapll

DOUBLE PRECISION for dlapll
COMPLEX for clapll
DOUBLE COMPLEX for zlapll
Array, DIMENSION (1+(n-1)incy).
On entry, y contains the n-vector y.
incx INTEGER. The increment between successive elements of x; incx > 0.
incy INTEGER. The increment between successive elements of y; incy > 0.
1479
Output Parameters
x On exit, x is overwritten.
y On exit, y is overwritten.
ssmin REAL for slapll/clapll

DOUBLE PRECISION for dlapll/zlapll
The smallest singular value of the n-by-2 matrix A = (xy).
?lapmr
Rearranges rows of a matrix as specified by a
permutation vector.
Syntax
call slapmr( forwrd, m, n, x, ldx, k )
call dlapmr( forwrd, m, n, x, ldx, k )
call clapmr( forwrd, m, n, x, ldx, k )
call zlapmr( forwrd, m, n, x, ldx, k )
call lapmr( x,k[,forwrd] )
Include Files
mkl.fi
Description
The ?lapmr routine rearranges the rows of the m-by-n matrix X as specified by the permutation
k(1),k(2),...,k(m) of the integers 1,...,m.
If forwrd = .TRUE., forward permutation:
X(k(i,*)) is moved to X(i,*) for i= 1,2,...,m.

If forwrd = .FALSE., backward permutation:
X(i,*) is moved to X(k(i,*)) for i = 1,2,...,m.
Input Parameters
forwrd LOGICAL.
If forwrd = .TRUE., forward permutation.
If forwrd = .FALSE., backward permutation.
m INTEGER. The number of rows of the matrix X. m 0.
n INTEGER. The number of columns of the matrix X. n 0.
x REAL for slapmr

DOUBLE PRECISION for dlapmr
1480
LAPACK Routines 3
COMPLEX for clapmr
DOUBLE COMPLEX for zlapmr
Array, size (ldx,n)On entry, the m-by-n matrix X.
ldx INTEGER. The leading dimension of the array X, ldx max(1,m).
k INTEGER. Array, size (m). On entry, k contains the permutation vector and
is used as internal workspace.
Output Parameters
x On exit, x contains the permuted matrix X.
k On exit, k is reset to its original value.

Specific details for the routine ?lapmr interface are as follows:
x Holds the matrix X of size (n, n).
k Holds the vector of length m.
forwrd Specifies the permutation. Must be '.TRUE.' or '.FALSE.'.
See Also
?lapmt
?lapmt
Performs a forward or backward permutation of the
columns of a matrix.
Syntax
call slapmt( forwrd, m, n, x, ldx, k )
call dlapmt( forwrd, m, n, x, ldx, k )
call clapmt( forwrd, m, n, x, ldx, k )
call zlapmt( forwrd, m, n, x, ldx, k )
Include Files
mkl.fi
Description
The routine ?lapmt rearranges the columns of the m-by-n matrix X as specified by the permutation
k(1),k(2),...,k(n) of the integers 1,...,n.
If forwrd= .TRUE., forward permutation:
X(*,k(j)) is moved to X(*,j) for j=1,2,...,n.

If forwrd = .FALSE., backward permutation:
1481
X(*,j) is moved to X(*,k(j)) for j = 1,2,...,n.
Input Parameters
forwrd LOGICAL.
If forwrd= .TRUE., forward permutation
If forwrd = .FALSE., backward permutation
m INTEGER. The number of rows of the matrix X. m 0.
n INTEGER. The number of columns of the matrix X. n 0.
x REAL for slapmt

DOUBLE PRECISION for dlapmt
COMPLEX for clapmt
DOUBLE COMPLEX for zlapmt
Array, size (ldx,n). On entry, the m-by-n matrix X.
ldx INTEGER. The leading dimension of the array x, ldx max(1,m).
k INTEGER. Array, size (n). On entry, k contains the permutation vector and is
used as internal workspace.
Output Parameters
x On exit, x contains the permuted matrix X.
k On exit, k is reset to its original value.
See Also
?lapmr
?lapy2
Returns sqrt(x2+y2).
Syntax
val = slapy2( x, y )
val = dlapy2( x, y )
Include Files
mkl.fi
Description
The function ?lapy2 returns sqrt(x2+y2), avoiding unnecessary overflow or harmful underflow.
Input Parameters
x, y REAL for slapy2
1482
LAPACK Routines 3
DOUBLE PRECISION for dlapy2
Specify the input values x and y.
Output Parameters
val REAL for slapy2

DOUBLE PRECISION for dlapy2.
If val=-1D0, the first argument was NaN.
If val=-2D0, the second argument was NaN.
?lapy3
Returns sqrt(x2+y2+z2).
Syntax
val = slapy3( x, y, z )
val = dlapy3( x, y, z )
Include Files
mkl.fi
Description
The function ?lapy3 returns sqrt(x2+y2+z2), avoiding unnecessary overflow or harmful underflow.
Input Parameters
x, y, z REAL for slapy3

DOUBLE PRECISION for dlapy3
Specify the input values x, y and z.
Output Parameters
val REAL for slapy3

DOUBLE PRECISION for dlapy3.
If val = -1D0, the first argument was NaN.
If val = -2D0, the second argument was NaN.
If val = -3D0, the third argument was NaN.
?laqgb
Scales a general band matrix, using row and column
scaling factors computed by ?gbequ.
1483
Syntax
call slaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call dlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call claqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
call zlaqgb( m, n, kl, ku, ab, ldab, r, c, rowcnd, colcnd, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a general m-by-n band matrix A with kl subdiagonals and ku superdiagonals using
the row and column scaling factors in the vectors r and c.
Input Parameters
kl INTEGER. The number of subdiagonals within the band of A. kl 0.
ku INTEGER. The number of superdiagonals within the band of A. ku 0.
ab REAL for slaqgb

DOUBLE PRECISION for dlaqgb
COMPLEX for claqgb
DOUBLE COMPLEX for zlaqgb
Array, DIMENSION (ldab,n). On entry, the matrix A in band storage, in rows
1 to kl+ku+1. The j-th column of A is stored in the j-th column of the array
ab as follows: ab(ku+1+i-j,j) = A(i,j) for max(1,j-ku) i
min(m,j+kl).

ldakl+ku+1.
amax REAL for slaqgb/claqgb

DOUBLE PRECISION for dlaqgb/zlaqgb
Absolute value of largest matrix entry.
r, c REAL for slaqgb/claqgb

Arrays r (m), c (n). Contain the row and column scale factors for A,
respectively.
rowcnd REAL for slaqgb/claqgb

Ratio of the smallest r(i) to the largest r(i).
1484
LAPACK Routines 3
colcnd REAL for slaqgb/claqgb
Ratio of the smallest c(i) to the largest c(i).
Output Parameters
ab On exit, the equilibrated matrix, in the same storage format as A.

See equed for the form of the equilibrated matrix.
equed CHARACTER*1.
Specifies the form of equilibration that was done.
If equed = 'N': No equilibration
If equed = 'R': Row equilibration, that is, A has been premultiplied by

diag(r).
If equed = 'C': Column equilibration, that is, A has been postmultiplied by
diag(c).
If equed = 'B': Both row and column equilibration, that is, A has been
replaced by diag(r)*A*diag(c).
Application Notes
The routine uses internal parameters thresh, large, and small, which have the following meaning. thresh is a
threshold value used to decide if row or column scaling should be done based on the ratio of the row or
column scaling factors. If rowcnd < thresh, row scaling is done, and if colcnd < thresh, column scaling
is done. large and small are threshold values used to decide if row scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, row scaling is done.
?laqge
Scales a general rectangular matrix, using row and
column scaling factors computed by ?geequ.
Syntax
call slaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call dlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call claqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
call zlaqge( m, n, a, lda, r, c, rowcnd, colcnd, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a general m-by-n matrix A using the row and column scaling factors in the vectors r
and c.
Input Parameters
1485
m 0.

n 0.
a REAL for slaqge

DOUBLE PRECISION for dlaqge
COMPLEX for claqge
DOUBLE COMPLEX for zlaqge
Array, DIMENSION (lda,n). On entry, the m-by-n matrix A.

lda max(m,1).
r REAL for slanqge/claqge

DOUBLE PRECISION for dlaqge/zlaqge
Array, DIMENSION (m). The row scale factors for A.
c REAL for slanqge/claqge

Array, DIMENSION (n). The column scale factors for A.
rowcnd REAL for slanqge/claqge

Ratio of the smallest r(i) to the largest r(i).
colcnd REAL for slanqge/claqge

Ratio of the smallest c(i) to the largest c(i).
amax REAL for slanqge/claqge

Output Parameters
a On exit, the equilibrated matrix.

See equed for the form of the equilibrated matrix.
equed CHARACTER*1.
Specifies the form of equilibration that was done.
If equed = 'N': No equilibration
If equed = 'R': Row equilibration, that is, A has been premultiplied by

diag(r).
1486
LAPACK Routines 3
If equed = 'C': Column equilibration, that is, A has been postmultiplied by
diag(c).
If equed = 'B': Both row and column equilibration, that is, A has been
replaced by diag(r)*A*diag(c).
Application Notes
threshold value used to decide if row or column scaling should be done based on the ratio of the row or
column scaling factors. If rowcnd < thresh, row scaling is done, and if colcnd < thresh, column scaling
is done. large and small are threshold values used to decide if row scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, row scaling is done.
?laqhb
Scales a Hermetian band matrix, using scaling factors
computed by ?pbequ.
Syntax
call claqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqhb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermetian band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
stored.
If uplo = 'U': upper triangular.
If uplo = 'L': lower triangular.

n 0.
kd INTEGER. The number of super-diagonals of the matrix A if uplo = 'U', or

the number of sub-diagonals if uplo = 'L'.
kd 0.
ab COMPLEX for claqhb

DOUBLE COMPLEX for zlaqhb
Array, DIMENSION (ldab,n). On entry, the upper or lower triangle of the
band matrix A, stored in the first kd+1 rows of the array. The j-th column of
A is stored in the j-th column of the array ab as follows:
1487
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for max(1,j-kd) ij;
if uplo = 'L', ab(1+i-j,j) = A(i,j) for ji min(n,j+kd).

ldabkd+1.
scond REAL for claqsb

DOUBLE PRECISION for zlaqsb
Ratio of the smallest s(i) to the largest s(i).
amax REAL for claqsb

Output Parameters
ab On exit, if info = 0, the triangular factor U or L from the Cholesky

factorization A = UH*U or A = L*LH of the band matrix A, in the same
storage format as A.
s REAL for claqsb

Array, DIMENSION (n). The scale factors for A.
equed CHARACTER*1.
Specifies whether or not equilibration was done.
If equed = 'N': No equilibration.
If equed = 'Y': Equilibration was done, that is, A has been replaced by
diag(s)*A*diag(s).
Application Notes
threshold value used to decide if scaling should be based on the ratio of the scaling factors. If scond <
thresh, scaling is done.
The values large and small are threshold values used to decide if scaling should be done based on the
absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqp2
Computes a QR factorization with column pivoting of
the matrix block.
Syntax
call slaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call dlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call claqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
call zlaqp2( m, n, offset, a, lda, jpvt, tau, vn1, vn2, work )
1488
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine computes a QR factorization with column pivoting of the block A(offset+1:m,1:n). The block
A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of the matrix A that must be pivoted but no
factorized. offset 0.
a REAL for slaqp2

DOUBLE PRECISION for dlaqp2
COMPLEX for claqp2
DOUBLE COMPLEX for zlaqp2
Array, DIMENSION (lda,n). On entry, the m-by-n matrix A.
jpvt INTEGER.
On entry, if jpvt(i) 0, the i-th column of A is permuted to the front of

A*P (a leading column); if jpvt(i) = 0, the i-th column of A is a free
column.
vn1, vn2 REAL for slaqp2/claqp2

DOUBLE PRECISION for dlaqp2/zlaqp2
Arrays, DIMENSION (n) each. Contain the vectors with the partial and exact
column norms, respectively.
work REAL for slaqp2

COMPLEX for claqp2
DOUBLE COMPLEX for zlaqp2 Workspace array, DIMENSION (n).
Output Parameters
a On exit, the upper triangle of block A(offset+1:m,1:n) is the triangular

factor obtained; the elements in block A(offset+1:m,1:n) below the
diagonal, together with the array tau, represent the orthogonal matrix Q as
a product of elementary reflectors. Block A(1:offset,1:n) has been
accordingly pivoted, but not factorized.
1489
jpvt On exit, if jpvt(i) = k, then the i-th column of A*P was the k-th column
of A.
tau REAL for slaqp2

COMPLEX for claqp2
DOUBLE COMPLEX for zlaqp2
Array, DIMENSION(min(m,n)).
vn1, vn2 Contain the vectors with the partial and exact column norms, respectively.
?laqps
Computes a step of QR factorization with column
pivoting of a real m-by-n matrix A by using BLAS level
3.
Syntax
call slaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call dlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call claqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
call zlaqps( m, n, offset, nb, kb, a, lda, jpvt, tau, vn1, vn2, auxv, f, ldf )
Include Files
mkl.fi
Description
The routine computes a step of QR factorization with column pivoting of a real m-by-n matrix A by using
BLAS level 3. The routine tries to factorize NB columns from A starting from the row offset+1, and updates
all of the matrix with BLAS level 3 routine ?gemm.
In some cases, due to catastrophic cancellations, ?laqps cannot factorize NB columns. Hence, the actual
number of factorized columns is returned in kb.
Block A(1:offset,1:n) is accordingly pivoted, but not factorized.
Input Parameters
offset INTEGER. The number of rows of A that have been factorized in previous
steps.
nb INTEGER. The number of columns to factorize.
a REAL for slaqps

DOUBLE PRECISION for dlaqps
1490
LAPACK Routines 3
COMPLEX for claqps
DOUBLE COMPLEX for zlaqps

lda max(1,m).
jpvt INTEGER. Array, DIMENSION (n).

If jpvt(I) = k then column k of the full matrix A has been permuted into
position i in AP.
vn1, vn2 REAL for slaqps/claqps

DOUBLE PRECISION for dlaqps/zlaqps
Arrays, DIMENSION (n) each. Contain the vectors with the partial and exact
column norms, respectively.
auxv REAL for slaqps

COMPLEX for claqps
Array, DIMENSION (nb). Auxiliary vector.
f REAL for slaqps

COMPLEX for claqps
Array, DIMENSION (ldf,nb). For real flavors, matrix FT = L*YT*A. For
complex flavors, matrix FH = L*YH*A.
ldf INTEGER. The leading dimension of the array f.

ldf max(1,n).
Output Parameters
kb INTEGER. The number of columns actually factorized.
a On exit, block A(offset+1:m,1:kb) is the triangular factor obtained and

block A(1:offset,1:n) has been accordingly pivoted, but no factorized. The
rest of the matrix, block A(offset+1:m,kb+1:n) has been updated.
jpvt INTEGER array, DIMENSION (n). If jpvt(I) = k then column k of the full
matrix A has been permuted into position i in AP.
tau REAL for slaqps

1491
COMPLEX for claqps

Array, DIMENSION (kb). The scalar factors of the elementary reflectors.
vn1, vn2 The vectors with the partial and exact column norms, respectively.
auxv Auxiliary vector.
f Matrix F' = L*Y'*A.
?laqr0
Computes the eigenvalues of a Hessenberg matrix,
and optionally the marixes from the Schur
decomposition.
Syntax
call slaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call dlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call claqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
call zlaqr0( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices T and Z from
the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular matrix (the Schur form),
and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine can give the
Schur factorization of a matrix A which has been reduced to the Hessenberg form H by the orthogonal/
unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
Input Parameters
wantt LOGICAL.
If wantt = .FALSE., only eigenvalues are required.
wantz LOGICAL.
n INTEGER. The order of the Hessenberg matrix H. (n 0).
1492
LAPACK Routines 3
ilo, ihi INTEGER.
It is assumed that H is already upper triangular in rows and columns
1:ilo-1 and ihi+1:n, and if ilo > 1 then H(ilo, ilo-1) = 0.
ilo and ihi are normally set by a previous call to cgebal, and then
passed to cgehrd when the matrix output by cgebal is reduced to
Hessenberg form. Otherwise, ilo and ihi should be set to 1 and n,
respectively.
If n > 0, then 1 ilo ihi n.
If n=0, then ilo=1 and ihi=0
h REAL for slaqr0

DOUBLE PRECISION for dlaqr0
COMPLEX for claqr0
DOUBLE COMPLEX for zlaqr0.
Array, DIMENSION (ldh, n), contains the upper Hessenberg matrix H.
ldh INTEGER. The leading dimension of the array h. ldh max(1, n).
iloz, ihiz INTEGER. Specify the rows of Z to which transformations must be applied if
wantz is .TRUE., 1 iloz ilo; ihi ihiz n.
z REAL for slaqr0

COMPLEX for claqr0
Array, DIMENSION (ldz, ihi), contains the matrix Z if wantz is .TRUE.. If
wantz is .FALSE., z is not referenced.

If wantz is .TRUE., then ldz max(1, ihiz). Otherwise, ldz 1.
work REAL for slaqr0

COMPLEX for claqr0
Workspace array with dimension lwork.

lwork max(1,n) is sufficient, but for the optimal performance a greater
workspace may be required, typically as large as 6*n.
It is recommended to use the workspace query to determine the optimal

workspace size. If lwork=-1,then the routine performs a workspace query:
it estimates the optimal workspace size for the given values of the input
1493
parameters n, ilo, and ihi. The estimate is returned in work(1). No error

messages related to the lwork is issued by xerbla. Neither H nor Z are
accessed.
Output Parameters
h If info=0, and wantt is .TRUE., then h contains the upper quasi-triangular/

triangular matrix T from the Schur decomposition (the Schur form).
If info=0, and wantt is .FALSE., then the contents of h are unspecified on
exit.
(The output values of h when info > 0 are given under the description of
the info parameter below.)
The routine may explicitly set h(i,j) for i>j and j=1,2,...ilo-1 or
j=ihi+1, ihi+2,...n.
work(1) On exit work(1) contains the minimum value of lwork required for optimum
performance.
w COMPLEX for claqr0

Arrays, DIMENSION(n). The computed eigenvalues of h(ilo:ihi, ilo:ihi)
are stored in w(ilo:ihi). If wantt is .TRUE., then the eigenvalues are
stored in the same order as on the diagonal of the Schur form returned in
h, with w(i) = h(i,i).
wr, wi REAL for slaqr0

Arrays, DIMENSION(ihi) each. The real and imaginary parts, respectively, of
the computed eigenvalues of h(ilo:ihi, ilo:ihi) are stored in
wr(ilo:ihi) and wi(ilo:ihi). If two eigenvalues are computed as a
complex conjugate pair, they are stored in consecutive elements of wr and
wi, say the i-th and (i+1)-th, with wi(i)> 0 and wi(i+1) < 0. If wantt
is .TRUE., then the eigenvalues are stored in the same order as on the
diagonal of the Schur form returned in h, with wr(i) = h(i,i), and if
h(i:i+1,i:i+1)is a 2-by-2 diagonal block, then wi(i)=sqrt(-h(i
+1,i)*h(i,i+1)).
z If wantz is .TRUE., then z(ilo:ihi, iloz:ihiz) is replaced by z(ilo:ihi,

iloz:ihiz)*U, where U is the orthogonal/unitary Schur factor of
h(ilo:ihi, ilo:ihi).
If wantz is .FALSE., z is not referenced.
(The output values of z when info > 0 are given under the description of
info INTEGER.
> 0: if info = i, then the routine failed to compute all the eigenvalues.
Elements 1:ilo-1 and i+1:n of wr and wi contain those eigenvalues which
have been successfully computed.
1494
LAPACK Routines 3
> 0: if wantt is .FALSE., then the remaining unconverged eigenvalues are
the eigenvalues of the upper Hessenberg matrix rows and columns ilo
through info of the final output value of h.
> 0: if wantt is .TRUE., then (initial value of h)*U = U*(final value of h,

where U is an orthogonal/unitary matrix. The final value of h is upper
Hessenberg and quasi-triangular/triangular in rows and columns info+1
through ihi.
> 0: if wantz is .TRUE., then (final value of z(ilo:ihi,

iloz:ihiz))=(initial value of z(ilo:ihi, iloz:ihiz)*U, where U is the
orthogonal/unitary matrix in the previous expression (regardless of the
value of wantt).
> 0: if wantz is .FALSE., then z is not accessed.
?laqr1
Sets a scalar multiple of the first column of the
product of 2-by-2 or 3-by-3 matrix H and specified
shifts.
Syntax
call slaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call dlaqr1( n, h, ldh, sr1, si1, sr2, si2, v )
call claqr1( n, h, ldh, s1, s2, v )
call zlaqr1( n, h, ldh, s1, s2, v )
Include Files
mkl.fi
Description
Given a 2-by-2 or 3-by-3 matrix H, this routine sets v to a scalar multiple of the first column of the product
K = (H - s1*I)*(H - s2*I), or K = (H - (sr1 + i*si1)*I)*(H - (sr2 + i*si2)*I)
scaling to avoid overflows and most underflows.
It is assumed that either 1) sr1 = sr2 and si1 = -si2, or 2) si1 = si2 = 0.
This is useful for starting double implicit shift bulges in the QR algorithm.
Input Parameters
n INTEGER.
The order of the matrix H. n must be equal to 2 or 3.
sr1, si2, sr2, si2 REAL for slaqr1

Shift values that define K in the formula above.
s1, s2 COMPLEX for claqr1

1495
Shift values that define K in the formula above.

h REAL for slaqr1
COMPLEX for claqr1
Array, DIMENSION (ldh, n), contains 2-by-2 or 3-by-3 matrix H in the
formula above.
ldh INTEGER.
The leading dimension of the array h just as declared in the calling routine.
ldh n.
Output Parameters
v REAL for slaqr1

COMPLEX for claqr1
Array with dimension (n).
A scalar multiple of the first column of the matrix K in the formula above.
?laqr2
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect and
deflate fully converged eigenvalues from a trailing
principal submatrix (aggressive early deflation).
Syntax
call slaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
si, v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call dlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
call claqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
v, ldv, nh, t, ldt, nv, wv, ldwv, work, lwork )
call zlaqr2( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
Include Files
mkl.fi
Description
1496
LAPACK Routines 3
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary similarity
transformation designed to detect and deflate fully converged eigenvalues from a trailing principal submatrix.
On output H has been overwritten by a new Hessenberg matrix that is a perturbation of an orthogonal/
unitary similarity transformation of H. It is to be hoped that the final version of H has many zero subdiagonal
entries.
This subroutine is identical to ?laqr3 except that it avoids recursion by calling ?lahqr instead of ?laqr4.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully updated so that
the quasi-triangular/triangular Schur factor may be computed (in
cooperation with the calling subroutine).
If wantt = .FALSE., then only enough of H is updated to preserve the
eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z is updated so
that the orthogonal/unitary Schur factor may be computed (in cooperation
with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if wantz = .TRUE.)
the order of the orthogonal/unitary matrix Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0. ktop and kbot
together determine an isolated block along the diagonal of the Hessenberg
matrix.
kbot INTEGER.
It is assumed without a check that either kbot=n or h(kbot+1,kbot)=0.
ktop and kbot together determine an isolated block along the diagonal of
the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1 nw (kbot-ktop+1).
h REAL for slaqr2

COMPLEX for claqr2
Array, DIMENSION (ldh, n), on input the initial n-by-n section of h stores the
Hessenberg matrix H undergoing aggressive early deflation.
ldh INTEGER. The leading dimension of the array h just as declared in the
calling subroutine. ldhn.
wantz is .TRUE.. 1 iloz ihiz n.
1497
z REAL for slaqr2

COMPLEX for claqr2
Array, DIMENSION (ldz, n), contains the matrix Z if wantz is .TRUE.. If
wantz is .FALSE., then z is not referenced.
ldz INTEGER. The leading dimension of the array z just as declared in the
calling subroutine. ldz 1.
v REAL for slaqr2

COMPLEX for claqr2
Workspace array with dimension (ldv, nw). An nw-by-nw work array.
ldv INTEGER. The leading dimension of the array v just as declared in the
calling subroutine. ldv nw.
nh INTEGER. The number of column of t. nh nw.
t REAL for slaqr2

COMPLEX for claqr2
Workspace array with dimension (ldt, nw).
ldt INTEGER. The leading dimension of the array t just as declared in the
calling subroutine. ldtnw.
nv INTEGER. The number of rows of work array wv available for workspace.

nvnw.
wv REAL for slaqr2

COMPLEX for claqr2
Workspace array with dimension (ldwv, nw).
ldwv INTEGER. The leading dimension of the array wv just as declared in the
calling subroutine. ldwvnw.

COMPLEX for claqr2
1498
LAPACK Routines 3

lwork=2*nw) is sufficient, but for the optimal performance a greater
workspace may be required.
If lwork=-1,then the routine performs a workspace query: it estimates the
optimal workspace size for the given values of the input parameters n, nw,
ktop, and kbot. The estimate is returned in work(1). No error messages
related to the lwork is issued by xerbla. Neither H nor Z are accessed.
Output Parameters
h On output h has been transformed by an orthogonal/unitary similarity

transformation, perturbed, and the returned to Hessenberg form that (it is
to be hoped) has some zero subdiagonal entries.
work(1) On exit work(1) is set to an estimate of the optimal value of lwork for the
given values of the input parameters n, nw, ktop, and kbot.
z If wantz is .TRUE., then the orthogonal/unitary similarity transformation is

accumulated into z(iloz:ihiz, ilo:ihi) from the right.
If wantz is .FALSE., then z is unreferenced.
nd INTEGER. The number of converged eigenvalues uncovered by the routine.
ns INTEGER. The number of unconverged, that is approximate eigenvalues

returned in sr, si or in sh that may be used as shifts by the calling
subroutine.
sh COMPLEX for claqr2

Arrays, DIMENSION (kbot).
The approximate eigenvalues that may be used for shifts are stored in the
sh(kbot-nd-ns+1)through the sh(kbot-nd).
The converged eigenvalues are stored in the sh(kbot-nd+1)through the
sh(kbot).
sr, si REAL for slaqr2

Arrays, DIMENSION (kbot) each.
The real and imaginary parts of the approximate eigenvalues that may be
used for shifts are stored in the sr(kbot-nd-ns+1)through the sr(kbot-
nd), and si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are stored in the
sr(kbot-nd+1)through the sr(kbot), and si(kbot-nd+1) through the
si(kbot), respectively.
1499
?laqr3
Performs the orthogonal/unitary similarity
transformation of a Hessenberg matrix to detect and
deflate fully converged eigenvalues from a trailing
principal submatrix (aggressive early deflation).
Syntax
call slaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
call dlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sr,
call claqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
call zlaqr3( wantt, wantz, n, ktop, kbot, nw, h, ldh, iloz, ihiz, z, ldz, ns, nd, sh,
Include Files
mkl.fi
Description
The routine accepts as input an upper Hessenberg matrix H and performs an orthogonal/unitary similarity
transformation designed to detect and deflate fully converged eigenvalues from a trailing principal submatrix.
On output H has been overwritten by a new Hessenberg matrix that is a perturbation of an orthogonal/
unitary similarity transformation of H. It is to be hoped that the final version of H has many zero subdiagonal
entries.
Input Parameters
wantt LOGICAL.
If wantt = .TRUE., then the Hessenberg matrix H is fully updated so that
the quasi-triangular/triangular Schur factor may be computed (in
cooperation with the calling subroutine).
If wantt = .FALSE., then only enough of H is updated to preserve the
eigenvalues.
wantz LOGICAL.
If wantz = .TRUE., then the orthogonal/unitary matrix Z is updated so
that the orthogonal/unitary Schur factor may be computed (in cooperation
with the calling subroutine).
If wantz = .FALSE., then Z is not referenced.
n INTEGER. The order of the Hessenberg matrix H and (if wantz = .TRUE.)
the order of the orthogonal/unitary matrix Z.
ktop INTEGER.
It is assumed that either ktop=1 or h(ktop,ktop-1)=0. ktop and kbot
together determine an isolated block along the diagonal of the Hessenberg
matrix.
1500
LAPACK Routines 3
kbot INTEGER.
It is assumed without a check that either kbot=n or h(kbot+1,kbot)=0.
ktop and kbot together determine an isolated block along the diagonal of
the Hessenberg matrix.
nw INTEGER.
Size of the deflation window. 1nw(kbot-ktop+1).
h REAL for slaqr3

COMPLEX for claqr3
Array, DIMENSION (ldh, n), on input the initial n-by-n section of h stores the
Hessenberg matrix H undergoing aggressive early deflation.
calling subroutine. ldhn.
wantz is .TRUE.. 1ilozihizn.
z REAL for slaqr3

COMPLEX for claqr3
Array, DIMENSION (ldz, n), contains the matrix Z if wantz is .TRUE.. If
calling subroutine. ldz1.
v REAL for slaqr3

COMPLEX for claqr3
Workspace array with dimension (ldv, nw). An nw-by-nw work array.
calling subroutine. ldvnw.
nh INTEGER. The number of column of t. nhnw.
t REAL for slaqr3

COMPLEX for claqr3
1501
Workspace array with dimension (ldt, nw).
ldt INTEGER. The leading dimension of the array t just as declared in the
calling subroutine. ldtnw.
nv INTEGER. The number of rows of work array wv available for workspace.

nvnw.
wv REAL for slaqr3

COMPLEX for claqr3
Workspace array with dimension (ldwv, nw).
calling subroutine. ldwvnw.

COMPLEX for claqr3

lwork=2*nw) is sufficient, but for the optimal performance a greater
workspace may be required.
If lwork=-1,then the routine performs a workspace query: it estimates the
optimal workspace size for the given values of the input parameters n, nw,
ktop, and kbot. The estimate is returned in work(1). No error messages
related to the lwork is issued by xerbla. Neither H nor Z are accessed.
Output Parameters
h On output h has been transformed by an orthogonal/unitary similarity

transformation, perturbed, and the returned to Hessenberg form that (it is
to be hoped) has some zero subdiagonal entries.
work(1) On exit work(1) is set to an estimate of the optimal value of lwork for the
given values of the input parameters n, nw, ktop, and kbot.
z If wantz is .TRUE., then the orthogonal/unitary similarity transformation is

accumulated into z(iloz:ihiz, ilo:ihi) from the right.
nd INTEGER. The number of converged eigenvalues uncovered by the routine.
ns INTEGER. The number of unconverged, that is approximate eigenvalues

returned in sr, si or in sh that may be used as shifts by the calling
subroutine.
1502
LAPACK Routines 3
sh COMPLEX for claqr3
Arrays, DIMENSION (kbot).
The approximate eigenvalues that may be used for shifts are stored in the
sh(kbot-nd-ns+1)through the sh(kbot-nd).
The converged eigenvalues are stored in the sh(kbot-nd+1)through the
sh(kbot).

Arrays, DIMENSION (kbot) each.
The real and imaginary parts of the approximate eigenvalues that may be
used for shifts are stored in the sr(kbot-nd-ns+1)through the sr(kbot-
nd), and si(kbot-nd-ns+1) through the si(kbot-nd), respectively.
The real and imaginary parts of converged eigenvalues are stored in the
sr(kbot-nd+1)through the sr(kbot), and si(kbot-nd+1) through the
si(kbot), respectively.
?laqr4
Computes the eigenvalues of a Hessenberg matrix,
and optionally the matrices from the Schur
decomposition.
Syntax
call slaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call dlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, wr, wi, iloz, ihiz, z, ldz, work,
lwork, info )
call claqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
call zlaqr4( wantt, wantz, n, ilo, ihi, h, ldh, w, iloz, ihiz, z, ldz, work, lwork,
info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a Hessenberg matrix H, and, optionally, the matrices T and Z from
the Schur decomposition H=Z*T*ZH, where T is an upper quasi-triangular/triangular matrix (the Schur form),
and Z is the orthogonal/unitary matrix of Schur vectors.
Optionally Z may be postmultiplied into an input orthogonal/unitary matrix Q so that this routine can give the
Schur factorization of a matrix A which has been reduced to the Hessenberg form H by the orthogonal/
unitary matrix Q: A = Q*H*QH = (QZ)*H*(QZ)H.
This routine implements one level of recursion for ?laqr0. It is a complete implementation of the small bulge
multi-shift QR algorithm. It may be called by ?laqr0 and, for large enough deflation window size, it may be
called by ?laqr3. This routine is identical to ?laqr0 except that it calls ?laqr2 instead of ?laqr3.
1503
Input Parameters
wantt LOGICAL.
If wantt = .FALSE., only eigenvalues are required.
wantz LOGICAL.
n INTEGER. The order of the Hessenberg matrix H. (n 0).
ilo, ihi INTEGER.

1:ilo-1 and ihi+1:n, and if ilo > 1 then h(ilo, ilo-1) = 0.
ilo and ihi are normally set by a previous call to cgebal, and then
passed to cgehrd when the matrix output by cgebal is reduced to
Hessenberg form. Otherwise, ilo and ihi should be set to 1 and n,
respectively.
If n > 0, then 1 ilo ihi n.
If n=0, then ilo=1 and ihi=0
h REAL for slaqr4

COMPLEX for claqr4
Array, DIMENSION (ldh, n), contains the upper Hessenberg matrix H.
ldh INTEGER. The leading dimension of the array h. ldh max(1, n).
wantz is .TRUE., 1 iloz ilo; ihi ihiz n.
z REAL for slaqr4

COMPLEX for claqr4
wantz is .FALSE., z is not referenced.

If wantz is .TRUE., then ldz max(1, ihiz). Otherwise, ldz 1.

COMPLEX for claqr4
1504
LAPACK Routines 3

lwork max(1,n) is sufficient, but for the optimal performance a greater
workspace may be required, typically as large as 6*n.
It is recommended to use the workspace query to determine the optimal

workspace size. If lwork=-1,then the routine performs a workspace query:
it estimates the optimal workspace size for the given values of the input
parameters n, ilo, and ihi. The estimate is returned in work(1). No error
messages related to the lwork is issued by xerbla. Neither H nor Z are
accessed.
Output Parameters
h If info=0, and wantt is .TRUE., then h contains the upper quasi-triangular/

triangular matrix T from the Schur decomposition (the Schur form).
If info=0, and wantt is .FALSE., then the contents of h are unspecified on
exit.
(The output values of h when info > 0 are given under the description of
The routines may explicitly set h(i,j) for i>j and j=1,2,...ilo-1 or
j=ihi+1, ihi+2,...n.
work(1) On exit work(1) contains the minimum value of lwork required for optimum
performance.
w COMPLEX for claqr4

Arrays, DIMENSION(n). The computed eigenvalues of h(ilo:ihi, ilo:ihi)
are stored in w(ilo:ihi). If wantt is .TRUE., then the eigenvalues are
stored in the same order as on the diagonal of the Schur form returned in
h, with w(i) = h(i,i).
wr, wi REAL for slaqr4

Arrays, DIMENSION(ihi) each. The real and imaginary parts, respectively, of
the computed eigenvalues of h(ilo:ihi, ilo:ihi) are stored in the
wr(ilo:ihi) and wi(ilo:ihi). If two eigenvalues are computed as a
complex conjugate pair, they are stored in consecutive elements of wr and
wi, say the i-th and (i+1)-th, with wi(i)> 0 and wi(i+1) < 0. If wantt
is .TRUE., then the eigenvalues are stored in the same order as on the
diagonal of the Schur form returned in h, with wr(i) = h(i,i), and if
h(i:i+1,i:i+1)is a 2-by-2 diagonal block, then wi(i)=sqrt(-h(i
+1,i)*h(i,i+1)).
z If wantz is .TRUE., then z(ilo:ihi, iloz:ihiz) is replaced by z(ilo:ihi,

iloz:ihiz)*U, where U is the orthogonal/unitary Schur factor of
h(ilo:ihi, ilo:ihi).
1505
If wantz is .FALSE., z is not referenced.
(The output values of z when info > 0 are given under the description of
info INTEGER.
> 0: if info = i, then the routine failed to compute all the eigenvalues.
Elements 1:ilo-1 and i+1:n of wr and wi contain those eigenvalues which
have been successfully computed.
> 0: if wantt is .FALSE., then the remaining unconverged eigenvalues are
the eigenvalues of the upper Hessenberg matrix rows and columns ilo
through info of the final output value of h.
> 0: if wantt is .TRUE., then (initial value of h)*U = U*(final value of h,

where U is an orthogonal/unitary matrix. The final value of h is upper
Hessenberg and quasi-triangular/triangular in rows and columns info+1
through ihi.
> 0: if wantz is .TRUE., then (final value of z(ilo:ihi,

iloz:ihiz))=(initial value of z(ilo:ihi, iloz:ihiz)*U, where U is the
orthogonal/unitary matrix in the previous expression (regardless of the
value of wantt).
> 0: if wantz is .FALSE., then z is not accessed.
?laqr5
Performs a single small-bulge multi-shift QR sweep.
Syntax
call slaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh, iloz, ihiz,
z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call dlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, sr, si, h, ldh, iloz, ihiz,
z, ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call claqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz, ihiz, z,
ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
call zlaqr5( wantt, wantz, kacc22, n, ktop, kbot, nshfts, s, h, ldh, iloz, ihiz, z,
ldz, v, ldv, u, ldu, nv, wv, ldwv, nh, wh, ldwh )
Include Files
mkl.fi
Description
This auxiliary routine called by ?laqr0 performs a single small-bulge multi-shift QR sweep.
Input Parameters
wantt LOGICAL.
1506
LAPACK Routines 3
wantt = .TRUE. if the quasi-triangular/triangular Schur factor is
computed.
wantt is set to .FALSE. otherwise.
wantz LOGICAL.
wantz = .TRUE. if the orthogonal/unitary Schur factor is computed.
wantz is set to .FALSE. otherwise.
kacc22 INTEGER. Possible values are 0, 1, or 2.

Specifies the computation mode of far-from-diagonal orthogonal updates.
= 0: the routine does not accumulate reflections and does not use matrix-
matrix multiply to update far-from-diagonal matrix entries.
= 1: the routine accumulates reflections and uses matrix-matrix multiply to
update the far-from-diagonal matrix entries.
= 2: the routine accumulates reflections, uses matrix-matrix multiply to
update the far-from-diagonal matrix entries, and takes advantage of 2-by-2
block structure during matrix multiplies.
n INTEGER. The order of the Hessenberg matrix H upon which the routine
operates.
ktop, kbot INTEGER.

It is assumed without a check that either ktop=1 or h(ktop,ktop-1)=0,
and either kbot=n or h(kbot+1,kbot)=0.
nshfts INTEGER.
Number of simultaneous shifts, must be positive and even.

Arrays, DIMENSION (nshfts) each.
sr contains the real parts and si contains the imaginary parts of the
nshfts shifts of origin that define the multi-shift QR sweep.
s COMPLEX for claqr5

Arrays, DIMENSION (nshfts).
s contains the shifts of origin that define the multi-shift QR sweep.
h REAL for slaqr5

COMPLEX for claqr5
Array, DIMENSION (ldh, n), on input contains the Hessenberg matrix.
1507
calling routine. ldh max(1, n).
wantz is .TRUE.. 1 ilozihizn.
z REAL for slaqr5

COMPLEX for claqr5
calling routine. ldzn.
v REAL for slaqr5

COMPLEX for claqr5
Workspace array with dimension (ldv, nshfts/2).
calling routine. ldv 3.
u REAL for slaqr5

COMPLEX for claqr5
Workspace array with dimension (ldu, 3*nshfts-3).
ldu INTEGER. The leading dimension of the array u just as declared in the
calling routine. ldu 3*nshfts-3.
nh INTEGER. The number of column in the array wh available for workspace.

nh 1.
wh REAL for slaqr5

COMPLEX for claqr5
Workspace array with dimension (ldwh, nh)
ldwh INTEGER. The leading dimension of the array wh just as declared in the
calling routine. ldwh 3*nshfts-3
1508
LAPACK Routines 3
nv INTEGER. The number of rows of the array wv available for workspace. nv
1.
wv REAL for slaqr5

COMPLEX for claqr5
Workspace array with dimension (ldwv, 3*nshfts-3).
calling routine. ldwvnv.
Output Parameters
sr, si On output, may be reordered.
h On output a multi-shift QR Sweep with shifts sr(j)+i*si(j) or s(j) is

applied to the isolated diagonal block in rows and columns ktop through
kbot .
z If wantz is .TRUE., then the QR Sweep orthogonal/unitary similarity

transformation is accumulated into z(iloz:ihiz, ilo:ihi) from the right.
?laqsb
Scales a symmetric band matrix, using scaling factors
computed by ?pbequ.
Syntax
call slaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call dlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call claqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
call zlaqsb( uplo, n, kd, ab, ldab, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a symmetric band matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
matrix A is stored.
1509

n 0.

kd 0.
ab REAL for slaqsb

DOUBLE PRECISION for dlaqsb
COMPLEX for claqsb
DOUBLE COMPLEX for zlaqsb
Array, DIMENSION (ldab,n). On entry, the upper or lower triangle of the
symmetric band matrix A, stored in the first kd+1 rows of the array. The j-
th column of A is stored in the j-th column of the array ab as follows:
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for max(1,j-kd) ij;
if uplo = 'L', ab(1+i-j,j) = A(i,j) for ji min(n,j+kd).

ldabkd+1.
s REAL for slaqsb/claqsb

DOUBLE PRECISION for dlaqsb/zlaqsb
scond REAL for slaqsb/claqsb

amax REAL for slaqsb/claqsb

Output Parameters
ab On exit, if info = 0, the triangular factor U or L from the Cholesky

factorization of the band matrix A that can be A = UT*U or A = L*LT for
real flavors and A = UH*U or A = L*LH for complex flavors, in the same
storage format as A.
equed CHARACTER*1.
1510
LAPACK Routines 3
diag(s)*A*diag(s).
Application Notes
thresh, scaling is done. large and small are threshold values used to decide if scaling should be done based
on the absolute size of the largest matrix element. If amax > large or amax < small, scaling is done.
?laqsp
Scales a symmetric/Hermitian matrix in packed
storage, using scaling factors computed by ?ppequ.
Syntax
call slaqsp( uplo, n, ap, s, scond, amax, equed )
call dlaqsp( uplo, n, ap, s, scond, amax, equed )
call claqsp( uplo, n, ap, s, scond, amax, equed )
call zlaqsp( uplo, n, ap, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine ?laqsp equilibrates a symmetric matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
matrix A is stored.
ap REAL for slaqsp

DOUBLE PRECISION for dlaqsp
COMPLEX for claqsp
DOUBLE COMPLEX for zlaqsp
On entry, the upper or lower triangle of the symmetric matrix A, packed

columnwise in a linear array. The j-th column of A is stored in the array ap
as follows:
1511
s REAL for slaqsp/claqsp

DOUBLE PRECISION for dlaqsp/zlaqsp
scond REAL for slaqsp/claqsp

amax REAL for slaqsp/claqsp

Output Parameters
ap On exit, the equilibrated matrix: diag(s)*A*diag(s), in the same storage

format as A.
equed CHARACTER*1.
diag(s)*A*diag(s).
Application Notes
?laqsy
Scales a symmetric/Hermitian matrix, using scaling
factors computed by ?poequ.
Syntax
call slaqsy( uplo, n, a, lda, s, scond, amax, equed )
call dlaqsy( uplo, n, a, lda, s, scond, amax, equed )
call claqsy( uplo, n, a, lda, s, scond, amax, equed )
call zlaqsy( uplo, n, a, lda, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a symmetric matrix A using the scaling factors in the vector s.
1512
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1.
matrix A is stored.

n 0.
a REAL for slaqsy

DOUBLE PRECISION for dlaqsy
COMPLEX for claqsy
DOUBLE COMPLEX for zlaqsy
Array, DIMENSION (lda,n). On entry, the symmetric matrix A.

lda max(n,1).
s REAL for slaqsy/claqsy

DOUBLE PRECISION for dlaqsy/zlaqsy
scond REAL for slaqsy/claqsy

amax REAL for slaqsy/claqsy

Output Parameters
a On exit, if equed = 'Y', the equilibrated matrix: diag(s)*A*diag(s).
equed CHARACTER*1.
1513
If equed = 'Y': Equilibration was done, i.e., A has been replaced by

diag(s)*A*diag(s).
Application Notes
?laqtr
Solves a real quasi-triangular system of equations, or
a complex quasi-triangular system of special form, in
real arithmetic.
Syntax
call slaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
call dlaqtr( ltran, lreal, n, t, ldt, b, w, scale, x, work, info )
Include Files
mkl.fi
Description
The routine ?laqtr solves the real quasi-triangular system
op(T) * p = scale*c, if lreal = .TRUE.

or the complex quasi-triangular systems
op(T + iB)*(p+iq) = scale*(c+id), if lreal = .FALSE.
in real arithmetic, where T is upper quasi-triangular.
If lreal = .FALSE., then the first diagonal block of T must be 1-by-1, B is the specially structured matrix
op(A) = A or AT, AT denotes the transpose of matrix A.

On input,
1514
LAPACK Routines 3
This routine is designed for the condition number estimation in routine ?trsna.
Input Parameters
ltran LOGICAL.
On entry, ltran specifies the option of conjugate transpose:
= .FALSE., op(T + iB) = T + iB,
= .TRUE., op(T + iB) = (T + iB)T.
lreal LOGICAL.
On entry, lreal specifies the input matrix structure:
= .FALSE., the input is complex
= .TRUE., the input is real.
n INTEGER.
On entry, n specifies the order of T + iB. n 0.
t REAL for slaqtr

DOUBLE PRECISION for dlaqtr
Array, dimension (ldt,n). On entry, t contains a matrix in Schur canonical
form. If lreal = .FALSE., then the first diagonal block of t must be 1-
by-1.
ldt INTEGER. The leading dimension of the matrix T.

ldt max(1,n).
b REAL for slaqtr

Array, dimension (n). On entry, b contains the elements to form the matrix
B as described above. If lreal = .TRUE., b is not referenced.
w REAL for slaqtr

On entry, w is the diagonal element of the matrix B.
If lreal = .TRUE., w is not referenced.
x REAL for slaqtr

Array, dimension (2n). On entry, x contains the right hand side of the
system.
work REAL for slaqtr

1515
Output Parameters
scale REAL for slaqtr

On exit, scale is the scale factor.
x On exit, X is overwritten by the solution.
info INTEGER.
If info = 1: the some diagonal 1-by-1 block has been perturbed by a

small number smin to keep nonsingularity.
If info = 2: the some diagonal 2-by-2 block has been perturbed by a

small number in ?laln2 to keep nonsingularity.
NOTE
?lar1v
Computes the (scaled) r-th column of the inverse of
the submatrix in rows b1 through bn of tridiagonal
matrix.
Syntax
call slar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
mingma, r, isuppz, nrminv, resid, rqcorr, work )
call dlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
call clar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
call zlar1v( n, b1, bn, lambda, d, l, ld, lld, pivmin, gaptol, z, wantnc, negcnt, ztz,
Include Files
mkl.fi
Description
The routine ?lar1v computes the (scaled) r-th column of the inverse of the submatrix in rows b1 through bn
of the tridiagonal matrix L*D*LT - *I. When is close to an eigenvalue, the computed vector is an
accurate eigenvector. Usually, r corresponds to the index where the eigenvector is largest in magnitude.
The following steps accomplish this computation :
Stationary qd transform, L*D*LT - *I = L(+)*D(+)*L(+)T

Progressive qd transform, L*D*LT - *I = U(-)*D(-)*U(-)T,
1516
LAPACK Routines 3
Computation of the diagonal elements of the inverse of L*D*LT - *I by combining the above
transforms, and choosing r as the index where the diagonal of the inverse is (one of the) largest in
magnitude.
Computation of the (scaled) r-th column of the inverse using the twisted factorization obtained by
combining the top part of the stationary and the bottom part of the progressive transform.
Input Parameters
n INTEGER. The order of the matrix L*D*LT.
b1 INTEGER. First index of the submatrix of L*D*LT.
bn INTEGER. Last index of the submatrix of L*D*LT.
lambda REAL for slar1v/clar1v

DOUBLE PRECISION for dlar1v/zlar1v
The shift. To compute an accurate eigenvector, lambda should be a good
approximation to an eigenvalue of L*D*LT.
l REAL for slar1v/clar1v

The (n-1) subdiagonal elements of the unit bidiagonal matrix L, in elements

1 to n-1.
d REAL for slar1v/clar1v

The n diagonal elements of the diagonal matrix D.
ld REAL for slar1v/clar1v

The n-1 elements Li*Di.
lld REAL for slar1v/clar1v

The n-1 elements Li*Li*Di.
pivmin REAL for slar1v/clar1v

The minimum pivot in the Sturm sequence.
gaptol REAL for slar1v/clar1v

1517
Tolerance that indicates when eigenvector entries are negligible with respect
to their contribution to the residual.
z REAL for slar1v

DOUBLE PRECISION for dlar1v
COMPLEX for clar1v
DOUBLE COMPLEX for zlar1v
Array, DIMENSION (n). All entries of z must be set to 0.
wantnc LOGICAL.
Specifies whether negcnt has to be computed.
r INTEGER.
The twist index for the twisted factorization used to compute z. On input, 0
rn. If r is input as 0, r is set to the index where (L*D*LT -
lambda*I)-1 is largest in magnitude. If 1 rn, r is unchanged.
work REAL for slar1v/clar1v

Workspace array, DIMENSION (4*n).
Output Parameters
z REAL for slar1v

COMPLEX for clar1v
Array, DIMENSION (n). The (scaled) r-th column of the inverse. z(r) is
returned to be 1.
negcnt INTEGER. If wantnc is .TRUE. then negcnt = the number of pivots <
pivmin in the matrix factorization L*D*LT, and negcnt = -1 otherwise.
ztz REAL for slar1v/clar1v

The square of the 2-norm of z.
mingma REAL for slar1v/clar1v

The reciprocal of the largest (in magnitude) diagonal element of the inverse
of L*D*LT - lambda*I.
r On output, r is the twist index used to compute z. Ideally, r designates the

position of the maximum entry in the eigenvector.
isuppz INTEGER. Array, DIMENSION (2). The support of the vector in Z, that is, the
vector z is nonzero only in elements isuppz(1) through isuppz(2).
1518
LAPACK Routines 3
nrminv REAL for slar1v/clar1v
Equals 1/sqrt( ztz ).
resid REAL for slar1v/clar1v

The residual of the FP vector.
resid = ABS( mingma )/sqrt( ztz ).
rqcorr REAL for slar1v/clar1v

The Rayleigh Quotient correction to lambda.
rqcorr = mingma/ztz.
?lar2v
Applies a vector of plane rotations with real cosines
and real/complex sines from both sides to a sequence
of 2-by-2 symmetric/Hermitian matrices.
Syntax
call slar2v( n, x, y, z, incx, c, s, incc )
call dlar2v( n, x, y, z, incx, c, s, incc )
call clar2v( n, x, y, z, incx, c, s, incc )
call zlar2v( n, x, y, z, incx, c, s, incc )
Include Files
mkl.fi
Description
The routine ?lar2v applies a vector of real/complex plane rotations with real cosines from both sides to a
sequence of 2-by-2 real symmetric or complex Hermitian matrices, defined by the elements of the vectors x,
y and z. For i = 1,2,...,n
Input Parameters
n INTEGER. The number of plane rotations to be applied.
x, y, z REAL for slar2v

COMPLEX for clar2v
1519

Arrays, DIMENSION (1+(n-1)*incx) each. Contain the vectors x, y and z,
respectively. For all flavors of ?lar2v, elements of x and y are assumed to
be real.
incx INTEGER. The increment between elements of x, y, and z. incx > 0.
c REAL for slar2v/clar2v

Array, DIMENSION (1+(n-1)*incc). The cosines of the plane rotations.
s REAL for slar2v

COMPLEX for clar2v
Array, DIMENSION (1+(n-1)*incc). The sines of the plane rotations.
incc INTEGER. The increment between elements of c and s. incc > 0.
Output Parameters
x, y, z Vectors x, y and z, containing the results of transform.
?laran
Returns a random real number from a uniform
distribution.
Syntax
res = slaran (iseed)
res = dlaran (iseed)
Description
The ?laran routine returns a random real number from a uniform (0,1) distribution. This routine uses a
multiplicative congruential method with modulus 248 and multiplier 33952834046453. 48-bit integers are
stored in four integer array elements with 12 bits per element. Hence the routine is portable across machines
with integers of 32 bits or more.
Input Parameters
iseed INTEGER. Array, size 4. On entry, the seed of the random number
generator. The array elements must be between 0 and 4095, and iseed(4)
must be odd.
Output Parameters
iseed INTEGER.
On exit, the seed is updated.
1520
LAPACK Routines 3
res REAL for slaran,
DOUBLE PRECISION for dlaran,
Random number.
?larf
Applies an elementary reflector to a general
rectangular matrix.
Syntax
call slarf( side, m, n, v, incv, tau, c, ldc, work )
call dlarf( side, m, n, v, incv, tau, c, ldc, work )
call clarf( side, m, n, v, incv, tau, c, ldc, work )
call zlarf( side, m, n, v, incv, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from either the
left or the right. H is represented in one of the following forms:
H = I - tau*v*vT
where tau is a real scalar and v is a real vector.
If tau = 0, then H is taken to be the unit matrix.
H = I - tau*v*vH
where tau is a complex scalar and v is a complex vector.
If tau = 0, then H is taken to be the unit matrix. For clarf/zlarf, to apply HH (the conjugate transpose
of H), supply conjg(tau) instead of tau.
Input Parameters
side CHARACTER*1.
If side = 'L': form H*C
If side = 'R': form C*H.
v REAL for slarf

DOUBLE PRECISION for dlarf
COMPLEX for clarf
DOUBLE COMPLEX for zlarf
Array, DIMENSION
1521
(1 + (m-1)*abs(incv)) if side = 'L' or

(1 + (n-1)*abs(incv)) if side = 'R'. The vector v in the
representation of H. v is not used if tau = 0.
incv INTEGER. The increment between elements of v.

incv 0.
tau REAL for slarf

COMPLEX for clarf
The value tau in the representation of H.
c REAL for slarf

COMPLEX for clarf
Array, DIMENSION (ldc,n).

ldc max(1,m).
work REAL for slarf

COMPLEX for clarf
Workspace array, DIMENSION
(n) if side = 'L' or
(m) if side = 'R'.
Output Parameters
c On exit, C is overwritten by the matrix H*C if side = 'L', or C*H if side

= 'R'.
?larfb
Applies a block reflector or its transpose/conjugate-
transpose to a general rectangular matrix.
Syntax
call slarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
1522
LAPACK Routines 3
call dlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
call clarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
call zlarfb( side, trans, direct, storev, m, n, k, v, ldv, t, ldt, c, ldc, work,
ldwork )
Include Files
mkl.fi
Description
The real flavors of the routine ?larfb apply a real block reflector H or its transpose HT to a real m-by-n
matrix C from either left or right.
The complex flavors of the routine ?larfb apply a complex block reflector H or its conjugate transpose HH to
a complex m-by-n matrix C from either left or right.
Input Parameters
side CHARACTER*1.
If side = 'L': apply H or HT for real flavors and H or HH for complex
flavors from the left.
If side = 'R': apply H or HT for real flavors and H or HH for complex
flavors from the right.
trans CHARACTER*1.
If trans = 'N': apply H (No transpose).
If trans = 'C': apply HH (Conjugate transpose).
If trans = 'T': apply HT (Transpose).
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors
If direct = 'F': H = H(1)*H(2)*. . . *H(k) (forward)
If direct = 'B': H = H(k)* . . . H(2)*H(1) (backward)
storev CHARACTER*1.
Indicates how the vectors which define the elementary reflectors are
stored:
If storev = 'C': Column-wise
If storev = 'R': Row-wise
k INTEGER. The order of the matrix T (equal to the number of elementary

reflectors whose product defines the block reflector).
1523
v REAL for slarfb

DOUBLE PRECISION for dlarfb
COMPLEX for clarfb
DOUBLE COMPLEX for zlarfb
Array, DIMENSION
(ldv, k) if storev = 'C'
(ldv, m) if storev = 'R' and side = 'L'
(ldv, n) if storev = 'R' and side = 'R'
The matrix v. See Application Notes below.

If storev = 'C' and side = 'L', ldv max(1,m);
if storev = 'C' and side = 'R', ldv max(1,n);
if storev = 'R', ldvk.
t REAL for slarfb

COMPLEX for clarfb
Array, size (ldt,k).
Contains the triangular k-by-k matrix T in the representation of the block
reflector.
ldt INTEGER. The leading dimension of the array t.

ldtk.
c REAL for slarfb

COMPLEX for clarfb
Array, size (ldc,n).

ldc max(1,m).
work REAL for slarfb

COMPLEX for clarfb
Workspace array, DIMENSION (ldwork, k).
1524
LAPACK Routines 3
ldwork INTEGER. The leading dimension of the array work.
If side = 'L', ldwork max(1, n);
if side = 'R', ldwork max(1, m).
Output Parameters
c On exit, c is overwritten by the product of the following:
H*C, or HT*C, or C*H, or C*HT for real flavors

H*C, or HH*C, or C*H, or C*HH for complex flavors
info INTEGER.
Application Notes
The shape of the matrix V and the storage of the vectors which define the H(i) is best illustrated by the
following example with n = 5 and k = 3. The elements equal to 1 are not stored; the corresponding array
elements are modified but restored on exit. The rest of the array is not used.
1525
?larfg
Generates an elementary reflector (Householder
matrix).
Syntax
call slarfg( n, alpha, x, incx, tau )
call dlarfg( n, alpha, x, incx, tau )
call clarfg( n, alpha, x, incx, tau )
call zlarfg( n, alpha, x, incx, tau )
Include Files
mkl.fi
Description
The routine ?larfg generates a real/complex elementary reflector H of order n, such that

where alpha and beta are scalars (with beta real for all flavors), and x is an (n-1)-element real/complex
vector. H is represented in the form

where tau is a real/complex scalar and v is a real/complex (n-1)-element vector, respectively. Note that for
clarfg/zlarfg, H is not Hermitian.
If the elements of x are all zero (and, for complex flavors, alpha is real), then tau = 0 and H is taken to be
the unit matrix.
Otherwise, 1 tau 2 (for real flavors), or
1 Re(tau) 2 and abs(tau-1) 1 (for complex flavors).
Input Parameters
n INTEGER. The order of the elementary reflector.
alpha REAL for slarfg

DOUBLE PRECISION for dlarfg
COMPLEX for clarfg
1526
LAPACK Routines 3
DOUBLE COMPLEX for zlarfg On entry, the value alpha.
x REAL for slarfg

COMPLEX for clarfg
DOUBLE COMPLEX for zlarfg
Array, size (1+(n-2)*abs(incx)).
On entry, the vector x.
incx INTEGER.
The increment between elements of x. incx > 0.
Output Parameters
alpha On exit, it is overwritten with the value beta.
x On exit, it is overwritten with the vector v.
tau REAL for slarfg

COMPLEX for clarfg
DOUBLE COMPLEX for zlarfg The value tau.
?larfgp
Generates an elementary reflector (Householder
matrix) with non-negative beta .
Syntax
call slarfgp( n, alpha, x, incx, tau )
call dlarfgp( n, alpha, x, incx, tau )
call clarfgp( n, alpha, x, incx, tau )
call zlarfgp( n, alpha, x, incx, tau )
Include Files
mkl.fi
Description
The routine ?larfgp generates a real/complex elementary reflector H of order n, such that
1527
where alpha and beta are scalars (with beta real and non-negative for all flavors), and x is an (n-1)-element
real/complex vector. H is represented in the form

where tau is a real/complex scalar and v is a real/complex (n-1)-element vector. Note that for c/zlarfgp, H
is not Hermitian.
If the elements of x are all zero (and, for complex flavors, alpha is real), then tau = 0 and H is taken to be
the unit matrix.
Otherwise, 1 tau 2 (for real flavors), or
1 Re(tau) 2 and abs(tau-1) 1 (for complex flavors).
Input Parameters
n INTEGER. The order of the elementary reflector.
alpha REAL for slarfgp

DOUBLE PRECISION for dlarfgp
COMPLEX for clarfgp
DOUBLE COMPLEX for zlarfgp
On entry, the value alpha.
x REAL for s
COMPLEX for clarfgp
Array, DIMENSION (1+(n-2)*abs(incx)).
On entry, the vector x.
incx INTEGER.
The increment between elements of x. incx > 0.
Output Parameters
alpha On exit, it is overwritten with the value beta.
x On exit, it is overwritten with the vector v.
tau REAL for slarfgp

COMPLEX for clarfgp
1528
LAPACK Routines 3
The value tau.
?larft
Forms the triangular factor T of a block reflector H = I
- V*T*V**H.
Syntax
call slarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarft( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarft( direct, storev, n, k, v, ldv, tau, t, ldt )
Include Files
mkl.fi
Description
The routine ?larft forms the triangular factor T of a real/complex block reflector H of order n, which is
defined as a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)* . . .*H(k) and T is upper triangular;
If direct = 'B', H = H(k)*. . .*H(2)*H(1) and T is lower triangular.
If storev = 'C', the vector which defines the elementary reflector H(i) is stored in the i-th column of the
array v, and H = I - V*T*VT (for real flavors) or H = I - V*T*VH (for complex flavors) .
If storev = 'R', the vector which defines the elementary reflector H(i) is stored in the i-th row of the array
v, and H = I - VT*T*V (for real flavors) or H = I - VH*T*V (for complex flavors).
Input Parameters
direct CHARACTER*1.
Specifies the order in which the elementary reflectors are multiplied to form
the block reflector:
= 'F': H = H(1)*H(2)*. . . *H(k) (forward)
= 'B': H = H(k)*. . .*H(2)*H(1) (backward)
storev CHARACTER*1.
Specifies how the vectors which define the elementary reflectors are stored
(see also Application Notes below):
= 'C': column-wise
= 'R': row-wise.
n INTEGER. The order of the block reflector H. n 0.
k INTEGER. The order of the triangular factor T (equal to the number of

elementary reflectors). k 1.
1529
v REAL for slarft

DOUBLE PRECISION for dlarft
COMPLEX for clarft
DOUBLE COMPLEX for zlarft
Array, DIMENSION
(ldv, k) if storev = 'C' or
(ldv, n) if storev = 'R'.
The matrix V.

If storev = 'C', ldv max(1,n)
tau REAL for slarft

COMPLEX for clarft
Array, size (k). tau(i) must contain the scalar factor of the elementary
reflector H(i).
ldt INTEGER. The leading dimension of the output array t. ldtk.
Output Parameters
t REAL for slarft

COMPLEX for clarft
Array, size ldt by k. The k-by-k triangular factor T of the block reflector. If
direct = 'F', T is upper triangular; if direct = 'B', T is lower
triangular. The rest of the array is not used.
v The matrix V.
Application Notes
1530
LAPACK Routines 3
?larfx
Applies an elementary reflector to a general
rectangular matrix, with loop unrolling when the
reflector has order less than or equal to 10.
Syntax
call slarfx( side, m, n, v, tau, c, ldc, work )
call dlarfx( side, m, n, v, tau, c, ldc, work )
call clarfx( side, m, n, v, tau, c, ldc, work )
call zlarfx( side, m, n, v, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine ?larfx applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from
either the left or the right.
H is represented in the following forms:
H = I - tau*v*vT, where tau is a real scalar and v is a real vector.

H = I - tau*v*vH, where tau is a complex scalar and v is a complex vector.
1531
Input Parameters
side CHARACTER*1.
If side = 'R': form C*H.
v REAL for slarfx

DOUBLE PRECISION for dlarfx
COMPLEX for clarfx
DOUBLE COMPLEX for zlarfx
Array, size
(m) if side = 'L' or
(n) if side = 'R'.
The vector v in the representation of H.
tau REAL for slarfx

COMPLEX for clarfx
c REAL for slarfx

COMPLEX for clarfx
Array, size ldc by n. On entry, the m-by-n matrix C.
ldc INTEGER. The leading dimension of the array c. lda (1,m).
work REAL for slarfx

COMPLEX for clarfx
Workspace array, size
(m) if side = 'R'.
work is not referenced if H has order < 11.
1532
LAPACK Routines 3
Output Parameters
c On exit, C is overwritten by the matrix H*C if side = 'L', or C*H if side =

'R'.
?large
Pre- and post-multiplies a real general matrix with a
random orthogonal matrix.
Syntax
call slarge( n, a, lda, iseed, work, info )
call dlarge( n, a, lda, iseed, work, info )
call clarge( n, a, lda, iseed, work, info )
call zlarge( n, a, lda, iseed, work, info )
Include Files
mkl.fi
Description
The routine ?large pre- and post-multiplies a general n-by-n matrix A with a random orthogonal or unitary
matrix: A = U*D*UT .
Input Parameters
n INTEGER. The order of the matrix A. n0
a REAL for slarge,

DOUBLE PRECISION for dlarge,
COMPLEX for clarge,
DOUBLE COMPLEX for zlarge,
Array, size lda by n.
On entry, the original n-by-n matrix A.
lda INTEGER. The leading dimension of the array a. ldan.
iseed INTEGER. Array, size 4.

On entry, the seed of the random number generator. The array elements
must be between 0 and 4095, and iseed(4) must be odd.
work REAL for slarge,

DOUBLE PRECISION for dlarge,
COMPLEX for clarge,
DOUBLE COMPLEX for zlarge,
Workspace array, size 2*n.
1533
Output Parameters
a INTEGER.
On exit, A is overwritten by U*A*U' for some random orthogonal matrix U.
iseed INTEGER.
info INTEGER.
If info < 0, the i -th parameter had an illegal value.
?largv
Generates a vector of plane rotations with real cosines
and real/complex sines.
Syntax
call slargv( n, x, incx, y, incy, c, incc )
call dlargv( n, x, incx, y, incy, c, incc )
call clargv( n, x, incx, y, incy, c, incc )
call zlargv( n, x, incx, y, incy, c, incc )
Include Files
mkl.fi
Description
The routine generates a vector of real/complex plane rotations with real cosines, determined by elements of
the real/complex vectors x and y.
For slargv/dlargv:
For clargv/zlargv:
where c(i)2 + abs(s(i))2 = 1 and the following conventions are used (these are the same as in clartg/
zlartg but differ from the BLAS Level 1 routine crotg/zrotg):
If yi = 0, then c(i) = 1 and s(i) = 0;
If xi = 0, then c(i) = 0 and s(i) is chosen so that ri is real.
1534
LAPACK Routines 3
Input Parameters
n INTEGER. The number of plane rotations to be generated.
x, y REAL for slargv

DOUBLE PRECISION for dlargv
COMPLEX for clargv
DOUBLE COMPLEX for zlargv
Arrays, DIMENSION (1+(n-1)*incx) and (1+(n-1)*incy), respectively. On
entry, the vectors x and y.
incx INTEGER. The increment between elements of x.

incx > 0.
incy INTEGER. The increment between elements of y.

incy > 0.
incc INTEGER. The increment between elements of the output array c. incc >
0.
Output Parameters
x On exit, x(i) is overwritten by ai (for real flavors), or by ri (for complex

flavors), for i = 1,...,n.
y On exit, the sines s(i) of the plane rotations.
c REAL for slargv/clargv

DOUBLE PRECISION for dlargv/zlargv
Array, DIMENSION (1+(n-1)*incc). The cosines of the plane rotations.
?larnd
Returns a random real number from a uniform or
normal distribution.
Syntax
res = slarnd( idist, iseed )
res = dlarnd( idist, iseed )
res = clarnd( idist, iseed )
res = zlarnd( idist, iseed )
Include Files
mkl.fi
Description
The routine ?larnd returns a random number from a uniform or normal distribution.
1535
Input Parameters
idist INTEGER. Specifies the distribution of the random numbers. For slarnd and
dlanrd:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1).
For clarnd and zlanrd:
= 1: real and imaginary parts each uniform (0,1)

= 2: real and imaginary parts each uniform (-1,1)
= 3: real and imaginary parts each normal (0,1)
= 4: uniformly distributed on the disc abs(z) 1
= 5: uniformly distributed on the circle abs(z) = 1

On entry, the seed of the random number generator. The array elements
Output Parameters
iseed INTEGER.
res REAL for slarnd,

DOUBLE PRECISION for dlarnd,
COMPLEX for clarnd,
DOUBLE COMPLEX for zlarnd,
Random number.
?larnv
Returns a vector of random numbers from a uniform
or normal distribution.
Syntax
call slarnv( idist, iseed, n, x )
call dlarnv( idist, iseed, n, x )
call clarnv( idist, iseed, n, x )
call zlarnv( idist, iseed, n, x )
Include Files
mkl.fi
Description
1536
LAPACK Routines 3
The routine ?larnv returns a vector of n random real/complex numbers from a uniform or normal
distribution.
This routine calls the auxiliary routine ?laruv to generate random real numbers from a uniform (0,1)
distribution, in batches of up to 128 using vectorisable code. The Box-Muller method is used to transform
numbers from a uniform to a normal distribution.
Input Parameters
idist INTEGER. Specifies the distribution of the random numbers: for slarnv and
dlarnv:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1).
for clarnv and zlarnv:

= 4: uniformly distributed on the disc abs(z) < 1
= 5: uniformly distributed on the circle abs(z) = 1
iseed INTEGER. Array, size (4).

On entry, the seed of the random number generator; the array elements
n INTEGER. The number of random numbers to be generated.
Output Parameters
x REAL for slarnv

DOUBLE PRECISION for dlarnv
COMPLEX for clarnv
DOUBLE COMPLEX for zlarnv
Array, size (n). The generated random numbers.
iseed On exit, the seed is updated.
?laror
Pre- or post-multiplies an m-by-n matrix by a random
Syntax
call slaror( side, init, m, n, a, lda, iseed, x, info )
call dlaror( side, init, m, n, a, lda, iseed, x, info )
call claror( side, init, m, n, a, lda, iseed, x, info )
1537
call zlaror( side, init, m, n, a, lda, iseed, x, info )
Include Files
mkl.fi
Description
The routine ?laror pre- or post-multiplies an m-by-n matrix A by a random orthogonal or unitary matrix U,
overwriting A. A may optionally be initialized to the identity matrix before multiplying by U. U is generated
using the method of G.W. Stewart (SIAM J. Numer. Anal. 17, 1980, 403-409).
Input Parameters
side CHARACTER*1. Specifies whether A is multiplied by U on the left or right.

for slaror and dlaror:
If side = 'L', multiply A on the left (premultiply) by U.
If side = 'R', multiply A on the right (postmultiply) by UT.
If side = 'C' or 'T', multiply A on the left by U and the right by UT.
for claror and zlaror:
If side = 'L', multiply A on the left (premultiply) by U.
If side = 'R', multiply A on the right (postmultiply) by UC>.
If side = 'C', multiply A on the left by U and the right by UC>
Ifside = 'T', multiply A on the left by U and the right by UT.
init CHARACTER*1. Specifies whether or not a should be initialized to the

identity matrix.
If init = 'I', initialize a to (a section of) the identity matrix before
applying U.
If init = 'N', no initialization. Apply U to the input matrix A.
init = 'I' generates square or rectangular orthogonal matrices:

For m = n and side = 'L' or 'R', the rows and the columns are
orthogonal to each other.
For rectangular matrices where m < n:
If side = 'R', ?laror produces a dense matrix in which rows are

orthogonal and columns are not.
If side= 'L', ?laror produces a matrix in which rows are orthogonal,
first m columns are orthogonal, and remaining columns are zero.
For rectangular matrices where m > n:
If side = 'L', ?laror produces a dense matrix in which columns are

orthogonal and rows are not.
If side = 'R', ?laror produces a matrix in which columns are
orthogonal, first m rows are orthogonal, and remaining rows are zero.
m INTEGER. The number of rows of A.
n INTEGER. The number of columns of A.
1538
LAPACK Routines 3
a REAL for slaror,
DOUBLE PRECISION for dlaror,
COMPLEX for claror,
DOUBLE COMPLEX for zlaror,

lda max(1, m).

On entry, specifies the seed of the random number generator. The array
elements must be between 0 and 4095; if not they are reduced mod 4096.
Also, iseed(4) must be odd.
x REAL for slaror,

DOUBLE PRECISION for dlaror,
COMPLEX for claror,
DOUBLE COMPLEX for zlaror,
Workspace array, size (3*max( m, n )) .
Value of side Length of workspace
'L' 2*m + n
'R' 2*n + m
'C' or 'T' 3*n
Output Parameters
a On exit, overwritten
by UA ( if side = 'L' ),
by AU ( if side = 'R' ),
by UAUT ( if side = 'C' or 'T').
iseed The values of iseed are changed on exit, and can be used in the next call
to continue the same random number sequence.
info INTEGER. Array, size (4).

For slaror and dlaror:
If info < 0, the i -th parameter had an illegal value.
If info = 1, the random numbers generated by ?laror are bad.
For claror and zlaror:
If info = -1, side is not 'L', 'R', 'C', or 'T'.
If info = -3, if m is negative.
1539
If info = -4, if m is negative or if side is 'C' or 'T' and n is not equal to

m.
If info = -6, if lda is less than m .
?larot
Applies a Givens rotation to two adjacent rows or
columns.
Syntax
call slarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call dlarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call clarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
call zlarot( lrows, lleft, lright, nl, c, s, a, lda, xleft, xright )
Include Files
mkl.fi
Description
The routine ?larot applies a Givens rotation to two adjacent rows or columns, where one element of the
first or last column or row is stored in some format other than GE so that elements of the matrix may be
used or modified for which no array element is provided.
One example is a symmetric matrix in SB format (bandwidth = 4), for which uplo = 'L'. Two adjacent rows
will have the format:
row j : C > C > C > C > C > . . . .

row j + 1 : C > C > C > C > C > . . . .
'*' indicates elements for which storage is provided.

'.' indicates elements for which no storage is provided, but are not necessarily zero; their values are
determined by symmetry.
' ' indicates elements which are required to be zero, and have no storage provided.
Those columns which have two '*' entries can be handled by srot (for slarot and clarot), or by
drot( for dlarot and zlarot).
Those columns which have no '*' entries can be ignored, since as long as the Givens rotations are carefully
applied to preserve symmetry, their values are determined.
Those columns which have one '*' have to be handled separately, by using separate variables p and q :
row j : C > C > C > C > C > p. . . .

row j + 1 : q C > C > C > C > C > . . . .
If element p is set correctly, ?larot rotates the column and sets p to its new value. The next call to ?larot
rotates columns j and j +1, and restore symmetry. The element q is zero at the beginning, and non-zero
after the rotation. Later, rotations would presumably be chosen to zero q out.
Typical Calling Sequences: rotating the i -th and (i +1)-st rows.
1540
LAPACK Routines 3
Example
Typical Calling Sequences

These examples rotate the i -th and (i +1)-st rows.
General dense matrix:
call dlarot (.TRUE.,.FALSE.,.FALSE., n, c, s,

a(i,1),lda, dummy, dummy)
General banded matrix in GB format:
j = max(1, i-kl )
nl = min( n, i+ku+1 ) + 1-j
call dlarot( .TRUE., i-kl.GE.1, i+ku.LT.n, nl, c,s,
a(ku+i+1-j,j),lda-1, xleft, xright )
NOTE
i + 1 - j is just min(i, kl + 1).
Symmetric banded matrix in SY format, bandwidth K, lower triangle only:
j = max(1, i-k )
nl = min( k+1, i ) + 1
call dlarot( .TRUE., i-k.GE.1, .TRUE., nl, c,s,
a(i,j), lda, xleft, xright )
Same, but upper triangle only:
nl = min( k+1, n-i ) + 1

call dlarot( .TRUE., .TRUE., i+k.LT.n, nl, c,s,
a(i,i), lda, xleft, xright )
Symmetric banded matrix in SB format, bandwidth K, lower triangle only: [ same as for SY, except:]
. . . .
a(i+1-j,j), lda, xleft, xright )
NOTE
i+1-j is just min(i,k+1)
Same, but upper triangle only:
. . . .
a(k+1,i), lda-1, xleft, xright )
Rotating columns is just the transpose of rotating rows, except for GB and SB: (rotating columns i and i+1)
GB:
NOTE
ku+j+1-i is just max(1,ku+2-i)
j = max(1, i-ku )
nl = min( n, i+kl+1 ) + 1-j
call dlarot( .TRUE., i-ku.LE.1, i+kl.LT.n, nl, c,s,
a(ku+j+1-i,i),lda-1, xtop, xbottm )
1541
SB: (upper triangle)
. . . . . .
a(k+j+1-i,i),lda-1, xtop, xbottm )
SB: (lower triangle) . . . . . . A(1,i),LDA-1, XTOP, XBOTTM )
. . . . . .
a(1,i),lda-1, xtop, xbottm )
Input Parameters
lrows LOGICAL.
If lrows = .TRUE., ?larot rotates two rows.
If lrows = .FALSE., ?larot rotates two columns.
lleft LOGICAL.
If lleft = .TRUE., xleft is used instead of the corresponding element of
a for the first element in the second row (if lrows = .FALSE.) or column
(if lrows=.TRUE.).
If lleft = .FALSE., the corresponding element of a is used.
lright LOGICAL.
If lleft = .TRUE., xright is used instead of the corresponding element
of a for the first element in the second row (if lrows = .FALSE.) or
column (if lrows=.TRUE.).
If lright = .FALSE., the corresponding element of a is used.
nl INTEGER. The length of the rows (if lrows=.TRUE.) or columns (if

lrows=.TRUE.) to be rotated.
If xleft or xright are used, the columns or rows they are in should be
included in nl, e.g., if lleft = lright = .TRUE., then nl must be at
least 2.
The number of rows or columns to be rotated exclusive of those involving
xleft and/or xright may not be negative, i.e., nl minus how many of
lleft and lright are .TRUE. must be at least zero; if not, xerbla is
called.
c, s REAL for slarot,

DOUBLE PRECISION for dlarot,
COMPLEX for clarot,
DOUBLE COMPLEX for zlarot,
Specify the Givens rotation to be applied.
If lrows = .TRUE., then the matrix
1542
LAPACK Routines 3
is applied from the left.
If lrows = .FALSE., then the transpose thereof is applied from the right.
a REAL for slarot,

COMPLEX for clarot,
The array containing the rows or columns to be rotated. The first element of
a should be the upper left element to be rotated.
lda INTEGER. The "effective" leading dimension of a.

If a contains a matrix stored in GE or SY format, then this is just the
leading dimension of A.
If a contains a matrix stored in band (GB or SB) format, then this should be
one less than the leading dimension used in the calling routine. Thus, if ais
dimensioned a(lda,*) in ?larot, then a(1,j ) would be the j -th element
in the first of the two rows to be rotated, and a(2,j ) would be the j -th in
the second, regardless of how the array may be stored in the calling
routine. a cannot be dimensioned, because for band format the row number
may exceed lda, which is not legal FORTRAN.
If lrows = .TRUE., then lda must be at least 1, otherwise it must be at

least nl minus the number of .TRUE. values in xleft and xright.
xleft REAL for slarot,

COMPLEX for clarot,
If lrows = .TRUE., xleft is used and modified instead of a(2,1) (if
lrows = .TRUE.) or a(1,2) (if lrows = .FALSE.).
xright REAL for slarot,

COMPLEX for clarot,
If lright = .TRUE., xright is used and modified instead of a(1,nl) (if
lrows = .TRUE.) or a(nl,1) (if lrows = .FALSE.).
Output Parameters
a On exit, modified array A.
?larra
Computes the splitting points with the specified
threshold.
1543
Syntax
call slarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
call dlarra( n, d, e, e2, spltol, tnrm, nsplit, isplit, info )
Include Files
mkl.fi
Description
The routine computes the splitting points with the specified threshold and sets any "small" off-diagonal
elements to zero.
Input Parameters
n INTEGER. The order of the matrix (n> 1).
d REAL for slarra

DOUBLE PRECISION for dlarra
e REAL for slarra

First (n-1) entries contain the subdiagonal elements of the tridiagonal

matrix T; e(n) need not be set.
e2 REAL for slarra

First (n-1) entries contain the squares of the subdiagonal elements of the
tridiagonal matrix T; e2(n) need not be set.
spltol REAL for slarra

The threshold for splitting. Two criteria can be used: spltol<0 : criterion
based on absolute off-diagonal value; spltol>0 : criterion that preserves
relative accuracy.
tnrm REAL for slarra

The norm of the matrix.
Output Parameters
e On exit, the entries e(isplit(i)), 1 i nsplit, are set to zero, the

other entries of e are untouched.
1544
LAPACK Routines 3
e2 On exit, the entries e2(isplit(i)), 1 i nsplit, are set to zero.
nsplit INTEGER.
The number of blocks the matrix T splits into. 1 nsplit n
isplit INTEGER.
The splitting points, at which T breaks up into blocks. The first block
consists of rows/columns 1 to isplit(1), the second of rows/columns
isplit(1)+1 through isplit(2), and so on, and the nsplit-th consists
of rows/columns isplit(nsplit-1)+1 through isplit(nsplit)=n.
info INTEGER.
?larrb
Provides limited bisection to locate eigenvalues for
more accuracy.
Syntax
call slarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr, work,
iwork, pivmin, spdiam, twist, info )
call dlarrb( n, d, lld, ifirst, ilast, rtol1, rtol2, offset, w, wgap, werr, work,
iwork, pivmin, spdiam, twist, info )
Include Files
mkl.fi
Description
Given the relatively robust representation (RRR) L*D*LT, the routine does "limited" bisection to refine the
eigenvalues of L*D*LT, w( ifirst-offset ) through w( ilast-offset ), to more accuracy. Initial guesses for these
eigenvalues are input in w. The corresponding estimate of the error in these guesses and their gaps are input
in werr and wgap, respectively. During bisection, intervals [left, right] are maintained by storing their mid-
points and semi-widths in the arrays w and werr respectively.
Input Parameters
n INTEGER. The order of the matrix.
d REAL for slarrb

DOUBLE PRECISION for dlarrb
Array, DIMENSION (n). The n diagonal elements of the diagonal matrix D.
lld REAL for slarrb

The n-1 elements Li*Li*Di.
1545
ifirst INTEGER. The index of the first eigenvalue to be computed.
ilast INTEGER. The index of the last eigenvalue to be computed.
rtol1, rtol2 REAL for slarrb

Tolerance for the convergence of the bisection intervals. An interval [left,
right] has converged if RIGHT-LEFT.LT.MAX( rtol1*gap, rtol2*max(|
left|,|right|) ), where gap is the (estimated) distance to the nearest
eigenvalue.
offset INTEGER. Offset for the arrays w, wgap and werr, that is, the ifirst-offset
through ilast-offset elements of these arrays are to be used.
w REAL for slarrb

Array, DIMENSION (n). On input, w( ifirst-offset ) through w( ilast-offset )
are estimates of the eigenvalues of L*D*LT indexed ifirst through ilast.
wgap REAL for slarrb

Array, DIMENSION (n-1). The estimated gaps between consecutive
eigenvalues of L*D*LT, that is, wgap(i-offset) is the gap between
eigenvalues i and i+1. Note that if IFIRST.EQ.ILAST then wgap(ifirst-
offset) must be set to 0.
werr REAL for slarrb

Array, DIMENSION (n). On input, werr(ifirst-offset) through werr(ilast-offset)
are the errors in the estimates of the corresponding elements in w.
work REAL for slarrb

pivmin REAL for slarrb

The minimum pivot in the Sturm sequence.
spdiam REAL for slarrb

The spectral diameter of the matrix.
twist INTEGER. The twist index for the twisted factorization that is used for the
negcount.
twist = n: Compute negcount from L*D*LT - lambda*i = L+* D+ *L

+T
twist = n: Compute negcount from L*D*LT - lambda*i = U-*D-*U-T
1546
LAPACK Routines 3
twist = n: Compute negcount from L*D*LT - lambda*i = Nr*D r*Nr
iwork INTEGER.
Output Parameters
w On output, the estimates of the eigenvalues are "refined".
wgap On output, the gaps are refined.
werr On output, "refined" errors in the estimates of w.
info INTEGER.
Error flag.
?larrc
Computes the number of eigenvalues of the
symmetric tridiagonal matrix.
Syntax
call slarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
call dlarrc( jobt, n, vl, vu, d, e, pivmin, eigcnt, lcnt, rcnt, info )
Include Files
mkl.fi
Description
The routine finds the number of eigenvalues of the symmetric tridiagonal matrix T or of its factorization
L*D*LT in the specified interval.
Input Parameters
jobt CHARACTER*1.
= 'T': computes Sturm count for matrix T.
= 'L': computes Sturm count for matrix L*D*LT.
n INTEGER.
The order of the matrix. (n > 1).
vl,vu REAL for slarrc

DOUBLE PRECISION for dlarrc
The lower and upper bounds for the eigenvalues.
d REAL for slarrc
1547
If jobt= 'T': contains the n diagonal elements of the tridiagonal matrix T.
If jobt= 'L': contains the n diagonal elements of the diagonal matrix D.
e REAL for slarrc

If jobt= 'T': contains the (n-1)offdiagonal elements of the matrix T.
If jobt= 'L': contains the (n-1)offdiagonal elements of the matrix L.
pivmin REAL for slarrc

The minimum pivot in the Sturm sequence for the matrix T.
Output Parameters
eigcnt INTEGER.
The number of eigenvalues of the symmetric tridiagonal matrix T that are in
the half-open interval (vl,vu].
lcnt,rcnt INTEGER.
The left and right negcounts of the interval.
info INTEGER.
Now it is not used and always is set to 0.
?larrd
Computes the eigenvalues of a symmetric tridiagonal
matrix to suitable accuracy.
Syntax
call slarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin, nsplit,
isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
call dlarrd( range, order, n, vl, vu, il, iu, gers, reltol, d, e, e2, pivmin, nsplit,
isplit, m, w, werr, wl, wu, iblock, indexw, work, iwork, info )
Include Files
mkl.fi
Description
The routine computes the eigenvalues of a symmetric tridiagonal matrix T to suitable accuracy. This is an
auxiliary code to be called from ?stemr. The user may ask for all eigenvalues, all eigenvalues in the half-
open interval (vl, vu], or the il-th through iu-th eigenvalues.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be much smaller
than that. (For more details see [Kahan66].
1548
LAPACK Routines 3
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval (vl, vu] will be
found.
= 'I': ("Index") the il-th through iu-th eigenvalues will be found.
order CHARACTER.
= 'B': ("By block") the eigenvalues will be grouped by split-off block (see
iblock, isplit below) and ordered from smallest to largest within the
block.
= 'E': ("Entire matrix") the eigenvalues for the entire matrix will be
ordered from smallest to largest.
n INTEGER. The order of the tridiagonal matrix T (n 1).
vl,vu REAL for slarrd

DOUBLE PRECISION for dlarrd
If range = 'V': the lower and upper bounds of the interval to be
searched for eigenvalues. Eigenvalues less than or equal to vl, or greater
than vu, will not be returned. vl < vu.
If range = 'A' or 'I': not referenced.
il,iu INTEGER.
If range = 'I': the indices (in ascending order) of the smallest and
largest eigenvalues to be returned. 1 il iu n, if n > 0; il=1 and
iu=0 if n=0.
If range = 'A' or 'V': not referenced.
gers REAL for slarrd

Array, DIMENSION (2*n).
The n Gerschgorin intervals (the i-th Gerschgorin interval is

(gers(2*i-1), gers(2*i)).
reltol REAL for slarrd
than reltol times the larger (in magnitude) endpoint, then it is considered
to be sufficiently small, that is converged. Note: this should always be at
least radix*machine epsilon.
d REAL for slarrd

1549
e REAL for slarrd

Contains (n-1) off-diagonal elements of the tridiagonal matrix T.
e2 REAL for slarrd

Contains (n-1) squared off-diagonal elements of the tridiagonal matrix T.
pivmin REAL for slarrd

nsplit INTEGER.
The number of diagonal blocks the matrix T . 1 nsplit n
isplit INTEGER.
Arrays, DIMENSION (n).
The splitting points, at which T breaks up into submatrices. The first

submatrix consists of rows/columns 1 to isplit(1), the second of rows/
columns isplit(1)+1 through isplit(2), and so on, and the nsplit-th
consists of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n.
(Only the first nsplit elements actually is used, but since the user cannot
know a priori value of nsplit, n words must be reserved for isplit.)
work REAL for slarrd

iwork INTEGER.
Output Parameters
m INTEGER.
The actual number of eigenvalues found. 0 mn. (See also the description
of info=2,3.)
w REAL for slarrd

1550
LAPACK Routines 3
The first m elements of w contain the eigenvalue approximations. ?laprd
computes an interval Ij = (aj, bj] that includes eigenvalue j. The
eigenvalue approximation is given as the interval midpoint w(j)= (aj
+bj)/2. The corresponding error is bounded by werr(j) = abs(aj-bj)/2.
werr REAL for slarrd

The error bound on the corresponding eigenvalue approximation in w.
wl, wu REAL for slarrd

The interval (wl, wu] contains all the wanted eigenvalues.
If range = 'V': then wl=vl and wu=vu.
If range = 'A': then wl and wu are the global Gerschgorin bounds on the
spectrum.
If range = 'I': then wl and wu are computed by ?laebz from the index
range specified.
iblock INTEGER.
At each row/column j where e(j) is zero or small, the matrix T is

considered to split into a block diagonal matrix.
If info = 0, then iblock(i) specifies to which block (from 1 to the
number of blocks) the eigenvalue w(i) belongs. (The routine may use the
remaining n-m elements as workspace.)
indexw INTEGER.
The indices of the eigenvalues within each block (submatrix); for example,
indexw(i)= j and iblock(i)=k imply that the i-th eigenvalue w(i) is
the j-th eigenvalue in block k.
info INTEGER.
> 0: some or all of the eigenvalues fail to converge or are not computed:
=1 or 3: bisection fail to converge for some eigenvalues; these eigenvalues
are flagged by a negative block number. The effect is that the eigenvalues
may not be as accurate as the absolute and relative tolerances.
=2 or 3:range='I' only: not all of the eigenvalues il:iu are found.
=4: range='I', and the Gershgorin interval initially used is too small. No
eigenvalues are computed.
1551
?larre
Given the tridiagonal matrix T, sets small off-diagonal
elements to zero and for each unreduced block Ti,
finds base representations and eigenvalues.
Syntax
call slarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit, isplit,
m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
call dlarre( range, n, vl, vu, il, iu, d, e, e2, rtol1, rtol2, spltol, nsplit, isplit,
m, w, werr, wgap, iblock, indexw, gers, pivmin, work, iwork, info )
Include Files
mkl.fi
Description
To find the desired eigenvalues of a given real symmetric tridiagonal matrix T, the routine sets any "small"
off-diagonal elements to zero, and for each unreduced block Ti, it finds
a suitable shift at one end of the block spectrum

the base representation, Ti - i*I = Li*Di*LiT, and
eigenvalues of each Li*Di*LiT.
The representations and eigenvalues found are then used by ?stemr to compute the eigenvectors of a
symmetric tridiagonal matrix. The accuracy varies depending on whether bisection is used to find a few
eigenvalues or the dqds algorithm (subroutine ?lasq2) to compute all and discard any unwanted one. As an
added benefit, ?larre also outputs the n Gerschgorin intervals for the matrices Li*Di*LiT.
Input Parameters
range CHARACTER.
= 'A': ("All") all eigenvalues will be found.
= 'V': ("Value") all eigenvalues in the half-open interval (vl, vu] will be
found.
= 'I': ("Index") the il-th through iu-th eigenvalues of the entire matrix
will be found.
n INTEGER. The order of the matrix. n > 0.
vl, vu REAL for slarre

DOUBLE PRECISION for dlarre
If range='V', the lower and upper bounds for the eigenvalues. Eigenvalues
less than or equal to vl, or greater than vu, are not returned. vl < vu.
il, iu INTEGER.
If range='I', the indices (in ascending order) of the smallest and largest
eigenvalues to be returned. 1 iliun.
d REAL for slarre

1552
LAPACK Routines 3
The n diagonal elements of the diagonal matrices T.
e REAL for slarre

Array, DIMENSION (n). The first (n-1) entries contain the subdiagonal
elements of the tridiagonal matrix T; e(n) need not be set.
e2 REAL for slarre

Array, DIMENSION (n). The first (n-1) entries contain the squares of the
subdiagonal elements of the tridiagonal matrix T; e2(n) need not be set.
rtol1, rtol2 REAL for slarre

Parameters for bisection. An interval [LEFT,RIGHT] has converged if
RIGHT-LEFT.LT.MAX( rtol1*gap, rtol2*max(|LEFT|,|RIGHT|) ).
spltol REAL for slarre

The threshold for splitting.
work REAL for slarre

iwork INTEGER.
Output Parameters
vl, vu On exit, if range='I' or ='A', contain the bounds on the desired part of
the spectrum.
d On exit, the n diagonal elements of the diagonal matrices Di .
e On exit, the subdiagonal elements of the unit bidiagonal matrices Li . The

entries e( isplit( i) ), 1 insplit, contain the base points sigmai on
output.
e2 On exit, the entries e2( isplit( i) ), 1 insplit, have been set to zero.
nsplit INTEGER. The number of blocks T splits into. 1 nsplitn.
isplit INTEGER. Array, DIMENSION (n). The splitting points, at which T breaks up
into blocks. The first block consists of rows/columns 1 to isplit(1), the
second of rows/columns isplit(1)+1 through isplit(2), etc., and the
nsplit-th consists of rows/columns isplit(nsplit-1)+1 through
isplit(nsplit)=n.
1553
m INTEGER. The total number of eigenvalues (of all the Li*Di*LiT) found.
w REAL for slarre

Array, DIMENSION (n). The first m elements contain the eigenvalues. The
eigenvalues of each of the blocks, Li*Di*LiT, are sorted in ascending order.
The routine may use the remaining n-m elements as workspace.
werr REAL for slarre

Array, DIMENSION (n). The error bound on the corresponding eigenvalue in
w.
wgap REAL for slarre

Array, DIMENSION (n). The separation from the right neighbor eigenvalue in
w. The gap is only with respect to the eigenvalues of the same block as
each block has its own representation tree. Exception: at the right end of a
block the left gap is stored.
iblock INTEGER. Array, DIMENSION (n).

The indices of the blocks (submatrices) associated with the corresponding
eigenvalues in w; iblock(i)=1 if eigenvalue w(i) belongs to the first block
from the top, =2 if w(i) belongs to the second block, etc.
indexw INTEGER. Array, DIMENSION (n).

indexw(i)= 10 and iblock(i)=2 imply that the i-th eigenvalue w(i) is the 10-
th eigenvalue in the second block.
gers REAL for slarre

Array, DIMENSION (2*n). The n Gerschgorin intervals (the i-th Gerschgorin
interval is (gers(2*i-1), gers(2*i)).
pivmin REAL for slarre

The minimum pivot in the Sturm sequence for T .
info INTEGER.
If info = 0: successful exit
If info > 0: A problem occured in ?larre. If info = 5, the Rayleigh

Quotient Iteration failed to converge to full accuracy.
If info < 0: One of the called subroutines signaled an internal problem.
Inspection of the corresponding parameter info for further information is
required.
If info = -1, there is a problem in ?larrd
1554
LAPACK Routines 3
If info = -2, no base representation could be found in maxtry
iterations. Increasing maxtry and recompilation might be a remedy.
If info = -3, there is a problem in ?larrb when computing the refined
root representation for ?lasq2.
If info = -4, there is a problem in ?larrb when preforming bisection
on the desired part of the spectrum.
If info = -5, there is a problem in ?lasq2.
If info = -6, there is a problem in ?lasq2.
See Also
?stemr
?lasq2
?larrb
?larrd
?larrf
Finds a new relatively robust representation such that
at least one of the eigenvalues is relatively isolated.
Syntax
call slarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
call dlarrf( n, d, l, ld, clstrt, clend, w, wgap, werr, spdiam, clgapl, clgapr,
pivmin, sigma, dplus, lplus, work, info )
Include Files
mkl.fi
Description
Given the initial representation L*D*LT and its cluster of close eigenvalues (in a relative measure), w(clstrt),
w(clstrt+1), ... w(clend), the routine ?larrf finds a new relatively robust representation
L*D*LT - i*I = L(+)*D(+)*L(+)T

such that at least one of the eigenvalues of L(+)*D*(+)*L(+)T is relatively isolated.
Input Parameters
n INTEGER. The order of the matrix (subblock, if the matrix is splitted).
d REAL for slarrf

DOUBLE PRECISION for dlarrf
Array, DIMENSION (n). The n diagonal elements of the diagonal matrix D.
l REAL for slarrf

The (n-1) subdiagonal elements of the unit bidiagonal matrix L.
1555
ld REAL for slarrf

The n-1 elements Li*Di.
clstrt INTEGER. The index of the first eigenvalue in the cluster.
clend INTEGER. The index of the last eigenvalue in the cluster.
w REAL for slarrf

Array, DIMENSION (clend -clstrt+1). The eigenvalue approximations of
L*D*LT in ascending order. w(clstrt) through w(clend) form the cluster of
relatively close eigenvalues.
wgap REAL for slarrf

Array, DIMENSION (clend -clstrt+1). The separation from the right
neighbor eigenvalue in w.
werr REAL for slarrf

Array, DIMENSION (clend -clstrt+1). On input, werr contains the
semiwidth of the uncertainty interval of the corresponding eigenvalue
approximation in w.
spdiam REAL for slarrf

Estimate of the spectral diameter obtained from the Gerschgorin intervals.
clgapl, clgapr REAL for slarrf

Absolute gap on each end of the cluster. Set by the calling routine to protect
against shifts too close to eigenvalues outside the cluster.
pivmin REAL for slarrf

The minimum pivot allowed in the Sturm sequence.
work REAL for slarrf

Output Parameters
wgap On output, the gaps are refined.
1556
LAPACK Routines 3
sigma REAL for slarrf
The shift used to form L(+)*D*(+)*L(+)T.
dplus REAL for slarrf

Array, DIMENSION (n). The n diagonal elements of the diagonal matrix
D(+).
lplus REAL for slarrf

Array, DIMENSION (n). The first (n-1) elements of lplus contain the
subdiagonal elements of the unit bidiagonal matrix L(+).
?larrj
Performs refinement of the initial estimates of the
eigenvalues of the matrix T.
Syntax
call slarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork, pivmin,
spdiam, info )
call dlarrj( n, d, e2, ifirst, ilast, rtol, offset, w, werr, work, iwork, pivmin,
spdiam, info )
Include Files
mkl.fi
Description
Given the initial eigenvalue approximations of T, this routine does bisection to refine the eigenvalues of T,
w(ifirst-offset) through w(ilast-offset), to more accuracy. Initial guesses for these eigenvalues are
input in w, the corresponding estimate of the error in these guesses in werr. During bisection, intervals
[a,b] are maintained by storing their mid-points and semi-widths in the arrays w and werr respectively.
Input Parameters
n INTEGER. The order of the matrix T.
d REAL for slarrj

DOUBLE PRECISION for dlarrj
Contains n diagonal elements of the matrix T.
e2 REAL for slarrj

1557
Contains (n-1) squared sub-diagonal elements of the T.
ifirst INTEGER.
The index of the first eigenvalue to be computed.
ilast INTEGER.
The index of the last eigenvalue to be computed.
rtol REAL for slarrj
Tolerance for the convergence of the bisection intervals. An interval [a,b]
is considered to be converged if (b-a) rtol*max(|a|,|b|).
offset INTEGER.
Offset for the arrays w and werr, that is the ifirst-offset through
ilast-offset elements of these arrays are to be used.
w REAL for slarrj
On input, w(ifirst-offset) through w(ilast-offset) are estimates of

the eigenvalues of L*D*LT indexed ifirst through ilast.
werr REAL for slarrj

On input, werr(ifirst-offset) through werr(ilast-offset) are the

errors in the estimates of the corresponding elements in w.
work REAL for slarrj

iwork INTEGER.
pivmin REAL for slarrj

spdiam REAL for slarrj

The spectral diameter of the matrix T.
1558
LAPACK Routines 3
Output Parameters
w On exit, contains the refined estimates of the eigenvalues.
werr On exit, contains the refined errors in the estimates of the corresponding
elements in w.
info INTEGER.
Now it is not used and always is set to 0.
?larrk
Computes one eigenvalue of a symmetric tridiagonal
matrix T to suitable accuracy.
Syntax
call slarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
call dlarrk( n, iw, gl, gu, d, e2, pivmin, reltol, w, werr, info )
Include Files
mkl.fi
Description
The routine computes one eigenvalue of a symmetric tridiagonal matrix T to suitable accuracy. This is an
auxiliary code to be called from ?stemr.
To avoid overflow, the matrix must be scaled so that its largest element is no greater than
(overflow1/2*underflow1/4) in absolute value, and for greatest accuracy, it should not be much smaller
than that. For more details see [Kahan66].
Input Parameters
n INTEGER. The order of the matrix T. (n 1).
iw INTEGER.
The index of the eigenvalue to be returned.
gl, gu REAL for slarrk

DOUBLE PRECISION for dlarrk
An upper and a lower bound on the eigenvalue.
d REAL for slarrk
e2 REAL for slarrk

1559
Contains (n-1) squared off-diagonal elements of the T.
pivmin REAL for slarrk

reltol REAL for slarrk

than reltol times the larger (in magnitude) endpoint, then it is considered
to be sufficiently small, that is converged. Note: this should always be at
least radix*machine epsilon.
Output Parameters
w REAL for slarrk

Contains the eigenvalue approximation.
werr REAL for slarrk

Contains the error bound on the corresponding eigenvalue approximation in
w.
info INTEGER.
= 0: Eigenvalue converges
= -1: Eigenvalue does not converge
?larrr
Performs tests to decide whether the symmetric
tridiagonal matrix T warrants expensive computations
which guarantee high relative accuracy in the
eigenvalues.
Syntax
call slarrr( n, d, e, info )
call dlarrr( n, d, e, info )
Include Files
mkl.fi
Description
The routine performs tests to decide whether the symmetric tridiagonal matrix T warrants expensive
computations which guarantee high relative accuracy in the eigenvalues.
1560
LAPACK Routines 3
Input Parameters
n INTEGER. The order of the matrix T. (n> 0).
d REAL for slarrr

DOUBLE PRECISION for dlarrr
e REAL for slarrr

DOUBLE PRECISION for dlarrr
The first (n-1) entries contain sub-diagonal elements of the tridiagonal

matrix T; e(n) is set to 0.
Output Parameters
info INTEGER.
= 0: the matrix warrants computations preserving relative accuracy
(default value).
= -1: the matrix warrants computations guaranteeing only absolute
accuracy.
?larrv
Computes the eigenvectors of the tridiagonal matrix T
= L*D*LT given L, D and the eigenvalues of L*D*LT.
Syntax
call slarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
werr, wgap, iblock, indexw, gers, z, ldz, isuppz, work, iwork, info )
call dlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
call clarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
call zlarrv( n, vl, vu, d, l, pivmin, isplit, m, dol, dou, minrgp, rtol1, rtol2, w,
Include Files
mkl.fi
Description
The routine ?larrv computes the eigenvectors of the tridiagonal matrix T = L*D*LT given L, D and
approximations to the eigenvalues of L*D*LT.
The input eigenvalues should have been computed by slarre for real flavors (slarrv/clarrv) and by
dlarre for double precision flavors (dlarre/zlarre).
1561
Input Parameters
n INTEGER. The order of the matrix. n 0.
vl, vu REAL for slarrv/clarrv

DOUBLE PRECISION for dlarrv/zlarrv
Lower and upper bounds respectively of the interval that contains the
desired eigenvalues. vl < vu. Needed to compute gaps on the left or right
end of the extremal eigenvalues in the desired range.
d REAL for slarrv/clarrv

Array, DIMENSION (n). On entry, the n diagonal elements of the diagonal
matrix D.
l REAL for slarrv/clarrv

On entry, the (n-1) subdiagonal elements of the unit bidiagonal matrix L are
contained in elements 1 to n-1 of L if the matrix is not splitted. At the end
of each block the corresponding shift is stored as given by slarre for real
flavors and by dlarre for double precision flavors.
pivmin REAL for slarrv/clarrv

The minimum pivot allowed in the Sturm sequence.
isplit INTEGER. Array, DIMENSION (n).

The splitting points, at which T breaks up into blocks. The first block
consists of rows/columns 1 to isplit(1), the second of rows/columns
isplit(1)+1 through isplit(2), etc.
m INTEGER. The total number of eigenvalues found.

0 mn. If range = 'A', m = n, and if range = 'I', m = iu - il +1.
dol, dou INTEGER.

If you want to compute only selected eigenvectors from all the eigenvalues
supplied, specify an index range dol:dou. Or else apply the setting dol=1,
dou=m. Note that dol and dou refer to the order in which the eigenvalues
are stored in w.
If you want to compute only selected eigenpairs, then the columns dol-1 to
dou+1 of the eigenvector space Z contain the computed eigenvectors. All
other columns of Z are set to zero.
minrgp, rtol1, rtol2 REAL for slarrv/clarrv

1562
LAPACK Routines 3
Parameters for bisection. An interval [LEFT,RIGHT] has converged if
RIGHT-LEFT.LT.MAX( rtol1*gap, rtol2*max(|LEFT|,|RIGHT|) ).
w REAL for slarrv/clarrv

Array, DIMENSION (n). The first m elements of w contain the approximate
eigenvalues for which eigenvectors are to be computed. The eigenvalues
should be grouped by split-off block and ordered from smallest to largest
within the block (the output array w from ?larre is expected here). These
eigenvalues are set with respect to the shift of the corresponding root
representation for their block.
werr REAL for slarrv/clarrv

Array, DIMENSION (n). The first m elements contain the semiwidth of the
uncertainty interval of the corresponding eigenvalue in w.
wgap REAL for slarrv/clarrv

Array, DIMENSION (n). The separation from the right neighbor eigenvalue in
w.
iblock INTEGER. Array, DIMENSION (n).

The indices of the blocks (submatrices) associated with the corresponding
eigenvalues in w; iblock(i)=1 if eigenvalue w(i) belongs to the first block
from the top, =2 if w(i) belongs to the second block, etc.
indexw INTEGER. Array, DIMENSION (n).

indexw(i)= 10 and iblock(i)=2 imply that the i-th eigenvalue w(i) is the 10-
th eigenvalue in the second block.
gers REAL for slarrv/clarrv

Array, DIMENSION (2*n). The n Gerschgorin intervals (the i-th Gerschgorin
interval is (gers(2*i-1), gers(2*i)). The Gerschgorin intervals should
be computed from the original unshifted matrix.
ldz INTEGER. The leading dimension of the output array Z. ldz 1, and if jobz
= 'V', ldz max(1,n).
work REAL for slarrv/clarrv

iwork INTEGER.
1563
Output Parameters
d On exit, d may be overwritten.
l On exit, l is overwritten.
w On exit, w holds the eigenvalues of the unshifted matrix.
werr On exit, werr contains refined values of its input approximations.
wgap On exit, wgap contains refined values of its input approximations. Very
small gaps are changed.
z REAL for slarrv

DOUBLE PRECISION for dlarrv
COMPLEX for clarrv
DOUBLE COMPLEX for zlarrv
Array, DIMENSION (ldz, max(1,m) ).
If info = 0, the first m columns of z contain the orthonormal eigenvectors of

the matrix T corresponding to the input eigenvalues, with the i-th column of
z holding the eigenvector associated with w(i).
NOTE
The user must ensure that at least max(1,m) columns are supplied in
the array z.
isuppz INTEGER .
Array, DIMENSION(2*max(1,m)). The support of the eigenvectors in z, that
is, the indices indicating the nonzero elements in z. The i-th eigenvector is
nonzero only in elements isuppz(2i-1) through isuppz(2i).
info INTEGER.
If info = 0: successful exit
If info > 0: A problem occured in ?larrv. If info = 5, the Rayleigh

Quotient Iteration failed to converge to full accuracy.
If info < 0: One of the called subroutines signaled an internal problem.
Inspection of the corresponding parameter info for further information is
required.
If info = -1, there is a problem in ?larrb when refining a child

eigenvalue;
If info = -2, there is a problem in ?larrf when computing the
relatively robust representation (RRR) of a child. When a child is inside a
tight cluster, it can be difficult to find an RRR. A partial remedy from the
user's point of view is to make the parameter minrgp smaller and
recompile. However, as the orthogonality of the computed vectors is
proportional to 1/minrgp, you should be aware that you might be
trading in precision when you decrease minrgp.
1564
LAPACK Routines 3
If info = -3, there is a problem in ?larrb when refining a single
eigenvalue after the Rayleigh correction was rejected.
See Also
?larrb
?larre
?larrf
?lartg
Generates a plane rotation with real cosine and real/
complex sine.
Syntax
call slartg( f, g, cs, sn, r )
call dlartg( f, g, cs, sn, r )
call clartg( f, g, cs, sn, r )
call zlartg( f, g, cs, sn, r )
Include Files
mkl.fi
Description
The routine generates a plane rotation so that
where cs2 + |sn|2 = 1
This is a slower, more accurate version of the BLAS Level 1 routine ?rotg, except for the following
differences.
For slartg/dlartg:
f and g are unchanged on return;

If g=0, then cs=1 and sn=0;
If f=0 and g 0, then cs=0 and sn=1 without doing any floating point operations (saves work in ?bdsqr
when there are zeros on the diagonal);
If f exceeds g in magnitude, cs will be positive.
For clartg/zlartg:
f and g are unchanged on return;

If g=0, then cs=1 and sn=0;
If f=0, then cs=0 and sn is chosen so that r is real.
1565
Input Parameters
f, g REAL for slartg

DOUBLE PRECISION for dlartg
COMPLEX for clartg
DOUBLE COMPLEX for zlartg
The first and second component of vector to be rotated.
Output Parameters
cs REAL for slartg/clartg

DOUBLE PRECISION for dlartg/zlartg
The cosine of the rotation.
sn REAL for slartg

COMPLEX for clartg
The sine of the rotation.
r REAL for slartg

COMPLEX for clartg
The nonzero component of the rotated vector.
?lartgp
Generates a plane rotation.
Syntax
call slartgp( f, g, cs, sn, r )
call dlartgp( f, g, cs, sn, r )
call lartgp( f,g,cs,sn,r )
Include Files
mkl.fi
Description
The routine generates a plane rotation so that
1566
LAPACK Routines 3
where cs2 + sn2 = 1
This is a slower, more accurate version of the BLAS Level 1 routine ?rotg, except for the following
differences:
f and g are unchanged on return.

If g=0, then cs=(+/-)1 and sn=0.
If f=0 and g 0, then cs=0 and sn=(+/-)1.
The sign is chosen so that r 0.
Input Parameters
f, g REAL for slartgp

DOUBLE PRECISION for dlartgp
The first and second component of the vector to be rotated.
Output Parameters
cs REAL for slartgp

sn REAL for slartgp

r REAL for slartgp

The nonzero component of the rotated vector.

If info =-1,f is NaN.
If info = -2, g is NaN.

Specific details for the routine ?lartgp interface are as follows:
f Holds the first component of the vector to be rotated.
g Holds the second component of the vector to be rotated.
cs Holds the cosine of the rotation.
sn Holds the sine of the rotation.
1567
r Holds the nonzero component of the rotated vector.
See Also
?rotg
?lartg
?lartgs
?lartgs
Generates a plane rotation designed to introduce a
bulge in implicit QR iteration for the bidiagonal SVD
problem.
Syntax
call slartgs( x, y, sigma, cs, sn )
call dlartgs( x, y, sigma, cs, sn )
call lartgs( x,y,sigma,cs,sn )
Include Files
mkl.fi
Description
The routine generates a plane rotation designed to introduce a bulge in Golub-Reinsch-style implicit QR
iteration for the bidiagonal SVD problem. x and y are the top-row entries, and sigma is the shift. The
computed cs and sn define a plane rotation that satisfies the following:
with r nonnegative.
If x2 - sigma and x * y are 0, the rotation is by /2
Input Parameters
x, y REAL for slartgs

DOUBLE PRECISION for dlartgs
The (1,1) and (1,2) entries of an upper bidiagonal matrix, respectively.
sigma REAL for slartgs

Shift
Output Parameters
cs REAL for slartgs

1568
LAPACK Routines 3
sn REAL for slartgs


Specific details for the routine ?lartgs interface are as follows:
x Holds the (1,1) entry of an upper diagonal matrix.
y Holds the (1,2) entry of an upper diagonal matrix.
sigma Holds the shift.
cs Holds the cosine of the rotation.
sn Holds the sine of the rotation.
See Also
?lartg
?lartgp
?lartv
Applies a vector of plane rotations with real cosines
and real/complex sines to the elements of a pair of
vectors.
Syntax
call slartv( n, x, incx, y, incy, c, s, incc )
call dlartv( n, x, incx, y, incy, c, s, incc )
call clartv( n, x, incx, y, incy, c, s, incc )
call zlartv( n, x, incx, y, incy, c, s, incc )
Include Files
mkl.fi
Description
The routine applies a vector of real/complex plane rotations with real cosines to elements of the real/complex
vectors x and y. For i = 1,2,...,n
1569
Input Parameters
n INTEGER. The number of plane rotations to be applied.
x, y REAL for slartv

DOUBLE PRECISION for dlartv
COMPLEX for clartv
DOUBLE COMPLEX for zlartv
Arrays, DIMENSION (1+(n-1)*incx) and (1+(n-1)*incy), respectively. The
input vectors x and y.
incx INTEGER. The increment between elements of x. incx > 0.
incy INTEGER. The increment between elements of y. incy > 0.
c REAL for slartv/clartv

DOUBLE PRECISION for dlartv/zlartv
Array, DIMENSION (1+(n-1)*incc).
The cosines of the plane rotations.
s REAL for slartv

DOUBLE PRECISION for dlartv
COMPLEX for clartv
DOUBLE COMPLEX for zlartv
Array, DIMENSION (1+(n-1)*incc).
The sines of the plane rotations.
incc INTEGER. The increment between elements of c and s. incc > 0.
Output Parameters
x, y The rotated vectors x and y.
?laruv
Returns a vector of n random real numbers from a
uniform distribution.
Syntax
call slaruv( iseed, n, x )
call dlaruv( iseed, n, x )
Include Files
mkl.fi
Description
The routine ?laruv returns a vector of n random real numbers from a uniform (0,1) distribution (n 128).
1570
LAPACK Routines 3
This is an auxiliary routine called by ?larnv.
Input Parameters
iseed INTEGER. Array, DIMENSION (4). On entry, the seed of the random number
generator; the array elements must be between 0 and 4095, and iseed(4)
must be odd.
n INTEGER. The number of random numbers to be generated. n 128.
Output Parameters
x REAL for slaruv

DOUBLE PRECISION for dlaruv
Array, DIMENSION (n). The generated random numbers.
seed On exit, the seed is updated.
?larz
Applies an elementary reflector (as returned by ?
tzrzf) to a general matrix.
Syntax
call slarz( side, m, n, l, v, incv, tau, c, ldc, work )
call dlarz( side, m, n, l, v, incv, tau, c, ldc, work )
call clarz( side, m, n, l, v, incv, tau, c, ldc, work )
call zlarz( side, m, n, l, v, incv, tau, c, ldc, work )
Include Files
mkl.fi
Description
The routine ?larz applies a real/complex elementary reflector H to a real/complex m-by-n matrix C, from
either the left or the right. H is represented in the forms
H = I-tau*v*vT for real flavors and H = I-tau*v*vH for complex flavors,
where tau is a real/complex scalar and v is a real/complex vector, respectively.
For complex flavors, to apply HH (the conjugate transpose of H), supply conjg(tau) instead of tau.
H is a product of k elementary reflectors as returned by ?tzrzf.
Input Parameters
side CHARACTER*1.
If side = 'R': form C*H
1571
l INTEGER. The number of entries of the vector v containing the meaningful

part of the Householder vectors.
If side = 'L', mL 0,
if side = 'R', nL 0.
v REAL for slarz

DOUBLE PRECISION for dlarz
COMPLEX for clarz
DOUBLE COMPLEX for zlarz
Array, DIMENSION (1+(l-1)*abs(incv)).
The vector v in the representation of H as returned by ?tzrzf.
v is not used if tau = 0.
incv INTEGER. The increment between elements of v.

incv 0.
tau REAL for slarz

COMPLEX for clarz
c REAL for slarz

COMPLEX for clarz
Array, DIMENSION (ldc,n).

ldc max(1,m).
work REAL for slarz

COMPLEX for clarz
1572
LAPACK Routines 3
(m) if side = 'R'.
Output Parameters
c On exit, C is overwritten by the matrix H*C if side = 'L', or C*H if side

= 'R'.
?larzb
Applies a block reflector or its transpose/conjugate-
transpose to a general matrix.
Syntax
call slarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call dlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call clarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
call zlarzb( side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, c, ldc, work,
ldwork )
Include Files
mkl.fi
Description
The routine applies a real/complex block reflector H or its transpose HT (or the conjugate transpose HH for
complex flavors) to a real/complex distributed m-by-n matrix C from the left or the right. Currently, only
storev = 'R' and direct = 'B' are supported.
Input Parameters
side CHARACTER*1.
If side = 'L': apply H or HT/HH from the left
If side = 'R': apply H or HT/HH from the right
trans CHARACTER*1.
If trans = 'N': apply H (No transpose)
If trans='C': apply HH (conjugate transpose)
If trans='T': apply HT (transpose transpose)
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors
= 'F': H = H(1)*H(2)*...*H(k) (forward, not supported)
= 'B': H = H(k)*...*H(2)*H(1) (backward)
storev CHARACTER*1.
1573
Indicates how the vectors which define the elementary reflectors are
stored:
= 'C': Column-wise (not supported)
= 'R': Row-wise.
k INTEGER. The order of the matrix T (equal to the number of elementary

reflectors whose product defines the block reflector).
l INTEGER. The number of columns of the matrix V containing the meaningful

part of the Householder reflectors.
If side = 'L', ml 0, if side = 'R', nl 0.
v REAL for slarzb

DOUBLE PRECISION for dlarzb
COMPLEX for clarzb
DOUBLE COMPLEX for zlarzb
Array, DIMENSION (ldv, nv).
If storev = 'C', nv = k;
if storev = 'R', nv = l.

If storev = 'C', ldvl; if storev = 'R', ldvk.
t REAL for slarzb

COMPLEX for clarzb
Array, DIMENSION (ldt,k). The triangular k-by-k matrix T in the
representation of the block reflector.
ldt INTEGER. The leading dimension of the array t.

ldtk.
c REAL for slarzb

COMPLEX for clarzb
Array, DIMENSION (ldc,n). On entry, the m-by-n matrix C.

ldc max(1,m).
1574
LAPACK Routines 3
work REAL for slarzb
COMPLEX for clarzb
Workspace array, DIMENSION (ldwork, k).

If side = 'L', ldwork max(1, n);
if side = 'R', ldwork max(1, m).
Output Parameters
c On exit, C is overwritten by H*C, or HT/HH*C, or C*H, or C*HT/HH.
?larzt
Forms the triangular factor T of a block reflector H = I
- V*T*VH.
Syntax
call slarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call dlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call clarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
call zlarzt( direct, storev, n, k, v, ldv, tau, t, ldt )
Include Files
mkl.fi
Description
The routine forms the triangular factor T of a real/complex block reflector H of order > n, which is defined as
a product of k elementary reflectors.
If direct = 'F', H = H(1)*H(2)*...*H(k), and T is upper triangular.
If direct = 'B', H = H(k)*...*H(2)*H(1), and T is lower triangular.
If storev = 'C', the vector which defines the elementary reflector H(i) is stored in the i-th column of the
array v, and H = I-V*T*VT (for real flavors) or H = I-V*T*VH (for complex flavors).
If storev = 'R', the vector which defines the elementary reflector H(i) is stored in the i-th row of the array
v, and H = I-VT*T*V (for real flavors) or H = I-VH*T*V (for complex flavors).
Currently, only storev = 'R' and direct = 'B' are supported.
Input Parameters
direct CHARACTER*1.
Specifies the order in which the elementary reflectors are multiplied to form
the block reflector:
1575
If direct = 'F': H = H(1)*H(2)*...*H(k) (forward, not supported)
If direct = 'B': H = H(k)*...*H(2)*H(1) (backward)
storev CHARACTER*1.
Specifies how the vectors which define the elementary reflectors are stored
(see also Application Notes below):
If storev = 'C': column-wise (not supported)
If storev = 'R': row-wise
n INTEGER. The order of the block reflector H. n 0.
k INTEGER. The order of the triangular factor T (equal to the number of

elementary reflectors). k 1.
v REAL for slarzt

DOUBLE PRECISION for dlarzt
COMPLEX for clarzt
DOUBLE COMPLEX for zlarzt
Array, DIMENSION
(ldv, k) if storev = 'C'
(ldv, n) if storev = 'R' The matrix V.

If storev = 'C', ldv max(1,n);
tau REAL for slarzt

COMPLEX for clarzt
Array, DIMENSION (k). tau(i) must contain the scalar factor of the
elementary reflector H(i).
ldt INTEGER. The leading dimension of the output array t.

ldtk.
Output Parameters
t REAL for slarzt

COMPLEX for clarzt
1576
LAPACK Routines 3
Array, DIMENSION (ldt,k). The k-by-k triangular factor T of the block
reflector. If direct = 'F', T is upper triangular; if direct = 'B', T is
lower triangular. The rest of the array is not used.
v The matrix V. See Application Notes below.
Application Notes
1577
?las2
Computes singular values of a 2-by-2 triangular
matrix.
Syntax
call slas2( f, g, h, ssmin, ssmax )
call dlas2( f, g, h, ssmin, ssmax )
Include Files
mkl.fi
Description
The routine ?las2 computes the singular values of the 2-by-2 matrix
On return, ssmin is the smaller singular value and SSMAX is the larger singular value.
Input Parameters
f, g, h REAL for slas2

DOUBLE PRECISION for dlas2
The (1,1), (1,2) and (2,2) elements of the 2-by-2 matrix, respectively.
1578
LAPACK Routines 3
Output Parameters
ssmin, ssmax REAL for slas2

DOUBLE PRECISION for dlas2
The smaller and the larger singular values, respectively.
Application Notes
Barring over/underflow, all output quantities are correct to within a few units in the last place (ulps), even in
the absence of a guard digit in addition/subtraction. In ieee arithmetic, the code works correctly if one matrix
element is infinite. Overflow will not occur unless the largest singular value itself overflows, or is within a few
ulps of overflow. (On machines with partial overflow, like the Cray, overflow may occur if the largest singular
value is within a factor of 2 of overflow.) Underflow is harmless if underflow is gradual. Otherwise, results
may correspond to a matrix modified by perturbations of size near the underflow threshold.
?lascl
Multiplies a general rectangular matrix by a real scalar
defined as cto/cfrom.
Syntax
call slascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call dlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call clascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
call zlascl( type, kl, ku, cfrom, cto, m, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lascl multiplies the m-by-n real/complex matrix A by the real scalar cto/cfrom. The operation
is performed without over/underflow as long as the final result cto*A(i,j)/cfrom does not over/underflow.
type specifies that A may be full, upper triangular, lower triangular, upper Hessenberg, or banded.
Input Parameters
type CHARACTER*1. This parameter specifies the storage type of the input
matrix.
= 'G': A is a full matrix.
= 'L': A is a lower triangular matrix.
= 'U': A is an upper triangular matrix.
= 'H': A is an upper Hessenberg matrix.
= 'B': A is a symmetric band matrix with lower bandwidth kl and upper

bandwidth ku and with the only the lower half stored
= 'Q': A is a symmetric band matrix with lower bandwidth kl and upper
bandwidth ku and with the only the upper half stored.
1579
= 'Z': A is a band matrix with lower bandwidth kl and upper bandwidth ku.
See description of the ?gbtrf function for storage details.
kl INTEGER. The lower bandwidth of A. Referenced only if type = 'B', 'Q' or

'Z'.
ku INTEGER. The upper bandwidth of A. Referenced only if type = 'B', 'Q'

or 'Z'.
cfrom, cto REAL for slascl/clascl

DOUBLE PRECISION for dlascl/zlascl
The matrix A is multiplied by cto/cfrom. A(i,j) is computed without over/
underflow if the final result cto*A(i,j)/cfrom can be represented without
over/underflow. cfrom must be nonzero.
a REAL for slascl

DOUBLE PRECISION for dlascl
COMPLEX for clascl
DOUBLE COMPLEX for zlascl
Array, size (lda, n). The matrix to be multiplied by cto/cfrom. See type for
the storage type.

lda max(1,m).
Output Parameters
a The multiplied matrix A.
info INTEGER.
If info = 0 - successful exit
See Also
?gbtrf
?lasd0
Computes the singular values of a real upper
bidiagonal n-by-m matrix B with diagonal d and off-
diagonal e. Used by ?bdsdc.
Syntax
call slasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
call dlasd0( n, sqre, d, e, u, ldu, vt, ldvt, smlsiz, iwork, work, info )
1580
LAPACK Routines 3
Include Files
mkl.fi
Description
Using a divide and conquer approach, the routine ?lasd0 computes the singular value decomposition (SVD)
of a real upper bidiagonal n-by-m matrix B with diagonal d and offdiagonal e, where m = n + sqre.
The algorithm computes orthogonal matrices U and VT such that B = U*S*VT. The singular values S are
overwritten on d.
The related subroutine ?lasda computes only the singular values, and optionally, the singular vectors in
compact form.
Input Parameters
n INTEGER. On entry, the row dimension of the upper bidiagonal matrix. This
is also the dimension of the main diagonal array d.
sqre INTEGER. Specifies the column dimension of the bidiagonal matrix.

If sqre = 0: the bidiagonal matrix has column dimension m = n.
If sqre = 1: the bidiagonal matrix has column dimension m = n+1.
d REAL for slasd0

DOUBLE PRECISION for dlasd0
bidiagonal matrix.
e REAL for slasd0

Array, DIMENSION (m-1). Contains the subdiagonal entries of the bidiagonal
matrix. On exit, e is destroyed.
ldu INTEGER. On entry, leading dimension of the output array u.
ldvt INTEGER. On entry, leading dimension of the output array vt.
smlsiz INTEGER. On entry, maximum size of the subproblems at the bottom of the
computation tree.
iwork INTEGER.
Workspace array, dimension must be at least (8n).
work REAL for slasd0

Workspace array, dimension must be at least (3m2+2m).
Output Parameters
d On exit d, If info = 0, contains singular values of the bidiagonal matrix.
u REAL for slasd0
1581

Array, DIMENSION at least (ldq, n). On exit, u contains the left singular
vectors.
vt REAL for slasd0

Array, DIMENSION at least (ldvt, m). On exit, vtT contains the right singular
vectors.
info INTEGER.
If info = 1, a singular value did not converge.
?lasd1
Computes the SVD of an upper bidiagonal matrix B of
the specified size. Used by ?bdsdc.
Syntax
call slasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork, work, info )
call dlasd1( nl, nr, sqre, d, alpha, beta, u, ldu, vt, ldvt, idxq, iwork, work, info )
Include Files
mkl.fi
Description
The routine computes the SVD of an upper bidiagonal n-by-m matrix B, where n = nl + nr + 1 and m = n
+ sqre.
The routine ?lasd1 is called from ?lasd0.
A related subroutine ?lasd7 handles the case in which the singular values (and the singular vectors in
factored form) are desired.
?lasd1 computes the SVD as follows:
= U(out)*(D(out) 0)*VT(out)
whereZT = (Z1TaZ2Tb) = uT*VTT, and u is a vector of dimension m with alpha and beta in the nl+1 and nl
+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The left singular vectors of the original matrix are stored in u, and the transpose of the right singular vectors
are stored in vt, and the singular values are in d. The algorithm consists of three stages:
1582
LAPACK Routines 3
1. The first stage consists of deflating the size of the problem when there are multiple singular values or
when there are zeros in the Z vector. For each such occurrence the dimension of the secular equation
problem is reduced by one. This stage is performed by the routine ?lasd2.
2. The second stage consists of calculating the updated singular values. This is done by finding the square
roots of the roots of the secular equation via the routine ?lasd4 (as called by ?lasd3). This routine
also calculates the singular vectors of the current problem.
3. The final stage consists of computing the updated singular vectors directly using the updated singular
values. The singular vectors for the current problem are multiplied with the singular vectors from the
overall problem.
Input Parameters

nl 1.

nr 1.
sqre INTEGER.
If sqre = 0: the lower block is an nr-by-nr square matrix.
If sqre = 1: the lower block is an nr-by-(nr+1) rectangular matrix. The

bidiagonal matrix has row dimension n = nl + nr + 1, and column
d REAL for slasd1

Array, DIMENSION (nl+nr+1). n = nl+nr+1. On entry d(1:nl,1:nl)
contains the singular values of the upper block; and d(nl+2:n) contains
the singular values of the lower block.
alpha REAL for slasd1

Contains the diagonal element associated with the added row.
beta REAL for slasd1

Contains the off-diagonal element associated with the added row.
u REAL for slasd1

Array, DIMENSION (ldu, n). On entry u(1:nl, 1:nl) contains the left
singular vectors of the upper block; u(nl+2:n, nl+2:n) contains the left
singular vectors of the lower block.
ldu INTEGER. The leading dimension of the array U.

ldu max(1, n).
vt REAL for slasd1
1583

Array, DIMENSION (ldvt, m), where m = n + sqre.
On entry vt(1:nl+1, 1:nl+1)T contains the right singular vectors of the

upper block; vt(nl+2:m, nl+2:m)T contains the right singular vectors of
the lower block.
ldvt INTEGER. The leading dimension of the array vt.

ldvt max(1, M).
iwork INTEGER.
Workspace array, DIMENSION (4n).

Workspace array, DIMENSION (3m2 + 2m).
Output Parameters
d On exit d(1:n) contains the singular values of the modified matrix.
alpha On exit, the diagonal element associated with the added row deflated by
max( abs( alpha ), abs( beta ), abs( D(I) ) ), I = 1,n.
beta On exit, the off-diagonal element associated with the added row deflated by
max( abs( alpha ), abs( beta ), abs( D(I) ) ), I = 1,n.
u On exit u contains the left singular vectors of the bidiagonal matrix.
vt On exit vtT contains the right singular vectors of the bidiagonal matrix.
idxq INTEGER
Array, DIMENSION (n). Contains the permutation which will reintegrate the
subproblem just solved back into sorted order, that is, d(idxq( i = 1,
n )) will be in ascending order.
info INTEGER.
If info = 1, a singular value did not converge.
?lasd2
Merges the two sets of singular values together into a
single sorted set. Used by ?bdsdc.
Syntax
call slasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma, u2, ldu2,
vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
1584
LAPACK Routines 3
call dlasd2( nl, nr, sqre, k, d, z, alpha, beta, u, ldu, vt, ldvt, dsigma, u2, ldu2,
vt2, ldvt2, idxp, idx, idxp, idxq, coltyp, info )
Include Files
mkl.fi
Description
The routine ?lasd2 merges the two sets of singular values together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more singular
values are close together or if there is a tiny entry in the Z vector. For each such occurrence the order of the
related secular equation problem is reduced by one.
Input Parameters

nl 1.

nr 1.
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix
If sqre = 1): the lower block is an nr-by-(nr+1) rectangular matrix. The

bidiagonal matrix has n = nl + nr + 1 rows and m = n + sqren
columns.
d REAL for slasd2

Array, DIMENSION (n). On entry d contains the singular values of the two
submatrices to be combined.


u REAL for slasd2

Array, DIMENSION (ldu, n). On entry u contains the left singular vectors of
two submatrices in the two square blocks with corners at (1,1), (nl, nl), and
(nl+2, nl+2), (n,n).
1585
ldun.
ldu2 INTEGER. The leading dimension of the output array u2. ldu2n.
vt REAL for slasd2

Array, DIMENSION (ldvt, m). On entry, vtT contains the right singular
vectors of two submatrices in the two square blocks with corners at (1,1),
(nl+1, nl+1), and (nl+2, nl+2), (m, m).
ldvt INTEGER. The leading dimension of the array vt. ldvtm.
ldvt2 INTEGER. The leading dimension of the output array vt2. ldvt2m.
idxp INTEGER.
Workspace array, DIMENSION (n). This will contain the permutation used to
place deflated values of D at the end of the array. On output idxp(2:k)
points to the nondeflated d-values and idxp(k+1:n) points to the deflated
singular values.
idx INTEGER.
sort the contents of d into ascending order.
coltyp INTEGER.
Workspace array, DIMENSION (n). As workspace, this array contains a label
that indicates which of the following types a column in the u2 matrix or a
row in the vt2 matrix is:
1 : non-zero in the upper half only
2 : non-zero in the lower half only
3 : dense
4 : deflated.
idxq INTEGER. Array, DIMENSION (n). This parameter contains the permutation
that separately sorts the two sub-problems in D in the ascending order.
Note that entries in the first half of this permutation must first be moved
one position backwards and entries in the second half must have nl+1
added to their values.
Output Parameters
k INTEGER. Contains the dimension of the non-deflated matrix, This is the

d On exit D contains the trailing (n-k) updated singular values (those which
were deflated) sorted into increasing order.
u On exit u contains the trailing (n-k) updated left singular vectors (those
which were deflated) in its last n-k columns.
z REAL for slasd2
1586
LAPACK Routines 3
Array, DIMENSION (n). On exit, z contains the updating row vector in the
secular equation.
dsigma REAL for slasd2

Array, DIMENSION (n). Contains a copy of the diagonal elements (k-1
singular values and one zero) in the secular equation.
u2 REAL for slasd2

Array, DIMENSION (ldu2, n). Contains a copy of the first k-1 left singular
vectors which will be used by ?lasd3 in a matrix multiply (?gemm) to solve
for the new left singular vectors. u2 is arranged into four blocks. The first
block contains a column with 1 at nl+1 and zero everywhere else; the
second block contains non-zero entries only at and above nl; the third
contains non-zero entries only below nl+1; and the fourth is dense.
vt On exit, vtT contains the trailing (n-k) updated right singular vectors (those
which were deflated) in its last n-k columns. In case sqre =1, the last row
of vt spans the right null space.
vt2 REAL for slasd2

Array, DIMENSION (ldvt2, n). vt2T contains a copy of the first k right
singular vectors which will be used by ?lasd3 in a matrix multiply (?gemm)
to solve for the new right singular vectors. vt2 is arranged into three blocks.
The first block contains a row that corresponds to the special 0 diagonal
element in sigma; the second block contains non-zeros only at and before
nl +1; the third block contains non-zeros only at and after nl +2.
idxc INTEGER. Array, DIMENSION (n). This will contain the permutation used to
arrange the columns of the deflated u matrix into three groups: the first
group contains non-zero entries only at and above nl, the second contains
non-zero entries only below nl+2, and the third is dense.
coltyp On exit, it is an array of dimension 4, with coltyp(i) being the dimension of

the i-th type columns.
info INTEGER.
If info = 0): successful exit
?lasd3
Finds all square roots of the roots of the secular
equation, as defined by the values in D and Z, and
then updates the singular vectors by matrix
multiplication. Used by ?bdsdc.
1587
Syntax
call slasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt, vt2,
ldvt2, idxc, ctot, z, info )
call dlasd3( nl, nr, sqre, k, d, q, ldq, dsigma, u, ldu, u2, ldu2, vt, ldvt, vt2,
ldvt2, idxc, ctot, z, info )
Include Files
mkl.fi
Description
The routine ?lasd3 finds all the square roots of the roots of the secular equation, as defined by the values in
D and Z.
It makes the appropriate calls to ?lasd4 and then updates the singular vectors by matrix multiplication.
Input Parameters

nl 1.

nr 1.
sqre INTEGER.
If sqre = 0): the lower block is an nr-by-nr square matrix.
If sqre = 1): the lower block is an nr-by-(nr+1) rectangular matrix. The

bidiagonal matrix has n = nl + nr + 1 rows and m = n + sqren
columns.
k INTEGER.The size of the secular equation, 1 kn.
q REAL for slasd3

Workspace array, DIMENSION at least (ldq, k).
ldq INTEGER. The leading dimension of the array Q.

ldqk.

Array, DIMENSION (k). The first k elements of this array contain the old
roots of the deflated updating problem. These are the poles of the secular
equation.

ldun.
1588
LAPACK Routines 3
u2 REAL for slasd3
Array, DIMENSION (ldu2, n).
The first k columns of this matrix contain the non-deflated left singular
vectors for the split problem.
ldu2 INTEGER. The leading dimension of the array u2.

ldu2n.
ldvt INTEGER. The leading dimension of the array vt.

ldvtn.
vt2 REAL for slasd3

Array, DIMENSION (ldvt2, n).
The first k columns of vt2' contain the non-deflated right singular vectors
for the split problem.
ldvt2 INTEGER. The leading dimension of the array vt2.

ldvt2n.
idxc INTEGER. Array, DIMENSION (n).

The permutation used to arrange the columns of u (and rows of vt) into
three groups: the first group contains non-zero entries only at and above
(or before) nl +1; the second contains non-zero entries only at and below
(or after) nl+2; and the third is dense. The first column of u and the row of
vt are treated separately, however. The rows of the singular vectors found
by ?lasd4 must be likewise permuted before the matrix multiplies can take
place.
ctot INTEGER. Array, DIMENSION (4). A count of the total number of the various
types of columns in u (or rows in vt), as described in idxc.
The fourth column type is any column which has been deflated.
z REAL for slasd3

Array, DIMENSION (k). The first k elements of this array contain the
components of the deflation-adjusted updating row vector.
Output Parameters
d REAL for slasd3

Array, DIMENSION (k). On exit the square roots of the roots of the secular
equation, in ascending order.
u REAL for slasd3
1589

Array, DIMENSION (ldu, n).
The last n - k columns of this matrix contain the deflated left singular
vectors.
vt REAL for slasd3

Array, DIMENSION (ldvt, m).
The last m - k columns of vt' contain the deflated right singular vectors.
vt2 Destroyed on exit.
z Destroyed on exit.
info INTEGER.
If info = 0): successful exit.
If info = 1, an singular value did not converge.
Application Notes
digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray XMP, Cray
YMP, Cray C 90, or Cray 2. It could conceivably fail on hexadecimal or decimal machines without guard digits,
but we know of none.
?lasd4
Computes the square root of the i-th updated
eigenvalue of a positive symmetric rank-one
modification to a positive diagonal matrix. Used by ?
bdsdc.
Syntax
call slasd4( n, i, d, z, delta, rho, sigma, work, info)
call dlasd4( n, i, d, z, delta, rho, sigma, work, info )
Include Files
mkl.fi
Description
The routine computes the square root of the i-th updated eigenvalue of a positive symmetric rank-one
modification to a positive diagonal matrix whose entries are given as the squares of the corresponding
entries in the array d, and that 0 d(i) < d(j) for i < j and that rho > 0. This is arranged by the calling
routine, and is no loss in generality. The rank-one modified system is thus
diag(d)*diag(d) + rho*Z*ZT,
where the Euclidean norm of Z is equal to 1.The method consists of approximating the rational functions in
the secular equation by simpler interpolating rational functions.
1590
LAPACK Routines 3
Input Parameters
n INTEGER. The length of all arrays.
i INTEGER.
The index of the eigenvalue to be computed. 1 in.
d REAL for slasd4

The original eigenvalues. They must be in order, 0 d(i) < d(j) for i <
j.
z REAL for slasd4

The components of the updating vector.
rho REAL for slasd4


Workspace array, DIMENSION (n ).
If n 1, work contains (d(j) + sigma_i) in its j-th component.
If n = 1, then work( 1 ) = 1.
Output Parameters
delta REAL for slasd4

If n 1, delta contains (d(j) - sigma_i) in its j-th component.
If n = 1, then delta (1) = 1. The vector delta contains the information

necessary to construct the (singular) eigenvectors.
sigma REAL for slasd4

The computed sigma_i, the i-th updated eigenvalue.
info INTEGER.
> 0: If info = 1, the updating process failed.
1591
?lasd5
Computes the square root of the i-th eigenvalue of a
positive symmetric rank-one modification of a 2-by-2
diagonal matrix.Used by ?bdsdc.
Syntax
call slasd5( i, d, z, delta, rho, dsigma, work )
call dlasd5( i, d, z, delta, rho, dsigma, work )
Include Files
mkl.fi
Description
The routine computes the square root of the i-th eigenvalue of a positive symmetric rank-one modification of
a 2-by-2 diagonal matrix diag(d)*diag(d)+rho*Z*ZT
The diagonal entries in the array d must satisfy 0 d(i) < d(j) for i<i, rho mustbe greater than 0, and
that the Euclidean norm of the vector Z is equal to 1.
Input Parameters
i INTEGER.The index of the eigenvalue to be computed. i = 1 or i = 2.
d REAL for slasd5

Array, dimension (2 ).
The original eigenvalues, 0 d(1) < d(2).
z REAL for slasd5

Array, dimension ( 2 ).
The components of the updating vector.
rho REAL for slasd5


DOUBLE PRECISION for dlasd5.
Workspace array, dimension ( 2 ). Contains (d(j) + sigma_i) in its j-th
component.
Output Parameters
delta REAL for slasd5

1592
LAPACK Routines 3
Array, dimension ( 2 ).
Contains (d(j) - sigma_i) in its j-th component. The vector delta

contains the information necessary to construct the eigenvectors.

The computed sigma_i, the i-th updated eigenvalue.
?lasd6
Computes the SVD of an updated upper bidiagonal
matrix obtained by merging two smaller ones by
appending a row. Used by ?bdsdc.
Syntax
call slasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork, info )
call dlasd6( icompq, nl, nr, sqre, d, vf, vl, alpha, beta, idxq, perm, givptr, givcol,
ldgcol, givnum, ldgnum, poles, difl, difr, z, k, c, s, work, iwork, info )
Include Files
mkl.fi
Description
The routine ?lasd6 computes the SVD of an updated upper bidiagonal matrix B obtained by merging two
smaller ones by appending a row. This routine is used only for the problem which requires all singular values
and optionally singular vector matrices in factored form. B is an n-by-m matrix with n = nl + nr + 1 and m
= n + sqre. A related subroutine, ?lasd1, handles the case in which all singular values and singular vectors
of the bidiagonal matrix are desired. ?lasd6 computes the SVD as follows:
= U(out)*(D(out)*VT(out)
where Z' = (Z1' aZ2' b) = u'*VT', and u is a vector of dimension m with alpha and beta in the nl+1
and nl+2-th entries and zeros elsewhere; and the entry b is empty if sqre = 0.
The singular values of B can be computed using D1, D2, the first components of all the right singular vectors
of the lower block, and the last components of all the right singular vectors of the upper block. These
components are stored and updated in vf and vl, respectively, in ?lasd6. Hence U and VT are not explicitly
referenced.
The singular values are stored in D. The algorithm consists of two stages:
1. The first stage consists of deflating the size of the problem when there are multiple singular values or if
there is a zero in the Z vector. For each such occurrence the dimension of the secular equation problem
is reduced by one. This stage is performed by the routine ?lasd7.
1593
2. The second stage consists of calculating the updated singular values. This is done by finding the roots
of the secular equation via the routine ?lasd4 (as called by ?lasd8). This routine also updates vf and
vl and computes the distances between the updated singular values and the old singular values. ?
lasd6 is called from ?lasda.
Input Parameters

form:
= 0: Compute singular values only
= 1: Compute singular vectors in factored form as well.

nl 1.

nr 1.
sqre INTEGER .
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix.
The bidiagonal matrix has row dimension n=nl+nr+1, and column
d REAL for slasd6

Array, dimension ( nl+nr+1 ). On entry d(1:nl,1:nl) contains the singular
values of the upper block, and d(nl+2:n) contains the singular values of the
lower block.
vf REAL for slasd6

Array, dimension ( m ).
On entry, vf(1:nl+1) contains the first components of all right singular

vectors of the upper block; and vf(nl+2:m)
contains the first components of all right singular vectors of the lower block.
vl REAL for slasd6

On entry, vl(1:nl+1) contains the last components of all right singular

vectors of the upper block; and vl(nl+2:m) contains the last components of
all right singular vectors of the lower block.

1594
LAPACK Routines 3
ldgcol INTEGER.The leading dimension of the output array givcol, must be at least
n.
ldgnum INTEGER.
The leading dimension of the output arrays givnum and poles, must be at
least n.

Workspace array, dimension ( 4m ).
iwork INTEGER.
Workspace array, dimension ( 3n ).
Output Parameters
d On exit d(1:n) contains the singular values of the modified matrix.
vf On exit, vf contains the first components of all right singular vectors of the
bidiagonal matrix.
vl On exit, vl contains the last components of all right singular vectors of the
bidiagonal matrix.
alpha On exit, the diagonal element associated with the added row deflated by
max(abs(alpha), abs(beta), abs(D(I))), I = 1,n.
beta On exit, the off-diagonal element associated with the added row deflated by
max(abs(alpha), abs(beta), abs(D(I))), I = 1,n.
idxq INTEGER.
Array, dimension (n). This contains the permutation which will reintegrate
the subproblem just solved back into sorted order, that is, d( idxq( i =
1, n ) ) will be in ascending order.
perm INTEGER.
Array, dimension (n). The permutations (from deflation and sorting) to be
applied to each block. Not referenced if icompq = 0.
givptr INTEGER. The number of Givens rotations which took place in this
subproblem. Not referenced if icompq = 0.
givcol INTEGER.
Array, dimension ( ldgcol, 2 ). Each pair of numbers indicates a pair of
columns to take place in a Givens rotation. Not referenced if icompq = 0.
givnum REAL for slasd6
1595

Array, dimension ( ldgnum, 2 ). Each number indicates the C or S value to
be used in the corresponding Givens rotation. Not referenced if icompq =
0.
poles REAL for slasd6

Array, dimension ( ldgnum, 2 ). On exit, poles(1,*) is an array containing
the new singular values obtained from solving the secular equation, and
poles(2,*) is an array containing the poles in the secular equation. Not
referenced if icompq = 0.
difl REAL for slasd6

Array, dimension (n). On exit, difl(i) is the distance between i-th updated
(undeflated) singular value and the i-th (undeflated) old singular value.
difr REAL for slasd6

Array, dimension (ldgnum, 2 ) if icompq = 1 and dimension (n) if
icompq = 0.
On exit, difr(i, 1) is the distance between i-th updated (undeflated) singular
value and the i+1-th (undeflated) old singular value. If icompq = 1,
difr(1: k, 2) is an array containing the normalizing factors for the right
singular vector matrix.
See ?lasd8 for details on difl and difr.
z REAL for slasd6

The first elements of this array contain the components of the deflation-
adjusted updating row vector.
k INTEGER. Contains the dimension of the non-deflated matrix. This is the

c REAL for slasd6

c contains garbage if sqre =0 and the C-value of a Givens rotation related
to the right null space if sqre = 1.
s REAL for slasd6

s contains garbage if sqre =0 and the S-value of a Givens rotation related
to the right null space if sqre = 1.
info INTEGER.
1596
LAPACK Routines 3
> 0: if info = 1, an singular value did not converge
?lasd7
Merges the two sets of singular values together into a
single sorted set. Then it tries to deflate the size of
the problem. Used by ?bdsdc.
Syntax
call slasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta, dsigma,
idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s, info )
call dlasd7( icompq, nl, nr, sqre, k, d, z, zw, vf, vfw, vl, vlw, alpha, beta, dsigma,
idx, idxp, idxq, perm, givptr, givcol, ldgcol, givnum, ldgnum, c, s, info )
Include Files
mkl.fi
Description
The routine ?lasd7 merges the two sets of singular values together into a single sorted set. Then it tries to
deflate the size of the problem. There are two ways in which deflation can occur: when two or more singular
values are close together or if there is a tiny entry in the Z vector. For each such occurrence the order of the
related secular equation problem is reduced by one. ?lasd7 is called from ?lasd6.
Input Parameters
icompq INTEGER. Specifies whether singular vectors are to be computed in compact

form, as follows:
= 0: Compute singular values only.
= 1: Compute singular vectors of upper bidiagonal matrix in compact form.

nl 1.

nr 1.
sqre INTEGER.
= 0: the lower block is an nr-by-nr square matrix.
= 1: the lower block is an nr-by-(nr+1) rectangular matrix. The bidiagonal
matrix has n = nl + nr + 1 rows and m = n + sqren columns.
d REAL for slasd7

Array, DIMENSION (n). On entry d contains the singular values of the two
submatrices to be combined.
1597
zw REAL for slasd7

Array, DIMENSION ( m ).
Workspace for z.
vf REAL for slasd7

Array, DIMENSION ( m ). On entry, vf(1:nl+1) contains the first
components of all right singular vectors of the upper block; and vf(nl
+2:m) contains the first components of all right singular vectors of the
lower block.
vfw REAL for slasd7

Workspace for vf.
vl REAL for slasd7

On entry, vl(1:nl+1) contains the last components of all right singular

vectors of the upper block; and vl(nl+2:m) contains the last components
of all right singular vectors of the lower block.
VLW REAL for slasd7

Workspace for VL.


idx INTEGER.
sort the contents of d into ascending order.
idxp INTEGER.
place deflated values of d at the end of the array.
1598
LAPACK Routines 3
idxq INTEGER.
This contains the permutation which separately sorts the two sub-problems
in d into ascending order. Note that entries in the first half of this
permutation must first be moved one position backward; and entries in the
second half must first have nl+1 added to their values.
ldgcol INTEGER.The leading dimension of the output array givcol, must be at least
n.
ldgnum INTEGER. The leading dimension of the output array givnum, must be at
least n.
Output Parameters
k INTEGER. Contains the dimension of the non-deflated matrix, this is the

order of the related secular equation.
1 kn.
d On exit, d contains the trailing (n-k) updated singular values (those which
were deflated) sorted into increasing order.
z REAL for slasd7

Array, DIMENSION (m).
On exit, Z contains the updating row vector in the secular equation.
vf On exit, vf contains the first components of all right singular vectors of the
bidiagonal matrix.
vl On exit, vl contains the last components of all right singular vectors of the
bidiagonal matrix.

Array, DIMENSION (n). Contains a copy of the diagonal elements (k-1
singular values and one zero) in the secular equation.
idxp On output, idxp(2: k) points to the nondeflated d-values and idxp( k+1:n)
points to the deflated singular values.
perm INTEGER.
The permutations (from deflation and sorting) to be applied to each singular

block. Not referenced if icompq = 0.
givptr INTEGER.
The number of Givens rotations which took place in this subproblem. Not
referenced if icompq = 0.
1599
givcol INTEGER.
Array, DIMENSION ( ldgcol, 2 ). Each pair of numbers indicates a pair of
columns to take place in a Givens rotation. Not referenced if icompq = 0.
givnum REAL for slasd7

Array, DIMENSION ( ldgnum, 2 ). Each number indicates the C or S value to
be used in the corresponding Givens rotation. Not referenced if icompq =
0.
c REAL for slasd7.

If sqre =0, then c contains garbage, and if sqre = 1, then c contains C-
value of a Givens rotation related to the right null space.
S REAL for slasd7.

If sqre =0, then s contains garbage, and if sqre = 1, then s contains S-
value of a Givens rotation related to the right null space.
info INTEGER.
?lasd8
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info )
call dlasd8( icompq, k, d, z, vf, vl, difl, difr, lddifr, dsigma, work, info )
Include Files
mkl.fi
Description
The routine ?lasd8 finds the square roots of the roots of the secular equation, as defined by the values in
dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each element in d, the distance to its
two nearest poles (elements in dsigma). It also updates the arrays vf and vl, the first and last components of
all the right singular vectors of the original bidiagonal matrix. ?lasd8 is called from ?lasd6.
Input Parameters

form in the calling routine:
1600
LAPACK Routines 3
= 1: Compute singular vectors in factored form as well.
k INTEGER. The number of terms in the rational function to be solved by ?

lasd4. k 1.
z REAL for slasd8

Array, DIMENSION ( k ).
The first k elements of this array contain the components of the deflation-
adjusted updating row vector.
vf REAL for slasd8

On entry, vf contains information passed through dbede8.
vl REAL for slasd8

Array, DIMENSION ( k ). On entry, vl contains information passed through
dbede8.
lddifr INTEGER. The leading dimension of the output array difr, must be at least
k.

The first k elements of this array contain the old roots of the deflated
updating problem. These are the poles of the secular equation.

Workspace array, DIMENSION at least (3k).
Output Parameters
d REAL for slasd8

On output, D contains the updated singular values.
z Updated on exit.
vf On exit, vf contains the first k components of the first components of all

right singular vectors of the bidiagonal matrix.
1601
vl On exit, vl contains the first k components of the last components of all


Array, DIMENSION ( k ). On exit, difl(i) = d(i) - dsigma(i).

Array,
DIMENSION ( lddifr, 2 ) if icompq = 1 and
DIMENSION ( k ) if icompq = 0.
On exit, difr(i,1) = d(i) - dsigma(i+1), difr(k,1) is not defined
and will not be referenced. If icompq = 1, difr(1:k,2) is an array
containing the normalizing factors for the right singular vector matrix.
dsigma The elements of this array may be very slightly altered in value.
info INTEGER.
> 0: If info = 1, an singular value did not converge.
?lasd9
Finds the square roots of the roots of the secular
equation, and stores, for each element in D, the
distance to its two nearest poles. Used by ?bdsdc.
Syntax
call slasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
call dlasd9( icompq, ldu, k, d, z, vf, vl, difl, difr, dsigma, work, info )
Include Files
mkl.fi
Description
The routine ?lasd9 finds the square roots of the roots of the secular equation, as defined by the values in
dsigma and z. It makes the appropriate calls to ?lasd4, and stores, for each element in d, the distance to its
two nearest poles (elements in dsigma). It also updates the arrays vf and vl, the first and last components of
all the right singular vectors of the original bidiagonal matrix. ?lasd9 is called from ?lasd7.
Input Parameters

form in the calling routine:
If icompq = 0, compute singular values only;
1602
LAPACK Routines 3
If icompq = 1, compute singular vector matrices in factored form also.
k INTEGER. The number of terms in the rational function to be solved by

slasd4. k 1.

Array, DIMENSION(k).
The first k elements of this array contain the old roots of the deflated
updating problem. These are the poles of the secular equation.
z REAL for slasd9

Array, DIMENSION (k). The first k elements of this array contain the
components of the deflation-adjusted updating row vector.
vf REAL for slasd9

Array, DIMENSION(k). On entry, vf contains information passed through
sbede8.
vl REAL for slasd9

Array, DIMENSION(k). On entry, vl contains information passed through
sbede8.

Workspace array, DIMENSION at least (3k).
Output Parameters
d REAL for slasd9

Array, DIMENSION(k). d(i) contains the updated singular values.
vf On exit, vf contains the first k components of the first components of all

vl On exit, vl contains the first k components of the last components of all


Array, DIMENSION (k).
On exit, difl(i) = d(i) - dsigma(i).
1603

Array,
DIMENSION (ldu, 2) if icompq =1 and
DIMENSION (k) if icompq = 0.
On exit, difr(i, 1) = d(i) - dsigma(i+1), difr(k, 1) is not defined
and will not be referenced.
If icompq = 1, difr(1:k, 2) is an array containing the normalizing
factors for the right singular vector matrix.
info INTEGER.
> 0: If info = 1, an singular value did not converge
?lasda
Computes the singular value decomposition (SVD) of a
real upper bidiagonal matrix with diagonal d and off-
diagonal e. Used by ?bdsdc.
Syntax
call slasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z, poles,
givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
call dlasda( icompq, smlsiz, n, sqre, d, e, u, ldu, vt, k, difl, difr, z, poles,
givptr, givcol, ldgcol, perm, givnum, c, s, work, iwork, info )
Include Files
mkl.fi
Description
Using a divide and conquer approach, ?lasda computes the singular value decomposition (SVD) of a real
upper bidiagonal n-by-m matrix B with diagonal d and off-diagonal e, where m = n + sqre.
The algorithm computes the singular values in the SVDB = U*S*VT. The orthogonal matrices U and VT are
optionally computed in compact form. A related subroutine ?lasd0 computes the singular values and the
singular vectors in explicit form.
Input Parameters
icompq INTEGER.
Specifies whether singular vectors are to be computed in compact form, as
follows:
= 1: Compute singular vectors of upper bidiagonal matrix in compact form.
smlsiz INTEGER.
1604
LAPACK Routines 3
The maximum size of the subproblems at the bottom of the computation
tree.
n INTEGER. The row dimension of the upper bidiagonal matrix. This is also the
dimension of the main diagonal array d.
sqre INTEGER. Specifies the column dimension of the bidiagonal matrix.

If sqre = 0: the bidiagonal matrix has column dimension m = n
If sqre = 1: the bidiagonal matrix has column dimension m = n + 1.
d REAL for slasda

DOUBLE PRECISION for dlasda.
bidiagonal matrix.
e REAL for slasda

Array, DIMENSION ( m - 1 ). Contains the subdiagonal entries of the
bidiagonal matrix. On exit, e is destroyed.
ldu INTEGER. The leading dimension of arrays u, vt, difl, difr, poles, givnum,
and z. ldun.
ldgcol INTEGER. The leading dimension of arrays givcol and perm. ldgcoln.
work REAL for slasda

Workspace array, DIMENSION (6n+(smlsiz+1)2).
iwork INTEGER.
Workspace array, Dimension must be at least (7n).
Output Parameters
d On exit d, if info = 0, contains the singular values of the bidiagonal

matrix.
u REAL for slasda

Array, DIMENSION (ldu, smlsiz) if icompq =1.
Not referenced if icompq = 0.
If icompq = 1, on exit, u contains the left singular vector matrices of all

subproblems at the bottom level.
vt REAL for slasda

1605
Array, DIMENSION ( ldu, smlsiz+1 ) if icompq = 1, and not referenced if

icompq = 0. If icompq = 1, on exit, vt' contains the right singular vector
matrices of all subproblems at the bottom level.
k INTEGER.
Array, DIMENSION (n) if icompq = 1 and
DIMENSION (1) if icompq = 0.

If icompq = 1, on exit, k(i) is the dimension of the i-th secular equation on
the computation tree.
difl REAL for slasda

Array, DIMENSION ( ldu, nlvl ),
where nlvl = floor(log2(n/smlsiz)).
difr REAL for slasda

Array,
DIMENSION ( ldu, 2 nlvl ) if icompq = 1 and
DIMENSION (n) if icompq = 0.
If icompq = 1, on exit, difl(1:n, i) and difr(1:n,2i -1) record distances
between singular values on the i-th level and singular values on the (i -1)-
th level, and difr(1:n, 2i ) contains the normalizing factors for the right
singular vector matrix. See ?lasd8 for details.
z REAL for slasda

Array,
DIMENSION ( ldu, nlvl ) if icompq = 1 and
DIMENSION (n) if icompq = 0. The first k elements of z(1, i) contain the
components of the deflation-adjusted updating row vector for subproblems
on the i-th level.
poles REAL for slasda

DOUBLE PRECISION for dlasda
Array, DIMENSION(ldu, 2*nlvl)
if icompq = 1, and not referenced if icompq = 0. If icompq = 1, on exit,

poles(1, 2i - 1) and poles(1, 2i) contain the new and old singular values
involved in the secular equations on the i-th level.
givptr INTEGER. Array, DIMENSION (n) if icompq = 1, and not referenced if

icompq = 0. If icompq = 1, on exit, givptr( i ) records the number of
Givens rotations performed on the i-th problem on the computation tree.
givcol INTEGER .
1606
LAPACK Routines 3
Array, DIMENSION(ldgcol, 2*nlvl) if icompq = 1, and not referenced if
icompq = 0. If icompq = 1, on exit, for each i, givcol(1, 2 i - 1) and
givcol(1, 2 i) record the locations of Givens rotations performed on the i-th
level on the computation tree.
perm INTEGER . Array, DIMENSION ( ldgcol, nlvl ) if icompq = 1, and not

referenced if icompq = 0. If icompq = 1, on exit, perm (1, i) records
permutations done on the i-th level of the computation tree.
givnum REAL for slasda

Array DIMENSION ( ldu, 2*nlvl ) if icompq = 1, and not referenced if
icompq = 0. If icompq = 1, on exit, for each i, givnum(1, 2 i - 1) and
givnum(1, 2 i) record the C- and S-values of Givens rotations performed on
the i-th level on the computation tree.
c REAL for slasda

Array,
DIMENSION (n) if icompq = 1, and
If icompq = 1 and the i-th subproblem is not square, on exit, c(i) contains
the C-value of a Givens rotation related to the right null space of the i-th
subproblem.
s REAL for slasda

Array,
DIMENSION (n) icompq = 1, and
If icompq = 1 and the i-th subproblem is not square, on exit, s(i) contains
the S-value of a Givens rotation related to the right null space of the i-th
subproblem.
info INTEGER.
> 0: If info = 1, an singular value did not converge
?lasdq
Computes the SVD of a real bidiagonal matrix with
diagonal d and off-diagonal e. Used by ?bdsdc.
Syntax
call slasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info )
1607
call dlasdq( uplo, sqre, n, ncvt, nru, ncc, d, e, vt, ldvt, u, ldu, c, ldc, work,
info )
Include Files
mkl.fi
Description
The routine ?lasdq computes the singular value decomposition (SVD) of a real (upper or lower) bidiagonal
matrix with diagonal d and off-diagonal e, accumulating the transformations if desired. If B is the input
bidiagonal matrix, the algorithm computes orthogonal matrices Q and P such that B = Q*S*PT. The singular
values S are overwritten on d.
The input matrix U is changed to U*Q if desired.
The input matrix VT is changed to PT*VT if desired.
The input matrix C is changed to QT*C if desired.
Input Parameters
uplo CHARACTER*1. On entry, uplo specifies whether the input bidiagonal matrix
is upper or lower bidiagonal.
If uplo = 'U' or 'u', B is upper bidiagonal;
If uplo = 'L' or 'l', B is lower bidiagonal.
sqre INTEGER.
= 0: then the input matrix is n-by-n.
= 1: then the input matrix is n-by-(n+1) if uplu = 'U' and (n+1)-by-n if
uplu
= 'L'. The bidiagonal matrix has n = nl + nr + 1 rows and m = n +
sqren columns.
n INTEGER. On entry, n specifies the number of rows and columns in the

matrix. n must be at least 0.
ncvt INTEGER. On entry, ncvt specifies the number of columns of the matrix VT.
ncvt must be at least 0.
nru INTEGER. On entry, nru specifies the number of rows of the matrix U. nru
must be at least 0.
ncc INTEGER. On entry, ncc specifies the number of columns of the matrix C.
ncc must be at least 0.
d REAL for slasdq

DOUBLE PRECISION for dlasdq.
Array, DIMENSION (n). On entry, d contains the diagonal entries of the
bidiagonal matrix.
e REAL for slasdq

1608
LAPACK Routines 3
Array, DIMENSION is (n-1) if sqre = 0 and n if sqre = 1. On entry, the
entries of e contain the off-diagonal entries of the bidiagonal matrix.
vt REAL for slasdq

Array, DIMENSION (ldvt, ncvt). On entry, contains a matrix which on exit
has been premultiplied by PT, dimension n-by-ncvt if sqre = 0 and (n+1)-
by-ncvt if sqre = 1 (not referenced if ncvt=0).
ldvt INTEGER. On entry, ldvt specifies the leading dimension of vt as declared in

the calling (sub) program. ldvt must be at least 1. If ncvt is nonzero, ldvt
must also be at least n.
u REAL for slasdq

Array, DIMENSION (ldu, n). On entry, contains a matrix which on exit has
been postmultiplied by Q, dimension nru-by-n if sqre = 0 and nru-by-(n
+1) if sqre = 1 (not referenced if nru=0).
ldu INTEGER. On entry, ldu specifies the leading dimension of u as declared in

the calling (sub) program. ldu must be at least max(1, nru ) .
c REAL for slasdq

Array, DIMENSION (ldc, ncc). On entry, contains an n-by-ncc matrix which
on exit has been premultiplied by Q', dimension n-by-ncc if sqre = 0 and
(n+1)-by-ncc if sqre = 1 (not referenced if ncc=0).
ldc INTEGER. On entry, ldc specifies the leading dimension of C as declared in

the calling (sub) program. ldc must be at least 1. If ncc is non-zero, ldc
must also be at least n.
work REAL for slasdq

Array, DIMENSION (4n). This is a workspace array. Only referenced if one of
ncvt, nru, or ncc is nonzero, and if n is at least 2.
Output Parameters
d On normal exit, d contains the singular values in ascending order.
e On normal exit, e will contain 0. If the algorithm does not converge, d and e
will contain the diagonal and superdiagonal entries of a bidiagonal matrix
orthogonally equivalent to the one given as input.
vt On exit, the matrix has been premultiplied by P'.
u On exit, the matrix has been postmultiplied by Q.
c On exit, the matrix has been premultiplied by Q'.
1609
info INTEGER. On exit, a value of 0 indicates a successful exit. If info < 0,

argument number -info is illegal. If info > 0, the algorithm did not
converge, and info specifies how many superdiagonals did not converge.
?lasdt
Creates a tree of subproblems for bidiagonal divide
and conquer. Used by ?bdsdc.
Syntax
call slasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
call dlasdt( n, lvl, nd, inode, ndiml, ndimr, msub )
Include Files
mkl.fi
Description
The routine creates a tree of subproblems for bidiagonal divide and conquer.
Input Parameters
n INTEGER. On entry, the number of diagonal elements of the bidiagonal

matrix.
msub INTEGER. On entry, the maximum row dimension each subproblem at the
bottom of the tree can be of.
Output Parameters
lvl INTEGER. On exit, the number of levels on the computation tree.
nd INTEGER. On exit, the number of nodes on the tree.
inode INTEGER.
Array, DIMENSION (n). On exit, centers of subproblems.
ndiml INTEGER .
Array, DIMENSION (n). On exit, row dimensions of left children.
ndimr INTEGER .
Array, DIMENSION (n). On exit, row dimensions of right children.
?laset
Initializes the off-diagonal elements and the diagonal
elements of a matrix to given values.
Syntax
call slaset( uplo, m, n, alpha, beta, a, lda )
call dlaset( uplo, m, n, alpha, beta, a, lda )
1610
LAPACK Routines 3
call claset( uplo, m, n, alpha, beta, a, lda )
call zlaset( uplo, m, n, alpha, beta, a, lda )
Include Files
mkl.fi
Description
The routine initializes an m-by-n matrix A to beta on the diagonal and alpha on the off-diagonals.
Input Parameters
uplo CHARACTER*1. Specifies the part of the matrix A to be set.

If uplo = 'U', upper triangular part is set; the strictly lower triangular
part of A is not changed.
If uplo = 'L': lower triangular part is set; the strictly upper triangular
part of A is not changed.
Otherwise: All of the matrix A is set.

n 0.
alpha, beta REAL for slaset

DOUBLE PRECISION for dlaset
COMPLEX for claset
DOUBLE COMPLEX for zlaset.
The constants to which the off-diagonal and diagonal elements are to be
set, respectively.
a REAL for slaset

DOUBLE PRECISION for dlaset
COMPLEX for claset
DOUBLE COMPLEX for zlaset.
Array, DIMENSION (lda, n).
The array a contains the m-by-n matrix A.

lda max(1,m).
Output Parameters
a On exit, the leading m-by-n submatrix of A is set as follows:

if uplo = 'U', Aij = alpha, 1ij-1, 1jn,
if uplo = 'L', Aij = alpha, j+1im, 1jn,
1611
otherwise, Aij = alpha, 1im, 1jn, ij,
and, for all uplo, Aij = beta, 1imin(m, n).
?lasq1
Computes the singular values of a real square
bidiagonal matrix. Used by ?bdsqr.
Syntax
call slasq1( n, d, e, work, info )
call dlasq1( n, d, e, work, info )
Include Files
mkl.fi
Description
The routine ?lasq1 computes the singular values of a real n-by-n bidiagonal matrix Z with diagonal d and
off-diagonal e. The singular values are computed to high relative accuracy, in the absence of
denormalization, underflow and overflow.
Input Parameters
n INTEGER.The number of rows and columns in the matrix. n 0.
d REAL for slasq1

DOUBLE PRECISION for dlasq1.
On entry, d contains the diagonal elements of the bidiagonal matrix whose

SVD is desired.
e REAL for slasq1

On entry, elements e(1:n-1) contain the off-diagonal elements of the

bidiagonal matrix whose SVD is desired.
work REAL for slasq1

Workspace array, DIMENSION(4n).
Output Parameters
d On normal exit, d contains the singular values in decreasing order.
e On exit, e is overwritten.
info INTEGER.
1612
LAPACK Routines 3
= 0: successful exit;
< 0: if info = -i, the i-th argument had an illegal value;
> 0: the algorithm failed:

= 1, a split was marked by a positive value in e;
= 2, current block of Z not diagonalized after 100n iterations (in inner while
loop) - on exit the current contents of d and e represent a matrix with the
same singular values as the matrix with which ?lasq1 was originally called,
and which the calling subroutine could use to finish the computation, or
even feed back into ?lasq1;
= 3, termination criterion of outer while loop not met (program created

more than n unreduced blocks.
?lasq2
Computes all the eigenvalues of the symmetric
positive definite tridiagonal matrix associated with the
quotient difference array z to high relative accuracy.
Used by ?bdsqr and ?stegr.
Syntax
call slasq2( n, z, info )
call dlasq2( n, z, info )
Include Files
mkl.fi
Description
The routine ?lasq2 computes all the eigenvalues of the symmetric positive definite tridiagonal matrix
associated with the quotient difference array z to high relative accuracy, in the absence of denormalization,
underflow and overflow.
To see the relation of z to the tridiagonal matrix, let L be a unit lower bidiagonal matrix with subdiagonals
z(2,4,6,,..) and let U be an upper bidiagonal matrix with 1's above and diagonal z(1,3,5,,..). The tridiagonal
is LU or, if you prefer, the symmetric tridiagonal to which it is similar.
Input Parameters
n INTEGER. The number of rows and columns in the matrix. n 0.
z REAL for slasq2

Array, DIMENSION (4 * n).
On entry, z holds the quotient difference array.
1613
Output Parameters
z On exit, entries 1 to n hold the eigenvalues in decreasing order, z(2n+1)

holds the trace, and z(2n+2) holds the sum of the eigenvalues. If n > 2,
then z(2n+3) holds the iteration count, z(2n+4) holds ndivs/nin2, and
z(2n+5) holds the percentage of shifts that failed.
info INTEGER.
= 0: successful exit;
< 0: if the i-th argument is a scalar and had an illegal value, then info =
-i, if the i-th argument is an array and the j-entry had an illegal value,
then info = -(i*100+ j);
> 0: the algorithm failed:

= 1, a split was marked by a positive value in e;
= 2, current block of z not diagonalized after 100*n iterations (in inner
while loop) - On exit z holds a quotient difference array with the same
eigenvalues as the z array on entry;
= 3, termination criterion of outer while loop not met (program created
more than n unreduced blocks).
Application Notes
The routine ?lasq2 defines a logical variable, ieee, which is .TRUE. on machines which follow ieee-754
floating-point standard in their handling of infinities and NaNs, and .FALSE. otherwise. This variable is
passed to ?lasq3.
?lasq3
Checks for deflation, computes a shift and calls dqds.
Used by ?bdsqr.
Syntax
call slasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee, ttype,
dmin1, dmin2, dn, dn1, dn2, g, tau )
call dlasq3( i0, n0, z, pp, dmin, sigma, desig, qmax, nfail, iter, ndiv, ieee, ttype,
dmin1, dmin2, dn, dn1, dn2, g, tau )
Include Files
mkl.fi
Description
The routine ?lasq3 checks for deflation, computes a shift tau, and calls dqds. In case of failure, it changes
shifts, and tries again until output is positive.
Input Parameters
i0 INTEGER. First index.
n0 INTEGER. Last index.
1614
LAPACK Routines 3
z REAL for slasq3
Array, DIMENSION (4n). z holds the qd array.
pp INTEGER. pp=0 for ping, pp=1 for pong. pp=2 indicates that flipping was
applied to the Z array and that the initial tests for deflation should not be
performed.
desig REAL for slasq3

Lower order part of sigma.
qmax REAL for slasq3

Maximum value of q.
ieee LOGICAL.
Flag for ieee or non-ieee arithmetic (passed to ?lasq5).
ttype INTEGER.
Shift type.
dmin1, dmin2, dn, dn1, dn2, g, REAL for slasq3

tau
These scalars are passed as arguments in order to save their values
between calls to ?lasq3.
Output Parameters
dmin REAL for slasq3

Minimum value of d.
pp INTEGER. pp=0 for ping, pp=1 for pong. pp=2 indicates that flipping was
applied to the Z array and that the initial tests for deflation should not be
performed.
sigma REAL for slasq3

Sum of shifts used in the current segment.
desig Lower order part of sigma.
nfail INTEGER. Number of times shift was too big.
iter INTEGER. Number of iterations.
ndiv INTEGER. Number of divisions.
1615
ttype INTEGER.
Shift type.
dmin1, dmin2, dn, dn1, dn2, g, REAL for slasq3

tau
These scalars are passed as arguments in order to save their values
between calls to ?lasq3.
?lasq4
Computes an approximation to the smallest
eigenvalue using values of d from the previous
transform. Used by ?bdsqr.
Syntax
call slasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype, g )
call dlasq4( i0, n0, z, pp, n0in, dmin, dmin1, dmin2, dn, dn1, dn2, tau, ttype, g )
Include Files
mkl.fi
Description
The routine computes an approximation tau to the smallest eigenvalue using values of d from the previous
transform.
Input Parameters
z REAL for slasq4

Array, DIMENSION (4n).
z holds the qd array.
pp INTEGER. pp=0 for ping, pp=1 for pong.
n0in INTEGER. The value of n0 at start of eigtest.

Minimum value of d.
dmin1 REAL for slasq4

Minimum value of d, excluding d(n0).
1616
LAPACK Routines 3
Minimum value of d, excluding d(n0) and d(n0-1).
dn REAL for slasq4

DOUBLE PRECISION for dlasq4. Contains d(n).
dn1 REAL for slasq4

DOUBLE PRECISION for dlasq4. Contains d(n-1).
dn2 REAL for slasq4

DOUBLE PRECISION for dlasq4. Contains d(n-2).
g REAL for slasq4

A scalar passed as an argument in order to save its value between calls to ?
lasq4.
Output Parameters
tau REAL for slasq4

Shift.
ttype INTEGER. Shift type.
g REAL for slasq4

A scalar passed as an argument in order to save its value between calls to ?
lasq4.
?lasq5
Computes one dqds transform in ping-pong form.
Used by ?bdsqr and ?stegr.
Syntax
call slasq5( i0, n0, z, pp, tau, sigma, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee,
eps )
call dlasq5( i0, n0, z, pp, tau, sigma, dmin, dmin1, dmin2, dn, dnm1, dnm2, ieee,
eps )
Include Files
mkl.fi
Description
1617
The routine computes one dqds transform in ping-pong form: one version for ieee machines, another for
non-ieee machines.
Input Parameters
z REAL for slasq5

Array, DIMENSION (4n). z holds the qd array. emin is stored in z(4*n0)
to avoid an extra argument.
tau REAL for slasq5

This is the shift.
sigma REAL for slasq5

This is the accumulated shift up to the current point.
ieee LOGICAL. Flag for IEEE or non-IEEE arithmetic.
eps REAL for slasq5

This is the value of epsilon used.
Output Parameters

Minimum value of d.


dn REAL for slasq5

DOUBLE PRECISION for dlasq5. Contains d(n0), the last value of d.
dnm1 REAL for slasq5
1618
LAPACK Routines 3
DOUBLE PRECISION for dlasq5. Contains d(n0-1).

?lasq6
Computes one dqd transform in ping-pong form. Used
by ?bdsqr and ?stegr.
Syntax
call slasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
call dlasq6( i0, n0, z, pp, dmin, dmin1, dmin2, dn, dnm1, dnm2 )
Include Files
mkl.fi
Description
The routine ?lasq6 computes one dqd (shift equal to zero) transform in ping-pong form, with protection
against underflow and overflow.
Input Parameters
z REAL for slasq6

Array, DIMENSION (4n). Z holds the qd array. emin is stored in z(4*n0) to
avoid an extra argument.
Output Parameters

Minimum value of d.


1619
dn REAL for slasq6

DOUBLE PRECISION for dlasq6. Contains d(n0), the last value of d.


?lasr
Applies a sequence of plane rotations to a general
rectangular matrix.
Syntax
call slasr( side, pivot, direct, m, n, c, s, a, lda )
call dlasr( side, pivot, direct, m, n, c, s, a, lda )
call clasr( side, pivot, direct, m, n, c, s, a, lda )
call zlasr( side, pivot, direct, m, n, c, s, a, lda )
Include Files
mkl.fi
Description
The routine applies a sequence of plane rotations to a real/complex matrix A, from the left or the right.
A := P*A, when side = 'L' ( Left-hand side )
A := A*P', when side = 'R' ( Right-hand side )
where P is an orthogonal matrix consisting of a sequence of plane rotations with z = m when side = 'L'
and z = n when side = 'R'.
When direct = 'F' (Forward sequence), then
P = P(z-1)*...P(2)*P(1),
and when direct = 'B' (Backward sequence), then
P = P(1)*P(2)*...*P(z-1),
where P( k ) is a plane rotation matrix defined by the 2-by-2 plane rotation:
When pivot = 'V' ( Variable pivot ), the rotation is performed for the plane (k, k + 1), that is, P(k) has the
form
1620
LAPACK Routines 3
where R(k) appears as a rank-2 modification to the identity matrix in rows and columns k and k+1.
When pivot = 'T' ( Top pivot ), the rotation is performed for the plane (1,k+1), so P(k) has the form
where R(k) appears in rows and columns k and k+1.

Similarly, when pivot = 'B' ( Bottom pivot ), the rotation is performed for the plane (k,z), giving P(k) the
form
1621
where R(k) appears in rows and columns k and z. The rotations are performed without ever forming P(k)
explicitly.
Input Parameters
side CHARACTER*1. Specifies whether the plane rotation matrix P is applied to A

on the left or the right.
= 'L': left, compute A := P*A
= 'R': right, compute A:= A*P'
direct CHARACTER*1. Specifies whether P is a forward or backward sequence of

plane rotations.
= 'F': forward, P = P(z-1)*...*P(2)*P(1)
= 'B': backward, P = P(1)*P(2)*...*P(z-1)
pivot CHARACTER*1. Specifies the plane for which P(k) is a plane rotation matrix.
= 'V': Variable pivot, the plane (k, k+1)
= 'T': Top pivot, the plane (1, k+1)
= 'B': Bottom pivot, the plane (k, z)

If m 1, an immediate return is effected.

If n 1, an immediate return is effected.
c, s REAL for slasr/clasr

DOUBLE PRECISION for dlasr/zlasr.
Arrays, DIMENSION
(m-1) if side = 'L',
(n-1) if side = 'R' .
c(k) and s(k) contain the cosine and sine of the plane rotations respectively
that define the 2-by-2 plane rotation part (R(k)) of the P(k) matrix as
described above in Description.
a REAL for slasr

DOUBLE PRECISION for dlasr
COMPLEX for clasr
DOUBLE COMPLEX for zlasr.
The m-by-n matrix A.

lda max(1,m).
1622
LAPACK Routines 3
Output Parameters
a On exit, A is overwritten by P*A if side = 'R', or by A*P' if side = 'L'.
?lasrt
Sorts numbers in increasing or decreasing order.
Syntax
call slasrt( id, n, d, info )
call dlasrt( id, n, d, info )
Include Files
mkl.fi
Description
The routine ?lasrt sorts the numbers in d in increasing order (if id = 'I') or in decreasing order (if id =
'D'). It uses Quick Sort, reverting to Insertion Sort on arrays of size 20. Dimension of stack limits n to
about 232.
Input Parameters
id CHARACTER*1.
(d(1) ... d(n)) or into decreasing order
(d(1) ... d(n)), depending on id.
n INTEGER. The length of the array d.
d REAL for slasrt

DOUBLE PRECISION for dlasrt.
On entry, the array to be sorted.
Output Parameters
d On exit, d has been sorted into increasing order

(d[0]d[1] ... d[n-1]) or into decreasing order
(d[0] d[1] ... d[n-1]), depending on id.

?lassq
Updates a sum of squares represented in scaled form.
Syntax
call slassq( n, x, incx, scale, sumsq )
1623
call dlassq( n, x, incx, scale, sumsq )

call classq( n, x, incx, scale, sumsq )
call zlassq( n, x, incx, scale, sumsq )
Include Files
mkl.fi
Description
The real routines slassq/dlassq return the values scl and smsq such that
scl2 * smsq = x(1)2 +...+ x(n)2 + scale2 *sumsq,

where x( i ) = x(1 + ( i - 1) incx).
The value of sumsq is assumed to be non-negative and scl returns the value
scl = max( scale, abs(x(i))).
Values scale and sumsq must be supplied in scale and sumsq, and scl and smsq are overwritten on scale and
sumsq, respectively.
The complex routines classq/zlassq return the values scl and ssq such that
scl2 * ssq = x(1)2 +...+ x(n)2 + scale2 *sumsq,

where x(i) = abs(x(1 +(i - 1)*incx)).
The value of sumsq is assumed to be at least unity and the value of ssq will then satisfy 1.0 ssqsumsq
+ 2n
scale is assumed to be non-negative and scl returns the value
scl = max( scale, abs(real(x(i))), abs(aimag(x(i)))).
Values scale and sumsq must be supplied in scale and sumsq, and scl and ssq are overwritten on scale and
sumsq, respectively.
All routines ?lassq make only one pass through the vector x.
Input Parameters
n INTEGER. The number of elements to be used from the vector x.
x REAL for slassq

DOUBLE PRECISION for dlassq
COMPLEX for classq
DOUBLE COMPLEX for zlassq.
The vector for which a scaled sum of squares is computed: x(i) = x(1+
(i-1)*incx), 1 in.
incx INTEGER. The increment between successive values of the vector x. incx >
0.
scale REAL for slassq/classq

DOUBLE PRECISION for dlassq/zlassq.
On entry, the value scale in the equation above.
1624
LAPACK Routines 3
sumsq REAL for slassq/classq
DOUBLE PRECISION for dlassq/zlassq.
On entry, the value sumsq in the equation above.
Output Parameters
scale On exit, scale is overwritten with scl, the scaling factor for the sum of
squares.
sumsq For real flavors:

On exit, sumsq is overwritten with the value smsq in the equation above.
On exit, sumsq is overwritten with the value ssq in the equation above.
?lasv2
Computes the singular value decomposition of a 2-
by-2 triangular matrix.
Syntax
call slasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
call dlasv2( f, g, h, ssmin, ssmax, snr, csr, snl, csl )
Include Files
mkl.fi
Description
The routine ?lasv2 computes the singular value decomposition of a 2-by-2 triangular matrix
On return, abs(ssmax) is the larger singular value, abs(ssmin) is the smaller singular value, and (csl,snl) and
(csr,snr) are the left and right singular vectors for abs(ssmax), giving the decomposition
Input Parameters
f, g, h REAL for slasv2

DOUBLE PRECISION for dlasv2.
The (1,1), (1,2) and (2,2) elements of the 2-by-2 matrix, respectively.
1625
Output Parameters
ssmin, ssmax REAL for slasv2

abs(ssmin) and abs(ssmax) is the smaller and the larger singular value,
respectively.
snl, csl REAL for slasv2

The vector (csl, snl) is a unit left singular vector for the singular value
abs(ssmax).
snr, csr REAL for slasv2

The vector (csr, snr) is a unit right singular vector for the singular value
abs(ssmax).
Application Notes
Any input parameter may be aliased with any output parameter.
Barring over/underflow and assuming a guard digit in subtraction, all output quantities are correct to within a
few units in the last place (ulps).
In ieee arithmetic, the code works correctly if one matrix element is infinite. Overflow will not occur unless
the largest singular value itself overflows or is within a few ulps of overflow. (On machines with partial
overflow, like the Cray, overflow may occur if the largest singular value is within a factor of 2 of overflow.)
Underflow is harmless if underflow is gradual. Otherwise, results may correspond to a matrix modified by
perturbations of size near the underflow threshold.
?laswp
Performs a series of row interchanges on a general
rectangular matrix.
Syntax
call slaswp( n, a, lda, k1, k2, ipiv, incx )
call dlaswp( n, a, lda, k1, k2, ipiv, incx )
call claswp( n, a, lda, k1, k2, ipiv, incx )
call zlaswp( n, a, lda, k1, k2, ipiv, incx )
Include Files
mkl.fi
Description
The routine performs a series of row interchanges on the matrix A. One row interchange is initiated for each
of rows k1 through k2 of A.
Input Parameters
1626
LAPACK Routines 3
a REAL for slaswp

DOUBLE PRECISION for dlaswp
COMPLEX for claswp
DOUBLE COMPLEX for zlaswp.
Array a contains the m-by-n matrix A.
k1 INTEGER. The first element of ipiv for which a row interchange will be done.
k2 INTEGER. The last element of ipiv for which a row interchange will be done.
ipiv INTEGER.
Array, size k1+(k2-k1)*|incx|).
The vector of pivot indices. Only the elements in positions k1 through k2 of

ipiv are accessed.
ipiv(k) = l implies rows k and l are to be interchanged.
incx INTEGER. The increment between successive values of ipiv. If ipiv is

negative, the pivots are applied in reverse order.
Output Parameters
a On exit, the permuted matrix.
?lasy2
Solves the Sylvester matrix equation where the
matrices are of order 1 or 2.
Syntax
call slasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale, x, ldx,
xnorm, info )
call dlasy2( ltranl, ltranr, isgn, n1, n2, tl, ldtl, tr, ldtr, b, ldb, scale, x, ldx,
xnorm, info )
Include Files
mkl.fi
Description
The routine solves for the n1-by-n2 matrix X, 1 n1, n2 2, in
op(TL)*X + isgn*X*op(TR) = scale*B,

where
TL is n1-by-n1,
1627
TR is n2-by-n2,
B is n1-by-n2,
and isgn = 1 or -1. op(T) = T or TT, where TT denotes the transpose of T.
Input Parameters
ltranl LOGICAL.
On entry, ltranl specifies the op(TL):
= .FALSE., op(TL) = TL,
= .TRUE., op(TL) = (TL)T.
ltranr LOGICAL.
On entry, ltranr specifies the op(TR):
= .FALSE., op(TR) = TR,
= .TRUE., op(TR) = (TR)T.
isgn INTEGER. On entry, isgn specifies the sign of the equation as described
before. isgn may only be 1 or -1.
n1 INTEGER. On entry, n1 specifies the order of matrix TL.

n1 may only be 0, 1 or 2.
n2 INTEGER. On entry, n2 specifies the order of matrix TR.

n2 may only be 0, 1 or 2.
tl REAL for slasy2

DOUBLE PRECISION for dlasy2.
Array, DIMENSION (ldtl,2).
On entry, tl contains an n1-by-n1 matrix TL.
ldtl INTEGER.The leading dimension of the matrix TL.

ldtl max(1,n1).
tr REAL for slasy2

Array, DIMENSION (ldtr,2). On entry, tr contains an n2-by-n2 matrix TR.
ldtr INTEGER. The leading dimension of the matrix TR.

ldtr max(1,n2).
b REAL for slasy2

Array, DIMENSION (ldb,2). On entry, the n1-by-n2 matrix B contains the
right-hand side of the equation.
ldb INTEGER. The leading dimension of the matrix B.
1628
LAPACK Routines 3
ldb max(1,n1).
ldx INTEGER. The leading dimension of the output matrix X.

ldx max(1,n1).
Output Parameters
scale REAL for slasy2

On exit, scale contains the scale factor.
scale is chosen less than or equal to 1 to prevent the solution overflowing.
x REAL for slasy2

Array, DIMENSION (ldx,2). On exit, x contains the n1-by-n2 solution.
xnorm REAL for slasy2

On exit, xnorm is the infinity-norm of the solution.
info INTEGER. On exit, info is set to 0: successful exit. 1: TL and TR have too
close eigenvalues, so TL or TR is perturbed to get a nonsingular equation.
NOTE
?lasyf
Computes a partial factorization of a symmetric
Syntax
call slasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call dlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call clasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlasyf( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lasyf computes a partial factorization of a symmetric matrix A using the Bunch-Kaufman
diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
1629
uplo = 'L'
where the order of D is at most nb.

The actual order is returned in the argument kb, and is either nb or nb-1, or n if nnb.
This is an auxiliary routine called by ?sytrf. It uses blocked code (calling Level 3 BLAS) to update the
submatrix A11 (if uplo = 'U') or A22 (if uplo = 'L').
Input Parameters
uplo CHARACTER*1.
matrix A is stored:
= 'L': Lower triangular
nb INTEGER. The maximum number of columns of the matrix A that should be

factored. nb should be at least 2 to allow for 2-by-2 pivot blocks.
a REAL for slasyf

DOUBLE PRECISION for dlasyf
COMPLEX for clasyf
DOUBLE COMPLEX for zlasyf.
Array, DIMENSION (lda, n). If uplo = 'U', the leading n-by-n upper
the strictly lower triangular part of a is not referenced. If uplo = 'L', the
of the matrix A, and the strictly upper triangular part of a is not referenced.
lda INTEGER. The leading dimension of the array a. lda max(1,n).
w REAL for slasyf

DOUBLE PRECISION for dlasyf
COMPLEX for clasyf
DOUBLE COMPLEX for zlasyf.
Workspace array, DIMENSION (ldw, nb).
ldw INTEGER. The leading dimension of the array w. ldw max(1,n).
1630
LAPACK Routines 3
Output Parameters
kb INTEGER. The number of columns of A that were actually factored kb is

either nb-1 or nb, or n if nnb.
a On exit, a contains details of the partial factorization.
ipiv INTEGER. Array, DIMENSION (n ). Details of the interchanges and the block
structure of D.
If uplo = 'U', only the last kb elements of ipiv are set;
if uplo = 'L', only the first kb elements are set.
If ipiv(k) > 0, then rows and columns k and ipiv(k) were interchanged
and D(k, k) is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) = ipiv(k-1) < 0, then rows and columns
k-1 and -ipiv(k) were interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows and columns k
+1 and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
> 0: if info = k, D(k, k) is exactly zero. The factorization has been
completed, but the block diagonal matrix D is exactly singular.
?lasyf_rook
Computes a partial factorization of a complex
symmetric matrix, using the bounded Bunch-Kaufman
diagonal pivoting method.
Syntax
call slasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call dlasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call clasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlasyf_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lasyf_rook computes a partial factorization of a complex symmetric matrix A using the bounded
Bunch-Kaufman ("rook") diagonal pivoting method. The partial factorization has the form:
1631

This is an auxiliary routine called by ?sytrf_rook. It uses blocked code (calling Level 3 BLAS) to update the
Input Parameters
uplo CHARACTER*1.
matrix A is stored:

a REAL for slasyf_rook

DOUBLE PRECISION for dlasyf_rook
COMPLEX for clasyf_rook
DOUBLE COMPLEX for zlasyf_rook.
Array, DIMENSION (lda, n). If uplo = 'U', the leading n-by-n upper
the strictly lower triangular part of a is not referenced. If uplo = 'L', the
of the matrix A, and the strictly upper triangular part of a is not referenced.
w REAL for slasyf_rook

DOUBLE PRECISION for dlasyf_rook
COMPLEX for clasyf_rook
DOUBLE COMPLEX for zlasyf_rook.
Output Parameters

1632
LAPACK Routines 3
a On exit, a contains details of the partial factorization.
ipiv INTEGER. Array, DIMENSION (n ). Details of the interchanges and the block
structure of D.
and Dk, k is a 1-by-1 diagonal block.
If uplo = 'U' and ipiv(k) < 0 and ipiv(k - 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k - 1 and -
ipiv(k - 1) were interchanged, and Dk-1:k, k-1:k is a 2-by-2 diagonal block.
If uplo = 'L' and ipiv(k) < 0 and ipiv(k + 1) < 0, then rows and
columns k and -ipiv(k) were interchanged, rows and columns k + 1 and -
ipiv(k + 1) were interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
?lahef
Hermitian indefinite matrix, using the diagonal
pivoting method.
Syntax
call clahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlahef( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lahef computes a partial factorization of a complex Hermitian matrix A, using the Bunch-
Kaufman diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
If uplo = 'U':
1633

Note that UH denotes the conjugate transpose of U.

This is an auxiliary routine called by ?hetrf. It uses blocked code (calling Level 3 BLAS) to update the
Input Parameters
uplo CHARACTER*1.
A is stored:
= 'U': upper triangular
= 'L': lower triangular

a COMPLEX for clahef

DOUBLE COMPLEX for zlahef.
On entry, the Hermitian matrix A.

If uplo = 'U', the leading n-by-n upper triangular part of A contains the
If uplo = 'L', the leading n-by-n lower triangular part of A contains the
w COMPLEX for clahef

DOUBLE COMPLEX for zlahef.
Output Parameters

a On exit, A contains details of the partial factorization.
ipiv INTEGER.
Array, DIMENSION (n ). Details of the interchanges and the block structure
of D.
1634
LAPACK Routines 3
If ipiv(k) > 0, then rows and columns k and ipiv(k) are interchanged and
D(k, k) is a 1-by-1 diagonal block.
k-1 and -ipiv(k) are interchanged and D(k-1:k, k-1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv(k) = ipiv(k+1) < 0, then rows and columns k
+1 and -ipiv(k) are interchanged and D( k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
?lahef_rook
Hermitian indefinite matrix, using the bounded Bunch-
Kaufman diagonal pivoting method.
Syntax
call clahef_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
call zlahef_rook( uplo, n, nb, kb, a, lda, ipiv, w, ldw, info )
Include Files
mkl.fi
Description
The routine ?lahef_rook computes a partial factorization of a complex Hermitian matrix A, using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method. The partial factorization has the form:
If uplo = 'U':
If uplo = 'L':

Note that UH denotes the conjugate transpose of U.
1635
This is an auxiliary routine called by ?hetrf_rook. It uses blocked code (calling Level 3 BLAS) to update the
Input Parameters
uplo CHARACTER*1.
A is stored:

a COMPLEX for clahef_rook

DOUBLE COMPLEX for zlahef_rook.

w COMPLEX for clahef_rook

DOUBLE COMPLEX for zlahef_rook.
Output Parameters

a On exit, A contains details of the partial factorization.
ipiv INTEGER.
Array, DIMENSION (n ). Details of the interchanges and the block structure
of D.
D(k, k) is a 1-by-1 diagonal block.
1636
LAPACK Routines 3
If uplo = 'U' and ipiv(k) < 0 and ipiv(k-1) < 0, then rows and
columns k and -ipiv(k) are interchanged, rows and columns k - 1 and -
ipiv(k - 1) are interchanged, and Dk-1:k, k-1:k is a 2-by-2 diagonal block.
columns k and -ipiv(k) are interchanged, rows and columns k + 1 and -
ipiv(k + 1) are interchanged, and Dk:k+1, k:k+1 is a 2-by-2 diagonal block.
info INTEGER.
?latbs
Solves a triangular banded system of equations.
Syntax
call slatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call dlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call clatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
call zlatbs( uplo, trans, diag, normin, n, kd, ab, ldab, x, scale, cnorm, info )
Include Files
mkl.fi
Description
The routine solves one of the triangular systems

A*x = s*b, or AT*x = s*b, or AH*x = s*b (for complex flavors)
with scaling to prevent overflow, where A is an upper or lower triangular band matrix. Here AT denotes the
transpose of A, AH denotes the conjugate transpose of A, x and b are n-element vectors, and s is a scaling
factor, usually less than or equal to 1, chosen so that the components of x will be less than the overflow
threshold. If the unscaled problem will not cause overflow, the Level 2 BLAS routine ?tbsv is called. If the
matrix A is singular (A(j, j)=0 for some j), then s is set to 0 and a non-trivial solution to A*x = 0 is
returned.
Input Parameters
uplo CHARACTER*1.
trans CHARACTER*1.
Specifies the operation applied to A.
= 'N': solve A*x = s*b (no transpose)
= 'T': solve AT*x = s*b (transpose)
1637
= 'C': solve AH*x = s*b (conjugate transpose)
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular
= 'N': non-unit triangular
= 'U': unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'Y': cnorm contains the column norms on entry;
= 'N': cnorm is not set on entry. On exit, the norms is computed and
stored in cnorm.
kd INTEGER. The number of subdiagonals or superdiagonals in the triangular

matrix A. kb 0.
ab REAL for slatbs

DOUBLE PRECISION for dlatbs
COMPLEX for clatbs
DOUBLE COMPLEX for zlatbs.
Array, DIMENSION (ldab, n).
The upper or lower triangular band matrix A, stored in the first kb+1 rows
of the array. The j-th column of A is stored in the j-th column of the array
ab as follows:
if uplo = 'U', ab(kd+1+i-j,j) = A(i,j) for max(1, j-kd) ij;
if uplo = 'L', ab(1+i-j,j) = A(i,j) for ji min(n, j+kd).
ldab INTEGER. The leading dimension of the array ab. ldabkb+1.
x REAL for slatbs

DOUBLE PRECISION for dlatbs
COMPLEX for clatbs
DOUBLE COMPLEX for zlatbs.
On entry, the right hand side b of the triangular system.
cnorm REAL for slatbs/clatbs

DOUBLE PRECISION for dlatbs/zlatbs.
If NORMIN = 'Y', cnorm is an input argument and cnorm(j) contains the

norm of the off-diagonal part of the j-th column of A.
1638
LAPACK Routines 3
If trans = 'N', cnorm(j) must be greater than or equal to the infinity-
norm, and if trans = 'T' or 'C', cnorm(j) must be greater than or equal
to the 1-norm.
Output Parameters
scale REAL for slatbs/clatbs

DOUBLE PRECISION for dlatbs/zlatbs.
The scaling factor s for the triangular system as described above. If scale
= 0, the matrix A is singular or badly scaled, and the vector x is an exact or
approximate solution to Ax = 0.
cnorm If normin = 'N', cnorm is an output argument and cnorm(j) returns the
1-norm of the off-diagonal part of the j-th column of A.
info INTEGER.
< 0: if info = -k, the k-th argument had an illegal value
?latm1
Computes the entries of a matrix as specified.
Syntax
call slatm1( mode, cond, irsign, idist, iseed, d, n, info )
call dlatm1( mode, cond, irsign, idist, iseed, d, n, info )
call clatm1( mode, cond, irsign, idist, iseed, d, n, info )
call zlatm1( mode, cond, irsign, idist, iseed, d, n, info )
Include Files
mkl.fi
Description
The ?latm1 routine computes the entries of D(1..n) as specified by mode, cond and irsign. idist and
iseed determine the generation of random numbers.
?latm1 is called by slatmr (for slatm1 and dlatm1), and by clatmr(for clatm1 and zlatm1) to generate
random test matrices for LAPACK programs.
Input Parameters
mode INTEGER. On entry describes how d is to be computed:

mode = 0 means do not change d.
mode = 1 sets d(1) = 1 and d(2:n) = 1.0/cond
mode = 2 sets d(1:n-1) = 1 and d(n)=1.0/cond
mode = 3 sets d(i)=cond**(-(i-1)/(n-1))
mode = 4 sets d(i)= 1 - (i-1)/(n-1)*(1 - 1/cond)
1639
mode = 5 sets d to random numbers in the range ( 1/cond , 1 ) such

that their logarithms are uniformly distributed.
mode = 6 sets d to random numbers from same distribution as the rest of
the matrix.
mode < 0 has the same meaning as abs(mode), except that the order of
the elements of d is reversed.
Thus if mode is positive, d has entries ranging from 1 to 1/cond, if

negative, from 1/cond to 1.
cond REAL for slatm1,

DOUBLE PRECISION for dlatm1,
REAL for clatm1,
DOUBLE PRECISION for zlatm1,
On entry, used as described under mode above. If used, it must be 1.
irsign INTEGER.
On entry, if mode is not -6, 0, or 6, determines sign of entries of d.
If irsign = 0, entries of d are unchanged.
If irsign = 1, each entry of d is multiplied by a random complex number

uniformly distributed with absolute value 1.
idist INTEGER. Specifies the distribution of the random numbers.

For slatm1 and dlatm1:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
For clatm1 and zlatm1:

= 4: complex number uniform in disk(0, 1)

Specifies the seed of the random number generator. The random number
generator uses a linear congruential sequence limited to small integers, and
so should produce machine independent random numbers. The values of
iseed(4) are changed on exit, and can be used in the next call to ?latm1
to continue the same random number sequence.
d REAL for slatm1,

COMPLEX for clatm1,
DOUBLE COMPLEX for zlatm1,
1640
LAPACK Routines 3
Array, size n.
n INTEGER. Number of entries of d.
Output Parameters
d On exit, d is updated, unless mode = 0.
info INTEGER.
If info = -1, mode is not in range -6 to 6.
If info = -2, mode is neither -6, 0 nor 6, and irsign is neither 0 nor 1.
If info = -3, mode is neither -6, 0 nor 6 and cond is less than 1.
If info = -4, mode equals 6 or -6 and idist is not in range 1 to 4.
If info = -7, n is negative.
?latm2
Returns an entry of a random matrix.
Syntax
res = slatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = dlatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = clatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
res = zlatm2( m, n, i, j, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng, iwork,
sparse )
Include Files
mkl.fi
Description
The ?latm2 routine returns entry (i , j ) of a random matrix of dimension (m, n). It is called by the ?latmr
routine in order to build random test matrices. No error checking on parameters is done, because this routine
is called in a tight loop by ?latmr which has already checked the parameters.
Use of ?latm2 differs from ?latm3 in the order in which the random number generator is called to fill in
random matrix entries. With ?latm2, the generator is called to fill in the pivoted matrix columnwise. With ?
latm2, the generator is called to fill in the matrix columnwise, after which it is pivoted. Thus, ?latm3 can be
used to construct random matrices which differ only in their order of rows and/or columns. ?latm2 is used to
construct band matrices while avoiding calling the random number generator for entries outside the band
(and therefore generating random numbers).
The matrix whose (i , j ) entry is returned is constructed as follows (this routine only computes one entry):
1641
If i is outside (1..m) or j is outside (1..n), returns zero (this is convenient for generating matrices in
band format).
Generate a matrix A with random entries of distribution idist.
Set the diagonal to D.
Grade the matrix, if desired, from the left (by dl) and/or from the right (by dr or dl) as specified by
igrade.
Permute, if desired, the rows and/or columns as specified by ipvtng and iwork.
Band the matrix to have lower bandwidth kl and upper bandwidth ku.
Set random entries to zero as specified by sparse.
Input Parameters
m INTEGER. Number of rows of the matrix.
n INTEGER. Number of columns of the matrix.
i INTEGER. Row of the entry to be returned.
j INTEGER. Column of the entry to be returned.
kl INTEGER. Lower bandwidth.
ku INTEGER. Upper bandwidth.
idist INTEGER. On entry, idist specifies the type of distribution to be used to

generate a random matrix .
for slatm2 and dlatm2:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
for clatm2 and zlatm2:

= 4: complex number uniform in disk (0, 1)

Seed for the random number generator.
d REAL for slatm2,

COMPLEX for clatm2,
Array, size (min(i, j)). Diagonal entries of matrix.
igrade INTEGER. Specifies grading of matrix as follows:

= 0: no grading
= 1: matrix premultiplied by diag( dl )
1642
LAPACK Routines 3
= 2: matrix postmultiplied by diag( dr )
= 3: matrix premultiplied by diag( dl ) and postmultiplied by diag( dr)
= 4: matrix premultiplied by diag( dl ) and postmultiplied by

inv( diag( dl ) )
For slatm2 and slatm2:
= 5: matrix premultiplied by diag( dl ) and postmultiplied by diag( dl)

diag( conjg( dl ) )
dl REAL for slatm2,

COMPLEX for clatm2,
Array, size (i or j), as appropriate.
Left scale factors for grading matrix.
dr REAL for slatm2,

COMPLEX for clatm2,
Array, size (i or j), as appropriate.
Right scale factors for grading matrix.
ipvtng INTEGER. On entry specifies pivoting permutations as follows:

= 0: none
= 1: row pivoting
= 2: column pivoting
= 3: full pivoting, i.e., on both sides
iwork INTEGER.
Array, size (i or j), as appropriate. This array specifies the permutation
used. The row (or column) in position k was originally in position
iwork( k ). This differs from iwork for ?latm3.
sparse REAL for slatm2,

REAL for clatm2,
1643
Specifies the sparsity of the matrix. If sparse matrix is to be generated,

sparse should lie between 0 and 1. A uniform ( 0, 1 ) random number x is
generated and compared to sparse. If x is larger the matrix entry is
unchanged and if x is smaller the entry is set to zero. Thus on the average
a fraction sparse of the entries will be set to zero.
Output Parameters
iseed INTEGER.
res REAL for slatm2,

COMPLEX for clatm2,
Entry of a random matrix.
?latm3
Returns set entry of a random matrix.
Syntax
res = slatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = dlatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = clatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
res = zlatm3( m, n, i, j, isub, jsub, kl, ku, idist, iseed, d, igrade, dl, dr, ipvtng,
iwork, sparse )
Include Files
mkl.fi
Description
The ?latm3 routine returns the (isub, jsub) entry of a random matrix of dimension (m, n) described by the
other parameters. (isub, jsub) is the final position of the (i ,j ) entry after pivoting according to ipvtng and
iwork. ?latm3 is called by the ?latmr routine in order to build random test matrices. No error checking on
parameters is done, because this routine is called in a tight loop by ?latmr which has already checked the
parameters.
Use of ?latm3 differs from ?latm2 in the order in which the random number generator is called to fill in
random matrix entries. With ?latm2, the generator is called to fill in the pivoted matrix columnwise. With ?
latm3, the generator is called to fill in the matrix columnwise, after which it is pivoted. Thus, ?latm3 can be
used to construct random matrices which differ only in their order of rows and/or columns. ?latm2 is used to
construct band matrices while avoiding calling the random number generator for entries outside the band
(and therefore generating random numbers in different orders for different pivot orders).
The matrix whose (isub, jsub ) entry is returned is constructed as follows (this routine only computes one
entry):
1644
LAPACK Routines 3
If isub is outside (1..m) or jsub is outside (1..n), returns zero (this is convenient for generating
matrices in band format).
Generate a matrix A with random entries of distribution idist.
Set the diagonal to D.
Grade the matrix, if desired, from the left (by dl) and/or from the right (by dr or dl) as specified by
igrade.
Permute, if desired, the rows and/or columns as specified by ipvtng and iwork.
Band the matrix to have lower bandwidth kl and upper bandwidth ku.
Set random entries to zero as specified by sparse.
Input Parameters
m INTEGER. Number of rows of matrix.
n INTEGER. Number of columns of matrix.
i INTEGER. Row of unpivoted entry to be returned.
j INTEGER. Column of unpivoted entry to be returned.
isub INTEGER. Row of pivoted entry to be returned.
jsub INTEGER. Column of pivoted entry to be returned.
kl INTEGER. Lower bandwidth.
ku INTEGER. Upper bandwidth.
idist INTEGER. On entry, idist specifies the type of distribution to be used to

generate a random matrix.
for slatm2 and dlatm2:
= 1: uniform (0,1)
= 2: uniform (-1,1)
= 3: normal (0,1)
for clatm2 and zlatm2:

= 4: complex number uniform in disk(0, 1)

Seed for random number generator.
d REAL for slatm3,

COMPLEX for clatm3,
Array, size (min(i, j)). Diagonal entries of matrix.
1645
igrade INTEGER. Specifies grading of matrix as follows:

= 0: no grading
= 1: matrix premultiplied by diag( dl )
= 2: matrix postmultiplied by diag( dr )
= 3: matrix premultiplied by diag( dl ) and postmultiplied by diag( dr)

inv( diag( dl ) )
For slatm2 and slatm2:

diag( conjg( dl ) )
dl REAL for slatm3,

COMPLEX for clatm3,
Array, size (i or j, as appropriate).
Left scale factors for grading matrix.
dr REAL for slatm3,

COMPLEX for clatm3,
Array, size (i or j, as appropriate).
Right scale factors for grading matrix.
ipvtng INTEGER. On entry specifies pivoting permutations as follows:

If ipvtng = 0: none.
If ipvtng = 1: row pivoting.
If ipvtng = 2: column pivoting.
If ipvtng = 3: full pivoting, i.e., on both sides.
sparse REAL for slatm3,

REAL for clatm3,
1646
LAPACK Routines 3
On entry, specifies the sparsity of the matrix if sparse matrix is to be
generated. sparse should lie between 0 and 1. A uniform( 0, 1 ) random
number x is generated and compared to sparse; if x is larger the matrix
entry is unchanged and if x is smaller the entry is set to zero. Thus on the
average a fraction sparse of the entries will be set to zero.
iwork INTEGER.
Array, size (i or j, as appropriate). This array specifies the permutation
used. The row (or column) originally in position k is in position iwork( k )
after pivoting. This differs from iwork for ?latm2.
Output Parameters
isub On exit, row of pivoted entry is updated.
jsub On exit, column of pivoted entry is updated.
res REAL for slatm3,

COMPLEX for clatm3,
Entry of a random matrix.
?latm5
Generates matrices involved in the Generalized
Sylvester equation.
Syntax
call slatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
ldl, alpha, qblcka, qblckb )
call dlatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
call clatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
call zlatm5( prtype, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf, r, ldr, l,
Include Files
mkl.fi
Description
The ?latm5 routine generates matrices involved in the Generalized Sylvester equation:
A * R - L * B = C
D * R - L * E = F
They also satisfy the diagonalization condition:
1647
Input Parameters
prtype INTEGER. Specifies the type of matrices to generate.
If prtype = 1, A and B are Jordan blocks, D and E are identity

matrices.
A:
If (i == j) then Ai, j = 1.0.
If (j == i + 1) then Ai, j = -1.0.
Otherwise Ai, j = 0.0, i, j = 1...m
B:
If (i == j) then Bi, j = 1.0 - alpha.
If (j == i + 1) then Bi, j = 1.0 .
Otherwise Bi, j = 0.0, i, j = 1...n.
D:
If (i == j) then Di, j = 1.0.
Otherwise Di, j = 0.0, i, j = 1...m.
E:
If (i == j) then Ei, j = 1.0
Otherwise Ei, j = 0.0, i, j = 1...n.
L = R are chosen from [-10...10], which specifies the right hand sides
(C, F).
If prtype = 2 or 3: Triangular and/or quasi- triangular.
A:
If (ij) then Ai, j = [-1...1].
Otherwise Ai, j = 0.0, i, j = 1...M.
If (prtype = 3) then Ak + 1, k + 1 = Ak, k;
Ak + 1, k = [-1...1];
sign(Ak, k + 1) = -(sign(Ak + 1, k).
k = 1, m- 1, qblcka
B:
If (ij) then Bi, j = [-1...1].
Otherwise Bi, j = 0.0, i, j = 1...n.
If (prtype = 3) thenBk + 1, k + 1 = Bk, k
Bk + 1, k = [-1...1]
1648
LAPACK Routines 3
sign(Bk, k + 1)= -(sign(Bk + 1, k)
k = 1, n - 1, qblckb.
D:
If (ij) then Di, j = [-1...1].
Otherwise Di, j = 0.0, i, j = 1...m.
E:
If (i <= j) then Ei, j = [-1...1].
Otherwise Ei, j = 0.0, i, j = 1...N.
L, R are chosen from [-10...10], which specifies the right hand sides (C,
F).
If prtype = 4 Full
Ai, j = [-10...10]
Di, j = [-1...1] i,j = 1...m
Bi, j = [-10...10]
Ei, j = [-1...1] i,j = 1...n
Ri, j = [-10...10]
Li, j = [-1...1] i = 1..m ,j = 1...n
L and R specifies the right hand sides (C, F).

If prtype = 5 special case common and/or close eigs.
m INTEGER. Specifies the order of A and D and the number of rows in C, F, R

and L.
n INTEGER. Specifies the order of B and E and the number of columns in C, F,

R and L.
ldc INTEGER. The leading dimension of c.
ldd INTEGER. The leading dimension of d.
lde INTEGER. The leading dimension of e.
ldf INTEGER. The leading dimension of f.
ldr INTEGER. The leading dimension of r.
ldl INTEGER. The leading dimension of l.
alpha REAL for slatm5,

REAL for clatm5,
1649
Parameter used in generating prtype = 1 and 5 matrices.
qblcka INTEGER. When prtype = 3, specifies the distance between 2-by-2 blocks
on the diagonal in A. Otherwise, qblcka is not referenced. qblcka > 1.
qblckb INTEGER. When prtype = 3, specifies the distance between 2-by-2 blocks
on the diagonal in B. Otherwise, qblckb is not referenced. qblckb > 1.
Output Parameters
a REAL for slatm5,

COMPLEX for clatm5,
Array, size (lda, m). On exit a contains them-by-m array A initialized
according to prtype.
b REAL for slatm5,

COMPLEX for clatm5,
Array, size (ldb, n). On exit b contains the n-by-n array B initialized
c REAL for slatm5,

COMPLEX for clatm5,
Array, size (ldc, n). On exit c contains the m-by-n array C initialized
d REAL for slatm5,

COMPLEX for clatm5,
Array, size (ldd, m). On exit d contains the m-by-m array D initialized
e REAL for slatm5,

COMPLEX for clatm5,
Array, size (lde, n). On exit e contains the n-by-n array E initialized
1650
LAPACK Routines 3
f REAL for slatm5,
COMPLEX for clatm5,
Array, size (ldf, n). On exit f contains the m-by-n array F initialized
r REAL for slatm5,

COMPLEX for clatm5,
Array, size (ldr, n). On exit R contains the m-by-n array R initialized
l REAL for slatm5,

COMPLEX for clatm5,
Array, size (ldl, n). On exit l contains the m-by-narray L initialized
?latm6
Generates test matrices for the generalized eigenvalue
problem, their corresponding right and left
eigenvector matrices, and also reciprocal condition
numbers for all eigenvalues and the reciprocal
condition numbers of eigenvectors corresponding to
the 1th and 5th eigenvalues.
Syntax
call slatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call dlatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call clatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
call zlatm6( type, n, a, lda, b, x, ldx, y, ldy, alpha, beta, wx, wy, s, dif )
Include Files
mkl.fi
Description
The ?latm6 routine generates test matrices for the generalized eigenvalue problem, their corresponding right
and left eigenvector matrices, and also reciprocal condition numbers for all eigenvalues and the reciprocal
condition numbers of eigenvectors corresponding to the 1th and 5th eigenvalues.
There two kinds of test matrix pairs:
(A, B)= inverse(YH) * (Da, Db) * inverse(X)
1651
Type 1:
Type 2:
In both cases the same inverse(YH) and inverse(X) are used to compute (A, B), giving the exact eigenvectors
to (A,B) as (YH, X):
,
where a, b, x and y will have all values independently of each other.
Input Parameters
type INTEGER. Specifies the problem type.
n INTEGER. Size of the matrices A and B.
lda INTEGER. The leading dimension of a and of b.
ldx INTEGER. The leading dimension of x.
ldy INTEGER. The leading dimension of y.
alpha, beta REAL for slatm6,

COMPLEX for clatm6,
Weighting constants for matrix A.
wx REAL for slatm6,

COMPLEX for clatm6,
1652
LAPACK Routines 3
Constant for right eigenvector matrix.
wy REAL for slatm6,

COMPLEX for clatm6,
Constant for left eigenvector matrix.
Output Parameters
a REAL for slatm6,

COMPLEX for clatm6,
Array, size (lda, n). On exit, a contains the n-by-n matrix initialized
according to type.
b REAL for slatm26,

COMPLEX for clatm6,
Array, size (lda, n). On exit, b contains the n-by-n matrix initialized
according to type.
x REAL for slatm6,

COMPLEX for clatm6,
Array, size (ldx, n). On exit, x contains the n-by-n matrix of right
eigenvectors.
y REAL for slatm6,

COMPLEX for clatm6,
Array, size (ldy, n). On exit, y is the n-by-n matrix of left eigenvectors.
s REAL for slatm6,

REAL for clatm6,
1653
Array, size (n). s(i ) is the reciprocal condition number for eigenvalue i .
dif REAL for slatm6,

REAL for clatm6,
Array, size(n). dif(i ) is the reciprocal condition number for eigenvector
i.
?latme
Generates random non-symmetric square matrices
with specified eigenvalues.
Syntax
call slatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
conds, kl, ku, anorm, a, lda, work, info )
call dlatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
call clatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
call zlatme( n, dist, iseed, d, mode, cond, dmax, ei, rsign, upper, sim, ds, modes,
Include Files
mkl.fi
Description
The ?latme routine generates random non-symmetric square matrices with specified eigenvalues. ?latme
operates by applying the following sequence of operations:
1. Set the diagonal to d, where d may be input or computed according to mode, cond, dmax, and rsign as
described below.
2. If upper = 'T', the upper triangle of a is set to random values out of distribution dist.
3. If sim='T', a is multiplied on the left by a random matrix X, whose singular values are specified by ds,
modes, and conds, and on the right by X inverse.
4. If kl < n-1, the lower bandwidth is reduced to kl using Householder transformations. If ku < n-1,
the upper bandwidth is reduced to ku.
5. If anorm is not negative, the matrix is scaled to have maximum-element-norm anorm.
NOTE
Since the matrix cannot be reduced beyond Hessenberg form, no packing options are available.
Input Parameters
n INTEGER. The number of columns (or rows) of A.
1654
LAPACK Routines 3
dist CHARACTER*1. On entry, dist specifies the type of distribution to be used
to generate the random eigen-/singular values, and on the upper triangle
(see upper).
If dist = 'U': uniform( 0, 1 )
If dist = 'S': uniform( -1, 1 )
If dist = 'N': normal( 0, 1 )
If dist = 'D': uniform on the complex disc |z| < 1.

On entry iseed specifies the seed of the random number generator. The
elements should lie between 0 and 4095 inclusive, and iseed(4) should be
odd. The random number generator uses a linear congruential sequence
limited to small integers, and so should produce machine independent
random numbers.
d REAL for slatme,

DOUBLE PRECISION for dlatme,
COMPLEX for clatme,
DOUBLE COMPLEX for zlatme,
Array, size (n). This array is used to specify the eigenvalues of A.
If mode = 0, then d is assumed to contain the eigenvalues. Otherwise they
are computed according to mode, cond, dmax, and rsign and placed in d.
mode INTEGER. On entry mode describes how the eigenvalues are to be specified:
mode = 0 means use d (with ei for slatme and dlatme) as input.
mode = 1 sets d(1) = 1 and d[1:n - 1]=1.0/cond.
mode = 2 sets d(1:n-1) = 1 and d(n)=1.0/cond.
mode = 3 sets d(i) = cond**(-(i-1)/(n-1)).
mode = 4 sets d(i) = 1 - (i-1)/(n-1)*(1 - 1/cond).
the matrix.
Thus if mode is between 1 and 4, d has entries ranging from 1 to 1/cond, if

between -1 and -4, d has entries ranging from 1/cond to 1.
cond REAL for slatme,

REAL for clatme,
DOUBLE PRECISION for zlatme,
1655
On entry, this is used as described under mode above. If used, it must be

1.
dmax REAL for slatme,

COMPLEX for clatme,
If mode is not -6, 0 or 6, the contents of d as computed according to mode
and cond are scaled by dmax / max(abs(d(i))). Note that dmax needs
not be positive or real: if dmax is negative or complex (or zero), d will be
scaled by a negative or complex number (or zero). If rsign='F' then the
largest (absolute) eigenvalue will be equal to dmax.
ei CHARACTER*1. Used by slatme and dlatme only.

Array, size (n).
If mode = 0, and ei(1)is not ' ' (space character), this array specifies
which elements of d (on input) are real eigenvalues and which are the real
and imaginary parts of a complex conjugate pair of eigenvalues. The
elements of ei may then only have the values 'R' and 'I'.
If ei(j) = 'R' and ei(j + 1) = 'I', then the j -th eigenvalue is

cmplx( d(j) , d(j + 1) ), and the (j +1)-th is the complex conjugate
thereof.
If ei(j) = ei(j + 1)='R', then the j-th eigenvalue is d(j) (i.e., real).
ei(1) may not be 'I', nor may two adjacent elements of ei both have the
value 'I'.
If mode is not 0, then ei is ignored. If mode is 0 and ei(1) = ' ', then
the eigenvalues will all be real.
rsign CHARACTER*1. If mode is not 0, 6, or -6, and rsign = 'T', then the
elements of d, as computed according to mode and cond, are multiplied by
a random sign (+1 or -1) for slatme and dlatme or by a complex number
from the unit circle |z| = 1 for clatme and zlatme.
If rsign = 'F', the elements of d are not multiplied. rsign may only have
the values 'T' or 'F'.
upper CHARACTER*1. If upper = 'T', then the elements of a above the diagonal
will be set to random numbers out of dist.
If upper = 'F', they will not. upper may only have the values 'T' or 'F'.
sim CHARACTER*1. If sim = 'T', then a will be operated on by a "similarity

transform", i.e., multiplied on the left by a matrix X and on the right by X
inverse. X = USV, where U and V are random unitary matrices and S is a
(diagonal) matrix of singular values specified by ds, modes, and conds.
If sim = 'F', then a will not be transformed.
ds REAL for slatme,

1656
LAPACK Routines 3
REAL for clatme,
This array is used to specify the singular values of X, in the same way that
d specifies the eigenvalues of a. If mode = 0, the ds contains the singular
values, which may not be zero.
modes INTEGER.
Similar to mode, but for specifying the diagonal of S. modes = -6 and +6
are not allowed (since they would result in randomly ill-conditioned
eigenvalues.)
conds REAL for slatme,

REAL for clatme,
Similar to cond, but for specifying the diagonal of S.
kl INTEGER. This specifies the lower bandwidth of the matrix. kl = 1 specifies

upper Hessenberg form. If kl is at least n-1, then A will have full lower
bandwidth.
ku INTEGER. This specifies the upper bandwidth of the matrix. ku = 1

specifies lower Hessenberg form.
If ku is at least n-1, then a will have full upper bandwidth.
If ku and ku are both at least n-1, then a will be dense. Only one of ku and
kl may be less than n-1.
anorm REAL for slatme,

REAL for clatme,
If anorm is not negative, then a is scaled by a non-negative real number to
make the maximum-element-norm of a to be anorm.
lda INTEGER. Number of rows of matrix A.
work REAL for slatme,

COMPLEX for clatme,
Array, size (3*n). Workspace.
Output Parameters
iseed INTEGER.
1657
d Modified if mode is nonzero.
ds Modified if mode is nonzero.
a REAL for slatme,

COMPLEX for clatme,
Array, size (lda, n). On exit, a is the desired test matrix.
info INTEGER.
If info = 0, execution is successful.
If info = -1, n is negative .
If info = -2, dist is an illegal string.
If info = -6, cond is less than 1.0, and mode is not -6, 0, or 6 .
If info = -9, rsign is not 'T' or 'F' .
If info = -10, upper is not 'T' or 'F'.
If info = -11, sim is not 'T' or 'F'.
If info = -12, modes = 0 and ds has a zero singular value.
If info = -13, modes is not in the range -5 to 5.
If info = -14, modes is nonzero and conds is less than 1. .
If info = -15, kl is less than 1.
If info = -16, ku is less than 1, or kl and ku are both less than n-1.
If info = -19, lda is less than m.
If info = 1, error return from ?latm1 (computing d) .
If info = 2, cannot scale to dmax (max. eigenvalue is 0) .
If info = 3, error return from slatm1(for slatme and clatme), dlatm1

(for dlatme and zlatme) .
If info = 4, error return from ?large.
If info = 5, zero singular value from slatm1(for slatme and clatme),

dlatm1(for dlatme and zlatme).
?latmr
Generates random matrices of various types.
Syntax
call slatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
condl, dr, moder, condr, pivtng, ipivot, kl, ku, sparse, anorm, pack, a, lda, iwork,
info)
1658
LAPACK Routines 3
call dlatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
info)
call clatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
info)
call zlatmr (m, n, dist, iseed, sym, d, mode, cond, dmax, rsign, grade, dl, model,
info)
Description
The ?latmr routine operates by applying the following sequence of operations:
1. Generate a matrix A with random entries of distribution dist:
If sym = 'S', the matrix is symmetric,
If sym = 'H', the matrix is Hermitian,
If sym = 'N', the matrix is nonsymmetric.

2. Set the diagonal to D, where D may be input or computed according to mode, cond, dmax and rsign as
described below.
3. Grade the matrix, if desired, from the left or right as specified by grade. The inputs dl, model, condl,
dr, moder and condr also determine the grading as described below.
4. Permute, if desired, the rows and/or columns as specified by pivtng and ipivot.
5. Set random entries to zero, if desired, to get a random sparse matrix as specified by sparse.
6. Make A a band matrix, if desired, by zeroing out the matrix outside a band of lower bandwidth kl and
upper bandwidth ku.
7. Scale A, if desired, to have maximum entry anorm.
8. Pack the matrix if desired. See options specified by the pack parameter.
NOTE
If two calls to ?latmr differ only in the pack parameter, they generate mathematically equivalent
matrices. If two calls to ?latmr both have full bandwidth (kl = m-1 and ku = n-1), and differ only in
the pivtng and pack parameters, then the matrices generated differ only in the order of the rows and
columns, and otherwise contain the same data. This consistency cannot be and is not maintained with
less than full bandwidth.
Input Parameters
m INTEGER. Number of rows of A.
n INTEGER. Number of columns of A.
dist CHARACTER. On entry, dist specifies the type of distribution to be used to

generate a random matrix .
If dist = 'U', real and imaginary parts are independent uniform( 0, 1 ).
If dist = 'S', real and imaginary parts are independent uniform( -1, 1 ).
If dist = 'N', real and imaginary parts are independent normal( 0, 1 ).
If dist = 'D', distribution is uniform on interior of unit disk.
1659

On entry, iseed specifies the seed of the random number generator. They
should lie between 0 and 4095 inclusive, and iseed(4) should be odd. The
random number generator uses a linear congruential sequence limited to
small integers, and so should produce machine independent random
numbers.
sym CHARACTER. If sym = 'S', generated matrix is symmetric.
If sym = 'H', generated matrix is Hermitian.
If sym = 'N', generated matrix is nonsymmetric.
d REAL for slatmr,

DOUBLE PRECISION for dlatmr,
COMPLEX for clatmr,
DOUBLE COMPLEX for zlatmr,
On entry this array specifies the diagonal entries of the diagonal of A. d
may either be specified on entry, or set according to mode and cond as
described below. If the matrix is Hermitian, the real part of d is taken. May
be changed on exit if mode is nonzero.
mode INTEGER. On entry describes how d is to be used:
mode = 0 means use d as input.

mode = 1 sets d(1)=1 and d(2:n)=1.0/cond.
mode = 2 sets d(1:n-1)=1 and d(n)=1.0/cond.
mode = 3 sets d(i)=cond**(-(i-1)/(n-1)).
mode = 4 sets d(i)=1 - (i-1)/(n-1)*(1 - 1/cond).
the matrix.
Thus if mode is between 1 and 4, d has entries ranging from 1 to 1/cond, if

between -1 and -4, D has entries ranging from 1/cond to 1.
cond REAL for slatmr,

REAL for clatmr,
DOUBLE PRECISION for zlatmr,
On entry, used as described under mode above. If used, cond must be 1.
dmax REAL for slatmr,

1660
LAPACK Routines 3
COMPLEX for clatmr,
If mode is not -6, 0, or 6, the diagonal is scaled by dmax /
max(abs(d(i))), so that maximum absolute entry of diagonal is
abs(dmax). If dmax is complex (or zero), the diagonal is scaled by a
complex number (or zero).
rsign CHARACTER. If mode is not -6, 0, or 6, specifies the sign of the diagonal as
follows:
For slatmr and dlatmr, if rsign = 'T', diagonal entries are multiplied 1
or -1 with a probability of 0.5.
For clatmr and zlatmr, if rsign = 'T', diagonal entries are multiplied by
a random complex number uniformly distributed with absolute value 1.
If rsign = 'F', diagonal entries are unchanged.
grade CHARACTER. Specifies grading of matrix as follows:

If grade = 'N', there is no grading
If grade = 'L', matrix is premultiplied by diag( dl) (only if matrix is

nonsymmetric)
If grade = 'R', matrix is postmultiplied by diag( dr ) (only if matrix is
nonsymmetric)
If grade = 'B', matrix is premultiplied by diag( dl ) and postmultiplied by
diag( dr ) (only if matrix is nonsymmetric)
If grade = 'H', matrix is premultiplied by diag( dl ) and postmultiplied by

diag( conjg(dl) ) (only if matrix is Hermitian or nonsymmetric)
If grade = 'S', matrix is premultiplied by diag(dl ) and postmultiplied by

diag( dl ) (only if matrix is symmetric or nonsymmetric)
If grade = 'E', matrix is premultiplied by diag( dl ) and postmultiplied by

inv( diag( dl ) ) (only if matrix is nonsymmetric)
NOTE
if grade = 'E', then m must equal n.
dl REAL for slatmr,

COMPLEX for clatmr,
Array, size (m).
If model = 0, then on entry this array specifies the diagonal entries of a

diagonal matrix used as described under grade above.
If model is not zero, then dl is set according to model and condl,
analogous to the way D is set according to mode and cond (except there is
no dmax parameter for dl).
If grade = 'E', then dl cannot have zero entries.
1661
Not referenced if grade = 'N' or 'R'. Changed on exit.
model INTEGER. This specifies how the diagonal array dl is computed, just as
mode specifies how D is computed.
condl REAL for slatmr,

REAL for clatmr,
When model is not zero, this specifies the condition number of the
computed dl.
dr REAL for slatmr,

COMPLEX for clatmr,
If moder = 0, then on entry this array specifies the diagonal entries of a
diagonal matrix used as described under grade above.
If moder is not zero, then dr is set according to moder and condr,

analogous to the way d is set according to mode and cond (except there is
no dmax parameter for dr).
Not referenced if grade = 'N', 'L', 'H''S' or 'E'.
moder INTEGER. This specifies how the diagonal array dr is to be computed, just
as mode specifies how d is to be computed.
condr REAL for slatmr and clatmr,

DOUBLE PRECISION for dlatmr and zlatmr,
When moder is not zero, this specifies the condition number of the
computed dr.
pivtng CHARACTER. On entry specifies pivoting permutations as follows:

If pivtng = 'N' or ' ': no pivoting permutation.
If pivtng = 'L': left or row pivoting (matrix must be nonsymmetric).
If pivtng = 'R': right or column pivoting (matrix must be nonsymmetric).
If pivtng = 'B' or 'F': both or full pivoting, i.e., on both sides. In this
case, m must equal n.
If two calls to ?latmr both have full bandwidth (kl = m - 1 and ku =

n-1), and differ only in the pivtng and pack parameters, then the matrices
generated differs only in the order of the rows and columns, and otherwise
contain the same data. This consistency cannot be maintained with less
than full bandwidth.
ipivot INTEGER. Array, size (n or m) This array specifies the permutation used.
After the basic matrix is generated, the rows, columns, or both are
permuted.
1662
LAPACK Routines 3
If row pivoting is selected, ?latmr starts with the last row and interchanges
row m and row ipivot(m), then moves to the next-to-last row,
interchanging rows (m-1) and row ipivot(m-1), and so on. In terms of
"2-cycles", the permutation is (1 ipivot(1)) (2 ipivot(2)) ...
(mipivot(m)) where the rightmost cycle is applied first. This is the inverse
of the effect of pivoting in LINPACK. The idea is that factoring (with
pivoting) an identity matrix which has been inverse-pivoted in this way
should result in a pivot vector identical to ipivot. Not referenced if pivtng
= 'N'.
sparse REAL for slatmr,

REAL for clatmr,
On entry, specifies the sparsity of the matrix if a sparse matrix is to be
generated. sparse should lie between 0 and 1. To generate a sparse
matrix, for each matrix entry a uniform ( 0, 1 ) random number x is
generated and compared to sparse; if x is larger the matrix entry is
unchanged and if x is smaller the entry is set to zero. Thus on the average
a fraction sparse of the entries is set to zero.
kl INTEGER. On entry, specifies the lower bandwidth of the matrix. For

example, kl = 0 implies upper triangular, kl = 1 implies upper
Hessenberg, and kl at least m-1 implies the matrix is not banded. Must
equal ku if matrix is symmetric or Hermitian.
ku INTEGER. On entry, specifies the upper bandwidth of the matrix. For

example, ku = 0 implies lower triangular, ku = 1 implies lower
Hessenberg, and kuat least n-1 implies the matrix is not banded. Must
equal kl if matrix is symmetric or Hermitian.
anorm REAL for slatmr,

REAL for clatmr,
On entry, specifies maximum entry of output matrix (output matrix is
multiplied by a constant so that its largest absolute entry equal anorm) if
anorm is nonnegative. If anorm is negative no scaling is done.
pack for slatmr,
for dlatmr,
for clatmr,
for zlatmr,
On entry, specifies packing of matrix as follows:

If pack = 'N': no packing
If pack = 'U': zero out all subdiagonal entries (if symmetric or Hermitian)
1663
If pack = 'L': zero out all superdiagonal entries (if symmetric or

Hermitian)
If pack = 'C': store the upper triangle columnwise (only if matrix
symmetric or Hermitian or square upper triangular)
If pack = 'R': store the lower triangle columnwise (only if matrix
symmetric or Hermitian or square lower triangular) (same as upper half
rowwise if symmetric) (same as conjugate upper half rowwise if Hermitian)
If pack = 'B': store the lower triangle in band storage scheme (only if
matrix symmetric or Hermitian)
If pack = 'Q': store the upper triangle in band storage scheme (only if
matrix symmetric or Hermitian)
If pack = 'Z': store the entire matrix in band storage scheme (pivoting
can be provided for by using this option to store A in the trailing rows of the
allocated storage)
Using these options, the various LAPACK packed and banded storage
schemes can be obtained:
LAPACK storage scheme Value of pack
GB 'Z'
PB, HB or TB 'B' or 'Q'
PP, HP or TP 'C' or 'R'
If two calls to ?latmr differ only in the pack parameter, they generate
mathematically equivalent matrices.
lda INTEGER. On entry, lda specifies the first dimension of a as declared in the
calling program.
If pack = 'N', 'U' or 'L', lda must be at least max( 1, m ).
If pack = 'C' or 'R', lda must be at least 1.
If pack = 'B', or 'Q', lda must be min( ku + 1, n ).
If pack = 'Z', lda must be at least kuu + kll + 1, where kuu =

min( ku, n-1 ) and kll = min( kl, n-1 ).
iwork INTEGER. Array, size (n or m). Workspace. Not referenced if pivtng = 'N'.
Changed on exit.
Output Parameters
iseed On exit, the seed is changed.
d May be changed on exit if mode is nonzero.
dl On exit, array is changed.
dr On exit, array is changed.
a REAL for slatmr,

1664
LAPACK Routines 3
COMPLEX for clatmr,
On exit, a is the desired test matrix. Only those entries of a which are
significant on output is referenced (even if a is in packed or band storage
format). The unoccupied corners of a in band format are zeroed out.
info INTEGER.
If info = -1, m is negative or unequal to n and sym = 'S' or 'H'.
If info = -2, n is negative .
If info = -3, dist is an illegal string.
If info = -5, sym is an illegal string..
If info = -8, cond is less than 1.0, and mode is neither -6, 0 nor 6.
If info = -10, mode is neither -6, 0 nor 6 and rsign is an illegal string.
If info = -11, grade is an illegal string, or grade = 'E' and m is not

equal to n, or grade='L', 'R', 'B', 'S' or 'E' and sym = 'H', or
grade = 'L', 'R', 'B', 'H' or 'E' and sym = 'S'
If info = -12,grade = 'E'and dl contains zero .
If info = -13, model is not in range -6 to 6 and grade = 'L', 'B',

'H', 'S' or 'E' .
If info = -14, condl is less than 1.0, grade = 'L', 'B', 'H', 'S' or
'E', and model is neither -6, 0 nor 6.
If info = -16, moder is not in range -6 to 6 and grade = 'R' or 'B' .
If info = -17, condr is less than 1.0, grade = 'R' or 'B', and moder
is neither -6, 0 nor 6 .
If info = -18, pivtng is an illegal string, or pivtng = 'B' or 'F' and m
is not equal to n, or pivtng = 'L' or 'R' and sym = 'S' or 'H'.
If info = -19, ipivot contains out of range number and pivtng is not
equal to 'N' .
If info = -20, kl is negative.
If info = -21, ku is negative, or sym = 'S' or 'H' and ku not equal to

kl .
If info = -22, sparse is not in range 0 to 1.
If info = -24, pack is an illegal string, or pack = 'U', 'L', 'B' or

'Q' and sym = 'N', or pack = 'C' and sym = 'N' and either kl is not
equal to 0 or n is not equal to m, or pack = 'R' and sym = 'N', and either
ku is not equal to 0 or n is not equal to m .
If info = -26, lda is too small .
If info = 1, error return from ?latm1 (computing D ) .
If info = 2, cannot scale to dmax (max. entry is 0) .
1665
If info = 3, error return from ?latm1(computing dl) .
If info = 4, error return from ?latm1(computing dr) .
If info = 5, anorm is positive, but matrix constructed prior to attempting

to scale it to have norm anorm, is zero .
?latdf
Uses the LU factorization of the n-by-n matrix
computed by ?getc2 and computes a contribution to
the reciprocal Dif-estimate.
Syntax
call slatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call dlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call clatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
call zlatdf( ijob, n, z, ldz, rhs, rdsum, rdscal, ipiv, jpiv )
Include Files
mkl.fi
Description
The routine ?latdf uses the LU factorization of the n-by-n matrix Z computed by ?getc2 and computes a
contribution to the reciprocal Dif-estimate by solving Z*x = b for x, and choosing the right-hand side b such
that the norm of x is as large as possible. On entry rhs = b holds the contribution from earlier solved sub-
systems, and on return rhs = x.
The factorization of Z returned by ?getc2 has the form Z = P*L*U*Q, where P and Q are permutation
matrices. L is lower triangular with unit diagonal elements and U is upper triangular.
Input Parameters
ijob INTEGER.
ijob = 2: First compute an approximative null-vector e of Z using ?gecon,
e is normalized, and solve for Z*x = e-fwith the sign giving the greater
value of 2-norm(x). This option is about 5 times as expensive as default.
ijob 2 (default): Local look ahead strategy where all entries of the right-
hand side b is chosen as either +1 or -1 .
n INTEGER. The number of columns of the matrix Z.
z REAL for slatdf/clatdf

DOUBLE PRECISION for dlatdf/zlatdf.
Array, DIMENSION (ldz, n)
On entry, the LU part of the factorization of the n-by-n matrix Z computed

by ?getc2: Z = P*L*U*Q.
ldz INTEGER. The leading dimension of the array Z. lda max(1, n).
1666
LAPACK Routines 3
rhs REAL for slatdf/clatdf
On entry, rhs contains contributions from other subsystems.
rdsum REAL for slatdf/clatdf

On entry, the sum of squares of computed contributions to the Dif-estimate
under computation by ?tgsyL, where the scaling factor rdscal has been
factored out. If trans = 'T', rdsum is not touched.
Note that rdsum only makes sense when ?tgsy2 is called by ?tgsyL.
rdscal REAL for slatdf/clatdf

On entry, scaling factor used to prevent overflow in rdsum.
If trans = T', rdscal is not touched.
Note that rdscal only makes sense when ?tgsy2 is called by ?tgsyL.
ipiv INTEGER.
The pivot indices; for 1 i n, row i of the matrix has been interchanged
with row ipiv(i).
jpiv INTEGER.
The pivot indices; for 1 jn, column j of the matrix has been interchanged
with column jpiv(j).
Output Parameters
rhs On exit, rhs contains the solution of the subsystem with entries according to
the value of ijob.
rdsum On exit, the corresponding sum of squares updated with the contributions
from the current sub-system.
If trans = 'T', rdsum is not touched.
rdscal On exit, rdscal is updated with respect to the current contributions in

rdsum.
If trans = 'T', rdscal is not touched.
?latps
Solves a triangular system of equations with the
matrix held in packed storage.
1667
Syntax
call slatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call dlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call clatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
call zlatps( uplo, trans, diag, normin, n, ap, x, scale, cnorm, info )
Include Files
mkl.fi
Description
The routine ?latps solves one of the triangular systems

with scaling to prevent overflow, where A is an upper or lower triangular matrix stored in packed form. Here
AT denotes the transpose of A, AH denotes the conjugate transpose of A, x and b are n-element vectors, and
s is a scaling factor, usually less than or equal to 1, chosen so that the components of x will be less than the
overflow threshold. If the unscaled problem does not cause overflow, the Level 2 BLAS routine ?tpsv is
called. If the matrix A is singular (A(j, j) = 0 for some j), then s is set to 0 and a non-trivial solution to
A*x = 0 is returned.
Input Parameters
uplo CHARACTER*1.
= 'L': uower triangular
trans CHARACTER*1.
diag CHARACTER*1.
Specifies whether the matrix A is unit triangular.
= 'U': unit triangular
normin CHARACTER*1.
Specifies whether cnorm is set.
= 'N': cnorm is not set on entry. On exit, the norms will be computed and
stored in cnorm.
1668
LAPACK Routines 3
ap REAL for slatps
DOUBLE PRECISION for dlatps
COMPLEX for clatps
DOUBLE COMPLEX for zlatps.
The upper or lower triangular matrix A, packed columnwise in a linear array.

if uplo = 'U', ap(i + (j-1)j/2) = A(i,j) for 1ij;
if uplo = 'L', ap(i + (j-1)(2n-j)/2) = A(i, j) for jin.
x REAL for slatpsDOUBLE PRECISION for dlatps

COMPLEX for clatps
DOUBLE COMPLEX for zlatps.
Array, DIMENSION (n)
cnorm REAL for slatps/clatps

DOUBLE PRECISION for dlatps/zlatps.
If normin = 'Y', cnorm is an input argument and cnorm(j) contains the

If trans = 'N', cnorm(j) must be greater than or equal to the infinity-
to the 1-norm.
Output Parameters
x On exit, x is overwritten by the solution vector x.
scale REAL for slatps/clatps

DOUBLE PRECISION for dlatps/zlatps.
The scaling factor s for the triangular system as described above.
If scale = 0, the matrix A is singular or badly scaled, and the vector x is
an exact or approximate solution to A*x = 0.
info INTEGER.
1669
?latrd
Reduces the first nb rows and columns of a
symmetric/Hermitian matrix A to real tridiagonal form
by an orthogonal/unitary similarity transformation.
Syntax
call slatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call dlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call clatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
call zlatrd( uplo, n, nb, a, lda, e, tau, w, ldw )
Include Files
mkl.fi
Description
The routine ?latrd reduces nb rows and columns of a real symmetric or complex Hermitian matrix A to
symmetric/Hermitian tridiagonal form by an orthogonal/unitary similarity transformation QT*A*Q for real
flavors, QH*A*Q for complex flavors, and returns the matrices V and W which are needed to apply the
transformation to the unreduced part of A.
If uplo = 'U', ?latrd reduces the last nb rows and columns of a matrix, of which the upper triangle is
supplied;
if uplo = 'L', ?latrd reduces the first nb rows and columns of a matrix, of which the lower triangle is
supplied.
This is an auxiliary routine called by ?sytrd/?hetrd.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric/
Hermitian matrix A is stored:
nb INTEGER. The number of rows and columns to be reduced.
a REAL for slatrd

DOUBLE PRECISION for dlatrd
COMPLEX for clatrd
DOUBLE COMPLEX for zlatrd.
On entry, the symmetric/Hermitian matrix A
1670
LAPACK Routines 3
lda INTEGER. The leading dimension of the array a. lda (1,n).
ldw INTEGER.
The leading dimension of the output array w. ldw max(1,n).
Output Parameters
a On exit, if uplo = 'U', the last nb columns have been reduced to

tridiagonal form, with the diagonal elements overwriting the diagonal
elements of a; the elements above the diagonal with the array tau,
represent the orthogonal/unitary matrix Q as a product of elementary
reflectors;
if uplo = 'L', the first nb columns have been reduced to tridiagonal form,
with the diagonal elements overwriting the diagonal elements of a; the
elements below the diagonal with the array tau, represent the orthogonal/
e REAL for slatrd/clatrd

DOUBLE PRECISION for dlatrd/zlatrd.
If uplo = 'U', e(n-nb:n-1) contains the superdiagonal elements of the
last nb columns of the reduced matrix;
if uplo = 'L', e(1:nb) contains the subdiagonal elements of the first nb
columns of the reduced matrix.
tau REAL for slatrd

COMPLEX for clatrd
The scalar factors of the elementary reflectors, stored in tau(n-nb:n-1) if

uplo = 'U', and in tau(1:nb) if uplo = 'L'.
w REAL for slatrd

COMPLEX for clatrd
Array, DIMENSION (ldw, n).
The n-by-nb matrix W required to update the unreduced part of A.
1671
Application Notes
If uplo = 'U', the matrix Q is represented as a product of elementary reflectors
Q = H(n)*H(n-1)*...*H(n-nb+1)
H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(i:n) = 0 and v(i-1) = 1; v(1:
i-1) is stored on exit in a(1: i-1, i), and tau in tau(i-1).
If uplo = 'L', the matrix Q is represented as a product of elementary reflectors
Q = H(1)*H(2)*...*H(nb)
Each H(i) has the form H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1: i) = 0 and v(i+1) = 1; v( i
+1:n) is stored on exit in a(i+1:n, i), and tau in tau(i).
The elements of the vectors v together form the n-by-nb matrix V which is needed, with W, to apply the
transformation to the unreduced part of the matrix, using a symmetric/Hermitian rank-2k update of the
form:
A := A - VW' - WV'.
The contents of a on exit are illustrated by the following examples with n = 5 and nb = 2:
where d denotes a diagonal element of the reduced matrix, a denotes an element of the original matrix that
is unchanged, and vi denotes an element of the vector defining H(i).
?latrs
Solves a triangular system of equations with the scale
factor set to prevent overflow.
Syntax
call slatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call dlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call clatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
call zlatrs( uplo, trans, diag, normin, n, a, lda, x, scale, cnorm, info )
1672
LAPACK Routines 3
Include Files
mkl.fi
Description
The routine solves one of the triangular systems
with scaling to prevent overflow. Here A is an upper or lower triangular matrix, AT denotes the transpose of
A, AH denotes the conjugate transpose of A, x and b are n-element vectors, and s is a scaling factor, usually
less than or equal to 1, chosen so that the components of x will be less than the overflow threshold. If the
unscaled problem will not cause overflow, the Level 2 BLAS routine ?trsv is called. If the matrix A is singular
(A(j,j) = 0 for some j), then s is set to 0 and a non-trivial solution to A*x = 0 is returned.
Input Parameters
uplo CHARACTER*1.
trans CHARACTER*1.
diag CHARACTER*1.
normin CHARACTER*1.
Specifies whether cnorm has been set or not.
= 'N': cnorm is not set on entry. O
n exit, the norms will be computed and stored in cnorm.
n INTEGER. The order of the matrix A. n 0
a REAL for slatrs

DOUBLE PRECISION for dlatrs
COMPLEX for clatrs
DOUBLE COMPLEX for zlatrs.
Array, DIMENSION (lda, n). Contains the triangular matrix A.
1673
If uplo = 'U', the leading n-by-n upper triangular part of the array a
contains the upper triangular matrix, and the strictly lower triangular part
If uplo = 'L', the leading n-by-n lower triangular part of the array a
contains the lower triangular matrix, and the strictly upper triangular part
If diag = 'U', the diagonal elements of A are also not referenced and are
assumed to be 1.
lda INTEGER. The leading dimension of the array a. lda max(1, n).
x REAL for slatrs

DOUBLE PRECISION for dlatrs
COMPLEX for clatrs
DOUBLE COMPLEX for zlatrs.
cnorm REAL for slatrs/clatrs

DOUBLE PRECISION for dlatrs/zlatrs.
If normin = 'Y', cnorm is an input argument and cnorm (j) contains the
If trans = 'N', cnorm (j) must be greater than or equal to the infinity-
to the 1-norm.
Output Parameters
x On exit, x is overwritten by the solution vector x.
scale REAL for slatrs/clatrs

DOUBLE PRECISION for dlatrs/zlatrs.
Array, DIMENSION (lda, n). The scaling factor s for the triangular system as
described above.
If scale = 0, the matrix A is singular or badly scaled, and the vector x is
an exact or approximate solution to A*x = 0.
info INTEGER.
1674
LAPACK Routines 3
Application Notes
A rough bound on x is computed; if that is less than overflow, ?trsv is called, otherwise, specific code is
used which checks for possible overflow or divide-by-zero at every operation.
A columnwise scheme is used for solving Ax = b. The basic algorithm if A is lower triangular is
x[1:n] := b[1:n]
for j = 1, ..., n
x(j) := x(j) / A(j,j)
x[j+1:n] := x[j+1:n] - x(j)*a[j+1:n,j]
end
Define bounds on the components of x after j iterations of the loop:
M(j) = bound on x[1:j]
G(j) = bound on x[j+1:n]
Initially, let M(0) = 0 and G(0) = max{x(i), i=1,...,n}.
Then for iteration j+1 we have

M(j+1) G(j) / | a(j+1,j+1)|
G(j+1) G(j) + M(j+1)*| a[j+2:n,j+1]|
G(j)(1 + cnorm(j+1)/ | a(j+1,j+1)|,
where cnorm(j+1) is greater than or equal to the infinity-norm of column j+1 of a, not counting the diagonal.
Hence
and
Since |x(j)| M(j), we use the Level 2 BLAS routine ?trsv if the reciprocal of the largest M(j),
j=1,..,n, is larger than max(underflow, 1/overflow).
The bound on x(j) is also used to determine when a step in the columnwise method can be performed
without fear of overflow. If the computed bound is greater than a large constant, x is scaled to prevent
overflow, but if the bound overflows, x is set to 0, x(j) to 1, and scale to 0, and a non-trivial solution to Ax =
0 is found.
Similarly, a row-wise scheme is used to solve ATx = b or AHx = b. The basic algorithm for A upper triangular
is
for j = 1, ..., n
x(j) := ( b(j) - A[1:j-1,j]' x[1:j-1]) / A(j,j)
end
We simultaneously compute two bounds
1675
G(j) = bound on ( b(i) - A[1:i-1,i]'*x[1:i-1]), 1ij

M(j) = bound on x(i), 1ij
The initial values are G(0) = 0, M(0) = max{ b(i), i=1,..,n}, and we add the constraint G(j)
G(j-1) and M(j) M(j-1) for j 1.
Then the bound on x(j) is
M(j) M(j-1) *(1 + cnorm(j)) / | A(j,j)|
and we can safely call ?trsv if 1/M(n) and 1/G(n) are both greater than max(underflow, 1/overflow).
?latrz
Factors an upper trapezoidal matrix by means of
orthogonal/unitary transformations.
Syntax
call slatrz( m, n, l, a, lda, tau, work )
call dlatrz( m, n, l, a, lda, tau, work )
call clatrz( m, n, l, a, lda, tau, work )
call zlatrz( m, n, l, a, lda, tau, work )
Include Files
mkl.fi
Description
The routine ?latrz factors the m-by-(m+l) real/complex upper trapezoidal matrix
[A1 A2] = [A(1:m,1:m) A(1: m, n-l+1:n)]

as ( R 0 )* Z, by means of orthogonal/unitary transformations. Z is an (m+l)-by-(m+l) orthogonal/unitary
matrix and R and A1 are m-by -m upper triangular matrices.
Input Parameters
l INTEGER. The number of columns of the matrix A containing the meaningful

part of the Householder vectors.
n-ml 0.
a REAL for slatrz

DOUBLE PRECISION for dlatrz
COMPLEX for clatrz
1676
LAPACK Routines 3
DOUBLE COMPLEX for zlatrz.
On entry, the leading m-by-n upper trapezoidal part of the array a must
contain the matrix to be factorized.
work REAL for slatrz

COMPLEX for clatrz
Workspace array, DIMENSION (m).
Output Parameters
a On exit, the leading m-by-m upper triangular part of a contains the upper
triangular matrix R, and elements n-l+1 to n of the first m rows of a, with
the array tau, represent the orthogonal/unitary matrix Z as a product of m
elementary reflectors.
tau REAL for slatrz

COMPLEX for clatrz
Application Notes
The factorization is obtained by Householder's method. The k-th transformation matrix, z(k), which is used to
introduce zeros into the (m - k + 1)-th row of A, is given in the form
where for real flavors
and for complex flavors
1677
tau is a scalar and z(k) is an l-element vector. tau and z(k) are chosen to annihilate the elements of the k-th
row of A2.
The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th row of A2, such that the
elements of z(k) are in a(k, l+1), ..., a(k, n).
The elements of r are returned in the upper triangular part of A1.

Z is given by
Z = Z(1)*Z(2)*...*Z(m).
?lauu2
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular matrices
Syntax
call slauu2( uplo, n, a, lda, info )
call dlauu2( uplo, n, a, lda, info )
call clauu2( uplo, n, a, lda, info )
call zlauu2( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lauu2 computes the product U*UT or LT*L for real flavors, and U*UH or LH*L for complex
flavors. Here the triangular factor U or L is stored in the upper or lower triangular part of the array a.
If uplo = 'U' or 'u', then the upper triangle of the result is stored, overwriting the factor U in A.
If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the factor L in A.
This is the unblocked form of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
Specifies whether the triangular factor stored in the array a is upper or
lower triangular:
n INTEGER. The order of the triangular factor U or L. n 0.
1678
LAPACK Routines 3
a REAL for slauu2
DOUBLE PRECISION for dlauu2
COMPLEX for clauu2
DOUBLE COMPLEX for zlauu2.
Array, DIMENSION (lda, n). On entry, the triangular factor U or L.
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten with the upper
triangle of the product U*UT (U*UH);
if uplo = 'L', then the lower triangle of a is overwritten with the lower
triangle of the product LT*L (LH*L).
info INTEGER.
?lauum
Computes the product U*UT(U*UH) or LT*L (LH*L),
where U and L are upper or lower triangular matrices
(blocked algorithm).
Syntax
call slauum( uplo, n, a, lda, info )
call dlauum( uplo, n, a, lda, info )
call clauum( uplo, n, a, lda, info )
call zlauum( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?lauum computes the product U*UT or LT*L for real flavors, and U*UH or LH*L for complex
flavors. Here the triangular factor U or L is stored in the upper or lower triangular part of the array a.
If uplo = 'U' or 'u', then the upper triangle of the result is stored, overwriting the factor U in A.
If uplo = 'L' or 'l', then the lower triangle of the result is stored, overwriting the factor L in A.
This is the blocked form of the algorithm, calling BLAS Level 3 Routines.
Input Parameters
uplo CHARACTER*1.
1679
Specifies whether the triangular factor stored in the array a is upper or

lower triangular:
n INTEGER. The order of the triangular factor U or L. n 0.
a REAL for slauum

DOUBLE PRECISION for dlauum
COMPLEX for clauum
DOUBLE COMPLEX for zlauum .
Array of size (lda, n).
On entry, the triangular factor U or L.
Output Parameters
a On exit,
if uplo = 'U', then the upper triangle of a is overwritten with the upper
triangle of the product U*UT(U*UH);
if uplo = 'L', then the lower triangle of a is overwritten with the lower
triangle of the product LT*L (LH*L).
info INTEGER.
If info = -k, the k-th parameter had an illegal value.
?orbdb1/?unbdb1
Simultaneously bidiagonalizes the blocks of a tall and
skinny matrix with orthonormal columns.
Syntax
call sorbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call dorbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call cunbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
call zunbdb1( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1, work,
lwork, info )
1680
LAPACK Routines 3
Include Files
mkl.fi, lapack.f90
Description
The routines ?orbdb1/?unbdb1 simultaneously bidiagonalize the blocks of a tall and skinny matrix X with
orthonormal columns:
The size of x11 is p by q, and x12 is (m - p) by q. q must not be larger than p, m-p, or m-q.
Tall and Skinny Matrix Routines
q min(p, m - p, m - q) ?orbdb1/?unbdb1
p min(q, m - p, m - q) ?orbdb2/?unbdb2
m - p min(p, q, m - q) ?orbdb3/?unbdb3
m - q min(p, q, m - p) ?orbdb4/?unbdb4
The orthogonal/unitary matrices p1, p2, and q1 are p-by-p, (m-p)-by-(m-p), (m-q)-by-(m-q), respectively.
p1, p2, and q1 are represented as products of elementary reflectors. See the description of ?orcsd2by1/?
uncsd2by1 for details on generating p1, p2, and q1 using ?orgqr and ?orglq.
The upper-bidiagonal matrices b11 and b12 of size q by q are represented implicitly by angles theta(1), ...,
theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a sine or cosine of
theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for details.
Input Parameters
m INTEGER. The number of rows in x11 plus the number of rows in x21.
q INTEGER. The number of columns in x11 and x21. 0 q min(p,m-p,m-q).
x11 REAL for sorbdb1

DOUBLE PRECISION for dorbdb1
COMPLEX for cunbdb1
DOUBLE COMPLEX for zunbdb1
Array, DIMENSION (ldx11,q).
On entry, the top block of the orthogonal/unitary matrix to be reduced.
ldx11 INTEGER. The leading dimension of the array X11. ldx11p.

COMPLEX for cunbdb1
1681

On entry, the bottom block of the orthogonal/unitary matrix to be reduced.
ldx21 INTEGER. The leading dimension of the array X21. ldx21m-p.
work REAL for sorbdb1

COMPLEX for cunbdb1
Workspace array, DIMENSION (lwork).

xerbla.
Output Parameters
x11 The columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1, where tril(A) denotes the lower
triangle of A, and triu(A) denotes the upper triangle of A.
x21 On exit, the columns of tril(x21) specify the reflectors for p2
theta REAL for sorbdb1

COMPLEX for cunbdb1
Array, DIMENSION (q). The entries of bidiagonal blocks b11 and b21 can be
details.
phi REAL for sorbdb1

COMPLEX for cunbdb1
Array, DIMENSION (q-1). The entries of bidiagonal blocks b11 and b21 can be
details.
taup1 REAL for sorbdb1

COMPLEX for cunbdb1
1682
LAPACK Routines 3
Array, DIMENSION (p).

COMPLEX for cunbdb1
Array, DIMENSION (m-p).

tauq1 REAL for sorbdb1
COMPLEX for cunbdb1
Array, DIMENSION (q).

info INTEGER.
See Also
?orcsd/?uncsd Computes the CS decomposition of a block-partitioned orthogonal/unitary matrix.
?orcsd2by1/?uncsd2by1 Computes the CS decomposition of a block-partitioned orthogonal/unitary
matrix.
?orbdb2/?unbdb2 Simultaneously bidiagonalizes the blocks of a tall and skinny matrix with
orthonormal columns.
?orbdb5/?unbdb5 Orthogonalizes a column vector with respect to the orthonormal basis matrix.
xerbla
?orbdb2/?unbdb2
Syntax
lwork, info )
lwork, info )
lwork, info )
1683
lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The size of x11 is p by q, and x12 is (m - p) by q. q must not be larger than p, m-p, or m-q.
The upper-bidiagonal matrices b11 and b12 of size p by p are represented implicitly by angles theta(1), ...,
theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a sine or cosine of
theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for details.
Input Parameters
p INTEGER. The number of rows in x11. 0 p min(q,m-p,m-q).

COMPLEX for cunbdb2
1684
LAPACK Routines 3
COMPLEX for cunbdb2

COMPLEX for cunbdb2

xerbla.
Output Parameters
x11 On exit: the columns of tril(x11) specify reflectors for p1 and the rows of
triu(x11,1) specify reflectors for q1.

COMPLEX for cunbdb2
details.

COMPLEX for cunbdb2
details.

1685
COMPLEX for cunbdb2


COMPLEX for cunbdb2

COMPLEX for cunbdb2

info INTEGER.
See Also
?orcsd/?uncsd
matrix.
xerbla
?orbdb3/?unbdb3
Syntax
lwork, info )
lwork, info )
1686
LAPACK Routines 3
lwork, info )
lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The size of x11 is p by q, and x12 is (m - p) by q. m-p must not be larger than p, q, or m-q.
The upper-bidiagonal matrices b11 and b12 of size (m-p) by (m-p)are represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a
sine or cosine of theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for
details.
Input Parameters
p INTEGER. The number of rows in x11. 0 pm, m-p min(p, q, m-q).

COMPLEX for cunbdb3
1687

COMPLEX for cunbdb3

COMPLEX for cunbdb3

xerbla.
Output Parameters

COMPLEX for cunbdb3
details.

COMPLEX for cunbdb3
details.
1688
LAPACK Routines 3
COMPLEX for cunbdb3

COMPLEX for cunbdb3

COMPLEX for cunbdb3

info INTEGER.
See Also
?orcsd/?uncsd
matrix.
xerbla
?orbdb4/?unbdb4
Syntax
call sorbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
phantom, work, lwork, info )
1689
call dorbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
call cunbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
call zunbdb4( m, p, q, x11, ldx11, x21, ldx21, theta, phi, taup1, taup2, tauq1,
Include Files
mkl.fi, lapack.f90
Description
The size of x11 is p by q, and x12 is (m - p) by q. m-q must not be larger than q, p, or m-p.
The upper-bidiagonal matrices b11 and b12 of size (m-q) by (m-q) are represented implicitly by angles
theta(1), ..., theta(q) and phi(1), ..., phi(q-1). Every entry in each bidiagonal band is a product of a
sine or cosine of theta with a sine or cosine of phi. See [Sutton09] or the description of ?orcsd/?uncsd for
details.
Input Parameters
q INTEGER. The number of columns in x11 and x21. 0 qm and 0 m-q

min(p,m-p,q).

COMPLEX for cunbdb4
1690
LAPACK Routines 3

COMPLEX for cunbdb4

COMPLEX for cunbdb4

xerbla.
Output Parameters

COMPLEX for cunbdb4
details.

COMPLEX for cunbdb4
1691
details.

COMPLEX for cunbdb4

COMPLEX for cunbdb4

COMPLEX for cunbdb4

phantom REAL for sorbdb4
COMPLEX for cunbdb4
The routine computes an m-by-1 column vector y that is orthogonal to the

columns of [ x11; x21 ]. phantom(1:p) and phantom(p+1:m) contain
Householder vectors for y(1:p) and y(p+1:m), respectively.
info INTEGER.
See Also
?orcsd/?uncsd
matrix.
1692
LAPACK Routines 3
xerbla
?orbdb5/?unbdb5
Orthogonalizes a column vector with respect to the
orthonormal basis matrix.
Syntax
call sorbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call dorbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call cunbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call zunbdb5( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The ?orbdb5/?unbdb5 routines orthogonalize the column vector
with respect to the columns of
The columns of Q must be orthonormal.

If the projection is zero according to Kahan's "twice is enough" criterion, then some other vector from the
orthogonal complement is returned. This vector is chosen in an arbitrary but deterministic way.
Input Parameters
m1 INTEGER
The dimension of x1 and the number of rows in q1. 0 m1.
m2 INTEGER
n INTEGER
The number of columns in q1 and q2. 0 n.
x1 REAL for sordb5
1693
DOUBLE PRECISION for dordb5

COMPLEX for cundb5
COMPLEX*16 for zundb5
Array of size m1.
The top part of the vector to be orthogonalized.
incx1 INTEGER
Increment for entries of x1.
x2 REAL for sordb5

COMPLEX for cundb5
Array of size m2.
The bottom part of the vector to be orthogonalized.
incx2 INTEGER
q1 REAL for sordb5

COMPLEX for cundb5
Array of size (ldq1, n).
The top part of the orthonormal basis matrix.
ldq1 INTEGER
The leading dimension of q1. ldq1m1.
q2 REAL for sordb5

COMPLEX for cundb5
The bottom part of the orthonormal basis matrix.
ldq2 INTEGER
work REAL for sordb5

COMPLEX for cundb5
1694
LAPACK Routines 3
Workspace array of size lwork.
lwork INTEGER
The size of the array work. lworkn.
Output Parameters
x1 The top part of the projected vector.
x2 The bottom part of the projected vector.
info INTEGER.
See Also
?orcsd/?uncsd
matrix.
xerbla
?orbdb6/?unbdb6
Orthogonalizes a column vector with respect to the
orthonormal basis matrix.
Syntax
call sorbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call dorbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call cunbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
call zunbdb6( m1, m2, n, x1, incx1, x2, incx2, q1, ldq1, q2, ldq2, work, lwork, info )
Include Files
mkl.fi, lapack.f90
Description
The ?orbdb6/?unbdb6 routines orthogonalize the column vector
1695
with respect to the columns of
The columns of Q must be orthonormal.

If the projection is zero according to Kahan's "twice is enough" criterion, then the zero vector is returned.
Input Parameters
m1 INTEGER
m2 INTEGER
n INTEGER
The number of columns in q1 and q2. 0 n.
x1 REAL for sordb5

COMPLEX for cundb5
Array of size m1.
The top part of the vector to be orthogonalized.
incx1 INTEGER
x2 REAL for sordb5

COMPLEX for cundb5
Array of size m2.
The bottom part of the vector to be orthogonalized.
incx2 INTEGER
q1 REAL for sordb5

COMPLEX for cundb5
The top part of the orthonormal basis matrix.
1696
LAPACK Routines 3
ldq1 INTEGER
q2 REAL for sordb5

COMPLEX for cundb5
The bottom part of the orthonormal basis matrix.
ldq2 INTEGER
work REAL for sordb5

COMPLEX for cundb5
lwork INTEGER
The size of the array work. lworkn.
Output Parameters
x1 The top part of the projected vector.
x2 The bottom part of the projected vector.
info INTEGER.
See Also
?orcsd/?uncsd
matrix.
xerbla
1697
?org2l/?ung2l
Generates all or part of the orthogonal/unitary matrix
Q from a QL factorization determined by ?geqlf
Syntax
call sorg2l( m, n, k, a, lda, tau, work, info )
call dorg2l( m, n, k, a, lda, tau, work, info )
call cung2l( m, n, k, a, lda, tau, work, info )
call zung2l( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?org2l/?ung2l generates an m-by-n real/complex matrix Q with orthonormal columns, which is
defined as the last n columns of a product of k elementary reflectors of order m:
Q = H(k)*...*H(2)*H(1) as returned by ?geqlf.
Input Parameters
m INTEGER. The number of rows of the matrix Q. m 0.
n INTEGER. The number of columns of the matrix Q. mn 0.

matrix Q. nk 0.
a REAL for sorg2l

DOUBLE PRECISION for dorg2l
COMPLEX for cung2l
DOUBLE COMPLEX for zung2l.
On entry, the (n-k+i)-th column must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?geqlf in the
last k columns of its array argument A.
tau REAL for sorg2l

COMPLEX for cung2l
returned by ?geqlf.
1698
LAPACK Routines 3
work REAL for sorg2l
COMPLEX for cung2l
Output Parameters
a On exit, the m-by-n matrix Q.
info INTEGER.
?org2r/?ung2r
Q from a QR factorization determined by ?geqrf
Syntax
call sorg2r( m, n, k, a, lda, tau, work, info )
call dorg2r( m, n, k, a, lda, tau, work, info )
call cung2r( m, n, k, a, lda, tau, work, info )
call zung2r( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?org2r/?ung2r generates an m-by-n real/complex matrix Q with orthonormal columns, which is
defined as the first n columns of a product of k elementary reflectors of order m
Q = H(1)*H(2)*...*H(k)
as returned by ?geqrf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. mn 0.

matrix Q. nk 0.
a REAL for sorg2r

DOUBLE PRECISION for dorg2r
1699
COMPLEX for cung2r

DOUBLE COMPLEX for zung2r.
On entry, the i-th column must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?geqrf in the
first k columns of its array argument a.
lda INTEGER. The first DIMENSION of the array a. lda max(1,m).
tau REAL for sorg2r

COMPLEX for cung2r
returned by ?geqrf.
work REAL for sorg2r

COMPLEX for cung2r
Output Parameters
info INTEGER.
?orgl2/?ungl2
Q from an LQ factorization determined by ?gelqf
Syntax
call sorgl2( m, n, k, a, lda, tau, work, info )
call dorgl2( m, n, k, a, lda, tau, work, info )
call cungl2( m, n, k, a, lda, tau, work, info )
call zungl2( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
1700
LAPACK Routines 3
Description
The routine ?orgl2/?ungl2 generates a m-by-n real/complex matrix Q with orthonormal rows, which is
defined as the first m rows of a product of k elementary reflectors of order n
Q = H(k)*...*H(2)*H(1)for real flavors, or Q = (H(k))H*...*(H(2))H*(H(1))H for complex flavors as
returned by ?gelqf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. nm.

matrix Q. mk 0.
a REAL for sorgl2

DOUBLE PRECISION for dorgl2
COMPLEX for cungl2
DOUBLE COMPLEX for zungl2.
Array, DIMENSION (lda, n). On entry, the i-th row must contain the vector
which defines the elementary reflector H(i), for i = 1,2,..., k, as
returned by ?gelqf in the first k rows of its array argument a.
tau REAL for sorgl2

COMPLEX for cungl2
returned by ?gelqf.
work REAL for sorgl2

COMPLEX for cungl2
Output Parameters
info INTEGER.
1701
?orgr2/?ungr2
Q from an RQ factorization determined by ?gerqf
Syntax
call sorgr2( m, n, k, a, lda, tau, work, info )
call dorgr2( m, n, k, a, lda, tau, work, info )
call cungr2( m, n, k, a, lda, tau, work, info )
call zungr2( m, n, k, a, lda, tau, work, info )
Include Files
mkl.fi
Description
The routine ?orgr2/?ungr2 generates an m-by-n real matrix Q with orthonormal rows, which is defined as
the last m rows of a product of k elementary reflectors of order n
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H for complex flavors as
returned by ?gerqf.
Input Parameters
n INTEGER. The number of columns of the matrix Q. nm
k INTEGER.
The number of elementary reflectors whose product defines the matrix Q.
mk 0.
a REAL for sorgr2

DOUBLE PRECISION for dorgr2
COMPLEX for cungr2
DOUBLE COMPLEX for zungr2.
On entry, the ( m- k+i)-th row must contain the vector which defines the
elementary reflector H(i), for i = 1,2,..., k, as returned by ?gerqf in
tau REAL for sorgr2

COMPLEX for cungr2
1702
LAPACK Routines 3
Array, DIMENSION (k).tau(i) must contain the scalar factor of the
elementary reflector H(i), as returned by ?gerqf.
work REAL for sorgr2

COMPLEX for cungr2
Output Parameters
info INTEGER.
?orm2l/?unm2l
matrix from a QL factorization determined by ?geqlf
Syntax
call sorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2l( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orm2l/?unm2l overwrites the general real/complex m-by-n matrix C with
Q*C if side = 'L' and trans = 'N', or

QT*C / QH*C if side = 'L' and trans = 'T' (for real flavors) or trans = 'C' (for complex flavors), or
C*Q if side = 'R' and trans = 'N', or
C*QT / C*QH if side = 'R' and trans = 'T' (for real flavors) or trans = 'C' (for complex flavors).
Here Q is a real orthogonal or complex unitary matrix defined as the product of k elementary reflectors
Q = H(k)*...*H(2)*H(1) as returned by ?geqlf.
Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side CHARACTER*1.
1703
= 'L': apply Q or QT / QH from the left
= 'R': apply Q or QT / QH from the right
trans CHARACTER*1.
= 'N': apply Q (no transpose)
= 'T': apply QT (transpose, for real flavors)
= 'C': apply QH (conjugate transpose, for complex flavors)
m INTEGER. The number of rows of the matrix C. m 0.

matrix Q.
If side = 'L', mk 0;
a REAL for sorm2l

DOUBLE PRECISION for dorm2l
COMPLEX for cunm2l
DOUBLE COMPLEX for zunm2l.
Array, DIMENSION (lda,k).
The i-th column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,..., k, as returned by ?geqlf in the last k
columns of its array argument a. The array a is modified by the routine but
restored on exit.

If side = 'L', lda max(1, m)
tau REAL for sorm2l

COMPLEX for cunm2l
Array, DIMENSION (k). tau(i) must contain the scalar factor of the
elementary reflector H(i), as returned by ?geqlf.
c REAL for sorm2l

COMPLEX for cunm2l
Array, DIMENSION (ldc, n).
1704
LAPACK Routines 3
ldc INTEGER. The leading dimension of the array C. ldc max(1,m).
work REAL for sorm2l

COMPLEX for cunm2l
Workspace array, DIMENSION:
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
c On exit, c is overwritten by Q*C or QT*C / QH*C, or C*Q, or C*QT / C*QH.
info INTEGER.
?orm2r/?unm2r
matrix from a QR factorization determined by ?geqrf
Syntax
call sorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunm2r( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orm2r/?unm2r overwrites the general real/complex m-by-n matrix C with

Q = H(1)*H(2)*...*H(k) as returned by ?geqrf.
1705
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.

matrix Q.
a REAL for sorm2r

DOUBLE PRECISION for dorm2r
COMPLEX for cunm2r
DOUBLE COMPLEX for zunm2r.
Array, DIMENSION (lda,k).
The i-th column must contain the vector which defines the elementary
reflector H(i), for i = 1,2,..., k, as returned by ?geqrf in the first k
columns of its array argument a. The array a is modified by the routine but
restored on exit.

If side = 'L', lda max(1, m);
tau REAL for sorm2r

COMPLEX for cunm2r
returned by ?geqrf.
c REAL for sorm2r

COMPLEX for cunm2r
1706
LAPACK Routines 3
ldc INTEGER. The leading dimension of the array c. ldc max(1,m).
work REAL for sorm2r

COMPLEX for cunm2r
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
info INTEGER.
?orml2/?unml2
matrix from a LQ factorization determined by ?gelqf
Syntax
call sorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dorml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunml2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?orml2/?unml2 overwrites the general real/complex m-by-n matrix C with

1707
Q = H(k)*...*H(2)*H(1)for real flavors, or Q = (H(k))H*...*(H(2))H*(H(1))H for complex flavors as

returned by ?gelqf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.

matrix Q.
a REAL for sorml2

DOUBLE PRECISION for dorml2
COMPLEX for cunml2
DOUBLE COMPLEX for zunml2.
Array, DIMENSION
(lda, m) if side = 'L',
(lda, n) if side = 'R'
The i-th row must contain the vector which defines the elementary reflector
H(i), for i = 1,2,..., k, as returned by ?gelqf in the first k rows of its
array argument a. The array a is modified by the routine but restored on
exit.
lda INTEGER. The leading dimension of the array a. lda max(1,k).
tau REAL for sorml2

COMPLEX for cunml2
returned by ?gelqf.
1708
LAPACK Routines 3
c REAL for sorml2
COMPLEX for cunml2
Array, DIMENSION (ldc, n) On entry, the m-by-n matrix C.
work REAL for sorml2

COMPLEX for cunml2
(n) if side = 'L',
(m) if side = 'R'
Output Parameters
info INTEGER.
?ormr2/?unmr2
matrix from a RQ factorization determined by ?gerqf
Syntax
call sormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call dormr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call cunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
call zunmr2( side, trans, m, n, k, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?ormr2/?unmr2 overwrites the general real/complex m-by-n matrix C with

1709
Q = H(1)*H(2)*...*H(k) for real flavors, or Q = (H(1))H*(H(2))H*...*(H(k))H as returned by ?gerqf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.

matrix Q.
a REAL for sormr2

DOUBLE PRECISION for dormr2
COMPLEX for cunmr2
DOUBLE COMPLEX for zunmr2.
Array, DIMENSION
H(i), for i = 1,2,...,k, as returned by ?gerqf in the last k rows of its
exit.
lda INTEGER.
The leading dimension of the array a. lda max(1,k).
tau REAL for sormr2

COMPLEX for cunmr2
1710
LAPACK Routines 3
returned by ?gerqf.
c REAL for sormr2

COMPLEX for cunmr2
work REAL for sormr2

COMPLEX for cunmr2
(n) if side = 'L',
(m) if side = 'R'
Output Parameters
info INTEGER.
?ormr3/?unmr3
matrix from a RZ factorization determined by ?tzrzf
Syntax
call sormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call dormr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call cunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
call zunmr3( side, trans, m, n, k, l, a, lda, tau, c, ldc, work, info )
Include Files
mkl.fi
Description
The routine ?ormr3/?unmr3 overwrites the general real/complex m-by-n matrix C with
1711

Q = H(1)*H(2)*...*H(k) as returned by ?tzrzf.
Input Parameters
side CHARACTER*1.
trans CHARACTER*1.

matrix Q.
l INTEGER. The number of columns of the matrix A containing the meaningful

If side = 'L', ml 0,
if side = 'R', nl 0.
a REAL for sormr3

COMPLEX for cunmr3
Array, DIMENSION
H(i), for i = 1,2,...,k, as returned by ?tzrzf in the last k rows of its
exit.
1712
LAPACK Routines 3
lda INTEGER.
The leading dimension of the array a. lda max(1,k).
tau REAL for sormr3

COMPLEX for cunmr3
returned by ?tzrzf.
c REAL for sormr3

COMPLEX for cunmr3
work REAL for sormr3

COMPLEX for cunmr3
(n) if side = 'L',
(m) if side = 'R'.
Output Parameters
info INTEGER.
?pbtf2
Computes the Cholesky factorization of a symmetric/
Hermitian positive-definite band matrix (unblocked
algorithm).
Syntax
call spbtf2( uplo, n, kd, ab, ldab, info )
call dpbtf2( uplo, n, kd, ab, ldab, info )
1713
call cpbtf2( uplo, n, kd, ab, ldab, info )

call zpbtf2( uplo, n, kd, ab, ldab, info )
Include Files
mkl.fi
Description
The routine computes the Cholesky factorization of a real symmetric or complex Hermitian positive definite
band matrix A.
A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular. This is the unblocked version of the
algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.

kd 0.
ab REAL for spbtf2

DOUBLE PRECISION for dpbtf2
COMPLEX for cpbtf2
DOUBLE COMPLEX for zpbtf2.
Array, DIMENSION (ldab, n).
On entry, the upper or lower triangle of the symmetric/ Hermitian band

matrix A, stored in the first kd+1 rows of the array. The j-th column of A is
if uplo = 'U', ab(kd+1+i-j,j) = A(i, j for max(1, j-kd) ij;
if uplo = 'L', ab(1+i-j,j) = A(i, j for ji min(n, j+kd).
ldab INTEGER. The leading dimension of the array ab. ldabkd+1.
1714
LAPACK Routines 3
Output Parameters
ab On exit, If info = 0, the triangular factor U or L from the Cholesky

factorization A=UT*U (A=UH*U), or A= L*LT (A = L*LH) of the band matrix
A, in the same storage format as A.
info INTEGER.
> 0: if info = k, the leading minor of order k is not positive definite, and
the factorization could not be completed.
?potf2
Computes the Cholesky factorization of a symmetric/
Hermitian positive-definite matrix (unblocked
algorithm).
Syntax
call spotf2( uplo, n, a, lda, info )
call dpotf2( uplo, n, a, lda, info )
call cpotf2( uplo, n, a, lda, info )
call zpotf2( uplo, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?potf2 computes the Cholesky factorization of a real symmetric or complex Hermitian positive
definite matrix A. The factorization has the form
A = UT*U for real flavors, A = UH*U for complex flavors if uplo = 'U', or
A = L*LT for real flavors, A = L*LH for complex flavors if uplo = 'L',
where U is an upper triangular matrix, and L is lower triangular.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines
Input Parameters
uplo CHARACTER*1.
Hermitian matrix A is stored.
a REAL for spotf2

DOUBLE PRECISION or dpotf2
1715
COMPLEX for cpotf2

DOUBLE COMPLEX for zpotf2.
On entry, the symmetric/Hermitian matrix A.


lda max(1,n).
Output Parameters
a On exit, If info = 0, the factor U or L from the Cholesky factorization

A=UT*U (A=UH*U), or A= L*LT (A = L*LH).
info INTEGER.
> 0: if info = k, the leading minor of order k is not positive definite, and
the factorization could not be completed.
?ptts2
Solves a tridiagonal system of the form A*X=B using
the L*D*LH/L*D*LH factorization computed by ?pttrf.
Syntax
call sptts2( n, nrhs, d, e, b, ldb )
call dptts2( n, nrhs, d, e, b, ldb )
call cptts2( iuplo, n, nrhs, d, e, b, ldb )
call zptts2( iuplo, n, nrhs, d, e, b, ldb )
Include Files
mkl.fi
Description
The routine ?ptts2 solves a tridiagonal system of the form
A*X = B
Real flavors sptts2/dptts2 use the L*D*LT factorization of A computed by spttrf/dpttrf, and complex
flavors cptts2/zptts2 use the UH*D*U or L*D*LH factorization of A computed by cpttrf/zpttrf.
D is a diagonal matrix specified in the vector d, U (or L) is a unit bidiagonal matrix whose superdiagonal
(subdiagonal) is specified in the vector e, and X and B are n-by-nrhs matrices.
1716
LAPACK Routines 3
Input Parameters
iuplo INTEGER. Used with complex flavors only.

Specifies the form of the factorization, and whether the vector e is the
superdiagonal of the upper bidiagonal factor U or the subdiagonal of the
lower bidiagonal factor L.
= 1: A = UH*D*U, e is the superdiagonal of U;
= 0: A = L*D*LH, e is the subdiagonal of L
n INTEGER. The order of the tridiagonal matrix A. n 0.
nrhs INTEGER. The number of right hand sides, that is, the number of columns
of the matrix B. nrhs 0.
d REAL for sptts2/cptts2

DOUBLE PRECISION for dptts2/zptts2.
The n diagonal elements of the diagonal matrix D from the factorization of

A.
e REAL for sptts2

DOUBLE PRECISION for dptts2
COMPLEX for cptts2
DOUBLE COMPLEX for zptts2.
Contains the (n-1) subdiagonal elements of the unit bidiagonal factor L from
the L*D*LT (for real flavors) or L*D*LH (for complex flavors when iuplo =
0) factorization of A.
For complex flavors when iuplo = 1, e contains the (n-1) superdiagonal
elements of the unit bidiagonal factor U from the factorization A = UH*D*U.
B REAL for sptts2/cptts2

DOUBLE PRECISION for dptts2/zptts2.
Array, DIMENSION (ldb, nrhs).
On entry, the right hand side vectors B for the system of linear equations.
ldb INTEGER. The leading dimension of the array B. ldb max(1,n).
Output Parameters
b On exit, the solution vectors, X.
?rscl
Multiplies a vector by the reciprocal of a real scalar.
1717
Syntax
call srscl( n, sa, sx, incx )
call drscl( n, sa, sx, incx )
call csrscl( n, sa, sx, incx )
call zdrscl( n, sa, sx, incx )
Include Files
mkl.fi
Description
The routine ?rscl multiplies an n-element real/complex vector x by the real scalar 1/a. This is done without
overflow or underflow as long as the final result x/a does not overflow or underflow.
Input Parameters
n INTEGER. The number of components of the vector x.
sa REAL for srscl/csrscl

DOUBLE PRECISION for drscl/zdrscl.
The scalar a which is used to divide each component of the vector x. sa
must be 0, or the subroutine will divide by zero.
sx REAL for srscl

DOUBLE PRECISION for drscl
COMPLEX for csrscl
DOUBLE COMPLEX for zdrscl.
Array, DIMENSION(1+(n-1)*|incx|).
The n-element vector x.
incx INTEGER. The increment between successive values of the vector sx.
If incx > 0, sx(1)=x(1), and sx(1+(i-1)*incx)=x(i), 1<in.
Output Parameters
sx On exit, the result x/a.
?syswapr
Applies an elementary permutation on the rows and
columns of a symmetric matrix.
Syntax
call ssyswapr( uplo, n, a, lda, i1, i2 )
call dsyswapr( uplo, n, a, lda, i1, i2 )
call csyswapr( uplo, n, a, lda, i1, i2 )
call zsyswapr( uplo, n, a, lda, i1, i2 )
1718
LAPACK Routines 3
call syswapr( a,i1,i2[,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a symmetric matrix.
Input Parameters

a REAL for ssyswapr

DOUBLE PRECISION for dsyswapr
COMPLEX for csyswapr
DOUBLE COMPLEX for zsyswapr
The array a contains the block diagonal matrix D and the multipliers used to
obtain the factor U or L as computed by ?sytrf.
lda INTEGER. The leading dimension of the array a. ldamax(1,n).
i1 INTEGER. Index of the first row to swap.
i2 INTEGER. Index of the second row to swap.
Output Parameters
If info = 'U', the upper triangular part of the inverse is formed and the part of
A below the diagonal is not referenced.
If info = 'L', the lower triangular part of the inverse is formed and the part of
A above the diagonal is not referenced.

Specific details for the routine syswapr interface are as follows:
1719
i1 Holds the index for swap.
See Also
?sytrf
?heswapr
columns of a Hermitian matrix.
Syntax
call cheswapr( uplo, n, a, lda, i1, i2 )
call zheswapr( uplo, n, a, lda, i1, i2 )
call heswapr( a, i1, i2 [,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a Hermitian matrix.
Input Parameters

a COMPLEX for cheswapr

DOUBLE COMPLEX for zheswapr
obtain the factor U or L as computed by ?hetrf.
1720
LAPACK Routines 3
Output Parameters
a If info = 0, the inverse of the original matrix.

Specific details for the routine heswapr interface are as follows:
See Also
?hetrf
?syswapr1
?syswapr1
columns of a symmetric matrix.
Syntax
call ssyswapr1( uplo, n, a, lda, i1, i2 )
call dsyswapr1( uplo, n, a, lda, i1, i2 )
call csyswapr1( uplo, n, a, lda, i1, i2 )
call zsyswapr1( uplo, n, a, lda, i1, i2 )
call syswapr1( a,i1,i2[,uplo] )
Include Files
mkl.fi, lapack.f90
Description
The routine applies an elementary permutation on the rows and columns of a symmetric matrix.
1721
Input Parameters

a REAL for ssyswapr1

DOUBLE PRECISION for dsyswapr1
COMPLEX for csyswapr1
DOUBLE COMPLEX for zsyswapr1
Array of dimension (lda,n).
obtain the factor U or L as computed by ?sytrf.
Output Parameters

Specific details for the routine syswapr1 interface are as follows:
1722
LAPACK Routines 3
See Also
?sytrf
?sygs2/?hegs2
Reduces a symmetric/Hermitian positive-definite
generalized eigenproblem to standard form, using the
factorization results obtained from ?potrf (unblocked
algorithm).
Syntax
call ssygs2( itype, uplo, n, a, lda, b, ldb, info )
call dsygs2( itype, uplo, n, a, lda, b, ldb, info )
call chegs2( itype, uplo, n, a, lda, b, ldb, info )
call zhegs2( itype, uplo, n, a, lda, b, ldb, info )
Include Files
mkl.fi
Description
The routine ?sygs2/?hegs2 reduces a real symmetric-definite or a complex Hermitian positive-definite
generalized eigenproblem to standard form.
If itype = 1, the problem is
A*x = *B*x
and A is overwritten by inv(UH)*A*inv(U) or inv(L)*A*inv(LH) for complex flavors and by
inv(UT)*A*inv(U) or inv(L)*A*inv(LT)for real flavors.
If itype = 2 or 3, the problem is
A*B*x = *x, or B*A*x = *x,

and A is overwritten by U*A*UH or LH*A*L for complex flavors and by U*A*UT or LT*A*L for real flavors. Here
UT and LT are the transpose while UH and LH are conjugate transpose of U and L.
B must be previously factorized by ?potrf as follows:
UH*U or L*LH for complex flavors

UT*U or L*LT for real flavors
Input Parameters
itype INTEGER.
= 1: compute inv(UH)*A*inv(U) or inv(L)*A*inv(LH);
= 2 or 3: compute U*A*UH orLH*A*L.
For real flavors:

= 1: compute inv(UT)*A*inv(U) or inv(L)*A*inv(LT);
= 2 or 3: compute U*A*UT orLT*A*L.
symmetric/Hermitian matrix A is stored, and how B has been factorized.
1723
n INTEGER. The order of the matrices A and B. n 0.
a REAL for ssygs2

DOUBLE PRECISION for dsygs2
COMPLEX for chegs2
DOUBLE COMPLEX for zhegs2.

lda INTEGER.
The leading dimension of the array a. lda max(1,n).
b REAL for ssygs2

DOUBLE PRECISION for dsygs2
COMPLEX for chegs2
DOUBLE COMPLEX for zhegs2.
Array, DIMENSION (ldb, n).
The triangular factor from the Cholesky factorization of B as returned by ?

potrf.
ldb INTEGER. The leading dimension of the array b. ldb max(1,n).
Output Parameters
a On exit, If info = 0, the transformed matrix, stored in the same format as

A.
info INTEGER.
?sytd2/?hetd2
Reduces a symmetric/Hermitian matrix to real
symmetric tridiagonal form by an orthogonal/unitary
similarity transformation(unblocked algorithm).
1724
LAPACK Routines 3
Syntax
call ssytd2( uplo, n, a, lda, d, e, tau, info )
call dsytd2( uplo, n, a, lda, d, e, tau, info )
call chetd2( uplo, n, a, lda, d, e, tau, info )
call zhetd2( uplo, n, a, lda, d, e, tau, info )
Include Files
mkl.fi
Description
The routine ?sytd2/?hetd2 reduces a real symmetric/complex Hermitian matrix A to real symmetric
tridiagonal form T by an orthogonal/unitary similarity transformation: QT*A*Q = T (QH*A*Q = T ).
Input Parameters
uplo CHARACTER*1.
a REAL for ssytd2

DOUBLE PRECISION for dsytd2
COMPLEX for chetd2
DOUBLE COMPLEX for zhetd2.

Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal of a are

overwritten by the corresponding elements of the tridiagonal matrix T, and
the elements above the first superdiagonal, with the array tau, represent
the orthogonal/unitary matrix Q as a product of elementary reflectors;
1725
if uplo = 'L', the diagonal and first subdiagonal of a are overwritten by

below the first subdiagonal, with the array tau, represent the orthogonal/
d REAL for ssytd2/chetd2

DOUBLE PRECISION for dsytd2/zhetd2.
The diagonal elements of the tridiagonal matrix T:

d(i) = a(i,i).
e REAL for ssytd2/chetd2

DOUBLE PRECISION for dsytd2/zhetd2.
The off-diagonal elements of the tridiagonal matrix T:

e(i) = a(i,i+1) if uplo = 'U',
e(i) = a(i+1,i) if uplo = 'L'.
tau REAL for ssytd2

DOUBLE PRECISION for dsytd2
COMPLEX for chetd2
DOUBLE COMPLEX for zhetd2.
The first n-1 elements contain scalar factors of the elementary reflectors.
tau(n) is used as workspace.
info INTEGER.
?sytf2
Computes the factorization of a real/complex
symmetric indefinite matrix, using the diagonal
pivoting method (unblocked algorithm).
Syntax
call ssytf2( uplo, n, a, lda, ipiv, info )
call dsytf2( uplo, n, a, lda, ipiv, info )
call csytf2( uplo, n, a, lda, ipiv, info )
call zsytf2( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
1726
LAPACK Routines 3
Description
The routine ?sytf2 computes the factorization of a real/complex symmetric matrix A using the Bunch-
Kaufman diagonal pivoting method:
A = U*D*UT, or A = L*D*LT,
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
This is the unblocked version of the algorithm, calling BLAS Level 2 Routines.
Input Parameters
uplo CHARACTER*1.
matrix A is stored
a REAL for ssytf2

DOUBLE PRECISION for dsytf2
COMPLEX for csytf2
DOUBLE COMPLEX for zsytf2.

lda INTEGER.
Output Parameters
a On exit, the block diagonal matrix D and the multipliers used to obtain the
factor U or L.
ipiv INTEGER.
Details of the interchanges and the block structure of D

D(k,k) is a 1-by-1 diagonal block.
1727
k-1 and -ipiv(k) are interchanged and D(k - 1:k, k - 1:k) is a 2-by-2
diagonal block.
If uplo = 'L' and ipiv( k) = ipiv( k+1)< 0, then rows and columns
k+1 and -ipiv(k) were interchanged and D(k:k + 1,k:k + 1) is a 2-by-2
diagonal block.
info INTEGER.
< 0: if info = -k, the k-th argument has an illegal value
> 0: if info = k, D(k,k) is exactly zero. The factorization are completed, but
the block diagonal matrix D is exactly singular, and division by zero will
occur if it is used to solve a system of equations.
?sytf2_rook
Computes the factorization of a real/complex
symmetric indefinite matrix, using the bounded
Bunch-Kaufman diagonal pivoting method (unblocked
algorithm).
Syntax
call ssytf2_rook( uplo, n, a, lda, ipiv, info )
call dsytf2_rook( uplo, n, a, lda, ipiv, info )
call csytf2_rook( uplo, n, a, lda, ipiv, info )
call zsytf2_rook( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine ?sytf2_rook computes the factorization of a real/complex symmetric matrix A using the
bounded Bunch-Kaufman ("rook") diagonal pivoting method:
A = U*D*UT, or A = L*D*LT,
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, and D is symmetric
and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
Input Parameters
uplo CHARACTER*1.
matrix A is stored
1728
LAPACK Routines 3
a REAL for ssytf2_rook
DOUBLE PRECISION for dsytf2_rook
COMPLEX for csytf2_rook
DOUBLE COMPLEX for zsytf2_rook.

lda INTEGER.
Output Parameters
factor U or L.
ipiv INTEGER.

and Dk, k is a 1-by-1 diagonal block.
info INTEGER.
< 0: if info = -k, the k-th argument has an illegal value
> 0: if info = k, D(k,k) is exactly zero. The factorization are completed, but
the block diagonal matrix D is exactly singular, and division by zero will
occur if it is used to solve a system of equations.
?hetf2
Computes the factorization of a complex Hermitian
matrix, using the diagonal pivoting method (unblocked
algorithm).
1729
Syntax
call chetf2( uplo, n, a, lda, ipiv, info )
call zhetf2( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the factorization of a complex Hermitian matrix A using the Bunch-Kaufman diagonal
pivoting method:
A = U*D*UH or A = L*D*LH
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, UH is the conjugate
transpose of U, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
Input Parameters
uplo CHARACTER*1.
A is stored:
A COMPLEX for chetf2

DOUBLE COMPLEX for zhetf2.

Output Parameters
factor U or L.
ipiv INTEGER. Array, DIMENSION (n).

and D(k,k) is a 1-by-1 diagonal block.
1730
LAPACK Routines 3
If uplo = 'U' and ipiv(k) = ipiv( k-1) < 0, then rows and columns
k-1 and -ipiv(k) were interchanged and D(k-1:k,k-1:k ) is a 2-by-2 diagonal
block.
If uplo = 'L' and ipiv(k) = ipiv( k+1) < 0, then rows and columns
k+1 and -ipiv(k) were interchanged and D(k:k+1, k:k+1) is a 2-by-2
diagonal block.
info INTEGER.
> 0: if info = k, D(k,k) is exactly zero. The factorization has been

completed, but the block diagonal matrix D is exactly singular, and division
by zero will occur if it is used to solve a system of equations.
?hetf2_rook
Computes the factorization of a complex Hermitian
matrix, using the bounded Bunch-Kaufman diagonal
pivoting method (unblocked algorithm).
Syntax
call chetf2_rook( uplo, n, a, lda, ipiv, info )
call zhetf2_rook( uplo, n, a, lda, ipiv, info )
Include Files
mkl.fi
Description
The routine computes the factorization of a complex Hermitian matrix A using the bounded Bunch-Kaufman
("rook") diagonal pivoting method:
A = U*D*UH or A = L*D*LH
where U (or L) is a product of permutation and unit upper (lower) triangular matrices, UH is the conjugate
transpose of U, and D is Hermitian and block diagonal with 1-by-1 and 2-by-2 diagonal blocks.
Input Parameters
uplo CHARACTER*1.
A is stored:
a COMPLEX for chetf2_rook

DOUBLE COMPLEX for zhetf2_rook.
1731

Output Parameters
factor U or L.
ipiv INTEGER. Array, DIMENSION (n).

Details of the interchanges and the block structure of D.
and D(k,k) is a 1-by-1 diagonal block.
info INTEGER.
> 0: if info = k, D(k,k) is exactly zero. The factorization has been

completed, but the block diagonal matrix D is exactly singular, and division
by zero will occur if it is used to solve a system of equations.
?tgex2
Swaps adjacent diagonal blocks in an upper (quasi)
triangular matrix pair by an orthogonal/unitary
equivalence transformation.
Syntax
call stgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2, work,
lwork, info )
call dtgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, n1, n2, work,
lwork, info )
call ctgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
call ztgex2( wantq, wantz, n, a, lda, b, ldb, q, ldq, z, ldz, j1, info )
1732
LAPACK Routines 3
Include Files
mkl.fi
Description
The real routines stgex2/dtgex2 swap adjacent diagonal blocks (A11, B11) and (A22, B22) of size 1-by-1 or
2-by-2 in an upper (quasi) triangular matrix pair (A, B) by an orthogonal equivalence transformation. (A, B)
must be in generalized real Schur canonical form (as returned by sgges/dgges), that is, A is block upper
triangular with 1-by-1 and 2-by-2 diagonal blocks. B is upper triangular.
The complex routines ctgex2/ztgex2 swap adjacent diagonal 1-by-1 blocks (A11, B11) and (A22, B22) in
an upper triangular matrix pair (A, B) by an unitary equivalence transformation.
(A, B) must be in generalized Schur canonical form, that is, A and B are both upper triangular.
All routines optionally update the matrices Q and Z of generalized Schur vectors:
For real flavors,
Q(in)*A(in)*Z(in)T = Q(out)*A(out)*Z(out)T
Q(in)*B(in)*Z(in)T = Q(out)*B(out)*Z(out)T.
For complex flavors,
Q(in)*A(in)*Z(in)H = Q(out)*A(out)*Z(out)H
Q(in)*B(in)*Z(in)H = Q(out)*B(out)*Z(out)H.
Input Parameters
wantq LOGICAL.
If wantq = .TRUE. : update the left transformation matrix Q;
If wantq = .FALSE. : do not update Q.
wantz LOGICAL.
If wantz = .TRUE. : update the right transformation matrix Z;
If wantz = .FALSE.: do not update Z.
n INTEGER. The order of the matrices A and B. n 0.
a, b REAL for stgex2DOUBLE PRECISION for dtgex2

COMPLEX for ctgex2
DOUBLE COMPLEX for ztgex2.
Arrays, DIMENSION (lda, n) and (ldb, n), respectively.
On entry, the matrices A and B in the pair (A, B).
ldb INTEGER. The leading dimension of the array b. ldb max(1,n).
q, z REAL for stgex2DOUBLE PRECISION for dtgex2

COMPLEX for ctgex2
DOUBLE COMPLEX for ztgex2.
Arrays, DIMENSION (ldq, n) and (ldz, n), respectively.
1733
On entry, if wantq = .TRUE., q contains the orthogonal/unitary matrix Q,

and if wantz = .TRUE., z contains the orthogonal/unitary matrix Z.
ldq INTEGER. The leading dimension of the array q. ldq 1.

If wantq = .TRUE., ldqn.
ldz INTEGER. The leading dimension of the array z. ldz 1.

If wantz = .TRUE., ldzn.
j1 INTEGER.
The index to the first block (A11, B11). 1 j1n.
n1 INTEGER. Used with real flavors only. The order of the first block (A11,
B11). n1 = 0, 1 or 2.
n2 INTEGER. Used with real flavors only. The order of the second block (A22,
B22). n2 = 0, 1 or 2.
work REAL for stgex2

DOUBLE PRECISION for dtgex2.
Workspace array, DIMENSION(max(1,lwork)). Used with real flavors only.

lworkmax(n*(n2+n1), 2*(n2+n1)2)
Output Parameters
a On exit, the updated matrix A.
B On exit, the updated matrix B.
Q On exit, the updated matrix Q.

Not referenced if wantq = .FALSE..
z On exit, the updated matrix Z.

Not referenced if wantz = .FALSE..
info INTEGER.
=0: Successful exit For stgex2/dtgex2: If info = 1, the transformed
matrix (A, B) would be too far from generalized Schur form; the blocks are
not swapped and (A, B) and (Q, Z) are unchanged. The problem of
swapping is too ill-conditioned. If info = -16: lwork is too small.
Appropriate value for lwork is returned in work(1).
For ctgex2/ztgex2:
If info = 1, the transformed matrix pair (A, B) would be too far from
generalized Schur form; the problem is ill-conditioned.
1734
LAPACK Routines 3
?tgsy2
Solves the generalized Sylvester equation (unblocked
algorithm).
Syntax
call stgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
scale, rdsum, rdscal, iwork, pq, info )
call dtgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
call ctgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
call ztgsy2( trans, ijob, m, n, a, lda, b, ldb, c, ldc, d, ldd, e, lde, f, ldf,
Include Files
mkl.fi
Description
The routine ?tgsy2 solves the generalized Sylvester equation:
A*R-L*B=scale*C (1)
D*R-L*E=scale*F
using Level 1 and 2 BLAS, where R and L are unknown m-by-n matrices, (A, D), ( B, E) and (C, F) are given
matrix pairs of size m-by -m, n-by-n and m-by-n, respectively. For stgsy2/dtgsy2, pairs (A, D) and (B, E)
must be in generalized Schur canonical form, that is, A, B are upper quasi triangular and D, E are upper
triangular. For ctgsy2/ztgsy2, matrices A, B, D and E are upper triangular (that is, (A, D) and (B, E) in
generalized Schur form).
The solution (R, L) overwrites (C, F).
0 scale 1 is an output scaling factor chosen to avoid overflow.
In matrix notation, solving equation (1) corresponds to solve
Z*x = scale*b
where Z is defined for real flavors as
and for complex flavors as
Here Ik is the identity matrix of size k and XT (XH) is the transpose (conjugate transpose) of X. kron(X, Y)
denotes the Kronecker product between the matrices X and Y.
1735
For real flavors, if trans = 'T', solve the transposed system
ZT*y = scale*b
for y, which is equivalent to solving for R and L in
AT*R+DT*L=scale*C (4)
R*BT+L*ET=scale*(-F)
For complex flavors, if trans = 'C', solve the conjugate transposed system
ZH*y = scale*b
for y, which is equivalent to solving for R and L in
AH*R+DH*L=scale*C (5)
R*BH+L*EH=scale*(-F)
These cases are used to compute an estimate of Dif[(A,D),(B,E)] = sigma_min(Z) using reverse
communication with ?lacon.
?tgsy2 also (for ijob 1) contributes to the computation in ?tgsyl of an upper bound on the separation
between two matrix pairs. Then the input (A, D), (B, E) are sub-pencils of the matrix pair (two matrix pairs)
in ?tgsyl. See ?tgsyl for details.
Input Parameters
trans CHARACTER*1.
If trans = 'N', solve the generalized Sylvester equation (1);
If trans = 'T': solve the transposed system (4).
If trans = 'C': solve the conjugate transposed system (5).
ijob INTEGER. Specifies what kind of functionality is to be performed.

If ijob = 0: solve (1) only.
If ijob = 1: a contribution from this subsystem to a Frobenius norm-based

estimate of the separation between two matrix pairs is computed (look
ahead strategy is used);
If ijob = 2: a contribution from this subsystem to a Frobenius norm-based
estimate of the separation between two matrix pairs is computed (?gecon
on sub-systems is used).
Not referenced if trans = 'T'.
m INTEGER. On entry, m specifies the order of A and D, and the row

dimension of C, F, R and L.
n INTEGER. On entry, n specifies the order of B and E, and the column

dimension of C, F, R and L.
a, b REAL for stgsy2

DOUBLE PRECISION for dtgsy2
COMPLEX for ctgsy2
DOUBLE COMPLEX for ztgsy2.
1736
LAPACK Routines 3
Arrays, DIMENSION (lda, m) and (ldb, n), respectively. On entry, a contains
an upper (quasi) triangular matrix A, and b contains an upper (quasi)
triangular matrix B.
lda INTEGER. The leading dimension of the array a. lda max(1, m).
ldb INTEGER.
The leading dimension of the array b. ldb max(1, n).
c, f REAL for stgsy2

COMPLEX for ctgsy2
Arrays, DIMENSION (ldc, n) and (ldf, n), respectively. On entry, c contains
the right-hand-side of the first matrix equation in (1), and f contains the
right-hand-side of the second matrix equation in (1).
ldc INTEGER. The leading dimension of the array c. ldcmax(1, m).
d, e REAL for stgsy2

COMPLEX for ctgsy2
Arrays, DIMENSION (ldd, m) and (lde, n), respectively. On entry, d contains
an upper triangular matrix D, and e contains an upper triangular matrix E.
ldd INTEGER. The leading dimension of the array d. ldd max(1, m).
lde INTEGER. The leading dimension of the array e. lde max(1, n).
ldf INTEGER. The leading dimension of the array f. ldf max(1, m).
rdsum REAL for stgsy2/ctgsy2

DOUBLE PRECISION for dtgsy2/ztgsy2.
On entry, the sum of squares of computed contributions to the Dif-
estimate under computation by ?tgsyL, where the scaling factor rdscal has
been factored out.
rdscal REAL for stgsy2/ctgsy2

On entry, scaling factor used to prevent overflow in rdsum.
iwork INTEGER. Used with real flavors only.

Workspace array, DIMENSION (m+n+2).
Output Parameters
c On exit, if ijob = 0, c is overwritten by the solution R.
1737
f On exit, if ijob = 0, f is overwritten by the solution L.
scale REAL for stgsy2/ctgsy2

On exit, 0 scale 1. If 0 < scale < 1, the solutions R and L (C and F
on entry) hold the solutions to a slightly perturbed system, but the input
matrices A, B, D and E are not changed. If scale = 0, R and L hold the
solutions to the homogeneous system with C = F = 0. Normally scale =
1.
rdsum On exit, the corresponding sum of squares updated with the contributions
from the current sub-system.
If trans = 'T', rdsum is not touched.
Note that rdsum only makes sense when ?tgsy2 is called by ?tgsyl.
rdscal On exit, rdscal is updated with respect to the current contributions in

rdsum.
If trans = 'T', rdscal is not touched.
Note that rdscal only makes sense when ?tgsy2 is called by ?tgsyl.
pq INTEGER. Used with real flavors only.

On exit, the number of subsystems (of size 2-by-2, 4-by-4 and 8-by-8)
solved by the routine stgsy2/dtgsy2.
info INTEGER. On exit, if info is set to

= 0: Successful exit
< 0: If info = -i, the i-th argument has an illegal value.
> 0: The matrix pairs (A, D) and (B, E) have common or very close
eigenvalues.
?trti2
Computes the inverse of a triangular matrix
Syntax
call strti2( uplo, diag, n, a, lda, info )
call dtrti2( uplo, diag, n, a, lda, info )
call ctrti2( uplo, diag, n, a, lda, info )
call ztrti2( uplo, diag, n, a, lda, info )
Include Files
mkl.fi
Description
The routine ?trti2 computes the inverse of a real/complex upper or lower triangular matrix.
This is the Level 2 BLAS version of the algorithm.
1738
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1.
diag CHARACTER*1.
a REAL for strti2

DOUBLE PRECISION for dtrti2
COMPLEX for ctrti2
DOUBLE COMPLEX for ztrti2.
On entry, the triangular matrix A.

If uplo = 'U', the leading n-by-n upper triangular part of the array a
contains the upper triangular matrix, and the strictly lower triangular part
If uplo = 'L', the leading n-by-n lower triangular part of the array a
contains the lower triangular matrix, and the strictly upper triangular part
If diag = 'U', the diagonal elements of a are also not referenced and are
assumed to be 1.
Output Parameters
a On exit, the (triangular) inverse of the original matrix, in the same storage
format.
info INTEGER.
clag2z
Converts a complex single precision matrix to a
complex double precision matrix.
Syntax
call clag2z( m, n, sa, ldsa, a, lda, info )
1739
Include Files
mkl.fi
Description
This routine converts a complex single precision matrix SA to a complex double precision matrix A.
Note that while it is possible to overflow while converting from double to single, it is not possible to overflow
when converting from single to double.
This is an auxiliary routine so there is no argument checking.
Input Parameters
m INTEGER. The number of lines of the matrix A (m 0).
n INTEGER. The number of columns in the matrix A (n 0).
ldsa INTEGER. The leading dimension of the array sa; ldsa max(1, m).
a DOUBLE PRECISION array, DIMENSION (lda, n).

On entry, contains the m-by-n coefficient matrix A.
lda INTEGER. The leading dimension of the array a; lda max(1, m).
Output Parameters
sa REAL array, DIMENSION (ldsa, n).

On exit, contains the m-by-n coefficient matrix SA.
info INTEGER.
dlag2s
Converts a double precision matrix to a single
precision matrix.
Syntax
call dlag2s( m, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine converts a double precision matrix SA to a single precision matrix A.
RMAX is the overflow for the single precision arithmetic. dlag2s checks that all the entries of A are between
-RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
1740
LAPACK Routines 3
Input Parameters

Output Parameters

On exit, if info = 0, contains the m-by-n coefficient matrix SA; if info >
0, the content of sa is unspecified.
info INTEGER.
If info = 1, an entry of the matrix A is greater than the single precision

overflow threshold; in this case, the content of sa on exit is unspecified.
slag2d
Converts a single precision matrix to a double
precision matrix.
Syntax
call slag2d( m, n, sa, ldsa, a, lda, info )
Include Files
mkl.fi
Description
The routine converts a single precision matrix SA to a double precision matrix A.
Note that while it is possible to overflow while converting from double to single, it is not possible to overflow
when converting from single to double.
Input Parameters

On entry, contains the m-by-n coefficient matrix SA.
1741
Output Parameters

On exit, contains the m-by-n coefficient matrix A.
info INTEGER.
zlag2c
Converts a complex double precision matrix to a
complex single precision matrix.
Syntax
call zlag2c( m, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
The routine converts a double precision complex matrix SA to a single precision complex matrix A.
RMAX is the overflow for the single precision arithmetic. zlag2c checks that all the entries of A are between
-RMAX and RMAX. If not, the convertion is aborted and a flag is raised.
Input Parameters
a DOUBLE COMPLEX array, DIMENSION (lda, n).

Output Parameters
sa COMPLEX array, DIMENSION (ldsa, n).

On exit, if info = 0, contains the m-by-n coefficient matrix SA; if info >
0, the content of sa is unspecified.
info INTEGER.
1742
LAPACK Routines 3
If info = 1, an entry of the matrix A is greater than the single precision
overflow threshold; in this case, the content of sa on exit is unspecified.
?larfp
Generates a real or complex elementary reflector.
Syntax
call slarfp(n, alpha, x, incx, tau)
call dlarfp(n, alpha, x, incx, tau)
call clarfp(n, alpha, x, incx, tau)
call zlarfp(n, alpha, x, incx, tau)
Include Files
mkl.fi
Description
The ?larfp routines generate a real or complex elementary reflector H of order n, such that
H * (alpha) = (beta),
( x ) ( 0 )
and H'*H =I for real flavors, conjg(H)'*H =I for complex flavors.
Here
alpha and beta are scalars, beta is real and non-negative,
x is (n-1)-element vector.
H is represented in the form
H = I - tau*( 1 )* (1 v'),
( v )
where tau is scalar, and v is (n-1)-element vector .
For real flavors if the elements of x are all zero, then tau = 0 and H is taken to be the unit matrix. Otherwise
1 tau 2.
For complex flavors if the elements of x are all zero and alpha is real, then tau = 0 and H is taken to be the
unit matrix. Otherwise 1 real(tau) 2, and abs (tau-1 1.
Input Parameters
n INTEGER. Specifies the order of the elementary reflector.
alpha REAL for slarfp

DOUBLE PRECISION for dlarfp
COMPLEX for clarfp
DOUBLE COMPLEX for zlarfp
x REAL for slarfp
1743

COMPLEX for clarfp
Array, DIMENSION at least (1 + (n - 1)*abs(incx)). It contains the
vector x.

Output Parameters
alpha Overwritten by the value beta.
y Overwritten by the vector v.
tau REAL for slarfp

COMPLEX for clarfp
Contains the scalar tau.
ila?lc
Scans a matrix for its last non-zero column.
Syntax
value = ilaslc(m, n, a, lda)
value = iladlc(m, n, a, lda)
value = ilaclc(m, n, a, lda)
value = ilazlc(m, n, a, lda)
Include Files
mkl.fi
Description
The ila?lc routines scan a matrix A for its last non-zero column.
Input Parameters
m INTEGER. Specifies number of rows in the matrix A.
n INTEGER. Specifies number of columns in the matrix A.
a REAL for ilaslc

DOUBLE PRECISION for iladlc
COMPLEX for ilaclc
1744
LAPACK Routines 3
DOUBLE COMPLEX for ilazlc
Array, DIMENSION(lda, *). The second dimension of a must be at least
max(1, n).
Before entry the leading n-by-n part of the array a must contain the matrix
A.

Output Parameters
value INTEGER
Number of the last non-zero column.
ila?lr
Scans a matrix for its last non-zero row.
Syntax
value = ilaslr(m, n, a, lda)
value = iladlr(m, n, a, lda)
value = ilaclr(m, n, a, lda)
value = ilazlr(m, n, a, lda)
Include Files
mkl.fi
Description
The ila?lr routines scan a matrix A for its last non-zero row.
Input Parameters
m INTEGER. Specifies number of rows in the matrix A.
n INTEGER. Specifies number of columns in the matrix A.
a REAL for ilaslr

DOUBLE PRECISION for iladlr
COMPLEX for ilaclr
DOUBLE COMPLEX for idazlr
Array, DIMENSION(lda, *). The second dimension of a must be at least
max(1, n).
Before entry the leading n-by-n part of the array a must contain the matrix
A.

1745
Output Parameters
value INTEGER
Number of the last non-zero row.
?gsvj0
Pre-processor for the routine ?gesvj.
Syntax
call sgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call dgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call cgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
call zgsvj0(jobv, m, n, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep, work,
lwork, info)
Include Files
mkl.fi
Description
This routine is called from ?gesvj as a pre-processor and that is its main purpose. It applies Jacobi rotations
in the same way as ?gesvj does, but it does not check convergence (stopping criterion).
The routine ?gsvj0 enables ?gesvj to use a simplified version of itself to work on a submatrix of the original
matrix.
Input Parameters
jobv CHARACTER*1. Must be 'V', 'A', or 'N'.

Specifies whether the output from this routine is used to compute the
matrix V.
If jobv = 'V', the product of the Jacobi rotations is accumulated by post-
multiplying the n-by-n array v.
If jobv = 'A', the product of the Jacobi rotations is accumulated by post-

multiplying the mv-by-n array v.
If jobv = 'N', the Jacobi rotations are not accumulated.
m INTEGER. The number of rows of the input matrix A (m 0).
n INTEGER. The number of columns of the input matrix B (mn 0).
a REAL for sgsvj0

DOUBLE PRECISION for dgsvj0.
COMPLEX for cgsvj0
DOUBLE COMPLEX for zgsvj0
1746
LAPACK Routines 3
Array, DIMENSION (lda, n). Contains the m-by-n matrix A, such that
A*diag(D) represents the input matrix.
d REAL for sgsvj0

COMPLEX for cgsvj0
Array, DIMENSION (n). Contains the diagonal matrix D that accumulates the
scaling factors from the fast scaled Jacobi rotations. On entry A*diag(D)
represents the input matrix.
sva REAL for sgsvj0

REAL for cgsvj0
DOUBLE PRECISION for zgsvj0.
Array, DIMENSION (n). Contains the Euclidean norms of the columns of the
matrix A*diag(D).
mv INTEGER. The leading dimension of b; at least max(1, p).

If jobv = 'A', then mv rows of v are post-multiplied by a sequence of
Jacobi rotations.
If jobv = 'N', then mv is not referenced .
v REAL for sgsvj0

COMPLEX for cgsvj0
Array, DIMENSION (ldv, n).
If jobv = 'V', then n rows of v are post-multiplied by a sequence of

Jacobi rotations.
Jacobi rotations.
If jobv = 'N', then v is not referenced.
ldv INTEGER. The leading dimension of the array v; ldv 1

ldvn if jobv = 'V';
ldvmv if jobv = 'A'.
eps REAL for sgsvj0

REAL for cgsvj0
1747
The relative machine precision (epsilon) returned by the routine ?lamch.
sfmin REAL for sgsvj0

REAL for cgsvj0
Value of safe minimum returned by the routine ?lamch.
tol REAL for sgsvj0

REAL for cgsvj0
The threshold for Jacobi rotations. For a pair A(:,p), A(:,q) of pivot
columns, the Jacobi rotation is applied only if
abs(cos(angle(A(:,p),A(:,q))))> tol.
nsweep INTEGER.
The number of sweeps of Jacobi rotations to be performed.
work REAL for sgsvj0

COMPLEX for cgsvj0
lwork INTEGER. The size of the array work; at least max(1, m).
Output Parameters
a On exit, A*diag(D) represents the input matrix post-multiplied by a

sequence of Jacobi rotations, where the rotation threshold and the total
number of sweeps are given in tol and nsweep, respectively
d On exit, A*diag(D) represents the input matrix post-multiplied by a

number of sweeps are given in tol and nsweep, respectively.
sva On exit, contains the Euclidean norms of the columns of the output matrix
A*diag(D).
v If jobv = 'V', then n rows of v are post-multiplied by a sequence of

Jacobi rotations.
Jacobi rotations.
info INTEGER.
1748
LAPACK Routines 3
?gsvj1
Pre-processor for the routine ?gesvj, applies Jacobi
rotations targeting only particular pivots.
Syntax
call sgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call dgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call cgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
call zgsvj1(jobv, m, n, n1, a, lda, d, sva, mv, v, ldv, eps, sfmin, tol, nsweep,
work, lwork, info)
Include Files
mkl.fi
Description
This routine is called from ?gesvj as a pre-processor and that is its main purpose. It applies Jacobi rotations
in the same way as ?gesvj does, but it targets only particular pivots and it does not check convergence
(stopping criterion).
The routine ?gsvj1 applies few sweeps of Jacobi rotations in the column space of the input m-by-n matrix A.
The pivot pairs are taken from the (1,2) off-diagonal block in the corresponding n-by-n Gram matrix A'*A.
The block-entries (tiles) of the (1,2) off-diagonal block are marked by the [x]'s in the following scheme:
* * * x x x
* * * x x x
* * * x x x
x x x * * *
x x x * * *
x x x * * *
row-cycling in the nblr-by-nblc[x] blocks, row-cyclic pivoting inside each [x] block
In terms of the columns of the matrix A, the first n1 columns are rotated 'against' the remaining n-n1
columns, trying to increase the angle between the corresponding subspaces. The off-diagonal block is n1-by-
(n-n1) and it is tiled using quadratic tiles. The number of sweeps is specified by nsweep, and the
orthogonality threshold is set by tol.
Input Parameters
jobv CHARACTER*1. Must be 'V', 'A', or 'N'.

Specifies whether the output from this routine is used to compute the
matrix V.
1749
If jobv = 'V', the product of the Jacobi rotations is accumulated by post-

multiplying the n-by-n array v.
If jobv = 'A', the product of the Jacobi rotations is accumulated by post-

multiplying the mv-by-n array v.
If jobv = 'N', the Jacobi rotations are not accumulated.
m INTEGER. The number of rows of the input matrix A (m 0).
n INTEGER. The number of columns of the input matrix B (mn 0).
n1 INTEGER. Specifies the 2-by-2 block partition. The first n1 columns are
rotated 'against' the remaining n-n1 columns of the matrix A.
a REAL for sgsvj1

COMPLEX for cgsvj1
DOUBLE COMPLEX for zgsvj1.
Array, DIMENSION (lda, n). Contains the m-by-n matrix A, such that
d REAL for sgsvj1

COMPLEX for cgsvj1
Arrays, DIMENSION (n). Contains the diagonal matrix D that accumulates
the scaling factors from the fast scaled Jacobi rotations. On entry
sva REAL for sgsvj1

REAL for cgsvj1
Arrays, DIMENSION (n). Contains the Euclidean norms of the columns of the
matrix A*diag(D).
mv INTEGER. The leading dimension of b; at least max(1, p).

Jacobi rotations.
If jobv = 'N', then mv is not referenced .
v REAL for sgsvj1

COMPLEX for cgsvj1
1750
LAPACK Routines 3
Array, DIMENSION (ldv, n).
If jobv = 'V', then n rows of v are post-multiplied by a sequence of

Jacobi rotations.
Jacobi rotations.
ldv INTEGER. The leading dimension of the array v; ldv 1

ldv n if jobv = 'V';
ldv mv if jobv = 'A'.
eps REAL for sgsvj1

REAL for cgsvj1
The relative machine precision (epsilon) returned by the routine ?lamch.
sfmin REAL for sgsvj1

REAL for cgsvj1
Value of safe minimum returned by the routine ?lamch.
tol REAL for sgsvj1

REAL for cgsvj1
The threshold for Jacobi rotations. For a pair A(:,p), A(:,q) of pivot
columns, the Jacobi rotation is applied only if
abs(cos(angle(A(:,p),A(:,q))))> tol.
nsweep INTEGER.
The number of sweeps of Jacobi rotations to be performed.
work REAL for sgsvj1

COMPLEX for cgsvj1
lwork INTEGER. The size of the array work; at least max(1, m).
1751
Output Parameters
a On exit, A*diag(D) represents the input matrix post-multiplied by a

number of sweeps are given in tol and nsweep, respectively
d On exit, A*diag(D) represents the input matrix post-multiplied by a

number of sweeps are given in tol and nsweep, respectively.
sva On exit, contains the Euclidean norms of the columns of the output matrix
A*diag(D).
v If jobv = 'V', then n rows of v are post-multiplied by a sequence of

Jacobi rotations.
Jacobi rotations.
info INTEGER.
?sfrk
Performs a symmetric rank-k operation for matrix in
RFP format.
Syntax
call ssfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
call dsfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
Include Files
mkl.fi
Description
The ?sfrk routines perform a matrix-matrix operation using symmetric matrices. The operation is defined as
C := alpha*A*AT + beta*C,
or
C := alpha*AT*A + beta*C,
where:
C is an n-by-n symmetric matrix in rectangular full packed (RFP) format,
1752
LAPACK Routines 3
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP C is stored;
if transr= 'T' or 't', the transpose form of RFP C is stored.
array c is used.

if trans = 'N' or 'n', then C := alpha*A*AT + beta*C;
if trans = 'T' or 't', then C := alpha*AT*A + beta*C;

least zero.
k INTEGER. On entry with trans = 'N' or 'n', k specifies the number of

columns of the matrix A, and on entry with trans = 'T' or 't', k
specifies the number of rows of the matrix A.
alpha REAL for ssfrk

DOUBLE PRECISION for dsfrk
a REAL for ssfrk

Array, DIMENSION(lda,ka), where ka is k when trans = 'N' or 'n', and
is n otherwise. Before entry with trans = 'N' or 'n', the leading n-by-k
part of the array a must contain the matrix A, otherwise the leading k-by-n
part of the array a must contain the matrix A.

(sub)program. When trans = 'N' or 'n', then lda must be at least
max(1,n), otherwise lda must be at least max(1, k).
beta REAL for ssfrk

c REAL for ssfrk

Array, size (n*(n+1)/2 ). Before entry contains the symmetric matrix C in
RFP format.
1753
Output Parameters
c If trans = 'N' or 'n', then c contains C := alpha*A*A' + beta*C;
if trans = 'T' or 't', then c contains C := alpha*A'*A + beta*C;
?hfrk
Performs a Hermitian rank-k operation for matrix in
RFP format.
Syntax
call chfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
call zhfrk(transr, uplo, trans, n, k, alpha, a, lda, beta, c)
Include Files
mkl.fi
Description
The ?hfrk routines perform a matrix-matrix operation using Hermitian matrices. The operation is defined as
C := alpha*A*AH + beta*C,
or
C := alpha*AH*A + beta*C,
where:
alpha and beta are real scalars,
C is an n-by-n Hermitian matrix in RFP format,
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP C is stored;
if transr = 'C' or 'c', the conjugate-transpose form of RFP C is stored.
array c is used.

if trans = 'N' or 'n', then C := alpha*A*AH + beta*C;
if trans = 'C' or 'c', then C := alpha*AH*A + beta*C.

least zero.
1754
LAPACK Routines 3
k INTEGER. On entry with trans = 'N' or 'n', k specifies the number of
columns of the matrix a, and on entry with trans = 'T' or 't' or 'C' or
'c', k specifies the number of rows of the matrix a.
alpha COMPLEX for chfrk

DOUBLE COMPLEX for zhfrk
a COMPLEX for chfrk

Array, DIMENSION(lda,ka), where ka is k when trans = 'N' or 'n', and
is n otherwise. Before entry with trans = 'N' or 'n', the leading n-by-k
part of the array a must contain the matrix A, otherwise the leading k-by-n
part of the array a must contain the matrix A.

(sub)program. When trans = 'N' or 'n', then lda must be at least
max(1,n), otherwise lda must be at least max(1, k).
beta COMPLEX for chfrk

c COMPLEX for chfrk

Array, size (n*(n+1)/2 ). Before entry contains the Hermitian matrix C in
in RFP format.
Output Parameters
c If trans = 'N' or 'n', then c contains C := alpha*A*AH + beta*C;
if trans = 'C' or 'c', then c contains C := alpha*AH*A + beta*C ;
?tfsm
Solves a matrix equation (one operand is a triangular
matrix in RFP format).
Syntax
call stfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call dtfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call ctfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
call ztfsm(transr, side, uplo, trans, diag, m, n, alpha, a, b, ldb)
Include Files
mkl.fi
1755
Description
The ?tfsm routines solve one of the following matrix equations:
op(A)*X = alpha*B,
or
X*op(A) = alpha*B,
where:
alpha is a scalar,
X and B are m-by-n matrices,
A is a unit, or non-unit, upper or lower triangular matrix in rectangular full packed (RFP) format.
op(A) can be one of the following:
op(A) = A or op(A) = AT for real flavors

op(A) = A or op(A) = AH for complex flavors
The matrix B is overwritten by the solution matrix X.
Input Parameters
transr CHARACTER*1.
if transr = 'N' or 'n', the normal form of RFP A is stored;
if transr = 'T' or 't', the transpose form of RFP A is stored;
if transr = 'C' or 'c', the conjugate-transpose form of RFP A is stored.
side CHARACTER*1. Specifies whether op(A) appears on the left or right of X in

the equation:
if side = 'L' or 'l', then op(A)*X = alpha*B;
if side = 'R' or 'r', then X*op(A) = alpha*B.
uplo CHARACTER*1. Specifies whether the RFP matrix A is upper or lower

triangular:
if uplo = 'U' or 'u', then the matrix is upper triangular;
trans CHARACTER*1. Specifies the form of op(A) used in the matrix

multiplication:
if trans = 'N' or 'n', then op(A) = A;
if trans = 'T' or 't', then op(A) = A';
if trans = 'C' or 'c', then op(A) = conjg(A').
diag CHARACTER*1. Specifies whether the RFP matrix A is unit triangular:


least zero.
1756
LAPACK Routines 3
least zero.
alpha REAL for stfsm

DOUBLE PRECISION for dtfsm
COMPLEX for ctfsm
DOUBLE COMPLEX for ztfsm
entry.
a REAL for stfsm

COMPLEX for ctfsm
Array, size (n*(n+1)/2). Contains the matrix A in RFP format.
b REAL for stfsm

COMPLEX for ctfsm
Array, size (1, ldb*n)
Before entry, the leading m-by-n part of the array b must contain the right-
hand side matrix B.

(sub)program. The value of ldb must be at least max(1, m).
Output Parameters
?lansf
absolute value of a symmetric matrix in RFP format.
Syntax
val = slansf(norm, transr, uplo, n, a, work)
val = dlansf(norm, transr, uplo, n, a, work)
Include Files
mkl.fi
Description
T
1757
The function ?lansf returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n real symmetric matrix A in the rectangular full packed (RFP)
format .
Input Parameters


column sum),
row sum),
transr CHARACTER*1.
Specifies whether the RFP format of matrix A is normal or transposed
format.
If transr = 'N': RFP format is normal;
if transr = 'T': RFP format is transposed.
uplo CHARACTER*1.
Specifies whether the RFP matrix A came from upper or lower triangular
matrix.
If uplo = 'U': RFP matrix A came from an upper triangular matrix;
if uplo = 'L': RFP matrix A came from a lower triangular matrix.

When n = 0, ?lansf is set to zero.
a REAL for slansf

DOUBLE PRECISION for dlansf
Array, DIMENSION (n*(n+1)/2).
The upper (if uplo = 'U') or lower (if uplo = 'L') part of the symetric
matrix A stored in RFP format.
work REAL for slansf.

DOUBLE PRECISION for dlansf.

referenced.
Output Parameters
val REAL for slansf

DOUBLE PRECISION for dlansf
1758
LAPACK Routines 3
?lanhf
absolute value of a Hermitian matrix in RFP format.
Syntax
val = clanhf(norm, transr, uplo, n, a, work)
val = zlanhf(norm, transr, uplo, n, a, work)
Include Files
mkl.fi
Description
The function ?lanhf returns the value of the 1-norm, or the Frobenius norm, or the infinity norm, or the
element of largest absolute value of an n-by-n complex Hermitian matrix A in the rectangular full packed
(RFP) format.
Input Parameters
norm CHARACTER*1.
Specifies the value to be returned by the routine:

column sum),
row sum),
transr CHARACTER*1.
Specifies whether the RFP format of matrix A is normal or conjugate-
transposed format.
If transr = 'N': RFP format is normal;
if transr = 'C': RFP format is conjugate-transposed.
uplo CHARACTER*1.
Specifies whether the RFP matrix A came from upper or lower triangular
matrix.
If uplo = 'U': RFP matrix A came from an upper triangular matrix;
if uplo = 'L': RFP matrix A came from a lower triangular matrix.

When n = 0, ?lanhf is set to zero.
1759
a COMPLEX for clanhf

DOUBLE COMPLEX for zlanhf
Array, DIMENSION (n*(n+1)/2).
The upper (if uplo = 'U') or lower (if uplo = 'L') part of the Hermitian
matrix A stored in RFP format.
work COMPLEX for clanhf.

DOUBLE COMPLEX for zlanhf.

referenced.
Output Parameters
val COMPLEX for clanhf

DOUBLE COMPLEX for zlanhf
?tfttp
Copies a triangular matrix from the rectangular full
packed format (TF) to the standard packed format
(TP) .
Syntax
call stfttp( transr, uplo, n, arf, ap, info )
call dtfttp( transr, uplo, n, arf, ap, info )
call ctfttp( transr, uplo, n, arf, ap, info )
call ztfttp( transr, uplo, n, arf, ap, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the Rectangular Full Packed (RFP) format to the standard
packed format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf is in the Normal format,
= 'T': arf is in the Transpose format (for stfttp and dtfttp),
= 'C': arf is in the Conjugate-transpose format (for ctfttp and ztfttp).
uplo CHARACTER*1.
1760
LAPACK Routines 3
Specifies whether A is upper or lower triangular:
= 'U': A is upper triangular,
= 'L': A is lower triangular.
arf REAL for stfttp,

DOUBLE PRECISION for dtfttp,
COMPLEX for ctfttp,
DOUBLE COMPLEX for ztfttp.
Array, size at least max (1, n*(n+1)/2).
On entry, the upper or lower triangular matrix A stored in the RFP format.
Output Parameters
ap REAL for stfttp,

DOUBLE PRECISION for dtfttp,
COMPLEX for ctfttp,
DOUBLE COMPLEX for ztfttp.
On exit, the upper or lower triangular matrix A, packed columnwise in a

linear array.
if uplo = 'U', ap(i + (j-1)*j/2) = A(i,j) for 1 i j,
if uplo = 'L', ap(i + (j-1)*(2n-j)/2) = A(i,j) for j i n.

?tfttr
Copies a triangular matrix from the rectangular full
packed format (TF) to the standard full format (TR) .
Syntax
call stfttr( transr, uplo, n, arf, a, lda, info )
call dtfttr( transr, uplo, n, arf, a, lda, info )
call ctfttr( transr, uplo, n, arf, a, lda, info )
call ztfttr( transr, uplo, n, arf, a, lda, info )
Include Files
mkl.fi
1761
Description
The routine copies a triangular matrix A from the Rectangular Full Packed (RFP) format to the standard full
format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf is in the Normal format,
= 'T': arf is in the Transpose format (for stfttr and dtfttr),
= 'C': arf is in the Conjugate-transpose format (for ctfttr and ztfttr).
uplo CHARACTER*1.
n INTEGER. The order of the matrices arf and a. n 0.
arf REAL for stfttr,

DOUBLE PRECISION for dtfttr,
COMPLEX for ctfttr,
DOUBLE COMPLEX for ztfttr.
On entry, the upper or lower triangular matrix A stored in the RFP

format.
Output Parameters
a REAL for stfttr,

DOUBLE PRECISION for dtfttr,
COMPLEX for ctfttr,
DOUBLE COMPLEX for ztfttr.
On exit, the triangular matrix A. If uplo = 'U', the leading n-by-n upper
triangular part of the array a contains the upper triangular matrix, and the
strictly lower triangular part of a is not referenced. If uplo = 'L', the leading
n-by-n lower triangular part of the array a contains the lower triangular
matrix, and the strictly upper triangular part of a is not referenced.

1762
LAPACK Routines 3
?tpqrt2
Computes a QR factorization of a real or complex
"triangular-pentagonal" matrix, which is composed of
a triangular block and a pentagonal block, using the
compact WY representation for Q.
Syntax
call stpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call dtpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call ctpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call ztpqrt2(m, n, l, a, lda, b, ldb, t, ldt, info)
call tpqrt2(a, b, t [, info])
Include Files
mkl.fi, lapack.f90
Description
The input matrix C is an (n+m)-by-n matrix
where A is an n-by-n upper triangular matrix, and B is an m-by-n pentagonal matrix consisting of an (m-l)-
by-n rectangular matrix B1 on top of an l-by-n upper trapezoidal matrix B2:
The upper trapezoidal matrix B2 consists of the first l rows of an n-by-n upper triangular matrix, where 0
l min(m,n). If l=0, B is an m-by-n rectangular matrix. If m=l=n, B is upper triangular. The matrix W
contains the elementary reflectors H(i) in the ith column below the diagonal (of A) in the (n+m)-by-n input
matrix C so that W can be represented as
Thus, V contains all of the information needed for W, and is returned in array b.
NOTE
V has the same form as B:
1763
The columns of V represent the vectors which define the H(i)s.

The (m+n)-by-(m+n) block reflector H is then given by
H = I - W*T*WT for real flavors, and

H = I - W*T*WH for complex flavors
where WT is the transpose of W, WH is the conjugate transpose of W, and T is the upper triangular factor of
the block reflector.
Input Parameters
n INTEGER. The number of columns in B and the order of the triangular

matrix A (n 0).
l INTEGER. The number of rows of the upper trapezoidal part of B (min(m, n)

l 0).
a, b REAL for stpqrt2

DOUBLE PRECISION for dtpqrt2
COMPLEX for ctpqrt2
COMPLEX*16 for ztpqrt2.
Arrays: a, size (lda, n) contains the n-by-n upper triangular matrix A.
b, size (ldb, n), the pentagonal m-by-n matrix B. The first (m-l) rows
contain the rectangular B1 matrix, and the next l rows contain the upper
trapezoidal B2 matrix.
ldb INTEGER. The leading dimension of b; at least max(1, m).
Output Parameters
a The elements on and above the diagonal of the array contain the upper
b The pentagonal matrix V.
t REAL for stpqrt2

DOUBLE PRECISION for dtpqrt2
COMPLEX for ctpqrt2
1764
LAPACK Routines 3
COMPLEX*16 for ztpqrt2.
Array, size (ldt, n).
The upper n-by-n upper triangular factor T of the block reflector.
info INTEGER.
?tprfb
Applies a real or complex "triangular-pentagonal"
blocked reflector to a real or complex matrix, which is
composed of two blocks.
Syntax
call stprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call dtprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call ctprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call ztprfb(side, trans, direct, storev, m, n, k, l, v, ldv, t, ldt, a, lda, b, ldb,
work, ldwork)
call tprfb(t, v, a, b[, direct][, storev][, side][, trans])
Include Files
mkl.fi, lapack.f90
Description
The ?tprfb routine applies a real or complex "triangular-pentagonal" block reflector H, HT, or HH from either
the left or the right to a real or complex matrix C, which is composed of two blocks A and B.
The block B is m-by-n. If side = 'R', A is m-by-k, and if side = 'L', A is of size k-by-n.
The pentagonal matrix V is composed of a rectangular block V1 and a trapezoidal block V2. The size of the
trapezoidal block is determined by the parameter l, where 0lk. if l=k, the V2 block of V is triangular; if
l=0, there is no trapezoidal block, thus V = V1 is rectangular.
1765
direct='F' direct='B'
storev='C'
V2 is upper trapezoidal (first l rows of k-by-k V2 is lower trapezoidal (last l rows of k-by-k
upper triangular) lower triangular matrix)
storev='R'
V2 is lower trapezoidal (first l columns of k- V2 is upper trapezoidal (last l columns of k-

by-k lower triangular matrix) by-k upper triangular matrix)
side='L' side='R'
storev='C'
V is m-by-k V is n-by-k
V2 is l-by-k V2 is l-by-k
storev='R'
V is k-by-m V is k-by-n
V2 is k-by-l V2 is k-by-l
Input Parameters
side CHARACTER*1.
= 'L': apply H, HT, or HH from the left,
= 'R': apply H, HT, or HH from the right.
trans CHARACTER*1.
= 'N': apply H (no transpose),
= 'T': apply HT (transpose),
= 'C': apply HH (conjugate transpose).
direct CHARACTER*1.
Indicates how H is formed from a product of elementary reflectors:
= 'F': H = H(1) H(2) . . . H(k) (Forward),
= 'B': H = H(k) . . . H(2) H(1) (Backward).
storev CHARACTER*1.
Indicates how the vectors that define the elementary reflectors are stored:
= 'C': Columns,
= 'R': Rows.
n INTEGER. The number of columns in B (n 0).
1766
LAPACK Routines 3
k INTEGER. The order of the matrix T, which is the number of elementary
reflectors whose product defines the block reflector. (k 0)
l INTEGER. The order of the trapezoidal part of V. (kl 0).
v REAL for stprfb

DOUBLE PRECISION for dtprfb
COMPLEX for ctprfb
COMPLEX*16 for ztprfb.
DIMENSION (ldv, k) if storev = 'C',
DIMENSION (ldv, m) if storev = 'R' and side = 'L',
DIMENSION (ldv, n) if storev = 'R' and side = 'R'.
The pentagonal matrix V, which contains the elementary reflectors H(1),
H(2), ..., H(k).

If storev = 'C' and side = 'L', at least max(1, m).
If storev = 'C' and side = 'R', at least max(1, n).
If storev = 'R' , at least k.
t REAL for stprfb

COMPLEX for ctprfb
Array size (ldt, k). The triangular k-by-k matrix T in the representation of
the block reflector.
ldt INTEGER. The leading dimension of the array t (ldtk).
a REAL for stprfb

COMPLEX for ctprfb
DIMENSION (lda, n) if side = 'L',
DIMENSION (lda, k) if side = 'R'.
The k-by-n or m-by-k matrix A.

If side = 'L', at least max(1, k).
If side = 'R', at least max(1, m).
b REAL for stprfb

1767
COMPLEX for ctprfb

Array size (ldb, n), the m-by-n matrix B.
ldb INTEGER. The leading dimension of the array b (ldb max(1, m)).
work REAL for stprfb

COMPLEX for ctprfb
DIMENSION (ldwork, n) if side = 'L',
DIMENSION (ldwork, k) if side = 'R'.
Workspace array.

If side = 'L', at least k.
If side = 'R', at least m.
Output Parameters
a Contains the corresponding block of H*C, HT*C, HH*C, C*H, C*HT, or C*HH.
b Contains the corresponding block of H*C, HT*C, HH*C, C*H, C*HT, or C*HH.

?tpttf
Copies a triangular matrix from the standard packed
format (TP) to the rectangular full packed format (TF).
Syntax
call stpttf( transr, uplo, n, ap, arf, info )
call dtpttf( transr, uplo, n, ap, arf, info )
call ctpttf( transr, uplo, n, ap, arf, info )
call ztpttf( transr, uplo, n, ap, arf, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard packed format to the Rectangular Full Packed
(RFP) format. For the description of the RFP format, see Matrix Storage Schemes.
1768
LAPACK Routines 3
Input Parameters
transr CHARACTER*1.
= 'N': arf must be in the Normal format,
= 'T': arf must be in the Transpose format (for stpttf and dtpttf),
= 'C': arf must be in the Conjugate-transpose format (for ctpttf and

ztpttf).
uplo CHARACTER*1.
ap REAL for stpttf,

DOUBLE PRECISION for dtpttf,
COMPLEX for ctpttf,
DOUBLE COMPLEX for ztpttf.
On entry, the upper or lower triangular matrix A, packed columnwise in a

linear array.
Output Parameters
arf REAL for stpttf,

DOUBLE PRECISION for dtpttf,
COMPLEX for ctfttp,
DOUBLE COMPLEX for ztpttf.
On exit, the upper or lower triangular matrix A stored in the RFP

format.
info INTEGER.
=0: successful exit,
< 0: if info = -i, the i-th parameter had an illegal value.
1769
?tpttr
Copies a triangular matrix from the standard packed
format (TP) to the standard full format (TR) .
Syntax
call stpttr( uplo, n, ap, a, lda, info )
call dtpttr( uplo, n, ap, a, lda, info )
call ctpttr( uplo, n, ap, a, lda, info )
call ztpttr( uplo, n, ap, a, lda, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard packed format to the standard full format.
Input Parameters
uplo CHARACTER*1.
n INTEGER. The order of the matrices ap and a. n 0.
ap REAL for stpttr,

DOUBLE PRECISION for dtpttr,
COMPLEX for ctpttr,
DOUBLE COMPLEX for ztpttr.
On entry, the upper or lower triangular matrix A, packed columnwise in a
linear array. The j-th column of A is stored in the array ap as follows:
Output Parameters
a REAL for stpttr,

DOUBLE PRECISION for dtpttr,
COMPLEX for ctpttr,
DOUBLE COMPLEX for ztpttr.
1770
LAPACK Routines 3
On exit, the triangular matrix A. If uplo = 'U', the leading n-by-n upper
triangular part of the array a contains the upper triangular part of the
matrix A, and the strictly lower triangular part of a is not referenced. If
uplo = 'L', the leading n-by-n lower triangular part of the array a contains
the lower triangular part of the matrix A, and the strictly upper triangular
info INTEGER.
?trttf
Copies a triangular matrix from the standard full
format (TR) to the rectangular full packed format (TF).
Syntax
call strttf( transr, uplo, n, a, lda, arf, info )
call dtrttf( transr, uplo, n, a, lda, arf, info )
call ctrttf( transr, uplo, n, a, lda, arf, info )
call ztrttf( transr, uplo, n, a, lda, arf, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard full format to the Rectangular Full Packed (RFP)
format. For the description of the RFP format, see Matrix Storage Schemes.
Input Parameters
transr CHARACTER*1.
= 'N': arf must be in the Normal format,
= 'T': arf must be in the Transpose format (for strttf and dtrttf),
= 'C': arf must be in the Conjugate-transpose format (for ctrttf and

ztrttf).
uplo CHARACTER*1.
a REAL for strttf,
1771
DOUBLE PRECISION for dtrttf,

COMPLEX for ctrttf,
DOUBLE COMPLEX for ztrttf.
On entry, the triangular matrix A. If uplo = 'U', the leading n-by-n upper
Output Parameters
arf REAL for strttf,

DOUBLE PRECISION for dtrttf,
COMPLEX for ctrttf,
DOUBLE COMPLEX for ztrttf.
On exit, the upper or lower triangular matrix A stored in the RFP

format.
?trttp
Copies a triangular matrix from the standard full
format (TR) to the standard packed format (TP) .
Syntax
call strttp( uplo, n, a, lda, ap, info )
call dtrttp( uplo, n, a, lda, ap, info )
call ctrttp( uplo, n, a, lda, ap, info )
call ztrttp( uplo, n, a, lda, ap, info )
Include Files
mkl.fi
Description
The routine copies a triangular matrix A from the standard full format to the standard packed format.
Input Parameters
uplo CHARACTER*1.
1772
LAPACK Routines 3
n INTEGER. The order of the matrix A, n 0.
a REAL for strttp,

DOUBLE PRECISION for dtrttp,
COMPLEX for ctrttp,
DOUBLE COMPLEX for ztrttp.
On entry, the triangular matrix A. If uplo = 'U', the leading n-by-n upper
Output Parameters
ap REAL for strttp,

DOUBLE PRECISION for dtrttp,
COMPLEX for ctrttp,
DOUBLE COMPLEX for ztrttp.
On exit, the upper or lower triangular matrix A, packed columnwise in a

linear array. The j-th column of A is stored in the array ap as follows:

?pstf2
Computes the Cholesky factorization with complete
pivoting of a real symmetric or complex Hermitian
positive semi-definite matrix.
Syntax
call spstf2( uplo, n, a, lda, piv, rank, tol, work, info )
call dpstf2( uplo, n, a, lda, piv, rank, tol, work, info )
call cpstf2( uplo, n, a, lda, piv, rank, tol, work, info )
1773
call zpstf2( uplo, n, a, lda, piv, rank, tol, work, info )
Include Files
mkl.fi
Description
The real flavors spstf2 and dpstf2 compute the Cholesky factorization with complete pivoting of a real
symmetric positive semi-definite matrix A. The complex flavors cpstf2 and zpstf2 compute the Cholesky
factorization with complete pivoting of a complex Hermitian positive semi-definite matrix A. The factorization
has the form:
PT* A * P = UT * U, if uplo = 'U' for real flavors,
PT* A * P = UH * U, if uplo = 'U' for complex flavors,
PT* A * P = L * LT, if uplo = 'L' for real flavors,
PT* A * P = L * LH, if uplo = 'L' for complex flavors,
where U is an upper triangular matrix and L is lower triangular, and P is stored as vector piv.
This algorithm does not check that A is positive semi-definite. This version of the algorithm calls level 2
BLAS.
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the symmetric or
= 'U': Upper triangular,
a REAL for spstf2,

DOUBLE PRECISION for dpstf2,
COMPLEX for cpstf2,
DOUBLE COMPLEX for zpstf2.
Array, DIMENSION (lda, *).
On entry, the symmetric matrix A. If uplo = 'U', the leading n-by-n upper
triangular part of the array a contains the upper triangular part of the
matrix A, and the strictly lower triangular part of a is not referenced. If
uplo = 'L', the leading n-by-n lower triangular part of the array a contains
the lower triangular part of the matrix A, and the strictly upper triangular
tol REAL for spstf2 and cpstf2,

DOUBLE PRECISION for dpstf2 and zpstf2.
A user-defined tolerance.
If tol < 0, n*ulp*max(A(k,k)) will be used (ulp is the Unit in the Last
Place, or Unit of Least Precision). The algorithm terminates at the (k - 1)-st
step if the pivot is not greater than tol.
1774
LAPACK Routines 3
lda INTEGER. The leading dimension of the matrix A. lda max(1,n).
work REAL for spstf2 and cpstf2,

DOUBLE PRECISION for dpstf2 and zpstf2.
Workspace array, DIMENSION at least max (1, 2*n).
Output Parameters
piv INTEGER. Array. DIMENSION at least max (1,n).

piv is such that the non-zero entries are P ( piv (k), k ) = 1.
a On exit, if info = 0, the factor U or L from the Cholesky factorization

stored the same way as the matrix A is stored on entry.
rank INTEGER.
The rank of A, determined by the number of steps the algorithm completed.
info INTEGER.
< 0: if info = -k, the k-th parameter had an illegal value,
=0: the algorithm completed successfully,

> 0: the matrix A is rank-deficient with the computed rank, returned in
rank, or indefinite.
dlat2s
Converts a double-precision triangular matrix to a
single-precision triangular matrix.
Syntax
call dlat2s( uplo, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine converts a double-precision triangular matrix A to a single-precision triangular matrix SA.
dlat2s checks that all the elements of A are between -RMAX and RMAX, where RMAX is the overflow for the
single-precision arithmetic. If this condition is not met, the conversion is aborted and a flag is raised. The
routine does no parameter checking.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular:
n INTEGER. The number of rows and columns of the matrix A. n 0.
1775
a DOUBLE PRECISION.
On entry, the n-by-n triangular matrix A.
ldsa INTEGER. The leading dimension of the array sa. ldsa max(1,n).
Output Parameters
sa REAL.
Array, DIMENSION (ldsa, *).
Only the part of sa determined by uplo is referenced. On exit,
if info = 0, the n-by-n triangular matrix SA,

if info > 0, the content of the part of sa determined by uplo is
unspecified.
info INTEGER.
> 0: an element of the matrix A is greater than the single-precision
overflow threshold; in this case, the content of the part of sa determined by
uplo is unspecified on exit.
zlat2c
Converts a double complex triangular matrix to a
complex triangular matrix.
Syntax
call zlat2c( uplo, n, a, lda, sa, ldsa, info )
Include Files
mkl.fi
Description
This routine is declared in mkl_lapack.fi.
The routine converts a DOUBLE COMPLEX triangular matrix A to a COMPLEX triangular matrix SA. zlat2c
checks that the real and complex parts of all the elements of A are between -RMAX and RMAX, where RMAX
is the overflow for the single-precision arithmetic. If this condition is not met, the conversion is aborted and a
flag is raised. The routine does no parameter checking.
Input Parameters
uplo CHARACTER*1.
Specifies whether the matrix A is upper or lower triangular:
1776
LAPACK Routines 3
n INTEGER. The number of rows and columns in the matrix A. n 0.
a DOUBLE COMPLEX.
On entry, the n-by-n triangular matrix A.
ldsa INTEGER. The leading dimension of the array sa. ldsa max(1,n).
Output Parameters
sa COMPLEX.
Array, DIMENSION (ldsa, *).
Only the part of sa determined by uplo is referenced. On exit,
if info = 0, the n-by-n triangular matrix sa,

if info > 0, the content of the part of sa determined by uplo is
unspecified.
info INTEGER.
> 0: the real or complex part of an element of the matrix A is greater than
the single-precision overflow threshold; in this case, the content of the part
of sa determined by uplo is unspecified on exit.
?lacp2
Copies all or part of a real two-dimensional array to a
complex array.
Syntax
call clacp2( uplo, m, n, a, lda, b, ldb )
call zlacp2( uplo, m, n, a, lda, b, ldb )
Include Files
mkl.fi
Description
The routine copies all or part of a real matrix A to another matrix B.
Input Parameters
uplo CHARACTER*1.
Specifies the part of the matrix A to be copied to B.
1777
Otherwise, all of the matrix A is copied.
a REAL for clacp2

DOUBLE PRECISION for zlacp2
Array, size at least (lda, n), contains the m-by-n matrix A.
If uplo = 'U', only the upper triangle or trapezoid is accessed; if uplo =

'L', only the lower triangle or trapezoid is accessed.
lda INTEGER. The leading dimension of a; lda max(1, m) .
ldb INTEGER. The leading dimension of the output array b; ldb max(1, m).
Output Parameters
b COMPLEX for clacp2

DOUBLE COMPLEX for zlacp2.
On exit, B = A in the locations specified by uplo.

?la_gbamv
Performs a matrix-vector operation to calculate error
bounds.
Syntax
call sla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call dla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call cla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
call zla_gbamv(trans, m, n, kl, ku, alpha, ab, ldab, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_gbamv function performs one of the matrix-vector operations defined as
y := alpha*abs(A)*abs(x) + beta*abs(y),
or
y := alpha*abs(A)T*abs(x) + beta*abs(y),
1778
LAPACK Routines 3
where:
A is an m-by-n matrix, with kl sub-diagonals and ku super-diagonals.
This function is primarily used in calculating error bounds. To protect against underflow during evaluation,
the function perturbs components in the resulting vector away from zero by (n + 1) times the underflow
threshold. To prevent unnecessarily large errors for block structure embedded in general matrices, the
function does not perturb symbolically zero components. A zero entry is considered symbolic if all
multiplications involved in computing that entry have at least one zero multiplicand.
Input Parameters
trans INTEGER. Specifies the operation to be performed:

If trans = 'BLAS_NO_TRANS', then y := alpha*abs(A)*abs(x) +
beta*abs(y)
If trans = 'BLAS_TRANS', then y := alpha*abs(AT)*abs(x) +
beta*abs(y)
If trans = 'BLAS_CONJ_TRANS', then y := alpha*abs(AT)*abs(x) +
beta*abs(y)
The parameter is unchanged on exit.

The value of m must be at least zero. Unchanged on exit.

The value of n must be at least zero. Unchanged on exit.
kl INTEGER. Specifies the number of sub-diagonals within the band of A.

kl 0.
ku INTEGER. Specifies the number of super-diagonals within the band of A.

ku 0.
alpha REAL for sla_gbamv and cla_gbamv

DOUBLE PRECISION for dla_gbamv and zla_gbamv
Specifies the scalar alpha. Unchanges on exit.
ab REAL for sla_gbamv

DOUBLE PRECISION for dla_gbamv
COMPLEX for cla_gbamv
DOUBLE COMPLEX for zla_gbamv
Array, DIMENSION(ldab, *).
Before entry, the leading m-by-n part of the array ab must contain the
matrix of coefficients. The second dimension of ab must be at least
max(1,n). Unchanged on exit.
1779
ldab INTEGER. Specifies the leading dimension of ab as declared in the calling

(sub)program. The value of ldab must be at least max(1, m). Unchanged
on exit.
x REAL for sla_gbamv

DOUBLE PRECISION for dla_gbamv
COMPLEX for cla_gbamv
DOUBLE COMPLEX for zla_gbamv
Array, DIMENSION
(1 + (n - 1)*abs(incx)) when trans = 'N' or 'n'

and at least
(1 + (m - 1)*abs(incx)) otherwise.
Before entry, the incremented array x must contain the vector x.
zero.
beta REAL for sla_gbamv and cla_gbamv

Specifies the scalar beta. When beta is zero, you do not need to set y on
input.
y REAL for sla_gbamv and cla_gbamv

Array, DIMENSION at least
(1 +(m - 1)*abs(incy)) when trans = 'N' or 'n'

and at least
(1 +(n - 1)*abs(incy)) otherwise.
Before entry with beta non-zero, the incremented array y must contain the
vector y.

The value of incy must not be zero. Unchanged on exit.
Output Parameters
y Updated vector y.
?la_gbrcond
Estimates the Skeel condition number for a general
banded matrix.
Syntax
call sla_gbrcond( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, cmode, c, info, work,
iwork )
1780
LAPACK Routines 3
call dla_gbrcond( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, cmode, c, info, work,
iwork )
Include Files
mkl.fi
Description
The function estimates the Skeel condition number of
op(A) * op2(C)
where
the cmode parameter determines op2 as follows:
cmode Value op2(C)
1 C
0 I
-1 inv(C)
The Skeel condition number

cond(A) = norminf(|inv(A)||A|)
is computed by computing scaling factors R such that
diag(R)*A*op2(C)
is row equilibrated and by computing the standard infinity-norm condition number.
Input Parameters

n INTEGER. The number of linear equations, that is, the order of the matrix A; n
0.
ab, afb, c, work REAL for sla_gbrcond

DOUBLE PRECISION for dla_gbrcond
Arrays:
ab(ldab,*) contains the original band matrix A stored in rows from 1 to kl + ku
+ 1. The j-th column of A is stored in the j-th column of the array ab as follows:
ab(ku+1+i-j,j) = A(i,j)
for
1781
max(1,j-ku) i min(n,j+kl)
afb(ldafb,*) contains details of the LU factorization of the band matrix A, as
returned by ?gbtrf. U is stored as an upper triangular band matrix with kl+ku
superdiagonals in rows 1 to kl+ku+1, and the multipliers used during the
factorization are stored in rows kl+ku+2 to 2*kl+ku+1.
c, DIMENSIONn. The vector C in the formula op(A) * op2(C).

work is a workspace array of DIMENSION (5*n).
The second dimension of ab and afb must be at least max(1, n).
ldab INTEGER. The leading dimension of the array ab. ldabkl+ku+1.
ldafb INTEGER. The leading dimension of afb. ldafb2*kl+ku+1.
ipiv INTEGER.
Array with DIMENSIONn. The pivot indices from the factorization A = P*L*U as
computed by ?gbtrf. Row i of the matrix was interchanged with row ipiv(i).
cmode INTEGER. Determines op2(C) in the formula op(A) * op2(C) as follows:

If cmode = 1, op2(C) = C.
If cmode = 0, op2(C) = I.
If cmode = -1, op2(C) = inv(C).
iwork INTEGER. Workspace array with DIMENSIONn.
Output Parameters
info INTEGER.
If i > 0, the i-th parameter is invalid.
See Also
?gbtrf
?la_gbrcond_c
Computes the infinity norm condition number of
op(A)*inv(diag(c)) for general banded matrices.
Syntax
call cla_gbrcond_c( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, c, capply, info,
work, rwork )
call zla_gbrcond_c( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, c, capply, info,
work, rwork )
Include Files
mkl.fi
Description
The function computes the infinity norm condition number of
1782
LAPACK Routines 3
op(A) * inv(diag(c))
where the c is a REAL vector for cla_gbrcond_c and a DOUBLE PRECISION vector for zla_gbrcond_c.
Input Parameters

If trans = 'N', the system has the form A*X = B (No transpose)
If trans = 'T', the system has the form AT*X = B (Transpose)
If trans = 'C', the system has the form AH*X = B (Conjugate Transpose =
Transpose)
0.
ab, afb, work COMPLEX for cla_gbrcond_c

DOUBLE COMPLEX for zla_gbrcond_c
Arrays:
for

ipiv INTEGER.
c, rwork REAL for cla_gbrcond_c

DOUBLE PRECISION for zla_gbrcond_c
Array c with DIMENSIONn. The vector c in the formula
op(A) * inv(diag(c)).
1783
Array rwork with DIMENSIONn is a workspace.
capply LOGICAL. If .TRUE., then the function uses the vector c from the formula
Output Parameters
info INTEGER.
See Also
?gbtrf
?la_gbrcond_x
op(A)*diag(x) for general banded matrices.
Syntax
call cla_gbrcond_x( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, x, info, work,
rwork )
call zla_gbrcond_x( trans, n, kl, ku, ab, ldab, afb, ldafb, ipiv, x, info, work,
rwork )
Include Files
mkl.fi
Description
op(A) * diag(x)
where the x is a COMPLEX vector for cla_gbrcond_x and a DOUBLE COMPLEX vector for zla_gbrcond_x.
Input Parameters

Transpose)
0.
1784
LAPACK Routines 3
ab, afb, x, work COMPLEX for cla_gbrcond_x
DOUBLE COMPLEX for zla_gbrcond_x
Arrays:
for
x, DIMENSIONn. The vector x in the formula op(A) * diag(x).

ipiv INTEGER.
rwork REAL for cla_gbrcond_x

DOUBLE PRECISION for zla_gbrcond_x
Output Parameters
info INTEGER.
See Also
?gbtrf
?la_gbrfsx_extended
Improves the computed solution to a system of linear
equations for general banded matrices by performing
extra-precise iterative refinement and provides error
bounds and backward error estimates for the solution.
1785
Syntax
call sla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
ldafb, ipiv, colequ, c, b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm,
err_bnds_comp, res, ayb, dy, y_tail, rcond, ithresh, rthresh, dz_ub, ignore_cwise,
info )
call dla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
info )
call cla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
info )
call zla_gbrfsx_extended( prec_type, trans_type, n, kl, ku, nrhs, ab, ldab, afb,
info )
Include Files
mkl.fi
Description
The ?la_gbrfsx_extended subroutine improves the computed solution to a system of linear equations by
performing extra-precise iterative refinement and provides error bounds and backward error estimates for
the solution. The ?gbrfsx routine calls ?la_gbrfsx_extended to perform iterative refinement.
In addition to normwise error bound, the code provides maximum componentwise error bound, if possible.
See comments for err_bnds_norm and err_bnds_comp for details of the error bounds.
Use ?la_gbrfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
Specifies the intermediate precision to be used in refinement. The value is
defined by ilaprec(p), where p is a CHARACTER and:
If p = 'S': Single.
If p = 'D': Double.
If p = 'I': Indigenous.
If p = 'X', 'E': Extra.
trans_type INTEGER.
Specifies the transposition operation on A. The value is defined by
ilatrans(t), where t is a CHARACTER and:
If t = 'N': No transpose.
If t = 'T': Transpose.
If t = 'C': Conjugate Transpose.
1786
LAPACK Routines 3
matrix B.
ab, afb, b, y REAL for sla_gbrfsx_extended

DOUBLE PRECISION for dla_gbrfsx_extended
COMPLEX for cla_gbrfsx_extended
DOUBLE COMPLEX for zla_gbrfsx_extended.
Arrays: ab(ldab,*), afb(ldafb,*), b(ldb,*), y(ldy,*).
The array ab contains the original n-by-n matrix A. The second dimension
of ab must be at least max(1,n).
The array afb contains the factors L and U from the factorization A =
P*L*U) as computed by ?gbtrf. The second dimension of afb must be at
least max(1,n).
max(1,nrhs).
The array y on entry contains the solution matrix X as computed by ?
gbtrs. The second dimension of y must be at least max(1,nrhs).
ldab INTEGER. The leading dimension of the array ab; ldab max(1,n).
ldafb INTEGER. The leading dimension of the array afb; ldafb max(1,n).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). Contains the pivot indices from the
factorization A = P*L*U) as computed by ?gbtrf; row i of the matrix was
colequ LOGICAL. If colequ = .TRUE., column equilibration was done to A before

calling this routine. This is needed to compute the solution and error
bounds correctly.
c REAL for single precision flavors

c contains the column scale factors for A. If colequ = .FALSE., c is not
accessed.
If c is input, each element of c should be a power of the radix to ensure a
reliable solution and error estimates. Scaling by power of the radix does not
cause rounding errors unless the result underflows or overflows. Rounding
errors during scaling lead to refining with a matrix that is not equivalent to
the input matrix, producing error estimates that may not be reliable.
1787
ldy INTEGER. The leading dimension of the array y; ldy max(1, n).
n_norms INTEGER. Determines which error bounds to return. See err_bnds_norm

and err_bnds_comp descriptions in Output Arguments section below.
If n_norms 1, returns normwise error bounds.
If n_norms 2, returns componentwise error bounds.

Array, DIMENSION(nrhs,n_err_bnds). For each right-hand side, contains

side.
fields:



1788
LAPACK Routines 3
numbers are 1/(norm(1/
z,inf)*norm(z,inf)) for some appropriately
scaled matrix Z.
approximately 1.
Use this subroutine to set only the second field
above.

follows:

each right-hand side. If componentwise accuracy is nit requested

side.
The second index in err_bnds_comp(:,err) contains the follwoing three
fields:



1789

scaled matrix Z.
above.
res, dy, y_tail REAL for sla_gbrfsx_extended

Workspace arrays of DIMENSIONn.
res holds the intermediate residual.

dy holds the intermediate solution.
y_tail holds the trailing bits of the intermediate solution.
ayb REAL for single precision flavors

Workspace array, DIMENSIONn.

ithresh INTEGER. The maximum number of residual computations allowed for

refinement. The default is 10. For 'aggressive', set to 100 to permit
convergence using approximate factorizations or factorizations other than
LU. If the factorization uses a technique other than Gaussian elimination,
the guarantees in err_bnds_norm and err_bnds_comp may no longer be
trustworthy.
rthresh REAL for single precision flavors

Determines when to stop refinement if the error estimate stops decreasing.
Refinement stops when the next solution no longer satisfies
1790
LAPACK Routines 3
norm(dx_{i+1}) < rthresh * norm(dx_i)
where norm(z) is the infinity norm of Z.
rthresh satisfies
0 < rthresh 1.
The default value is 0.5. For 'aggressive' set to 0.9 to permit convergence
on extremely ill-conditioned matrices.
dz_ub REAL for single precision flavors

Determines when to start considering componentwise convergence.
Componentwise dz_ub convergence is only considered after each
component of the solution y is stable, that is, the relative change in each
component is less than dz_ub. The default value is 0.25, requiring the first
bit to be stable.
ignore_cwise LOGICAL
If .TRUE., the function ignores componentwise convergence. Default value
is .FALSE.
Output Parameters
y REAL for sla_gbrfsx_extended

The improved solution matrix Y.
berr_out REAL for single precision flavors

Array, DIMENSION at least max(1, nrhs). Contains the componentwise
relative backward error for right-hand-side j from the formula
max(i) ( abs(res(i)) / ( abs(op(A))*abs(y) + abs(B) )(i) )
where abs(z) is the componentwise absolute value of the matrix or vector
Z. This is computed by ?la_lin_berr.
err_bnds_norm, Values of the corresponding input parameters improved after iterative

err_bnds_comp refinement and stored in the second column of the array ( 1:nrhs, 2 ).
The other elements are kept unchanged.

See Also
?gbrfsx
?gbtrf
?gbtrs
1791
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_gbrpvgrw
Computes the reciprocal pivot growth factor norm(A)/
norm(U) for a general band matrix.
Syntax
call sla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call dla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call cla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
call zla_gbrpvgrw( n, kl, ku, ncols, ab, ldab, afb, ldafb )
Include Files
mkl.fi
Description
The ?la_gbrpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
absolute element norm is used. If this is much less than 1, the stability of the LU factorization of the
equilibrated matrix A could be poor. This also means that the solution X, estimated condition numbers, and
error bounds could be unreliable.
Input Parameters

n 0.
ncols INTEGER. The number of columns of the matrix A; ncols 0.
ab, afb REAL for sla_gbrpvgrw

DOUBLE PRECISION for dla_gbrpvgrw
COMPLEX for cla_gbrpvgrw
DOUBLE COMPLEX for zla_gbrpvgrw.
Arrays: ab(ldab,*), afb(ldafb,*).
ab contains the original band matrix A (see Matrix Storage Schemes)

stored in rows from 1 to kl + ku + 1. The j-th column of A is stored in
the j-th column of the array ab as follows:
for
1792
LAPACK Routines 3
afb contains details of the LU factorization of the band matrix A, as
returned by ?gbtrf. U is stored as an upper triangular band matrix
with kl+ku superdiagonals in rows 1 to kl+ku+1, and the multipliers
used during the factorization are stored in rows kl+ku+2 to 2*kl+ku
+1.
ldafb INTEGER. The leading dimension of afb; ldafb 2*kl+ku+1.
See Also
?gbtrf
?la_geamv
Syntax
call sla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call dla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call cla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_geamv(trans, m, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_geamv routines perform a matrix-vector operation defined as
y := alpha*abs(A)*(x) + beta*abs(y),
or
y := alpha*abs(AT)*abs(x) + beta*abs(y),
where:
Input Parameters

if trans = BLAS_NO_TRANS , then y := alpha*abs(A)*abs(x) +
beta*abs(y)
1793
if trans = BLAS_TRANS, then y := alpha*abs(AT)*abs(x) +

beta*abs(y)
if trans = 'BLAS_CONJ_TRANS, then y := alpha*abs(AT)*abs(x) +
beta*abs(y).


alpha REAL for sla_geamv and for cla_geamv

DOUBLE PRECISION for dla_geamv and zla_geamv
a REAL for sla_geamv

DOUBLE PRECISION for dla_geamv
COMPLEX for cla_geamv
DOUBLE COMPLEX for zla_geamv
Array, DIMENSION(lda, *). Before entry, the leading m-by-n part of the
array a must contain the matrix of coefficients. The second dimension of a

x REAL for sla_geamv

DOUBLE PRECISION for dla_geamv
COMPLEX for cla_geamv
DOUBLE COMPLEX for zla_geamv
Array, DIMENSION at least (1+(n-1)*abs(incx)) when trans = 'N' or
'n' and at least (1+(m - 1)*abs(incx)) otherwise. Before entry, the
incremented array x must contain the vector X.

The value of incx must be non-zero.
beta REAL for sla_geamv and for cla_geamv

input.
y REAL for sla_geamv and for cla_geamv

Array, DIMENSION at least (1 +(m - 1)*abs(incy)) when trans = 'N'
or 'n' and at least (1 +(n - 1)*abs(incy)) otherwise. Before entry with
non-zero beta, the incremented array y must contain the vector Y.
1794
LAPACK Routines 3
The value of incy must be non-zero.
Output Parameters
y Updated vector Y.
?la_gercond
Estimates the Skeel condition number for a general
matrix.
Syntax
call sla_gercond( trans, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
call dla_gercond( trans, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
op(A) * op2(C)
where
cmode Value op2(C)
1 C
0 I
-1 inv(C)

cond(A) = norminf(|inv(A)||A|
diag(R)*A*op2(C)
Input Parameters

Transpose).
1795
0.
a, af, c, work REAL for sla_gercond

DOUBLE PRECISION for dla_gercond
Arrays:
a(lda,*) contains the original general n-by-n matrix A.
af(ldaf,*) contains factors L and U from the factorization of the general matrix
A=P*L*U, as returned by ?getrf.

The second dimension of a and af must be at least max(1, n).
ldaf INTEGER. The leading dimension of af. ldafmax(1, n).
ipiv INTEGER.
computed by ?getrf. Row i of the matrix was interchanged with row ipiv(i).

Output Parameters
info INTEGER.
See Also
?getrf
?la_gercond_c
op(A)*inv(diag(c)) for general matrices.
Syntax
call cla_gercond_c( trans, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_gercond_c( trans, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
1796
LAPACK Routines 3
Include Files
mkl.fi
Description
where the c is a REAL vector for cla_gercond_c and a DOUBLE PRECISION vector for zla_gercond_c.
Input Parameters

Transpose)
0.
a, af, work COMPLEX for cla_gercond_c

DOUBLE COMPLEX for zla_gercond_c
Arrays:
af(ldaf,*) contains the factors L and U from the factorization A=P*L*U as
returned by ?getrf.

ldaf INTEGER. The leading dimension of af. ldafmax(1,n).
ipiv INTEGER.
c, rwork REAL for cla_gercond_c

DOUBLE PRECISION for zla_gercond_c
capply LOGICAL. If capply=.TRUE., then the function uses the vector c from the
formula
1797
Output Parameters
info INTEGER.
See Also
?getrf
?la_gercond_x
op(A)*diag(x) for general matrices.
Syntax
call cla_gercond_x( trans, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_gercond_x( trans, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
Description
op(A) * diag(x)
where the x is a COMPLEX vector for cla_gercond_x and a DOUBLE COMPLEX vector for zla_gercond_x.
Input Parameters

Transpose)
0.
a, af, x, work COMPLEX for cla_gercond_x

DOUBLE COMPLEX for zla_gercond_x
Arrays:
af(ldaf,*) contains the factors L and U from the factorization A=P*L*U as
returned by ?getrf.
x, DIMENSIONn. The vector x in the formula op(A) * diag(x).
1798
LAPACK Routines 3
ipiv INTEGER.
rwork REAL for cla_gercond_x

DOUBLE PRECISION for zla_gercond_x
Output Parameters
info INTEGER.
See Also
?getrf
?la_gerfsx_extended
equations for general matrices by performing extra-
precise iterative refinement and provides error bounds
and backward error estimates for the solution.
Syntax
call sla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
colequ, c, b, ldb, y, ldy, berr_out, n_norms, errs_n, errs_c, res, ayb, dy, y_tail,
rcond, ithresh, rthresh, dz_ub, ignore_cwise, info )
call dla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
call cla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
call zla_gerfsx_extended( prec_type, trans_type, n, nrhs, a, lda, af, ldaf, ipiv,
Include Files
mkl.fi
1799
Description
The ?la_gerfsx_extended subroutine improves the computed solution to a system of linear equations for
general matrices by performing extra-precise iterative refinement and provides error bounds and backward
error estimates for the solution. The ?gerfsx routine calls ?la_gerfsx_extended to perform iterative
refinement.
See comments for errs_n and errs_c for details of the error bounds.
Use ?la_gerfsx_extended to set only the second fields of errs_n and errs_c.
Input Parameters
prec_type INTEGER.
If p = 'S': Single.
If p = 'D': Double.
trans_type INTEGER.
Specifies the transposition operation on A. The value is defined by
ilatrans(t), where t is a CHARACTER and:
If t = 'N': No transpose.
If t = 'T': Transpose.
If t = 'C': Conjugate Transpose.
matrix B.
a, af, b, y REAL for sla_gerfsx_extended

DOUBLE PRECISION for dla_gerfsx_extended
COMPLEX for cla_gerfsx_extended
DOUBLE COMPLEX for zla_gerfsx_extended.
Arrays: a(lda,*), af(ldaf,*), b(ldb,*), y(ldy,*).
The array a contains the original matrix n-by-n matrix A. The second
The array af contains the factors L and U from the factorization A =

P*L*U) as computed by ?getrf. The second dimension of af must be at
least max(1,n).
max(1,nrhs).
1800
LAPACK Routines 3
getrs. The second dimension of y must be at least max(1,nrhs).
ipiv INTEGER.
Array, DIMENSION at least max(1, n). Contains the pivot indices from the
factorization A = P*L*U) as computed by ?getrf; row i of the matrix was

bounds correctly.
c REAL for single precision flavors (sla_gerfsx_extended,

cla_gerfsx_extended)
DOUBLE PRECISION for double precision flavors (dla_gerfsx_extended,
zla_gerfsx_extended).
used.
n_norms INTEGER. Determines which error bounds to return. See errs_n and
errs_c descriptions in Output Arguments section below.
errs_n REAL for single precision flavors

1801

The first index in errs_n(i,:) corresponds to the i-th right-hand side.
The second index in errs_n(:,err) contains the following three fields:



scaled matrix Z.
approximately 1.
above.
errs_c REAL for single precision flavors

follows:

1802
LAPACK Routines 3
(params(3) = 0.0), then errs_c is not accessed. If n_err_bnds < 3, then
at most the first (:,n_err_bnds) entries are returned.
The first index in errs_c(i,:) corresponds to the i-th right-hand side.
The second index in errs_c(:,err) contains the follwoing three fields:



scaled matrix Z.
above.
res, dy, y_tail REAL for sla_gerfsx_extended


ayb REAL for single precision flavors
1803



the guarantees in errs_n and errs_c may no longer be trustworthy.
rthresh REAL for single precision flavors

rthresh satisfies
0 < rthresh 1.
dz_ub REAL for single precision flavors

bit to be stable.
is .FALSE.
Output Parameters
y REAL for sla_gerfsx_extended

1804
LAPACK Routines 3
berr_out REAL for single precision flavors

Array, DIMENSION at least max(1, nrhs). Contains the componentwise
relative backward error for right-hand-side j from the formula
errs_n, errs_c Values of the corresponding input parameters improved after iterative
refinement and stored in the second column of the array ( 1:nrhs, 2 ).

See Also
?gerfsx
?getrf
?getrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_heamv
indefinite matrix to calculate error bounds.
Syntax
call cla_heamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_heamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_heamv routines perform a matrix-vector operation defined as
where:
1805
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the array A is to be
referenced:
If uplo = 'BLAS_UPPER', only the upper triangular part of A is to be
referenced,
If uplo = 'BLAS_LOWER', only the lower triangular part of A is to be
referenced.
n INTEGER. Specifies the number of rows and columns of the matrix A. The
value of n must be at least zero.
alpha REAL for cla_heamv

DOUBLE PRECISION for zla_heamv
a COMPLEX for cla_heamv

DOUBLE COMPLEX for zla_heamv

x COMPLEX for cla_heamv

DOUBLE COMPLEX for zla_heamv
Array, DIMENSION at least (1+(n-1)*abs(incx)). Before entry, the

beta REAL for cla_heamv

input.
y REAL for cla_heamv

1806
LAPACK Routines 3
Array, DIMENSION at least (1 +(n - 1)*abs(incy)) otherwise. Before
entry with non-zero beta, the incremented array y must contain the vector
Y.

Output Parameters
y Updated vector Y.
?la_hercond_c
op(A)*inv(diag(c)) for Hermitian indefinite matrices.
Syntax
call cla_hercond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_hercond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
Include Files
mkl.fi
Description
where the c is a REAL vector for cla_hercond_c and a DOUBLE PRECISION vector for zla_hercond_c.
Input Parameters

Specifies the triangle of A to store:
If uplo = 'U', the upper triangle of A is stored,
0.
a COMPLEX for cla_hercond_c

DOUBLE COMPLEX for zla_hercond_c
Array, DIMENSION(lda, *). On entry, the n-by-n matrix A. The second
af COMPLEX for cla_hercond_c

1807
Array, DIMENSION(ldaf, *). The block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?hetrf. The second dimension
of af must be at least max(1,n).
ldaf INTEGER. The leading dimension of the array af. ldafmax(1,n).
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure of D
as determined by ?hetrf.
c REAL for cla_hercond_c

DOUBLE PRECISION for zla_hercond_c
work COMPLEX for cla_hercond_c

Array DIMENSION 2*n. Workspace.
rwork REAL for cla_hercond_c

Array DIMENSIONn. Workspace.
Output Parameters
info INTEGER.
See Also
?hetrf
?la_hercond_x
op(A)*diag(x) for Hermitian indefinite matrices.
Syntax
call cla_hercond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_hercond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
1808
LAPACK Routines 3
Description
op(A) * diag(x)
where the x is a COMPLEX vector for cla_hercond_x and a DOUBLE COMPLEX vector for zla_hercond_x.
Input Parameters

0.
a COMPLEX for cla_hercond_c

af COMPLEX for cla_hercond_c

used to obtain the factor U or L as computed by ?hetrf. The second dimension
ipiv INTEGER.
x COMPLEX for cla_hercond_c

Array x with DIMENSIONn. The vector x in the formula
op(A) * inv(diag(x)).
work COMPLEX for cla_hercond_c

rwork REAL for cla_hercond_c

1809
Output Parameters
info INTEGER.
See Also
?hetrf
?la_herfsx_extended
equations for Hermitian indefinite matrices by
performing extra-precise iterative refinement and
provides error bounds and backward error estimates
for the solution.
Syntax
call cla_herfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
b, ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
call zla_herfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
Include Files
mkl.fi
Description
The ?la_herfsx_extended subroutine improves the computed solution to a system of linear equations by
the solution. The ?herfsx routine calls ?la_herfsx_extended to perform iterative refinement.
Use ?la_herfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
If p = 'S': Single.
If p = 'D': Double.

1810
LAPACK Routines 3
matrix B.
a, af, b, y COMPLEX for cla_herfsx_extended

DOUBLE COMPLEX for zla_herfsx_extended.
The array a contains the original n-by-n matrix A. The second dimension of
a must be at least max(1,n).
to obtain the factor U or L as computed by ?hetrf. The second dimension
The array b contains the right-hand-side of the matrix B. The second


hetrs. The second dimension of y must be at least max(1,nrhs).
ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block structure of D

bounds correctly.
c REAL for cla_herfsx_extended

DOUBLE PRECISION for zla_herfsx_extended.
used.

1811
err_bnds_norm REAL for cla_herfsx_extended

corresponding to the normwise relative error.
Normwise relative error in the i-th solution vector is defined as follows:

side.
fields:

threshold sqrt(n)*slamch() for
cla_herfsx_extended and
sqrt(n)*dlamch() for
zla_herfsx_extended.

for cla_herfsx_extended and
zla_herfsx_extended. This error bound
should only be trusted if the previous boolean is
true.

sqrt(n)*slamch() for
zla_herfsx_extended to determine if the error
estimate is "guaranteed". These reciprocal
condition numbers are 1/(norm(1/
scaled matrix Z.
1812
LAPACK Routines 3
approximately 1.
above.
err_bnds_comp REAL for cla_herfsx_extended

follows:


side.
fields:

zla_herfsx_extended.

for cla_herfsx_extended and
zla_herfsx_extended. This error bound
true.

1813
zla_herfsx_extended to determine if the error
scaled matrix Z.
above.
res, dy, y_tail COMPLEX for cla_herfsx_extended


ayb REAL for cla_herfsx_extended

rcond REAL for cla_herfsx_extended


trustworthy.
rthresh REAL for cla_herfsx_extended

1814
LAPACK Routines 3
rthresh satisfies
0 < rthresh 1.
dz_ub REAL for cla_herfsx_extended

bit to be stable.
is .FALSE.
Output Parameters
y COMPLEX for cla_herfsx_extended

berr_out REAL for cla_herfsx_extended

Array, DIMENSIONnrhs. berr_out(j) contains the componentwise relative
backward error for right-hand-side j from the formula


See Also
?herfsx
?hetrf
?hetrs
?lamch
ilaprec
ilatrans
?la_lin_berr
1815
?la_herpvgrw
norm(U) for a Hermitian indefinite matrix.
Syntax
call cla_herpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call zla_herpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
Include Files
mkl.fi
Description
The ?la_herpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
Input Parameters


n 0.
info INTEGER. The value of INFO returned from ?hetrf, that is, the pivot
in column info is exactly 0.
a, af COMPLEX for cla_herpvgrw

DOUBLE COMPLEX for zla_herpvgrw.
Arrays: a(lda,*), af(ldaf,*).
a contains the n-by-n matrix A. The second dimension of a must be at

least max(1,n).
af contains the block diagonal matrix D and the multipliers used to

obtain the factor U or L as computed by ?hetrf. The second
dimension of af must be at least max(1,n).
lda INTEGER. The leading dimension of array a; ldamax(1,n).
ldaf INTEGER. The leading dimension of array af; ldafmax(1,n).
ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block
structure of D as determined by ?hetrf.
work REAL for cla_herpvgrw
1816
LAPACK Routines 3
DOUBLE PRECISION for zla_herpvgrw.
Array, DIMENSION 2*n. Workspace.
See Also
?hetrf
?la_lin_berr
Computes component-wise relative backward error.
Syntax
call sla_lin_berr(n, nz, nrhs, res, ayb, berr )
call dla_lin_berr(n, nz, nrhs, res, ayb, berr )
call cla_lin_berr(n, nz, nrhs, res, ayb, berr )
call zla_lin_berr(n, nz, nrhs, res, ayb, berr )
Include Files
mkl.fi
Description
The ?la_lin_berr computes a component-wise relative backward error from the formula:
max(i) ( abs(R(i))/( abs(op(A_s))*abs(Y) + abs(B_s) )(i) )

where abs(Z) is the component-wise value of the matrix or vector Z.
Input Parameters
n INTEGER. The number of linear equations, the order of the matrix A; n 0.
nz INTEGER. The parameter for guarding against spuriously zero residuals.

(nz+1)*slamch( 'Safe minimum' ) is added to R(i) in the numerator
of the relative backward error formula. The default value is n.
nrhs INTEGER. Number of right-hand sides, the number of columns in the

matrices AYB, RES, and BERR; nrhs 0.
res, ayb REAL for sla_lin_berr, cla_lin_berr

DOUBLE PRECISION for dla_lin_berr, zla_lin_berr
Arrays, DIMENSION (n,nrhs).
res is the residual matrix, that is, the matrix R in the relative backward
error formula.
ayb is the denominator of that formula, that is, the matrix
abs(op(A_s))*abs(Y) + abs(B_s). The matrices A, Y, and B are from
iterative refinement. See description of ?la_gerfsx_extended.
Output Parameters
berr REAL for sla_lin_berr
1817
DOUBLE PRECISION for dla_lin_berr

COMPLEX for cla_lin_berr
DOUBLE COMPLEX for zla_lin_berr
The component-wise relative backward error.
See Also
?lamch
?la_gerfsx_extended
?la_porcond
Estimates the Skeel condition number for a symmetric
positive-definite matrix.
Syntax
call sla_porcond( uplo, n, a, lda, af, ldaf, cmode, c, info, work, iwork )
call dla_porcond( uplo, n, a, lda, af, ldaf, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
op(A) * op2(C)
where
cmode Value op2(C)
1 C
0 I
-1 inv(C)

diag(R)*A*op2(C)
Input Parameters

1818
LAPACK Routines 3
0.
a, af, c, work REAL for sla_porcond

DOUBLE PRECISION for dla_porcond
Arrays:
a (lda,*) contains the n-by-n matrix A.
af (ldaf,*) contains the triangular factor L or U from the Cholesky factorization
A = UT*U or A = L*LT,
as computed by ?potrf.

lda INTEGER. The leading dimension of the array ab. ldamax(1,n).

Output Parameters
info INTEGER.
See Also
?potrf
?la_porcond_c
op(A)*inv(diag(c)) for Hermitian positive-definite
matrices.
Syntax
call cla_porcond_c( uplo, n, a, lda, af, ldaf, c, capply, info, work, rwork )
call zla_porcond_c( uplo, n, a, lda, af, ldaf, c, capply, info, work, rwork )
Include Files
mkl.fi
1819
Description
where the c is a REAL vector for cla_porcond_c and a DOUBLE PRECISION vector for zla_porcond_c.
Input Parameters

0.
a COMPLEX for cla_porcond_c

DOUBLE COMPLEX for zla_porcond_c
af COMPLEX for cla_porcond_c

Array, DIMENSION(ldaf, *). The triangular factor L or U from the Cholesky
factorization
A = UH*U or A = L*LH,
The second dimension of af must be at least max(1,n).
c REAL for cla_porcond_c

DOUBLE PRECISION for zla_porcond_c
work COMPLEX for cla_porcond_c

rwork REAL for cla_porcond_c

1820
LAPACK Routines 3
Output Parameters
info INTEGER.
See Also
?potrf
?la_porcond_x
op(A)*diag(x) for Hermitian positive-definite matrices.
Syntax
call cla_porcond_x( uplo, n, a, lda, af, ldaf, x, info, work, rwork )
call zla_porcond_x( uplo, n, a, lda, af, ldaf, x, info, work, rwork )
Include Files
mkl.fi
Description
op(A) * diag(x)
where the x is a COMPLEX vector for cla_porcond_x and a DOUBLE COMPLEX vector for zla_porcond_x.
Input Parameters

0.
a COMPLEX for cla_porcond_c

Array, DIMENSION(lda, *). On entry, the n-by-n matrix A.
af COMPLEX for cla_porcond_c

1821
Array, DIMENSION(ldaf, *). The triangular factor L or U from the Cholesky

factorization
A = UH*U or A = L*LH,
x COMPLEX for cla_porcond_c

work COMPLEX for cla_porcond_c

rwork REAL for cla_porcond_c

Output Parameters
info INTEGER.
See Also
?potrf
?la_porfsx_extended
equations for symmetric or Hermitian positive-definite
matrices by performing extra-precise iterative
refinement and provides error bounds and backward
error estimates for the solution.
Syntax
call sla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
ldb, y, ldy, berr_out, n_norms, err_bnds_norm, err_bnds_comp, res, ayb, dy, y_tail,
call dla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
call cla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
1822
LAPACK Routines 3
call zla_porfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, colequ, c, b,
Include Files
mkl.fi
Description
The ?la_porfsx_extended subroutine improves the computed solution to a system of linear equations by
the solution. The ?herfsx routine calls ?la_porfsx_extended to perform iterative refinement.
Use ?la_porfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
If p = 'S': Single.
If p = 'D': Double.

matrix B.
a, af, b, y REAL for sla_porfsx_extended

DOUBLE PRECISION for dla_porfsx_extended
COMPLEX for cla_porfsx_extended
DOUBLE COMPLEX for zla_porfsx_extended.
factorization as computed by ?potrf:
A = UT*U or A = L*LT for real flavors,

A = UH*U or A = L*LH for complex flavors.
1823


potrs. The second dimension of y must be at least max(1,nrhs).

bounds correctly.
c REAL for sla_porfsx_extended and cla_porfsx_extended

DOUBLE PRECISION for dla_porfsx_extended and
zla_porfsx_extended.
used.

err_bnds_norm REAL for sla_porfsx_extended and cla_porfsx_extended


1824
LAPACK Routines 3
side.
fields:

sla_porfsx_extended/
cla_porfsx_extended and
dla_porfsx_extended/

for sla_porfsx_extended/
zla_porfsx_extended. This error bound
true.

zla_porfsx_extended to determine if the error
scaled matrix Z.
approximately 1.
above.
err_bnds_comp REAL for sla_porfsx_extended and cla_porfsx_extended

1825

follows:


side.
fields:


for sla_porfsx_extended/
zla_porfsx_extended. This error bound
true.

zla_porfsx_extended to determine if the error
1826
LAPACK Routines 3
scaled matrix Z.
above.
res, dy, y_tail REAL for sla_porfsx_extended


ayb REAL for sla_porfsx_extended and cla_porfsx_extended

rcond REAL for sla_porfsx_extended and cla_porfsx_extended


trustworthy.
rthresh REAL for sla_porfsx_extended and cla_porfsx_extended

1827
rthresh satisfies
0 < rthresh 1.
dz_ub REAL for sla_porfsx_extended and cla_porfsx_extended

bit to be stable.
is .FALSE.
Output Parameters
y REAL for sla_porfsx_extended

berr_out REAL for sla_porfsx_extended and cla_porfsx_extended



See Also
?porfsx
?potrf
1828
LAPACK Routines 3
?potrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_porpvgrw
norm(U) for a symmetric or Hermitian positive-
definite matrix.
Syntax
call sla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call dla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call cla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
call zla_porpvgrw( uplo, ncols, a, lda, af, ldaf, work )
Include Files
mkl.fi
Description
The ?la_porpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
Input Parameters

a, af REAL for sla_porpvgrw

DOUBLE PRECISION for dla_porpvgrw
COMPLEX for cla_porpvgrw
DOUBLE COMPLEX for zla_porpvgrw.
The array a contains the input n-by-n matrix A. The second dimension
of a must be at least max(1,n).

factorization as computed by ?potrf:
A = UT*U or A = L*LT for real flavors,

A = UH*U or A = L*LH for complex flavors.
1829

ldaf INTEGER. The leading dimension of af; ldaf max(1,n).
work REAL for sla_porpvgrw and cla_porpvgrw

DOUBLE PRECISION for dla_porpvgrw and zla_porpvgrw.
Workspace array, dimension 2*n.
See Also
?potrf
?laqhe
Scales a Hermitian matrix.
Syntax
call claqhe( uplo, n, a, lda, s, scond, amax, equed )
call zlaqhe( uplo, n, a, lda, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermitian matrix A using the scaling factors in the vector s.
Input Parameters
uplo CHARACTER*1.
Specifies whether to store the upper or lower part of the Hermitian matrix
A.

n 0.
a COMPLEX for claqhe

DOUBLE COMPLEX for zlaqhe
Array, DIMENSION (lda,n). On entry, the Hermitian matrix A.
upper triangular part of matrix A and the strictly lower triangular part of a is
not referenced.
lower triangular part of matrix A and the strictly upper triangular part of a is
not referenced.
1830
LAPACK Routines 3
lda max(n,1).
s REAL for claqhe

DOUBLE PRECISION for zlaqhe
scond REAL for claqhe

amax REAL for claqhe

Output Parameters
a If equed = 'Y', a contains the equilibrated matrix diag(s)*A*diag(s).
equed CHARACTER*1.
diag(s)*A*diag(s).
Application Notes
The routine uses internal parameters thresh, large, and small. The parameter thresh is a threshold value
used to decide if scaling should be done based on the ratio of the scaling factors. If scond < thresh, scaling
is done.
The large and small parameters are threshold values used to decide if scaling should be done based on the
?laqhp
Scales a Hermitian matrix stored in packed form.
Syntax
call claqhp( uplo, n, ap, s, scond, amax, equed )
call zlaqhp( uplo, n, ap, s, scond, amax, equed )
Include Files
mkl.fi
Description
The routine equilibrates a Hermitian matrix A using the scaling factors in the vector s.
1831
Input Parameters
uplo CHARACTER*1.
Specifies whether to store the upper or lower part of the Hermitian matrix
A.

n 0.
ap COMPLEX for claqhp

DOUBLE COMPLEX for zlaqhp
Array, DIMENSION (n*(n+1)/2). The Hermitian matrix A.
If uplo = 'U', the upper triangular part of the Hermitian matrix A is

stored in the packed array ap as follows:
ap(i+(j-1)*j/2) = A(i,j) for 1 ij.
If uplo = 'L', the lower triangular part of Hermitian matrix A is stored
in the packed array ap as follows:
ap(i+(j-1)*(2n-j)/2) = A(i,j) for jin.
s REAL for claqhp

DOUBLE PRECISION for zlaqhp
scond REAL for claqhp

amax REAL for claqhp

Output Parameters
a If equed = 'Y', a contains the equilibrated matrix diag(s)*A*diag(s) in

the same storage format as on input.
equed CHARACTER*1.
diag(s)*A*diag(s).
1832
LAPACK Routines 3
Application Notes
The routine uses internal parameters thresh, large, and small. The parameter thresh is a threshold value
used to decide if scaling should be done based on the ratio of the scaling factors. If scond < thresh, scaling
is done.
The large and small parameters are threshold values used to decide if scaling should be done based on the
?larcm
Multiplies a square real matrix by a complex matrix.
Syntax
call clarcm( m, n, a, lda, b, ldb, c, ldc, rwork )
call zlarcm( m, n, a, lda, b, ldb, c, ldc, rwork )
Include Files
mkl.fi
Description
The routine performs a simple matrix-matrix multiplication of the form

C = A*B,
where A is m-by-m and real, B is m-by-n and complex, C is m-by-n and complex.
Input Parameters
m INTEGER. The number of rows and columns of the matrix A and of the
number of rows of the matrix C (m 0).
n INTEGER. The number of columns of the matrix B and the number of

columns of the matrix C
(n 0).
a REAL for clarcm

DOUBLE PRECISION for zlarcm
Array, DIMENSION(lda, m). Contains the m-by-m matrix A.
lda INTEGER. The leading dimension of the array a, ldamax(1, m).
b COMPLEX for clarcm

DOUBLE COMPLEX for zlarcm
Array, DIMENSION(ldb, n). Contains the m-by-n matrix B.
ldb INTEGER. The leading dimension of the array b, ldbmax(1, n).
ldc INTEGER. The leading dimension of the output array c, ldcmax(1, m).
rwork REAL for clarcm

DOUBLE PRECISION for zlarcm
1833
Workspace array, DIMENSION(2*m*n).
Output Parameters
c COMPLEX for clarcm

DOUBLE COMPLEX for zlarcm
Array, DIMENSION (ldc, n). Contains the m-by-n matrix C.
?la_gerpvgrw
norm(U) for a general matrix.
Syntax
call sla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call dla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call cla_gerpvgrw( n, ncols, a, lda, af, ldaf )
call zla_gerpvgrw( n, ncols, a, lda, af, ldaf )
Include Files
mkl.fi
Description
The ?la_gerpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
Input Parameters

n 0.
a, af REAL for sla_gerpvgrw

DOUBLE PRECISION for dla_gerpvgrw
COMPLEX for cla_gerpvgrw
DOUBLE COMPLEX for zla_gerpvgrw.
The array af contains the factors L and U from the factorization

triangular factor L or U from the Cholesky factorization A = P*L*U as
computed by ?getrf. The second dimension of af must be at least
max(1,n).
1834
LAPACK Routines 3
See Also
?getrf
?larscl2
Performs reciprocal diagonal scaling on a vector.
Syntax
call slarscl2(m, n, d, x, ldx)
call dlarscl2(m, n, d, x, ldx)
call clarscl2(m, n, d, x, ldx)
call zlarscl2(m, n, d, x, ldx)
Include Files
mkl.fi
Description
The ?larscl2 routines perform reciprocal diagonal scaling on a vector
x := D-1*x,
where:
x is a vector, and
D is a diagonal matrix.
Input Parameters
m INTEGER. Specifies the number of rows of the matrix D and the number of
elements of the vector x. The value of m must be at least zero.
n INTEGER. The number of columns of D and x. The value of n must be at

least zero.
d REAL for slarscl2 and clarscl2.

DOUBLE PRECISION for dlarscl2 and zlarscl2.
Array, DIMENSIONm. Diagonal matrix D stored as a vector of length m.
x REAL for slarscl2.

DOUBLE PRECISION for dlarscl2.
COMPLEX for clarscl2.
DOUBLE COMPLEX for zlarscl2.
Array, DIMENSION(ldx,n). The vector x to scale by D.
ldx INTEGER.
1835
The leading dimension of the vector x. The value of ldx must be at least
zero.
Output Parameters
x Scaled vector x.
?lascl2
Performs diagonal scaling on a vector.
Syntax
call slascl2(m, n, d, x, ldx)
call dlascl2(m, n, d, x, ldx)
call clascl2(m, n, d, x, ldx)
call zlascl2(m, n, d, x, ldx)
Include Files
mkl.fi
Description
The ?lascl2 routines perform diagonal scaling on a vector
x := D*x,
where:
x is a vector, and
D is a diagonal matrix.
Input Parameters
m INTEGER. Specifies the number of rows of the matrix D and the number of
elements of the vector x. The value of m must be at least zero.
n INTEGER. The number of columns of D and x. The value of n must be at

least zero.
d REAL for slascl2 and clascl2.

DOUBLE PRECISION for dlascl2 and zlascl2.
Array, DIMENSIONm. Diagonal matrix D stored as a vector of length m.
x REAL for slascl2.

DOUBLE PRECISION for dlascl2.
COMPLEX for clascl2.
DOUBLE COMPLEX for zlascl2.
Array, DIMENSION(ldx,n). The vector x to scale by D.
ldx INTEGER.
1836
LAPACK Routines 3
The leading dimension of the vector x. The value of ldx must be at least
zero.
Output Parameters
x Scaled vector x.
?la_syamv
indefinite matrix to calculate error bounds.
Syntax
call sla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call dla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call cla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
call zla_syamv(uplo, n, alpha, a, lda, x, incx, beta, y, incy)
Include Files
mkl.fi
Description
The ?la_syamv routines perform a matrix-vector operation defined as
where:
Input Parameters
uplo CHARACTER*1.
Specifies whether the upper or lower triangular part of the array A is to be
referenced:
If uplo = 'BLAS_UPPER', only the upper triangular part of A is to be
referenced,
If uplo = 'BLAS_LOWER', only the lower triangular part of A is to be
referenced.
n INTEGER. Specifies the number of rows and columns of the matrix A. The
value of n must be at least zero.
1837
alpha REAL for sla_syamv and cla_syamv

DOUBLE PRECISION for dla_syamv and zla_syamv.
a REAL for sla_syamv

DOUBLE PRECISION for dla_syamv
COMPLEX for cla_syamv
DOUBLE COMPLEX for zla_syamv.

x REAL for sla_syamv

DOUBLE PRECISION for dla_syamv
COMPLEX for cla_syamv
DOUBLE COMPLEX for zla_syamv.
Array, DIMENSION at least (1+(n-1)*abs(incx)). Before entry, the

beta REAL for sla_syamv and cla_syamv

DOUBLE PRECISION for dla_syamv and zla_syamv
input.
y REAL for sla_syamv and cla_syamv

DOUBLE PRECISION for dla_syamv and zla_syamv
Array, DIMENSION at least (1 +(n - 1)*abs(incy)) otherwise. Before
entry with non-zero beta, the incremented array y must contain the vector
Y.

Output Parameters
y Updated vector Y.
1838
LAPACK Routines 3
?la_syrcond
Estimates the Skeel condition number for a symmetric
indefinite matrix.
Syntax
call sla_syrcond( uplo, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
call dla_syrcond( uplo, n, a, lda, af, ldaf, ipiv, cmode, c, info, work, iwork )
Include Files
mkl.fi
Description
op(A) * op2(C)
where
cmode Value op2(C)
1 C
0 I
-1 inv(C)

diag(R)*A*op2(C)
Input Parameters

0.
a, af, c, work REAL for sla_syrcond

DOUBLE PRECISION for dla_syrcond
Arrays:
ab (lda,*) contains the n-by-n matrix A.
af (ldaf,*) contains the The block diagonal matrix D and the multipliers used to
obtain the factor L or U as computed by ?sytrf.
1839

lda INTEGER. The leading dimension of the array ab. ldamax(1,n).
ipiv INTEGER.

Output Parameters
info INTEGER.
See Also
?sytrf
?la_syrcond_c
op(A)*inv(diag(c)) for symmetric indefinite matrices.
Syntax
call cla_syrcond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
call zla_syrcond_c( uplo, n, a, lda, af, ldaf, ipiv, c, capply, info, work, rwork )
Include Files
mkl.fi
Description
where the c is a REAL vector for cla_syrcond_c and a DOUBLE PRECISION vector for zla_syrcond_c.
Input Parameters
1840
LAPACK Routines 3
0.
a COMPLEX for cla_syrcond_c

DOUBLE COMPLEX for zla_syrcond_c
af COMPLEX for cla_syrcond_c

used to obtain the factor U or L as computed by ?sytrf. The second dimension
ipiv INTEGER.
c REAL for cla_syrcond_c

DOUBLE PRECISION for zla_syrcond_c
work COMPLEX for cla_syrcond_c

rwork REAL for cla_syrcond_c

Output Parameters
info INTEGER.
1841
See Also
?sytrf
?la_syrcond_x
op(A)*diag(x) for symmetric indefinite matrices.
Syntax
call cla_syrcond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
call zla_syrcond_x( uplo, n, a, lda, af, ldaf, ipiv, x, info, work, rwork )
Include Files
mkl.fi
Description
op(A) * diag(x)
where the x is a COMPLEX vector for cla_syrcond_x and a DOUBLE COMPLEX vector for zla_syrcond_x.
Input Parameters

0.
a COMPLEX for cla_syrcond_c

af COMPLEX for cla_syrcond_c

used to obtain the factor U or L as computed by ?sytrf. The second dimension
ipiv INTEGER.
1842
LAPACK Routines 3
x COMPLEX for cla_syrcond_c

work COMPLEX for cla_syrcond_c

rwork REAL for cla_syrcond_c

Output Parameters
info INTEGER.
See Also
?sytrf
?la_syrfsx_extended
equations for symmetric indefinite matrices by
performing extra-precise iterative refinement and
provides error bounds and backward error estimates
for the solution.
Syntax
call sla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
call dla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
call cla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
call zla_syrfsx_extended( prec_type, uplo, n, nrhs, a, lda, af, ldaf, ipiv, colequ, c,
1843
Include Files
mkl.fi
Description
The ?la_syrfsx_extended subroutine improves the computed solution to a system of linear equations by
the solution. The ?syrfsx routine calls ?la_syrfsx_extended to perform iterative refinement.
Use ?la_syrfsx_extended to set only the second fields of err_bnds_norm and err_bnds_comp.
Input Parameters
prec_type INTEGER.
If p = 'S': Single.
If p = 'D': Double.

matrix B.
a, af, b, y REAL for sla_syrfsx_extended

DOUBLE PRECISION for dla_syrfsx_extended
COMPLEX for cla_syrfsx_extended
DOUBLE COMPLEX for zla_syrfsx_extended.
to obtain the factor U or L as computed by ?sytrf.

1844
LAPACK Routines 3
sytrs. The second dimension of y must be at least max(1,nrhs).
ipiv INTEGER.
Array with DIMENSIONn. Details of the interchanges and the block structure
of D as determined by ?sytrf.

bounds correctly.
c REAL for sla_syrfsx_extended and cla_syrfsx_extended

DOUBLE PRECISION for dla_syrfsx_extended and
zla_syrfsx_extended.
used.

err_bnds_norm REAL for sla_syrfsx_extended and cla_syrfsx_extended


1845

side.
fields:

sla_syrfsx_extended/
cla_syrfsx_extended and
dla_syrfsx_extended/

for sla_syrfsx_extended/
zla_syrfsx_extended. This error bound
true.

zla_syrfsx_extended to determine if the error
scaled matrix Z.
approximately 1.
above.
err_bnds_comp REAL for sla_syrfsx_extended and cla_syrfsx_extended

1846
LAPACK Routines 3
follows:


side.
fields:


for sla_syrfsx_extended/
zla_syrfsx_extended. This error bound
true.

zla_syrfsx_extended to determine if the error
1847

scaled matrix Z.
above.
res, dy, y_tail REAL for sla_syrfsx_extended


ayb REAL for sla_syrfsx_extended and cla_syrfsx_extended

rcond REAL for sla_syrfsx_extended and cla_syrfsx_extended


trustworthy.
rthresh REAL for sla_syrfsx_extended and cla_syrfsx_extended

1848
LAPACK Routines 3
rthresh satisfies
0 < rthresh 1.
dz_ub REAL for sla_syrfsx_extended and cla_syrfsx_extended

bit to be stable.
is .FALSE.
Output Parameters
y REAL for sla_syrfsx_extended

berr_out REAL for sla_syrfsx_extended and cla_syrfsx_extended



See Also
?syrfsx
?sytrf
1849
?sytrs
?lamch
ilaprec
ilatrans
?la_lin_berr
?la_syrpvgrw
norm(U) for a symmetric indefinite matrix.
Syntax
call sla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call dla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call cla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
call zla_syrpvgrw( uplo, n, info, a, lda, af, ldaf, ipiv, work )
Include Files
mkl.fi
Description
The ?la_syrpvgrw routine computes the reciprocal pivot growth factor norm(A)/norm(U). The max
Input Parameters


n 0.
info INTEGER. The value of INFO returned from ?sytrf, that is, the pivot
in column info is exactly 0.
a, af REAL for sla_syrpvgrw

DOUBLE PRECISION for dla_syrpvgrw
COMPLEX for cla_syrpvgrw
DOUBLE COMPLEX for zla_syrpvgrw.
1850
LAPACK Routines 3
The array af contains the block diagonal matrix D and the multipliers
used to obtain the factor U or L as computed by ?sytrf.

ipiv INTEGER.
Array, DIMENSIONn. Details of the interchanges and the block
structure of D as determined by ?sytrf.
work REAL for sla_syrpvgrw and cla_syrpvgrw

DOUBLE PRECISION for dla_syrpvgrw and zla_syrpvgrw.
Workspace array, dimension 2*n.
See Also
?sytrf
?la_wwaddw
Adds a vector into a doubled-single vector.
Syntax
call sla_wwaddw( n, x, y, w )
call dla_wwaddw( n, x, y, w )
call cla_wwaddw( n, x, y, w )
call zla_wwaddw( n, x, y, w )
Include Files
mkl.fi
Description
The ?la_wwaddw routine adds a vector W into a doubled-single vector (X, Y). This works for all existing IBM
hex and binary floating-point arithmetics, but not for decimal.
Input Parameters
n INTEGER. The length of vectors X, Y, and W .
x, y, w REAL for sla_wwaddw

DOUBLE PRECISION for dla_wwaddw
COMPLEX for cla_wwaddw
DOUBLE COMPLEX for zla_wwaddw.
Arrays DIMENSIONn.
x and y contain the first and second parts of the doubled-single

accumulation vector, respectively.
1851
w contains the vector W to be added.
Output Parameters
x, y Contain the first and second parts of the doubled-single accumulation vector,
respectively, after adding the vector W.
mkl_?tppack
Copies a triangular/symmetric matrix or submatrix
from standard full format to standard packed format.
Syntax
call mkl_stppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_dtppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ctppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ztppack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_tppack (ap, i, j, rows, cols, a[, uplo] [, trans] [, info])
Include Files
mkl.fi, lapack.f90
Description
The routine copies a triangular or symmetric matrix or its submatrix from standard full format to packed
format
APi:i+rows-1, j:j+cols-1 := op(A)
Standard packed formats include:
TP: triangular packed storage

SP: symmetric indefinite packed storage
HP: Hermitian indefinite packed storage
PP: symmetric or Hermitian positive definite packed storage
Full formats include:
GE: general
TR: triangular
SY: symmetric indefinite
HE: Hermitian indefinite
PO: symmetric or Hermitian positive definite
NOTE
Any elements of the copied submatrix rectangular outside of the triangular part of the matrix AP are
skipped.
Input Parameters
1852
LAPACK Routines 3
uplo CHARACTER*1. Specifies whether the matrix AP is upper or lower
triangular.
If uplo = 'U', AP is upper triangular.
If uplo = 'L': AP is lower triangular.
trans CHARACTER*1. Specifies whether or not the copied block of A is transposed

or not.
If trans = 'N', no transpose: op(A) = A.
If trans = 'T',transpose: op(A) = AT.
If trans = 'C',conjugate transpose: op(A) = AH. For real data this is the
same as trans = 'T'.
n INTEGER. The order of the matrix AP; n 0
i, j INTEGER. Coordinates of the left upper corner of the destination submatrix

in AP.
If uplo=U, 1 ijn.
If uplo=L, 1 jin.
rows INTEGER. Number of rows in the destination submatrix. 0 rowsn - i + 1.
cols INTEGER. Number of columns in the destination submatrix. 0 colsn - j

+ 1.
a REAL for mkl_stppack

DOUBLE PRECISION for mkl_dtppack
COMPLEX for mkl_ctppack
DOUBLE COMPLEX for mkl_ztppack
Pointer to the source submatrix.
Array a(lda, *) contains the rows-by-cols submatrix stored as unpacked
rows-by-columns if trans = N, or unpacked columns-by-rows if trans
= T or trans = C. The size of a must be at least lda*cols for trans =
'N' or lda*rows for trans='T' or trans='C'.
NOTE
If there are elements outside of the triangular part of AP, they are
skipped and are not copied from a.
lda max(1, rows) for trans = 'N' and and lda max(1, cols) for
trans='T' or trans='C'.
Output Parameters
ap REAL for mkl_stppack

DOUBLE PRECISION for mkl_dtppack
1853
COMPLEX for mkl_ctppack

DOUBLE COMPLEX for mkl_ztppack
Array of size at least max(1, n(n+1)/2). The array ap contains either the
upper or the lower triangular part of the matrix AP (as specified by uplo) in
packed storage (see Matrix Storage Schemes). The submatrix of ap from
row i to row i + rows - 1 and column j to column j + cols - 1 is
overwritten with a copy of the source matrix.
info INTEGER. If info=0, the execution is successful. If info = -i, the i-th
parameter had an illegal value.
mkl_?tpunpack
Copies a triangular/symmetric matrix or submatrix
from standard packed format to full format.
Syntax
call mkl_stpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_dtpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ctpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_ztpunpack (uplo, trans, n, ap, i, j, rows, cols, a, lda, info )
call mkl_tpunpack (ap, i, j, rows, cols, a[, uplo] [, trans] [, info])
Include Files
mkl.fi, lapack.f90
Description
The routine copies a triangular or symmetric matrix or its submatrix from standard packed format to full
format.
A := op(APi:i+rows-1, j:j+cols-1)
Standard packed formats include:
TP: triangular packed storage

SP: symmetric indefinite packed storage
HP: Hermitian indefinite packed storage
PP: symmetric or Hermitian positive definite packed storage
Full formats include:
GE: general
TR: triangular
SY: symmetric indefinite
HE: Hermitian indefinite
PO: symmetric or Hermitian positive definite
NOTE
Any elements of the copied submatrix rectangular outside of the triangular part of AP are skipped.
1854
LAPACK Routines 3
Input Parameters
uplo CHARACTER*1. Specifies whether matrix AP is upper or lower triangular.

If uplo = 'U', AP is upper triangular.
If uplo = 'L': AP is lower triangular.
trans CHARACTER*1. Specifies whether or not the copied block of AP is

transposed.
If trans = 'N', no transpose: op(AP) = AP.
If trans = 'T',transpose: op(AP) = APT.
If trans = 'C',conjugate transpose: op(AP) = APH. For real data this is the
same as trans = 'T'.
n INTEGER. The order of the matrix AP; n 0.
ap REAL for mkl_stpunpack

DOUBLE PRECISION for mkl_dtpunpack
COMPLEX for mkl_ctpunpack
DOUBLE COMPLEX for mkl_ztpunpack
Array, size at least max(1, n(n+1)/2). The array ap contains either the
upper or the lower triangular part of the matrix AP (as specified by uplo) in
packed storage (see Matrix Storage Schemes). It is the source for the
submatrix of AP from row i to row i + rows - 1 and column j to column j
+ cols - 1 to be copied.
i, j INTEGER. Coordinates of left upper corner of the submatrix in AP to copy.

If uplo=U, 1 ijn.
If uplo=L, 1 jin.
rows INTEGER. Number of rows to copy. 0 rowsn - i + 1.
cols INTEGER. Number of columns to copy. 0 colsn - j + 1.
lda INTEGER. The leading dimension of array a.
lda max(1,rows) for trans = 'N' and and lda max(1,cols) for
trans='T' or trans='C'.
Output Parameters
a REAL for mkl_stpunpack

DOUBLE PRECISION for mkl_dtpunpack
COMPLEX for mkl_ctpunpack
DOUBLE COMPLEX for mkl_ztpunpack
1855
Pointer to the destination matrix. The size of a must be at least lda*cols

for trans = 'N' or lda*rows for trans='T' or trans='C'. On exit, array a is
overwritten with a copy of the unpacked rows-by-cols submatrix of ap
unpacked rows-by-columns if trans = N, or unpacked columns-by-rows if
trans = T or trans = C.
NOTE
If there are elements outside of the triangular part of ap indicated by
uplo, they are skipped and are not copied to a.
info INTEGER. If info=0, the execution is successful. If info = -i, the i-th
parameter had an illegal value.
LAPACK Utility Functions and Routines

This section describes LAPACK utility functions and routines.
Summary information about these routines is given in the following table:
LAPACK Utility Routines
Types
ilaver Returns the version of the Lapack library.
ilaenv Environmental enquiry function which returns values for tuning

algorithmic performance.
iparmq Environmental enquiry function which returns values for tuning

algorithmic performance.
ieeeck Checks if the infinity and NaN arithmetic is safe. Called by ilaenv.
?labad s, d Returns the square root of the underflow and overflow thresholds if
the exponent-range is very large.
?lamch s, d Determines machine parameters for floating-point arithmetic.
?lamc1 s, d Called from ?lamc2. Determines machine parameters given by beta,

t, rnd, ieee1.
?lamc2 s, d Used by ?lamch. Determines machine parameters specified in its

arguments list.
?lamc3 s, d Called from ?lamc1-?lamc5. Intended to force a and b to be stored

prior to doing the addition of a and b.
?lamc4 s, d This is a service routine for ?lamc2.
?lamc5 s, d Called from ?lamc2. Attempts to compute the largest machine

floating-point number, without overflow.
chla_transtype Translates a BLAST-specified integer constant to the character string

specifying a transposition operation.
iladiag Translates a character string specifying whether a matrix has a unit

diagonal or not to the relevant BLAST-specified integer constant.
1856
LAPACK Routines 3
Types
ilaprec Translates a character string specifying an intermediate precision to

the relevant BLAST-specified integer constant.
ilatrans Translates a character string specifying a transposition operation to

the BLAST-specified integer constant.
ilauplo Translates a character string specifying an upper- or lower-triangular

matrix to the relevant BLAST-specified integer constant.
xerbla_array Assists other languages in calling the xerbla function.
See Also
lsame Tests two characters for equality regardless of the case.
lsamen Tests two character strings for equality regardless of the case.
second/dsecnd Returns elapsed time in seconds. Use to estimate real time between two calls to
this function.
xerblaError handling function called by BLAS, LAPACK, Vector Math, and Vector Statistics
functions.
ilaver
Returns the version of the LAPACK library.
Syntax
call ilaver( vers_major, vers_minor, vers_patch )
Include Files
mkl.fi
Description
This routine returns the version of the LAPACK library.
Output Parameters
vers_major INTEGER.
Returns the major version of the LAPACK library.
vers_minor INTEGER.
Returns the minor version from the major version of the LAPACK library.
vers_patch INTEGER.
Returns the patch version from the minor version of the LAPACK library.
ilaenv
Environmental enquiry function that returns values for
tuning algorithmic performance.
Syntax
value = ilaenv( ispec, name, opts, n1, n2, n3, n4 )
1857
Include Files
mkl.fi
Description
The enquiry function ilaenv is called from the LAPACK routines to choose problem-dependent parameters
for the local environment. See ispec below for a description of the parameters.
This version provides a set of parameters that should give good, but not optimal, performance on many of
the currently available computers.
Optimization Notice
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of ilaenv:
= 1: the optimal blocksize; if this value is 1, an unblocked algorithm will

give the best performance.
= 2: the minimum block size for which the block routine should be used; if
the usable block size is less than this value, an unblocked routine should be
used.
= 3: the crossover point (in a block routine, for n less than this value, an
unblocked routine should be used)
= 4: the number of shifts, used in the nonsymmetric eigenvalue routines
(deprecated)
= 5: the minimum column dimension for blocking to be used; rectangular
blocks must have dimension at least k-by-m, where k is given by
ilaenv(2,...) and m by ilaenv(5,...)
= 6: the crossover point for the SVD (when reducing an m-by-n matrix to
bidiagonal form, if max(m,n)/min(m,n) exceeds this value, a QR
factorization is used first to reduce the matrix to a triangular form.)
= 7: the number of processors
= 8: the crossover point for the multishift QR and QZ methods for
nonsymmetric eigenvalue problems (deprecated).
= 9: maximum size of the subproblems at the bottom of the computation
tree in the divide-and-conquer algorithm (used by ?gelsd and ?gesdd)
=10: ieee NaN arithmetic can be trusted not to trap

=11: infinity arithmetic can be trusted not to trap
1858
LAPACK Routines 3
12 ispec 16: ?hseqr or one of its subroutines, see iparmq for detailed
explanation.
name CHARACTER*(*). The name of the calling subroutine, in either upper case
or lower case.
opts CHARACTER*(*). The character options to the subroutine name,

concatenated into a single character string. For example, uplo = 'U',
trans = 'T', and diag = 'N' for a triangular routine would be specified
as opts = 'UTN'.
NOTE
Use only uppercase characters for the opts string.
n1, n2, n3, n4 INTEGER. Problem dimensions for the subroutine name; these may not all
be required.
Output Parameters
value INTEGER.
If value 0: the value of the parameter specified by ispec;
If value = -k < 0: the k-th argument had an illegal value.
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order that they
appear in the argument list for name, even if they are not used in determining the value of the
parameter specified by ispec.
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the argument list
for name. n1 is used first, n2 second, and so on, and unused problem dimensions are passed a value of
-1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine. For example,
ilaenv is used to retrieve the optimal blocksize for strtri as follows:
nb = ilaenv( 1, 'strtri', uplo // diag, n, -1, -1, -1> )

if( nb.le.1 ) nb = max( 1, n )
See Also
?hseqr
iparmq
iparmq
Environmental enquiry function which returns values
for tuning algorithmic performance.
Syntax
value = iparmq( ispec, name, opts, n, ilo, ihi, lwork )
1859
Include Files
mkl.fi
Description
The function sets problem and machine dependent parameters useful for ?hseqr and its subroutines. It is
called whenever ilaenv is called with 12ispec16.
Input Parameters
ispec INTEGER.
Specifies the parameter to be returned as the value of iparmq:
= 12: (inmin) Matrices of order nmin or less are sent directly to ?lahqr,
the implicit double shift QR algorithm. nmin must be at least 11.
= 13: (inwin) Size of the deflation window. This is best set greater than or
equal to the number of simultaneous shifts ns. Larger matrices benefit from
larger deflation windows.
= 14: (inibl) Determines when to stop nibbling and invest in an
(expensive) multi-shift QR sweep. If the aggressive early deflation
subroutine finds ld converged eigenvalues from an order nw deflation
window and ld>(nw*nibble)/100, then the next QR sweep is skipped and
early deflation is applied immediately to the remaining active diagonal
block. Setting iparmq(ispec=14)=0 causes TTQRE to skip a multi-shift QR
sweep whenever early deflation finds a converged eigenvalue. Setting
iparmq(ispec=14) greater than or equal to 100 prevents TTQRE from
skipping a multi-shift QR sweep.
= 15: (nshfts) The number of simultaneous shifts in a multi-shift QR
iteration.
= 16: (iacc22) iparmq is set to 0, 1 or 2 with the following meanings.
0: During the multi-shift QR sweep, ?laqr5 does not accumulate

reflections and does not use matrix-matrix multiply to update the far-from-
diagonal matrix entries.
1: During the multi-shift QR sweep, ?laqr5 and/or ?laqr3 accumulates
reflections and uses matrix-matrix multiply to update the far-from-diagonal
matrix entries.
2: During the multi-shift QR sweep, ?laqr5 accumulates reflections and
takes advantage of 2-by-2 block structure during matrix-matrix multiplies.
(If ?trrm is slower than ?gemm, then iparmq(ispec=16)=1 may be more
efficient than iparmq(ispec=16)=2 despite the greater level of arithmetic
work implied by the latter choice.)
name CHARACTER*(*). The name of the calling subroutine.
opts CHARACTER*(*). This is a concatenation of the string arguments to TTQRE.
n INTEGER. n is the order of the Hessenberg matrix H.
ilo, ihi INTEGER.
1860
LAPACK Routines 3
1:ilo-1 and ihi+1:n.
lwork INTEGER.
The amount of workspace available.
Output Parameters
value INTEGER.
If value 0: the value of the parameter specified by iparmq;
If value = -k < 0: the k-th argument had an illegal value.
Application Notes
The following conventions have been used when calling ilaenv from the LAPACK routines:
1. opts is a concatenation of all of the character options to subroutine name, in the same order that they
appear in the argument list for name, even if they are not used in determining the value of the
parameter specified by ispec.
2. The problem dimensions n1, n2, n3, n4 are specified in the order that they appear in the argument list
for name. n1 is used first, n2 second, and so on, and unused problem dimensions are passed a value of
-1.
3. The parameter value returned by ilaenv is checked for validity in the calling subroutine. For example,
ilaenv is used to retrieve the optimal blocksize for strtri as follows:
nb = ilaenv( 1, 'strtri', uplo // diag, n, -1, -1, -1> )

if( nb.le.1 ) nb = max( 1, n )
ieeeck
Checks if the infinity and NaN arithmetic is safe.
Called by ilaenv.
Syntax
ival = ieeeck( ispec, zero, one )
Include Files
mkl.fi
Description
The function ieeeck is called from ilaenv to verify that infinity and possibly NaN arithmetic is safe, that is,
will not trap.
Input Parameters
ispec INTEGER.
Specifies whether to test just for infinity arithmetic or both for infinity and
NaN arithmetic:
If ispec = 0: Verify infinity arithmetic only.
If ispec = 1: Verify infinity and NaN arithmetic.
1861
zero REAL. Must contain the value 0.0

This is passed to prevent the compiler from optimizing away this code.
one REAL. Must contain the value 1.0

This is passed to prevent the compiler from optimizing away this code.
Output Parameters
ival INTEGER.
If ival = 0: Arithmetic failed to produce the correct answers.
If ival = 1: Arithmetic produced the correct answers.
?labad
Returns the square root of the underflow and overflow
thresholds if the exponent-range is very large.
Syntax
call slabad( small, large )
call dlabad( small, large )
Include Files
mkl.fi
Description
The routine takes as input the values computed by slamch/dlamch for underflow and overflow, and returns
the square root of each of these values if the log of large is sufficiently large. This subroutine is intended to
identify machines with a large exponent range, such as the Crays, and redefine the underflow and overflow
limits to be the square roots of the values computed by ?lamch. This subroutine is needed because ?lamch
does not compensate for poor arithmetic in the upper half of the exponent range, as is found on a Cray.
Input Parameters
small REAL for slabad

DOUBLE PRECISION for dlabad.
The underflow threshold as computed by ?lamch.
large REAL for slabad

DOUBLE PRECISION for dlabad.
The overflow threshold as computed by ?lamch.
Output Parameters
small On exit, if log10(large) is sufficiently large, the square root of small,

otherwise unchanged.
1862
LAPACK Routines 3
large On exit, if log10(large) is sufficiently large, the square root of large,
otherwise unchanged.
?lamch
Determines machine parameters for floating-point
arithmetic.
Syntax
val = slamch( cmach )
val = dlamch( cmach )
Include Files
mkl.fi
Description
The function ?lamch determines single precision and double precision machine parameters.
Input Parameters
cmach CHARACTER*1. Specifies the value to be returned by ?lamch:

= 'E' or 'e', val = eps
= 'S' or 's', val = sfmin
= 'B' or 'b', val = base
= 'P' or 'p', val = eps*base
= 'n' or 'n', val = t
= 'R' or 'r', val = rnd
= 'M' or 'm', val = emin
= 'U' or 'u', val = rmin
= 'L' or 'l', val = emax
= 'O' or 'o', val = rmax
where
eps = relative machine precision;
sfmin = safe minimum, such that 1/sfmin does not overflow;
base = base of the machine;
prec = eps*base;
t = number of (base) digits in the mantissa;
rnd = 1.0 when rounding occurs in addition, 0.0 otherwise;
emin = minimum exponent before (gradual) underflow;
rmin = underflow_threshold - base**(emin-1);
emax = largest exponent before overflow;
rmax = overflow_threshold - (base**emax)*(1-eps).
1863
NOTE
You can use a character string for cmach instead of a single character
in order to make your code more readable. The first character of the
string determines the value to be returned. For example, 'Precision'
is interpreted as 'p'.
Output Parameters
val REAL for slamch

DOUBLE PRECISION for dlamch
?lamc1
Called from ?lamc2. Determines machine parameters
given by beta, t, rnd, ieee1.
Syntax
call slamc1( beta, t, rnd, ieee1 )
call dlamc1( beta, t, rnd, ieee1 )
Include Files
mkl.fi
Description
The routine ?lamc1 determines machine parameters given by beta, t, rnd, ieee1.
Output Parameters
beta INTEGER. The base of the machine.
t INTEGER. The number of (beta) digits in the mantissa.
rnd LOGICAL.
Specifies whether proper rounding ( rnd = .TRUE. ) or chopping ( rnd
= .FALSE. ) occurs in addition. This may not be a reliable guide to the way
in which the machine performs its arithmetic.
ieee1 LOGICAL.
Specifies whether rounding appears to be done in the ieee 'round to
nearest' style.
?lamc2
Used by ?lamch. Determines machine parameters
specified in its arguments list.
1864
LAPACK Routines 3
Syntax
call slamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
call dlamc2( beta, t, rnd, eps, emin, rmin, emax, rmax )
Include Files
mkl.fi
Description
The routine ?lamc2 determines machine parameters specified in its arguments list.
Output Parameters
beta INTEGER. The base of the machine.
t INTEGER. The number of (beta) digits in the mantissa.
rnd LOGICAL.
Specifies whether proper rounding (rnd = .TRUE.) or chopping (rnd
= .FALSE.) occurs in addition. This may not be a reliable guide to the way
in which the machine performs its arithmetic.
eps REAL for slamc2

DOUBLE PRECISION for dlamc2
The smallest positive number such that
fl(1.0 - eps) < 1.0,
where fl denotes the computed value.
emin INTEGER. The minimum exponent before (gradual) underflow occurs.
rmin REAL for slamc2

The smallest normalized number for the machine, given by
baseemin-1,
where base is the floating point value of beta.
emax INTEGER.The maximum exponent before overflow occurs.
rmax REAL for slamc2

The largest positive number for the machine, given by baseemax(1 - eps),
where base is the floating point value of beta.
?lamc3
Called from ?lamc1-?lamc5. Intended to force a and
b to be stored prior to doing the addition of a and b.
1865
Syntax
val = slamc3( a, b )
val = dlamc3( a, b )
Include Files
mkl.fi
Description
The routine is intended to force A and B to be stored prior to doing the addition of A and B, for use in
situations where optimizers might hold one of these in a register.
Input Parameters
a, b REAL for slamc3

The values a and b.
Output Parameters
val REAL for slamc3

The result of adding values a and b.
?lamc4
This is a service routine for ?lamc2.
Syntax
call slamc4( emin, start, base )
call dlamc4( emin, start, base )
Include Files
mkl.fi
Description
This is a service routine for ?lamc2.
Input Parameters
start REAL for slamc4

The starting point for determining emin.
base INTEGER. The base of the machine.
1866
LAPACK Routines 3
Output Parameters
emin INTEGER. The minimum exponent before (gradual) underflow, computed by

setting a = start and dividing by base until the previous a can not be
recovered.
?lamc5
Called from ?lamc2. Attempts to compute the largest
machine floating-point number, without overflow.
Syntax
call slamc5( beta, p, emin, ieee, emax, rmax)
call dlamc5( beta, p, emin, ieee, emax, rmax)
Include Files
mkl.fi
Description
The routine ?lamc5 attempts to compute rmax, the largest machine floating-point number, without overflow.
It assumes that emax + abs(emin) sum approximately to a power of 2. It will fail on machines where this
assumption does not hold, for example, the Cyber 205 (emin = -28625, emax = 28718). It will also fail if
the value supplied for emin is too large (that is, too close to zero), probably with overflow.
Input Parameters
beta INTEGER. The base of floating-point arithmetic.
p INTEGER. The number of base beta digits in the mantissa of a floating-point

value.
emin INTEGER. The minimum exponent before (gradual) underflow.
ieee LOGICAL. A logical flag specifying whether or not the arithmetic system is
thought to comply with the IEEE standard.
Output Parameters
emax INTEGER. The largest exponent before overflow.
rmax REAL for slamc5

The largest machine floating-point number.
chla_transtype
Translates a BLAST-specified integer constant to the
character string specifying a transposition operation.
Syntax
val = chla_transtype( trans )
1867
Include Files
mkl.fi
Description
The chla_transtype function translates a BLAST-specified integer constant to the character string
specifying a transposition operation.
The function returns a CHARACTER*1. If the input is not an integer indicating a transposition operator, then
val is 'X'. Otherwise, the function returns the constant value corresponding to trans.
Input Parameters
trans INTEGER.
If trans = BLAS_NO_TRANS = 111: No transpose.
If trans = BLAS_TRANS = 112: Transpose.
If trans = BLAS_CONJ_TRANS = 113: Conjugate Transpose.
Output Parameters
val CHARACTER*1
Character that specifies a transposition operation.
iladiag
Translates a character string specifying whether a
matrix has a unit diagonal to the relevant BLAST-
specified integer constant.
Syntax
val = iladiag( diag )
Include Files
mkl.fi
Description
The iladiag function translates a character string specifying whether a matrix has a unit diagonal or not to
the relevant BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating a unit or non-unit
diagonal. Otherwise, the function returns the constant value corresponding to diag.
Input Parameters
diag CHARACTER*1.
If diag = 'N': A is non-unit triangular.
If diag = 'U': A is unit triangular.
1868
LAPACK Routines 3
Output Parameters
val INTEGER
ilaprec
Translates a character string specifying an
intermediate precision to the relevant BLAST-specified
integer constant.
Syntax
val = ilaprec( prec )
Include Files
mkl.fi
Description
The ilaprec function translates a character string specifying an intermediate precision to the relevant
BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating a supported intermediate
precision. Otherwise, the function returns the constant value corresponding to prec.
Input Parameters
prec CHARACTER*1.
If prec = 'S': Single.
If prec = 'D': Double.
If prec = 'I': Indigenous.
If prec = 'X', 'E': Extra.
Output Parameters
val INTEGER
ilatrans
Translates a character string specifying a transposition
operation to the BLAST-specified integer constant.
Syntax
val = ilatrans( trans )
Include Files
mkl.fi
1869
Description
The ilatrans function translates a character string specifying a transposition operation to the BLAST-
The function returns a INTEGER. If val < 0, the input is not a character indicating a transposition operator.
Otherwise, the function returns the constant value corresponding to trans.
Input Parameters
trans CHARACTER*1.
If trans = 'N': No transpose.
If trans = 'T': Transpose.
If trans = 'C': Conjugate Transpose.
Output Parameters
val INTEGER
Character that specifies a transposition operation.
ilauplo
Translates a character string specifying an upper- or
lower-triangular matrix to the relevant BLAST-
Syntax
val = ilauplo( uplo )
Include Files
mkl.fi
Description
The ilauplo function translates a character string specifying an upper- or lower-triangular matrix to the
relevant BLAST-specified integer constant.
The function returns an INTEGER. If val < 0, the input is not a character indicating an upper- or lower-
triangular matrix. Otherwise, the function returns the constant value corresponding to uplo.
Input Parameters
diag CHARACTER.
If diag = 'U': A is upper triangular.
If diag = 'L': A is lower triangular.
Output Parameters
val INTEGER
1870
LAPACK Routines 3
xerbla_array
Assists other languages in calling the xerbla function.
Syntax
call xerbla_array( srname_array, srname_len, info )
Include Files
mkl.fi
Description
The routine assists other languages in calling the error handling xerbla function. Rather than taking a
Fortran string argument as the function name, xerbla_array takes an array of single characters along with
the array length. The routine then copies up to 32 characters of that array into a Fortran string and passes
that to xerbla. If called with a non-positive srname_len, the routine will call xerbla with a string of all
blank characters.
If some macro or other device makes xerbla_array available to C99 by a name lapack_xerbla and with a
common Fortran calling convention, a C99 program could invoke xerbla via:
{
int flen = strlen(__func__);
lapack_xerbla(__func__, &flen, &info);
}
Providing xerbla_array is not necessary for intercepting LAPACK errors. xerbla_array calls xerbla.
Output Parameters
srname_array CHARACTER(1).
Array, dimension (srname_len). The name of the routine that called
xerbla_array.
srname_len INTEGER.
The length of the name in srname_array.
info INTEGER.
Position of the invalid parameter in the parameter list of the calling routine.
LAPACK Test Functions and Routines

This section describes LAPACK test functions and routines.
Summary information about these routines is given in the following table:
LAPACK Test Routines
Types
?lagge s, d, c, z Generates a general m-by-n matrix .
1871

Types
?laghe c, z Generates a complex Hermitian matrix .
?lagsy s, d, c, z Generates a symmetric matrix by pre- and post- multiplying a real

diagonal matrix with a random unitary matrix .
?latms s, d, c, z Generates a general m-by-n matrix with specific singular values.
?latms
Generates a general m-by-n matrix with specific
singular values.
Syntax
call slatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call dlatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call clatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
call zlatms (m, n, dist, iseed, sym, d, mode, cond, dmax, kl, ku, pack, a, lda, work,
info)
Description
The ?latms routine generates random matrices with specified singular values, or symmetric/Hermitian
matrices with specified eigenvalues for testing LAPACK programs.
It applies this sequence of operations:
1. Set the diagonal to d, where d is input or computed according to mode, cond, dmax, and sym as
described in Input Parameters.
2. Generate a matrix with the appropriate band structure, by one of two methods:
Method A 1. Generate a dense m-by-n matrix by multiplying d on the left

and the right by random unitary matrices, then:
2. Reduce the bandwidth according to kl and ku, using
Householder transformations.
Method B: Convert the bandwidth-0 (i.e., diagonal) matrix to a bandwidth-1

matrix using Givens rotations, "chasing" out-of-band elements
back, much as in QR; then convert the bandwidth-1 to a
bandwidth-2 matrix, etc.
Note that for reasonably small bandwidths (relative to m and n)
this requires less storage, as a dense matrix is not generated.
Also, for symmetric or Hermitian matrices, only one triangle is
generated.
Method A is chosen if the bandwidth is a large fraction of the order of the matrix, and lda is at least m (so a
dense matrix can be stored.) Method B is chosen if the bandwidth is small (less than (1/2)*n for symmetric
or Hermitian or less than .3*n+m for nonsymmetric), or lda is less than m and not less than the bandwidth.
Pack the matrix if desired, using one of the methods specified by the pack parameter.
1872
LAPACK Routines 3
If Method B is chosen and band format is specified, then the matrix is generated in the band format and no
repacking is necessary.
Input Parameters
n INTEGER. The number of columns of the matrix A (n 0).
dist CHARACTER*1. Specifies the type of distribution to be used to generate the

random singular values or eigenvalues:
'U': uniform distribution (0, 1)

'S': symmetric uniform distribution (-1, 1)
'N': normal distribution (0, 1)
iseed INTEGER. Array with size 4.

Specifies the seed of the random number generator. Values should lie
between 0 and 4095 inclusive, and iseed(4) should be odd. The random
number generator uses a linear congruential sequence limited to small
integers, and so should produce machine independent random numbers.
The values of the array are modified, and can be used in the next call to ?
latms to continue the same random number sequence.
sym CHARACTER*1.
If sym='S' or 'H', the generated matrix is symmetric or Hermitian, with
eigenvalues specified by d, cond, mode, and dmax; they can be positive,
negative, or zero.
If sym='P', the generated matrix is symmetric or Hermitian, with
eigenvalues (which are singular, non-negative values) specified by d, cond,
mode, and dmax.
If sym='N', the generated matrix is nonsymmetric, with singular, non-
negative values specified by d, cond, mode, and dmax.
d REAL for slatms and clatms

DOUBLE PRECISION for dlatms and zlatms
Array, size (MIN(m , n))
This array is used to specify the singular values or eigenvalues of A (see the
description of sym). If mode=0, then d is assumed to contain the
eigenvalues or singular values, otherwise elements of d are computed
according to mode, cond, and dmax.
mode INTEGER. Describes how the singular/eigenvalues are specified.
mode = 0: use d as input

mode = 1: set d(1) = 1 and d(2:n) = 1.0/cond
mode = 2: set d(1:n - 1) = 1 and d(n) = 1.0/cond
mode = 3: set d(i) = cond-(i - 1)/(n - 1)
mode = 4: set d(i) = 1 - (i - 1)/(n - 1)*(1 - 1/cond)
1873
mode = 5: set elements of d to random numbers in the range (1/cond ,

1) such that their logarithms are uniformly distributed.
mode = 6: set elements of d to random numbers from same distribution
as the rest of the matrix.
mode < 0 has the same meaning as ABS(mode), except that the order of the
elements of d is reversed. Thus, if mode is positive, d has entries ranging
from 1 to 1/cond, if negative, from 1/cond to 1.
If sym='S' or 'H', and mode is not 0, 6, nor -6, then the elements of d are
also given a random sign (multiplied by +1 or -1).
cond REAL for slatms and clatms

Used in setting d as described for the mode parameter. If used, cond 1.
dmax REAL for slatms and clatms

If mode is not -6, 0 nor 6, the contents of d, as computed according to mode
and cond, are scaled by dmax / max(abs(d(i))); thus, the maximum
absolute eigenvalue or singular value (the norm) is abs(dmax).
NOTE
dmax need not be positive: if dmax is negative (or zero), d will be
scaled by a negative number (or zero).
kl INTEGER. Specifies the lower bandwidth of the matrix. For example, kl=0
implies upper triangular, kl=1 implies upper Hessenberg, and kl being at
least m - 1 means that the matrix has full lower bandwidth. kl must equal
ku if the matrix is symmetric or Hermitian.
ku INTEGER. Specifies the upper bandwidth of the matrix. For example, ku=0
implies lower triangular, ku=1 implies lower Hessenberg, and ku being at
least n - 1 means that the matrix has full upper bandwidth. kl must equal
ku if the matrix is symmetric or Hermitian.
pack CHARACTER*1. Specifies packing of matrix:
'N': no packing
'U': zero out all subdiagonal entries (if symmetric or Hermitian)
'L': zero out all superdiagonal entries (if symmetric or Hermitian)
'B': store the lower triangle in band storage scheme (only if matrix
symmetric, Hermitian, or lower triangular)
'Q': store the upper triangle in band storage scheme (only if matrix
symmetric, Hermitian, or upper triangular)
'Z': store the entire matrix in band storage scheme (pivoting can be
provided for by using this option to store A in the trailing rows of the
allocated storage)
Using these options, the various LAPACK packed and banded storage
schemes can be obtained:
1874
LAPACK Routines 3
'Z' 'B' 'Q' 'C' 'R'
GB: general band x
PB: symmetric positive definite band x x
SB: symmetric band x x
HB: Hermitian band x x
TB: triangular band x x
PP: symmetric positive definite packed x x
SP: symmetric packed x x
HP: Hermitian packed x x
TP: triangular packed x x
If two calls to ?latms differ only in the pack parameter, they generate
mathematically equivalent matrices.
lda INTEGER. lda specifies the first dimension of a as declared in the calling
program.
If pack='N', 'U', 'L', 'C', or 'R', then lda must be at least m.
If pack='B' or 'Q', then lda must be at least MIN(kl, m - 1) (which is

equal to MIN(ku,n - 1)).
If pack='Z', lda must be large enough to hold the packed array: MIN( ku,
n - 1) + MIN( kl, m - 1) + 1.
Output Parameters
iseed The array iseed contains the updated seed.
d The array d contains the updated seed.
NOTE
The array d is not modified if mode = 0.
a REAL for slatms

DOUBLE PRECISION for dlatms
COMPLEX for clatms
DOUBLE COMPLEX for zlatms
Array of size lda by n.
The array a contains the generated m-by-n matrix A.

a is first generated in full (unpacked) form, and then packed, if so specified
by pack. Thus, the first m elements of the first n columns are always
modified. If pack specifies a packed or banded storage scheme, all lda
1875
elements of the first n columns are modified; the elements of the array
which do not correspond to elements of the generated matrix are set to
zero.
work REAL for slatms

DOUBLE PRECISION for dlatms
COMPLEX for clatms
DOUBLE COMPLEX for zlatms
Array of size (3*MAX(n, m))
Workspace.

If info = 2, cannot scale to dmax (maximum singular value is 0).
If info = 3, error return from lagge, ?laghe, or lagsy.
1876
ScaLAPACK Routines 4
Intel Math Kernel Library implements routines from the ScaLAPACK package for distributed-memory
architectures. Routines are supported for both real and complex dense and band matrices to perform the
tasks of solving systems of linear equations, solving linear least-squares problems, eigenvalue and singular
value problems, as well as performing a number of related computational tasks.
Intel MKL ScaLAPACK routines are written in FORTRAN 77 with exception of a few utility routines written in C
to exploit the IEEE arithmetic. All routines are available in all precision types: single precision, double
precision, complexm, and double complex precision. See the mkl_scalapack.h header file for C declarations
of ScaLAPACK routines.
NOTE
ScaLAPACK routines are provided only for Intel 64 or Intel Many Integrated Core architectures.
See descriptions of ScaLAPACK computational routines that perform distinct computational tasks, as well as
driver routines for solving standard types of problems in one call. Additionally, Intel Math Kernel Library
implements ScaLAPACK Auxiliary Routines, Utility Functions and Routines, and Matrix Redistribution/Copy
Routines. The library includes routines for both real and complex data.
The <install_directory>/examples/scalapackf directory contains sample code demonstrating the use
of ScaLAPACK routines.
Generally, ScaLAPACK runs on a network of computers using MPI as a message-passing layer and a set of
prebuilt communication subprograms (BLACS), as well as a set of BLAS optimized for the target architecture.
Intel MKL version of ScaLAPACK is optimized for Intel processors. For the detailed system and environment
requirements, see Intel MKL Release Notes and Intel MKL Developer Guide.
For full reference on ScaLAPACK routines and related information, see [SLUG].
Optimization Notice
Overview of ScaLAPACK Routines

The model of the computing environment for ScaLAPACK is represented as a one-dimensional array of
processes (for operations on band or tridiagonal matrices) or also a two-dimensional process grid (for
operations on dense matrices). To use ScaLAPACK, all global matrices or vectors should be distributed on this
array or grid prior to calling the ScaLAPACK routines.
ScaLAPACK is closely tied to other components, including BLAS, BLACS, LAPACK, and PBLAS.
1877
ScaLAPACK Array Descriptors

ScaLAPACK uses two-dimensional block-cyclic data distribution as a layout for dense matrix computations.
This distribution provides good work balance between available processors, and also allows use of BLAS Level
3 routines for optimal local computations. Information about the data distribution that is required to establish
the mapping between each global matrix (array) and its corresponding process and memory location is
contained in the array called the array descriptor associated with each global matrix. The size of the array
descriptor is denoted as dlen_.
Let A be a two-dimensional block cyclicly distributed matrix with the array descriptor array desca. The
meaning of each array descriptor element depends on the type of the matrix A. The tables "Array descriptor
for dense matrices" and "Array descriptor for narrow-band and tridiagonal matrices" describe the meaning of
each element for the different types of matrices.
Array descriptor for dense matrices (dlen_=9)
Element Stored in Description Element Index
Name Number
dtype_a desca(dtype_) 1
Descriptor type ( =1 for dense matrices).
ctxt_a desca(ctxt_) BLACS context handle for the process grid. 2
m_a desca(m_) Number of rows in the global matrix A. 3
n_a desca(n_) Number of columns in the global matrix A. 4
mb_a desca(mb_) Row blocking factor. 5
nb_a desca(nb_) Column blocking factor. 6
1878
Name Number
rsrc_a desca(rsrc_) Process row over which the first row of the 7
global matrix A is distributed.
csrc_a desca(csrc_) Process column over which the first column of 8
the global matrix A is distributed.
lld_a desca(lld_) Leading dimension of the local matrix A. 9
Array descriptor for narrow-band and tridiagonal matrices (dlen_=7)

Name Number
dtype_a desca(dtype_) Descriptor type 1
dtype_a=501: 1-by-P grid,

dtype_a=502: P-by-1 grid.
ctxt_a desca(ctxt_) BLACS context handle indicating the BLACS 2

process grid over which the global matrix A is
distributed. The context itself is global, but the
handle (the integer value) can vary.
n_a desca(n_) The size of the matrix dimension being 3
distributed.
nb_a desca(nb_) The blocking factor used to distribute the 4
distributed dimension of the matrix A.
src_a desca(src_) The process row or column over which the first 5
row or column of the matrix A is distributed.
lld_a desca(lld_) The leading dimension of the local matrix 6
storing the local blocks of the distributed
matrix A. The minimum value of lld_a depends
on dtype_a.
dtype_a=501: lld_a max(size of

undistributed dimension, 1),
dtype_a=502: lld_a max(nb_a, 1).
Not Reserved for future use. 7

applicable
Similar notations are used for different matrices. For example: lld_b is the leading dimension of the local
matrix storing the local blocks of the distributed matrix B and dtype_z is the type of the global matrix Z.
The number of rows and columns of a global dense matrix that a particular process in a grid receives after
data distributing is denoted by LOCr() and LOCc(), respectively. To compute these numbers, you can use the
ScaLAPACK tool routine numroc.
After the block-cyclic distribution of global data is done, you may choose to perform an operation on a
submatrix sub(A) of the global matrix A defined by the following 6 values (for dense matrices):
m The number of rows of sub(A)
n The number of columns of sub(A)
a A pointer to the local matrix containing the entire global matrix A
ia The row index of sub(A) in the global matrix A
ja The column index of sub(A) in the global matrix A
1879
desca The array descriptor for the global matrix A
Optimization Notice
Naming Conventions for ScaLAPACK Routines

For each routine introduced in this chapter, you can use the ScaLAPACK name. The naming convention for
ScaLAPACK routines is similar to that used for LAPACK routines. A general rule is that each routine name in
ScaLAPACK, which has an LAPACK equivalent, is simply the LAPACK name prefixed by initial letter p.
ScaLAPACK names have the structure p?yyzzz or p?yyzz, which is described below.
The initial letter p is a distinctive prefix of ScaLAPACK routines and is present in each such routine.
The second symbol ? indicates the data type:
The second and third letters yy indicate the matrix type as:
ge general
gb general band
gg a pair of general matrices (for a generalized problem)
dt general tridiagonal (diagonally dominant-like)
db general band (diagonally dominant-like)
po symmetric or Hermitian positive-definite
pb symmetric or Hermitian positive-definite band
pt symmetric or Hermitian positive-definite tridiagonal
sy symmetric
st symmetric tridiagonal (real)
he Hermitian
or orthogonal
1880
tr triangular (or quasi-triangular)
tz trapezoidal
un unitary
For computational routines, the last three letters zzz indicate the computation performed and have the same
meaning as for LAPACK routines.
For driver routines, the last two letters zz or three letters zzz have the following meaning:
sv a simple driver for solving a linear system
svx an expert driver for solving a linear system
ls a driver for solving a linear least squares problem
ev a simple driver for solving a symmetric eigenvalue problem
evd a simple driver for solving an eigenvalue problem using a divide and conquer
algorithm
evx an expert driver for solving a symmetric eigenvalue problem
svd a driver for computing a singular value decomposition
gvx an expert driver for solving a generalized symmetric definite eigenvalue problem
Simple driver here means that the driver just solves the general problem, whereas an expert driver is more
versatile and can also optionally perform some related computations (such, for example, as refining the
solution and computing error bounds after the linear system is solved).
ScaLAPACK Computational Routines

In the sections that follow, the descriptions of ScaLAPACK computational routines are given. These routines
perform distinct computational tasks that can be used for:
Solving Systems of Linear Equations
Orthogonal Factorizations and LLS Problems
Symmetric Eigenproblems
Nonsymmetric Eigenproblems
Generalized Symmetric-Definite Eigenproblems
See also the respective driver routines.
Systems of Linear Equations: ScaLAPACK Computational Routines

ScaLAPACK supports routines for the systems of equations with the following types of matrices:
general
general banded
general diagonally dominant-like banded (including general tridiagonal)
symmetric or Hermitian positive-definite
symmetric or Hermitian positive-definite banded
symmetric or Hermitian positive-definite tridiagonal
A diagonally dominant-like matrix is defined as a matrix for which it is known in advance that pivoting is not
required in the LU factorization of this matrix.
1881
For the above matrix types, the library includes routines for performing the following computations: factoring
the matrix; equilibrating the matrix; solving a system of linear equations; estimating the condition number of
a matrix; refining the solution of linear equations and computing its error bounds; inverting the matrix. Note
that for some of the listed matrix types only part of the computational routines are provided (for example,
routines that refine the solution are not provided for band or tridiagonal matrices). See Table Computational
Routines for Systems of Linear Equations for full list of available routines.
To solve a particular problem, you can either call two or more computational routines or call a corresponding
driver routine that combines several tasks in one call. Thus, to solve a system of linear equations with a
general matrix, you can first call p?getrf(LU factorization) and then p?getrs(computing the solution).
Then, you might wish to call p?gerfs to refine the solution and get the error bounds. Alternatively, you can
just use the driver routine p?gesvx which performs all these tasks in one call.
Table Computational Routines for Systems of Linear Equations lists the ScaLAPACK computational routines
for factorizing, equilibrating, and inverting matrices, estimating their condition numbers, solving systems of
equations with real matrices, refining the solution, and estimating its error.
Computational Routines for Systems of Linear Equations
Matrix type, storage Factorize Equilibrate Solve Condition Estimate Invert
scheme matrix matrix system number error matrix
general (partial pivoting) p?getrf p?geequ p?getrs p?gecon p?gerfs p?getri
general band (partial p?gbtrf p?gbtrs
pivoting)
general band (no p?dbtrf p?dbtrs
pivoting)
general tridiagonal (no p?dttrf p?dttrs
pivoting)
symmetric/Hermitian p?potrf p?poequ p?potrs p?pocon p?porfs p?potri
positive-definite
symmetric/Hermitian p?pbtrf p?pbtrs
positive-definite, band
symmetric/Hermitian p?pttrf p?pttrs
positive-definite,
tridiagonal
triangular p?trtrs p?trcon p?trrfs p?trtri
In this table ? stands for s (single precision real), d (double precision real), c (single precision complex), or z
(double precision complex).
Matrix Factorization: ScaLAPACK Computational Routines

This section describes the ScaLAPACK routines for matrix factorization. The following factorizations are
supported:
LU factorization of general matrices

LU factorization of diagonally dominant-like matrices
Cholesky factorization of real symmetric or complex Hermitian positive-definite matrices
You can compute the factorizations using full and band storage of matrices.
p?getrf
distributed matrix.
Syntax
call psgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pdgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pcgetrf(m, n, a, ia, ja, desca, ipiv, info)
call pzgetrf(m, n, a, ia, ja, desca, ipiv, info)
1882
Include Files
Description
The p?getrfroutine forms the LU factorization of a general m-by-n distributed matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1) as
A = P*L*U
where P is a permutation matrix, L is lower triangular with unit diagonal elements (lower trapezoidal if m>n)
and U is upper triangular (upper trapezoidal if m < n). L and U are stored in sub(A).
The routine uses partial pivoting, with row interchanges.
NOTE
This routine supports the Progress Routine feature. See mkl_progress for details.
Input Parameters
m (global) INTEGER. The number of rows in the distributed matrix sub(A);

m0.
n (global) INTEGER. The number of columns in the distributed matrix sub(A);

n0.
a (local)
REAL for psgetrf
DOUBLE PRECISION for pdgetrf
COMPLEX for pcgetrf
DOUBLE COMPLEX for pzgetrf.
Pointer into the local memory to an array of local size (lld_a, LOCc(ja
+n-1)).
Contains the local pieces of the distributed matrix sub(A) to be factored.
ia, ja (global) INTEGER. The row and column indices in the global matrix A
indicating the first row and the first column of the matrix sub(A),
respectively.
desca (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix A.
Output Parameters
a Overwritten by local pieces of the factors L and U from the factorization A =

P*L*U. The unit diagonal elements of L are not stored.
ipiv (local) INTEGER Array of size LOCr(m_a)+ mb_a.
Contains the pivoting information: local row i was interchanged with global
row ipiv(i). This array is tied to the distributed matrix A.
1883
info < 0: if the i-th argument is an array and the j-th entry had an illegal
value, then info = -(i*100+j); if the i-th argument is a scalar and had an
illegal value, then info = -i.
If info = i > 0, uia+i, ja+j-1 is 0. The factorization has been completed,

but the factor U is exactly singular. Division by zero will occur if you use the
factor U for solving a system of linear equations.
See Also
Overview of ScaLAPACK Routines for details of ScaLAPACK array descriptor structures and related
notations.
p?gbtrf
Computes the LU factorization of a general n-by-n
banded distributed matrix.
Syntax
call psgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pdgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pcgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
call pzgbtrf(n, bwl, bwu, a, ja, desca, ipiv, af, laf, work, lwork, info)
Include Files
Description
The p?gbtrf routine computes the LU factorization of a general n-by-n real/complex banded distributed
matrix A(1:n, ja:ja+n-1) using partial pivoting with row interchanges.
The resulting factorization is not the same factorization as returned from the LAPACK routine ?gbtrf.
Additional permutations are performed on the matrix for the sake of parallelism.
A(1:n, ja:ja+n-1) = P*L*U*Q
where P and Q are permutation matrices, and L and U are banded lower and upper triangular matrices,
respectively. The matrix Q represents reordering of columns for the sake of parallelism, while P represents
reordering of rows for numerical stability using classic partial pivoting.
Optimization Notice
1884
Input Parameters
n (global) INTEGER. The number of rows and columns in the distributed

submatrix A(1:n, ja:ja+n-1); n 0.
bwl (global) INTEGER. The number of sub-diagonals within the band of A
( 0 bwln-1 ).
bwu (global) INTEGER. The number of super-diagonals within the band of A
( 0 bwun-1 ).
a (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
+n-1)) where
lld_a 2*bwl + 2*bwu +1.
Contains the local pieces of the n-by-n distributed banded matrix A(1:n,
ja:ja+n-1) to be factored.
ja (global) INTEGER. The index in the global matrix A indicating the start of
the matrix to be operated on (which may be either all of A or a submatrix of
A).
If dtype_a = 501, then dlen_ 7;
else if dtype_a = 1, then dlen_ 9.
laf (local) INTEGER. The size of the array af.
Must be laf (nb_a+bwu)*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
If laf is not large enough, an error code will be returned and the minimum
acceptable size will be returned in af(1).
work (local) Same type as a. Workspace array of size lwork.
lwork (local or global) INTEGER. The size of the work array (lwork 1). If lwork
is too small, the minimal acceptable size will be returned in work(1) and an
error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note that additional
permutations are performed on the matrix, so that the factors returned are
different from those returned by LAPACK.
1885
ipiv (local) INTEGER array.
The size of ipiv must be nb_a.
Contains pivot indices for local factorizations. Note that you should not alter
the contents of this array between factorization and solve.
af (local)
REAL for psgbtrf
DOUBLE PRECISION for pdgbtrf
COMPLEX for pcgbtrf
DOUBLE COMPLEX for pzgbtrf.
Array of size laf.
Auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?gbtrf and is stored in af.
Note that if a linear system is to be solved using p?gbtrs after the

factorization routine,af must not be altered after the factorization.
work(1) On exit, work(1) contains the minimum value of lwork required.
info < 0:
If the i-th argument is an array and the j-th entry had an illegal value, then
info = -(i*100+j); if the i-th argument is a scalar and had an illegal value,
then info = -i.
info> 0:
If info = kNPROCS, the submatrix stored on processor info and factored
locally was not nonsingular, and the factorization was not completed.
If info = k > NPROCS, the submatrix stored on processor info-NPROCS
representing interactions with other processors was not nonsingular, and
the factorization was not completed.
See Also
notations.
p?dbtrf
Computes the LU factorization of a n-by-n diagonally
dominant-like banded distributed matrix.
Syntax
call psdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pddbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pcdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
call pzdbtrf(n, bwl, bwu, a, ja, desca, af, laf, work, lwork, info)
1886
Include Files
Description
The p?dbtrfroutine computes the LU factorization of a n-by-n real/complex diagonally dominant-like banded
distributed matrix A(1:n, ja:ja+n-1) without pivoting.
NOTE
A matrix is called diagonally dominant-like if pivoting is not required for LU to be numerically stable.
Note that the resulting factorization is not the same factorization as returned from LAPACK. Additional
permutations are performed on the matrix for the sake of parallelism.
Optimization Notice
Input Parameters
n (global) INTEGER. The number of rows and columns in the distributed

submatrix A(1:n, ja:ja+n-1); n 0.
bwl (global) INTEGER. The number of sub-diagonals within the band of A
(0 bwln-1).
bwu (global) INTEGER. The number of super-diagonals within the band of A
(0 bwun-1).
a (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
+n-1)).
Contains the local pieces of the n-by-n distributed banded matrix A(1:n,
ja:ja+n-1) to be factored.
A).
1887
Must be lafNB*(bwl+bwu)+6*(max(bwl,bwu))2 .
lwork (local or global) INTEGER. The size of the work array, must be lwork
(max(bwl,bwu))2. If lwork is too small, the minimal acceptable size will
be returned in work(1) and an error code is returned.
Output Parameters
a On exit, this array contains details of the factorization. Note that additional
permutations are performed on the matrix, so that the factors returned are
different from those returned by LAPACK.
af (local)
REAL for psdbtrf
DOUBLE PRECISION for pddbtrf
COMPLEX for pcdbtrf
DOUBLE COMPLEX for pzdbtrf.
Array of size laf.
routine p?dbtrf and is stored in af.
Note that if a linear system is to be solved using p?dbtrs after the

work(1) On exit, work(1) contains the minimum value of lwork required for
optimum performance.
info < 0:
then info = -i.
info> 0:
locally was not diagonally dominant-like, and the factorization was not
completed.
1888
See Also
notations.
p?dttrf
Computes the LU factorization of a diagonally
dominant-like tridiagonal distributed matrix.
Syntax
call psdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pddttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pcdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
call pzdttrf(n, dl, d, du, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?dttrfroutine computes the LU factorization of an n-by-n real/complex diagonally dominant-like
tridiagonal distributed matrix A(1:n, ja:ja+n-1) without pivoting for stability.
The resulting factorization is not the same factorization as returned from LAPACK. Additional permutations
are performed on the matrix for the sake of parallelism.
The factorization has the form:
A(1:n, ja:ja+n-1) = P*L*U*PT,
where P is a permutation matrix, and L and U are banded lower and upper triangular matrices, respectively.
Optimization Notice
Input Parameters
n (global) INTEGER. The number of rows and columns to be operated on, that
is, the order of the distributed submatrix A(1:n, ja:ja+n-1) (n 0).
dl, d, du (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
1889
COMPLEX for pcpttrf

DOUBLE COMPLEX for pzpttrf.
Pointers to the local arrays of size nb_a each.
On entry, the array dl contains the local part of the global vector storing
the subdiagonal elements of the matrix. Globally, dl(1) is not referenced,
and dl must be aligned with d.
On entry, the array d contains the local part of the global vector storing the
diagonal elements of the matrix.
On entry, the array du contains the local part of the global vector storing
the super-diagonal elements of the matrix. du(n) is not referenced, and du
must be aligned with d.
A).
Must be laf 2*(NB+2) .
work (local) Same type as d. Workspace array of size lwork.
lwork (local or global) INTEGER. The size of the work array, must be at least
lwork 8*NPCOL.
Output Parameters
dl, d, du On exit, overwritten by the information containing the factors of the matrix.
af (local)
REAL for psdttrf
DOUBLE PRECISION for pddttrf
COMPLEX for pcdttrf
DOUBLE COMPLEX for pzdttrf.
Array of size laf.
routine p?dttrf and is stored in af.
Note that if a linear system is to be solved using p?dttrs after the

factorization routine,af must not be altered.
1890
info < 0:
then info = -i.
info> 0:
locally was not diagonally dominant-like, and the factorization was not
completed.
See Also
notations.
p?potrf
(Hermitian) positive-definite distributed matrix.
Syntax
call pspotrf(uplo, n, a, ia, ja, desca, info)
call pdpotrf(uplo, n, a, ia, ja, desca, info)
call pcpotrf(uplo, n, a, ia, ja, desca, info)
call pzpotrf(uplo, n, a, ia, ja, desca, info)
Include Files
Description
The p?potrfroutine computes the Cholesky factorization of a real symmetric or complex Hermitian positive-
definite distributed n-by-n matrix A(ia:ia+n-1, ja:ja+n-1), denoted below as sub(A).

sub(A) = UH*U if uplo='U', or
sub(A) = L*LH if uplo='L'
Input Parameters
uplo (global) CHARACTER*1.
Indicates whether the upper or lower triangular part of sub(A) is stored.

Must be 'U' or 'L'.
1891
If uplo = 'U', the array a stores the upper triangular part of the matrix
sub(A) that is factored as UH*U.
matrix sub(A) that is factored as L*LH.
n (global) INTEGER. The order of the distributed matrix sub(A) (n0).
a (local)
REAL for pspotrf
DOUBLE PRECISON for pdpotrf
COMPLEX for pcpotrf
DOUBLE COMPLEX for pzpotrf.
Pointer into the local memory to an array of size (lld_a, LOCc(ja+n-1)).
On entry, this array contains the local pieces of the n-by-n symmetric/
Hermitian distributed matrix sub(A) to be factored.
Depending on uplo, the array a contains either the upper or the lower
triangular part of the matrix sub(A) (see uplo).
respectively.
Output Parameters
a The upper or lower triangular part of a is overwritten by the Cholesky factor

U or L, as specified by uplo.
If info=0, the execution is successful;
info < 0: if the i-th argument is an array, and the j-th entry had an illegal
If info = k >0, the leading minor of order k, A(ia:ia+k-1, ja:ja+k-1), is

not positive-definite, and the factorization could not be completed.
See Also
notations.
p?pbtrf
(Hermitian) positive-definite banded distributed
matrix.
Syntax
call pspbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
1892
call pdpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pcpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
call pzpbtrf(uplo, n, bw, a, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?pbtrfroutine computes the Cholesky factorization of an n-by-n real symmetric or complex Hermitian
positive-definite banded distributed matrix A(1:n, ja:ja+n-1).
A(1:n, ja:ja+n-1) = P*UH*U*PT, if uplo='U', or
A(1:n, ja:ja+n-1) = P*L*LH*PT, if uplo='L',
where P is a permutation matrix and U and L are banded upper and lower triangular matrices, respectively.
Optimization Notice
Input Parameters
uplo (global) CHARACTER*1. Must be 'U' or 'L'.
If uplo = 'U', upper triangle of A(1:n, ja:ja+n-1) is stored;
If uplo = 'L', lower triangle of A(1:n, ja:ja+n-1) is stored.
n (global) INTEGER. The order of the distributed submatrix A(1:n, ja:ja

+n-1).
(n0).
bw (global) INTEGER.
The number of superdiagonals of the distributed matrix if uplo = 'U', or

the number of subdiagonals if uplo = 'L' (bw0).
a (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
1893
Pointer into the local memory to an array of size (lld_a,LOCc(ja+n-1)).
On entry, this array contains the local pieces of the upper or lower triangle
of the symmetric/Hermitian band distributed matrix A(1:n, ja:ja+n-1) to
be factored.
A).
Must be laf (NB+2*bw)*bw.
lwork (local or global) INTEGER. The size of the work array, must be lworkbw2.
Output Parameters
a On exit, if info=0, contains the permuted triangular factor U or L from the

Cholesky factorization of the band matrix A(1:n, ja:ja+n-1), as specified
by uplo.
af (local)
REAL for pspbtrf
DOUBLE PRECISON for pdpbtrf
COMPLEX for pcpbtrf
DOUBLE COMPLEX for pzpbtrf.
Array of size laf. Auxiliary fill-in space. The fill-in space is created in a call
to the factorization routine p?pbtrf and stored in af. Note that if a linear
system is to be solved using p?pbtrs after the factorization routine,af
must not be altered.
info < 0:
then info = -i.
1894
info>0:
locally was not positive definite, and the factorization was not completed.
See Also
notations.
p?pttrf
(Hermitian) positive-definite tridiagonal distributed
matrix.
Syntax
call pspttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pdpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pcpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
call pzpttrf(n, d, e, ja, desca, af, laf, work, lwork, info)
Include Files
Description
The p?pttrfroutine computes the Cholesky factorization of an n-by-n real symmetric or complex hermitian
positive-definite tridiagonal distributed matrix A(1:n, ja:ja+n-1).
A(1:n, ja:ja+n-1) = P*L*D*LH*PT, or
A(1:n, ja:ja+n-1) = P*UH*D*U*PT,
where P is a permutation matrix, and U and L are tridiagonal upper and lower triangular matrices,
respectively.
Optimization Notice
1895
Input Parameters
n (global) INTEGER. The order of the distributed submatrix A(1:n, ja:ja

+n-1)
(n 0).
d, e (local)
REAL for pspttrf
DOUBLE PRECISON for pdpttrf
COMPLEX for pcpttrf
Pointers into the local memory to arrays of size nb_a each.
On entry, the array d contains the local part of the global vector storing the
main diagonal of the distributed matrix A.
On entry, the array e contains the local part of the global vector storing the
upper diagonal of the distributed matrix A.
A).
desca (global and local ) INTEGER array of size dlen_. The array descriptor for the
Must be lafnb_a+2.
work (local) Same type as d and e. Workspace array of size lwork .
lwork 8*NPCOL.
Output Parameters
d, e On exit, overwritten by the details of the factorization.
af (local)
REAL for pspttrf
DOUBLE PRECISION for pdpttrf
COMPLEX for pcpttrf
Array of size laf.
1896
routine p?pttrf and stored in af.
Note that if a linear system is to be solved using p?pttrs after the

factorization routine,af must not be altered.
info < 0:
then info = -i.
info> 0:
locally was not positive definite, and the factorization was not completed.
See Also
notations.
Solving Systems of Linear Equations: ScaLAPACK Computational Routines

This section describes the ScaLAPACK routines for solving systems of linear equations. Before calling most of
these routines, you need to factorize the matrix of your system of equations (see Routines for Matrix
Factorization in this chapter). However, the factorization is not necessary if your system of equations has a
triangular matrix.
p?getrs
Solves a system of distributed linear equations with a
general square matrix, using the LU factorization
computed by p?getrf.
Syntax
call psgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pdgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pcgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
call pzgetrs(trans, n, nrhs, a, ia, ja, desca, ipiv, b, ib, jb, descb, info)
Include Files
Description
The p?getrsroutine solves a system of distributed linear equations with a general n-by-n distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the LU factorization computed by p?getrf.
1897
The system has one of the following forms specified by trans:

sub(A)*X = sub(B) (no transpose),
sub(A)T*X = sub(B) (transpose),
sub(A)H*X = sub(B) (conjugate transpose),
where sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1).
Before calling this routine,you must call p?getrf to compute the LU factorization of sub(A).
Input Parameters
trans (global) CHARACTER*1. Must be 'N' or 'T' or 'C'.

If trans = 'N', then sub(A)*X = sub(B) is solved for X.
If trans = 'T', then sub(A)T*X = sub(B) is solved for X.
If trans = 'C', then sub(A)H *X = sub(B) is solved for X.
n (global) INTEGER. The number of linear equations; the order of the matrix
sub(A) (n0).
nrhs (global) INTEGER. The number of right hand sides; the number of columns
of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for psgetrs
DOUBLE PRECISION for pdgetrs
COMPLEX for pcgetrs
DOUBLE COMPLEX for pzgetrs.
Pointers into the local memory to arrays of local sizes (lld_a,LOCc(ja
+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
On entry, the array a contains the local pieces of the factors L and U from
the factorization sub(A) = P*L*U; the unit diagonal elements of L are not
stored. On entry, the array b contains the right hand sides sub(B).
respectively.
ipiv (local) INTEGER Array of size of LOCr(m_a) + mb_a. Contains the pivoting
information: local row i of the matrix was interchanged with the global row
ipiv(i).
This array is tied to the distributed matrix A.
ib, jb (global) INTEGER. The row and column indices in the global matrix B
indicating the first row and the first column of the matrix sub(B),
respectively.
1898
descb (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix B.
Output Parameters
b On exit, overwritten by the solution distributed matrix X.
info INTEGER. If info=0, the execution is successful. info < 0:

then info = -i.
See Also
notations.
p?gbtrs
Solves a system of distributed linear equations with a
general band matrix, using the LU factorization
computed by p?gbtrf.
Syntax
call psgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pdgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pcgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
call pzgbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, ipiv, b, ib, descb, af, laf,
work, lwork, info)
Include Files
Description
The p?gbtrs routine solves a system of distributed linear equations with a general band distributed matrix
sub(A) = A(1:n, ja:ja+n-1) using the LU factorization computed by p?gbtrf.
The system has one of the following forms specified by trans:
sub(A)*X = sub(B) (no transpose),

sub(A)T*X = sub(B) (transpose),
sub(A)H*X = sub(B) (conjugate transpose),
where sub(B) = B(ib:ib+n-1, 1:nrhs).
Before calling this routine,you must call p?gbtrf to compute the LU factorization of sub(A).
Input Parameters
1899
If trans = 'C', then sub(A)H *X = sub(B) is solved for X.
n (global) INTEGER. The number of linear equations; the order of the

distributed matrix sub(A) (n 0).
bwl (global) INTEGER. The number of sub-diagonals within the band of A( 0

bwln-1 ).
bwu (global) INTEGER. The number of super-diagonals within the band of A( 0

bwun-1 ).
of the distributed matrix sub(B) (nrhs 0).
a, b (local)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
DOUBLE COMPLEX for pzgbtrs.
+n-1)) and (lld_b,LOCc(nrhs)), respectively.
The array a contains details of the LU factorization of the distributed band
matrix A.
On entry, the array b contains the local pieces of the right hand sides
B(ib:ib+n-1, 1:nrhs).
the matrix to be operated on ( which may be either all of A or a submatrix
of A).
ib (global) INTEGER. The index in the global matrix A indicating the start of
A).
If dtype_b = 502, then dlen_ 7;
else if dtype_b = 1, then dlen_ 9.
Must be lafnb_a*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
1900
lworknrhs*(nb_a+2*bwl+4*bwu).
Output Parameters
ipiv (local) INTEGER array.
The size of ipiv must be nb_a.
Contains pivot indices for local factorizations. Note that you should not alter
the contents of this array between factorization and solve.
b On exit, overwritten by the local pieces of the solution distributed matrix X.
af (local)
REAL for psgbtrs
DOUBLE PRECISION for pdgbtrs
COMPLEX for pcgbtrs
DOUBLE COMPLEX for pzgbtrs.
Array of size laf.
Auxiliary Fill-in space. The fill-in space is created in a call to the

factorization routine p?gbtrf and is stored in af.
Note that if a linear system is to be solved using p?gbtrs after the


info < 0:
then info = -i.
See Also
notations.
p?dbtrs
dominant-like banded distributed matrix using the
factorization computed by p?dbtrf.
Syntax
call psdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
1901
call pddbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzdbtrs(trans, n, bwl, bwu, nrhs, a, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
Include Files
Description
The p?dbtrsroutine solves for X one of the systems of equations:
sub(A)*X = sub(B),
(sub(A))T*X = sub(B), or
(sub(A))H*X = sub(B),
where sub(A) = A(1:n, ja:ja+n-1) is a diagonally dominant-like banded distributed matrix, and sub(B)
denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses the LU factorization computed by p?dbtrf.
Input Parameters

If trans = 'T', then (sub(A))T*X = sub(B) is solved for X.
If trans = 'C', then (sub(A))H*X = sub(B) is solved for X.
n (global) INTEGER. The order of the distributed matrix sub(A) (n 0).
bwl (global) INTEGER. The number of subdiagonals within the band of A
( 0 bwln-1 ).
bwu (global) INTEGER. The number of superdiagonals within the band of A
( 0 bwun-1 ).
a, b (local)
REAL for psdbtrs
DOUBLE PRECISON for pddbtrs
COMPLEX for pcdbtrs
DOUBLE COMPLEX for pzdbtrs.
+n-1)) and (lld_b,LOCc(nrhs)), respectively.
1902
On entry, the array a contains details of the LU factorization of the band
matrix A, as computed by p?dbtrf.
On entry, the array b contains the local pieces of the right hand side
distributed matrix sub(B).
A).
ib (global) INTEGER. The row index in the global matrix B indicating the first
row of the matrix to be operated on (which may be either all of B or a
submatrix of B).
af, work (local)

REAL for psdbtrs
DOUBLE PRECISION for pddbtrs
COMPLEX for pcdbtrs
DOUBLE COMPLEX for pzdbtrs.
Arrays of size laf and lwork, respectively The array af contains auxiliary
fill-in space. The fill-in space is created in a call to the factorization routine
p?dbtrf and is stored in af.
The array work is a workspace array.
Must be lafNB*(bwl+bwu)+6*(max(bwl,bwu))2 .
lwork (local or global) INTEGER. The size of the array work, must be at least
lwork (max(bwl,bwu))2.
Output Parameters
b On exit, this array contains the local pieces of the solution distributed
matrix X.
1903

then info = -i.
See Also
notations.
p?dttrs
dominant-like tridiagonal distributed matrix using the
factorization computed by p?dttrf.
Syntax
call psdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pddttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pcdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
call pzdttrs(trans, n, nrhs, dl, d, du, ja, desca, b, ib, descb, af, laf, work,
lwork, info)
Include Files
Description
The p?dttrsroutine solves for X one of the systems of equations:
sub(A)*X = sub(B),
where sub(A) =A(1:n, ja:ja+n-1) is a diagonally dominant-like tridiagonal distributed matrix, and sub(B)
denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses the LU factorization computed by p?dttrf.
Input Parameters

If trans = 'T', then (sub(A))T*X = sub(B) is solved for X.
If trans = 'C', then (sub(A))H*X = sub(B) is solved for X.
1904
dl, d, du (local)
REAL for psdttrs
DOUBLE PRECISON for pddttrs
COMPLEX for pcdttrs
DOUBLE COMPLEX for pzdttrs.
Pointers to the local arrays of size nb_a each.
On entry, these arrays contain details of the factorization. Globally, dl(1)

and du(n) are not referenced; dl and du must be aligned with d.
A).
If dtype_a = 501 or dtype_a = 502, then dlen_ 7;
b (local) Same type as d.
Pointer into the local memory to an array of local size

(lld_b,LOCc(nrhs))
On entry, the array b contains the local pieces of the n-by-nrhs right hand
side distributed matrix sub(B).
submatrix of B).
af, work (local) REAL for psdttrs
DOUBLE PRECISION for pddttrs

COMPLEX for pcdttrs
DOUBLE COMPLEX for pzdttrs.
Arrays of size laf and (lwork), respectively.
The array af contains auxiliary fill-in space. The fill-in space is created in a
call to the factorization routine p?dttrf and is stored in af. If a linear
system is to be solved using p?dttrs after the factorization routine,af
must not be altered.
1905
Must be lafNB*(bwl+bwu)+6*(bwl+bwu)*(bwl+2*bwu).
lwork 10*NPCOL+4*nrhs.
Output Parameters
matrix X.

then info = -i.
See Also
notations.
p?potrs
factored symmetric/Hermitian distributed positive-
definite matrix.
Syntax
call pspotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pcpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pzpotrs(uplo, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
Include Files
Description
The p?potrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive
definite distributed matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, jb:jb+nrhs-1).
This routine uses Cholesky factorization

sub(A) = UH*U, or sub(A) = L*LH
computed by p?potrf.
1906
Input Parameters
If uplo = 'U', upper triangle of sub(A) is stored;
If uplo = 'L', lower triangle of sub(A) is stored.
a, b (local)
REAL for pspotrs
DOUBLE PRECISION for pdpotrs
COMPLEX for pcpotrs
DOUBLE COMPLEX for pzpotrs.
Pointers into the local memory to arrays of local sizes
(lld_a,LOCc(ja+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
The array a contains the factors L or U from the Cholesky factorization
sub(A) = L*LH or sub(A) = UH*U, as computed by p?potrf.
On entry, the array b contains the local pieces of the right hand sides
sub(B).
respectively.
respectively.
descb (local) INTEGER array of size dlen_. The array descriptor for the distributed
matrix B.
Output Parameters
b Overwritten by the local pieces of the solution matrix X.

info < 0: if the i-th argument is an array and the j-th entry had an illegal
See Also
notations.
1907
p?pbtrs
factored symmetric/Hermitian positive-definite band
matrix.
Syntax
call pspbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pdpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pcpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
call pzpbtrs(uplo, n, bw, nrhs, a, ja, desca, b, ib, descb, af, laf, work, lwork,
info)
Include Files
Description
The p?pbtrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive definite
distributed band matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses Cholesky factorization

sub(A) = P*UH*U*PT, or sub(A) = P*L*LH*PT
computed by p?pbtrf.
Input Parameters
bw (global) INTEGER. The number of superdiagonals of the distributed matrix if

uplo = 'U', or the number of subdiagonals if uplo = 'L' (bw0).
a, b (local)
REAL for pspbtrs
DOUBLE PRECISION for pdpbtrs
COMPLEX for pcpbtrs
DOUBLE COMPLEX for pzpbtrs.
1908
+n-1)) and (lld_b,LOCc(nrhs-1)), respectively.
The array a contains the permuted triangular factor U or L from the
Cholesky factorization sub(A) = P*UH*U*PT, or sub(A) = P*L*LH*PT of the
band matrix A, as returned by p?pbtrf.
On entry, the array b contains the local pieces of the n-by-nrhs right hand
A).
row of the matrix sub(B).
af, work (local) Arrays, same type as a.
The array af is of size laf. It contains auxiliary fill-in space. The fill-in
space is created in a call to the factorization routine p?dbtrf and is stored
in af.
The array work is a workspace array of size lwork.
Must be lafnrhs*bw.
lworkbw2.
Output Parameters
b On exit, if info=0, this array contains the local pieces of the n-by-nrhs
solution distributed matrix X.

info < 0:
1909
then info = -i.
See Also
notations.
p?pttrs
Solves a system of linear equations with a symmetric
(Hermitian) positive-definite tridiagonal distributed
matrix using the factorization computed by p?pttrf.
Syntax
call pspttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pdpttrs(n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pcpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
call pzpttrs(uplo, n, nrhs, d, e, ja, desca, b, ib, descb, af, laf, work, lwork, info)
Include Files
Description
The p?pttrsroutine solves for X a system of distributed linear equations in the form:
sub(A)*X = sub(B) ,
where sub(A) = A(1:n, ja:ja+n-1) is an n-by-n real symmetric or complex Hermitian positive definite
tridiagonal distributed matrix, and sub(B) denotes the distributed matrix B(ib:ib+n-1, 1:nrhs).
This routine uses the factorization

sub(A) = P*L*D*LH*PT, or sub(A) = P*UH*D*U*PT
computed by p?pttrf.
Input Parameters
uplo (global, used in complex flavors only)

CHARACTER*1. Must be 'U' or 'L'.
d, e (local)
REAL for pspttrs
DOUBLE PRECISON for pdpttrs
COMPLEX for pcpttrs
1910
DOUBLE COMPLEX for pzpttrs.
Pointers into the local memory to arrays of size nb_a each.
These arrays contain details of the factorization as returned by p?pttrf
A).
If dtype_a = 501 or dtype_a = 502, then dlen_ 7;
b (local) Same type as d, e.

(lld_b,LOCc(nrhs)).
On entry, the array b contains the local pieces of the n-by-nrhsright hand
submatrix of B).
af, work (local) REAL for pspttrs
DOUBLE PRECISION for pdpttrs

COMPLEX for pcpttrs
DOUBLE COMPLEX for pzpttrs.
Arrays of size laf and (lwork), respectively. The array af contains
auxiliary fill-in space. The fill-in space is created in a call to the factorization
routine p?pttrf and is stored in af.
Must be lafnb_a+2.
If laf is not large enough, an error code is returned and the minimum
lwork (10+2*min(100,nrhs))*NPCOL+4*nrhs.
1911
Output Parameters
matrix X.
work(1)) On exit, work(1) contains the minimum value of lwork required for

info < 0:
if the i-th argument is an array and the j-th entry had an illegal
value, then info = -(i*100+j); if the i-th argument is a scalar and
had an illegal value, then info = -i.
See Also
notations.
p?trtrs
Solves a system of linear equations with a triangular
distributed matrix.
Syntax
call pstrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pdtrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pctrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
call pztrtrs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, info)
Include Files
Description
The p?trtrsroutine solves for X one of the following systems of linear equations:
sub(A)*X = sub(B),
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular distributed matrix of order n, and sub(B) denotes
the distributed matrix B(ib:ib+n-1, jb:jb+nrhs-1).
A check is made to verify that sub(A) is nonsingular.
Input Parameters
Indicates whether sub(A) is upper or lower triangular:

If uplo = 'U', then sub(A) is upper triangular.
If uplo = 'L', then sub(A) is lower triangular.
1912
If trans = 'C', then sub(A)H*X = sub(B) is solved for X.
diag (global) CHARACTER*1. Must be 'N' or 'U'.
If diag = 'N', then sub(A) is not a unit triangular matrix.
If diag = 'U', then sub(A) is unit triangular.
nrhs (global) INTEGER. The number of right-hand sides; i.e., the number of
columns of the distributed matrix sub(B) (nrhs0).
a, b (local)
REAL for pstrtrs
DOUBLE PRECISION for pdtrtrs
COMPLEX for pctrtrs
DOUBLE COMPLEX for pztrtrs.
+n-1)) and (lld_b,LOCc(jb+nrhs-1)), respectively.
The array a contains the local pieces of the distributed triangular matrix
sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of sub(A) contains
the upper triangular matrix, and the strictly lower triangular part of sub(A)
is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of sub(A) contains
the lower triangular matrix, and the strictly upper triangular part of sub(A)
is not referenced.
If diag = 'U', the diagonal elements of sub(A) are also not referenced
and are assumed to be 1.
On entry, the array b contains the local pieces of the right hand side
distributed matrix sub(B).
respectively.
respectively.
1913
Output Parameters
b On exit, if info=0, sub(B) is overwritten by the solution matrix X.

info < 0:
if the i-th argument is an array and the j-th entry had an illegal value, then
then info = -i.
info> 0:
if info = i, the i-th diagonal element of sub(A) is zero, indicating that the
submatrix is singular and the solutions X have not been computed.
See Also
notations.
Estimating the Condition Number: ScaLAPACK Computational Routines

This section describes the ScaLAPACK routines for estimating the condition number of a matrix. The condition
number is used for analyzing the errors in the solution of a system of linear equations. Since the condition
number may be arbitrarily large when the matrix is nearly singular, the routines actually compute the
reciprocal condition number.
p?gecon
general distributed matrix in either the 1-norm or the
infinity-norm.
Syntax
call psgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pdgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pcgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
call pzgecon(norm, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
Include Files
Description
The p?gecon routine estimates the reciprocal of the condition number of a general distributed real/complex
matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) in either the 1-norm or infinity-norm, using the LU factorization
computed by p?getrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is computed as
1914
Input Parameters
norm (global) CHARACTER*1. Must be '1' or 'O' or 'I'.
Specifies whether the 1-norm condition number or the infinity-norm

condition number is required.
If norm = '1' or 'O', then the 1-norm is used;
If norm = 'I', then the infinity-norm is used.
a (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
The array a contains the local pieces of the factors L and U from the
factorization sub(A) = P*L*U; the unit diagonal elements of L are not
stored.
respectively.
anorm (global) REAL for single precision flavors, DOUBLE PRECISION for double
precision flavors.
If norm = '1' or 'O', the 1-norm of the original distributed matrix sub(A);
If norm = 'I', the infinity-norm of the original distributed matrix sub(A).
work (local)
REAL for psgecon
DOUBLE PRECISION for pdgecon
COMPLEX for pcgecon
DOUBLE COMPLEX for pzgecon.
The array work of size lwork is a workspace array.
lwork (local or global) INTEGER. The size of the array work.
For real flavors:
1915
lwork must be at least

lwork 2*LOCr(n+mod(ia-1,mb_a))+2*LOCc(n+mod(ja-1,nb_a))
+max(2, max(nb_a*max(1, iceil(NPROW-1, NPCOL)), LOCc(n
+mod(ja-1,nb_a)) + nb_a*max(1, iceil(NPCOL-1, NPROW)))).
lwork 2*LOCr(n+mod(ia-1,mb_a))+max(2,
max(nb_a*iceil(NPROW-1, NPCOL), LOCc(n+mod(ja-1,nb_a))+
nb_a*iceil(NPCOL-1, NPROW))).
LOCr and LOCc values can be computed using the ScaLAPACK tool function
numroc; NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
NOTE
iceil(x,y) is the ceiling of x/y, and mod(x,y) is the integer
remainder of x/y.
iwork (local) INTEGER. Workspace array of size liwork. Used in real flavors only.
liwork (local or global) INTEGER. The size of the array iwork; used in real flavors
only. Must be at least
liworkLOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcgecon
DOUBLE PRECISION for pzgecon

Workspace array of size lrwork. Used in complex flavors only.
lrwork (local or global) INTEGER. The size of the array rwork; used in complex
flavors only. Must be at least
lrwork max(1, 2*LOCc(n+mod(ja-1,nb_a))).
Output Parameters
rcond (global) REAL for single precision flavors.

The reciprocal of the condition number of the distributed matrix sub(A). See
Description.
iwork(1) On exit, iwork(1) contains the minimum value of liwork required for
optimum performance (for real flavors).
rwork(1) On exit, rwork(1) contains the minimum value of lrwork required for
optimum performance (for complex flavors).
info (global) INTEGER. If info=0, the execution is successful.
1916
info < 0:
then info = -i.
See Also
notations.
p?pocon
Estimates the reciprocal of the condition number (in
the 1 - norm) of a symmetric / Hermitian positive-
definite distributed matrix.
Syntax
call pspocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pdpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, iwork, liwork,
info)
call pcpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
call pzpocon(uplo, n, a, ia, ja, desca, anorm, rcond, work, lwork, rwork, lrwork,
info)
Include Files
Description
The p?poconroutine estimates the reciprocal of the condition number (in the 1 - norm) of a real symmetric
or complex Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1), using the
Cholesky factorization sub(A) = UH*U or sub(A) = L*LH computed by p?potrf.
An estimate is obtained for ||(sub(A))-1||, and the reciprocal of the condition number is computed as
Input Parameters
Specifies whether the factor stored in sub(A) is upper or lower triangular.

If uplo = 'U', sub(A) stores the upper triangular factor U of the Cholesky
factorization sub(A) = UH*U.
If uplo = 'L', sub(A) stores the lower triangular factor L of the Cholesky
factorization sub(A) = L*LH.
1917
a (local)
REAL for pspocon
DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
The array a contains the local pieces of the factors L or U from the Cholesky
factorization sub(A) = UH*U, or sub(A) = L*LH, as computed by p?potrf.
respectively.
anorm (global) REAL for single precision flavors,

The 1-norm of the symmetric/Hermitian distributed matrix sub(A).
work (local)
REAL for pspocon
DOUBLE PRECISION for pdpocon
COMPLEX for pcpocon
DOUBLE COMPLEX for pzpocon.
For real flavors:

lwork 2*LOCr(n+mod(ia-1,mb_a))+2*LOCc(n+mod(ja-1,nb_a))
+max(2, max(nb_a*iceil(NPROW-1, NPCOL), LOCc(n
+mod(ja-1,nb_a))+nb_a*iceil(NPCOL-1, NPROW))).
max(nb_a*max(1,iceil(NPROW-1, NPCOL)), LOCc(n+mod(ja-1,nb_a))
+nb_a*max(1,iceil(NPCOL-1, NPROW)))).
NOTE
remainder of x/y.
1918
only. Must be at least liworkLOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcpocon
DOUBLE PRECISION for pzpocon

flavors only. Must be at least lrwork 2*LOCc(n+mod(ja-1,nb_a)).
Output Parameters

The reciprocal of the condition number of the distributed matrix sub(A).
info < 0:
then info = -i.
See Also
notations.
p?trcon
triangular distributed matrix in either 1-norm or
infinity-norm.
Syntax
call pstrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, iwork, liwork,
info)
call pdtrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, iwork, liwork,
info)
call pctrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, rwork, lrwork,
info)
call pztrcon(norm, uplo, diag, n, a, ia, ja, desca, rcond, work, lwork, rwork, lrwork,
info)
1919
Include Files
Description
The p?trconroutine estimates the reciprocal of the condition number of a triangular distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1), in either the 1-norm or the infinity-norm.
The norm of sub(A) is computed and an estimate is obtained for ||(sub(A))-1||, then the reciprocal of the
condition number is computed as
Input Parameters
norm (global) CHARACTER*1. Must be '1' or 'O' or 'I'.
Specifies whether the 1-norm condition number or the infinity-norm

condition number is required.
If norm = '1' or 'O', then the 1-norm is used;
If norm = 'I', then the infinity-norm is used.
If uplo = 'U', sub(A) is upper triangular. If uplo = 'L', sub(A) is lower

triangular.
diag (global) CHARACTER*1. Must be 'N' or 'U'.
If diag = 'N', sub(A) is non-unit triangular. If diag = 'U', sub(A) is unit

triangular.
n (global) INTEGER. The order of the distributed matrix sub(A), (n0).
a (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
The array a contains the local pieces of the triangular distributed matrix
sub(A).
If uplo = 'U', the leading n-by-n upper triangular part of this distributed
matrix contains the upper triangular matrix, and its strictly lower triangular
part is not referenced.
If uplo = 'L', the leading n-by-n lower triangular part of this distributed
matrix contains the lower triangular matrix, and its strictly upper triangular
part is not referenced.
1920
respectively.
work (local)
REAL for pstrcon
DOUBLE PRECISION for pdtrcon
COMPLEX for pctrcon
DOUBLE COMPLEX for pztrcon.
For real flavors:

lwork 2*LOCr(n+mod(ia-1,mb_a))+LOCc(n+mod(ja-1,nb_a))
+max(2, max(nb_a*max(1,iceil(NPROW-1, NPCOL)),
LOCc(n+mod(ja-1,nb_a))+nb_a*max(1,iceil(NPCOL-1, NPROW)))).
max(nb_a*iceil(NPROW-1, NPCOL),
LOCc(n+mod(ja-1,nb_a))+nb_a*iceil(NPCOL-1, NPROW))).
NOTE
remainder of x/y.
liworkLOCr(n+mod(ia-1,mb_a)).
rwork (local) REAL for pcpocon
DOUBLE PRECISION for pzpocon

flavors only. Must be at least
1921
lrworkLOCc(n+mod(ja-1,nb_a)).
Output Parameters

The reciprocal of the condition number of the distributed matrix sub(A).
info < 0:
then info = -i.
See Also
notations.
Refining the Solution and Estimating Its Error: ScaLAPACK Computational Routines
This section describes the ScaLAPACK routines for refining the computed solution of a system of linear
equations and estimating the solution error. You can call these routines after factorizing the matrix of the
system of equations and computing the solution (see Routines for Matrix Factorization and Solving Systems
of Linear Equations).
p?gerfs
equations and provides error bounds and backward
error estimates for the solution.
Syntax
call psgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pcgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pzgerfs(trans, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, ipiv, b, ib, jb,
descb, x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
1922
Description
The p?gerfs routine improves the computed solution to one of the systems of linear equations
sub(A)*sub(X) = sub(B),
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) and provides error bounds and backward error estimates for the solution.
Here sub(A) = A(ia:ia+n-1, ja:ja+n-1), sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1), and sub(X) = X(ix:ix
+n-1, jx:jx+nrhs-1).
Input Parameters

If trans = 'N', the system has the form sub(A)*sub(X) = sub(B) (No
transpose);
If trans = 'T', the system has the form sub(A)T*sub(X) = sub(B)
(Transpose);
If trans = 'C', the system has the form sub(A)H*sub(X) = sub(B)
(Conjugate transpose).
nrhs (global) INTEGER. The number of right-hand sides, i.e., the number of
columns of the matrices sub(B) and sub(X) (nrhs 0).
a, af, b, x (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
Pointers into the local memory to arrays of local sizes a(lld_a, LOCc(ja
+n-1)), af(lld_af,LOCc(jaf+n-1)), b(lld_b,LOCc(jb+nrhs-1)),
and x(lld_x,LOCc(jx+nrhs-1)), respectively.
The array a contains the local pieces of the distributed matrix sub(A).
The array af contains the local pieces of the distributed factors of the
matrix sub(A) = P*L*U as computed by p?getrf.
The array b contains the local pieces of the distributed matrix of right hand
sides sub(B).
On entry, the array x contains the local pieces of the distributed solution
matrix sub(X).
respectively.
1923
iaf, jaf (global) INTEGER. The row and column indices in the global matrix AF
indicating the first row and the first column of the matrix sub(AF),
respectively.
descaf (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix AF.
respectively.
ix, jx (global) INTEGER. The row and column indices in the global matrix X
indicating the first row and the first column of the matrix sub(X),
respectively.
descx (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix X.
ipiv (local) INTEGER.
Array of size LOCr(m_af) + mb_af.
This array contains pivoting information as computed by p?getrf. If

ipiv(i)=j, then the local row i was swapped with the global row j.
work (local)
REAL for psgerfs
DOUBLE PRECISION for pdgerfs
COMPLEX for pcgerfs
DOUBLE COMPLEX for pzgerfs.
For real flavors:

lwork 3*LOCr(n+mod(ia-1,mb_a))
NOTE
mod(x,y) is the integer remainder of x/y.
iwork (local) INTEGER. Workspace array, size liwork. Used in real flavors only.
1924
liworkLOCr(n+mod(ib-1,mb_b)).
rwork (local) REAL for pcgerfs
DOUBLE PRECISION for pzgerfs

Workspace array, size lrwork. Used in complex flavors only.
flavors only. Must be at least lrworkLOCr(n+mod(ib-1,mb_b))).
Output Parameters
x On exit, contains the improved solution vectors.

Arrays of size LOCc(jb+nrhs-1) each.
The array ferr contains the estimated forward error bound for each
solution vector of sub(X).
If XTRUE is the true solution corresponding to sub(X), ferr is an estimated
upper bound for the magnitude of the largest element in (sub(X) - XTRUE)
divided by the magnitude of the largest element in sub(X). The estimate is
as reliable as the estimate for rcond, and is almost always a slight
overestimate of the true error.
This array is tied to the distributed matrix X.
The array berr contains the component-wise relative backward error of
each solution vector (that is, the smallest relative change in any entry of
sub(A) or sub(B) that makes sub(X) an exact solution). This array is tied to
the distributed matrix X.
info < 0:
then info = -i.
See Also
notations.
1925
p?porfs
equations with symmetric/Hermitian positive definite
distributed matrix and provides error bounds and
backward error estimates for the solution.
Syntax
call psporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pcporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pzporfs(uplo, n, nrhs, a, ia, ja, desca, af, iaf, jaf, descaf, b, ib, jb, descb,
x, ix, jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?porfsroutine improves the computed solution to the system of linear equations
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a real symmetric or complex Hermitian positive definite
distributed matrix and
sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1),
sub(X) = X(ix:ix+n-1, jx:jx+nrhs-1)
are right-hand side and solution submatrices, respectively. This routine also provides error bounds and
backward error estimates for the solution.
Input Parameters

Hermitian matrix sub(A) is stored.
triangular.
nrhs (global) INTEGER. The number of right-hand sides, i.e., the number of
columns of the matrices sub(B) and sub(X) (nrhs0).
a, af, b, x (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
DOUBLE COMPLEX for pzporfs.
1926
Pointers into the local memory to arrays of local sizes
a(lld_a, LOCc(ja+n-1)), af(lld_af,LOCc(jaf+n-1)),
b(lld_b,LOCc(jb+nrhs-1)), and x(lld_x,LOCc(jx+nrhs-1)),
respectively.
The array a contains the local pieces of the n-by-n symmetric/Hermitian
distributed matrix sub(A).
the upper triangular part of the matrix, and its strictly lower triangular part
is not referenced.
the lower triangular part of the distributed matrix, and its strictly upper
triangular part is not referenced.
The array af contains the factors L or U from the Cholesky factorization
sub(A) = L*LH or sub(A) = UH*U, as computed by p?potrf.
On entry, the array b contains the local pieces of the distributed matrix of
right hand sides sub(B).
On entry, the array x contains the local pieces of the solution vectors
sub(X).
respectively.
iaf, jaf (global) INTEGER. The row and column indices in the global matrix AF
indicating the first row and the first column of the matrix sub(AF),
respectively.
descaf (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix AF.
respectively.
respectively.
work (local)
REAL for psporfs
DOUBLE PRECISION for pdporfs
COMPLEX for pcporfs
1927
DOUBLE COMPLEX for pzporfs.

lwork (local) INTEGER. The size of the array work.
For real flavors:

NOTE
rwork (local) REAL for pcporfs
DOUBLE PRECISION for pzporfs

Output Parameters
x On exit, contains the improved solution vectors.

1928
info < 0:
then info = -i.
See Also
notations.
p?trrfs
Provides error bounds and backward error estimates
for the solution to a system of linear equations with a
distributed triangular coefficient matrix.
Syntax
call pstrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pdtrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, iwork, liwork, info)
call pctrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
call pztrrfs(uplo, trans, diag, n, nrhs, a, ia, ja, desca, b, ib, jb, descb, x, ix,
jx, descx, ferr, berr, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?trrfsroutine provides error bounds and backward error estimates for the solution to one of the
systems of linear equations
sub(A)T*sub(X) = sub(B), or
sub(A)H*sub(X) = sub(B) ,
where sub(A) = A(ia:ia+n-1, ja:ja+n-1) is a triangular matrix,
sub(B) = B(ib:ib+n-1, jb:jb+nrhs-1), and
1929
sub(X) = X(ix:ix+n-1, jx:jx+nrhs-1).
The solution matrix X must be computed by p?trtrs or some other means before entering this routine. The
routine p?trrfs does not do iterative refinement because doing so cannot improve the backward error.
Input Parameters

triangular.

If trans = 'N', the system has the form sub(A)*sub(X) = sub(B) (No
transpose);
If trans = 'T', the system has the form sub(A)T*sub(X) = sub(B)
(Transpose);
If trans = 'C', the system has the form sub(A)H*sub(X) = sub(B)
(Conjugate transpose).

If diag = 'N', then sub(A) is non-unit triangular.
nrhs (global) INTEGER. The number of right-hand sides, that is, the number of
columns of the matrices sub(B) and sub(X) (nrhs0).
a, b, x (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
Pointers into the local memory to arrays of local sizes a(lld_a, LOCc(ja
+n-1)), b(lld_b,LOCc(jb+nrhs-1)), and x(lld_x,LOCc(jx+nrhs-1)),
respectively.
The array a contains the local pieces of the original triangular distributed
matrix sub(A).
is not referenced.
the lower triangular part of the distributed matrix, and its strictly upper
triangular part is not referenced.
1930
On entry, the array b contains the local pieces of the distributed matrix of
right hand sides sub(B).
On entry, the array x contains the local pieces of the solution vectors
sub(X).
respectively.
respectively.
respectively.
work (local)
REAL for pstrrfs
DOUBLE PRECISION for pdtrrfs
COMPLEX for pctrrfs
DOUBLE COMPLEX for pztrrfs.
lwork (local) INTEGER. The size of the array work.
For real flavors:

lwork must be at least lwork 3*LOCr(n+mod(ia-1,mb_a))
NOTE
1931
rwork (local) REAL for pctrrfs
DOUBLE PRECISION for pztrrfs

Output Parameters

info < 0:
then info = -i.
See Also
notations.
Matrix Inversion: ScaLAPACK Computational Routines

This sections describes ScaLAPACK routines that compute the inverse of a matrix based on the previously
obtained factorization. Note that it is not recommended to solve a system of equations Ax = b by first
computing A-1 and then forming the matrix-vector product x = A-1b. Call a solver routine instead (see
Solving Systems of Linear Equations); this is more efficient and more accurate.
1932
p?getri
Computes the inverse of a LU-factored distributed
matrix.
Syntax
call psgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pdgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pcgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
call pzgetri(n, a, ia, ja, desca, ipiv, work, lwork, iwork, liwork, info)
Include Files
Description
The p?getriroutine computes the inverse of a general distributed matrix sub(A) = A(ia:ia+n-1, ja:ja
+n-1) using the LU factorization computed by p?getrf. This method inverts U and then computes the
inverse of sub(A) by solving the system
inv(sub(A))*L = inv(U)
for inv(sub(A)).
Input Parameters
is, the order of the distributed matrix sub(A) (n0).
a (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
Pointer into the local memory to an array of local size (lld_a,LOCc(ja
+n-1)).
On entry, the array a contains the local pieces of the L and U obtained by
the factorization sub(A) = P*L*U computed by p?getrf.
respectively.
work (local)
REAL for psgetri
DOUBLE PRECISION for pdgetri
COMPLEX for pcgetri
DOUBLE COMPLEX for pzgetri.
1933
lwork (local) INTEGER. The size of the array work. lwork must be at least
lworkLOCr(n+mod(ia-1,mb_a))*nb_a.
NOTE
The array work is used to keep at most an entire column block of sub(A).
iwork (local) INTEGER. Workspace array used for physically transposing the
pivots, size liwork.
liwork (local or global) INTEGER. The size of the array iwork.
The minimal value liwork of is determined by the following code:
if NPROW == NPCOL then

liwork = LOCc(n_a + mod(ja-1,nb_a))+ nb_a
else
liwork = LOCc(n_a + mod(ja-1,nb_a)) +
max(ceil(ceil(LOCr(m_a)/mb_a)/(lcm/NPROW)),nb_a)
end if
where lcm is the least common multiple of process rows and columns
(NPROW and NPCOL).
Output Parameters
ipiv (local) INTEGER.
Array of size LOCr(m_a)+ mb_a.
This array contains the pivoting information.

If ipiv(i)=j, then the local row i was swapped with the global row j.
info < 0:
then info = -i.
info> 0:
1934
If info = i, U(i,i) is exactly zero. The factorization has been completed, but
the factor U is exactly singular, and division by zero will occur if it is used to
solve a system of equations.
See Also
notations.
p?potri
Computes the inverse of a symmetric/Hermitian
positive definite distributed matrix.
Syntax
call pspotri(uplo, n, a, ia, ja, desca, info)
call pdpotri(uplo, n, a, ia, ja, desca, info)
call pcpotri(uplo, n, a, ia, ja, desca, info)
call pzpotri(uplo, n, a, ia, ja, desca, info)
Include Files
Description
The p?potriroutine computes the inverse of a real symmetric or complex Hermitian positive definite
distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) using the Cholesky factorization sub(A) = UH*U or
sub(A) = L*LH computed by p?potrf.
Input Parameters

Hermitian matrix sub(A) is stored.
If uplo = 'U', upper triangle of sub(A) is stored. If uplo = 'L', lower
triangle of sub(A) is stored.
a (local)
REAL for pspotri
DOUBLE PRECISION for pdpotri
COMPLEX for pcpotri
DOUBLE COMPLEX for pzpotri.
+n-1)).
On entry, the array a contains the local pieces of the triangular factor U or L
from the Cholesky factorization sub(A) = UH*U, or sub(A) = L*LH, as
computed by p?potrf.
1935
respectively.
Output Parameters
a On exit, overwritten by the local pieces of the upper or lower triangle of the
(symmetric/Hermitian) inverse of sub(A).
info < 0:
then info = -i.
info> 0:
If info = i, the element (i, i) of the factor U or L is zero, and the inverse
could not be computed.
See Also
notations.
p?trtri
Computes the inverse of a triangular distributed
matrix.
Syntax
call pstrtri(uplo, diag, n, a, ia, ja, desca, info)
call pdtrtri(uplo, diag, n, a, ia, ja, desca, info)
call pctrtri(uplo, diag, n, a, ia, ja, desca, info)
call pztrtri(uplo, diag, n, a, ia, ja, desca, info)
Include Files
Description
The p?trtriroutine computes the inverse of a real or complex upper or lower triangular distributed matrix
sub(A) = A(ia:ia+n-1, ja:ja+n-1).
Input Parameters
Specifies whether the distributed matrix sub(A) is upper or lower triangular.

If uplo = 'U', sub(A) is upper triangular.
If uplo = 'L', sub(A) is lower triangular.
1936
Specifies whether or not the distributed matrix sub(A) is unit triangular.
If diag = 'N', then sub(A) is non-unit triangular.
a (local)
REAL for pstrtri
DOUBLE PRECISION for pdtrtri
COMPLEX for pctrtri
DOUBLE COMPLEX for pztrtri.
+n-1)).
The array a contains the local pieces of the triangular distributed matrix
sub(A).
the upper triangular matrix to be inverted, and the strictly lower triangular
part of sub(A) is not referenced.
the lower triangular matrix, and the strictly upper triangular part of sub(A)
is not referenced.
respectively.
Output Parameters
a On exit, overwritten by the (triangular) inverse of the original matrix.
info < 0:
then info = -i.
info> 0:
If info = k, A(ia+k-1, ja+k-1) is exactly zero. The triangular matrix
sub(A) is singular and its inverse cannot be computed.
See Also
notations.
1937
Matrix Equilibration: ScaLAPACK Computational Routines

ScaLAPACK routines described in this section are used to compute scaling factors needed to equilibrate a
matrix. Note that these routines do not actually scale the matrices.
p?geequ
equilibrate a general rectangular distributed matrix
and reduce its condition number.
Syntax
call psgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pdgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pcgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
call pzgeequ(m, n, a, ia, ja, desca, r, c, rowcnd, colcnd, amax, info)
Include Files
Description
The p?geequroutine computes row and column scalings intended to equilibrate an m-by-n distributed matrix
sub(A) = A(ia:ia+m-1, ja:ja+n-1) and reduce its condition number. The output array r returns the row
scale factors ri , and the array c returns the column scale factors cj . These factors are chosen to try to make
the largest element in each row and column of the matrix B with elements bij=ri*aij*cj have absolute value 1.
ri and cj are restricted to be between SMLNUM = smallest safe number and BIGNUM = largest safe number.
Use of these scaling factors is not guaranteed to reduce the condition number of sub(A) but works well in
practice.

BIGNUM = 1 / SMLNUM
The auxiliary function p?laqge uses scaling factors computed by p?geequ to scale a general rectangular
matrix.
Input Parameters
m (global) INTEGER. The number of rows to be operated on, that is, the
number of rows of the distributed matrix sub(A) (m 0).
n (global) INTEGER. The number of columns to be operated on, that is, the
number of columns of the distributed matrix sub(A) (n 0).
a (local)
REAL for psgeequ
DOUBLE PRECISION for pdgeequ
COMPLEX for pcgeequ
DOUBLE COMPLEX for pzgeequ .
1938
+n-1)).
The array a contains the local pieces of the m-by-n distributed matrix whose
equilibration factors are to be computed.
respectively.
Output Parameters
r, c (local) REAL for single precision flavors;

Arrays of sizes LOCr(m_a) and LOCc(n_a), respectively.
If info = 0, or info>ia+m-1, the array r(ia:ia+m-1) contains the row

scale factors for sub(A). r is aligned with the distributed matrix A, and
replicated across every process column. r is tied to the distributed matrix
A.
If info = 0, the array c (ja:ja+n-1) contains the column scale factors for
sub(A). c is aligned with the distributed matrix A, and replicated down
every process row. c is tied to the distributed matrix A.
rowcnd, colcnd (global) REAL for single precision flavors;

If info = 0 or info>ia+m-1, rowcnd contains the ratio of the smallest r(i)
to the largest r(i) (iaiia+m-1). If rowcnd 0.1 and amax is neither too
large nor too small, it is not worth scaling by r(ia:ia+m-1).
If info = 0, colcnd contains the ratio of the smallest c(j) to the largest
c(j) (ja jja+n-1).
If colcnd 0.1, it is not worth scaling by c(ja:ja+n-1).
amax (global) REAL for single precision flavors;

Absolute value of the largest matrix element. If amax is very close to
overflow or very close to underflow, the matrix should be scaled.
info < 0:
then info = -i.
info> 0:
If info = i and
1939
im, the i-th row of the distributed matrix
sub(A) is exactly zero;

i>m, the (i - m)-th column of the distributed
matrix sub(A) is exactly zero.
See Also
notations.
p?poequ
distributed matrix and reduce its condition number.
Syntax
call pspoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pdpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pcpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
call pzpoequ(n, a, ia, ja, desca, sr, sc, scond, amax, info)
Include Files
Description
The p?poequ routine computes row and column scalings intended to equilibrate a real symmetric or complex
Hermitian positive definite distributed matrix sub(A) = A(ia:ia+n-1, ja:ja+n-1) and reduce its condition
number (with respect to the two-norm). The output arrays sr and sc return the row and column scale
factors
These factors are chosen so that the scaled distributed matrix B with elements bij=s(i)*aij*s(j) has ones on
the diagonal.
This choice of sr and sc puts the condition number of B within a factor n of the smallest possible condition
number over all possible diagonal scalings.
The auxiliary function p?laqsy uses scaling factors computed by p?geequ to scale a general rectangular
matrix.
Input Parameters
a (local)
REAL for pspoequ
DOUBLE PRECISION for pdpoequ
1940
COMPLEX for pcpoequ
DOUBLE COMPLEX for pzpoequ.
+n-1)).
The array a contains the n-by-n symmetric/Hermitian positive definite
distributed matrix sub(A) whose scaling factors are to be computed. Only
the diagonal elements of sub(A) are referenced.
respectively.
Output Parameters
sr, sc (local)
REAL for single precision flavors;

Arrays of sizes LOCr(m_a) and LOCc(n_a), respectively.
If info = 0, the array sr(ia:ia+n-1) contains the row scale factors for
sub(A). sr is aligned with the distributed matrix A, and replicated across
every process column. sr is tied to the distributed matrix A.
If info = 0, the array sc(ja:ja+n-1) contains the column scale factors

for sub(A). sc is aligned with the distributed matrix A, and replicated down
every process row. sc is tied to the distributed matrix A.
scond (global)

If info = 0, scond contains the ratio of the smallest sr(i) ( or sc(j)) to
the largest sr(i) ( or sc(j)), with
iaiia+n-1 and jajja+n-1.

If scond 0.1 and amax is neither too large nor too small, it is not worth
scaling by sr ( or sc ).
amax (global)
Absolute value of the largest matrix element. If amax is very close to
overflow or very close to underflow, the matrix should be scaled.
1941
info < 0:
then info = -i.
info> 0:
If info = k, the k-th diagonal entry of sub(A) is nonpositive.
See Also
notations.
Orthogonal Factorizations: ScaLAPACK Computational Routines

This section describes the ScaLAPACK routines for the QR(RQ) and LQ(QL) factorization of matrices. Routines
for the RZ factorization as well as for generalized QR and RQ factorizations are also included. For the
mathematical definition of the factorizations, see the respective LAPACK sections or refer to [SLUG].
Table "Computational Routines for Orthogonal Factorizations" lists ScaLAPACK routines that perform
orthogonal factorization of matrices.
Computational Routines for Orthogonal Factorizations
Matrix type, Factorize Factorize with Generate matrix Apply matrix Q
factorization without pivoting Q
pivoting
general matrices, QR p?geqrf p?geqpf p?orgqr p?ormqr

factorization
p?ungqr p?unmqr
general matrices, RQ p?gerqf p?orgrq p?ormrq

factorization
p?ungrq p?unmrq
general matrices, LQ p?gelqf p?orglq p?ormlq

factorization
p?unglq p?unmlq
general matrices, QL p?geqlf p?orgql p?ormql

factorization
p?ungql p?unmql
trapezoidal matrices, p?tzrzf p?ormrz

RZ factorization
p?unmrz
pair of matrices, p?ggqrf

generalized QR
factorization
pair of matrices, p?ggrqf

generalized RQ
factorization
p?geqrf
matrix.
1942
Syntax
call psgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqrf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?geqrf routine forms the QR factorization of a general m-by-n distributed matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) as
A=Q*R.
Input Parameters
m (global) INTEGER. The number of rows in the distributed matrix sub(A); (m

0).

(n 0).
a (local)
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf
COMPLEX for pcgeqrf
DOUBLE COMPLEX for pzgeqrf.
+n-1)).
indicating the first row and the first column of the submatrix A(ia:ia+m-1,
ja:ja+n-1), respectively.
distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
COMPLEX for pcgeqrf.
DOUBLE COMPLEX for pzgeqrf
lwork (local or global) INTEGER, size of work, must be at least lworknb_a *

(mp0+nq0+nb_a), where
1943
iroff = mod(ia-1, mb_a), icoff = mod(ja-1, nb_a),

iarow = indxg2p(ia, mb_a, MYROW, rsrc_a, NPROW),
iacol = indxg2p(ja, nb_a, MYCOL, csrc_a, NPCOL),
mp0 = numroc(m+iroff, mb_a, MYROW, iarow, NPROW),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL), and numroc,
indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW and NPCOL
can be determined by calling the subroutine blacs_gridinfo.
If lwork = -1, then lwork is global input and a workspace query is

assumed; the routine only calculates the minimum and optimal size for all
work arrays. Each of these values is returned in the first entry of the
corresponding work array, and no error message is issued by pxerbla.
Output Parameters
a The elements on and above the diagonal of sub(A) contain the min(m,n)-by-
n upper trapezoidal matrix R (R is upper triangular if mn); the elements
below the diagonal, with the array tau, represent the orthogonal/unitary
matrix Q as a product of elementary reflectors (see Application Notes
below).
tau (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
Array of size LOCc(ja+min(m,n)-1).
Contains the scalar factor of elementary reflectors. tau is tied to the

= 0, the execution is successful.

< 0, if the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j); if the i-th argument is a scalar and had an illegal
value, then info = -i.
Application Notes
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k = min(m,n).

H(i) = I - tau*v*v'
1944
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is
stored on exit in A(ia+i:ia+m-1, ja+i-1), and tau in tau(ja+i-1).
See Also
notations.
p?geqpf
matrix with pivoting.
Syntax
call psgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pdgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, info)
call pcgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, rwork, lrwork, info)
call pzgeqpf(m, n, a, ia, ja, desca, ipiv, tau, work, lwork, rwork, lrwork, info)
Include Files
Description
The p?geqpf routine forms the QR factorization with column pivoting of a general m-by-n distributed matrix
sub(A)= A(ia:ia+m-1, ja:ja+n-1) as
sub(A)*P=Q*R.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(A) (m 0).
n (global) INTEGER. The number of columns in the matrix sub(A) (n 0).
a (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
+n-1)).
work (local).
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf.
1945
COMPLEX for pcgeqpf.

DOUBLE COMPLEX for pzgeqpf
lwork (local or global) INTEGER, size of work, must be at least
For real flavors:

lworkmax(3,mp0+nq0) + LOCc (ja+n-1) + nq0.
lworkmax(3,mp0+nq0) .
Here
mp0 = numroc(m+iroff, mb_a, MYROW, iarow, NPROW ),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL),
LOCc (ja+n-1) = numroc(ja+n-1, nb_a, MYCOL,csrc_a, NPCOL),
and numroc, indxg2p are ScaLAPACK tool functions.
You can determine MYROW, MYCOL, NPROW and NPCOL by calling the
blacs_gridinfosubroutine.
rwork (local).
REAL for pcgeqpf.
DOUBLE PRECISION for pzgeqpf.
Workspace array of size lrwork (complex flavors only).
lrwork (local or global) INTEGER, size of rwork (complex flavors only). The value
of lrwork must be at least
lworkLOCc (ja+n-1) + nq0 .

Here
mp0 = numroc(m+iroff, mb_a, MYROW, iarow, NPROW ),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL),
LOCc (ja+n-1) = numroc(ja+n-1, nb_a, MYCOL,csrc_a, NPCOL),
and numroc, indxg2p are ScaLAPACK tool functions.
1946
You can determine MYROW, MYCOL, NPROW and NPCOL by calling the
blacs_gridinfosubroutine.
If lrwork = -1, then lrwork is global input and a workspace query is
Output Parameters
a The elements on and above the diagonal of sub(A)contain the min(m,n)-by-

n upper trapezoidal matrix R (R is upper triangular if mn); the elements
below the diagonal, with the array tau, represent the orthogonal/unitary
below).
ipiv (local) INTEGER. Array of size LOCc(ja+n-1).
ipiv(i) = k, the local i-th column of sub(A)*P was the global k-th column of
sub(A). ipiv is tied to the distributed matrix A.
tau (local)
REAL for psgeqpf
DOUBLE PRECISION for pdgeqpf
COMPLEX for pcgeqpf
DOUBLE COMPLEX for pzgeqpf.
Array of size LOCc(ja+min(m, n)-1).
Contains the scalar factor tau of elementary reflectors. tau is tied to the

Application Notes
Q = H(1)*H(2)*...*H(k)
where k = min(m,n).

H = I - tau*v*v'
1947
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:m) is
stored on exit in A(ia+i:ia+m-1, ja+i-1).
The matrix P is represented in ipiv as follows: if ipiv(j)= i then the j-th column of P is the i-th canonical
unit vector.
See Also
notations.
p?orgqr
Generates the orthogonal matrix Q of the QR
factorization formed by p?geqrf.
Syntax
call psorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgqrroutine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia+m-1,
ja:ja+n-1) with orthonormal columns, which is defined as the first n columns of a product of k elementary
reflectors of order m
Q= H(1)*H(2)*...*H(k)
as returned by p?geqrf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q) (m 0).
n (global) INTEGER. The number of columns in the matrix sub(Q) (mn 0).
k (global) INTEGER. The number of elementary reflectors whose product

defines the matrix Q(nk 0).
a (local)
REAL for psorgqr
DOUBLE PRECISION for pdorgqr
+n-1)). The j-th column must contain the vector that defines the
elementary reflector H(j), jajja +k-1, as returned by p?geqrf in the k
columns of its distributed matrix argument A(ia:*, ja:ja+k-1).
tau (local)
1948
REAL for psorgqr
Array of size LOCc(ja+k-1).
Contains the scalar factor tau(j) of elementary reflectors H(j) as returned

by p?geqrf. tau is tied to the distributed matrix A.
work (local)
REAL for psorgqr
Workspace array of size of lwork.
lwork (local or global) INTEGER, size of work.
Must be at least lworknb_a*(nqa0 + mpa0 + nb_a), where
iroffa = mod(ia-1, mb_a), icoffa = mod(ja-1, nb_a),

mpa0 = numroc(m+iroffa, mb_a, MYROW, iarow, NPROW),
nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL);
indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL, NPROW
and NPCOL can be determined by calling the subroutine blacs_gridinfo.

Output Parameters
a Contains the local pieces of the m-by-n distributed matrix Q.
work(1) On exit, (1) contains the minimum value of lwork required for optimum
performance.

< 0: if the i-th argument is an array and the j-th entry had an illegal value,
See Also
notations.
p?ungqr
Generates the complex unitary matrix Q of the QR
factorization formed by p?geqrf.
1949
Syntax
call pcungqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungqr(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia+m-1,
ja:ja+n-1) with orthonormal columns, which is defined as the first n columns of a product of k elementary
reflectors of order m
Q = H(1)*H(2)*...*H(k)
as returned by p?geqrf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q); (m0).
n (global) INTEGER. The number of columns in the matrix sub(Q) (mn0).

defines the matrix Q (nk0).
a (local)
COMPLEX for pcungqr
DOUBLE COMPLEX for pzungqr
The j-th column must contain the vector that defines the elementary
reflector H(j), jajja +k-1, as returned by p?geqrf in the k columns of
its distributed matrix argument A(ia:*, ja:ja+k-1).
indicating the first row and the first column of the submatrix A, respectively.
tau (local)
COMPLEX for pcungqr

work (local)
COMPLEX for pcungqr
1950
lworknb_a*(nqa0 + mpa0 + nb_a), where
iroffa = mod(ia-1, mb_a),
icoffa = mod(ja-1, nb_a),
nqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL)

Output Parameters
work(1) On exit work(1) contains the minimum value of lwork required for

See Also
notations.
p?ormqr
Multiplies a general matrix by the orthogonal matrix Q
of the QR factorization formed by p?geqrf.
Syntax
call psormqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormqrroutine overwrites the general real m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j
+n-1) with
1951
side ='L' side ='R'

trans = 'N': Q*sub(C) sub(C)*Q
trans = 'T': QT*sub(C) sub(C)*QT
where Q is a real orthogonal distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k)
as returned by p?geqrf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side (global) CHARACTER
='L':Q or QT is applied from the left.

='R':Q or QT is applied from the right.
trans (global) CHARACTER
='N', no transpose, Q is applied.

='T', transpose, QT is applied.
m (global) INTEGER. The number of rows in the distributed matrix sub(C)

(m0).
n (global) INTEGER. The number of columns in the distributed matrix sub(C)

(n0).

defines the matrix Q. Constraints:
If side = 'L', mk0
If side = 'R', nk0.
a (local)
REAL for psormqr
DOUBLE PRECISION for pdormqr.
reflector H(j), jajja+k-1, as returned by p?geqrf in the k columns of its
distributed matrix argument A(ia:*, ja:ja+k-1). A(ia:*, ja:ja+k-1) is
modified by the routine but restored on exit.
If side = 'L', lld_amax(1, LOCr(ia+m-1))
If side = 'R', lld_amax(1, LOCr(ia+n-1))
tau (local)
REAL for psormqr
1952
DOUBLE PRECISION for pdormqr

c (local)
REAL for psormqr
Pointer into the local memory to an array of local size (lld_c,LOCc(jc
+n-1)).
Contains the local pieces of the distributed matrix sub(C) to be factored.
ic, jc (global) INTEGER. The row and column indices in the global matrix C
indicating the first row and the first column of the matrix sub(C),
respectively.
descc (global and local) INTEGER array of size dlen_. The array descriptor for the
distributed matrix C.
work (local)
REAL for psormqr
lwork (local or global) INTEGER, size of work, must be at least:
if side = 'L',
lworkmax((nb_a*(nb_a-1))/2, (nqc0+mpc0)*nb_a) + nb_a*nb_a

else if side = 'R',
lworkmax((nb_a*(nb_a-1))/2, (nqc0+max(npa0+numroc(numroc(n
+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a)
+ nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm(NPROW, NPCOL),
npa0= numroc(n+iroffa, mb_a, MYROW, iarow, NPROW),
iroffc = mod(ic-1, mb_c),
icoffc = mod(jc-1, nb_c),
icrow = indxg2p(ic, mb_c, MYROW, rsrc_c, NPROW),
iccol = indxg2p(jc, nb_c, MYCOL, csrc_c, NPCOL),
mpc0= numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
1953
nqc0= numroc(n+icoffc, nb_c, MYCOL, iccol, NPCOL),

ilcm, indxg2p and numroc are ScaLAPACK tool functions; MYROW, MYCOL,
NPROW and NPCOL can be determined by calling the subroutine
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q*sub(C), or QT*sub(C), or sub(C)*QT, or

sub(C)*Q.

See Also
notations.
p?unmqr
the QR factorization formed by p?geqrf.
Syntax
call pcunmqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmqr(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j+n-1)
with
side ='L' side ='R'

trans = 'T': QH*sub(C) sub(C)*QH
where Q is a complex unitary distributed matrix defined as the product of k elementary reflectors
Q = H(1) H(2)... H(k) as returned by p?geqrf. Q is of order m if side = 'L' and of order n if side ='R'.
1954
Input Parameters
='L': Q or QH is applied from the left.

='R': Q or QH is applied from the right.

='C', conjugate transpose, QH is applied.

(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+k-1)).
reflector H(j), jajja+k-1, as returned by p?geqrf in the k columns of its
If side = 'L', lld_amax(1, LOCr(ia+m-1))
If side = 'R', lld_amax(1, LOCr(ia+n-1))
tau (local)
COMPLEX for pcunmqr
DOUBLE COMPLEX for pzunmqr

c (local)
COMPLEX for pcunmqr
1955

+n-1)).
indicating the first row and the first column of the submatrix C, respectively.
work (local)
COMPLEX for pcunmqr
If side = 'L',
lworkmax((nb_a*(nb_a-1))/2, (nqc0 + mpc0)*nb_a) + nb_a*nb_a

else if side = 'R',
lworkmax((nb_a*(nb_a-1))/2, (nqc0 + max(npa0 +

numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0,
lcmq), mpc0))*nb_a) + nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0 = numroc(n+iroffa, mb_a, MYROW, iarow, NPROW),
mpc0 = numroc(m+iroffc, mb_c, MYROW, icrow, NPROW),
nqc0 = numroc(n+icoffc, nb_c, MYCOL, iccol, NPCOL),
blacs_gridinfo.
1956
Output Parameters
c Overwritten by the product Q*sub(C), or QH*sub(C), or sub(C)*QH, or

sub(C)*Q .

See Also
notations.
p?gelqf
Computes the LQ factorization of a general
rectangular matrix.
Syntax
call psgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgelqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?gelqf routine computes the LQ factorization of a real/complex distributed m-by-n matrix sub(A)=
A(ia:ia+m-1,ja:ja+n-1) = L*Q.
Input Parameters
m (global) INTEGER. The number of rows in the distributed submatrix sub(A)

(m 0).
n (global) INTEGER. The number of columns in the distributed submatrix

sub(A) (n 0).
a (local)
REAL for psgelqf
DOUBLE PRECISION for pdgelqf
COMPLEX for pcgelqf
DOUBLE COMPLEX for pzgelqf
+n-1)).
1957
ia, ja (global) INTEGER. The row and column indices in the global array A
indicating the first row and the first column of the submatrix A(ia:ia
+m-1,ja:ja+n-1), respectively.
work (local)
REAL for psgelqf
COMPLEX for pcgelqf
lwork (local or global) INTEGER, size of work, must be at least lworkmb_a*(mp0

+ nq0 + mb_a), where
iroff = mod(ia-1, mb_a),
icoff = mod(ja-1, nb_a),
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL)
NOTE

Output Parameters
a The elements on and below the diagonal of sub(A) contain the m-by-
min(m,n) lower trapezoidal matrix L (L is lower trapezoidal if mn); the
elements above the diagonal, with the array tau, represent the orthogonal/
unitary matrix Q as a product of elementary reflectors (see Application
Notes below).
tau (local)
REAL for psgelqf
1958
COMPLEX for pcgelqf
Array of size LOCr(ia+min(m, n)-1).
Contains the scalar factors of elementary reflectors. tau is tied to the


Application Notes
Q = H(ia+k-1)*H(ia+k-2)*...*H(ia),
where k = min(m,n)

H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:n) is
stored on exit in A(ia+i-1,ja+i:ja+n-1), and tau in tau(ia+i-1).
See Also
notations.
p?orglq
Generates the real orthogonal matrix Q of the LQ
factorization formed by p?gelqf.
Syntax
call psorglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orglq routine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first m rows of a product of k elementary
reflectors of order n
Q = H(k)*...* H(2)* H(1)
as returned by p?gelqf.
1959
Input Parameters
n (global) INTEGER. The number of columns in the matrix sub(Q) (nm0).

defines the matrix Q(mk0).
a (local)
REAL for psorglq
DOUBLE PRECISION for pdorglq
+n-1)). On entry, the i-th row must contain the vector that defines the
elementary reflector H(i), iaiia+k-1, as returned by p?gelqf in the k
rows of its distributed matrix argument A(ia:ia+k-1, ja:*).
work (local)
REAL for psorglq

lworkmb_a*(mpa0+nqa0+mb_a), where
NOTE

1960
Output Parameters
a Contains the local pieces of the m-by-n distributed matrix Q to be factored.
tau (local)
REAL for psorglq
Array of size LOCr(ia+k-1).
Contains the scalar factors tau(j) of elementary reflectors H(j). tau is tied
to the distributed matrix A.

See Also
notations.
p?unglq
Generates the unitary matrix Q of the LQ factorization
formed by p?gelqf.
Syntax
call pcunglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzunglq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia
Q = (H(k))H...*(H(2))H*(H(1))H as returned by p?gelqf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q) (m0).

a (local)
1961
COMPLEX for pcunglq

DOUBLE COMPLEX for pzunglq
+n-1)). On entry, the i-th row must contain the vector that defines the
elementary reflector H(i), iaiia+k-1, as returned by p?gelqf in the k
rows of its distributed matrix argument A(ia:ia+k-1, ja:*).
tau (local)
COMPLEX for pcunglq
Array of size LOCr(ia+k-1).
work (local)
COMPLEX for pcunglq

lworkmb_a*(mpa0+nqa0+mb_a), where
NOTE

1962
Output Parameters

See Also
notations.
p?ormlq
of the LQ factorization formed by p?gelqf.
Syntax
call psormlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work, lwork,
info)
call pdormlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, work, lwork,
info)
Include Files
Description
The p?ormlq routine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j
+n-1) with
side ='L' side ='R'

Q = H(k)...H(2) H(1)
as returned by p?gelqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
='L': Q or QT is applied from the left.

='R': Q or QT is applied from the right.
1963

(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)),
if side = 'L' and (lld_a,LOCc(ja+n-1)), if side = 'R'. The i-th row
must contain the vector that defines the elementary reflector H(i), iaiia
+k-1, as returned by p?gelqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*).
A(ia:ia+k-1, ja:*) is modified by the routine but restored on exit.
tau (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq

by p?gelqf. tau is tied to the distributed matrix A.
c (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq
+n-1)).
1964
work (local)
REAL for psormlq
DOUBLE PRECISION for pdormlq.
lwork (local or global) INTEGER, size of the array work; must be at least:
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0+maxmqa0)+ numroc(numroc(m

+ iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0, lcmp), nqc0))*
mb_a) + mb_a*mb_a
else if side = 'R',
lworkmax((mb_a* (mb_a-1))/2, (mpc0+nqc0)*mb_a + mb_a*mb_a

end if
where
lcmp = lcm/NPROW with lcm = ilcm (NPROW, NPCOL),
mqa0 = numroc(m+icoffa, nb_a, MYCOL, iacol, NPCOL),
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q*sub(C), or Q' *sub (C), or sub(C)*Q', or

sub(C)*Q
1965

See Also
notations.
p?unmlq
Multiplies a general matrix by the unitary matrix Q of
the LQ factorization formed by p?gelqf.
Syntax
call pcunmlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmlq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1)
with
side ='L' side ='R'

trans = 'T': QH*sub(C) sub(C)*QH
Q = H(k)' ... H(2)' H(1)'
as returned by p?gelqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters


1966
(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)),
if side = 'L' and (lld_a,LOCc(ja+n-1)), if side = 'R', where lld_a
max(1, LOCr (ia+k-1)). The i-th column must contain the vector that
defines the elementary reflector H(i), iaiia+k-1, as returned by p?gelqf
in the k rows of its distributed matrix argument A( ia:ia+k-1, ja:*).
A( ia:ia+k-1, ja:*) is modified by the routine but restored on exit.
tau (local)
COMPLEX for pcunmlq
DOUBLE COMPLEX for pzunmlq
Array of size LOCc(ia+k-1).

by p?gelqf. tau is tied to the distributed matrix A.
c (local)
COMPLEX for pcunmlq
+n-1)).
work (local)
1967
COMPLEX for pcunmlq

lwork (local or global) INTEGER, size of the array work; must be at least:
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 + maxmqa0)+

numroc(numroc(m + iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0,
lcmp), nqc0))*mb_a) + mb_a*mb_a
else if side = 'R',
lworkmax((mb_a* (mb_a-1))/2, (mpc0 + nqc0)*mb_a + mb_a*mb_a

end if
where
mqa0 = numroc(m + icoffa, nb_a, MYCOL, iacol, NPCOL),
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub (C), or sub(C)*Q', or

sub(C)*Q
1968

See Also
notations.
p?geqlf
Computes the QL factorization of a general matrix.
Syntax
call psgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgeqlf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?geqlf routine forms the QL factorization of a real/complex distributed m-by-n matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) = Q*L.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q); (m 0).
n (global) INTEGER. The number of columns in the matrix sub(Q) (n 0).
a (local)
REAL for psgeqlf
DOUBLE PRECISION for pdgeqlf
COMPLEX for pcgeqlf
DOUBLE COMPLEX for pzgeqlf
+n-1)). Contains the local pieces of the distributed matrix sub(A) to be
factored.
work (local)
1969
REAL for psgeqlf

COMPLEX for pcgeqlf
lwork (local or global) INTEGER, size of work, must be at least lworknb_a*(mp0

+ nq0 + nb_a), where
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL)
NOTE
numroc and indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW

Output Parameters
a On exit, if mn, the lower triangle of the distributed submatrix A(ia+m-n:ia

+m-1, ja:ja+n-1) contains the n-by-n lower triangular matrix L; if mn, the
elements on and below the (n - m)-th superdiagonal contain the m-by-n
lower trapezoidal matrix L; the remaining elements, with the array tau,
represent the orthogonal/unitary matrix Q as a product of elementary
reflectors (see Application Notes below).
tau (local)
REAL for psgeqlf
COMPLEX for pcgeqlf
Array of size LOCc(ja+n-1).
Contains the scalar factors of elementary reflectors. tau is tied to the

1970

Application Notes
Q = H(ja+k-1)*...*H(ja+1)*H(ja)
where k = min(m,n)

H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(m-k+i+1:m) = 0 and v(m-k+i) = 1;
v(1:m-k+i-1) is stored on exit in A(ia:ia+m-k+i-2, ja+n-k+i-1), and tau in tau(ja+n-k+i-1).
See Also
notations.
p?orgql
Generates the orthogonal matrix Q of the QL
factorization formed by p?geqlf.
Syntax
call psorgql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgql routine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
Q = H(k)*...*H(2)*H(1)
as returned by p?geqlf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q), (m0).
n (global) INTEGER. The number of columns in the matrix sub(Q),(mn0).

defines the matrix Q(nk0).
1971
a (local)
REAL for psorgql
DOUBLE PRECISION for pdorgql
+n-1)). On entry, the j-th column must contain the vector that defines the
elementary reflector H(j),ja+n-kjja+n-1, as returned by p?geqlf in the
k columns of its distributed matrix argument A(ia:*,ja+n-k:ja+n-1).
tau (local)
REAL for psorgql
work (local)
REAL for psorgql

lworknb_a*(nqa0+mpa0+nb_a), where
NOTE

1972
Output Parameters

See Also
notations.
p?ungql
Generates the unitary matrix Q of the QL factorization
formed by p?geqlf.
Syntax
call pcungql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungql(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the whole or part of m-by-n complex distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows, which is defined as the first n columns of a product of k
elementary reflectors of order m
Q = (H(k))H...*(H(2))H*(H(1))H as returned by p?geqlf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q) (m0).
n (global) INTEGER. The number of columns in the matrix sub(Q) (mn0).

defines the matrix Q(nk0).
a (local)
COMPLEX for pcungql
DOUBLE COMPLEX for pzungql
(lld_a,LOCc(ja+n-1)). On entry, the j-th column must contain the
vector that defines the elementary reflector H(j), ja+n-kjja+n-1,
as returned by p?geqlf in the k columns of its distributed matrix
argument A(ia:*, ja+n-k: ja+n-1).
1973
tau (local)
COMPLEX for pcungql
Array of size LOCr(ia+n-1).
work (local)
COMPLEX for pcungql

lworknb_a*(nqa0 + mpa0 + nb_a), where

Output Parameters

1974
See Also
notations.
p?ormql
of the QL factorization formed by p?geqlf.
Syntax
call psormql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormqlroutine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j
+n-1) with
side ='L' side ='R'

Q = H(k)' ... H(2)' H(1)'
as returned by p?geqlf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters


m (global) INTEGER. The number of rows in the distributed matrix sub(C),

(m0).
n (global) INTEGER. The number of columns in the distributed matrix sub(C),

(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
1975
REAL for psormql

DOUBLE PRECISION for pdormql.
reflector H(j), jajja+k-1, as returned by p?gelqf in the k columns of its
If side = 'L',lld_amax(1, LOCr(ia+m-1)),
If side = 'R', lld_amax(1, LOCr(ia+n-1)).
tau (local)
REAL for psormql

by p?geqlf. tau is tied to the distributed matrix A.
c (local)
REAL for psormql
+n-1)).
work (local)
REAL for psormql
lwork (local or global) INTEGER, dimension of work, must be at least:
If side = 'L',
lworkmax((nb_a*(nb_a-1))/2, (nqc0+mpc0)*nb_a + nb_a*nb_a

else if side ='R',
1976
lworkmax((nb_a*(nb_a-1))/2, (nqc0+max(npa0 +
numroc(numroc(n+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0,
lcmq), mpc0))*nb_a) + nb_a*nb_a
end if
where
lcmq = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0= numroc(n + iroffa, mb_a, MYROW, iarow, NPROW),
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q* sub(C), or Q'*sub (C), or sub(C)* Q', or

sub(C)* Q

See Also
notations.
1977
p?unmql
the QL factorization formed by p?geqlf.
Syntax
call pcunmql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmql(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general complex m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1)
with
side ='L' side ='R'

trans = 'C': QH*sub(C) sub(C)*QH
Q = H(k)' ... H(2)' H(1)'
as returned by p?geqlf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters



(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql.
1978
reflector H(j), jajja+k-1, as returned by p?geqlf in the k columns of its
If side = 'L',lld_amax(1, LOCr(ia+m-1)),
If side = 'R', lld_amax(1, LOCr(ia+n-1)).
tau (local)
COMPLEX for pcunmql
DOUBLE COMPLEX for pzunmql
Array of size LOCc(ia+n-1).

by p?geqlf. tau is tied to the distributed matrix A.
c (local)
COMPLEX for pcunmql
+n-1)).
work (local)
COMPLEX for pcunmql
If side = 'L',
lworkmax((nb_a* (nb_a-1))/2, (nqc0+mpc0)*nb_a + nb_a*nb_a

else if side ='R',
lworkmax((nb_a*(nb_a-1))/2, (nqc0+maxnpa0)+ numroc(numroc(n

+icoffc, nb_a, 0, 0, NPCOL), nb_a, 0, 0, lcmq), mpc0))*nb_a)
+ nb_a*nb_a
1979
end if
where
lcmp = lcm/NPCOL with lcm = ilcm (NPROW, NPCOL),
npa0 = numroc (n + iroffa, mb_a, MYROW, iarow, NPROW),
NOTE
blacs_gridinfo.
NOTE

Output Parameters
c Overwritten by the product Q* sub(C), or Q' sub (C), or sub(C)* Q', or

sub(C)* Q

1980
See Also
notations.
p?gerqf
Computes the RQ factorization of a general
rectangular matrix.
Syntax
call psgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pcgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pzgerqf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?gerqf routine forms the QR factorization of a general m-by-n distributed matrix sub(A)= A(ia:ia
+m-1, ja:ja+n-1) as
A= R*Q
Input Parameters
m (global) INTEGER. The number of rows in the distributed matrix sub(A);

(m0).

(n0).
a (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
+n-1)).
distributed matrix A
work (local).
REAL for psgeqrf
DOUBLE PRECISION for pdgeqrf.
1981
COMPLEX for pcgeqrf.

DOUBLE COMPLEX for pzgeqrf

lworkmb_a*(mp0+nq0+mb_a), where
NOTE
nq0 = numroc(n+icoff, nb_a, MYCOL, iacol, NPCOL) and numroc,

indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW and NPCOL
can be determined by calling the subroutine blacs_gridinfo.

Output Parameters
a On exit, if mn, the upper triangle of A(ia:ia+m-1, ja:ja+n-1) contains the

m-by-m upper triangular matrix R; if mn, the elements on and above the (m
- n)-th subdiagonal contain the m-by-n upper trapezoidal matrix R; the
remaining elements, with the array tau, represent the orthogonal/unitary
below).
tau (local)
REAL for psgeqrf
COMPLEX for pcgeqrf
Array of size LOCr(ia+m-1).

1982
Application Notes
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k = min(m,n).

H(i) = I - tau*v*v'
where tau is a real/complex scalar, and v is a real/complex vector with v(n-k+i+1:n) = 0 and v(n-k+i) = 1;
v(1:n-k+i-1) is stored on exit in A(ia+m-k+i-1,ja:ja+n-k+i-2), and tau in tau(ia+m-k+i-1).
See Also
notations.
p?orgrq
Generates the orthogonal matrix Q of the RQ
factorization formed by p?gerqf.
Syntax
call psorgrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pdorgrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?orgrqroutine generates the whole or part of m-by-n real distributed matrix Q denoting A(ia:ia
+m-1,ja:ja+n-1) with orthonormal rows that is defined as the last m rows of a product of k elementary
Q= H(1)*H(2)*...*H(k)
as returned by p?gerqf.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(Q), (m0).
n (global) INTEGER. The number of columns in the matrix sub(Q), (nm0).

a (local)
REAL for psorgrq
DOUBLE PRECISION for pdorgrq
1983

+n-1)). The i-th row must contain the vector that defines the elementary
reflector H(i), iaiia+m-1, as returned by p?gerqf in the k rows of its
distributed matrix argument A(ia+m-k:ia+m-1, ja:*).
tau (local)
REAL for psorgrq
Contains the scalar factor tau(i) of elementary reflectors H(i) as returned

by p?gerqf. tau is tied to the distributed matrix A.
work (local)
REAL for psorgrq

lworkmb_a*(mpa0 + nqa0 + mb_a), where
NOTE

Output Parameters
1984

See Also
notations.
p?ungrq
Generates the unitary matrix Q of the RQ factorization
formed by p?gerqf.
Syntax
call pcungrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
call pzungrq(m, n, k, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
This routine generates the m-by-n complex distributed matrix Q denoting A(ia:ia+m-1,ja:ja+n-1) with
orthonormal rows, which is defined as the last m rows of a product of k elementary reflectors of order n
Q = (H(1))H*(H(2))H*...*(H(k))H as returned by p?gerqf.
Input Parameters

a (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrqc
The i-th row must contain the vector that defines the elementary reflector
H(i), ia+m-kiia+m-1, as returned by p?gerqf in the k rows of its
distributed matrix argument A(ia+m-k:ia+m-1, ja:*).
1985
tau (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq

work (local)
COMPLEX for pcungrq
DOUBLE COMPLEX for pzungrq

lworkmb_a*(mpa0 +nqa0+mb_a), where
NOTE

Output Parameters

1986
See Also
notations.
p?ormr3
Applies an orthogonal distributed matrix to a general
m-by-n distributed matrix.
Syntax
call psormr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
call pdormr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
Description
p?ormr3 overwrites the general real m-by-n distributed matrix sub( C ) = C(ic:ic+m-1,jc:jc+n-1) with
trans = 'N' Q * sub( C ) sub( C ) * Q
trans = 'T' QT * sub( C ) sub( C ) * QT

Q * sub( C )
Q = H(1) H(2) . . . H(k)
as returned by p?tzrzf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters
side (global)
CHARACTER.
= 'L': apply Q or QT from the Left;
= 'R': apply Q or QT from the Right.
trans (global)
CHARACTER.
= 'N': No transpose, apply Q;
= 'T': Transpose, apply QT.
m (global)
INTEGER.
The number of rows to be operated on i.e the number of rows of the
distributed submatrix sub( C ). m >= 0.
n (global)
INTEGER.
1987
The number of columns to be operated on i.e the number of columns of the

distributed submatrix sub( C ). n >= 0.
k (global)
INTEGER.
If side = 'L', m >= k >= 0,
if side = 'R', n >= k >= 0.
l (global)
INTEGER.
The columns of the distributed submatrix sub( A ) containing the
meaningful part of the Householder reflectors.
If side = 'L', m >= l >= 0,
if side = 'R', n >= l >= 0.
a (local)
REAL for psormr3
DOUBLE PRECISION for pdormr3
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)) if
side='L', and (lld_a,LOCc(ja+n-1)) if side='R', where lld_a >=
MAX(1,LOCr(ia+k-1));
On entry, the i-th row must contain the vector which defines the elementary
reflector H(i), ia <= i <= ia+k-1, as returned by p?tzrzf in the k rows of
its distributed matrix argument A(ia:ia+k-1,ja:*).
A(ia:ia+k-1,ja:*) is modified by the routine but restored on exit.
ia (global)
INTEGER.
The row index in the global array a indicating the first row of sub( A ).
ja (global)
INTEGER.
The column index in the global array a indicating the first column of
sub( A ).
desca (global and local)

INTEGER.
Array of size dlen_.
The array descriptor for the distributed matrix A.
tau (local)
REAL for psormr3
Array, size LOCc(ia+k-1).
1988
This array contains the scalar factors tau(i) of the elementary reflectors
H(i) as returned by p?tzrzf. tau is tied to the distributed matrix A.
c (local)
REAL for psormr3
Pointer into the local memory to an array of size (lld_c,LOCc(jc+n-1)) .
On entry, the local pieces of the distributed matrix sub( C ).
ic (global)
INTEGER.
The row index in the global array c indicating the first row of sub( C ).
jc (global)
INTEGER.
The column index in the global array c indicating the first column of
sub( C ).
descc (global and local)

INTEGER.
The array descriptor for the distributed matrix C.
work (local)
REAL for psormr3
Array, size (lwork)
lwork (local)
INTEGER.
The size of the array work.
lwork is local input and must be at least

If side = 'L', lwork >= MpC0 + MAX( MAX( 1, NqC0 ), numroc( numroc( m
+IROFFC,mb_a,0,0,NPROW ),mb_a,0,0,NqC0 ) );
if side = 'R', lwork >= NqC0 + MAX( 1, MpC0 );
where LCMP = LCM / NPROW

LCM = iclm( NPROW, NPCOL ),
IROFFC = MOD( ic-1, mb_c ),
ICOFFC = MOD( jc-1, nb_c),
ICROW = indxg2p( ic, mb_c, MYROW, rsrc_c, NPROW ),
ICCOL = indxg2p( jc, nb_c, MYCOL, csrc_c, NPCOL ),
MpC0 = numroc( m+IROFFC, mb_c, MYROW, ICROW, NPROW ),
1989
NqC0 = numroc( n+ICOFFC, nb_c, MYCOL, ICCOL, NPCOL ),
ilcm, indxg2p, and numroc are ScaLAPACK tool functions;

MYROW, MYCOL, NPROW and NPCOL can be determined by calling the
subroutine blacs_gridinfo.

Output Parameters
c On exit, sub( C ) is overwritten by Q*sub( C ) or Q'*sub( C ) or sub( C )*Q'

or sub( C )*Q.
work On exit, work(1) returns the minimal and optimal lwork.
info (local)
INTEGER.
< 0: If the i-th argument is an array and the j-th entry had an illegal value,
then info = -(i*100+j), if the i-th argument is a scalar and had an illegal
Application Notes
Alignment requirements
The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1) must verify some alignment
properties, namely the following expressions should be true:
If side = 'L',
( nb_a = mb_c .AND. ICOFFA = IROFFC )

If side = 'R',
( nb_a = nb_c .AND. ICOFFA = ICOFFC .AND. IACOL = ICCOL )
p?unmr3
Applies an orthogonal distributed matrix to a general
m-by-n distributed matrix.
Syntax
call pcunmr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
call pzunmr3 (side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info )
Description
p?unmr3 overwrites the general complex m-by-n distributed matrix sub( C ) = C(ic:ic+m-1,jc:jc+n-1) with
trans = 'N': Q * sub( C ) sub( C ) * Q
1990
trans = 'C': QH * sub( C ) sub( C ) * QH
Q = H(1)' H(2)' . . . H(k)'
Input Parameters
side (global)
CHARACTER.
= 'L': apply Q or QH from the Left;
= 'R': apply Q or QH from the Right.
trans (global)
CHARACTER.
= 'N': No transpose, apply Q;
= 'C': Conjugate transpose, apply QH.
m (global)
INTEGER.
The number of rows to be operated on i.e the number of rows of the
distributed submatrix sub( C ). m >= 0.
n (global)
INTEGER.
The number of columns to be operated on i.e the number of columns of the
distributed submatrix sub( C ). n >= 0.
k (global)
INTEGER.
If side = 'L', m >= k >= 0, if side = 'R', n >= k >= 0.
l (global)
INTEGER.
The columns of the distributed submatrix sub( A ) containing the
meaningful part of the Householder reflectors.
If side = 'L', m >= l >= 0, if side = 'R', n >= l >= 0.
a (local)
COMPLEX for pcunmr3
DOUBLE COMPLEX for pzunmr3
side='L', and (lld_a,LOCc(ja+n-1)) if side='R', where lld_a >=
MAX(1,LOCr(ia+k-1));
1991
On entry, the i-th row must contain the vector which defines the elementary
reflector H(i), ia <= i <= ia+k-1, as returned by p?tzrzf in the k rows of
its distributed matrix argument A(ia:ia+k-1,ja:*).
A(ia:ia+k-1,ja:*) is modified by the routine but restored on exit.
ia (global)
INTEGER.
ja (global)
INTEGER.
sub( A ).

INTEGER.
tau (local)
COMPLEX for pcunmr3
Array, size LOCc(ia+k-1).
This array contains the scalar factors tau(i) of the elementary reflectors
H(i) as returned by p?tzrzf. tau is tied to the distributed matrix A.
c (local)
COMPLEX for pcunmr3
Pointer into the local memory to an array of size (lld_c,LOCc(jc+n-1)) .
On entry, the local pieces of the distributed matrix sub( C ).
ic (global)
INTEGER.
The row index in the global array c indicating the first row of sub( C ).
jc (global)
INTEGER.
The column index in the global array c indicating the first column of
sub( C ).
descc (global and local)

INTEGER.
The array descriptor for the distributed matrix C.
1992
work (local)
COMPLEX for pcunmr3
Array, size (lwork)
On exit, work(1) returns the minimal and optimal lwork.
lwork (local or global)

INTEGER.
lwork is local input and must be at least

If side = 'L', lwork >= MpC0 + MAX( MAX( 1, NqC0 ), numroc( numroc( m
+IROFFC,mb_a,0,0,NPROW ),mb_a,0,0,LCMP ) );
if side = 'R', lwork >= NqC0 + MAX( 1, MpC0 );
where LCMP = LCM / NPROW with LCM = ICLM( NPROW, NPCOL ),

IROFFC = MOD( ic-1, MB_C ), ICOFFC = MOD( jc-1, nb_c ),
ICROW = indxg2p( ic, MB_C, MYROW, rsrc_c, NPROW ),
ICCOL = indxg2p( jc, nb_c, MYCOL, csrc_c, NPCOL ),
MpC0 = numroc( m+IROFFC, MB_C, MYROW, ICROW, NPROW ),
NqC0 = numroc( n+ICOFFC, nb_c, MYCOL, ICCOL, NPCOL ),
ilcm, indxg2p, and numroc are ScaLAPACK tool functions;


Output Parameters
c On exit, sub( C ) is overwritten by Q*sub( C ) or Q'*sub( C ) or sub( C )*Q'

or sub( C )*Q.
work (local)
COMPLEX for pcunmr3
Array, size (lwork)
info (local)
INTEGER.
1993
Application Notes
The distributed submatrices A(ia:*, ja:*) and C(ic:ic+m-1,jc:jc+n-1) must verify some alignment
properties, namely the following expressions should be true:
If side = 'L', ( nb_a = MB_C and ICOFFA = IROFFC )
If side = 'R', ( nb_a = nb_c and ICOFFA = ICOFFC and IACOL = ICCOL )
p?ormrq
of the RQ factorization formed by p?gerqf.
Syntax
call psormrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
The p?ormrqroutine overwrites the general real m-by-n distributed matrix sub (C) = C(i:i+m-1,j:j
+n-1) with
side ='L' side ='R'

Q = H(1) H(2)... H(k)
as returned by p?gerqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters



(m0).
1994
(n0).

If side = 'L', mk0
If side = 'R', nk0.
a (local)
REAL for psormqr
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R'.
H(i), iaiia+k-1, as returned by p?gerqf in the k rows of its distributed
matrix argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by
the routine but restored on exit.
tau (local)
REAL for psormqr

c (local)
REAL for psormrq
DOUBLE PRECISION for pdormrq
+n-1)).
indicating the first row and the first column of the matrix sub(C),
respectively.
work (local)
REAL for psormrq
1995
DOUBLE PRECISION for pdormrq.

If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 + max(mqa0 +

numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0,
else if side ='R',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 + nqc0)*mb_a) + mb_a*mb_a

end if
where
mqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL),
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q* sub(C), or Q'*sub (C), or sub(C)* Q', or

sub(C)* Q
1996
See Also
notations.
p?unmrq
the RQ factorization formed by p?gerqf.
Syntax
call pcunmrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmrq(side, trans, m, n, k, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
with
side ='L' side ='R'

Q = H(1)' H(2)'... H(k)'
as returned by p?gerqf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters


m (global) INTEGER. The number of rows in the distributed matrix sub(C) ,

(m0).

(n0).
1997

If side = 'L', mk0
If side = 'R', nk0.
a (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq.
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R'. The i-th row
must contain the vector that defines the elementary reflector H(i), iaiia
+k-1, as returned by p?gerqf in the k rows of its distributed matrix
argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by the
routine but restored on exit.
tau (local)
COMPLEX for pcunmrq
DOUBLE COMPLEX for pzunmrq

c (local)
COMPLEX for pcunmrq
+n-1)).
work (local)
COMPLEX for pcunmrq
1998
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 +
max(mqa0+numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW), mb_a,
0, 0, lcmp), nqc0))*mb_a) + mb_a*mb_a
else if side = 'R',

end if
where
lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q* sub(C) or Q'*sub (C), or sub(C)* Q', or

sub(C)* Q
1999
See Also
notations.
p?tzrzf
Reduces the upper trapezoidal matrix A to upper
triangular form.
Syntax
call pstzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pdtzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pctzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
call pztzrzf(m, n, a, ia, ja, desca, tau, work, lwork, info)
Include Files
Description
The p?tzrzfroutine reduces the m-by-n (mn) real/complex upper trapezoidal matrix sub(A)= A(ia:ia+m-1,
ja:ja+n-1) to upper triangular form by means of orthogonal/unitary transformations. The upper trapezoidal
matrix A is factored as
A = (R 0)*Z,
where Z is an n-by-n orthogonal/unitary matrix and R is an m-by-m upper triangular matrix.
Input Parameters
m (global) INTEGER. The number of rows in the matrix sub(A); (m0).
n (global) INTEGER. The number of columns in the matrix sub(A) (n0).
a (local)
REAL for pstzrzf
DOUBLE PRECISION for pdtzrzf.
COMPLEX for pctzrzf.
DOUBLE COMPLEX for pztzrzf.
Contains the local pieces of the m-by-n distributed matrix sub (A) to be
factored.
work (local)
2000
REAL for pstzrzf

lworkmb_a*(mp0+nq0+mb_a), where
mp0 = numroc (m+iroff, mb_a, MYROW, iarow, NPROW),
nq0 = numroc (n+icoff, nb_a, MYCOL, iacol, NPCOL)
NOTE

Output Parameters
a On exit, the leading m-by-m upper triangular part of sub(A) contains the
upper triangular matrix R, and elements m+1 to n of the first m rows of sub
(A), with the array tau, represent the orthogonal/unitary matrix Z as a
product of m elementary reflectors.
tau (local)
REAL for pstzrzf

2001

< 0:if the i-th argument is an array and the j-th entry had an illegal value,
Application Notes
The factorization is obtained by the Householder's method. The k-th transformation matrix, Z(k), which is or
whose conjugate transpose is used to introduce zeros into the (m - k +1)-th row of sub(A), is given in the
form
where
T(k) = i - tau*u(k)*u(k)',
tau is a scalar and Z(k) is an (n - m) element vector. tau and Z(k) are chosen to annihilate the elements of
the k-th row of sub(A). The scalar tau is returned in the k-th element of tau and the vector u(k) in the k-th
row of sub(A), such that the elements of Z(k) are in a(k, m + 1),..., a(k, n). The elements of R are
returned in the upper triangular part of sub(A). Z is given by
Z = Z(1) * Z(2) *... * Z(m).
See Also
notations.
p?ormrz
Multiplies a general matrix by the orthogonal matrix
from a reduction to upper triangular form formed by
p?tzrzf.
Syntax
call psormrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pdormrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
Description
This routine overwrites the general real m-by-n distributed matrix sub(C) = C(i:i+m-1,j:j+n-1) with
2002
side ='L' side ='R'
Q = H(1) H(2)... H(k)
Input Parameters



(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
l (global)
The columns of the distributed matrix sub(A) containing the meaningful
If side = 'L', ml0
If side = 'R', nl0.
a (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R', where
lld_amax(1,LOCr(ia+k-1)).
H(i), iaiia+k-1, as returned by p?tzrzf in the k rows of its distributed
matrix argument A(ia:ia+k-1, ja:*). A(ia:ia+k-1, ja:*) is modified by
the routine but restored on exit.
2003
tau (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz

by p?tzrzf. tau is tied to the distributed matrix A.
c (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz
+n-1)).
work (local)
REAL for psormrz
DOUBLE PRECISION for pdormrz.
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0 + max(mqa0 +

numroc(numroc(n+iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0,
else if side ='R',

end if
where
iroffa = mod(ia-1, mb_a), icoffa = mod(ja-1, nb_a),
mqa0 = numroc(n+icoffa, nb_a, MYCOL, iacol, NPCOL),
2004
NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q*sub(C), or Q'*sub (C), or sub(C)*Q', or

sub(C)*Q

See Also
notations.
p?unmrz
Multiplies a general matrix by the unitary
transformation matrix from a reduction to upper
triangular form determined by p?tzrzf.
Syntax
call pcunmrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
call pzunmrz(side, trans, m, n, k, l, a, ia, ja, desca, tau, c, ic, jc, descc, work,
lwork, info)
Include Files
2005
Description
with
side ='L' side ='R'

Q = H(1)' H(2)'... H(k)'
as returned by pctzrzf/pztzrzf. Q is of order m if side = 'L' and of order n if side = 'R'.
Input Parameters


m (global) INTEGER. The number of rows in the distributed matrix sub(C),

(m0).

(n0).

If side = 'L', mk0
If side = 'R', nk0.
l (global) INTEGER. The columns of the distributed matrix sub(A) containing

the meaningful part of the Householder reflectors.
If side = 'L', ml0
If side = 'R', nl0.
a (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz.
side = 'L', and (lld_a,LOCc(ja+n-1)) if side = 'R', where
lld_amax(1, LOCr(ja+k-1)). The i-th row must contain the vector that
defines the elementary reflector H(i), iaiia+k-1, as returned by p?gerqf
in the k rows of its distributed matrix argument A(ia:ia+k-1, ja:*).
A(ia:ia+k-1, ja:*) is modified by the routine but restored on exit.
2006
tau (local)
COMPLEX for pcunmrz
DOUBLE COMPLEX for pzunmrz

c (local)
COMPLEX for pcunmrz
+n-1)).
work (local)
COMPLEX for pcunmrz
If side = 'L',
lworkmax((mb_a*(mb_a-1))/2, (mpc0+max(mqa0+numroc(numroc(n
+iroffc, mb_a, 0, 0, NPROW), mb_a, 0, 0, lcmp), nqc0))*mb_a)
+ mb_a*mb_a
else if side ='R',
lworkmax((mb_a*(mb_a-1))/2, (mpc0+nqc0)*mb_a) + mb_a*mb_a

end if
where
lcmp = lcm/NPROW with lcm = ilcm(NPROW, NPCOL),
2007

NOTE
blacs_gridinfo.
Output Parameters
c Overwritten by the product Q* sub(C), or Q'*sub (C), or sub(C)*Q', or

sub(C)*Q

See Also
notations.
p?ggqrf
Computes the generalized QR factorization.
Syntax
call psggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pdggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pcggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
2008
call pzggqrf(n, m, p, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
Include Files
Description
The p?ggqrfroutine forms the generalized QR factorization of an n-by-m matrix
sub(A) = A(ia:ia+n-1, ja:ja+m-1)
and an n-by-p matrix
sub(B) = B(ib:ib+n-1, jb:jb+p-1):
as
sub(A) = Q*R, sub(B) = Q*T*Z,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and R and T
assume one of the forms:
If nm
or if n < m
where R11 is upper triangular, and
where T12 or T21 is an upper triangular matrix.
2009
In particular, if sub(B) is square and nonsingular, the GQR factorization of sub(A) and sub(B) implicitly gives
the QR factorization of inv (sub(B))* sub (A):
inv(sub(B))*sub(A) = ZH*(inv(T)*R)
Input Parameters
n (global) INTEGER. The number of rows in the distributed matrices sub (A)
and sub(B) (n0).
m (global) INTEGER. The number of columns in the distributed matrix sub(A)

(m0).
p INTEGER. The number of columns in the distributed matrix sub(B) (p0).
a (local)
REAL for psggqrf
DOUBLE PRECISION for pdggqrf
COMPLEX for pcggqrf
DOUBLE COMPLEX for pzggqrf.
Pointer into the local memory to an array of size (lld_a,LOCc(ja+m-1)).
Contains the local pieces of the n-by-m matrix sub(A) to be factored.
b (local)
REAL for psggqrf
COMPLEX for pcggqrf
Pointer into the local memory to an array of size (lld_b,LOCc(jb+p-1)).
Contains the local pieces of the n-by-p matrix sub(B) to be factored.
indicating the first row and the first column of the submatrix B,
respectively.
work (local)
REAL for psggqrf
COMPLEX for pcggqrf
2010
lwork (local or global) INTEGER. Sze of work, must be at least
lworkmax(nb_a*(npa0+mqa0+nb_a), max((nb_a*(nb_a-1))/2,
(pqb0+npb0)*nb_a)+nb_a*nb_a, mb_b*(npb0+pqb0+mb_b)),
where
iroffa = mod(ia-1, mb_A),
npa0 = numroc (n+iroffa, mb_a, MYROW, iarow, NPROW),
mqa0 = numroc (m+icoffa, nb_a, MYCOL, iacol, NPCOL)
iroffb = mod(ib-1, mb_b),
icoffb = mod(jb-1, nb_b),
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL),
npb0 = numroc (n+iroffa, mb_b, MYROW, Ibrow, NPROW),
pqb0 = numroc(m+icoffb, nb_b, MYCOL, ibcol, NPCOL)
NOTE
and numroc, indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW

Output Parameters
a On exit, the elements on and above the diagonal of sub (A) contain the
min(n, m)-by-m upper trapezoidal matrix R (R is upper triangular if nm); the
elements below the diagonal, with the array taua, represent the
orthogonal/unitary matrix Q as a product of min(n, m) elementary
reflectors. (See Application Notes below).
taua, taub (local)

REAL for psggqrf
COMPLEX for pcggqrf
2011
Arrays of size LOCc(ja+min(n,m)-1) for taua and LOCr(ib+n-1) for

taub.
which represent the orthogonal/unitary matrix Q. taua is tied to the
distributed matrix A. (See Application Notes below).
which represent the orthogonal/unitary matrix Z. taub is tied to the
distributed matrix B. (See Application Notes below).

Application Notes
Q = H(ja)*H(ja+1)*...*H(ja+k-1),
where k= min(n,m).

H(i) = i - taua*v*v'
where taua is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i) = 1; v(i+1:n)
is stored on exit in A(ia+i:ia+n-1, ja+i-1) , and taua in taua(ja+i-1).To form Q explicitly, use ScaLAPACK
subroutine p?orgqr/p?ungqr. To use Q to update another matrix, use ScaLAPACK subroutine p?ormqr/p?
unmqr.
Z = H(ib)*H(ib+1)*...*H(ib+k-1), where k= min(n,p).

H(i) = i - taub*v*v'
where taub is a real/complex scalar, and v is a real/complex vector with v(p-k+i+1:p) = 0 and v(p-k+i) = 1;
v(1:p-k+i-1) is stored on exit in B(ib+n-k+i-1,jb:jb+p-k+i-2), and taub in taub(ib+n-k+i-1). To form Z
explicitly, use ScaLAPACK subroutine p?orgrq/p?ungrq. To use Z to update another matrix, use ScaLAPACK
subroutine p?ormrq/p?unmrq.
See Also
notations.
p?ggrqf
Computes the generalized RQ factorization.
Syntax
call psggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
2012
call pdggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pcggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
call pzggrqf(m, p, n, a, ia, ja, desca, taua, b, ib, jb, descb, taub, work, lwork,
info)
Include Files
Description
The p?ggrqfroutine forms the generalized RQ factorization of an m-by-n matrix sub(A) = A(ia:ia+m-1,
ja:ja+n-1) and a p-by-n matrix sub(B) = B(ib:ib+p-1, jb:jb+n-1):
sub(A) = R*Q, sub(B) = Z*T*Q,
where Q is an n-by-n orthogonal/unitary matrix, Z is a p-by-p orthogonal/unitary matrix, and R and T
assume one of the forms:
or
where R11 or R21 is upper triangular, and
or
where T11 is upper triangular.
2013
In particular, if sub(B) is square and nonsingular, the GRQ factorization of sub(A) and sub(B) implicitly gives
the RQ factorization of sub (A)*inv(sub(B)):
sub(A)*inv(sub(B))= (R*inv(T))*Z'
where inv(sub(B)) denotes the inverse of the matrix sub(B), and Z' denotes the transpose (conjugate
transpose) of matrix Z.
Input Parameters
m (global) INTEGER. The number of rows in the distributed matrices sub (A)
(m0).
p INTEGER. The number of rows in the distributed matrix sub(B) (p0).
n (global) INTEGER. The number of columns in the distributed matrices

sub(A) and sub(B) (n0).
a (local)
REAL for psggrqf
DOUBLE PRECISION for pdggrqf
COMPLEX for pcggrqf
DOUBLE COMPLEX for pzggrqf.
Contains the local pieces of the m-by-n distributed matrix sub(A) to be
factored.
b (local)
REAL for psggrqf
COMPLEX for pcggrqf
Pointer into the local memory to an array of size (lld_b,LOCc(jb+n-1)).
Contains the local pieces of the p-by-n matrix sub(B) to be factored.
indicating the first row and the first column of the submatrix B, respectively.
work (local)
REAL for psggrqf
2014
COMPLEX for pcggrqf
lwork (local or global) INTEGER.
Size of work, must be at least lworkmax(mb_a*(mpa0+nqa0+mb_a),

max((mb_a*(mb_a-1))/2, (ppb0+nqb0)*mb_a) + mb_a*mb_a,
nb_b*(ppb0+nqb0+nb_b)), where
iroffa = mod(ia-1, mb_A),
mpa0 = numroc (m+iroffa, mb_a, MYROW, iarow, NPROW),
nqa0 = numroc (m+icoffa, nb_a, MYCOL, iacol, NPCOL)
iroffb = mod(ib-1, mb_b),
icoffb = mod(jb-1, nb_b),
ibrow = indxg2p(ib, mb_b, MYROW, rsrc_b, NPROW ),
ibcol = indxg2p(jb, nb_b, MYCOL, csrc_b, NPCOL ),
ppb0 = numroc (p+iroffb, mb_b, MYROW, ibrow,NPROW),
nqb0 = numroc (n+icoffb, nb_b, MYCOL, ibcol,NPCOL)
NOTE
and numroc, indxg2p are ScaLAPACK tool functions; MYROW, MYCOL, NPROW

Output Parameters
a On exit, if mn, the upper triangle of A(ia:ia+m-1, ja+n-m:ja+n-1)

contains the m-by-m upper triangular matrix R; if mn, the elements on and
above the (m-n)-th subdiagonal contain the m-by-n upper trapezoidal matrix
R; the remaining elements, with the array taua, represent the orthogonal/
unitary matrix Q as a product of min(n,m) elementary reflectors (see
taua, taub (local)

REAL for psggqrf
COMPLEX for pcggqrf
2015

Arrays of size LOCr(ia+m-1)for taua and LOCc(jb+min(p,n)-1) for
taub.
which represent the orthogonal/unitary matrix Q. taua is tied to the
distributed matrix A.(See Application Notes below).
which represent the orthogonal/unitary matrix Z. taub is tied to the
distributed matrix B. (See Application Notes below).

Application Notes
Q = H(ia)*H(ia+1)*...*H(ia+k-1),
where k= min(m,n).

H(i) = i - taua*v*v'
where taua is a real/complex scalar, and v is a real/complex vector with v(n-k+i+1:n) = 0 and v(n-k+i) = 1;
v(1:n-k+i-1) is stored on exit in A(ia+m-k+i-1, ja:ja+n-k+i-2), and taua in taua(ia+m-k+i-1). To form Q
explicitly, use ScaLAPACK subroutine p?orgrq/p?ungrq. To use Q to update another matrix, use ScaLAPACK
subroutine p?ormrq/p?unmrq.

Z = H(jb)*H(jb+1)*...*H(jb+k-1), where k= min(p,n).

H(i) = i - taub*v*v'
where taub is a real/complex scalar, and v is a real/complex vector with v(1:i-1) = 0 and v(i)= 1; v(i+1:p) is
stored on exit in B(ib+i:ib+p-1,jb+i-1), and taub in taub(jb+i-1). To form Z explicitly, use ScaLAPACK
subroutine p?orgqr/p?ungqr. To use Z to update another matrix, use ScaLAPACK subroutine p?ormqr/p?
unmqr.
See Also
notations.
2016
Symmetric Eigenvalue Problems: ScaLAPACK Computational Routines
To solve a symmetric eigenproblem with ScaLAPACK, you usually need to reduce the matrix to real
tridiagonal form T and then find the eigenvalues and eigenvectors of the tridiagonal matrix T. ScaLAPACK
includes routines for reducing the matrix to a tridiagonal form by an orthogonal (or unitary) similarity
transformation A = QTQH as well as for solving tridiagonal symmetric eigenvalue problems. These routines
are listed in Table "Computational Routines for Solving Symmetric Eigenproblems".
There are different routines for symmetric eigenproblems, depending on whether you need eigenvalues only
or eigenvectors as well, and on the algorithm used (either the QTQ algorithm, or bisection followed by
inverse iteration).
Computational Routines for Solving Symmetric Eigenproblems
Operation Dense symmetric/ Orthogonal/unitary Symmetric
Hermitian matrix matrix tridiagonal
matrix
Reduce to tridiagonal form A = QTQH p?sytrd/p?hetrd
Multiply matrix after reduction p?ormtr/p?unmtr
Find all eigenvalues and eigenvectors steqr2*
of a tridiagonal matrix T by a QTQ
method
Find selected eigenvalues of a p?stebz
tridiagonal matrix T via bisection
Find selected eigenvectors of a p?stein
tridiagonal matrix T by inverse
iteration
* This routine is described as part of auxiliary ScaLAPACK routines.
p?syngst
Reduces a complex Hermitian-definite generalized
eigenproblem to standard form.
Syntax
call pssyngst (ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale, work,
lwork, info )
call pdsyngst (ibtype, uplo, n, a, ia, ja, desca, b, ib, jb, descb, scale, work,
lwork, info )
Description
p?syngst reduces a complex Hermitian-definite generalized eigenproblem to standard form.
p?syngst performs the same function as p?hegst, but is based on rank 2K updates, which are faster and
more scalable than triangular solves (the basis of p?syngst).
p?syngst calls p?hegst when uplo='U', hence p?hengst provides improved performance only when
uplo='L', ibtype=1.
p?syngst also calls p?hegst when insufficient workspace is provided, hence p?syngst provides improved
performance only when lwork >= 2 * NP0 * NB + NQ0 * NB + NB * NB
In the following sub( A ) denotes A( ia:ia+n-1, ja:ja+n-1 ) and sub( B ) denotes B( ib:ib+n-1, jb:jb
+n-1 ).
If ibtype = 1, the problem is sub( A )*x = lambda*sub( B )*x, and sub( A ) is overwritten by
inv(UH)*sub( A )*inv(U) or inv(L)*sub( A )*inv(LH)
If ibtype = 2 or 3, the problem is sub( A )*sub( B )*x = lambda*x or sub( B )*sub( A )*x = lambda*x, and
sub( A ) is overwritten by U*sub( A )*UH or LH*sub( A )*L.
2017
sub( B ) must have been previously factorized as UH*U or L*LH by p?potrf.
Input Parameters
ibtype (global)
INTEGER.
= 1: compute inv(UH)*sub( A )*inv(U) or inv(L)*sub( A )*inv(LH);
= 2 or 3: compute U*sub( A )*UH or LH*sub( A )*L.
uplo (global)
CHARACTER.
= 'U': Upper triangle of sub( A ) is stored and sub( B ) is factored as UH*U;
= 'L': Lower triangle of sub( A ) is stored and sub( B ) is factored as L*LH.
n (global)
INTEGER.
The order of the matrices sub( A ) and sub( B ). n >= 0.
a (local)
REAL for pssyngst
DOUBLE PRECISION for pdsyngst
On entry, this array contains the local pieces of the n-by-n Hermitian
distributed matrix sub( A ). If uplo = 'U', the leading n-by-n upper
triangular part of sub( A ) contains the upper triangular part of the matrix,
and its strictly lower triangular part is not referenced. If uplo = 'L', the
leading n-by-n lower triangular part of sub( A ) contains the lower
triangular part of the matrix, and its strictly upper triangular part is not
referenced.
ia (global)
INTEGER.
A's global row index, which points to the beginning of the submatrix which
is to be operated on.
ja (global)
INTEGER.
A's global column index, which points to the beginning of the submatrix
which is to be operated on.

INTEGER.
b (local)
REAL for pssyngst
2018
Pointer into the local memory to an array of size (lld_b,LOCc(jb+n-1)).
On entry, this array contains the local pieces of the triangular factor from
the Cholesky factorization of sub( B ), as returned by p?potrf.
ib (global)
INTEGER.
B's global row index, which points to the beginning of the submatrix which
is to be operated on.
jb (global)
INTEGER.
B's global column index, which points to the beginning of the submatrix
which is to be operated on.
descb (global and local)

INTEGER.
The array descriptor for the distributed matrix B.
work (local)
REAL for pssyngst
Array, size (lwork)

INTEGER.
lwork is local input and must be at least lwork >= MAX( NB * ( NP0 +1 ),
3 * NB )
When ibtype = 1 and uplo = 'L', p?syngst provides improved
performance when lwork >= 2 * NP0 * NB + NQ0 * NB + NB * NB,
where NB = mb_a = nb_a,

NP0 = numroc( n, NB, 0, 0, NPROW ),
NQ0 = numroc( n, NB, 0, 0, NPROW ),
numroc is a ScaLAPACK tool functions


assumed; the routine only calculates the optimal size for all work arrays.
Each of these values is returned in the first entry of the corresponding work
array, and no error message is issued by pxerbla.
2019
Output Parameters
a On exit, if info = 0, the transformed matrix, stored in the same format as

sub( A ).
scale (global)
REAL for pssyngst
Amount by which the eigenvalues should be scaled to compensate for the
scaling performed in this routine. At present, scale is always returned as
1.0, it is returned here to allow for future enhancement.
work (local)
REAL for pssyngst
Array, size (lwork)
info (global)
INTEGER.
p?syntrd
Reduces a real symmetric matrix to symmetric
tridiagonal form.
Syntax
call pssyntrd (uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info )
call pdsyntrd (uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info )
Description
p?syntrd is a prototype version of p?sytrd which uses tailored codes (either the serial, ?sytrd, or the
parallel code, p?syttrd) when the workspace provided by the user is adequate.
p?syntrd reduces a real symmetric matrix sub( A ) to symmetric tridiagonal form T by an orthogonal
similarity transformation:
Q' * sub( A ) * Q = T, where sub( A ) = A(ia:ia+n-1,ja:ja+n-1).
Features
p?syntrd is faster than p?sytrd on almost all matrices, particularly small ones (i.e. n < 500 * sqrt(P) ),
provided that enough workspace is available to use the tailored codes.
The tailored codes provide performance that is essentially independent of the input data layout.
The tailored codes place no restrictions on ia, ja, MB or NB. At present, ia, ja, MB and NB are restricted to
those values allowed by p?hetrd to keep the interface simple (see the Application Notes section for more
information about the restrictions).
2020
Input Parameters
uplo (global)
CHARACTER.
matrix sub( A ) is stored:
n (global)
INTEGER.
The number of rows and columns to be operated on, i.e. the order of the
distributed submatrix sub( A ). n >= 0.
a (local)
REAL for pssyntrd
DOUBLE PRECISION for pdsyntrd
On entry, this array contains the local pieces of the symmetric distributed
matrix sub( A ). If uplo = 'U', the leading n-by-n upper triangular part of
sub( A ) contains the upper triangular part of the matrix, and its strictly
lower triangular part is not referenced. If uplo = 'L', the leading n-by-n
lower triangular part of sub( A ) contains the lower triangular part of the
matrix, and its strictly upper triangular part is not referenced.
ia (global)
INTEGER.
ja (global)
INTEGER.
sub( A ).

INTEGER.
work (local)
REAL for pssyntrd
Array, size (lwork)

INTEGER.
2021
lwork is local input and must be at least lwork >= MAX( NB * ( NP +1 ), 3

* NB )
For optimal performance, greater workspace is needed, i.e.
lwork >= 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS + 4 ) * NPS
ANB = pjlaenv( ICTXT, 3, 'p?syttrd', 'L', 0, 0, 0, 0 )
ICTXT = desca( ctxt_ )
SQNPC = INT( sqrt( REAL( NPROW * NPCOL ) ) )
numroc is a ScaLAPACK tool function.

pjlaenv is a ScaLAPACK environmental inquiry function.
blacs_gridinfo.
Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal of sub( A ) are
the orthogonal matrix Q as a product of elementary reflectors; if uplo = 'L',
the diagonal and first subdiagonal of sub( A ) are overwritten by the
corresponding elements of the tridiagonal matrix T, and the elements below
the first subdiagonal, with the array tau, represent the orthogonal matrix Q
as a product of elementary reflectors. See Further Details.
d (local)
REAL for pssyntrd
Array, size LOCc(ja+n-1)
The diagonal elements of the tridiagonal matrix T: d(i) = A(i,i). d is tied to

the distributed matrix A.
e (local)
REAL for pssyntrd
Array, size LOCc(ja+n-1) if uplo = 'U', LOCc(ja+n-2) otherwise.
The off-diagonal elements of the tridiagonal matrix T: e(i) = A(i,i+1) if

uplo = 'U', e(i) = A(i+1,i) if uplo = 'L'. e is tied to the distributed matrix A.
tau (local)
REAL for pssyntrd
Array, size LOCc(ja+n-1).
This array contains the scalar factors tau of the elementary reflectors. tau
is tied to the distributed matrix A.
2022
work (local)
REAL for pssyntrd
Array, size (lwork)
On exit, work(1) returns the optimal lwork.
info (global)
INTEGER.
Application Notes
Q = H(n-1) . . . H(2) H(1).

H(i) = I - tau * v * v', where tau is a complex scalar, and v is a complex vector with v(i+1:n) = 0 and v(i) =
1; v(1:i-1) is stored on exit in A(ia:ia+i-2,ja+i), and tau in tau(ja+i-1).
Q = H(1) H(2) . . . H(n-1).

H(i) = I - tau * v * v', where tau is a complex scalar, and v is a complex vector with v(1:i) = 0 and v(i+1) =
1; v(i+2:n) is stored on exit in A(ia+i+1:ia+n-1,ja+i-1), and tau in tau(ja+i-1).
The contents of sub( A ) on exit are illustrated by the following examples with n = 5:
if uplo = 'U':
d e v2 v3 v4
d e v3 v4
d e v3
d e
d
if uplo = 'L':
d
e d
v1 e d
v1 v2 e d
v1 v2 v3 e d
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector
defining H(i).
The distributed submatrix sub( A ) must verify some alignment properties, namely the following expression
should be true:
2023
( mb_a = nb_a and IROFFA = ICOFFA and IROFFA = 0 ) with IROFFA = mod( ia-1, mb_a), and ICOFFA =
mod( ja-1, nb_a ).
p?sytrd
Reduces a symmetric matrix to real symmetric
tridiagonal form by an orthogonal similarity
transformation.
Syntax
call pssytrd(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
call pdsytrd(uplo, n, a, ia, ja, desca, d, e, tau, work, lwork, info)
Include Files
Description
The p?sytrd routine reduces a real symmetric matrix sub(A) to symmetric tridiagonal form T by an
orthogonal similarity transformation:
Q'*sub(A)*Q = T,
where sub(A) = A(ia:ia+n-1,ja:ja+n-1).
Input Parameters
uplo (global) CHARACTER.

matrix sub(A) is stored:
If uplo = 'U', upper triangular
If uplo = 'L', lower triangular
a (local)
REAL for pssytrd
DOUBLE PRECISION for pdsytrd.
On entry, this array contains the local pieces of the symmetric distributed
matrix sub(A).
is not referenced.
the lower triangular part of the matrix, and its strictly upper triangular part
is not referenced. See Application Notes below.
2024
work (local)
REAL for pssytrd
lworkmax(NB*(np +1), 3*NB),

where NB = mb_a = nb_a,
np = numroc(n, NB, MYROW, iarow, NPROW),

iarow = indxg2p(ia, NB, MYROW, rsrc_a, NPROW).

Output Parameters
a On exit, if uplo = 'U', the diagonal and first superdiagonal of sub(A) are
the orthogonal matrix Q as a product of elementary reflectors; if uplo =
'L', the diagonal and first subdiagonal of sub(A) are overwritten by the
corresponding elements of the tridiagonal matrix T, and the elements below
the first subdiagonal, with the array tau, represent the orthogonal matrix Q
as a product of elementary reflectors. See Application Notes below.
d (local)
REAL for pssytrd
Arrays of size LOCc(ja+n-1) .The diagonal elements of the tridiagonal
matrix T:
d(i)= A(i,i).
d is tied to the distributed matrix A.
e (local)
REAL for pssytrd
Arrays of size LOCc(ja+n-1) if uplo = 'U', LOCc(ja+n-2) otherwise.
The off-diagonal elements of the tridiagonal matrix T:

e(i)= A(i,i+1) if uplo = 'U',
e(i) = A(i+1,i) if uplo = 'L'.
e is tied to the distributed matrix A.
2025
tau (local)
REAL for pssytrd
Arrays of size LOCc(ja+n-1). This array contains the scalar factors of the
elementary reflectors. tau is tied to the distributed matrix A.

Application Notes
Q = H(n-1)... H(2) H(1).

H(i) = i - tau * v * v',
where tau is a real scalar, and v is a real vector with v(i+1:n) = 0 and v(i) = 1; v(1:i-1) is stored on exit in
A(ia:ia+i-2, ja+i), and tau in tau(ja+i-1).
Q = H(1) H(2)... H(n-1).

H(i) = i - tau * v * v',
where tau is a real scalar, and v is a real vector with v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in
A(ia+i+1:ia+n-1,ja+i-1), and tau in tau(ja+i-1).
The contents of sub(A) on exit are illustrated by the following examples with n = 5:
If uplo = 'U':
If uplo = 'L':
2026
where d and e denote diagonal and off-diagonal elements of T, and vi denotes an element of the vector
defining H(i).
See Also
Overview of ScaLAPACK Ro

MKL 2017 Developer Reference Fortran PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MKL 2017 Developer Reference Fortran PDF

Uploaded by

Copyright:

Available Formats

Intel Math Kernel Library

Chapter 1: Function Domains

Chapter 2: BLAS and Sparse BLAS Routines

Sparse Matrix Storage Formats for Inspector-executor Sparse BLAS

Chapter 3: LAPACK Routines

Intel MKL Fortran 95 Interfaces for LAPACK Routines vs. Netlib

Singular Value Decomposition: LAPACK Computational Routines...864

Chapter 4: ScaLAPACK Routines

Chapter 5: Sparse Solver Routines

CG Interface Description............................................................... 2516

Chapter 6: Extended Eigensolver Routines

Chapter 7: Vector Mathematical Functions

VM Mathematical Function Interfaces...................................... 2589

Chapter 8: Statistical Functions

Summary Statistics Naming Conventions.........................................2817

Chapter 9: Fourier Transform Functions

Chapter 10: PBLAS Routines

Chapter 11: Partial Differential Equations Support

Chapter 12: Nonlinear Optimization Problem Solvers

Chapter 13: Support Functions

Chapter 14: BLACS Routines

Examples of BLACS Routines Usage........................................................ 3186

Chapter 15: Data Fitting Functions

Appendix A: Linear Solvers Basics

Appendix B: Routine and Function Arguments

Appendix C: FFTW Interface to Intel Math Kernel Library

Appendix D: Code Examples

Third Party Content

Copyright 2000-2011 The University of California Berkeley. All rights reserved.

Copyright 2006-2012 The University of Colorado Denver. All rights reserved.

Copyright 2009, The Regents of the University of Massachusetts, Amherst.

Copyright 2005-2008 ZIH, TU Dresden, Federal Republic of Germany

This manual uses the following notational conventions:

Routine name shorthand (for example, ?ungqr instead of cungqr/zungqr).

Routine Name Shorthand

lowercase courier Code examples:

* Used as a multiplication symbol in code examples and equations and

Sparse BLAS Routines

Sparse Solver Routines

Extended Eigensolver Routines

Converges quickly in 2-3 iterations with very high accuracy

Fourier Transform Functions

Partial Differential Equations Support

Nonlinear Optimization Problem Solvers

Data Fitting Functions

Loop unrolling to minimize loop management costs

Naming Conventions for BLAS Routines

s real, single precision

c complex, single precision

d real, double precision

z complex, double precision

gb general band matrix

sp symmetric matrix (packed storage)

sb symmetric band matrix

hp Hermitian matrix (packed storage)

hb Hermitian band matrix

tp triangular matrix (packed storage)

tb triangular band matrix.

g Givens rotation construction

m modified Givens rotation

sv solving a system of linear equations with a single unknown vector

r rank-1 update of a matrix

r2 rank-2 update of a matrix.