From f46ce64a4f7e1187047f1a527ffbca919e97fe40 Mon Sep 17 00:00:00 2001 From: julie Date: Thu, 28 Apr 2016 05:35:01 +0000 Subject: [PATCH] From Mark Gates (UTK) - Fix Bug 157 - Various bugs in SVD routines (*gesdd, *gesvd, and *bdsdc). Items are labelled (a) through (m), omitting (l). Several are not bugs, just suggestions. Most bugs are in *gesdd. There's one bug (g) in *bdsdc. This is the underlying cause of LAPACK bug #111. There's one bug (m) in [cz]gesvd. I also added an INT() cast in these assignments to silence compiler warnings. Changed: LWORK_ZGEQRF=CDUM(1) to: LWORK_ZGEQRF = INT( CDUM(1) ) Where possible, I ran a test showing the wrong behavior, then a test showing the corrected behavior. These use a modified version of the MAGMA SVD tester (calling LAPACK), because I could adjust the lwork as needed. The last 3 columns are the lwork type, the lwork size, and the lwork formula. The lwork types are: doc_old as documented in LAPACK 3.6. doc as in the attached, updated documentation. min_old minwrk, as computed in LAPACK 3.6. min minwrk, as computed in the attached, updated code. min-1 minimum - 1; this should cause gesdd to return an error. opt optimal size. max the maximum size LAPACK will take advantage of; some cases, the optimal is n*n + work, while the max is m*n + work. query what gesdd returns for an lwork query; should equal opt or max. After the lwork, occasionally there is a ! or ? error code indicating: Error codes: ! error: lwork < min. For (min-1), this ought to appear. ? compatability issue: lwork < min_old, will fail for lapack <= 3.6. I also tested the update routines on a wide variety of sizes and jobz, with various lwork. Besides fixing the bugs below, I made two significant changes. 1) Changed *gesdd from computing each routine's workspace using, e.g.: N*ilaenv(...) to querying each routine for its LWORK, e.g.: CALL ZGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) LWORK_ZGEBRD_MN = INT( CDUM(1) ) This matches how *gesvd was changed in LAPACK 3.4.0. 2) Changed the Workspace: comments, which were incredibly deceptive. For instance, in Path 2 before dbdsdc, it said Workspace: need N + N*N + BDSPAC since dbdsdc needs the [e] vector, [U] matrix, and bdspac. However, that ignores that the [tauq, taup] vectors and [R] matrix are also already allocated, though dbdsdc doesn't need them. So the workspace needed at that point in the code is actually Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + BDSPAC For clarity, I added in [brackets] what matrices or vectors were allocated, and the order reflects their order in memory. I may do a similar change for *gesvd eventually. The workspace comments in MAGMA's *gesvd have already been updated as above. ================================================================================ a) Throughout, to simplify equations, let: mn = min( M, N ) mx = max( M, N ) ================================================================================ b) [sdcz]gesdd Path 4 (m >> n, job="all") has wrong minwrk formula in code: minwrk = bdspac + mn*mn + 2*mn + mx = 4*mn*mn + 6*mn + mx This is an overestimate, needlessly rejecting the documented formula: doc = 4*mn*mn + 7*mn In complex, the correct min fails, but the doc matches the wrong minwrk. Solution: fix code to: minwrk = mn*mn + max( 3*mn + bdspac, mn + mx ) = mn*mn + max( 3*mn*mn + 7*mn, mn + mx ) Test cases: m=40, ..., 100; n=20; jobz='A' See bug (d) showing documentation is also wrong. Also, see bug (i), complex [cz]gesdd should return -12 instead of -13. ================================================================================ bt) transposed case [sd]gesdd Path 4t (n >> m, job="all") has a different wrong minwrk; see bug (c). [cz]gesdd Path 4t exhibits same bug as Path 4. Test cases: m=20; n=40, ..., 100; jobz='A' ================================================================================ c) [sd]gesdd Path 4t (n >> m, job="all") has wrong minwrk formula in code, different than bug (b): minwrk = bdspac + m*m + 3*m = 4*mn*mn + 7*mn This formula lacks any dependence on N, so the code will fail (without setting info; orglq calls xerbla) if N is too large, N > 3*M*M + 6*M. Bug does not occur in complex. Test cases: m=20; n = 1320; jobz='A' ok with documented lwork m=20; n > 1320; jobz='A' fails with documented lwork Solution: as in bug (b), fix code to: minwrk = mn*mn + max( 3*mn + bdspac, mn + mx ) = mn*mn + max( 3*mn*mn + 7*mn, mn + mx ) See bug (d) showing documentation is also wrong. ================================================================================ d) [sd]gesdd documentation lists the same minimum size for jobz='S' and 'A': If JOBZ = 'S' or 'A', LWORK >= min(M,N)*(7 + 4*min(M,N)) However, jobz='A' actually also depends on max(M,N): minwrk = mn*mn + max( 3*mn*mn + 7*mn, mn + mx ) This causes the formula to fail for mx > 3*mn*mn + 6*mn. Test cases: m > 1320; n = 20; jobz='A' fails with document lwork, even after fixing bugs (b) and (c). m = 20; n > 1320; jobz='A' fails also. Solution: in docs, split these two cases. This fix uses an overestimate, so that codes using it will be backwards compatible with LAPACK <= 3.6. If JOBZ = 'S', LWORK >= 4*mn*mn + 7*mn. If JOBZ = 'A', LWORK >= 4*mn*mn + 6*mn + mx. ================================================================================ e) [sd]gesdd, Path 5, jobz='A' has wrong maxwrk formula in the code: MAXWRK = MAX( MAXWRK, BDSPAC + 3*N ) Should be: MAXWRK = MAX( WRKBL, BDSPAC + 3*N ) This causes the lwork query to ignore WRKBL, and return the minimum workspace size, BDSPAC + 3*N, instead of the optimal workspace size. However, it only affects the result for small sizes where min(M,N) < NB. Path 5t has the correct maxwrk formula. Complex is correct for both Path 5 and 5t. Test case: Compare lwork query with M = 30, N = 20, jobz='A', lwork query is 1340 M = 20, N = 30, jobz='A', lwork query is 3260 These should be the same. Solution: fix code as above. ================================================================================ f) Not a bug, just a suggestion. The lwork minimum sizes are not actually minimums, and can be larger than the queried lwork size. Solution: add a comment: These are not tight minimums in all cases; see comments inside code. ================================================================================ g) [sd]bdsdc segfaults due to too small workspace size. Its documentation claims: If COMPQ = 'N' then LWORK >= (4 * N). Based on this, in zgesdd, the rwork size >= 5*min(M,N). However, LAPACK bug 111 found that rwork size >= 7*min(M,N) was required. In dbdsdc, if uplo='L', then it rotates lower bidiagonal to upper bidiagonal, and saves 2 vectors of Givens rotations in work. It shifts WSTART from 1 to 2*N-1. Then it calls dlasdq( ..., work( wstart ), info ). As dlasdq requires 4*N, dbdsdc would now require 6*N in this case. This caused zgesdd to require rwork size >= 7*min(M,N) when N > M and jobz='N'. My preferred solution is to change WSTART to 1 in the DLASDQ call inside dbdsdc: IF( ICOMPQ.EQ.0 ) THEN CALL DLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U, $ LDU, WORK( WSTART ), INFO ) GO TO 40 END IF to: IF( ICOMPQ.EQ.0 ) THEN * Ignores WSTART, which is needed only for ICOMPQ = 1 or 2; * using WSTART would change required workspace to 6*N for uplo='L'. CALL DLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U, $ LDU, WORK( 1 ), INFO ) GO TO 40 END IF The [cz]gesdd documentation, which was changed to 7*min(M,N) in LAPACK 3.6, may be reverted to 5*min(M,N), if desired. ================================================================================ h) [sd]gesdd for jobz='N' requires bdspac = 7*n for the dbdsdc workspace. However, dbdsdc requires only 4*n, or 6*n before fixing bug (g). For backwards compatability, I did not change the code, but added a comment for clarification. ================================================================================ i) [cz]gesdd returns info = -13 instead of info = -12 for lwork errors. ================================================================================ j) In zgesdd, for computing maxwrk, these paths: Path 6, jobz=A Path 6t, jobz=S Path 6t, jobz=A query ilaenv( 1, zungbr, ... ) when the code actually calls zunmbr (twice). I corrected it. ================================================================================ k) In zgesdd documentation, currently lrwork >= max( 5*mn*mn + 7*mn, 2*mx*mn + 2*mn*mn + mn ) It doesn't need that much, particularly for (mx >> mn) case. If (mx >> mn), lrwork >= 5*mn*mn + 5*mn; else, lrwork >= max( 5*mn*mn + 5*mn, 2*mx*mn + 2*mn*mn + mn ). I changed this in the documentation. Feel free to revert if you prefer. ================================================================================ m) [cz]gesvd, Path 10 and 10t, have minwrk inside the wrong conditional: IF( .NOT.WNTVN ) THEN MAXWRK = MAX( MAXWRK, 2*N+LWORK_ZUNGBR_P ) MINWRK = 2*N + M END IF So Path 10 with jobvt='N', and Path 10t with jobu='N', have minwrk = 1, so an invalid lwork is not correctly rejected. ================================================================================ mt) transposed case broken: with old routine, Path 10t with jobu='N' doesn't enforce minwrk --- SRC/cgesdd.f | 1048 +++++++++++++++++++++++++++++--------------------- SRC/cgesvd.f | 40 +- SRC/dbdsdc.f | 21 +- SRC/dgesdd.f | 721 +++++++++++++++++++--------------- SRC/dgesvd.f | 608 ++++++++++++++--------------- SRC/sbdsdc.f | 7 +- SRC/sgesdd.f | 730 ++++++++++++++++++++--------------- SRC/sgesvd.f | 36 +- SRC/zgesdd.f | 1045 ++++++++++++++++++++++++++++--------------------- SRC/zgesvd.f | 40 +- 10 files changed, 2430 insertions(+), 1866 deletions(-) diff --git a/SRC/cgesdd.f b/SRC/cgesdd.f index 7f16b63b..370acd94 100644 --- a/SRC/cgesdd.f +++ b/SRC/cgesdd.f @@ -135,8 +135,8 @@ *> \param[in] LDU *> \verbatim *> LDU is INTEGER -*> The leading dimension of the array U. LDU >= 1; if -*> JOBZ = 'S' or 'A' or JOBZ = 'O' and M < N, LDU >= M. +*> The leading dimension of the array U. LDU >= 1; +*> if JOBZ = 'S' or 'A' or JOBZ = 'O' and M < N, LDU >= M. *> \endverbatim *> *> \param[out] VT @@ -152,8 +152,8 @@ *> \param[in] LDVT *> \verbatim *> LDVT is INTEGER -*> The leading dimension of the array VT. LDVT >= 1; if -*> JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; +*> The leading dimension of the array VT. LDVT >= 1; +*> if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; *> if JOBZ = 'S', LDVT >= min(M,N). *> \endverbatim *> @@ -167,24 +167,28 @@ *> \verbatim *> LWORK is INTEGER *> The dimension of the array WORK. LWORK >= 1. -*> if JOBZ = 'N', LWORK >= 2*min(M,N)+max(M,N). -*> if JOBZ = 'O', -*> LWORK >= 2*min(M,N)*min(M,N)+2*min(M,N)+max(M,N). -*> if JOBZ = 'S' or 'A', -*> LWORK >= min(M,N)*min(M,N)+2*min(M,N)+max(M,N). -*> For good performance, LWORK should generally be larger. -*> *> If LWORK = -1, a workspace query is assumed. The optimal *> size for the WORK array is calculated and stored in WORK(1), *> and no other work except argument checking is performed. +*> +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LWORK >= 2*mn + mx. +*> If JOBZ = 'O', LWORK >= 2*mn*mn + 2*mn + mx. +*> If JOBZ = 'S', LWORK >= mn*mn + 3*mn. +*> If JOBZ = 'A', LWORK >= mn*mn + 2*mn + mx. +*> These are not tight minimums in all cases; see comments inside code. +*> For good performance, LWORK should generally be larger; +*> a query is recommended. *> \endverbatim *> *> \param[out] RWORK *> \verbatim *> RWORK is REAL array, dimension (MAX(1,LRWORK)) -*> If JOBZ = 'N', LRWORK >= 7*min(M,N). -*> Otherwise, -*> LRWORK >= min(M,N)*max(5*min(M,N)+7,2*max(M,N)+2*min(M,N)+1) +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LRWORK >= 5*mn (LAPACK <= 3.6 needs 7*mn); +*> else if mx >> mn, LRWORK >= 5*mn*mn + 5*mn; +*> else LRWORK >= max( 5*mn*mn + 5*mn, +*> 2*mx*mn + 2*mn*mn + mn ). *> \endverbatim *> *> \param[out] IWORK @@ -221,6 +225,7 @@ * ===================================================================== SUBROUTINE CGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, $ WORK, LWORK, RWORK, IWORK, INFO ) + implicit none * * -- LAPACK driver routine (version 3.6.0) -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- @@ -241,8 +246,6 @@ * ===================================================================== * * .. Parameters .. - INTEGER LQUERV - PARAMETER ( LQUERV = -1 ) COMPLEX CZERO, CONE PARAMETER ( CZERO = ( 0.0E+0, 0.0E+0 ), $ CONE = ( 1.0E+0, 0.0E+0 ) ) @@ -250,16 +253,27 @@ PARAMETER ( ZERO = 0.0E+0, ONE = 1.0E+0 ) * .. * .. Local Scalars .. - LOGICAL WNTQA, WNTQAS, WNTQN, WNTQO, WNTQS + LOGICAL LQUERY, WNTQA, WNTQAS, WNTQN, WNTQO, WNTQS INTEGER BLK, CHUNK, I, IE, IERR, IL, IR, IRU, IRVT, $ ISCL, ITAU, ITAUP, ITAUQ, IU, IVT, LDWKVT, $ LDWRKL, LDWRKR, LDWRKU, MAXWRK, MINMN, MINWRK, $ MNTHR1, MNTHR2, NRWORK, NWORK, WRKBL - REAL ANRM, BIGNUM, EPS, SMLNUM + INTEGER LWORK_CGEBRD_MN, LWORK_CGEBRD_MM, + $ LWORK_CGEBRD_NN, LWORK_CGELQF_MN, + $ LWORK_CGEQRF_MN, + $ LWORK_CUNGBR_P_MN, LWORK_CUNGBR_P_NN, + $ LWORK_CUNGBR_Q_MN, LWORK_CUNGBR_Q_MM, + $ LWORK_CUNGLQ_MN, LWORK_CUNGLQ_NN, + $ LWORK_CUNGQR_MM, LWORK_CUNGQR_MN, + $ LWORK_CUNMBR_PRC_MM, LWORK_CUNMBR_QLN_MM, + $ LWORK_CUNMBR_PRC_MN, LWORK_CUNMBR_QLN_MN, + $ LWORK_CUNMBR_PRC_NN, LWORK_CUNMBR_QLN_NN + REAL ANRM, BIGNUM, EPS, SMLNUM * .. * .. Local Arrays .. INTEGER IDUM( 1 ) REAL DUM( 1 ) + COMPLEX CDUM( 1 ) * .. * .. External Subroutines .. EXTERNAL CGEBRD, CGELQF, CGEMM, CGEQRF, CLACP2, CLACPY, @@ -268,9 +282,8 @@ * .. * .. External Functions .. LOGICAL LSAME - INTEGER ILAENV - REAL CLANGE, SLAMCH - EXTERNAL CLANGE, SLAMCH, ILAENV, LSAME + REAL SLAMCH, CLANGE + EXTERNAL LSAME, SLAMCH, CLANGE * .. * .. Intrinsic Functions .. INTRINSIC INT, MAX, MIN, SQRT @@ -279,15 +292,16 @@ * * Test the input arguments * - INFO = 0 - MINMN = MIN( M, N ) - MNTHR1 = INT( MINMN*17.0 / 9.0 ) - MNTHR2 = INT( MINMN*5.0 / 3.0 ) - WNTQA = LSAME( JOBZ, 'A' ) - WNTQS = LSAME( JOBZ, 'S' ) + INFO = 0 + MINMN = MIN( M, N ) + MNTHR1 = INT( MINMN*17.0E0 / 9.0E0 ) + MNTHR2 = INT( MINMN*5.0E0 / 3.0E0 ) + WNTQA = LSAME( JOBZ, 'A' ) + WNTQS = LSAME( JOBZ, 'S' ) WNTQAS = WNTQA .OR. WNTQS - WNTQO = LSAME( JOBZ, 'O' ) - WNTQN = LSAME( JOBZ, 'N' ) + WNTQO = LSAME( JOBZ, 'O' ) + WNTQN = LSAME( JOBZ, 'N' ) + LQUERY = ( LWORK.EQ.-1 ) MINWRK = 1 MAXWRK = 1 * @@ -309,8 +323,8 @@ END IF * * Compute workspace -* (Note: Comments in the code beginning "Workspace:" describe the -* minimal amount of workspace needed at that point in the code, +* Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace allocated at that point in the code, * as well as the preferred amount for good performance. * CWorkspace refers to complex workspace, and RWorkspace to * real workspace. NB refers to the optimal block size for the @@ -320,233 +334,283 @@ IF( M.GE.N ) THEN * * There is no complex work space needed for bidiagonal SVD -* The real work space needed for bidiagonal SVD is BDSPAC -* for computing singular values and singular vectors; BDSPAN -* for computing singular values only. -* BDSPAC = 5*N*N + 7*N -* BDSPAN = MAX(7*N+4, 3*N+2+SMLSIZ*(SMLSIZ+8)) +* The real work space needed for bidiagonal SVD (sbdsdc) is +* BDSPAC = 3*N*N + 4*N for singular values and vectors; +* BDSPAC = 4*N for singular values only; +* not including e, RU, and RVT matrices. +* +* Compute space preferred for each routine + CALL CGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGEBRD_MN = INT( CDUM(1) ) +* + CALL CGEBRD( N, N, CDUM(1), N, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGEBRD_NN = INT( CDUM(1) ) +* + CALL CGEQRF( M, N, CDUM(1), M, CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGEQRF_MN = INT( CDUM(1) ) +* + CALL CUNGBR( 'P', N, N, N, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_P_NN = INT( CDUM(1) ) +* + CALL CUNGBR( 'Q', M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_Q_MM = INT( CDUM(1) ) +* + CALL CUNGBR( 'Q', M, N, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_Q_MN = INT( CDUM(1) ) +* + CALL CUNGQR( M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGQR_MM = INT( CDUM(1) ) +* + CALL CUNGQR( M, N, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGQR_MN = INT( CDUM(1) ) +* + CALL CUNMBR( 'P', 'R', 'C', N, N, N, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_CUNMBR_PRC_NN = INT( CDUM(1) ) +* + CALL CUNMBR( 'Q', 'L', 'N', M, M, N, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_CUNMBR_QLN_MM = INT( CDUM(1) ) +* + CALL CUNMBR( 'Q', 'L', 'N', M, N, N, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_CUNMBR_QLN_MN = INT( CDUM(1) ) +* + CALL CUNMBR( 'Q', 'L', 'N', N, N, N, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_CUNMBR_QLN_NN = INT( CDUM(1) ) * IF( M.GE.MNTHR1 ) THEN IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * - MAXWRK = N + N*ILAENV( 1, 'CGEQRF', ' ', M, N, -1, - $ -1 ) - MAXWRK = MAX( MAXWRK, 2*N+2*N* - $ ILAENV( 1, 'CGEBRD', ' ', N, N, -1, -1 ) ) + MAXWRK = N + LWORK_CGEQRF_MN + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CGEBRD_NN ) MINWRK = 3*N ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') -* - WRKBL = N + N*ILAENV( 1, 'CGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'CUNGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'CGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 2 (M >> N, JOBZ='O') +* + WRKBL = N + LWORK_CGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_CUNGQR_MN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_PRC_NN ) MAXWRK = M*N + N*N + WRKBL MINWRK = 2*N*N + 3*N ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') -* - WRKBL = N + N*ILAENV( 1, 'CGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'CUNGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'CGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 3 (M >> N, JOBZ='S') +* + WRKBL = N + LWORK_CGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_CUNGQR_MN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_PRC_NN ) MAXWRK = N*N + WRKBL MINWRK = N*N + 3*N ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') -* - WRKBL = N + N*ILAENV( 1, 'CGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+M*ILAENV( 1, 'CUNGQR', ' ', M, - $ M, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'CGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 4 (M >> N, JOBZ='A') +* + WRKBL = N + LWORK_CGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_CUNGQR_MM ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_CUNMBR_PRC_NN ) MAXWRK = N*N + WRKBL - MINWRK = N*N + 2*N + M + MINWRK = N*N + MAX( 3*N, N + M ) END IF ELSE IF( M.GE.MNTHR2 ) THEN * -* Path 5 (M much larger than N, but not as much as MNTHR1) +* Path 5 (M >> N, but not as much as MNTHR1) * - MAXWRK = 2*N + ( M+N )*ILAENV( 1, 'CGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*N + LWORK_CGEBRD_MN MINWRK = 2*N + M IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'Q', M, N, N, -1 ) ) +* Path 5o (M >> N, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_Q_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + N*N ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'Q', M, N, N, -1 ) ) +* Path 5s (M >> N, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_Q_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+M* - $ ILAENV( 1, 'CUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5a (M >> N, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNGBR_Q_MM ) END IF ELSE * -* Path 6 (M at least N, but not much larger) +* Path 6 (M >= N, but not much larger) * - MAXWRK = 2*N + ( M+N )*ILAENV( 1, 'CGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*N + LWORK_CGEBRD_MN MINWRK = 2*N + M IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, N, N, -1 ) ) +* Path 6o (M >= N, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_PRC_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_QLN_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + N*N ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, N, N, -1 ) ) +* Path 6s (M >= N, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_QLN_MN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_PRC_NN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'CUNGBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+M* - $ ILAENV( 1, 'CUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6a (M >= N, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_CUNMBR_PRC_NN ) END IF END IF ELSE * * There is no complex work space needed for bidiagonal SVD -* The real work space needed for bidiagonal SVD is BDSPAC -* for computing singular values and singular vectors; BDSPAN -* for computing singular values only. -* BDSPAC = 5*M*M + 7*M -* BDSPAN = MAX(7*M+4, 3*M+2+SMLSIZ*(SMLSIZ+8)) +* The real work space needed for bidiagonal SVD (sbdsdc) is +* BDSPAC = 3*M*M + 4*M for singular values and vectors; +* BDSPAC = 4*M for singular values only; +* not including e, RU, and RVT matrices. +* +* Compute space preferred for each routine + CALL CGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGEBRD_MN = INT( CDUM(1) ) +* + CALL CGEBRD( M, M, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGEBRD_MM = INT( CDUM(1) ) +* + CALL CGELQF( M, N, CDUM(1), M, CDUM(1), CDUM(1), -1, IERR ) + LWORK_CGELQF_MN = INT( CDUM(1) ) +* + CALL CUNGBR( 'P', M, N, M, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_P_MN = INT( CDUM(1) ) +* + CALL CUNGBR( 'P', N, N, M, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_P_NN = INT( CDUM(1) ) +* + CALL CUNGBR( 'Q', M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGBR_Q_MM = INT( CDUM(1) ) +* + CALL CUNGLQ( M, N, M, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGLQ_MN = INT( CDUM(1) ) +* + CALL CUNGLQ( N, N, M, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_CUNGLQ_NN = INT( CDUM(1) ) +* + CALL CUNMBR( 'P', 'R', 'C', M, M, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_CUNMBR_PRC_MM = INT( CDUM(1) ) +* + CALL CUNMBR( 'P', 'R', 'C', M, N, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_CUNMBR_PRC_MN = INT( CDUM(1) ) +* + CALL CUNMBR( 'P', 'R', 'C', N, N, M, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_CUNMBR_PRC_NN = INT( CDUM(1) ) +* + CALL CUNMBR( 'Q', 'L', 'N', M, M, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_CUNMBR_QLN_MM = INT( CDUM(1) ) * IF( N.GE.MNTHR1 ) THEN IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * - MAXWRK = M + M*ILAENV( 1, 'CGELQF', ' ', M, N, -1, - $ -1 ) - MAXWRK = MAX( MAXWRK, 2*M+2*M* - $ ILAENV( 1, 'CGEBRD', ' ', M, M, -1, -1 ) ) + MAXWRK = M + LWORK_CGELQF_MN + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CGEBRD_MM ) MINWRK = 3*M ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') -* - WRKBL = M + M*ILAENV( 1, 'CGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'CUNGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'CGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 2t (N >> M, JOBZ='O') +* + WRKBL = M + LWORK_CGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_CUNGLQ_MN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_PRC_MM ) MAXWRK = M*N + M*M + WRKBL MINWRK = 2*M*M + 3*M ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* - WRKBL = M + M*ILAENV( 1, 'CGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'CUNGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'CGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 3t (N >> M, JOBZ='S') +* + WRKBL = M + LWORK_CGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_CUNGLQ_MN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_PRC_MM ) MAXWRK = M*M + WRKBL MINWRK = M*M + 3*M ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') -* - WRKBL = M + M*ILAENV( 1, 'CGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+N*ILAENV( 1, 'CUNGLQ', ' ', N, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'CGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 4t (N >> M, JOBZ='A') +* + WRKBL = M + LWORK_CGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_CUNGLQ_NN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_CUNMBR_PRC_MM ) MAXWRK = M*M + WRKBL - MINWRK = M*M + 2*M + N + MINWRK = M*M + MAX( 3*M, M + N ) END IF ELSE IF( N.GE.MNTHR2 ) THEN * -* Path 5t (N much larger than M, but not as much as MNTHR1) +* Path 5t (N >> M, but not as much as MNTHR1) * - MAXWRK = 2*M + ( M+N )*ILAENV( 1, 'CGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*M + LWORK_CGEBRD_MN MINWRK = 2*M + N IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'P', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5to (N >> M, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_P_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + M*M ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'P', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5ts (N >> M, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_P_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*M+N* - $ ILAENV( 1, 'CUNGBR', 'P', N, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5ta (N >> M, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNGBR_P_NN ) END IF ELSE * -* Path 6t (N greater than M, but not much larger) +* Path 6t (N > M, but not much larger) * - MAXWRK = 2*M + ( M+N )*ILAENV( 1, 'CGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*M + LWORK_CGEBRD_MN MINWRK = 2*M + N IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'PRC', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNMBR', 'QLN', M, M, N, -1 ) ) +* Path 6to (N > M, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_PRC_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + M*M ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'PRC', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6ts (N > M, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_PRC_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*M+N* - $ ILAENV( 1, 'CUNGBR', 'PRC', N, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'CUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6ta (N > M, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_CUNMBR_PRC_NN ) END IF END IF END IF @@ -554,18 +618,20 @@ END IF IF( INFO.EQ.0 ) THEN WORK( 1 ) = MAXWRK - IF( LWORK.LT.MINWRK .AND. LWORK.NE.LQUERV ) - $ INFO = -13 + IF( LWORK.LT.MINWRK .AND. .NOT. LQUERY ) THEN + INFO = -12 + END IF END IF -* -* Quick returns * IF( INFO.NE.0 ) THEN CALL XERBLA( 'CGESDD', -INFO ) RETURN + ELSE IF( LQUERY ) THEN + RETURN END IF - IF( LWORK.EQ.LQUERV ) - $ RETURN +* +* Quick return if possible +* IF( M.EQ.0 .OR. N.EQ.0 ) THEN RETURN END IF @@ -598,15 +664,16 @@ * IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: need 0) +* CWorkspace: need N [tau] + N [work] +* CWorkspace: prefer N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -621,8 +688,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (CWorkspace: need 3*N, prefer 2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( N, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -630,15 +698,15 @@ NRWORK = IE + N * * Perform bidiagonal SVD, compute singular values only -* (CWorkspace: 0) -* (RWorkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') +* Path 2 (M >> N, JOBZ='O') * N left singular vectors to be overwritten on A and * N right singular vectors to be computed in VT * @@ -648,20 +716,21 @@ * LDWRKU = N IR = IU + LDWRKU*N - IF( LWORK.GE.M*N+N*N+3*N ) THEN + IF( LWORK .GE. M*N + N*N + 3*N ) THEN * * WORK(IR) is M by N * LDWRKR = M ELSE - LDWRKR = ( LWORK-N*N-3*N ) / N + LDWRKR = ( LWORK - N*N - 3*N ) / N END IF ITAU = IR + LDWRKR*N NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need N*N+2*N, prefer M*N+N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -673,8 +742,9 @@ $ LDWRKR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CUNGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -684,8 +754,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( N, N, WORK( IR ), LDWRKR, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -694,8 +765,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of R in WORK(IRU) and computing right singular vectors * of R in WORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = IE + N IRVT = IRU + N*N @@ -706,8 +777,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by the left singular vectors of R -* (CWorkspace: need 2*N*N+3*N, prefer M*N+N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRU ), N, WORK( IU ), $ LDWRKU ) @@ -717,8 +789,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by the right singular vectors of R -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, WORK( IR ), LDWRKR, @@ -727,8 +800,9 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IU), storing result in WORK(IR) and copying to A -* (CWorkspace: need 2*N*N, prefer N*N+M*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] +* CWorkspace: prefer N*N [U] + M*N [R] +* RWorkspace: need 0 * DO 10 I = 1, M, LDWRKR CHUNK = MIN( M-I+1, LDWRKR ) @@ -741,7 +815,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') +* Path 3 (M >> N, JOBZ='S') * N left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -754,8 +828,9 @@ NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need N*N+2*N, prefer N*N+N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -767,8 +842,9 @@ $ LDWRKR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CUNGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -778,8 +854,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( N, N, WORK( IR ), LDWRKR, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -788,8 +865,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = IE + N IRVT = IRU + N*N @@ -800,8 +877,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, @@ -810,8 +888,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, WORK( IR ), LDWRKR, @@ -820,8 +899,8 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in U -* (CWorkspace: need N*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] +* RWorkspace: need 0 * CALL CLACPY( 'F', N, N, U, LDU, WORK( IR ), LDWRKR ) CALL CGEMM( 'N', 'N', M, N, N, CONE, A, LDA, WORK( IR ), @@ -829,7 +908,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') +* Path 4 (M >> N, JOBZ='A') * M left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -842,16 +921,18 @@ NWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) CALL CLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (CWorkspace: need N+M, prefer N+M*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N [tau] + M [work] +* CWorkspace: prefer N*N [U] + N [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CUNGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -866,8 +947,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (CWorkspace: need 3*N, prefer 2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( N, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -879,8 +961,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -888,8 +970,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by left singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRU ), N, WORK( IU ), $ LDWRKU ) @@ -899,8 +982,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of R -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -909,8 +993,8 @@ * * Multiply Q in U by left singular vectors of R in * WORK(IU), storing result in A -* (CWorkspace: need N*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] +* RWorkspace: need 0 * CALL CGEMM( 'N', 'N', M, N, N, CONE, U, LDU, WORK( IU ), $ LDWRKU, CZERO, A, LDA ) @@ -925,7 +1009,7 @@ * * MNTHR2 <= M < MNTHR1 * -* Path 5 (M much larger than N, but not as much as MNTHR1) +* Path 5 (M >> N, but not as much as MNTHR1) * Reduce to bidiagonal form without QR decomposition, use * CUNGBR and matrix multiplication to compute singular vectors * @@ -936,19 +1020,21 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (CWorkspace: need 2*N+M, prefer 2*N+(M+N)*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5n (M >> N, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IU = NWORK @@ -956,22 +1042,25 @@ IRVT = IRU + N*N NRWORK = IRVT + N*N * +* Path 5o (M >> N, JOBZ='O') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL CUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CUNGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * * WORK( IU ) is M by N * @@ -980,15 +1069,15 @@ * * WORK(IU) is LDWRKU by N * - LDWRKU = ( LWORK-3*N ) / N + LDWRKU = ( LWORK - 3*N ) / N END IF NWORK = IU + LDWRKU*N * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -996,8 +1085,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in WORK(IU), copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL CLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, $ WORK( IU ), LDWRKU, RWORK( NRWORK ) ) @@ -1005,8 +1094,10 @@ * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need N*N, prefer M*N) -* (Rworkspace: need 3*N*N, prefer N*N+2*M*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] +* RWorkspace: need N [e] + N*N [RU] + 2*N*N [rwork] +* RWorkspace: prefer N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT DO 20 I = 1, M, LDWRKU @@ -1019,17 +1110,20 @@ * ELSE IF( WNTQS ) THEN * +* Path 5s (M >> N, JOBZ='S') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL CUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to U, generate Q -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'L', M, N, A, LDA, U, LDU ) CALL CUNGBR( 'Q', M, N, N, U, LDU, WORK( ITAUQ ), @@ -1038,8 +1132,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1050,8 +1144,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL CLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) @@ -1059,8 +1153,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need N*N+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT CALL CLACRM( M, N, U, LDU, RWORK( IRU ), N, A, LDA, @@ -1068,17 +1162,20 @@ CALL CLACPY( 'F', M, N, A, LDA, U, LDU ) ELSE * +* Path 5a (M >> N, JOBZ='A') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL CUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to U, generate Q -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'L', M, N, A, LDA, U, LDU ) CALL CUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), @@ -1087,8 +1184,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1099,8 +1196,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL CLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) @@ -1108,8 +1205,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT CALL CLACRM( M, N, U, LDU, RWORK( IRU ), N, A, LDA, @@ -1121,7 +1218,7 @@ * * M .LT. MNTHR2 * -* Path 6 (M at least N, but not much larger) +* Path 6 (M >= N, but not much larger) * Reduce to bidiagonal form without QR decomposition * Use CUNMBR to compute singular vectors * @@ -1132,26 +1229,28 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (CWorkspace: need 2*N+M, prefer 2*N+(M+N)*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need N [e] * CALL CGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 6n (M >= N, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IU = NWORK IRU = NRWORK IRVT = IRU + N*N NRWORK = IRVT + N*N - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * * WORK( IU ) is M by N * @@ -1160,15 +1259,16 @@ * * WORK( IU ) is LDWRKU by N * - LDWRKU = ( LWORK-3*N ) / N + LDWRKU = ( LWORK - 3*N ) / N END IF NWORK = IU + LDWRKU*N * +* Path 6o (M >= N, JOBZ='O') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -1176,21 +1276,24 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*N [U] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), $ LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * -* Copy real matrix RWORK(IRU) to complex matrix WORK(IU) -* Overwrite WORK(IU) by left singular vectors of A, copying -* to A -* (Cworkspace: need M*N+2*N, prefer M*N+N+N*NB) -* (Rworkspace: need 0) +* Path 6o-fast +* Copy real matrix RWORK(IRU) to complex matrix WORK(IU) +* Overwrite WORK(IU) by left singular vectors of A, copying +* to A +* CWorkspace: need 2*N [tauq, taup] + M*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] * CALL CLASET( 'F', M, N, CZERO, CZERO, WORK( IU ), $ LDWRKU ) @@ -1202,17 +1305,21 @@ CALL CLACPY( 'F', M, N, WORK( IU ), LDWRKU, A, LDA ) ELSE * +* Path 6o-slow * Generate Q in A -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*N [U] + N*NB [work] +* RWorkspace: need 0 * CALL CUNGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need N*N, prefer M*N) -* (Rworkspace: need 3*N*N, prefer N*N+2*M*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] +* RWorkspace: need N [e] + N*N [RU] + 2*N*N [rwork] +* RWorkspace: prefer N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT DO 30 I = 1, M, LDWRKU @@ -1227,11 +1334,12 @@ * ELSE IF( WNTQS ) THEN * +* Path 6s (M >= N, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1242,8 +1350,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL CLASET( 'F', M, N, CZERO, CZERO, U, LDU ) CALL CLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) @@ -1253,8 +1362,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -1262,11 +1372,12 @@ $ LWORK-NWORK+1, IERR ) ELSE * +* Path 6a (M >= N, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1285,8 +1396,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 2*N+M, prefer 2*N+M*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL CLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -1295,8 +1407,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL CLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -1316,15 +1429,16 @@ * IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M [tau] + M [work] +* CWorkspace: prefer M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1339,8 +1453,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (CWorkspace: need 3*M, prefer 2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, M, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1348,15 +1463,15 @@ NRWORK = IE + M * * Perform bidiagonal SVD, compute singular values only -* (CWorkspace: 0) -* (RWorkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL SBDSDC( 'U', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'U', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') +* Path 2t (N >> M, JOBZ='O') * M right singular vectors to be overwritten on A and * M left singular vectors to be computed in U * @@ -1366,7 +1481,7 @@ * WORK(IVT) is M by M * IL = IVT + LDWKVT*M - IF( LWORK.GE.M*N+M*M+3*M ) THEN + IF( LWORK .GE. M*N + M*M + 3*M ) THEN * * WORK(IL) M by N * @@ -1377,14 +1492,15 @@ * WORK(IL) is M by CHUNK * LDWRKL = M - CHUNK = ( LWORK-M*M-3*M ) / M + CHUNK = ( LWORK - M*M - 3*M ) / M END IF ITAU = IL + LDWRKL*CHUNK NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1396,8 +1512,9 @@ $ WORK( IL+LDWRKL ), LDWRKL ) * * Generate Q in A -* (CWorkspace: need M*M+2*M, prefer M*M+M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CUNGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1407,8 +1524,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, M, WORK( IL ), LDWRKL, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -1417,8 +1535,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1429,8 +1547,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by the left singular vectors of L -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, @@ -1439,8 +1558,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) * Overwrite WORK(IVT) by the right singular vectors of L -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1450,8 +1570,9 @@ * * Multiply right singular vectors of L in WORK(IL) by Q * in A, storing result in WORK(IL) and copying to A -* (CWorkspace: need 2*M*M, prefer M*M+M*N)) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] +* CWorkspace: prefer M*M [VT] + M*N [L] +* RWorkspace: need 0 * DO 40 I = 1, N, CHUNK BLK = MIN( N-I+1, CHUNK ) @@ -1464,9 +1585,9 @@ * ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* M right singular vectors to be computed in VT and -* M left singular vectors to be computed in U +* Path 3t (N >> M, JOBZ='S') +* M right singular vectors to be computed in VT and +* M left singular vectors to be computed in U * IL = 1 * @@ -1477,8 +1598,9 @@ NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1490,8 +1612,9 @@ $ WORK( IL+LDWRKL ), LDWRKL ) * * Generate Q in A -* (CWorkspace: need M*M+2*M, prefer M*M+M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CUNGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1501,8 +1624,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, M, WORK( IL ), LDWRKL, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -1511,8 +1635,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1523,8 +1647,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, @@ -1533,8 +1658,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by left singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', M, M, M, WORK( IL ), LDWRKL, @@ -1543,8 +1669,8 @@ * * Copy VT to WORK(IL), multiply right singular vectors of L * in WORK(IL) by Q in A, storing result in VT -* (CWorkspace: need M*M) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] +* RWorkspace: need 0 * CALL CLACPY( 'F', M, M, VT, LDVT, WORK( IL ), LDWRKL ) CALL CGEMM( 'N', 'N', M, N, M, CONE, WORK( IL ), LDWRKL, @@ -1552,7 +1678,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 9t (N much larger than M, JOBZ='A') +* Path 4t (N >> M, JOBZ='A') * N right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1565,16 +1691,18 @@ NWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL CGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) CALL CLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (CWorkspace: need M+N, prefer M+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M [tau] + N [work] +* CWorkspace: prefer M*M [VT] + M [tau] + N*NB [work] +* RWorkspace: need 0 * CALL CUNGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1589,8 +1717,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, M, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1599,8 +1728,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1611,8 +1740,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of L -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, M, A, LDA, @@ -1621,8 +1751,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) * Overwrite WORK(IVT) by right singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1632,11 +1763,11 @@ * * Multiply right singular vectors of L in WORK(IVT) by * Q in VT, storing result in A -* (CWorkspace: need M*M) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] +* RWorkspace: need 0 * - CALL CGEMM( 'N', 'N', M, N, M, CONE, WORK( IVT ), - $ LDWKVT, VT, LDVT, CZERO, A, LDA ) + CALL CGEMM( 'N', 'N', M, N, M, CONE, WORK( IVT ), LDWKVT, + $ VT, LDVT, CZERO, A, LDA ) * * Copy right singular vectors of A from A to VT * @@ -1648,10 +1779,9 @@ * * MNTHR2 <= N < MNTHR1 * -* Path 5t (N much larger than M, but not as much as MNTHR1) +* Path 5t (N >> M, but not as much as MNTHR1) * Reduce to bidiagonal form without QR decomposition, use * CUNGBR and matrix multiplication to compute singular vectors -* * IE = 1 NRWORK = IE + M @@ -1660,8 +1790,9 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB) -* (RWorkspace: M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1669,11 +1800,12 @@ * IF( WNTQN ) THEN * +* Path 5tn (N >> M, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL SBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IRVT = NRWORK @@ -1681,23 +1813,26 @@ NRWORK = IRU + M*M IVT = NWORK * +* Path 5to (N >> M, JOBZ='O') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'L', M, M, A, LDA, U, LDU ) CALL CUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Generate P**H in A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CUNGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * LDWKVT = M - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * * WORK( IVT ) is M by N * @@ -1707,15 +1842,15 @@ * * WORK( IVT ) is M by CHUNK * - CHUNK = ( LWORK-3*M ) / M + CHUNK = ( LWORK - 3*M ) / M NWORK = IVT + LDWKVT*CHUNK END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * CALL SBDSDC( 'L', 'I', M, S, RWORK( IE ), RWORK( IRU ), $ M, RWORK( IRVT ), M, DUM, IDUM, @@ -1723,8 +1858,8 @@ * * Multiply Q in U by real matrix RWORK(IRVT) * storing the result in WORK(IVT), copying to U -* (Cworkspace: need 0) -* (Rworkspace: need 2*M*M) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL CLACRM( M, M, U, LDU, RWORK( IRU ), M, WORK( IVT ), $ LDWKVT, RWORK( NRWORK ) ) @@ -1732,8 +1867,10 @@ * * Multiply RWORK(IRVT) by P**H in A, storing the * result in WORK(IVT), copying to A -* (CWorkspace: need M*M, prefer M*N) -* (Rworkspace: need 2*M*M, prefer 2*M*N) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] +* RWorkspace: need M [e] + M*M [RVT] + 2*M*M [rwork] +* RWorkspace: prefer M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU DO 50 I = 1, N, CHUNK @@ -1745,17 +1882,20 @@ 50 CONTINUE ELSE IF( WNTQS ) THEN * +* Path 5ts (N >> M, JOBZ='S') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'L', M, M, A, LDA, U, LDU ) CALL CUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to VT, generate P**H -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'U', M, N, A, LDA, VT, LDVT ) CALL CUNGBR( 'P', M, N, M, VT, LDVT, WORK( ITAUP ), @@ -1764,8 +1904,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1776,8 +1916,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need 3*M*M) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL CLACRM( M, M, U, LDU, RWORK( IRU ), M, A, LDA, $ RWORK( NRWORK ) ) @@ -1785,8 +1925,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need M*M+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU CALL CLARCM( M, N, RWORK( IRVT ), M, VT, LDVT, A, LDA, @@ -1794,17 +1934,20 @@ CALL CLACPY( 'F', M, N, A, LDA, VT, LDVT ) ELSE * +* Path 5ta (N >> M, JOBZ='A') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'L', M, M, A, LDA, U, LDU ) CALL CUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to VT, generate P**H -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL CLACPY( 'U', M, N, A, LDA, VT, LDVT ) CALL CUNGBR( 'P', N, N, M, VT, LDVT, WORK( ITAUP ), @@ -1813,8 +1956,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1825,8 +1968,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need 3*M*M) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL CLACRM( M, M, U, LDU, RWORK( IRU ), M, A, LDA, $ RWORK( NRWORK ) ) @@ -1834,9 +1977,10 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need M*M+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * + NRWORK = IRU CALL CLARCM( M, N, RWORK( IRVT ), M, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) CALL CLACPY( 'F', M, N, A, LDA, VT, LDVT ) @@ -1846,7 +1990,7 @@ * * N .LT. MNTHR2 * -* Path 6t (N greater than M, but not much larger) +* Path 6t (N > M, but not much larger) * Reduce to bidiagonal form without LQ decomposition * Use CUNMBR to compute singular vectors * @@ -1857,24 +2001,27 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB) -* (RWorkspace: M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need M [e] * CALL CGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 6tn (N > M, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL SBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL SBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 6to (N > M, JOBZ='O') LDWKVT = M IVT = NWORK - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * * WORK( IVT ) is M by N * @@ -1885,15 +2032,15 @@ * * WORK( IVT ) is M by CHUNK * - CHUNK = ( LWORK-3*M ) / M + CHUNK = ( LWORK - 3*M ) / M NWORK = IVT + LDWKVT*CHUNK END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1904,21 +2051,24 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*M [VT] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), $ LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * -* Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) -* Overwrite WORK(IVT) by right singular vectors of A, -* copying to A -* (Cworkspace: need M*N+2*M, prefer M*N+M+M*NB) -* (Rworkspace: need 0) +* Path 6to-fast +* Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) +* Overwrite WORK(IVT) by right singular vectors of A, +* copying to A +* CWorkspace: need 2*M [tauq, taup] + M*N [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1928,17 +2078,21 @@ CALL CLACPY( 'F', M, N, WORK( IVT ), LDWKVT, A, LDA ) ELSE * +* Path 6to-slow * Generate P**H in A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*M [VT] + M*NB [work] +* RWorkspace: need 0 * CALL CUNGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need M*M, prefer M*N) -* (Rworkspace: need 3*M*M, prefer M*M+2*M*N) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] +* RWorkspace: need M [e] + M*M [RVT] + 2*M*M [rwork] +* RWorkspace: prefer M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU DO 60 I = 1, N, CHUNK @@ -1952,11 +2106,12 @@ END IF ELSE IF( WNTQS ) THEN * +* Path 6ts (N > M, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1967,8 +2122,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -1977,8 +2133,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL CLASET( 'F', M, N, CZERO, CZERO, VT, LDVT ) CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) @@ -1987,11 +2144,12 @@ $ LWORK-NWORK+1, IERR ) ELSE * +* Path 6ta (N > M, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -2003,8 +2161,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL CLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL CUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -2017,8 +2176,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 2*M+N, prefer 2*M+N*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + N*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL CLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) CALL CUNMBR( 'P', 'R', 'C', N, N, M, A, LDA, diff --git a/SRC/cgesvd.f b/SRC/cgesvd.f index 0435a73e..33d26b6c 100644 --- a/SRC/cgesvd.f +++ b/SRC/cgesvd.f @@ -322,23 +322,23 @@ MNTHR = ILAENV( 6, 'CGESVD', JOBU // JOBVT, M, N, 0, 0 ) * Compute space needed for CGEQRF CALL CGEQRF( M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGEQRF=CDUM(1) + LWORK_CGEQRF = INT( CDUM(1) ) * Compute space needed for CUNGQR CALL CUNGQR( M, N, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_CUNGQR_N=CDUM(1) + LWORK_CUNGQR_N = INT( CDUM(1) ) CALL CUNGQR( M, M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_CUNGQR_M=CDUM(1) + LWORK_CUNGQR_M = INT( CDUM(1) ) * Compute space needed for CGEBRD CALL CGEBRD( N, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGEBRD=CDUM(1) + LWORK_CGEBRD = INT( CDUM(1) ) * Compute space needed for CUNGBR CALL CUNGBR( 'P', N, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_P=CDUM(1) + LWORK_CUNGBR_P = INT( CDUM(1) ) CALL CUNGBR( 'Q', N, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_Q=CDUM(1) + LWORK_CUNGBR_Q = INT( CDUM(1) ) * MNTHR = ILAENV( 6, 'CGESVD', JOBU // JOBVT, M, N, 0, 0 ) IF( M.GE.MNTHR ) THEN @@ -446,24 +446,24 @@ * CALL CGEBRD( M, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGEBRD=CDUM(1) + LWORK_CGEBRD = INT( CDUM(1) ) MAXWRK = 2*N + LWORK_CGEBRD IF( WNTUS .OR. WNTUO ) THEN CALL CUNGBR( 'Q', M, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_Q=CDUM(1) + LWORK_CUNGBR_Q = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*N+LWORK_CUNGBR_Q ) END IF IF( WNTUA ) THEN CALL CUNGBR( 'Q', M, M, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_Q=CDUM(1) + LWORK_CUNGBR_Q = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*N+LWORK_CUNGBR_Q ) END IF IF( .NOT.WNTVN ) THEN MAXWRK = MAX( MAXWRK, 2*N+LWORK_CUNGBR_P ) - MINWRK = 2*N + M END IF + MINWRK = 2*N + M END IF ELSE IF( MINMN.GT.0 ) THEN * @@ -472,25 +472,25 @@ MNTHR = ILAENV( 6, 'CGESVD', JOBU // JOBVT, M, N, 0, 0 ) * Compute space needed for CGELQF CALL CGELQF( M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGELQF=CDUM(1) + LWORK_CGELQF = INT( CDUM(1) ) * Compute space needed for CUNGLQ CALL CUNGLQ( N, N, M, CDUM(1), N, CDUM(1), CDUM(1), -1, $ IERR ) - LWORK_CUNGLQ_N=CDUM(1) + LWORK_CUNGLQ_N = INT( CDUM(1) ) CALL CUNGLQ( M, N, M, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_CUNGLQ_M=CDUM(1) + LWORK_CUNGLQ_M = INT( CDUM(1) ) * Compute space needed for CGEBRD CALL CGEBRD( M, M, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGEBRD=CDUM(1) + LWORK_CGEBRD = INT( CDUM(1) ) * Compute space needed for CUNGBR P CALL CUNGBR( 'P', M, M, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_P=CDUM(1) + LWORK_CUNGBR_P = INT( CDUM(1) ) * Compute space needed for CUNGBR Q CALL CUNGBR( 'Q', M, M, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_Q=CDUM(1) + LWORK_CUNGBR_Q = INT( CDUM(1) ) IF( N.GE.MNTHR ) THEN IF( WNTVN ) THEN * @@ -596,25 +596,25 @@ * CALL CGEBRD( M, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_CGEBRD=CDUM(1) + LWORK_CGEBRD = INT( CDUM(1) ) MAXWRK = 2*M + LWORK_CGEBRD IF( WNTVS .OR. WNTVO ) THEN * Compute space needed for CUNGBR P CALL CUNGBR( 'P', M, N, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_P=CDUM(1) + LWORK_CUNGBR_P = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*M+LWORK_CUNGBR_P ) END IF IF( WNTVA ) THEN CALL CUNGBR( 'P', N, N, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_CUNGBR_P=CDUM(1) + LWORK_CUNGBR_P = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*M+LWORK_CUNGBR_P ) END IF IF( .NOT.WNTUN ) THEN MAXWRK = MAX( MAXWRK, 2*M+LWORK_CUNGBR_Q ) - MINWRK = 2*M + N END IF + MINWRK = 2*M + N END IF END IF MAXWRK = MAX( MINWRK, MAXWRK ) diff --git a/SRC/dbdsdc.f b/SRC/dbdsdc.f index 2c572f12..3cb084c7 100644 --- a/SRC/dbdsdc.f +++ b/SRC/dbdsdc.f @@ -311,7 +311,7 @@ WSTART = 1 QSTART = 3 IF( ICOMPQ.EQ.1 ) THEN - CALL DCOPY( N, D, 1, Q( 1 ), 1 ) + CALL DCOPY( N, D, 1, Q( 1 ), 1 ) CALL DCOPY( N-1, E, 1, Q( N+1 ), 1 ) END IF IF( IUPLO.EQ.2 ) THEN @@ -335,8 +335,11 @@ * If ICOMPQ = 0, use DLASDQ to compute the singular values. * IF( ICOMPQ.EQ.0 ) THEN +* Ignore WSTART, instead using WORK( 1 ), since the two vectors +* for CS and -SN above are added only if ICOMPQ == 2, +* and adding them exceeds documented WORK size of 4*n. CALL DLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U, - $ LDU, WORK( WSTART ), INFO ) + $ LDU, WORK( 1 ), INFO ) GO TO 40 END IF * @@ -412,24 +415,24 @@ DO 30 I = 1, NM1 IF( ( ABS( E( I ) ).LT.EPS ) .OR. ( I.EQ.NM1 ) ) THEN * -* Subproblem found. First determine its size and then -* apply divide and conquer on it. +* Subproblem found. First determine its size and then +* apply divide and conquer on it. * IF( I.LT.NM1 ) THEN * -* A subproblem with E(I) small for I < NM1. +* A subproblem with E(I) small for I < NM1. * NSIZE = I - START + 1 ELSE IF( ABS( E( I ) ).GE.EPS ) THEN * -* A subproblem with E(NM1) not too small but I = NM1. +* A subproblem with E(NM1) not too small but I = NM1. * NSIZE = N - START + 1 ELSE * -* A subproblem with E(NM1) small. This implies an -* 1-by-1 subproblem at D(N). Solve this 1-by-1 problem -* first. +* A subproblem with E(NM1) small. This implies an +* 1-by-1 subproblem at D(N). Solve this 1-by-1 problem +* first. * NSIZE = I - START + 1 IF( ICOMPQ.EQ.2 ) THEN diff --git a/SRC/dgesdd.f b/SRC/dgesdd.f index 54e2652e..4bdc8a64 100644 --- a/SRC/dgesdd.f +++ b/SRC/dgesdd.f @@ -18,8 +18,8 @@ * Definition: * =========== * -* SUBROUTINE DGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, -* LWORK, IWORK, INFO ) +* SUBROUTINE DGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, +* WORK, LWORK, IWORK, INFO ) * * .. Scalar Arguments .. * CHARACTER JOBZ @@ -154,8 +154,8 @@ *> \param[in] LDVT *> \verbatim *> LDVT is INTEGER -*> The leading dimension of the array VT. LDVT >= 1; if -*> JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; +*> The leading dimension of the array VT. LDVT >= 1; +*> if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; *> if JOBZ = 'S', LDVT >= min(M,N). *> \endverbatim *> @@ -169,16 +169,18 @@ *> \verbatim *> LWORK is INTEGER *> The dimension of the array WORK. LWORK >= 1. -*> If JOBZ = 'N', -*> LWORK >= 3*min(M,N) + max(max(M,N),7*min(M,N)). -*> If JOBZ = 'O', -*> LWORK >= 3*min(M,N) + -*> max(max(M,N),5*min(M,N)*min(M,N)+4*min(M,N)). -*> If JOBZ = 'S' or 'A' -*> LWORK >= min(M,N)*(7+4*min(M,N)) -*> For good performance, LWORK should generally be larger. -*> If LWORK = -1 but other input arguments are legal, WORK(1) -*> returns the optimal LWORK. +*> If LWORK = -1, a workspace query is assumed. The optimal +*> size for the WORK array is calculated and stored in WORK(1), +*> and no other work except argument checking is performed. +*> +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LWORK >= 3*mn + max( mx, 7*mn ). +*> If JOBZ = 'O', LWORK >= 3*mn + max( mx, 5*mn*mn + 4*mn ). +*> If JOBZ = 'S', LWORK >= 4*mn*mn + 7*mn. +*> If JOBZ = 'A', LWORK >= 4*mn*mn + 6*mn + mx. +*> These are not tight minimums in all cases; see comments inside code. +*> For good performance, LWORK should generally be larger; +*> a query is recommended. *> \endverbatim *> *> \param[out] IWORK @@ -212,9 +214,11 @@ *> Ming Gu and Huan Ren, Computer Science Division, University of *> California at Berkeley, USA *> +*> @precisions fortran d -> s * ===================================================================== - SUBROUTINE DGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, - $ LWORK, IWORK, INFO ) + SUBROUTINE DGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, + $ WORK, LWORK, IWORK, INFO ) + implicit none * * -- LAPACK driver routine (version 3.6.0) -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- @@ -243,6 +247,15 @@ $ IR, ISCL, ITAU, ITAUP, ITAUQ, IU, IVT, LDWKVT, $ LDWRKL, LDWRKR, LDWRKU, MAXWRK, MINMN, MINWRK, $ MNTHR, NWORK, WRKBL + INTEGER LWORK_DGEBRD_MN, LWORK_DGEBRD_MM, + $ LWORK_DGEBRD_NN, LWORK_DGELQF_MN, + $ LWORK_DGEQRF_MN, + $ LWORK_DORGBR_P_MM, LWORK_DORGBR_Q_NN, + $ LWORK_DORGLQ_MN, LWORK_DORGLQ_NN, + $ LWORK_DORGQR_MM, LWORK_DORGQR_MN, + $ LWORK_DORMBR_PRT_MM, LWORK_DORMBR_QLN_MM, + $ LWORK_DORMBR_PRT_MN, LWORK_DORMBR_QLN_MN, + $ LWORK_DORMBR_PRT_NN, LWORK_DORMBR_QLN_NN DOUBLE PRECISION ANRM, BIGNUM, EPS, SMLNUM * .. * .. Local Arrays .. @@ -256,9 +269,8 @@ * .. * .. External Functions .. LOGICAL LSAME - INTEGER ILAENV DOUBLE PRECISION DLAMCH, DLANGE - EXTERNAL DLAMCH, DLANGE, ILAENV, LSAME + EXTERNAL DLAMCH, DLANGE, LSAME * .. * .. Intrinsic Functions .. INTRINSIC INT, MAX, MIN, SQRT @@ -267,13 +279,13 @@ * * Test the input arguments * - INFO = 0 - MINMN = MIN( M, N ) - WNTQA = LSAME( JOBZ, 'A' ) - WNTQS = LSAME( JOBZ, 'S' ) + INFO = 0 + MINMN = MIN( M, N ) + WNTQA = LSAME( JOBZ, 'A' ) + WNTQS = LSAME( JOBZ, 'S' ) WNTQAS = WNTQA .OR. WNTQS - WNTQO = LSAME( JOBZ, 'O' ) - WNTQN = LSAME( JOBZ, 'N' ) + WNTQO = LSAME( JOBZ, 'O' ) + WNTQN = LSAME( JOBZ, 'N' ) LQUERY = ( LWORK.EQ.-1 ) * IF( .NOT.( WNTQA .OR. WNTQS .OR. WNTQO .OR. WNTQN ) ) THEN @@ -294,115 +306,140 @@ END IF * * Compute workspace -* (Note: Comments in the code beginning "Workspace:" describe the -* minimal amount of workspace needed at that point in the code, +* Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace allocated at that point in the code, * as well as the preferred amount for good performance. * NB refers to the optimal block size for the immediately -* following subroutine, as returned by ILAENV.) +* following subroutine, as returned by ILAENV. * IF( INFO.EQ.0 ) THEN MINWRK = 1 MAXWRK = 1 + BDSPAC = 0 + MNTHR = INT( MINMN*11.0D0 / 6.0D0 ) IF( M.GE.N .AND. MINMN.GT.0 ) THEN * * Compute space needed for DBDSDC * - MNTHR = INT( MINMN*11.0D0 / 6.0D0 ) IF( WNTQN ) THEN +* dbdsdc needs only 4*N (or 6*N for uplo=L for LAPACK <= 3.6) +* keep 7*N for backwards compatability. BDSPAC = 7*N ELSE BDSPAC = 3*N*N + 4*N END IF +* +* Compute space preferred for each routine + CALL DGEBRD( M, N, DUM(1), M, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_DGEBRD_MN = INT( DUM(1) ) +* + CALL DGEBRD( N, N, DUM(1), N, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_DGEBRD_NN = INT( DUM(1) ) +* + CALL DGEQRF( M, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_DGEQRF_MN = INT( DUM(1) ) +* + CALL DORGBR( 'Q', N, N, N, DUM(1), N, DUM(1), DUM(1), -1, + $ IERR ) + LWORK_DORGBR_Q_NN = INT( DUM(1) ) +* + CALL DORGQR( M, M, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_DORGQR_MM = INT( DUM(1) ) +* + CALL DORGQR( M, N, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_DORGQR_MN = INT( DUM(1) ) +* + CALL DORMBR( 'P', 'R', 'T', N, N, N, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_DORMBR_PRT_NN = INT( DUM(1) ) +* + CALL DORMBR( 'Q', 'L', 'N', N, N, N, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_DORMBR_QLN_NN = INT( DUM(1) ) +* + CALL DORMBR( 'Q', 'L', 'N', M, N, N, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_DORMBR_QLN_MN = INT( DUM(1) ) +* + CALL DORMBR( 'Q', 'L', 'N', M, M, N, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_DORMBR_QLN_MM = INT( DUM(1) ) +* IF( M.GE.MNTHR ) THEN IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * - WRKBL = N + N*ILAENV( 1, 'DGEQRF', ' ', M, N, -1, - $ -1 ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'DGEBRD', ' ', N, N, -1, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+N ) + WRKBL = N + LWORK_DGEQRF_MN + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD_NN ) + MAXWRK = MAX( WRKBL, BDSPAC + N ) MINWRK = BDSPAC + N ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') -* - WRKBL = N + N*ILAENV( 1, 'DGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'DORGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'DGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 2 (M >> N, JOBZ='O') +* + WRKBL = N + LWORK_DGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + 2*N*N MINWRK = BDSPAC + 2*N*N + 3*N ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') -* - WRKBL = N + N*ILAENV( 1, 'DGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'DORGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'DGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 3 (M >> N, JOBZ='S') +* + WRKBL = N + LWORK_DGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + N*N MINWRK = BDSPAC + N*N + 3*N ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') -* - WRKBL = N + N*ILAENV( 1, 'DGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+M*ILAENV( 1, 'DORGQR', ' ', M, - $ M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'DGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 4 (M >> N, JOBZ='A') +* + WRKBL = N + LWORK_DGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_MM ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + N*N - MINWRK = BDSPAC + N*N + 2*N + M + MINWRK = N*N + MAX( 3*N + BDSPAC, N + M ) END IF ELSE * -* Path 5 (M at least N, but not much larger) +* Path 5 (M >= N, but not much larger) * - WRKBL = 3*N + ( M+N )*ILAENV( 1, 'DGEBRD', ' ', M, N, -1, - $ -1 ) + WRKBL = 3*N + LWORK_DGEBRD_MN IF( WNTQN ) THEN - MAXWRK = MAX( WRKBL, BDSPAC+3*N ) +* Path 5n (M >= N, jobz='N') + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) ELSE IF( WNTQO ) THEN - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'QLN', M, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 5o (M >= N, jobz='O') + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_MN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + M*N - MINWRK = 3*N + MAX( M, N*N+BDSPAC ) + MINWRK = 3*N + MAX( M, N*N + BDSPAC ) ELSE IF( WNTQS ) THEN - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'QLN', M, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*N ) +* Path 5s (M >= N, jobz='S') + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) ELSE IF( WNTQA ) THEN - WRKBL = MAX( WRKBL, 3*N+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, BDSPAC+3*N ) +* Path 5a (M >= N, jobz='A') + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) END IF END IF @@ -410,106 +447,129 @@ * * Compute space needed for DBDSDC * - MNTHR = INT( MINMN*11.0D0 / 6.0D0 ) IF( WNTQN ) THEN +* dbdsdc needs only 4*N (or 6*N for uplo=L for LAPACK <= 3.6) +* keep 7*N for backwards compatability. BDSPAC = 7*M ELSE BDSPAC = 3*M*M + 4*M END IF +* +* Compute space preferred for each routine + CALL DGEBRD( M, N, DUM(1), M, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_DGEBRD_MN = INT( DUM(1) ) +* + CALL DGEBRD( M, M, A, M, S, DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_DGEBRD_MM = INT( DUM(1) ) +* + CALL DGELQF( M, N, A, M, DUM(1), DUM(1), -1, IERR ) + LWORK_DGELQF_MN = INT( DUM(1) ) +* + CALL DORGLQ( N, N, M, DUM(1), N, DUM(1), DUM(1), -1, IERR ) + LWORK_DORGLQ_NN = INT( DUM(1) ) +* + CALL DORGLQ( M, N, M, A, M, DUM(1), DUM(1), -1, IERR ) + LWORK_DORGLQ_MN = INT( DUM(1) ) +* + CALL DORGBR( 'P', M, M, M, A, N, DUM(1), DUM(1), -1, IERR ) + LWORK_DORGBR_P_MM = INT( DUM(1) ) +* + CALL DORMBR( 'P', 'R', 'T', M, M, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_DORMBR_PRT_MM = INT( DUM(1) ) +* + CALL DORMBR( 'P', 'R', 'T', M, N, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_DORMBR_PRT_MN = INT( DUM(1) ) +* + CALL DORMBR( 'P', 'R', 'T', N, N, M, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_DORMBR_PRT_NN = INT( DUM(1) ) +* + CALL DORMBR( 'Q', 'L', 'N', M, M, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_DORMBR_QLN_MM = INT( DUM(1) ) +* IF( N.GE.MNTHR ) THEN IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * - WRKBL = M + M*ILAENV( 1, 'DGELQF', ' ', M, N, -1, - $ -1 ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'DGEBRD', ' ', M, M, -1, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+M ) + WRKBL = M + LWORK_DGELQF_MN + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD_MM ) + MAXWRK = MAX( WRKBL, BDSPAC + M ) MINWRK = BDSPAC + M ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') -* - WRKBL = M + M*ILAENV( 1, 'DGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'DORGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'DGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 2t (N >> M, JOBZ='O') +* + WRKBL = M + LWORK_DGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_MN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + 2*M*M MINWRK = BDSPAC + 2*M*M + 3*M ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* - WRKBL = M + M*ILAENV( 1, 'DGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'DORGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'DGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 3t (N >> M, JOBZ='S') +* + WRKBL = M + LWORK_DGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_MN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*M MINWRK = BDSPAC + M*M + 3*M ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') -* - WRKBL = M + M*ILAENV( 1, 'DGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+N*ILAENV( 1, 'DORGLQ', ' ', N, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'DGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 4t (N >> M, JOBZ='A') +* + WRKBL = M + LWORK_DGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_NN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*M - MINWRK = BDSPAC + M*M + 3*M + MINWRK = M*M + MAX( 3*M + BDSPAC, M + N ) END IF ELSE * -* Path 5t (N greater than M, but not much larger) +* Path 5t (N > M, but not much larger) * - WRKBL = 3*M + ( M+N )*ILAENV( 1, 'DGEBRD', ' ', M, N, -1, - $ -1 ) + WRKBL = 3*M + LWORK_DGEBRD_MN IF( WNTQN ) THEN - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5tn (N > M, jobz='N') + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) ELSE IF( WNTQO ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', M, N, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 5to (N > M, jobz='O') + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_MN ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*N - MINWRK = 3*M + MAX( N, M*M+BDSPAC ) + MINWRK = 3*M + MAX( N, M*M + BDSPAC ) ELSE IF( WNTQS ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', M, N, M, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5ts (N > M, jobz='S') + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_MN ) + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) ELSE IF( WNTQA ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'DORMBR', 'PRT', N, N, M, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5ta (N > M, jobz='A') + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) END IF END IF END IF + MAXWRK = MAX( MAXWRK, MINWRK ) WORK( 1 ) = MAXWRK * @@ -559,17 +619,18 @@ * IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need 2*N, prefer N+N*NB) +* Workspace: need N [tau] + N [work] +* Workspace: prefer N [tau] + N*NB [work] * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Zero out below R * @@ -580,7 +641,8 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* Workspace: need 3*N [e, tauq, taup] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + 2*N*NB [work] * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -588,14 +650,14 @@ NWORK = IE + N * * Perform bidiagonal SVD, computing singular values only -* (Workspace: need N+BDSPAC) +* Workspace: need N [e] + BDSPAC * CALL DBDSDC( 'U', 'N', N, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ = 'O') +* Path 2 (M >> N, JOBZ = 'O') * N left singular vectors to be overwritten on A and * N right singular vectors to be computed in VT * @@ -603,42 +665,45 @@ * * WORK(IR) is LDWRKR by N * - IF( LWORK.GE.LDA*N+N*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. LDA*N + N*N + 3*N + BDSPAC ) THEN LDWRKR = LDA ELSE - LDWRKR = ( LWORK-N*N-3*N-BDSPAC ) / N + LDWRKR = ( LWORK - N*N - 3*N - BDSPAC ) / N END IF ITAU = IR + LDWRKR*N NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy R to WORK(IR), zeroing out below it * CALL DLACPY( 'U', N, N, A, LDA, WORK( IR ), LDWRKR ) - CALL DLASET( 'L', N-1, N-1, ZERO, ZERO, WORK( IR+1 ), + CALL DLASET( 'L', N - 1, N - 1, ZERO, ZERO, WORK(IR+1), $ LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + N ITAUP = ITAUQ + N NWORK = ITAUP + N * -* Bidiagonalize R in VT, copying result to WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Bidiagonalize R in WORK(IR) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * WORK(IU) is N by N * @@ -648,7 +713,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), N, $ VT, LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -656,21 +721,23 @@ * * Overwrite WORK(IU) by left singular vectors of R * and VT by right singular vectors of R -* (Workspace: need 2*N*N+3*N, prefer 2*N*N+2*N+N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IU ), N, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of R in * WORK(IU), storing result in WORK(IR) and copying to A -* (Workspace: need 2*N*N, prefer N*N+M*N) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] +* Workspace: prefer M*N [R] + 3*N [e, tauq, taup] + N*N [U] * DO 10 I = 1, M, LDWRKR - CHUNK = MIN( M-I+1, LDWRKR ) + CHUNK = MIN( M - I + 1, LDWRKR ) CALL DGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ), $ LDA, WORK( IU ), N, ZERO, WORK( IR ), $ LDWRKR ) @@ -680,7 +747,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') +* Path 3 (M >> N, JOBZ='S') * N left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -693,38 +760,41 @@ NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy R to WORK(IR), zeroing out below it * CALL DLACPY( 'U', N, N, A, LDA, WORK( IR ), LDWRKR ) - CALL DLASET( 'L', N-1, N-1, ZERO, ZERO, WORK( IR+1 ), + CALL DLASET( 'L', N - 1, N - 1, ZERO, ZERO, WORK(IR+1), $ LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + N ITAUP = ITAUQ + N NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagoal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, $ LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -732,19 +802,20 @@ * * Overwrite U by left singular vectors of R and VT * by right singular vectors of R -* (Workspace: need N*N+3*N, prefer N*N+2*N+N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * CALL DORMBR( 'P', 'R', 'T', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in U -* (Workspace: need N*N) +* Workspace: need N*N [R] * CALL DLACPY( 'F', N, N, U, LDU, WORK( IR ), LDWRKR ) CALL DGEMM( 'N', 'N', M, N, N, ONE, A, LDA, WORK( IR ), @@ -752,7 +823,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') +* Path 4 (M >> N, JOBZ='A') * M left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -765,16 +836,18 @@ NWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* Workspace: need N*N [U] + N [tau] + N [work] +* Workspace: prefer N*N [U] + N [tau] + N*NB [work] * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* Workspace: need N*N [U] + N [tau] + M [work] +* Workspace: prefer N*N [U] + N [tau] + M*NB [work] CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Produce R in A, zeroing out other entries * @@ -785,7 +858,8 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [U] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -794,7 +868,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), N, $ VT, LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -802,18 +876,19 @@ * * Overwrite WORK(IU) by left singular vectors of R and VT * by right singular vectors of R -* (Workspace: need N*N+3*N, prefer N*N+2*N+N*NB) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [U] + 3*N [e, tauq, taup] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', N, N, N, A, LDA, $ WORK( ITAUQ ), WORK( IU ), LDWRKU, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in U by left singular vectors of R in * WORK(IU), storing result in A -* (Workspace: need N*N) +* Workspace: need N*N [U] * CALL DGEMM( 'N', 'N', M, N, N, ONE, U, LDU, WORK( IU ), $ LDWRKU, ZERO, A, LDA ) @@ -828,7 +903,7 @@ * * M .LT. MNTHR * -* Path 5 (M at least N, but not much larger) +* Path 5 (M >= N, but not much larger) * Reduce to bidiagonal form without QR decomposition * IE = 1 @@ -837,21 +912,24 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (Workspace: need 3*N+M, prefer 3*N+(M+N)*NB) +* Workspace: need 3*N [e, tauq, taup] + M [work] +* Workspace: prefer 3*N [e, tauq, taup] + (M+N)*NB [work] * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5n (M >= N, JOBZ='N') * Perform bidiagonal SVD, only computing singular values -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'N', N, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 5o (M >= N, JOBZ='O') IU = NWORK - IF( LWORK.GE.M*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*N + BDSPAC ) THEN * * WORK( IU ) is M by N * @@ -859,6 +937,8 @@ NWORK = IU + LDWRKU*N CALL DLASET( 'F', M, N, ZERO, ZERO, WORK( IU ), $ LDWRKU ) +* IR is unused; silence compile warnings + IR = -1 ELSE * * WORK( IU ) is N by N @@ -869,53 +949,59 @@ * WORK(IR) is LDWRKR by N * IR = NWORK - LDWRKR = ( LWORK-N*N-3*N ) / N + LDWRKR = ( LWORK - N*N - 3*N ) / N END IF NWORK = IU + LDWRKU*N * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), $ LDWRKU, VT, LDVT, DUM, IDUM, WORK( NWORK ), $ IWORK, INFO ) * * Overwrite VT by right singular vectors of A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL DORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * - IF( LWORK.GE.M*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*N + BDSPAC ) THEN * +* Path 5o-fast * Overwrite WORK(IU) by left singular vectors of A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + M*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + M*N [U] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), WORK( IU ), LDWRKU, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Copy left singular vectors of A from WORK(IU) to A * CALL DLACPY( 'F', M, N, WORK( IU ), LDWRKU, A, LDA ) ELSE * +* Path 5o-slow * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL DORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of * bidiagonal matrix in WORK(IU), storing result in * WORK(IR) and copying to A -* (Workspace: need 2*N*N, prefer N*N+M*N) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + NB*N [R] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + M*N [R] * DO 20 I = 1, M, LDWRKR - CHUNK = MIN( M-I+1, LDWRKR ) + CHUNK = MIN( M - I + 1, LDWRKR ) CALL DGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ), $ LDA, WORK( IU ), LDWRKU, ZERO, $ WORK( IR ), LDWRKR ) @@ -926,10 +1012,11 @@ * ELSE IF( WNTQS ) THEN * +* Path 5s (M >= N, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL DLASET( 'F', M, N, ZERO, ZERO, U, LDU ) CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, @@ -938,20 +1025,22 @@ * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 3*N, prefer 2*N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) ELSE IF( WNTQA ) THEN * +* Path 5a (M >= N, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL DLASET( 'F', M, M, ZERO, ZERO, U, LDU ) CALL DBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, @@ -961,20 +1050,21 @@ * Set the right corner of U to identity matrix * IF( M.GT.N ) THEN - CALL DLASET( 'F', M-N, M-N, ZERO, ONE, U( N+1, N+1 ), + CALL DLASET( 'F', M - N, M - N, ZERO, ONE, U(N+1,N+1), $ LDU ) END IF * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need N*N+2*N+M, prefer N*N+2*N+M*NB) +* Workspace: need 3*N [e, tauq, taup] + M [work] +* Workspace: prefer 3*N [e, tauq, taup] + M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', N, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) END IF * END IF @@ -989,17 +1079,18 @@ * IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M, prefer M+M*NB) +* Workspace: need M [tau] + M [work] +* Workspace: prefer M [tau] + M*NB [work] * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Zero out above L * @@ -1010,7 +1101,8 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* Workspace: need 3*M [e, tauq, taup] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + 2*M*NB [work] * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1018,68 +1110,69 @@ NWORK = IE + M * * Perform bidiagonal SVD, computing singular values only -* (Workspace: need M+BDSPAC) +* Workspace: need M [e] + BDSPAC * CALL DBDSDC( 'U', 'N', M, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') +* Path 2t (N >> M, JOBZ='O') * M right singular vectors to be overwritten on A and * M left singular vectors to be computed in U * IVT = 1 * -* IVT is M by M +* WORK(IVT) is M by M +* WORK(IL) is M by M; it is later resized to M by chunk for gemm * IL = IVT + M*M - IF( LWORK.GE.M*N+M*M+3*M+BDSPAC ) THEN -* -* WORK(IL) is M by N -* + IF( LWORK .GE. M*N + M*M + 3*M + BDSPAC ) THEN LDWRKL = M CHUNK = N ELSE LDWRKL = M - CHUNK = ( LWORK-M*M ) / M + CHUNK = ( LWORK - M*M ) / M END IF ITAU = IL + LDWRKL*M NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy L to WORK(IL), zeroing about above it * CALL DLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWRKL ) - CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, - $ WORK( IL+LDWRKL ), LDWRKL ) + CALL DLASET( 'U', M - 1, M - 1, ZERO, ZERO, + $ WORK( IL + LDWRKL ), LDWRKL ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + M ITAUP = ITAUQ + M NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL DGEBRD( M, M, WORK( IL ), LDWRKL, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U, and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M+M*M+BDSPAC) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), M, DUM, IDUM, WORK( NWORK ), @@ -1087,21 +1180,24 @@ * * Overwrite U by left singular vectors of L and WORK(IVT) * by right singular vectors of L -* (Workspace: need 2*M*M+3*M, prefer 2*M*M+2*M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUP ), WORK( IVT ), M, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IVT) by Q * in A, storing result in WORK(IL) and copying to A -* (Workspace: need 2*M*M, prefer M*M+M*N) +* Workspace: need M*M [VT] + M*M [L] +* Workspace: prefer M*M [VT] + M*N [L] +* At this point, L is resized as M by chunk. * DO 30 I = 1, N, CHUNK - BLK = MIN( N-I+1, CHUNK ) + BLK = MIN( N - I + 1, CHUNK ) CALL DGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IVT ), M, $ A( 1, I ), LDA, ZERO, WORK( IL ), LDWRKL ) CALL DLACPY( 'F', M, BLK, WORK( IL ), LDWRKL, @@ -1110,7 +1206,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') +* Path 3t (N >> M, JOBZ='S') * M right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1123,38 +1219,41 @@ NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [L] + M [tau] + M*NB [work] * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy L to WORK(IL), zeroing out above it * CALL DLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWRKL ) - CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, - $ WORK( IL+LDWRKL ), LDWRKL ) + CALL DLASET( 'U', M - 1, M - 1, ZERO, ZERO, + $ WORK( IL + LDWRKL ), LDWRKL ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [L] + M [tau] + M*NB [work] * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + M ITAUP = ITAUQ + M NWORK = ITAUP + M * -* Bidiagonalize L in WORK(IU), copying result to U -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Bidiagonalize L in WORK(IU). +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [L] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL DGEBRD( M, M, WORK( IL ), LDWRKL, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, VT, $ LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -1162,18 +1261,19 @@ * * Overwrite U by left singular vectors of L and VT * by right singular vectors of L -* (Workspace: need M*M+3*M, prefer M*M+2*M+M*NB) +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [L] + 3*M [e, tauq, taup] + M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IL) by * Q in A, storing result in VT -* (Workspace: need M*M) +* Workspace: need M*M [L] * CALL DLACPY( 'F', M, M, VT, LDVT, WORK( IL ), LDWRKL ) CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IL ), LDWRKL, @@ -1181,7 +1281,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') +* Path 4t (N >> M, JOBZ='A') * N right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1194,17 +1294,19 @@ NWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M [tau] + M*NB [work] * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M [tau] + N [work] +* Workspace: prefer M*M [VT] + M [tau] + N*NB [work] * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Produce L in A, zeroing out other entries * @@ -1215,7 +1317,8 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1224,7 +1327,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M+M*M+BDSPAC) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), LDWKVT, DUM, IDUM, @@ -1232,18 +1335,19 @@ * * Overwrite U by left singular vectors of L and WORK(IVT) * by right singular vectors of L -* (Workspace: need M*M+3*M, prefer M*M+2*M+M*NB) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup]+ M [work] +* Workspace: prefer M*M [VT] + 3*M [e, tauq, taup]+ M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, M, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', M, M, M, A, LDA, $ WORK( ITAUP ), WORK( IVT ), LDWKVT, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IVT) by * Q in VT, storing result in A -* (Workspace: need M*M) +* Workspace: need M*M [VT] * CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IVT ), LDWKVT, $ VT, LDVT, ZERO, A, LDA ) @@ -1258,7 +1362,7 @@ * * N .LT. MNTHR * -* Path 5t (N greater than M, but not much larger) +* Path 5t (N > M, but not much larger) * Reduce to bidiagonal form without LQ decomposition * IE = 1 @@ -1267,28 +1371,33 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB) +* Workspace: need 3*M [e, tauq, taup] + N [work] +* Workspace: prefer 3*M [e, tauq, taup] + (M+N)*NB [work] * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5tn (N > M, JOBZ='N') * Perform bidiagonal SVD, only computing singular values -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL DBDSDC( 'L', 'N', M, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 5to (N > M, JOBZ='O') LDWKVT = M IVT = NWORK - IF( LWORK.GE.M*N+3*M+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*M + BDSPAC ) THEN * * WORK( IVT ) is M by N * CALL DLASET( 'F', M, N, ZERO, ZERO, WORK( IVT ), $ LDWKVT ) NWORK = IVT + LDWKVT*N +* IL is unused; silence compile warnings + IL = -1 ELSE * * WORK( IVT ) is M by M @@ -1298,52 +1407,58 @@ * * WORK(IL) is M by CHUNK * - CHUNK = ( LWORK-M*M-3*M ) / M + CHUNK = ( LWORK - M*M - 3*M ) / M END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M*M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + BDSPAC * CALL DBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), LDWKVT, DUM, IDUM, $ WORK( NWORK ), IWORK, INFO ) * * Overwrite U by left singular vectors of A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * - IF( LWORK.GE.M*N+3*M+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*M + BDSPAC ) THEN * +* Path 5to-fast * Overwrite WORK(IVT) by left singular vectors of A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*N [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*N [VT] + M*NB [work] * CALL DORMBR( 'P', 'R', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), WORK( IVT ), LDWKVT, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Copy right singular vectors of A from WORK(IVT) to A * CALL DLACPY( 'F', M, N, WORK( IVT ), LDWKVT, A, LDA ) ELSE * +* Path 5to-slow * Generate P**T in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*NB [work] * CALL DORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply Q in A by right singular vectors of * bidiagonal matrix in WORK(IVT), storing result in * WORK(IL) and copying to A -* (Workspace: need 2*M*M, prefer M*M+M*N) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M*NB [L] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*N [L] * DO 40 I = 1, N, CHUNK - BLK = MIN( N-I+1, CHUNK ) + BLK = MIN( N - I + 1, CHUNK ) CALL DGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IVT ), $ LDWKVT, A( 1, I ), LDA, ZERO, $ WORK( IL ), M ) @@ -1353,10 +1468,11 @@ END IF ELSE IF( WNTQS ) THEN * +* Path 5ts (N > M, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL DLASET( 'F', M, N, ZERO, ZERO, VT, LDVT ) CALL DBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, VT, @@ -1365,20 +1481,22 @@ * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 3*M, prefer 2*M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) ELSE IF( WNTQA ) THEN * +* Path 5ta (N > M, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL DLASET( 'F', N, N, ZERO, ZERO, VT, LDVT ) CALL DBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, VT, @@ -1388,20 +1506,21 @@ * Set the right corner of VT to identity matrix * IF( N.GT.M ) THEN - CALL DLASET( 'F', N-M, N-M, ZERO, ONE, VT( M+1, M+1 ), + CALL DLASET( 'F', N-M, N-M, ZERO, ONE, VT(M+1,M+1), $ LDVT ) END IF * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 2*M+N, prefer 2*M+N*NB) +* Workspace: need 3*M [e, tauq, taup] + N [work] +* Workspace: prefer 3*M [e, tauq, taup] + N*NB [work] * CALL DORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL DORMBR( 'P', 'R', 'T', N, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) END IF * END IF diff --git a/SRC/dgesvd.f b/SRC/dgesvd.f index f3034042..f20350e5 100644 --- a/SRC/dgesvd.f +++ b/SRC/dgesvd.f @@ -175,7 +175,7 @@ *> LWORK >= MAX(1,5*MIN(M,N)) for the paths (see comments inside code): *> - PATH 1 (M much larger than N, JOBU='N') *> - PATH 1t (N much larger than M, JOBVT='N') -*> LWORK >= MAX(1,3*MIN(M,N)+MAX(M,N),5*MIN(M,N)) for the other paths +*> LWORK >= MAX(1,3*MIN(M,N) + MAX(M,N),5*MIN(M,N)) for the other paths *> For good performance, LWORK should generally be larger. *> *> If LWORK = -1, then a workspace query is assumed; the routine @@ -314,24 +314,24 @@ BDSPAC = 5*N * Compute space needed for DGEQRF CALL DGEQRF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_DGEQRF=DUM(1) + LWORK_DGEQRF = INT( DUM(1) ) * Compute space needed for DORGQR CALL DORGQR( M, N, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_DORGQR_N=DUM(1) + LWORK_DORGQR_N = INT( DUM(1) ) CALL DORGQR( M, M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_DORGQR_M=DUM(1) + LWORK_DORGQR_M = INT( DUM(1) ) * Compute space needed for DGEBRD CALL DGEBRD( N, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_DGEBRD=DUM(1) + LWORK_DGEBRD = INT( DUM(1) ) * Compute space needed for DORGBR P CALL DORGBR( 'P', N, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_P=DUM(1) + LWORK_DORGBR_P = INT( DUM(1) ) * Compute space needed for DORGBR Q CALL DORGBR( 'Q', N, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_Q=DUM(1) + LWORK_DORGBR_Q = INT( DUM(1) ) * IF( M.GE.MNTHR ) THEN IF( WNTUN ) THEN @@ -339,9 +339,9 @@ * Path 1 (M much larger than N, JOBU='N') * MAXWRK = N + LWORK_DGEQRF - MAXWRK = MAX( MAXWRK, 3*N+LWORK_DGEBRD ) + MAXWRK = MAX( MAXWRK, 3*N + LWORK_DGEBRD ) IF( WNTVO .OR. WNTVAS ) - $ MAXWRK = MAX( MAXWRK, 3*N+LWORK_DORGBR_P ) + $ MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_P ) MAXWRK = MAX( MAXWRK, BDSPAC ) MINWRK = MAX( 4*N, BDSPAC ) ELSE IF( WNTUO .AND. WNTVN ) THEN @@ -349,97 +349,97 @@ * Path 2 (M much larger than N, JOBU='O', JOBVT='N') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_N ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) - MAXWRK = MAX( N*N+WRKBL, N*N+M*N+N ) - MINWRK = MAX( 3*N+M, BDSPAC ) + MAXWRK = MAX( N*N + WRKBL, N*N + M*N + N ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUO .AND. WNTVAS ) THEN * * Path 3 (M much larger than N, JOBU='O', JOBVT='S' or * 'A') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_N ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) - MAXWRK = MAX( N*N+WRKBL, N*N+M*N+N ) - MINWRK = MAX( 3*N+M, BDSPAC ) + MAXWRK = MAX( N*N + WRKBL, N*N + M*N + N ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUS .AND. WNTVN ) THEN * * Path 4 (M much larger than N, JOBU='S', JOBVT='N') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_N ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUS .AND. WNTVO ) THEN * * Path 5 (M much larger than N, JOBU='S', JOBVT='O') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_N ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = 2*N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUS .AND. WNTVAS ) THEN * * Path 6 (M much larger than N, JOBU='S', JOBVT='S' or * 'A') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_N ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUA .AND. WNTVN ) THEN * * Path 7 (M much larger than N, JOBU='A', JOBVT='N') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_M ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUA .AND. WNTVO ) THEN * * Path 8 (M much larger than N, JOBU='A', JOBVT='O') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_M ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = 2*N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) ELSE IF( WNTUA .AND. WNTVAS ) THEN * * Path 9 (M much larger than N, JOBU='A', JOBVT='S' or * 'A') * WRKBL = N + LWORK_DGEQRF - WRKBL = MAX( WRKBL, N+LWORK_DORGQR_M ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_Q ) - WRKBL = MAX( WRKBL, 3*N+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = N*N + WRKBL - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) END IF ELSE * @@ -447,25 +447,25 @@ * CALL DGEBRD( M, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_DGEBRD=DUM(1) + LWORK_DGEBRD = INT( DUM(1) ) MAXWRK = 3*N + LWORK_DGEBRD IF( WNTUS .OR. WNTUO ) THEN CALL DORGBR( 'Q', M, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_Q=DUM(1) - MAXWRK = MAX( MAXWRK, 3*N+LWORK_DORGBR_Q ) + LWORK_DORGBR_Q = INT( DUM(1) ) + MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_Q ) END IF IF( WNTUA ) THEN CALL DORGBR( 'Q', M, M, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_Q=DUM(1) - MAXWRK = MAX( MAXWRK, 3*N+LWORK_DORGBR_Q ) + LWORK_DORGBR_Q = INT( DUM(1) ) + MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_Q ) END IF IF( .NOT.WNTVN ) THEN - MAXWRK = MAX( MAXWRK, 3*N+LWORK_DORGBR_P ) + MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_P ) END IF MAXWRK = MAX( MAXWRK, BDSPAC ) - MINWRK = MAX( 3*N+M, BDSPAC ) + MINWRK = MAX( 3*N + M, BDSPAC ) END IF ELSE IF( MINMN.GT.0 ) THEN * @@ -475,33 +475,33 @@ BDSPAC = 5*M * Compute space needed for DGELQF CALL DGELQF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_DGELQF=DUM(1) + LWORK_DGELQF = INT( DUM(1) ) * Compute space needed for DORGLQ CALL DORGLQ( N, N, M, DUM(1), N, DUM(1), DUM(1), -1, IERR ) - LWORK_DORGLQ_N=DUM(1) + LWORK_DORGLQ_N = INT( DUM(1) ) CALL DORGLQ( M, N, M, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_DORGLQ_M=DUM(1) + LWORK_DORGLQ_M = INT( DUM(1) ) * Compute space needed for DGEBRD CALL DGEBRD( M, M, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_DGEBRD=DUM(1) + LWORK_DGEBRD = INT( DUM(1) ) * Compute space needed for DORGBR P CALL DORGBR( 'P', M, M, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_P=DUM(1) + LWORK_DORGBR_P = INT( DUM(1) ) * Compute space needed for DORGBR Q CALL DORGBR( 'Q', M, M, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_Q=DUM(1) + LWORK_DORGBR_Q = INT( DUM(1) ) IF( N.GE.MNTHR ) THEN IF( WNTVN ) THEN * * Path 1t(N much larger than M, JOBVT='N') * MAXWRK = M + LWORK_DGELQF - MAXWRK = MAX( MAXWRK, 3*M+LWORK_DGEBRD ) + MAXWRK = MAX( MAXWRK, 3*M + LWORK_DGEBRD ) IF( WNTUO .OR. WNTUAS ) - $ MAXWRK = MAX( MAXWRK, 3*M+LWORK_DORGBR_Q ) + $ MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_Q ) MAXWRK = MAX( MAXWRK, BDSPAC ) MINWRK = MAX( 4*M, BDSPAC ) ELSE IF( WNTVO .AND. WNTUN ) THEN @@ -509,97 +509,97 @@ * Path 2t(N much larger than M, JOBU='N', JOBVT='O') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_M ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) - MAXWRK = MAX( M*M+WRKBL, M*M+M*N+M ) - MINWRK = MAX( 3*M+N, BDSPAC ) + MAXWRK = MAX( M*M + WRKBL, M*M + M*N + M ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVO .AND. WNTUAS ) THEN * * Path 3t(N much larger than M, JOBU='S' or 'A', * JOBVT='O') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_M ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) - MAXWRK = MAX( M*M+WRKBL, M*M+M*N+M ) - MINWRK = MAX( 3*M+N, BDSPAC ) + MAXWRK = MAX( M*M + WRKBL, M*M + M*N + M ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVS .AND. WNTUN ) THEN * * Path 4t(N much larger than M, JOBU='N', JOBVT='S') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_M ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVS .AND. WNTUO ) THEN * * Path 5t(N much larger than M, JOBU='O', JOBVT='S') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_M ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = 2*M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVS .AND. WNTUAS ) THEN * * Path 6t(N much larger than M, JOBU='S' or 'A', * JOBVT='S') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_M ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVA .AND. WNTUN ) THEN * * Path 7t(N much larger than M, JOBU='N', JOBVT='A') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_N ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVA .AND. WNTUO ) THEN * * Path 8t(N much larger than M, JOBU='O', JOBVT='A') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_N ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = 2*M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) ELSE IF( WNTVA .AND. WNTUAS ) THEN * * Path 9t(N much larger than M, JOBU='S' or 'A', * JOBVT='A') * WRKBL = M + LWORK_DGELQF - WRKBL = MAX( WRKBL, M+LWORK_DORGLQ_N ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DGEBRD ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_P ) - WRKBL = MAX( WRKBL, 3*M+LWORK_DORGBR_Q ) + WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P ) + WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q ) WRKBL = MAX( WRKBL, BDSPAC ) MAXWRK = M*M + WRKBL - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) END IF ELSE * @@ -607,26 +607,26 @@ * CALL DGEBRD( M, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_DGEBRD=DUM(1) + LWORK_DGEBRD = INT( DUM(1) ) MAXWRK = 3*M + LWORK_DGEBRD IF( WNTVS .OR. WNTVO ) THEN * Compute space needed for DORGBR P CALL DORGBR( 'P', M, N, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_P=DUM(1) - MAXWRK = MAX( MAXWRK, 3*M+LWORK_DORGBR_P ) + LWORK_DORGBR_P = INT( DUM(1) ) + MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_P ) END IF IF( WNTVA ) THEN CALL DORGBR( 'P', N, N, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_DORGBR_P=DUM(1) - MAXWRK = MAX( MAXWRK, 3*M+LWORK_DORGBR_P ) + LWORK_DORGBR_P = INT( DUM(1) ) + MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_P ) END IF IF( .NOT.WNTUN ) THEN - MAXWRK = MAX( MAXWRK, 3*M+LWORK_DORGBR_Q ) + MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_Q ) END IF MAXWRK = MAX( MAXWRK, BDSPAC ) - MINWRK = MAX( 3*M+N, BDSPAC ) + MINWRK = MAX( 3*M + N, BDSPAC ) END IF END IF MAXWRK = MAX( MAXWRK, MINWRK ) @@ -685,7 +685,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) @@ -702,7 +702,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1, @@ -711,7 +711,7 @@ IF( WNTVO .OR. WNTVAS ) THEN * * If right singular vectors desired, generate P'. -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -742,13 +742,13 @@ * Sufficient workspace for a fast algorithm * IR = 1 - IF( LWORK.GE.MAX( WRKBL, LDA*N+N )+LDA*N ) THEN + IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + LDA*N ) THEN * * WORK(IU) is LDA by N, WORK(IR) is LDA by N * LDWRKU = LDA LDWRKR = LDA - ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N+N )+N*N ) THEN + ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + N*N ) THEN * * WORK(IU) is LDA by N, WORK(IR) is N by N * @@ -765,7 +765,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -777,7 +777,7 @@ $ LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -787,14 +787,14 @@ IWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left vectors bidiagonalizing R -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), @@ -803,7 +803,7 @@ * * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IR) -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM, 1, $ WORK( IR ), LDWRKR, DUM, 1, @@ -812,7 +812,7 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in WORK(IU) and copying to A -* (Workspace: need N*N+2*N, prefer N*N+M*N+N) +* (Workspace: need N*N + 2*N, prefer N*N + M*N + N) * DO 10 I = 1, M, LDWRKU CHUNK = MIN( M-I+1, LDWRKU ) @@ -833,14 +833,14 @@ IWORK = ITAUP + N * * Bidiagonalize A -* (Workspace: need 3*N+M, prefer 3*N+(M+N)*NB) +* (Workspace: need 3*N + M, prefer 3*N + (M + N)*NB) * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left vectors bidiagonalizing A -* (Workspace: need 4*N, prefer 3*N+N*NB) +* (Workspace: need 4*N, prefer 3*N + N*NB) * CALL DORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -866,13 +866,13 @@ * Sufficient workspace for a fast algorithm * IR = 1 - IF( LWORK.GE.MAX( WRKBL, LDA*N+N )+LDA*N ) THEN + IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + LDA*N ) THEN * * WORK(IU) is LDA by N and WORK(IR) is LDA by N * LDWRKU = LDA LDWRKR = LDA - ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N+N )+N*N ) THEN + ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + N*N ) THEN * * WORK(IU) is LDA by N and WORK(IR) is N by N * @@ -889,7 +889,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -902,7 +902,7 @@ $ VT( 2, 1 ), LDVT ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -912,7 +912,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in VT, copying result to WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -920,14 +920,14 @@ CALL DLACPY( 'L', N, N, VT, LDVT, WORK( IR ), LDWRKR ) * * Generate left vectors bidiagonalizing R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right vectors bidiagonalizing R in VT -* (Workspace: need N*N+4*N-1, prefer N*N+3*N+(N-1)*NB) +* (Workspace: need N*N + 4*N-1, prefer N*N + 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -936,7 +936,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IR) and computing right * singular vectors of R in VT -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT, LDVT, $ WORK( IR ), LDWRKR, DUM, 1, @@ -945,7 +945,7 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in WORK(IU) and copying to A -* (Workspace: need N*N+2*N, prefer N*N+M*N+N) +* (Workspace: need N*N + 2*N, prefer N*N + M*N + N) * DO 20 I = 1, M, LDWRKU CHUNK = MIN( M-I+1, LDWRKU ) @@ -964,7 +964,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -977,7 +977,7 @@ $ VT( 2, 1 ), LDVT ) * * Generate Q in A -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -987,21 +987,21 @@ IWORK = ITAUP + N * * Bidiagonalize R in VT -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply Q in A by left vectors bidiagonalizing R -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT, $ WORK( ITAUQ ), A, LDA, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right vectors bidiagonalizing R in VT -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1045,7 +1045,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1058,7 +1058,7 @@ $ WORK( IR+1 ), LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1068,7 +1068,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, $ WORK( IE ), WORK( ITAUQ ), @@ -1076,7 +1076,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate left vectors bidiagonalizing R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), @@ -1085,7 +1085,7 @@ * * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IR) -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM, $ 1, WORK( IR ), LDWRKR, DUM, 1, @@ -1106,14 +1106,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1130,14 +1130,14 @@ END IF * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply Q in U by left vectors bidiagonalizing R -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), @@ -1172,7 +1172,7 @@ LDWRKU = LDA IR = IU + LDWRKU*N LDWRKR = LDA - ELSE IF( LWORK.GE.WRKBL+( LDA+N )*N ) THEN + ELSE IF( LWORK.GE.WRKBL+( LDA + N )*N ) THEN * * WORK(IU) is LDA by N and WORK(IR) is N by N * @@ -1191,7 +1191,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need 2*N*N+2*N, prefer 2*N*N+N+N*NB) +* (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1204,7 +1204,7 @@ $ WORK( IU+1 ), LDWRKU ) * * Generate Q in A -* (Workspace: need 2*N*N+2*N, prefer 2*N*N+N+N*NB) +* (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1215,7 +1215,7 @@ * * Bidiagonalize R in WORK(IU), copying result to * WORK(IR) -* (Workspace: need 2*N*N+4*N, +* (Workspace: need 2*N*N + 4*N, * prefer 2*N*N+3*N+2*N*NB) * CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S, @@ -1226,14 +1226,14 @@ $ WORK( IR ), LDWRKR ) * * Generate left bidiagonalizing vectors in WORK(IU) -* (Workspace: need 2*N*N+4*N, prefer 2*N*N+3*N+N*NB) +* (Workspace: need 2*N*N + 4*N, prefer 2*N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU, $ WORK( ITAUQ ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in WORK(IR) -* (Workspace: need 2*N*N+4*N-1, +* (Workspace: need 2*N*N + 4*N-1, * prefer 2*N*N+3*N+(N-1)*NB) * CALL DORGBR( 'P', N, N, N, WORK( IR ), LDWRKR, @@ -1244,7 +1244,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IU) and computing * right singular vectors of R in WORK(IR) -* (Workspace: need 2*N*N+BDSPAC) +* (Workspace: need 2*N*N + BDSPAC) * CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, WORK( IU ), @@ -1271,14 +1271,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1295,21 +1295,21 @@ END IF * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply Q in U by left vectors bidiagonalizing R -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right vectors bidiagonalizing R in A -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1353,7 +1353,7 @@ IWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1366,7 +1366,7 @@ $ WORK( IU+1 ), LDWRKU ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1376,7 +1376,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in WORK(IU), copying result to VT -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S, $ WORK( IE ), WORK( ITAUQ ), @@ -1386,14 +1386,14 @@ $ LDVT ) * * Generate left bidiagonalizing vectors in WORK(IU) -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU, $ WORK( ITAUQ ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in VT -* (Workspace: need N*N+4*N-1, +* (Workspace: need N*N + 4*N-1, * prefer N*N+3*N+(N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), @@ -1403,7 +1403,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IU) and computing * right singular vectors of R in VT -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT, $ LDVT, WORK( IU ), LDWRKU, DUM, 1, @@ -1424,14 +1424,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1448,7 +1448,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in VT -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -1456,14 +1456,14 @@ * * Multiply Q in U by left bidiagonalizing vectors * in VT -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in VT -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1510,7 +1510,7 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1524,7 +1524,7 @@ $ WORK( IR+1 ), LDWRKR ) * * Generate Q in U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* (Workspace: need N*N + N + M, prefer N*N + N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1534,7 +1534,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, $ WORK( IE ), WORK( ITAUQ ), @@ -1542,7 +1542,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), @@ -1551,7 +1551,7 @@ * * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IR) -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM, $ 1, WORK( IR ), LDWRKR, DUM, 1, @@ -1576,14 +1576,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N+M, prefer N+M*NB) +* (Workspace: need N + M, prefer N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1600,7 +1600,7 @@ END IF * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -1608,7 +1608,7 @@ * * Multiply Q in U by left bidiagonalizing vectors * in A -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), @@ -1643,7 +1643,7 @@ LDWRKU = LDA IR = IU + LDWRKU*N LDWRKR = LDA - ELSE IF( LWORK.GE.WRKBL+( LDA+N )*N ) THEN + ELSE IF( LWORK.GE.WRKBL+( LDA + N )*N ) THEN * * WORK(IU) is LDA by N and WORK(IR) is N by N * @@ -1662,14 +1662,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N*N+2*N, prefer 2*N*N+N+N*NB) +* (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need 2*N*N+N+M, prefer 2*N*N+N+M*NB) +* (Workspace: need 2*N*N + N + M, prefer 2*N*N + N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1687,7 +1687,7 @@ * * Bidiagonalize R in WORK(IU), copying result to * WORK(IR) -* (Workspace: need 2*N*N+4*N, +* (Workspace: need 2*N*N + 4*N, * prefer 2*N*N+3*N+2*N*NB) * CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S, @@ -1698,14 +1698,14 @@ $ WORK( IR ), LDWRKR ) * * Generate left bidiagonalizing vectors in WORK(IU) -* (Workspace: need 2*N*N+4*N, prefer 2*N*N+3*N+N*NB) +* (Workspace: need 2*N*N + 4*N, prefer 2*N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU, $ WORK( ITAUQ ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in WORK(IR) -* (Workspace: need 2*N*N+4*N-1, +* (Workspace: need 2*N*N + 4*N-1, * prefer 2*N*N+3*N+(N-1)*NB) * CALL DORGBR( 'P', N, N, N, WORK( IR ), LDWRKR, @@ -1716,7 +1716,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IU) and computing * right singular vectors of R in WORK(IR) -* (Workspace: need 2*N*N+BDSPAC) +* (Workspace: need 2*N*N + BDSPAC) * CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, WORK( IU ), @@ -1746,14 +1746,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N+M, prefer N+M*NB) +* (Workspace: need N + M, prefer N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1770,7 +1770,7 @@ END IF * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -1778,14 +1778,14 @@ * * Multiply Q in U by left bidiagonalizing vectors * in A -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in A -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1829,14 +1829,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* (Workspace: need N*N + 2*N, prefer N*N + N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* (Workspace: need N*N + N + M, prefer N*N + N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1853,7 +1853,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in WORK(IU), copying result to VT -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB) * CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S, $ WORK( IE ), WORK( ITAUQ ), @@ -1863,14 +1863,14 @@ $ LDVT ) * * Generate left bidiagonalizing vectors in WORK(IU) -* (Workspace: need N*N+4*N, prefer N*N+3*N+N*NB) +* (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB) * CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU, $ WORK( ITAUQ ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in VT -* (Workspace: need N*N+4*N-1, +* (Workspace: need N*N + 4*N-1, * prefer N*N+3*N+(N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), @@ -1880,7 +1880,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of R in WORK(IU) and computing * right singular vectors of R in VT -* (Workspace: need N*N+BDSPAC) +* (Workspace: need N*N + BDSPAC) * CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT, $ LDVT, WORK( IU ), LDWRKU, DUM, 1, @@ -1905,14 +1905,14 @@ IWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need 2*N, prefer N+N*NB) +* (Workspace: need 2*N, prefer N + N*NB) * CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N+M, prefer N+M*NB) +* (Workspace: need N + M, prefer N + M*NB) * CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1929,7 +1929,7 @@ IWORK = ITAUP + N * * Bidiagonalize R in VT -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* (Workspace: need 4*N, prefer 3*N + 2*N*NB) * CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -1937,14 +1937,14 @@ * * Multiply Q in U by left bidiagonalizing vectors * in VT -* (Workspace: need 3*N+M, prefer 3*N+M*NB) +* (Workspace: need 3*N + M, prefer 3*N + M*NB) * CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT, $ WORK( ITAUQ ), U, LDU, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in VT -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -1978,7 +1978,7 @@ IWORK = ITAUP + N * * Bidiagonalize A -* (Workspace: need 3*N+M, prefer 3*N+(M+N)*NB) +* (Workspace: need 3*N + M, prefer 3*N + (M + N)*NB) * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1, @@ -1987,7 +1987,7 @@ * * If left singular vectors desired in U, copy result to U * and generate left bidiagonalizing vectors in U -* (Workspace: need 3*N+NCU, prefer 3*N+NCU*NB) +* (Workspace: need 3*N + NCU, prefer 3*N + NCU*NB) * CALL DLACPY( 'L', M, N, A, LDA, U, LDU ) IF( WNTUS ) @@ -2001,7 +2001,7 @@ * * If right singular vectors desired in VT, copy result to * VT and generate right bidiagonalizing vectors in VT -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), @@ -2011,7 +2011,7 @@ * * If left singular vectors desired in A, generate left * bidiagonalizing vectors in A -* (Workspace: need 4*N, prefer 3*N+N*NB) +* (Workspace: need 4*N, prefer 3*N + N*NB) * CALL DORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2020,7 +2020,7 @@ * * If right singular vectors desired in A, generate right * bidiagonalizing vectors in A -* (Workspace: need 4*N-1, prefer 3*N+(N-1)*NB) +* (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB) * CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2082,7 +2082,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) @@ -2096,7 +2096,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1, @@ -2104,7 +2104,7 @@ IF( WNTUO .OR. WNTUAS ) THEN * * If left singular vectors desired, generate Q -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2137,14 +2137,14 @@ * Sufficient workspace for a fast algorithm * IR = 1 - IF( LWORK.GE.MAX( WRKBL, LDA*N+M )+LDA*M ) THEN + IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + LDA*M ) THEN * * WORK(IU) is LDA by N and WORK(IR) is LDA by M * LDWRKU = LDA CHUNK = N LDWRKR = LDA - ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N+M )+M*M ) THEN + ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + M*M ) THEN * * WORK(IU) is LDA by N and WORK(IR) is M by M * @@ -2163,7 +2163,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2175,7 +2175,7 @@ $ WORK( IR+LDWRKR ), LDWRKR ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2185,14 +2185,14 @@ IWORK = ITAUP + M * * Bidiagonalize L in WORK(IR) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate right vectors bidiagonalizing L -* (Workspace: need M*M+4*M-1, prefer M*M+3*M+(M-1)*NB) +* (Workspace: need M*M + 4*M-1, prefer M*M + 3*M + (M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR, $ WORK( ITAUP ), WORK( IWORK ), @@ -2201,7 +2201,7 @@ * * Perform bidiagonal QR iteration, computing right * singular vectors of L in WORK(IR) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1, @@ -2210,7 +2210,7 @@ * * Multiply right singular vectors of L in WORK(IR) by Q * in A, storing result in WORK(IU) and copying to A -* (Workspace: need M*M+2*M, prefer M*M+M*N+M) +* (Workspace: need M*M + 2*M, prefer M*M + M*N + M) * DO 30 I = 1, N, CHUNK BLK = MIN( N-I+1, CHUNK ) @@ -2231,14 +2231,14 @@ IWORK = ITAUP + M * * Bidiagonalize A -* (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB) +* (Workspace: need 3*M + N, prefer 3*M + (M + N)*NB) * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate right vectors bidiagonalizing A -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2264,14 +2264,14 @@ * Sufficient workspace for a fast algorithm * IR = 1 - IF( LWORK.GE.MAX( WRKBL, LDA*N+M )+LDA*M ) THEN + IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + LDA*M ) THEN * * WORK(IU) is LDA by N and WORK(IR) is LDA by M * LDWRKU = LDA CHUNK = N LDWRKR = LDA - ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N+M )+M*M ) THEN + ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + M*M ) THEN * * WORK(IU) is LDA by N and WORK(IR) is M by M * @@ -2290,7 +2290,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2302,7 +2302,7 @@ $ LDU ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2312,7 +2312,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in U, copying result to WORK(IR) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, U, LDU, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -2320,14 +2320,14 @@ CALL DLACPY( 'U', M, M, U, LDU, WORK( IR ), LDWRKR ) * * Generate right vectors bidiagonalizing L in WORK(IR) -* (Workspace: need M*M+4*M-1, prefer M*M+3*M+(M-1)*NB) +* (Workspace: need M*M + 4*M-1, prefer M*M + 3*M + (M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR, $ WORK( ITAUP ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate left vectors bidiagonalizing L in U -* (Workspace: need M*M+4*M, prefer M*M+3*M+M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2336,7 +2336,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of L in U, and computing right * singular vectors of L in WORK(IR) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, U, LDU, DUM, 1, @@ -2345,7 +2345,7 @@ * * Multiply right singular vectors of L in WORK(IR) by Q * in A, storing result in WORK(IU) and copying to A -* (Workspace: need M*M+2*M, prefer M*M+M*N+M)) +* (Workspace: need M*M + 2*M, prefer M*M + M*N + M)) * DO 40 I = 1, N, CHUNK BLK = MIN( N-I+1, CHUNK ) @@ -2364,7 +2364,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2376,7 +2376,7 @@ $ LDU ) * * Generate Q in A -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2386,21 +2386,21 @@ IWORK = ITAUP + M * * Bidiagonalize L in U -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, U, LDU, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply right vectors bidiagonalizing L by Q in A -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU, $ WORK( ITAUP ), A, LDA, WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate left vectors bidiagonalizing L in U -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2444,7 +2444,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2457,7 +2457,7 @@ $ WORK( IR+LDWRKR ), LDWRKR ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2467,7 +2467,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in WORK(IR) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S, $ WORK( IE ), WORK( ITAUQ ), @@ -2476,7 +2476,7 @@ * * Generate right vectors bidiagonalizing L in * WORK(IR) -* (Workspace: need M*M+4*M, prefer M*M+3*M+(M-1)*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + (M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR, $ WORK( ITAUP ), WORK( IWORK ), @@ -2485,7 +2485,7 @@ * * Perform bidiagonal QR iteration, computing right * singular vectors of L in WORK(IR) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1, @@ -2506,7 +2506,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2516,7 +2516,7 @@ CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2531,14 +2531,14 @@ $ LDA ) * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply right vectors bidiagonalizing L by Q in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, @@ -2573,7 +2573,7 @@ LDWRKU = LDA IR = IU + LDWRKU*M LDWRKR = LDA - ELSE IF( LWORK.GE.WRKBL+( LDA+M )*M ) THEN + ELSE IF( LWORK.GE.WRKBL+( LDA + M )*M ) THEN * * WORK(IU) is LDA by M and WORK(IR) is M by M * @@ -2592,7 +2592,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M*M+2*M, prefer 2*M*M+M+M*NB) +* (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2605,7 +2605,7 @@ $ WORK( IU+LDWRKU ), LDWRKU ) * * Generate Q in A -* (Workspace: need 2*M*M+2*M, prefer 2*M*M+M+M*NB) +* (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2616,7 +2616,7 @@ * * Bidiagonalize L in WORK(IU), copying result to * WORK(IR) -* (Workspace: need 2*M*M+4*M, +* (Workspace: need 2*M*M + 4*M, * prefer 2*M*M+3*M+2*M*NB) * CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S, @@ -2627,7 +2627,7 @@ $ WORK( IR ), LDWRKR ) * * Generate right bidiagonalizing vectors in WORK(IU) -* (Workspace: need 2*M*M+4*M-1, +* (Workspace: need 2*M*M + 4*M-1, * prefer 2*M*M+3*M+(M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU, @@ -2635,7 +2635,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in WORK(IR) -* (Workspace: need 2*M*M+4*M, prefer 2*M*M+3*M+M*NB) +* (Workspace: need 2*M*M + 4*M, prefer 2*M*M + 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), @@ -2645,7 +2645,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of L in WORK(IR) and computing * right singular vectors of L in WORK(IU) -* (Workspace: need 2*M*M+BDSPAC) +* (Workspace: need 2*M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ), $ WORK( IU ), LDWRKU, WORK( IR ), @@ -2672,14 +2672,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2694,21 +2694,21 @@ $ LDA ) * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Multiply right vectors bidiagonalizing L by Q in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors of L in A -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2752,7 +2752,7 @@ IWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2765,7 +2765,7 @@ $ WORK( IU+LDWRKU ), LDWRKU ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2775,7 +2775,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in WORK(IU), copying result to U -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S, $ WORK( IE ), WORK( ITAUQ ), @@ -2785,7 +2785,7 @@ $ LDU ) * * Generate right bidiagonalizing vectors in WORK(IU) -* (Workspace: need M*M+4*M-1, +* (Workspace: need M*M + 4*M-1, * prefer M*M+3*M+(M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU, @@ -2793,7 +2793,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in U -* (Workspace: need M*M+4*M, prefer M*M+3*M+M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2802,7 +2802,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of L in U and computing right * singular vectors of L in WORK(IU) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ), $ WORK( IU ), LDWRKU, U, LDU, DUM, 1, @@ -2823,14 +2823,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2846,7 +2846,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in U -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, U, LDU, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -2854,14 +2854,14 @@ * * Multiply right bidiagonalizing vectors in U by Q * in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU, $ WORK( ITAUP ), VT, LDVT, $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in U -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2888,7 +2888,7 @@ * N right singular vectors to be computed in VT and * no left singular vectors to be computed * - IF( LWORK.GE.M*M+MAX( N+M, 4*M, BDSPAC ) ) THEN + IF( LWORK.GE.M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN * * Sufficient workspace for a fast algorithm * @@ -2908,7 +2908,7 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2922,7 +2922,7 @@ $ WORK( IR+LDWRKR ), LDWRKR ) * * Generate Q in VT -* (Workspace: need M*M+M+N, prefer M*M+M+N*NB) +* (Workspace: need M*M + M + N, prefer M*M + M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2932,7 +2932,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in WORK(IR) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S, $ WORK( IE ), WORK( ITAUQ ), @@ -2940,7 +2940,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate right bidiagonalizing vectors in WORK(IR) -* (Workspace: need M*M+4*M-1, +* (Workspace: need M*M + 4*M-1, * prefer M*M+3*M+(M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR, @@ -2950,7 +2950,7 @@ * * Perform bidiagonal QR iteration, computing right * singular vectors of L in WORK(IR) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ), $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1, @@ -2975,14 +2975,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M+N, prefer M+N*NB) +* (Workspace: need M + N, prefer M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -2997,7 +2997,7 @@ $ LDA ) * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -3005,7 +3005,7 @@ * * Multiply right bidiagonalizing vectors in A by Q * in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, @@ -3028,7 +3028,7 @@ * N right singular vectors to be computed in VT and * M left singular vectors to be overwritten on A * - IF( LWORK.GE.2*M*M+MAX( N+M, 4*M, BDSPAC ) ) THEN + IF( LWORK.GE.2*M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN * * Sufficient workspace for a fast algorithm * @@ -3040,7 +3040,7 @@ LDWRKU = LDA IR = IU + LDWRKU*M LDWRKR = LDA - ELSE IF( LWORK.GE.WRKBL+( LDA+M )*M ) THEN + ELSE IF( LWORK.GE.WRKBL+( LDA + M )*M ) THEN * * WORK(IU) is LDA by M and WORK(IR) is M by M * @@ -3059,14 +3059,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M*M+2*M, prefer 2*M*M+M+M*NB) +* (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need 2*M*M+M+N, prefer 2*M*M+M+N*NB) +* (Workspace: need 2*M*M + M + N, prefer 2*M*M + M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3084,7 +3084,7 @@ * * Bidiagonalize L in WORK(IU), copying result to * WORK(IR) -* (Workspace: need 2*M*M+4*M, +* (Workspace: need 2*M*M + 4*M, * prefer 2*M*M+3*M+2*M*NB) * CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S, @@ -3095,7 +3095,7 @@ $ WORK( IR ), LDWRKR ) * * Generate right bidiagonalizing vectors in WORK(IU) -* (Workspace: need 2*M*M+4*M-1, +* (Workspace: need 2*M*M + 4*M-1, * prefer 2*M*M+3*M+(M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU, @@ -3103,7 +3103,7 @@ $ LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in WORK(IR) -* (Workspace: need 2*M*M+4*M, prefer 2*M*M+3*M+M*NB) +* (Workspace: need 2*M*M + 4*M, prefer 2*M*M + 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IWORK ), @@ -3113,7 +3113,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of L in WORK(IR) and computing * right singular vectors of L in WORK(IU) -* (Workspace: need 2*M*M+BDSPAC) +* (Workspace: need 2*M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ), $ WORK( IU ), LDWRKU, WORK( IR ), @@ -3143,14 +3143,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M+N, prefer M+N*NB) +* (Workspace: need M + N, prefer M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3165,7 +3165,7 @@ $ LDA ) * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -3173,14 +3173,14 @@ * * Multiply right bidiagonalizing vectors in A by Q * in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in A -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3204,7 +3204,7 @@ * N right singular vectors to be computed in VT and * M left singular vectors to be computed in U * - IF( LWORK.GE.M*M+MAX( N+M, 4*M, BDSPAC ) ) THEN + IF( LWORK.GE.M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN * * Sufficient workspace for a fast algorithm * @@ -3224,14 +3224,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* (Workspace: need M*M + 2*M, prefer M*M + M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M*M+M+N, prefer M*M+M+N*NB) +* (Workspace: need M*M + M + N, prefer M*M + M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3248,7 +3248,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in WORK(IU), copying result to U -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB) * CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S, $ WORK( IE ), WORK( ITAUQ ), @@ -3258,14 +3258,14 @@ $ LDU ) * * Generate right bidiagonalizing vectors in WORK(IU) -* (Workspace: need M*M+4*M, prefer M*M+3*M+(M-1)*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + (M-1)*NB) * CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU, $ WORK( ITAUP ), WORK( IWORK ), $ LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in U -* (Workspace: need M*M+4*M, prefer M*M+3*M+M*NB) +* (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3274,7 +3274,7 @@ * Perform bidiagonal QR iteration, computing left * singular vectors of L in U and computing right * singular vectors of L in WORK(IU) -* (Workspace: need M*M+BDSPAC) +* (Workspace: need M*M + BDSPAC) * CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ), $ WORK( IU ), LDWRKU, U, LDU, DUM, 1, @@ -3299,14 +3299,14 @@ IWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need 2*M, prefer M+M*NB) +* (Workspace: need 2*M, prefer M + M*NB) * CALL DGELQF( M, N, A, LDA, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M+N, prefer M+N*NB) +* (Workspace: need M + N, prefer M + N*NB) * CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3322,7 +3322,7 @@ IWORK = ITAUP + M * * Bidiagonalize L in U -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* (Workspace: need 4*M, prefer 3*M + 2*M*NB) * CALL DGEBRD( M, M, U, LDU, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), @@ -3330,14 +3330,14 @@ * * Multiply right bidiagonalizing vectors in U by Q * in VT -* (Workspace: need 3*M+N, prefer 3*M+N*NB) +* (Workspace: need 3*M + N, prefer 3*M + N*NB) * CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU, $ WORK( ITAUP ), VT, LDVT, $ WORK( IWORK ), LWORK-IWORK+1, IERR ) * * Generate left bidiagonalizing vectors in U -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3371,7 +3371,7 @@ IWORK = ITAUP + M * * Bidiagonalize A -* (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB) +* (Workspace: need 3*M + N, prefer 3*M + (M + N)*NB) * CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1, @@ -3380,7 +3380,7 @@ * * If left singular vectors desired in U, copy result to U * and generate left bidiagonalizing vectors in U -* (Workspace: need 4*M-1, prefer 3*M+(M-1)*NB) +* (Workspace: need 4*M-1, prefer 3*M + (M-1)*NB) * CALL DLACPY( 'L', M, M, A, LDA, U, LDU ) CALL DORGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), @@ -3390,7 +3390,7 @@ * * If right singular vectors desired in VT, copy result to * VT and generate right bidiagonalizing vectors in VT -* (Workspace: need 3*M+NRVT, prefer 3*M+NRVT*NB) +* (Workspace: need 3*M + NRVT, prefer 3*M + NRVT*NB) * CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT ) IF( WNTVA ) @@ -3404,7 +3404,7 @@ * * If left singular vectors desired in A, generate left * bidiagonalizing vectors in A -* (Workspace: need 4*M-1, prefer 3*M+(M-1)*NB) +* (Workspace: need 4*M-1, prefer 3*M + (M-1)*NB) * CALL DORGBR( 'Q', M, M, N, A, LDA, WORK( ITAUQ ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) @@ -3413,7 +3413,7 @@ * * If right singular vectors desired in A, generate right * bidiagonalizing vectors in A -* (Workspace: need 4*M, prefer 3*M+M*NB) +* (Workspace: need 4*M, prefer 3*M + M*NB) * CALL DORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( IWORK ), LWORK-IWORK+1, IERR ) diff --git a/SRC/sbdsdc.f b/SRC/sbdsdc.f index 261aa1c2..ebafecdc 100644 --- a/SRC/sbdsdc.f +++ b/SRC/sbdsdc.f @@ -311,7 +311,7 @@ WSTART = 1 QSTART = 3 IF( ICOMPQ.EQ.1 ) THEN - CALL SCOPY( N, D, 1, Q( 1 ), 1 ) + CALL SCOPY( N, D, 1, Q( 1 ), 1 ) CALL SCOPY( N-1, E, 1, Q( N+1 ), 1 ) END IF IF( IUPLO.EQ.2 ) THEN @@ -335,8 +335,11 @@ * If ICOMPQ = 0, use SLASDQ to compute the singular values. * IF( ICOMPQ.EQ.0 ) THEN +* Ignore WSTART, instead using WORK( 1 ), since the two vectors +* for CS and -SN above are added only if ICOMPQ == 2, +* and adding them exceeds documented WORK size of 4*n. CALL SLASDQ( 'U', 0, N, 0, 0, 0, D, E, VT, LDVT, U, LDU, U, - $ LDU, WORK( WSTART ), INFO ) + $ LDU, WORK( 1 ), INFO ) GO TO 40 END IF * diff --git a/SRC/sgesdd.f b/SRC/sgesdd.f index 1bc7e8a4..612ad8b5 100644 --- a/SRC/sgesdd.f +++ b/SRC/sgesdd.f @@ -18,8 +18,8 @@ * Definition: * =========== * -* SUBROUTINE SGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, -* LWORK, IWORK, INFO ) +* SUBROUTINE SGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, +* WORK, LWORK, IWORK, INFO ) * * .. Scalar Arguments .. * CHARACTER JOBZ @@ -27,7 +27,7 @@ * .. * .. Array Arguments .. * INTEGER IWORK( * ) -* REAL A( LDA, * ), S( * ), U( LDU, * ), +* REAL A( LDA, * ), S( * ), U( LDU, * ), * $ VT( LDVT, * ), WORK( * ) * .. * @@ -154,8 +154,8 @@ *> \param[in] LDVT *> \verbatim *> LDVT is INTEGER -*> The leading dimension of the array VT. LDVT >= 1; if -*> JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; +*> The leading dimension of the array VT. LDVT >= 1; +*> if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; *> if JOBZ = 'S', LDVT >= min(M,N). *> \endverbatim *> @@ -169,16 +169,18 @@ *> \verbatim *> LWORK is INTEGER *> The dimension of the array WORK. LWORK >= 1. -*> If JOBZ = 'N', -*> LWORK >= 3*min(M,N) + max(max(M,N),6*min(M,N)). -*> If JOBZ = 'O', -*> LWORK >= 3*min(M,N) + -*> max(max(M,N),5*min(M,N)*min(M,N)+4*min(M,N)). -*> If JOBZ = 'S' or 'A' -*> LWORK >= min(M,N)*(7+4*min(M,N)) -*> For good performance, LWORK should generally be larger. -*> If LWORK = -1 but other input arguments are legal, WORK(1) -*> returns the optimal LWORK. +*> If LWORK = -1, a workspace query is assumed. The optimal +*> size for the WORK array is calculated and stored in WORK(1), +*> and no other work except argument checking is performed. +*> +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LWORK >= 3*mn + max( mx, 7*mn ). +*> If JOBZ = 'O', LWORK >= 3*mn + max( mx, 5*mn*mn + 4*mn ). +*> If JOBZ = 'S', LWORK >= 4*mn*mn + 7*mn. +*> If JOBZ = 'A', LWORK >= 4*mn*mn + 6*mn + mx. +*> These are not tight minimums in all cases; see comments inside code. +*> For good performance, LWORK should generally be larger; +*> a query is recommended. *> \endverbatim *> *> \param[out] IWORK @@ -213,8 +215,9 @@ *> California at Berkeley, USA *> * ===================================================================== - SUBROUTINE SGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, - $ LWORK, IWORK, INFO ) + SUBROUTINE SGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, + $ WORK, LWORK, IWORK, INFO ) + implicit none * * -- LAPACK driver routine (version 3.6.0) -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- @@ -227,14 +230,14 @@ * .. * .. Array Arguments .. INTEGER IWORK( * ) - REAL A( LDA, * ), S( * ), U( LDU, * ), + REAL A( LDA, * ), S( * ), U( LDU, * ), $ VT( LDVT, * ), WORK( * ) * .. * * ===================================================================== * * .. Parameters .. - REAL ZERO, ONE + REAL ZERO, ONE PARAMETER ( ZERO = 0.0E0, ONE = 1.0E0 ) * .. * .. Local Scalars .. @@ -243,7 +246,16 @@ $ IR, ISCL, ITAU, ITAUP, ITAUQ, IU, IVT, LDWKVT, $ LDWRKL, LDWRKR, LDWRKU, MAXWRK, MINMN, MINWRK, $ MNTHR, NWORK, WRKBL - REAL ANRM, BIGNUM, EPS, SMLNUM + INTEGER LWORK_SGEBRD_MN, LWORK_SGEBRD_MM, + $ LWORK_SGEBRD_NN, LWORK_SGELQF_MN, + $ LWORK_SGEQRF_MN, + $ LWORK_SORGBR_P_MM, LWORK_SORGBR_Q_NN, + $ LWORK_SORGLQ_MN, LWORK_SORGLQ_NN, + $ LWORK_SORGQR_MM, LWORK_SORGQR_MN, + $ LWORK_SORMBR_PRT_MM, LWORK_SORMBR_QLN_MM, + $ LWORK_SORMBR_PRT_MN, LWORK_SORMBR_QLN_MN, + $ LWORK_SORMBR_PRT_NN, LWORK_SORMBR_QLN_NN + REAL ANRM, BIGNUM, EPS, SMLNUM * .. * .. Local Arrays .. INTEGER IDUM( 1 ) @@ -256,9 +268,8 @@ * .. * .. External Functions .. LOGICAL LSAME - INTEGER ILAENV REAL SLAMCH, SLANGE - EXTERNAL ILAENV, LSAME, SLAMCH, SLANGE + EXTERNAL SLAMCH, SLANGE, LSAME * .. * .. Intrinsic Functions .. INTRINSIC INT, MAX, MIN, SQRT @@ -267,13 +278,13 @@ * * Test the input arguments * - INFO = 0 - MINMN = MIN( M, N ) - WNTQA = LSAME( JOBZ, 'A' ) - WNTQS = LSAME( JOBZ, 'S' ) + INFO = 0 + MINMN = MIN( M, N ) + WNTQA = LSAME( JOBZ, 'A' ) + WNTQS = LSAME( JOBZ, 'S' ) WNTQAS = WNTQA .OR. WNTQS - WNTQO = LSAME( JOBZ, 'O' ) - WNTQN = LSAME( JOBZ, 'N' ) + WNTQO = LSAME( JOBZ, 'O' ) + WNTQN = LSAME( JOBZ, 'N' ) LQUERY = ( LWORK.EQ.-1 ) * IF( .NOT.( WNTQA .OR. WNTQS .OR. WNTQO .OR. WNTQN ) ) THEN @@ -294,222 +305,270 @@ END IF * * Compute workspace -* (Note: Comments in the code beginning "Workspace:" describe the -* minimal amount of workspace needed at that point in the code, +* Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace allocated at that point in the code, * as well as the preferred amount for good performance. * NB refers to the optimal block size for the immediately -* following subroutine, as returned by ILAENV.) +* following subroutine, as returned by ILAENV. * IF( INFO.EQ.0 ) THEN MINWRK = 1 MAXWRK = 1 + BDSPAC = 0 + MNTHR = INT( MINMN*11.0E0 / 6.0E0 ) IF( M.GE.N .AND. MINMN.GT.0 ) THEN * * Compute space needed for SBDSDC * - MNTHR = INT( MINMN*11.0E0 / 6.0E0 ) IF( WNTQN ) THEN +* sbdsdc needs only 4*N (or 6*N for uplo=L for LAPACK <= 3.6) +* keep 7*N for backwards compatability. BDSPAC = 7*N ELSE BDSPAC = 3*N*N + 4*N END IF +* +* Compute space preferred for each routine + CALL SGEBRD( M, N, DUM(1), M, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_SGEBRD_MN = INT( DUM(1) ) +* + CALL SGEBRD( N, N, DUM(1), N, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_SGEBRD_NN = INT( DUM(1) ) +* + CALL SGEQRF( M, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_SGEQRF_MN = INT( DUM(1) ) +* + CALL SORGBR( 'Q', N, N, N, DUM(1), N, DUM(1), DUM(1), -1, + $ IERR ) + LWORK_SORGBR_Q_NN = INT( DUM(1) ) +* + CALL SORGQR( M, M, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_SORGQR_MM = INT( DUM(1) ) +* + CALL SORGQR( M, N, N, DUM(1), M, DUM(1), DUM(1), -1, IERR ) + LWORK_SORGQR_MN = INT( DUM(1) ) +* + CALL SORMBR( 'P', 'R', 'T', N, N, N, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_SORMBR_PRT_NN = INT( DUM(1) ) +* + CALL SORMBR( 'Q', 'L', 'N', N, N, N, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_SORMBR_QLN_NN = INT( DUM(1) ) +* + CALL SORMBR( 'Q', 'L', 'N', M, N, N, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_SORMBR_QLN_MN = INT( DUM(1) ) +* + CALL SORMBR( 'Q', 'L', 'N', M, M, N, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_SORMBR_QLN_MM = INT( DUM(1) ) +* IF( M.GE.MNTHR ) THEN IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * - WRKBL = N + N*ILAENV( 1, 'SGEQRF', ' ', M, N, -1, - $ -1 ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'SGEBRD', ' ', N, N, -1, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+N ) + WRKBL = N + LWORK_SGEQRF_MN + WRKBL = MAX( WRKBL, 3*N + LWORK_SGEBRD_NN ) + MAXWRK = MAX( WRKBL, BDSPAC + N ) MINWRK = BDSPAC + N ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') -* - WRKBL = N + N*ILAENV( 1, 'SGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'SORGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'SGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 2 (M >> N, JOBZ='O') +* + WRKBL = N + LWORK_SGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_SORGQR_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + 2*N*N MINWRK = BDSPAC + 2*N*N + 3*N ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') -* - WRKBL = N + N*ILAENV( 1, 'SGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'SORGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'SGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 3 (M >> N, JOBZ='S') +* + WRKBL = N + LWORK_SGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_SORGQR_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + N*N MINWRK = BDSPAC + N*N + 3*N ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') -* - WRKBL = N + N*ILAENV( 1, 'SGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+M*ILAENV( 1, 'SORGQR', ' ', M, - $ M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+2*N* - $ ILAENV( 1, 'SGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 4 (M >> N, JOBZ='A') +* + WRKBL = N + LWORK_SGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_SORGQR_MM ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SGEBRD_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + N*N - MINWRK = BDSPAC + N*N + 2*N + M + MINWRK = N*N + MAX( 3*N + BDSPAC, N + M ) END IF ELSE * -* Path 5 (M at least N, but not much larger) +* Path 5 (M >= N, but not much larger) * - WRKBL = 3*N + ( M+N )*ILAENV( 1, 'SGEBRD', ' ', M, N, -1, - $ -1 ) + WRKBL = 3*N + LWORK_SGEBRD_MN IF( WNTQN ) THEN - MAXWRK = MAX( WRKBL, BDSPAC+3*N ) +* Path 5n (M >= N, jobz='N') + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) ELSE IF( WNTQO ) THEN - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'QLN', M, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*N ) +* Path 5o (M >= N, jobz='O') + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_MN ) + WRKBL = MAX( WRKBL, 3*N + BDSPAC ) MAXWRK = WRKBL + M*N - MINWRK = 3*N + MAX( M, N*N+BDSPAC ) + MINWRK = 3*N + MAX( M, N*N + BDSPAC ) ELSE IF( WNTQS ) THEN - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'QLN', M, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*N ) +* Path 5s (M >= N, jobz='S') + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_MN ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) ELSE IF( WNTQA ) THEN - WRKBL = MAX( WRKBL, 3*N+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*N+N* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, BDSPAC+3*N ) +* Path 5a (M >= N, jobz='A') + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*N + LWORK_SORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*N + BDSPAC ) MINWRK = 3*N + MAX( M, BDSPAC ) END IF END IF - ELSE IF ( MINMN.GT.0 ) THEN + ELSE IF( MINMN.GT.0 ) THEN * * Compute space needed for SBDSDC * - MNTHR = INT( MINMN*11.0E0 / 6.0E0 ) IF( WNTQN ) THEN +* sbdsdc needs only 4*N (or 6*N for uplo=L for LAPACK <= 3.6) +* keep 7*N for backwards compatability. BDSPAC = 7*M ELSE BDSPAC = 3*M*M + 4*M END IF +* +* Compute space preferred for each routine + CALL SGEBRD( M, N, DUM(1), M, DUM(1), DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_SGEBRD_MN = INT( DUM(1) ) +* + CALL SGEBRD( M, M, A, M, S, DUM(1), DUM(1), + $ DUM(1), DUM(1), -1, IERR ) + LWORK_SGEBRD_MM = INT( DUM(1) ) +* + CALL SGELQF( M, N, A, M, DUM(1), DUM(1), -1, IERR ) + LWORK_SGELQF_MN = INT( DUM(1) ) +* + CALL SORGLQ( N, N, M, DUM(1), N, DUM(1), DUM(1), -1, IERR ) + LWORK_SORGLQ_NN = INT( DUM(1) ) +* + CALL SORGLQ( M, N, M, A, M, DUM(1), DUM(1), -1, IERR ) + LWORK_SORGLQ_MN = INT( DUM(1) ) +* + CALL SORGBR( 'P', M, M, M, A, N, DUM(1), DUM(1), -1, IERR ) + LWORK_SORGBR_P_MM = INT( DUM(1) ) +* + CALL SORMBR( 'P', 'R', 'T', M, M, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_SORMBR_PRT_MM = INT( DUM(1) ) +* + CALL SORMBR( 'P', 'R', 'T', M, N, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_SORMBR_PRT_MN = INT( DUM(1) ) +* + CALL SORMBR( 'P', 'R', 'T', N, N, M, DUM(1), N, + $ DUM(1), DUM(1), N, DUM(1), -1, IERR ) + LWORK_SORMBR_PRT_NN = INT( DUM(1) ) +* + CALL SORMBR( 'Q', 'L', 'N', M, M, M, DUM(1), M, + $ DUM(1), DUM(1), M, DUM(1), -1, IERR ) + LWORK_SORMBR_QLN_MM = INT( DUM(1) ) +* IF( N.GE.MNTHR ) THEN IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * - WRKBL = M + M*ILAENV( 1, 'SGELQF', ' ', M, N, -1, - $ -1 ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'SGEBRD', ' ', M, M, -1, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+M ) + WRKBL = M + LWORK_SGELQF_MN + WRKBL = MAX( WRKBL, 3*M + LWORK_SGEBRD_MM ) + MAXWRK = MAX( WRKBL, BDSPAC + M ) MINWRK = BDSPAC + M ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') -* - WRKBL = M + M*ILAENV( 1, 'SGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'SORGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'SGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 2t (N >> M, JOBZ='O') +* + WRKBL = M + LWORK_SGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_SORGLQ_MN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + 2*M*M MINWRK = BDSPAC + 2*M*M + 3*M ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* - WRKBL = M + M*ILAENV( 1, 'SGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'SORGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'SGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 3t (N >> M, JOBZ='S') +* + WRKBL = M + LWORK_SGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_SORGLQ_MN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*M MINWRK = BDSPAC + M*M + 3*M ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') -* - WRKBL = M + M*ILAENV( 1, 'SGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+N*ILAENV( 1, 'SORGLQ', ' ', N, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+2*M* - $ ILAENV( 1, 'SGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 4t (N >> M, JOBZ='A') +* + WRKBL = M + LWORK_SGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_SORGLQ_NN ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SGEBRD_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_MM ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*M - MINWRK = BDSPAC + M*M + 3*M + MINWRK = M*M + MAX( 3*M + BDSPAC, M + N ) END IF ELSE * -* Path 5t (N greater than M, but not much larger) +* Path 5t (N > M, but not much larger) * - WRKBL = 3*M + ( M+N )*ILAENV( 1, 'SGEBRD', ' ', M, N, -1, - $ -1 ) + WRKBL = 3*M + LWORK_SGEBRD_MN IF( WNTQN ) THEN - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5tn (N > M, jobz='N') + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) ELSE IF( WNTQO ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', M, N, M, -1 ) ) - WRKBL = MAX( WRKBL, BDSPAC+3*M ) +* Path 5to (N > M, jobz='O') + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_MN ) + WRKBL = MAX( WRKBL, 3*M + BDSPAC ) MAXWRK = WRKBL + M*N - MINWRK = 3*M + MAX( N, M*M+BDSPAC ) + MINWRK = 3*M + MAX( N, M*M + BDSPAC ) ELSE IF( WNTQS ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', M, N, M, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5ts (N > M, jobz='S') + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_MN ) + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) ELSE IF( WNTQA ) THEN - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'QLN', M, M, N, -1 ) ) - WRKBL = MAX( WRKBL, 3*M+M* - $ ILAENV( 1, 'SORMBR', 'PRT', N, N, M, -1 ) ) - MAXWRK = MAX( WRKBL, BDSPAC+3*M ) +* Path 5ta (N > M, jobz='A') + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 3*M + LWORK_SORMBR_PRT_NN ) + MAXWRK = MAX( WRKBL, 3*M + BDSPAC ) MINWRK = 3*M + MAX( N, BDSPAC ) END IF END IF END IF + MAXWRK = MAX( MAXWRK, MINWRK ) WORK( 1 ) = MAXWRK * @@ -559,17 +618,18 @@ * IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need 2*N, prefer N+N*NB) +* Workspace: need N [tau] + N [work] +* Workspace: prefer N [tau] + N*NB [work] * CALL SGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Zero out below R * @@ -580,7 +640,8 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (Workspace: need 4*N, prefer 3*N+2*N*NB) +* Workspace: need 3*N [e, tauq, taup] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + 2*N*NB [work] * CALL SGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -588,14 +649,14 @@ NWORK = IE + N * * Perform bidiagonal SVD, computing singular values only -* (Workspace: need N+BDSPAC) +* Workspace: need N [e] + BDSPAC * CALL SBDSDC( 'U', 'N', N, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ = 'O') +* Path 2 (M >> N, JOBZ = 'O') * N left singular vectors to be overwritten on A and * N right singular vectors to be computed in VT * @@ -603,42 +664,45 @@ * * WORK(IR) is LDWRKR by N * - IF( LWORK.GE.LDA*N+N*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. LDA*N + N*N + 3*N + BDSPAC ) THEN LDWRKR = LDA ELSE - LDWRKR = ( LWORK-N*N-3*N-BDSPAC ) / N + LDWRKR = ( LWORK - N*N - 3*N - BDSPAC ) / N END IF ITAU = IR + LDWRKR*N NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL SGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy R to WORK(IR), zeroing out below it * CALL SLACPY( 'U', N, N, A, LDA, WORK( IR ), LDWRKR ) - CALL SLASET( 'L', N-1, N-1, ZERO, ZERO, WORK( IR+1 ), + CALL SLASET( 'L', N - 1, N - 1, ZERO, ZERO, WORK(IR+1), $ LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL SORGQR( M, N, N, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + N ITAUP = ITAUQ + N NWORK = ITAUP + N * -* Bidiagonalize R in VT, copying result to WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Bidiagonalize R in WORK(IR) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL SGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * WORK(IU) is N by N * @@ -648,7 +712,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), N, $ VT, LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -656,21 +720,23 @@ * * Overwrite WORK(IU) by left singular vectors of R * and VT by right singular vectors of R -* (Workspace: need 2*N*N+3*N, prefer 2*N*N+2*N+N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), WORK( IU ), N, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of R in * WORK(IU), storing result in WORK(IR) and copying to A -* (Workspace: need 2*N*N, prefer N*N+M*N) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N*N [U] +* Workspace: prefer M*N [R] + 3*N [e, tauq, taup] + N*N [U] * DO 10 I = 1, M, LDWRKR - CHUNK = MIN( M-I+1, LDWRKR ) + CHUNK = MIN( M - I + 1, LDWRKR ) CALL SGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ), $ LDA, WORK( IU ), N, ZERO, WORK( IR ), $ LDWRKR ) @@ -680,7 +746,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') +* Path 3 (M >> N, JOBZ='S') * N left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -693,38 +759,41 @@ NWORK = ITAU + N * * Compute A=Q*R -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL SGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy R to WORK(IR), zeroing out below it * CALL SLACPY( 'U', N, N, A, LDA, WORK( IR ), LDWRKR ) - CALL SLASET( 'L', N-1, N-1, ZERO, ZERO, WORK( IR+1 ), + CALL SLASET( 'L', N - 1, N - 1, ZERO, ZERO, WORK(IR+1), $ LDWRKR ) * * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need N*N [R] + N [tau] + N [work] +* Workspace: prefer N*N [R] + N [tau] + N*NB [work] * CALL SORGQR( M, N, N, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + N ITAUP = ITAUQ + N NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL SGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagoal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, $ LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -732,19 +801,20 @@ * * Overwrite U by left singular vectors of R and VT * by right singular vectors of R -* (Workspace: need N*N+3*N, prefer N*N+2*N+N*NB) +* Workspace: need N*N [R] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [R] + 3*N [e, tauq, taup] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * CALL SORMBR( 'P', 'R', 'T', N, N, N, WORK( IR ), LDWRKR, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in U -* (Workspace: need N*N) +* Workspace: need N*N [R] * CALL SLACPY( 'F', N, N, U, LDU, WORK( IR ), LDWRKR ) CALL SGEMM( 'N', 'N', M, N, N, ONE, A, LDA, WORK( IR ), @@ -752,7 +822,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') +* Path 4 (M >> N, JOBZ='A') * M left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -765,16 +835,18 @@ NWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* Workspace: need N*N [U] + N [tau] + N [work] +* Workspace: prefer N*N [U] + N [tau] + N*NB [work] * CALL SGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (Workspace: need N*N+N+M, prefer N*N+N+M*NB) +* Workspace: need N*N [U] + N [tau] + M [work] +* Workspace: prefer N*N [U] + N [tau] + M*NB [work] CALL SORGQR( M, M, N, U, LDU, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Produce R in A, zeroing out other entries * @@ -785,7 +857,8 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (Workspace: need N*N+4*N, prefer N*N+3*N+2*N*NB) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [U] + 3*N [e, tauq, taup] + 2*N*NB [work] * CALL SGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -794,7 +867,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), N, $ VT, LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -802,18 +875,19 @@ * * Overwrite WORK(IU) by left singular vectors of R and VT * by right singular vectors of R -* (Workspace: need N*N+3*N, prefer N*N+2*N+N*NB) +* Workspace: need N*N [U] + 3*N [e, tauq, taup] + N [work] +* Workspace: prefer N*N [U] + 3*N [e, tauq, taup] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', N, N, N, A, LDA, $ WORK( ITAUQ ), WORK( IU ), LDWRKU, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply Q in U by left singular vectors of R in * WORK(IU), storing result in A -* (Workspace: need N*N) +* Workspace: need N*N [U] * CALL SGEMM( 'N', 'N', M, N, N, ONE, U, LDU, WORK( IU ), $ LDWRKU, ZERO, A, LDA ) @@ -828,7 +902,7 @@ * * M .LT. MNTHR * -* Path 5 (M at least N, but not much larger) +* Path 5 (M >= N, but not much larger) * Reduce to bidiagonal form without QR decomposition * IE = 1 @@ -837,21 +911,24 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (Workspace: need 3*N+M, prefer 3*N+(M+N)*NB) +* Workspace: need 3*N [e, tauq, taup] + M [work] +* Workspace: prefer 3*N [e, tauq, taup] + (M+N)*NB [work] * CALL SGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5n (M >= N, JOBZ='N') * Perform bidiagonal SVD, only computing singular values -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'N', N, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 5o (M >= N, JOBZ='O') IU = NWORK - IF( LWORK.GE.M*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*N + BDSPAC ) THEN * * WORK( IU ) is M by N * @@ -859,6 +936,8 @@ NWORK = IU + LDWRKU*N CALL SLASET( 'F', M, N, ZERO, ZERO, WORK( IU ), $ LDWRKU ) +* IR is unused; silence compile warnings + IR = -1 ELSE * * WORK( IU ) is N by N @@ -869,53 +948,59 @@ * WORK(IR) is LDWRKR by N * IR = NWORK - LDWRKR = ( LWORK-N*N-3*N ) / N + LDWRKR = ( LWORK - N*N - 3*N ) / N END IF NWORK = IU + LDWRKU*N * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in WORK(IU) and computing right * singular vectors of bidiagonal matrix in VT -* (Workspace: need N+N*N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + BDSPAC * CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), WORK( IU ), $ LDWRKU, VT, LDVT, DUM, IDUM, WORK( NWORK ), $ IWORK, INFO ) * * Overwrite VT by right singular vectors of A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL SORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * - IF( LWORK.GE.M*N+3*N+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*N + BDSPAC ) THEN * +* Path 5o-fast * Overwrite WORK(IU) by left singular vectors of A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + M*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + M*N [U] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), WORK( IU ), LDWRKU, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Copy left singular vectors of A from WORK(IU) to A * CALL SLACPY( 'F', M, N, WORK( IU ), LDWRKU, A, LDA ) ELSE * +* Path 5o-slow * Generate Q in A -* (Workspace: need N*N+2*N, prefer N*N+N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + N*NB [work] * CALL SORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply Q in A by left singular vectors of * bidiagonal matrix in WORK(IU), storing result in * WORK(IR) and copying to A -* (Workspace: need 2*N*N, prefer N*N+M*N) +* Workspace: need 3*N [e, tauq, taup] + N*N [U] + NB*N [R] +* Workspace: prefer 3*N [e, tauq, taup] + N*N [U] + M*N [R] * DO 20 I = 1, M, LDWRKR - CHUNK = MIN( M-I+1, LDWRKR ) + CHUNK = MIN( M - I + 1, LDWRKR ) CALL SGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ), $ LDA, WORK( IU ), LDWRKU, ZERO, $ WORK( IR ), LDWRKR ) @@ -926,10 +1011,11 @@ * ELSE IF( WNTQS ) THEN * +* Path 5s (M >= N, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL SLASET( 'F', M, N, ZERO, ZERO, U, LDU ) CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, @@ -938,20 +1024,22 @@ * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 3*N, prefer 2*N+N*NB) +* Workspace: need 3*N [e, tauq, taup] + N [work] +* Workspace: prefer 3*N [e, tauq, taup] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, N, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) ELSE IF( WNTQA ) THEN * +* Path 5a (M >= N, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need N+BDSPAC) +* Workspace: need 3*N [e, tauq, taup] + BDSPAC * CALL SLASET( 'F', M, M, ZERO, ZERO, U, LDU ) CALL SBDSDC( 'U', 'I', N, S, WORK( IE ), U, LDU, VT, @@ -961,20 +1049,21 @@ * Set the right corner of U to identity matrix * IF( M.GT.N ) THEN - CALL SLASET( 'F', M-N, M-N, ZERO, ONE, U( N+1, N+1 ), + CALL SLASET( 'F', M - N, M - N, ZERO, ONE, U(N+1,N+1), $ LDU ) END IF * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need N*N+2*N+M, prefer N*N+2*N+M*NB) +* Workspace: need 3*N [e, tauq, taup] + M [work] +* Workspace: prefer 3*N [e, tauq, taup] + M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', N, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) END IF * END IF @@ -989,17 +1078,18 @@ * IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need 2*M, prefer M+M*NB) +* Workspace: need M [tau] + M [work] +* Workspace: prefer M [tau] + M*NB [work] * CALL SGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Zero out above L * @@ -1010,7 +1100,8 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (Workspace: need 4*M, prefer 3*M+2*M*NB) +* Workspace: need 3*M [e, tauq, taup] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + 2*M*NB [work] * CALL SGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1018,68 +1109,69 @@ NWORK = IE + M * * Perform bidiagonal SVD, computing singular values only -* (Workspace: need M+BDSPAC) +* Workspace: need M [e] + BDSPAC * CALL SBDSDC( 'U', 'N', M, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') +* Path 2t (N >> M, JOBZ='O') * M right singular vectors to be overwritten on A and * M left singular vectors to be computed in U * IVT = 1 * -* IVT is M by M +* WORK(IVT) is M by M +* WORK(IL) is M by M; it is later resized to M by chunk for gemm * IL = IVT + M*M - IF( LWORK.GE.M*N+M*M+3*M+BDSPAC ) THEN -* -* WORK(IL) is M by N -* + IF( LWORK .GE. M*N + M*M + 3*M + BDSPAC ) THEN LDWRKL = M CHUNK = N ELSE LDWRKL = M - CHUNK = ( LWORK-M*M ) / M + CHUNK = ( LWORK - M*M ) / M END IF ITAU = IL + LDWRKL*M NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] * CALL SGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy L to WORK(IL), zeroing about above it * CALL SLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWRKL ) - CALL SLASET( 'U', M-1, M-1, ZERO, ZERO, - $ WORK( IL+LDWRKL ), LDWRKL ) + CALL SLASET( 'U', M - 1, M - 1, ZERO, ZERO, + $ WORK( IL + LDWRKL ), LDWRKL ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] * CALL SORGLQ( M, N, M, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + M ITAUP = ITAUQ + M NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL SGEBRD( M, M, WORK( IL ), LDWRKL, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U, and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M+M*M+BDSPAC) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), M, DUM, IDUM, WORK( NWORK ), @@ -1087,21 +1179,24 @@ * * Overwrite U by left singular vectors of L and WORK(IVT) * by right singular vectors of L -* (Workspace: need 2*M*M+3*M, prefer 2*M*M+2*M+M*NB) +* Workspace: need M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + M*M [L] + 3*M [e, tauq, taup] + M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUP ), WORK( IVT ), M, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IVT) by Q * in A, storing result in WORK(IL) and copying to A -* (Workspace: need 2*M*M, prefer M*M+M*N) +* Workspace: need M*M [VT] + M*M [L] +* Workspace: prefer M*M [VT] + M*N [L] +* At this point, L is resized as M by chunk. * DO 30 I = 1, N, CHUNK - BLK = MIN( N-I+1, CHUNK ) + BLK = MIN( N - I + 1, CHUNK ) CALL SGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IVT ), M, $ A( 1, I ), LDA, ZERO, WORK( IL ), LDWRKL ) CALL SLACPY( 'F', M, BLK, WORK( IL ), LDWRKL, @@ -1110,7 +1205,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') +* Path 3t (N >> M, JOBZ='S') * M right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1123,38 +1218,41 @@ NWORK = ITAU + M * * Compute A=L*Q -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [L] + M [tau] + M*NB [work] * CALL SGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Copy L to WORK(IL), zeroing out above it * CALL SLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWRKL ) - CALL SLASET( 'U', M-1, M-1, ZERO, ZERO, - $ WORK( IL+LDWRKL ), LDWRKL ) + CALL SLASET( 'U', M - 1, M - 1, ZERO, ZERO, + $ WORK( IL + LDWRKL ), LDWRKL ) * * Generate Q in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [L] + M [tau] + M [work] +* Workspace: prefer M*M [L] + M [tau] + M*NB [work] * CALL SORGLQ( M, N, M, A, LDA, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) IE = ITAU ITAUQ = IE + M ITAUP = ITAUQ + M NWORK = ITAUP + M * -* Bidiagonalize L in WORK(IU), copying result to U -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Bidiagonalize L in WORK(IU). +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [L] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL SGEBRD( M, M, WORK( IL ), LDWRKL, S, WORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, VT, $ LDVT, DUM, IDUM, WORK( NWORK ), IWORK, @@ -1162,18 +1260,19 @@ * * Overwrite U by left singular vectors of L and VT * by right singular vectors of L -* (Workspace: need M*M+3*M, prefer M*M+2*M+M*NB) +* Workspace: need M*M [L] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [L] + 3*M [e, tauq, taup] + M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', M, M, M, WORK( IL ), LDWRKL, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IL) by * Q in A, storing result in VT -* (Workspace: need M*M) +* Workspace: need M*M [L] * CALL SLACPY( 'F', M, M, VT, LDVT, WORK( IL ), LDWRKL ) CALL SGEMM( 'N', 'N', M, N, M, ONE, WORK( IL ), LDWRKL, @@ -1181,7 +1280,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') +* Path 4t (N >> M, JOBZ='A') * N right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1194,17 +1293,19 @@ NWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M [tau] + M [work] +* Workspace: prefer M*M [VT] + M [tau] + M*NB [work] * CALL SGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need M*M [VT] + M [tau] + N [work] +* Workspace: prefer M*M [VT] + M [tau] + N*NB [work] * CALL SORGLQ( N, N, M, VT, LDVT, WORK( ITAU ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Produce L in A, zeroing out other entries * @@ -1215,7 +1316,8 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (Workspace: need M*M+4*M, prefer M*M+3*M+2*M*NB) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup] + M [work] +* Workspace: prefer M*M [VT] + 3*M [e, tauq, taup] + 2*M*NB [work] * CALL SGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1224,7 +1326,7 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M+M*M+BDSPAC) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'U', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), LDWKVT, DUM, IDUM, @@ -1232,18 +1334,19 @@ * * Overwrite U by left singular vectors of L and WORK(IVT) * by right singular vectors of L -* (Workspace: need M*M+3*M, prefer M*M+2*M+M*NB) +* Workspace: need M*M [VT] + 3*M [e, tauq, taup]+ M [work] +* Workspace: prefer M*M [VT] + 3*M [e, tauq, taup]+ M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, M, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', M, M, M, A, LDA, $ WORK( ITAUP ), WORK( IVT ), LDWKVT, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply right singular vectors of L in WORK(IVT) by * Q in VT, storing result in A -* (Workspace: need M*M) +* Workspace: need M*M [VT] * CALL SGEMM( 'N', 'N', M, N, M, ONE, WORK( IVT ), LDWKVT, $ VT, LDVT, ZERO, A, LDA ) @@ -1258,7 +1361,7 @@ * * N .LT. MNTHR * -* Path 5t (N greater than M, but not much larger) +* Path 5t (N > M, but not much larger) * Reduce to bidiagonal form without LQ decomposition * IE = 1 @@ -1267,28 +1370,33 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB) +* Workspace: need 3*M [e, tauq, taup] + N [work] +* Workspace: prefer 3*M [e, tauq, taup] + (M+N)*NB [work] * CALL SGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5tn (N > M, JOBZ='N') * Perform bidiagonal SVD, only computing singular values -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL SBDSDC( 'L', 'N', M, S, WORK( IE ), DUM, 1, DUM, 1, $ DUM, IDUM, WORK( NWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 5to (N > M, JOBZ='O') LDWKVT = M IVT = NWORK - IF( LWORK.GE.M*N+3*M+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*M + BDSPAC ) THEN * * WORK( IVT ) is M by N * CALL SLASET( 'F', M, N, ZERO, ZERO, WORK( IVT ), $ LDWKVT ) NWORK = IVT + LDWKVT*N +* IL is unused; silence compile warnings + IL = -1 ELSE * * WORK( IVT ) is M by M @@ -1298,52 +1406,58 @@ * * WORK(IL) is M by CHUNK * - CHUNK = ( LWORK-M*M-3*M ) / M + CHUNK = ( LWORK - M*M - 3*M ) / M END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in WORK(IVT) -* (Workspace: need M*M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + BDSPAC * CALL SBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, $ WORK( IVT ), LDWKVT, DUM, IDUM, $ WORK( NWORK ), IWORK, INFO ) * * Overwrite U by left singular vectors of A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) * - IF( LWORK.GE.M*N+3*M+BDSPAC ) THEN + IF( LWORK .GE. M*N + 3*M + BDSPAC ) THEN * +* Path 5to-fast * Overwrite WORK(IVT) by left singular vectors of A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*N [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*N [VT] + M*NB [work] * CALL SORMBR( 'P', 'R', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), WORK( IVT ), LDWKVT, - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Copy right singular vectors of A from WORK(IVT) to A * CALL SLACPY( 'F', M, N, WORK( IVT ), LDWKVT, A, LDA ) ELSE * +* Path 5to-slow * Generate P**T in A -* (Workspace: need M*M+2*M, prefer M*M+M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*NB [work] * CALL SORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), - $ WORK( NWORK ), LWORK-NWORK+1, IERR ) + $ WORK( NWORK ), LWORK - NWORK + 1, IERR ) * * Multiply Q in A by right singular vectors of * bidiagonal matrix in WORK(IVT), storing result in * WORK(IL) and copying to A -* (Workspace: need 2*M*M, prefer M*M+M*N) +* Workspace: need 3*M [e, tauq, taup] + M*M [VT] + M*NB [L] +* Workspace: prefer 3*M [e, tauq, taup] + M*M [VT] + M*N [L] * DO 40 I = 1, N, CHUNK - BLK = MIN( N-I+1, CHUNK ) + BLK = MIN( N - I + 1, CHUNK ) CALL SGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IVT ), $ LDWKVT, A( 1, I ), LDA, ZERO, $ WORK( IL ), M ) @@ -1353,10 +1467,11 @@ END IF ELSE IF( WNTQS ) THEN * +* Path 5ts (N > M, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL SLASET( 'F', M, N, ZERO, ZERO, VT, LDVT ) CALL SBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, VT, @@ -1365,20 +1480,22 @@ * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 3*M, prefer 2*M+M*NB) +* Workspace: need 3*M [e, tauq, taup] + M [work] +* Workspace: prefer 3*M [e, tauq, taup] + M*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', M, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) ELSE IF( WNTQA ) THEN * +* Path 5ta (N > M, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in U and computing right singular * vectors of bidiagonal matrix in VT -* (Workspace: need M+BDSPAC) +* Workspace: need 3*M [e, tauq, taup] + BDSPAC * CALL SLASET( 'F', N, N, ZERO, ZERO, VT, LDVT ) CALL SBDSDC( 'L', 'I', M, S, WORK( IE ), U, LDU, VT, @@ -1388,20 +1505,21 @@ * Set the right corner of VT to identity matrix * IF( N.GT.M ) THEN - CALL SLASET( 'F', N-M, N-M, ZERO, ONE, VT( M+1, M+1 ), + CALL SLASET( 'F', N-M, N-M, ZERO, ONE, VT(M+1,M+1), $ LDVT ) END IF * * Overwrite U by left singular vectors of A and VT * by right singular vectors of A -* (Workspace: need 2*M+N, prefer 2*M+N*NB) +* Workspace: need 3*M [e, tauq, taup] + N [work] +* Workspace: prefer 3*M [e, tauq, taup] + N*NB [work] * CALL SORMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) CALL SORMBR( 'P', 'R', 'T', N, N, M, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), - $ LWORK-NWORK+1, IERR ) + $ LWORK - NWORK + 1, IERR ) END IF * END IF diff --git a/SRC/sgesvd.f b/SRC/sgesvd.f index 91b27685..6e6234b1 100644 --- a/SRC/sgesvd.f +++ b/SRC/sgesvd.f @@ -314,24 +314,24 @@ BDSPAC = 5*N * Compute space needed for SGEQRF CALL SGEQRF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_SGEQRF=DUM(1) + LWORK_SGEQRF = INT( DUM(1) ) * Compute space needed for SORGQR CALL SORGQR( M, N, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_SORGQR_N=DUM(1) + LWORK_SORGQR_N = INT( DUM(1) ) CALL SORGQR( M, M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_SORGQR_M=DUM(1) + LWORK_SORGQR_M = INT( DUM(1) ) * Compute space needed for SGEBRD CALL SGEBRD( N, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_SGEBRD=DUM(1) + LWORK_SGEBRD = INT( DUM(1) ) * Compute space needed for SORGBR P CALL SORGBR( 'P', N, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_P=DUM(1) + LWORK_SORGBR_P = INT( DUM(1) ) * Compute space needed for SORGBR Q CALL SORGBR( 'Q', N, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_Q=DUM(1) + LWORK_SORGBR_Q = INT( DUM(1) ) * IF( M.GE.MNTHR ) THEN IF( WNTUN ) THEN @@ -447,18 +447,18 @@ * CALL SGEBRD( M, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_SGEBRD=DUM(1) + LWORK_SGEBRD = INT( DUM(1) ) MAXWRK = 3*N + LWORK_SGEBRD IF( WNTUS .OR. WNTUO ) THEN CALL SORGBR( 'Q', M, N, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_Q=DUM(1) + LWORK_SORGBR_Q = INT( DUM(1) ) MAXWRK = MAX( MAXWRK, 3*N+LWORK_SORGBR_Q ) END IF IF( WNTUA ) THEN CALL SORGBR( 'Q', M, M, N, A, LDA, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_Q=DUM(1) + LWORK_SORGBR_Q = INT( DUM(1) ) MAXWRK = MAX( MAXWRK, 3*N+LWORK_SORGBR_Q ) END IF IF( .NOT.WNTVN ) THEN @@ -475,24 +475,24 @@ BDSPAC = 5*M * Compute space needed for SGELQF CALL SGELQF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_SGELQF=DUM(1) + LWORK_SGELQF = INT( DUM(1) ) * Compute space needed for SORGLQ CALL SORGLQ( N, N, M, DUM(1), N, DUM(1), DUM(1), -1, IERR ) - LWORK_SORGLQ_N=DUM(1) + LWORK_SORGLQ_N = INT( DUM(1) ) CALL SORGLQ( M, N, M, A, LDA, DUM(1), DUM(1), -1, IERR ) - LWORK_SORGLQ_M=DUM(1) + LWORK_SORGLQ_M = INT( DUM(1) ) * Compute space needed for SGEBRD CALL SGEBRD( M, M, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_SGEBRD=DUM(1) + LWORK_SGEBRD = INT( DUM(1) ) * Compute space needed for SORGBR P CALL SORGBR( 'P', M, M, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_P=DUM(1) + LWORK_SORGBR_P = INT( DUM(1) ) * Compute space needed for SORGBR Q CALL SORGBR( 'Q', M, M, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_Q=DUM(1) + LWORK_SORGBR_Q = INT( DUM(1) ) IF( N.GE.MNTHR ) THEN IF( WNTVN ) THEN * @@ -608,19 +608,19 @@ * CALL SGEBRD( M, N, A, LDA, S, DUM(1), DUM(1), $ DUM(1), DUM(1), -1, IERR ) - LWORK_SGEBRD=DUM(1) + LWORK_SGEBRD = INT( DUM(1) ) MAXWRK = 3*M + LWORK_SGEBRD IF( WNTVS .OR. WNTVO ) THEN * Compute space needed for SORGBR P CALL SORGBR( 'P', M, N, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_P=DUM(1) + LWORK_SORGBR_P = INT( DUM(1) ) MAXWRK = MAX( MAXWRK, 3*M+LWORK_SORGBR_P ) END IF IF( WNTVA ) THEN CALL SORGBR( 'P', N, N, M, A, N, DUM(1), $ DUM(1), -1, IERR ) - LWORK_SORGBR_P=DUM(1) + LWORK_SORGBR_P = INT( DUM(1) ) MAXWRK = MAX( MAXWRK, 3*M+LWORK_SORGBR_P ) END IF IF( .NOT.WNTUN ) THEN diff --git a/SRC/zgesdd.f b/SRC/zgesdd.f index ea08dbc6..5e086a2a 100644 --- a/SRC/zgesdd.f +++ b/SRC/zgesdd.f @@ -18,8 +18,8 @@ * Definition: * =========== * -* SUBROUTINE ZGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, -* LWORK, RWORK, IWORK, INFO ) +* SUBROUTINE ZGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, +* WORK, LWORK, RWORK, IWORK, INFO ) * * .. Scalar Arguments .. * CHARACTER JOBZ @@ -135,8 +135,8 @@ *> \param[in] LDU *> \verbatim *> LDU is INTEGER -*> The leading dimension of the array U. LDU >= 1; if -*> JOBZ = 'S' or 'A' or JOBZ = 'O' and M < N, LDU >= M. +*> The leading dimension of the array U. LDU >= 1; +*> if JOBZ = 'S' or 'A' or JOBZ = 'O' and M < N, LDU >= M. *> \endverbatim *> *> \param[out] VT @@ -152,8 +152,8 @@ *> \param[in] LDVT *> \verbatim *> LDVT is INTEGER -*> The leading dimension of the array VT. LDVT >= 1; if -*> JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; +*> The leading dimension of the array VT. LDVT >= 1; +*> if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; *> if JOBZ = 'S', LDVT >= min(M,N). *> \endverbatim *> @@ -167,24 +167,28 @@ *> \verbatim *> LWORK is INTEGER *> The dimension of the array WORK. LWORK >= 1. -*> if JOBZ = 'N', LWORK >= 2*min(M,N)+max(M,N). -*> if JOBZ = 'O', -*> LWORK >= 2*min(M,N)*min(M,N)+2*min(M,N)+max(M,N). -*> if JOBZ = 'S' or 'A', -*> LWORK >= min(M,N)*min(M,N)+2*min(M,N)+max(M,N). -*> For good performance, LWORK should generally be larger. -*> *> If LWORK = -1, a workspace query is assumed. The optimal *> size for the WORK array is calculated and stored in WORK(1), *> and no other work except argument checking is performed. +*> +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LWORK >= 2*mn + mx. +*> If JOBZ = 'O', LWORK >= 2*mn*mn + 2*mn + mx. +*> If JOBZ = 'S', LWORK >= mn*mn + 3*mn. +*> If JOBZ = 'A', LWORK >= mn*mn + 2*mn + mx. +*> These are not tight minimums in all cases; see comments inside code. +*> For good performance, LWORK should generally be larger; +*> a query is recommended. *> \endverbatim *> *> \param[out] RWORK *> \verbatim *> RWORK is DOUBLE PRECISION array, dimension (MAX(1,LRWORK)) -*> If JOBZ = 'N', LRWORK >= 7*min(M,N). -*> Otherwise, -*> LRWORK >= min(M,N)*max(5*min(M,N)+7,2*max(M,N)+2*min(M,N)+1) +*> Let mx = max(M,N) and mn = min(M,N). +*> If JOBZ = 'N', LRWORK >= 5*mn (LAPACK <= 3.6 needs 7*mn); +*> else if mx >> mn, LRWORK >= 5*mn*mn + 5*mn; +*> else LRWORK >= max( 5*mn*mn + 5*mn, +*> 2*mx*mn + 2*mn*mn + mn ). *> \endverbatim *> *> \param[out] IWORK @@ -218,9 +222,11 @@ *> Ming Gu and Huan Ren, Computer Science Division, University of *> California at Berkeley, USA *> +*> @precisions fortran z -> c * ===================================================================== - SUBROUTINE ZGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, - $ LWORK, RWORK, IWORK, INFO ) + SUBROUTINE ZGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, + $ WORK, LWORK, RWORK, IWORK, INFO ) + implicit none * * -- LAPACK driver routine (version 3.6.0) -- * -- LAPACK is a software package provided by Univ. of Tennessee, -- @@ -241,8 +247,6 @@ * ===================================================================== * * .. Parameters .. - INTEGER LQUERV - PARAMETER ( LQUERV = -1 ) COMPLEX*16 CZERO, CONE PARAMETER ( CZERO = ( 0.0D+0, 0.0D+0 ), $ CONE = ( 1.0D+0, 0.0D+0 ) ) @@ -250,16 +254,27 @@ PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 ) * .. * .. Local Scalars .. - LOGICAL WNTQA, WNTQAS, WNTQN, WNTQO, WNTQS + LOGICAL LQUERY, WNTQA, WNTQAS, WNTQN, WNTQO, WNTQS INTEGER BLK, CHUNK, I, IE, IERR, IL, IR, IRU, IRVT, $ ISCL, ITAU, ITAUP, ITAUQ, IU, IVT, LDWKVT, $ LDWRKL, LDWRKR, LDWRKU, MAXWRK, MINMN, MINWRK, $ MNTHR1, MNTHR2, NRWORK, NWORK, WRKBL + INTEGER LWORK_ZGEBRD_MN, LWORK_ZGEBRD_MM, + $ LWORK_ZGEBRD_NN, LWORK_ZGELQF_MN, + $ LWORK_ZGEQRF_MN, + $ LWORK_ZUNGBR_P_MN, LWORK_ZUNGBR_P_NN, + $ LWORK_ZUNGBR_Q_MN, LWORK_ZUNGBR_Q_MM, + $ LWORK_ZUNGLQ_MN, LWORK_ZUNGLQ_NN, + $ LWORK_ZUNGQR_MM, LWORK_ZUNGQR_MN, + $ LWORK_ZUNMBR_PRC_MM, LWORK_ZUNMBR_QLN_MM, + $ LWORK_ZUNMBR_PRC_MN, LWORK_ZUNMBR_QLN_MN, + $ LWORK_ZUNMBR_PRC_NN, LWORK_ZUNMBR_QLN_NN DOUBLE PRECISION ANRM, BIGNUM, EPS, SMLNUM * .. * .. Local Arrays .. INTEGER IDUM( 1 ) DOUBLE PRECISION DUM( 1 ) + COMPLEX*16 CDUM( 1 ) * .. * .. External Subroutines .. EXTERNAL DBDSDC, DLASCL, XERBLA, ZGEBRD, ZGELQF, ZGEMM, @@ -268,9 +283,8 @@ * .. * .. External Functions .. LOGICAL LSAME - INTEGER ILAENV DOUBLE PRECISION DLAMCH, ZLANGE - EXTERNAL LSAME, ILAENV, DLAMCH, ZLANGE + EXTERNAL LSAME, DLAMCH, ZLANGE * .. * .. Intrinsic Functions .. INTRINSIC INT, MAX, MIN, SQRT @@ -279,15 +293,16 @@ * * Test the input arguments * - INFO = 0 - MINMN = MIN( M, N ) + INFO = 0 + MINMN = MIN( M, N ) MNTHR1 = INT( MINMN*17.0D0 / 9.0D0 ) MNTHR2 = INT( MINMN*5.0D0 / 3.0D0 ) - WNTQA = LSAME( JOBZ, 'A' ) - WNTQS = LSAME( JOBZ, 'S' ) + WNTQA = LSAME( JOBZ, 'A' ) + WNTQS = LSAME( JOBZ, 'S' ) WNTQAS = WNTQA .OR. WNTQS - WNTQO = LSAME( JOBZ, 'O' ) - WNTQN = LSAME( JOBZ, 'N' ) + WNTQO = LSAME( JOBZ, 'O' ) + WNTQN = LSAME( JOBZ, 'N' ) + LQUERY = ( LWORK.EQ.-1 ) MINWRK = 1 MAXWRK = 1 * @@ -309,8 +324,8 @@ END IF * * Compute workspace -* (Note: Comments in the code beginning "Workspace:" describe the -* minimal amount of workspace needed at that point in the code, +* Note: Comments in the code beginning "Workspace:" describe the +* minimal amount of workspace allocated at that point in the code, * as well as the preferred amount for good performance. * CWorkspace refers to complex workspace, and RWorkspace to * real workspace. NB refers to the optimal block size for the @@ -320,233 +335,283 @@ IF( M.GE.N ) THEN * * There is no complex work space needed for bidiagonal SVD -* The real work space needed for bidiagonal SVD is BDSPAC -* for computing singular values and singular vectors; BDSPAN -* for computing singular values only. -* BDSPAC = 5*N*N + 7*N -* BDSPAN = MAX(7*N+4, 3*N+2+SMLSIZ*(SMLSIZ+8)) +* The real work space needed for bidiagonal SVD (dbdsdc) is +* BDSPAC = 3*N*N + 4*N for singular values and vectors; +* BDSPAC = 4*N for singular values only; +* not including e, RU, and RVT matrices. +* +* Compute space preferred for each routine + CALL ZGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGEBRD_MN = INT( CDUM(1) ) +* + CALL ZGEBRD( N, N, CDUM(1), N, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGEBRD_NN = INT( CDUM(1) ) +* + CALL ZGEQRF( M, N, CDUM(1), M, CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGEQRF_MN = INT( CDUM(1) ) +* + CALL ZUNGBR( 'P', N, N, N, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_P_NN = INT( CDUM(1) ) +* + CALL ZUNGBR( 'Q', M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_Q_MM = INT( CDUM(1) ) +* + CALL ZUNGBR( 'Q', M, N, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_Q_MN = INT( CDUM(1) ) +* + CALL ZUNGQR( M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGQR_MM = INT( CDUM(1) ) +* + CALL ZUNGQR( M, N, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGQR_MN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'P', 'R', 'C', N, N, N, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_PRC_NN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'Q', 'L', 'N', M, M, N, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_QLN_MM = INT( CDUM(1) ) +* + CALL ZUNMBR( 'Q', 'L', 'N', M, N, N, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_QLN_MN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'Q', 'L', 'N', N, N, N, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_QLN_NN = INT( CDUM(1) ) * IF( M.GE.MNTHR1 ) THEN IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * - MAXWRK = N + N*ILAENV( 1, 'ZGEQRF', ' ', M, N, -1, - $ -1 ) - MAXWRK = MAX( MAXWRK, 2*N+2*N* - $ ILAENV( 1, 'ZGEBRD', ' ', N, N, -1, -1 ) ) + MAXWRK = N + LWORK_ZGEQRF_MN + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZGEBRD_NN ) MINWRK = 3*N ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') -* - WRKBL = N + N*ILAENV( 1, 'ZGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'ZUNGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'ZGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 2 (M >> N, JOBZ='O') +* + WRKBL = N + LWORK_ZGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_ZUNGQR_MN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_PRC_NN ) MAXWRK = M*N + N*N + WRKBL MINWRK = 2*N*N + 3*N ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') -* - WRKBL = N + N*ILAENV( 1, 'ZGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+N*ILAENV( 1, 'ZUNGQR', ' ', M, - $ N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'ZGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 3 (M >> N, JOBZ='S') +* + WRKBL = N + LWORK_ZGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_ZUNGQR_MN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_PRC_NN ) MAXWRK = N*N + WRKBL MINWRK = N*N + 3*N ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') -* - WRKBL = N + N*ILAENV( 1, 'ZGEQRF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, N+M*ILAENV( 1, 'ZUNGQR', ' ', M, - $ M, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+2*N* - $ ILAENV( 1, 'ZGEBRD', ' ', N, N, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'QLN', N, N, N, -1 ) ) - WRKBL = MAX( WRKBL, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'PRC', N, N, N, -1 ) ) +* Path 4 (M >> N, JOBZ='A') +* + WRKBL = N + LWORK_ZGEQRF_MN + WRKBL = MAX( WRKBL, N + LWORK_ZUNGQR_MM ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZGEBRD_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_QLN_NN ) + WRKBL = MAX( WRKBL, 2*N + LWORK_ZUNMBR_PRC_NN ) MAXWRK = N*N + WRKBL - MINWRK = N*N + 2*N + M + MINWRK = N*N + MAX( 3*N, N + M ) END IF ELSE IF( M.GE.MNTHR2 ) THEN * -* Path 5 (M much larger than N, but not as much as MNTHR1) +* Path 5 (M >> N, but not as much as MNTHR1) * - MAXWRK = 2*N + ( M+N )*ILAENV( 1, 'ZGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*N + LWORK_ZGEBRD_MN MINWRK = 2*N + M IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, N, N, -1 ) ) +* Path 5o (M >> N, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_Q_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + N*N ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, N, N, -1 ) ) +* Path 5s (M >> N, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_Q_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'P', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+M* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5a (M >> N, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_P_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNGBR_Q_MM ) END IF ELSE * -* Path 6 (M at least N, but not much larger) +* Path 6 (M >= N, but not much larger) * - MAXWRK = 2*N + ( M+N )*ILAENV( 1, 'ZGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*N + LWORK_ZGEBRD_MN MINWRK = 2*N + M IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, N, N, -1 ) ) +* Path 6o (M >= N, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_PRC_NN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_QLN_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + N*N ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, N, N, -1 ) ) +* Path 6s (M >= N, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_QLN_MN ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_PRC_NN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*N+N* - $ ILAENV( 1, 'ZUNGBR', 'PRC', N, N, N, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*N+M* - $ ILAENV( 1, 'ZUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6a (M >= N, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*N + LWORK_ZUNMBR_PRC_NN ) END IF END IF ELSE * * There is no complex work space needed for bidiagonal SVD -* The real work space needed for bidiagonal SVD is BDSPAC -* for computing singular values and singular vectors; BDSPAN -* for computing singular values only. -* BDSPAC = 5*M*M + 7*M -* BDSPAN = MAX(7*M+4, 3*M+2+SMLSIZ*(SMLSIZ+8)) +* The real work space needed for bidiagonal SVD (dbdsdc) is +* BDSPAC = 3*M*M + 4*M for singular values and vectors; +* BDSPAC = 4*M for singular values only; +* not including e, RU, and RVT matrices. +* +* Compute space preferred for each routine + CALL ZGEBRD( M, N, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGEBRD_MN = INT( CDUM(1) ) +* + CALL ZGEBRD( M, M, CDUM(1), M, CDUM(1), DUM(1), CDUM(1), + $ CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGEBRD_MM = INT( CDUM(1) ) +* + CALL ZGELQF( M, N, CDUM(1), M, CDUM(1), CDUM(1), -1, IERR ) + LWORK_ZGELQF_MN = INT( CDUM(1) ) +* + CALL ZUNGBR( 'P', M, N, M, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_P_MN = INT( CDUM(1) ) +* + CALL ZUNGBR( 'P', N, N, M, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_P_NN = INT( CDUM(1) ) +* + CALL ZUNGBR( 'Q', M, M, N, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGBR_Q_MM = INT( CDUM(1) ) +* + CALL ZUNGLQ( M, N, M, CDUM(1), M, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGLQ_MN = INT( CDUM(1) ) +* + CALL ZUNGLQ( N, N, M, CDUM(1), N, CDUM(1), CDUM(1), + $ -1, IERR ) + LWORK_ZUNGLQ_NN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'P', 'R', 'C', M, M, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_PRC_MM = INT( CDUM(1) ) +* + CALL ZUNMBR( 'P', 'R', 'C', M, N, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_PRC_MN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'P', 'R', 'C', N, N, M, CDUM(1), N, CDUM(1), + $ CDUM(1), N, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_PRC_NN = INT( CDUM(1) ) +* + CALL ZUNMBR( 'Q', 'L', 'N', M, M, M, CDUM(1), M, CDUM(1), + $ CDUM(1), M, CDUM(1), -1, IERR ) + LWORK_ZUNMBR_QLN_MM = INT( CDUM(1) ) * IF( N.GE.MNTHR1 ) THEN IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * - MAXWRK = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1, - $ -1 ) - MAXWRK = MAX( MAXWRK, 2*M+2*M* - $ ILAENV( 1, 'ZGEBRD', ' ', M, M, -1, -1 ) ) + MAXWRK = M + LWORK_ZGELQF_MN + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZGEBRD_MM ) MINWRK = 3*M ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') -* - WRKBL = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'ZUNGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'ZGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 2t (N >> M, JOBZ='O') +* + WRKBL = M + LWORK_ZGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_ZUNGLQ_MN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_PRC_MM ) MAXWRK = M*N + M*M + WRKBL MINWRK = 2*M*M + 3*M ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* - WRKBL = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+M*ILAENV( 1, 'ZUNGLQ', ' ', M, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'ZGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 3t (N >> M, JOBZ='S') +* + WRKBL = M + LWORK_ZGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_ZUNGLQ_MN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_PRC_MM ) MAXWRK = M*M + WRKBL MINWRK = M*M + 3*M ELSE IF( WNTQA ) THEN * -* Path 4t (N much larger than M, JOBZ='A') -* - WRKBL = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1, -1 ) - WRKBL = MAX( WRKBL, M+N*ILAENV( 1, 'ZUNGLQ', ' ', N, - $ N, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+2*M* - $ ILAENV( 1, 'ZGEBRD', ' ', M, M, -1, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'PRC', M, M, M, -1 ) ) - WRKBL = MAX( WRKBL, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, M, M, -1 ) ) +* Path 4t (N >> M, JOBZ='A') +* + WRKBL = M + LWORK_ZGELQF_MN + WRKBL = MAX( WRKBL, M + LWORK_ZUNGLQ_NN ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZGEBRD_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_QLN_MM ) + WRKBL = MAX( WRKBL, 2*M + LWORK_ZUNMBR_PRC_MM ) MAXWRK = M*M + WRKBL - MINWRK = M*M + 2*M + N + MINWRK = M*M + MAX( 3*M, M + N ) END IF ELSE IF( N.GE.MNTHR2 ) THEN * -* Path 5t (N much larger than M, but not as much as MNTHR1) +* Path 5t (N >> M, but not as much as MNTHR1) * - MAXWRK = 2*M + ( M+N )*ILAENV( 1, 'ZGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*M + LWORK_ZGEBRD_MN MINWRK = 2*M + N IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'P', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5to (N >> M, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_P_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + M*M ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'P', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5ts (N >> M, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_P_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*M+N* - $ ILAENV( 1, 'ZUNGBR', 'P', N, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'Q', M, M, N, -1 ) ) +* Path 5ta (N >> M, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_Q_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNGBR_P_NN ) END IF ELSE * -* Path 6t (N greater than M, but not much larger) +* Path 6t (N > M, but not much larger) * - MAXWRK = 2*M + ( M+N )*ILAENV( 1, 'ZGEBRD', ' ', M, N, - $ -1, -1 ) + MAXWRK = 2*M + LWORK_ZGEBRD_MN MINWRK = 2*M + N IF( WNTQO ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'PRC', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNMBR', 'QLN', M, M, N, -1 ) ) +* Path 6to (N > M, JOBZ='O') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_PRC_MN ) MAXWRK = MAXWRK + M*N MINWRK = MINWRK + M*M ELSE IF( WNTQS ) THEN - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'PRC', M, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6ts (N > M, JOBZ='S') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_PRC_MN ) ELSE IF( WNTQA ) THEN - MAXWRK = MAX( MAXWRK, 2*M+N* - $ ILAENV( 1, 'ZUNGBR', 'PRC', N, N, M, -1 ) ) - MAXWRK = MAX( MAXWRK, 2*M+M* - $ ILAENV( 1, 'ZUNGBR', 'QLN', M, M, N, -1 ) ) +* Path 6ta (N > M, JOBZ='A') + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_QLN_MM ) + MAXWRK = MAX( MAXWRK, 2*M + LWORK_ZUNMBR_PRC_NN ) END IF END IF END IF @@ -554,18 +619,20 @@ END IF IF( INFO.EQ.0 ) THEN WORK( 1 ) = MAXWRK - IF( LWORK.LT.MINWRK .AND. LWORK.NE.LQUERV ) - $ INFO = -13 + IF( LWORK.LT.MINWRK .AND. .NOT. LQUERY ) THEN + INFO = -12 + END IF END IF -* -* Quick returns * IF( INFO.NE.0 ) THEN CALL XERBLA( 'ZGESDD', -INFO ) RETURN + ELSE IF( LQUERY ) THEN + RETURN END IF - IF( LWORK.EQ.LQUERV ) - $ RETURN +* +* Quick return if possible +* IF( M.EQ.0 .OR. N.EQ.0 ) THEN RETURN END IF @@ -598,15 +665,16 @@ * IF( WNTQN ) THEN * -* Path 1 (M much larger than N, JOBZ='N') +* Path 1 (M >> N, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: need 0) +* CWorkspace: need N [tau] + N [work] +* CWorkspace: prefer N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -621,8 +689,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (CWorkspace: need 3*N, prefer 2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( N, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -630,15 +699,15 @@ NRWORK = IE + N * * Perform bidiagonal SVD, compute singular values only -* (CWorkspace: 0) -* (RWorkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2 (M much larger than N, JOBZ='O') +* Path 2 (M >> N, JOBZ='O') * N left singular vectors to be overwritten on A and * N right singular vectors to be computed in VT * @@ -648,20 +717,21 @@ * LDWRKU = N IR = IU + LDWRKU*N - IF( LWORK.GE.M*N+N*N+3*N ) THEN + IF( LWORK .GE. M*N + N*N + 3*N ) THEN * * WORK(IR) is M by N * LDWRKR = M ELSE - LDWRKR = ( LWORK-N*N-3*N ) / N + LDWRKR = ( LWORK - N*N - 3*N ) / N END IF ITAU = IR + LDWRKR*N NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need N*N+2*N, prefer M*N+N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -673,8 +743,9 @@ $ LDWRKR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZUNGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -684,8 +755,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( N, N, WORK( IR ), LDWRKR, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -694,8 +766,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of R in WORK(IRU) and computing right singular vectors * of R in WORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = IE + N IRVT = IRU + N*N @@ -706,8 +778,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by the left singular vectors of R -* (CWorkspace: need 2*N*N+3*N, prefer M*N+N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRU ), N, WORK( IU ), $ LDWRKU ) @@ -717,8 +790,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by the right singular vectors of R -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, WORK( IR ), LDWRKR, @@ -727,8 +801,9 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IU), storing result in WORK(IR) and copying to A -* (CWorkspace: need 2*N*N, prefer N*N+M*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N*N [R] +* CWorkspace: prefer N*N [U] + M*N [R] +* RWorkspace: need 0 * DO 10 I = 1, M, LDWRKR CHUNK = MIN( M-I+1, LDWRKR ) @@ -741,7 +816,7 @@ * ELSE IF( WNTQS ) THEN * -* Path 3 (M much larger than N, JOBZ='S') +* Path 3 (M >> N, JOBZ='S') * N left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -754,8 +829,9 @@ NWORK = ITAU + N * * Compute A=Q*R -* (CWorkspace: need N*N+2*N, prefer N*N+N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -767,8 +843,9 @@ $ LDWRKR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + N [tau] + N [work] +* CWorkspace: prefer N*N [R] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZUNGQR( M, N, N, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -778,8 +855,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in WORK(IR) -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( N, N, WORK( IR ), LDWRKR, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -788,8 +866,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = IE + N IRVT = IRU + N*N @@ -800,8 +878,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', N, N, N, WORK( IR ), LDWRKR, @@ -810,8 +889,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [R] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, WORK( IR ), LDWRKR, @@ -820,8 +900,8 @@ * * Multiply Q in A by left singular vectors of R in * WORK(IR), storing result in U -* (CWorkspace: need N*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [R] +* RWorkspace: need 0 * CALL ZLACPY( 'F', N, N, U, LDU, WORK( IR ), LDWRKR ) CALL ZGEMM( 'N', 'N', M, N, N, CONE, A, LDA, WORK( IR ), @@ -829,7 +909,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 4 (M much larger than N, JOBZ='A') +* Path 4 (M >> N, JOBZ='A') * M left singular vectors to be computed in U and * N right singular vectors to be computed in VT * @@ -842,16 +922,18 @@ NWORK = ITAU + N * * Compute A=Q*R, copying result to U -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N [tau] + N [work] +* CWorkspace: prefer N*N [U] + N [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) CALL ZLACPY( 'L', M, N, A, LDA, U, LDU ) * * Generate Q in U -* (CWorkspace: need N+M, prefer N+M*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + N [tau] + M [work] +* CWorkspace: prefer N*N [U] + N [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZUNGQR( M, M, N, U, LDU, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -866,8 +948,9 @@ NWORK = ITAUP + N * * Bidiagonalize R in A -* (CWorkspace: need 3*N, prefer 2*N+2*N*NB) -* (RWorkspace: need N) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + 2*N*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( N, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -879,8 +962,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -888,8 +971,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by left singular vectors of R -* (CWorkspace: need N*N+3*N, prefer N*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRU ), N, WORK( IU ), $ LDWRKU ) @@ -899,8 +983,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of R -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] + 2*N [tauq, taup] + N [work] +* CWorkspace: prefer N*N [U] + 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -909,8 +994,8 @@ * * Multiply Q in U by left singular vectors of R in * WORK(IU), storing result in A -* (CWorkspace: need N*N) -* (RWorkspace: 0) +* CWorkspace: need N*N [U] +* RWorkspace: need 0 * CALL ZGEMM( 'N', 'N', M, N, N, CONE, U, LDU, WORK( IU ), $ LDWRKU, CZERO, A, LDA ) @@ -925,7 +1010,7 @@ * * MNTHR2 <= M < MNTHR1 * -* Path 5 (M much larger than N, but not as much as MNTHR1) +* Path 5 (M >> N, but not as much as MNTHR1) * Reduce to bidiagonal form without QR decomposition, use * ZUNGBR and matrix multiplication to compute singular vectors * @@ -936,19 +1021,21 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (CWorkspace: need 2*N+M, prefer 2*N+(M+N)*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 5n (M >> N, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IU = NWORK @@ -956,22 +1043,25 @@ IRVT = IRU + N*N NRWORK = IRVT + N*N * +* Path 5o (M >> N, JOBZ='O') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL ZUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Generate Q in A -* (CWorkspace: need 2*N, prefer N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZUNGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * * WORK( IU ) is M by N * @@ -980,15 +1070,15 @@ * * WORK(IU) is LDWRKU by N * - LDWRKU = ( LWORK-3*N ) / N + LDWRKU = ( LWORK - 3*N ) / N END IF NWORK = IU + LDWRKU*N * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -996,8 +1086,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in WORK(IU), copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL ZLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, $ WORK( IU ), LDWRKU, RWORK( NRWORK ) ) @@ -1005,8 +1095,10 @@ * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need N*N, prefer M*N) -* (Rworkspace: need 3*N*N, prefer N*N+2*M*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] +* RWorkspace: need N [e] + N*N [RU] + 2*N*N [rwork] +* RWorkspace: prefer N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT DO 20 I = 1, M, LDWRKU @@ -1019,17 +1111,20 @@ * ELSE IF( WNTQS ) THEN * +* Path 5s (M >> N, JOBZ='S') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL ZUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to U, generate Q -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'L', M, N, A, LDA, U, LDU ) CALL ZUNGBR( 'Q', M, N, N, U, LDU, WORK( ITAUQ ), @@ -1038,8 +1133,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1050,8 +1145,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL ZLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) @@ -1059,8 +1154,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need N*N+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT CALL ZLACRM( M, N, U, LDU, RWORK( IRU ), N, A, LDA, @@ -1068,17 +1163,20 @@ CALL ZLACPY( 'F', M, N, A, LDA, U, LDU ) ELSE * +* Path 5a (M >> N, JOBZ='A') * Copy A to VT, generate P**H -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'U', N, N, A, LDA, VT, LDVT ) CALL ZUNGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to U, generate Q -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'L', M, N, A, LDA, U, LDU ) CALL ZUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), @@ -1087,8 +1185,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1099,8 +1197,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + 2*N*N [rwork] * CALL ZLARCM( N, N, RWORK( IRVT ), N, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) @@ -1108,8 +1206,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: 0) -* (Rworkspace: need 3*N*N) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT CALL ZLACRM( M, N, U, LDU, RWORK( IRU ), N, A, LDA, @@ -1121,7 +1219,7 @@ * * M .LT. MNTHR2 * -* Path 6 (M at least N, but not much larger) +* Path 6 (M >= N, but not much larger) * Reduce to bidiagonal form without QR decomposition * Use ZUNMBR to compute singular vectors * @@ -1132,26 +1230,28 @@ NWORK = ITAUP + N * * Bidiagonalize A -* (CWorkspace: need 2*N+M, prefer 2*N+(M+N)*NB) -* (RWorkspace: need N) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need N [e] * CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 6n (M >= N, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need N [e] + BDSPAC * - CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'U', 'N', N, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IU = NWORK IRU = NRWORK IRVT = IRU + N*N NRWORK = IRVT + N*N - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * * WORK( IU ) is M by N * @@ -1160,15 +1260,16 @@ * * WORK( IU ) is LDWRKU by N * - LDWRKU = ( LWORK-3*N ) / N + LDWRKU = ( LWORK - 3*N ) / N END IF NWORK = IU + LDWRKU*N * +* Path 6o (M >= N, JOBZ='O') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * CALL DBDSDC( 'U', 'I', N, S, RWORK( IE ), RWORK( IRU ), $ N, RWORK( IRVT ), N, DUM, IDUM, @@ -1176,21 +1277,24 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*N [U] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, $ WORK( ITAUP ), VT, LDVT, WORK( NWORK ), $ LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*N ) THEN + IF( LWORK .GE. M*N + 3*N ) THEN * -* Copy real matrix RWORK(IRU) to complex matrix WORK(IU) -* Overwrite WORK(IU) by left singular vectors of A, copying -* to A -* (Cworkspace: need M*N+2*N, prefer M*N+N+N*NB) -* (Rworkspace: need 0) +* Path 6o-fast +* Copy real matrix RWORK(IRU) to complex matrix WORK(IU) +* Overwrite WORK(IU) by left singular vectors of A, copying +* to A +* CWorkspace: need 2*N [tauq, taup] + M*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] * CALL ZLASET( 'F', M, N, CZERO, CZERO, WORK( IU ), $ LDWRKU ) @@ -1202,17 +1306,21 @@ CALL ZLACPY( 'F', M, N, WORK( IU ), LDWRKU, A, LDA ) ELSE * +* Path 6o-slow * Generate Q in A -* (Cworkspace: need 2*N, prefer N+N*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*N [U] + N*NB [work] +* RWorkspace: need 0 * CALL ZUNGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need N*N, prefer M*N) -* (Rworkspace: need 3*N*N, prefer N*N+2*M*N) +* CWorkspace: need 2*N [tauq, taup] + N*N [U] +* CWorkspace: prefer 2*N [tauq, taup] + M*N [U] +* RWorkspace: need N [e] + N*N [RU] + 2*N*N [rwork] +* RWorkspace: prefer N [e] + N*N [RU] + 2*M*N [rwork] < N + 5*N*N since M < 2*N here * NRWORK = IRVT DO 30 I = 1, M, LDWRKU @@ -1227,11 +1335,12 @@ * ELSE IF( WNTQS ) THEN * +* Path 6s (M >= N, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1242,8 +1351,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL ZLASET( 'F', M, N, CZERO, CZERO, U, LDU ) CALL ZLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) @@ -1253,8 +1363,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -1262,11 +1373,12 @@ $ LWORK-NWORK+1, IERR ) ELSE * +* Path 6a (M >= N, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] + BDSPAC * IRU = NRWORK IRVT = IRU + N*N @@ -1285,8 +1397,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 2*N+M, prefer 2*N+M*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + M [work] +* CWorkspace: prefer 2*N [tauq, taup] + M*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL ZLACP2( 'F', N, N, RWORK( IRU ), N, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -1295,8 +1408,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*N, prefer 2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need 2*N [tauq, taup] + N [work] +* CWorkspace: prefer 2*N [tauq, taup] + N*NB [work] +* RWorkspace: need N [e] + N*N [RU] + N*N [RVT] * CALL ZLACP2( 'F', N, N, RWORK( IRVT ), N, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, N, A, LDA, @@ -1316,15 +1430,16 @@ * IF( WNTQN ) THEN * -* Path 1t (N much larger than M, JOBZ='N') +* Path 1t (N >> M, JOBZ='N') * No singular vectors to be computed * ITAU = 1 NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M [tau] + M [work] +* CWorkspace: prefer M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1339,8 +1454,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (CWorkspace: need 3*M, prefer 2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, M, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1348,15 +1464,15 @@ NRWORK = IE + M * * Perform bidiagonal SVD, compute singular values only -* (CWorkspace: 0) -* (RWorkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL DBDSDC( 'U', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'U', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) * ELSE IF( WNTQO ) THEN * -* Path 2t (N much larger than M, JOBZ='O') +* Path 2t (N >> M, JOBZ='O') * M right singular vectors to be overwritten on A and * M left singular vectors to be computed in U * @@ -1366,7 +1482,7 @@ * WORK(IVT) is M by M * IL = IVT + LDWKVT*M - IF( LWORK.GE.M*N+M*M+3*M ) THEN + IF( LWORK .GE. M*N + M*M + 3*M ) THEN * * WORK(IL) M by N * @@ -1377,14 +1493,15 @@ * WORK(IL) is M by CHUNK * LDWRKL = M - CHUNK = ( LWORK-M*M-3*M ) / M + CHUNK = ( LWORK - M*M - 3*M ) / M END IF ITAU = IL + LDWRKL*CHUNK NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1396,8 +1513,9 @@ $ WORK( IL+LDWRKL ), LDWRKL ) * * Generate Q in A -* (CWorkspace: need M*M+2*M, prefer M*M+M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZUNGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1407,8 +1525,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, M, WORK( IL ), LDWRKL, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -1417,8 +1536,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1429,8 +1548,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix WORK(IU) * Overwrite WORK(IU) by the left singular vectors of L -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, @@ -1439,8 +1559,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) * Overwrite WORK(IVT) by the right singular vectors of L -* (CWorkspace: need N*N+3*N, prefer M*N+2*N+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1450,8 +1571,9 @@ * * Multiply right singular vectors of L in WORK(IL) by Q * in A, storing result in WORK(IL) and copying to A -* (CWorkspace: need 2*M*M, prefer M*M+M*N)) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M*M [L] +* CWorkspace: prefer M*M [VT] + M*N [L] +* RWorkspace: need 0 * DO 40 I = 1, N, CHUNK BLK = MIN( N-I+1, CHUNK ) @@ -1464,9 +1586,9 @@ * ELSE IF( WNTQS ) THEN * -* Path 3t (N much larger than M, JOBZ='S') -* M right singular vectors to be computed in VT and -* M left singular vectors to be computed in U +* Path 3t (N >> M, JOBZ='S') +* M right singular vectors to be computed in VT and +* M left singular vectors to be computed in U * IL = 1 * @@ -1477,8 +1599,9 @@ NWORK = ITAU + M * * Compute A=L*Q -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) @@ -1490,8 +1613,9 @@ $ WORK( IL+LDWRKL ), LDWRKL ) * * Generate Q in A -* (CWorkspace: need M*M+2*M, prefer M*M+M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + M [tau] + M [work] +* CWorkspace: prefer M*M [L] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZUNGLQ( M, N, M, A, LDA, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1501,8 +1625,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in WORK(IL) -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, M, WORK( IL ), LDWRKL, S, RWORK( IE ), $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ), @@ -1511,8 +1636,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1523,8 +1648,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, M, WORK( IL ), LDWRKL, @@ -1533,8 +1659,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by left singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [L] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', M, M, M, WORK( IL ), LDWRKL, @@ -1543,8 +1670,8 @@ * * Copy VT to WORK(IL), multiply right singular vectors of L * in WORK(IL) by Q in A, storing result in VT -* (CWorkspace: need M*M) -* (RWorkspace: 0) +* CWorkspace: need M*M [L] +* RWorkspace: need 0 * CALL ZLACPY( 'F', M, M, VT, LDVT, WORK( IL ), LDWRKL ) CALL ZGEMM( 'N', 'N', M, N, M, CONE, WORK( IL ), LDWRKL, @@ -1552,7 +1679,7 @@ * ELSE IF( WNTQA ) THEN * -* Path 9t (N much larger than M, JOBZ='A') +* Path 4t (N >> M, JOBZ='A') * N right singular vectors to be computed in VT and * M left singular vectors to be computed in U * @@ -1565,16 +1692,18 @@ NWORK = ITAU + M * * Compute A=L*Q, copying result to VT -* (CWorkspace: need 2*M, prefer M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M [tau] + M [work] +* CWorkspace: prefer M*M [VT] + M [tau] + M*NB [work] +* RWorkspace: need 0 * CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ), $ LWORK-NWORK+1, IERR ) CALL ZLACPY( 'U', M, N, A, LDA, VT, LDVT ) * * Generate Q in VT -* (CWorkspace: need M+N, prefer M+N*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + M [tau] + N [work] +* CWorkspace: prefer M*M [VT] + M [tau] + N*NB [work] +* RWorkspace: need 0 * CALL ZUNGLQ( N, N, M, VT, LDVT, WORK( ITAU ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) @@ -1589,8 +1718,9 @@ NWORK = ITAUP + M * * Bidiagonalize L in A -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+2*M*NB) -* (RWorkspace: need M) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + 2*M*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, M, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1599,8 +1729,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RU] + M*M [RVT] + BDSPAC * IRU = IE + M IRVT = IRU + M*M @@ -1611,8 +1741,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of L -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, M, A, LDA, @@ -1621,8 +1752,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) * Overwrite WORK(IVT) by right singular vectors of L -* (CWorkspace: need M*M+3*M, prefer M*M+2*M+M*NB) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] + 2*M [tauq, taup] + M [work] +* CWorkspace: prefer M*M [VT] + 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1632,8 +1764,8 @@ * * Multiply right singular vectors of L in WORK(IVT) by * Q in VT, storing result in A -* (CWorkspace: need M*M) -* (RWorkspace: 0) +* CWorkspace: need M*M [VT] +* RWorkspace: need 0 * CALL ZGEMM( 'N', 'N', M, N, M, CONE, WORK( IVT ), LDWKVT, $ VT, LDVT, CZERO, A, LDA ) @@ -1648,10 +1780,9 @@ * * MNTHR2 <= N < MNTHR1 * -* Path 5t (N much larger than M, but not as much as MNTHR1) +* Path 5t (N >> M, but not as much as MNTHR1) * Reduce to bidiagonal form without QR decomposition, use * ZUNGBR and matrix multiplication to compute singular vectors -* * IE = 1 NRWORK = IE + M @@ -1660,8 +1791,9 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB) -* (RWorkspace: M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, @@ -1669,11 +1801,12 @@ * IF( WNTQN ) THEN * +* Path 5tn (N >> M, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL DBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN IRVT = NRWORK @@ -1681,23 +1814,26 @@ NRWORK = IRU + M*M IVT = NWORK * +* Path 5to (N >> M, JOBZ='O') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'L', M, M, A, LDA, U, LDU ) CALL ZUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Generate P**H in A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZUNGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * LDWKVT = M - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * * WORK( IVT ) is M by N * @@ -1707,15 +1843,15 @@ * * WORK( IVT ) is M by CHUNK * - CHUNK = ( LWORK-3*M ) / M + CHUNK = ( LWORK - 3*M ) / M NWORK = IVT + LDWKVT*CHUNK END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * CALL DBDSDC( 'L', 'I', M, S, RWORK( IE ), RWORK( IRU ), $ M, RWORK( IRVT ), M, DUM, IDUM, @@ -1723,8 +1859,8 @@ * * Multiply Q in U by real matrix RWORK(IRVT) * storing the result in WORK(IVT), copying to U -* (Cworkspace: need 0) -* (Rworkspace: need 2*M*M) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL ZLACRM( M, M, U, LDU, RWORK( IRU ), M, WORK( IVT ), $ LDWKVT, RWORK( NRWORK ) ) @@ -1732,8 +1868,10 @@ * * Multiply RWORK(IRVT) by P**H in A, storing the * result in WORK(IVT), copying to A -* (CWorkspace: need M*M, prefer M*N) -* (Rworkspace: need 2*M*M, prefer 2*M*N) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] +* RWorkspace: need M [e] + M*M [RVT] + 2*M*M [rwork] +* RWorkspace: prefer M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU DO 50 I = 1, N, CHUNK @@ -1745,17 +1883,20 @@ 50 CONTINUE ELSE IF( WNTQS ) THEN * +* Path 5ts (N >> M, JOBZ='S') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'L', M, M, A, LDA, U, LDU ) CALL ZUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to VT, generate P**H -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'U', M, N, A, LDA, VT, LDVT ) CALL ZUNGBR( 'P', M, N, M, VT, LDVT, WORK( ITAUP ), @@ -1764,8 +1905,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1776,8 +1917,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need 3*M*M) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL ZLACRM( M, M, U, LDU, RWORK( IRU ), M, A, LDA, $ RWORK( NRWORK ) ) @@ -1785,8 +1926,8 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need M*M+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU CALL ZLARCM( M, N, RWORK( IRVT ), M, VT, LDVT, A, LDA, @@ -1794,17 +1935,20 @@ CALL ZLACPY( 'F', M, N, A, LDA, VT, LDVT ) ELSE * +* Path 5ta (N >> M, JOBZ='A') * Copy A to U, generate Q -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'L', M, M, A, LDA, U, LDU ) CALL ZUNGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Copy A to VT, generate P**H -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: 0) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + N*NB [work] +* RWorkspace: need 0 * CALL ZLACPY( 'U', M, N, A, LDA, VT, LDVT ) CALL ZUNGBR( 'P', N, N, M, VT, LDVT, WORK( ITAUP ), @@ -1813,8 +1957,8 @@ * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1825,8 +1969,8 @@ * * Multiply Q in U by real matrix RWORK(IRU), storing the * result in A, copying to U -* (CWorkspace: need 0) -* (Rworkspace: need 3*M*M) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + 2*M*M [rwork] * CALL ZLACRM( M, M, U, LDU, RWORK( IRU ), M, A, LDA, $ RWORK( NRWORK ) ) @@ -1834,9 +1978,10 @@ * * Multiply real matrix RWORK(IRVT) by P**H in VT, * storing the result in A, copying to VT -* (Cworkspace: need 0) -* (Rworkspace: need M*M+2*M*N) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * + NRWORK = IRU CALL ZLARCM( M, N, RWORK( IRVT ), M, VT, LDVT, A, LDA, $ RWORK( NRWORK ) ) CALL ZLACPY( 'F', M, N, A, LDA, VT, LDVT ) @@ -1846,7 +1991,7 @@ * * N .LT. MNTHR2 * -* Path 6t (N greater than M, but not much larger) +* Path 6t (N > M, but not much larger) * Reduce to bidiagonal form without LQ decomposition * Use ZUNMBR to compute singular vectors * @@ -1857,24 +2002,27 @@ NWORK = ITAUP + M * * Bidiagonalize A -* (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB) -* (RWorkspace: M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + (M+N)*NB [work] +* RWorkspace: need M [e] * CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ), $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1, $ IERR ) IF( WNTQN ) THEN * +* Path 6tn (N > M, JOBZ='N') * Compute singular values only -* (Cworkspace: 0) -* (Rworkspace: need BDSPAN) +* CWorkspace: need 0 +* RWorkspace: need M [e] + BDSPAC * - CALL DBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM, 1, DUM, 1, + CALL DBDSDC( 'L', 'N', M, S, RWORK( IE ), DUM,1,DUM,1, $ DUM, IDUM, RWORK( NRWORK ), IWORK, INFO ) ELSE IF( WNTQO ) THEN +* Path 6to (N > M, JOBZ='O') LDWKVT = M IVT = NWORK - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * * WORK( IVT ) is M by N * @@ -1885,15 +2033,15 @@ * * WORK( IVT ) is M by CHUNK * - CHUNK = ( LWORK-3*M ) / M + CHUNK = ( LWORK - 3*M ) / M NWORK = IVT + LDWKVT*CHUNK END IF * * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1904,21 +2052,24 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*M [VT] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, $ WORK( ITAUQ ), U, LDU, WORK( NWORK ), $ LWORK-NWORK+1, IERR ) * - IF( LWORK.GE.M*N+3*M ) THEN + IF( LWORK .GE. M*N + 3*M ) THEN * -* Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) -* Overwrite WORK(IVT) by right singular vectors of A, -* copying to A -* (Cworkspace: need M*N+2*M, prefer M*N+M+M*NB) -* (Rworkspace: need 0) +* Path 6to-fast +* Copy real matrix RWORK(IRVT) to complex matrix WORK(IVT) +* Overwrite WORK(IVT) by right singular vectors of A, +* copying to A +* CWorkspace: need 2*M [tauq, taup] + M*N [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, WORK( IVT ), $ LDWKVT ) @@ -1928,17 +2079,21 @@ CALL ZLACPY( 'F', M, N, WORK( IVT ), LDWKVT, A, LDA ) ELSE * +* Path 6to-slow * Generate P**H in A -* (Cworkspace: need 2*M, prefer M+M*NB) -* (Rworkspace: need 0) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*M [VT] + M*NB [work] +* RWorkspace: need 0 * CALL ZUNGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ), $ WORK( NWORK ), LWORK-NWORK+1, IERR ) * * Multiply Q in A by real matrix RWORK(IRU), storing the * result in WORK(IU), copying to A -* (CWorkspace: need M*M, prefer M*N) -* (Rworkspace: need 3*M*M, prefer M*M+2*M*N) +* CWorkspace: need 2*M [tauq, taup] + M*M [VT] +* CWorkspace: prefer 2*M [tauq, taup] + M*N [VT] +* RWorkspace: need M [e] + M*M [RVT] + 2*M*M [rwork] +* RWorkspace: prefer M [e] + M*M [RVT] + 2*M*N [rwork] < M + 5*M*M since N < 2*M here * NRWORK = IRU DO 60 I = 1, N, CHUNK @@ -1952,11 +2107,12 @@ END IF ELSE IF( WNTQS ) THEN * +* Path 6ts (N > M, JOBZ='S') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -1967,8 +2123,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -1977,8 +2134,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL ZLASET( 'F', M, N, CZERO, CZERO, VT, LDVT ) CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) @@ -1987,11 +2145,12 @@ $ LWORK-NWORK+1, IERR ) ELSE * +* Path 6ta (N > M, JOBZ='A') * Perform bidiagonal SVD, computing left singular vectors * of bidiagonal matrix in RWORK(IRU) and computing right * singular vectors of bidiagonal matrix in RWORK(IRVT) -* (CWorkspace: need 0) -* (RWorkspace: need BDSPAC) +* CWorkspace: need 0 +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] + BDSPAC * IRVT = NRWORK IRU = IRVT + M*M @@ -2003,8 +2162,9 @@ * * Copy real matrix RWORK(IRU) to complex matrix U * Overwrite U by left singular vectors of A -* (CWorkspace: need 3*M, prefer 2*M+M*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + M [work] +* CWorkspace: prefer 2*M [tauq, taup] + M*NB [work] +* RWorkspace: need M [e] + M*M [RVT] + M*M [RU] * CALL ZLACP2( 'F', M, M, RWORK( IRU ), M, U, LDU ) CALL ZUNMBR( 'Q', 'L', 'N', M, M, N, A, LDA, @@ -2017,8 +2177,9 @@ * * Copy real matrix RWORK(IRVT) to complex matrix VT * Overwrite VT by right singular vectors of A -* (CWorkspace: need 2*M+N, prefer 2*M+N*NB) -* (RWorkspace: M*M) +* CWorkspace: need 2*M [tauq, taup] + N [work] +* CWorkspace: prefer 2*M [tauq, taup] + N*NB [work] +* RWorkspace: need M [e] + M*M [RVT] * CALL ZLACP2( 'F', M, M, RWORK( IRVT ), M, VT, LDVT ) CALL ZUNMBR( 'P', 'R', 'C', N, N, M, A, LDA, diff --git a/SRC/zgesvd.f b/SRC/zgesvd.f index 73b489c9..39f5186d 100644 --- a/SRC/zgesvd.f +++ b/SRC/zgesvd.f @@ -322,23 +322,23 @@ MNTHR = ILAENV( 6, 'ZGESVD', JOBU // JOBVT, M, N, 0, 0 ) * Compute space needed for ZGEQRF CALL ZGEQRF( M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGEQRF=CDUM(1) + LWORK_ZGEQRF = INT( CDUM(1) ) * Compute space needed for ZUNGQR CALL ZUNGQR( M, N, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZUNGQR_N=CDUM(1) + LWORK_ZUNGQR_N = INT( CDUM(1) ) CALL ZUNGQR( M, M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZUNGQR_M=CDUM(1) + LWORK_ZUNGQR_M = INT( CDUM(1) ) * Compute space needed for ZGEBRD CALL ZGEBRD( N, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGEBRD=CDUM(1) + LWORK_ZGEBRD = INT( CDUM(1) ) * Compute space needed for ZUNGBR CALL ZUNGBR( 'P', N, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_P=CDUM(1) + LWORK_ZUNGBR_P = INT( CDUM(1) ) CALL ZUNGBR( 'Q', N, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_Q=CDUM(1) + LWORK_ZUNGBR_Q = INT( CDUM(1) ) * IF( M.GE.MNTHR ) THEN IF( WNTUN ) THEN @@ -445,24 +445,24 @@ * CALL ZGEBRD( M, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGEBRD=CDUM(1) + LWORK_ZGEBRD = INT( CDUM(1) ) MAXWRK = 2*N + LWORK_ZGEBRD IF( WNTUS .OR. WNTUO ) THEN CALL ZUNGBR( 'Q', M, N, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_Q=CDUM(1) + LWORK_ZUNGBR_Q = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*N+LWORK_ZUNGBR_Q ) END IF IF( WNTUA ) THEN CALL ZUNGBR( 'Q', M, M, N, A, LDA, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_Q=CDUM(1) + LWORK_ZUNGBR_Q = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*N+LWORK_ZUNGBR_Q ) END IF IF( .NOT.WNTVN ) THEN MAXWRK = MAX( MAXWRK, 2*N+LWORK_ZUNGBR_P ) - MINWRK = 2*N + M END IF + MINWRK = 2*N + M END IF ELSE IF( MINMN.GT.0 ) THEN * @@ -471,25 +471,25 @@ MNTHR = ILAENV( 6, 'ZGESVD', JOBU // JOBVT, M, N, 0, 0 ) * Compute space needed for ZGELQF CALL ZGELQF( M, N, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGELQF=CDUM(1) + LWORK_ZGELQF = INT( CDUM(1) ) * Compute space needed for ZUNGLQ CALL ZUNGLQ( N, N, M, CDUM(1), N, CDUM(1), CDUM(1), -1, $ IERR ) - LWORK_ZUNGLQ_N=CDUM(1) + LWORK_ZUNGLQ_N = INT( CDUM(1) ) CALL ZUNGLQ( M, N, M, A, LDA, CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZUNGLQ_M=CDUM(1) + LWORK_ZUNGLQ_M = INT( CDUM(1) ) * Compute space needed for ZGEBRD CALL ZGEBRD( M, M, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGEBRD=CDUM(1) + LWORK_ZGEBRD = INT( CDUM(1) ) * Compute space needed for ZUNGBR P CALL ZUNGBR( 'P', M, M, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_P=CDUM(1) + LWORK_ZUNGBR_P = INT( CDUM(1) ) * Compute space needed for ZUNGBR Q CALL ZUNGBR( 'Q', M, M, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_Q=CDUM(1) + LWORK_ZUNGBR_Q = INT( CDUM(1) ) IF( N.GE.MNTHR ) THEN IF( WNTVN ) THEN * @@ -595,25 +595,25 @@ * CALL ZGEBRD( M, N, A, LDA, S, DUM(1), CDUM(1), $ CDUM(1), CDUM(1), -1, IERR ) - LWORK_ZGEBRD=CDUM(1) + LWORK_ZGEBRD = INT( CDUM(1) ) MAXWRK = 2*M + LWORK_ZGEBRD IF( WNTVS .OR. WNTVO ) THEN * Compute space needed for ZUNGBR P CALL ZUNGBR( 'P', M, N, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_P=CDUM(1) + LWORK_ZUNGBR_P = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*M+LWORK_ZUNGBR_P ) END IF IF( WNTVA ) THEN CALL ZUNGBR( 'P', N, N, M, A, N, CDUM(1), $ CDUM(1), -1, IERR ) - LWORK_ZUNGBR_P=CDUM(1) + LWORK_ZUNGBR_P = INT( CDUM(1) ) MAXWRK = MAX( MAXWRK, 2*M+LWORK_ZUNGBR_P ) END IF IF( .NOT.WNTUN ) THEN MAXWRK = MAX( MAXWRK, 2*M+LWORK_ZUNGBR_Q ) - MINWRK = 2*M + N END IF + MINWRK = 2*M + N END IF END IF MAXWRK = MAX( MAXWRK, MINWRK ) -- 2.34.1