1 *> \brief <b> DGESVD computes the singular value decomposition (SVD) for GE matrices</b>
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download DGESVD + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dgesvd.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dgesvd.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dgesvd.f">
21 * SUBROUTINE DGESVD( JOBU, JOBVT, M, N, A, LDA, S, U, LDU, VT, LDVT,
24 * .. Scalar Arguments ..
25 * CHARACTER JOBU, JOBVT
26 * INTEGER INFO, LDA, LDU, LDVT, LWORK, M, N
28 * .. Array Arguments ..
29 * DOUBLE PRECISION A( LDA, * ), S( * ), U( LDU, * ),
30 * $ VT( LDVT, * ), WORK( * )
39 *> DGESVD computes the singular value decomposition (SVD) of a real
40 *> M-by-N matrix A, optionally computing the left and/or right singular
41 *> vectors. The SVD is written
43 *> A = U * SIGMA * transpose(V)
45 *> where SIGMA is an M-by-N matrix which is zero except for its
46 *> min(m,n) diagonal elements, U is an M-by-M orthogonal matrix, and
47 *> V is an N-by-N orthogonal matrix. The diagonal elements of SIGMA
48 *> are the singular values of A; they are real and non-negative, and
49 *> are returned in descending order. The first min(m,n) columns of
50 *> U and V are the left and right singular vectors of A.
52 *> Note that the routine returns V**T, not V.
60 *> JOBU is CHARACTER*1
61 *> Specifies options for computing all or part of the matrix U:
62 *> = 'A': all M columns of U are returned in array U:
63 *> = 'S': the first min(m,n) columns of U (the left singular
64 *> vectors) are returned in the array U;
65 *> = 'O': the first min(m,n) columns of U (the left singular
66 *> vectors) are overwritten on the array A;
67 *> = 'N': no columns of U (no left singular vectors) are
73 *> JOBVT is CHARACTER*1
74 *> Specifies options for computing all or part of the matrix
76 *> = 'A': all N rows of V**T are returned in the array VT;
77 *> = 'S': the first min(m,n) rows of V**T (the right singular
78 *> vectors) are returned in the array VT;
79 *> = 'O': the first min(m,n) rows of V**T (the right singular
80 *> vectors) are overwritten on the array A;
81 *> = 'N': no rows of V**T (no right singular vectors) are
84 *> JOBVT and JOBU cannot both be 'O'.
90 *> The number of rows of the input matrix A. M >= 0.
96 *> The number of columns of the input matrix A. N >= 0.
101 *> A is DOUBLE PRECISION array, dimension (LDA,N)
102 *> On entry, the M-by-N matrix A.
104 *> if JOBU = 'O', A is overwritten with the first min(m,n)
105 *> columns of U (the left singular vectors,
106 *> stored columnwise);
107 *> if JOBVT = 'O', A is overwritten with the first min(m,n)
108 *> rows of V**T (the right singular vectors,
110 *> if JOBU .ne. 'O' and JOBVT .ne. 'O', the contents of A
117 *> The leading dimension of the array A. LDA >= max(1,M).
122 *> S is DOUBLE PRECISION array, dimension (min(M,N))
123 *> The singular values of A, sorted so that S(i) >= S(i+1).
128 *> U is DOUBLE PRECISION array, dimension (LDU,UCOL)
129 *> (LDU,M) if JOBU = 'A' or (LDU,min(M,N)) if JOBU = 'S'.
130 *> If JOBU = 'A', U contains the M-by-M orthogonal matrix U;
131 *> if JOBU = 'S', U contains the first min(m,n) columns of U
132 *> (the left singular vectors, stored columnwise);
133 *> if JOBU = 'N' or 'O', U is not referenced.
139 *> The leading dimension of the array U. LDU >= 1; if
140 *> JOBU = 'S' or 'A', LDU >= M.
145 *> VT is DOUBLE PRECISION array, dimension (LDVT,N)
146 *> If JOBVT = 'A', VT contains the N-by-N orthogonal matrix
148 *> if JOBVT = 'S', VT contains the first min(m,n) rows of
149 *> V**T (the right singular vectors, stored rowwise);
150 *> if JOBVT = 'N' or 'O', VT is not referenced.
156 *> The leading dimension of the array VT. LDVT >= 1; if
157 *> JOBVT = 'A', LDVT >= N; if JOBVT = 'S', LDVT >= min(M,N).
162 *> WORK is DOUBLE PRECISION array, dimension (MAX(1,LWORK))
163 *> On exit, if INFO = 0, WORK(1) returns the optimal LWORK;
164 *> if INFO > 0, WORK(2:MIN(M,N)) contains the unconverged
165 *> superdiagonal elements of an upper bidiagonal matrix B
166 *> whose diagonal is in S (not necessarily sorted). B
167 *> satisfies A = U * B * VT, so it has the same singular values
168 *> as A, and singular vectors related by U and VT.
174 *> The dimension of the array WORK.
175 *> LWORK >= MAX(1,5*MIN(M,N)) for the paths (see comments inside code):
176 *> - PATH 1 (M much larger than N, JOBU='N')
177 *> - PATH 1t (N much larger than M, JOBVT='N')
178 *> LWORK >= MAX(1,3*MIN(M,N) + MAX(M,N),5*MIN(M,N)) for the other paths
179 *> For good performance, LWORK should generally be larger.
181 *> If LWORK = -1, then a workspace query is assumed; the routine
182 *> only calculates the optimal size of the WORK array, returns
183 *> this value as the first entry of the WORK array, and no error
184 *> message related to LWORK is issued by XERBLA.
190 *> = 0: successful exit.
191 *> < 0: if INFO = -i, the i-th argument had an illegal value.
192 *> > 0: if DBDSQR did not converge, INFO specifies how many
193 *> superdiagonals of an intermediate bidiagonal form B
194 *> did not converge to zero. See the description of WORK
195 *> above for details.
201 *> \author Univ. of Tennessee
202 *> \author Univ. of California Berkeley
203 *> \author Univ. of Colorado Denver
208 *> \ingroup doubleGEsing
210 * =====================================================================
211 SUBROUTINE DGESVD( JOBU, JOBVT, M, N, A, LDA, S, U, LDU,
212 $ VT, LDVT, WORK, LWORK, INFO )
214 * -- LAPACK driver routine (version 3.6.1) --
215 * -- LAPACK is a software package provided by Univ. of Tennessee, --
216 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
219 * .. Scalar Arguments ..
220 CHARACTER JOBU, JOBVT
221 INTEGER INFO, LDA, LDU, LDVT, LWORK, M, N
223 * .. Array Arguments ..
224 DOUBLE PRECISION A( LDA, * ), S( * ), U( LDU, * ),
225 $ VT( LDVT, * ), WORK( * )
228 * =====================================================================
231 DOUBLE PRECISION ZERO, ONE
232 PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 )
234 * .. Local Scalars ..
235 LOGICAL LQUERY, WNTUA, WNTUAS, WNTUN, WNTUO, WNTUS,
236 $ WNTVA, WNTVAS, WNTVN, WNTVO, WNTVS
237 INTEGER BDSPAC, BLK, CHUNK, I, IE, IERR, IR, ISCL,
238 $ ITAU, ITAUP, ITAUQ, IU, IWORK, LDWRKR, LDWRKU,
239 $ MAXWRK, MINMN, MINWRK, MNTHR, NCU, NCVT, NRU,
241 INTEGER LWORK_DGEQRF, LWORK_DORGQR_N, LWORK_DORGQR_M,
242 $ LWORK_DGEBRD, LWORK_DORGBR_P, LWORK_DORGBR_Q,
243 $ LWORK_DGELQF, LWORK_DORGLQ_N, LWORK_DORGLQ_M
244 DOUBLE PRECISION ANRM, BIGNUM, EPS, SMLNUM
247 DOUBLE PRECISION DUM( 1 )
249 * .. External Subroutines ..
250 EXTERNAL DBDSQR, DGEBRD, DGELQF, DGEMM, DGEQRF, DLACPY,
251 $ DLASCL, DLASET, DORGBR, DORGLQ, DORGQR, DORMBR,
254 * .. External Functions ..
257 DOUBLE PRECISION DLAMCH, DLANGE
258 EXTERNAL LSAME, ILAENV, DLAMCH, DLANGE
260 * .. Intrinsic Functions ..
261 INTRINSIC MAX, MIN, SQRT
263 * .. Executable Statements ..
265 * Test the input arguments
269 WNTUA = LSAME( JOBU, 'A' )
270 WNTUS = LSAME( JOBU, 'S' )
271 WNTUAS = WNTUA .OR. WNTUS
272 WNTUO = LSAME( JOBU, 'O' )
273 WNTUN = LSAME( JOBU, 'N' )
274 WNTVA = LSAME( JOBVT, 'A' )
275 WNTVS = LSAME( JOBVT, 'S' )
276 WNTVAS = WNTVA .OR. WNTVS
277 WNTVO = LSAME( JOBVT, 'O' )
278 WNTVN = LSAME( JOBVT, 'N' )
279 LQUERY = ( LWORK.EQ.-1 )
281 IF( .NOT.( WNTUA .OR. WNTUS .OR. WNTUO .OR. WNTUN ) ) THEN
283 ELSE IF( .NOT.( WNTVA .OR. WNTVS .OR. WNTVO .OR. WNTVN ) .OR.
284 $ ( WNTVO .AND. WNTUO ) ) THEN
286 ELSE IF( M.LT.0 ) THEN
288 ELSE IF( N.LT.0 ) THEN
290 ELSE IF( LDA.LT.MAX( 1, M ) ) THEN
292 ELSE IF( LDU.LT.1 .OR. ( WNTUAS .AND. LDU.LT.M ) ) THEN
294 ELSE IF( LDVT.LT.1 .OR. ( WNTVA .AND. LDVT.LT.N ) .OR.
295 $ ( WNTVS .AND. LDVT.LT.MINMN ) ) THEN
300 * (Note: Comments in the code beginning "Workspace:" describe the
301 * minimal amount of workspace needed at that point in the code,
302 * as well as the preferred amount for good performance.
303 * NB refers to the optimal block size for the immediately
304 * following subroutine, as returned by ILAENV.)
309 IF( M.GE.N .AND. MINMN.GT.0 ) THEN
311 * Compute space needed for DBDSQR
313 MNTHR = ILAENV( 6, 'DGESVD', JOBU // JOBVT, M, N, 0, 0 )
315 * Compute space needed for DGEQRF
316 CALL DGEQRF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR )
317 LWORK_DGEQRF = INT( DUM(1) )
318 * Compute space needed for DORGQR
319 CALL DORGQR( M, N, N, A, LDA, DUM(1), DUM(1), -1, IERR )
320 LWORK_DORGQR_N = INT( DUM(1) )
321 CALL DORGQR( M, M, N, A, LDA, DUM(1), DUM(1), -1, IERR )
322 LWORK_DORGQR_M = INT( DUM(1) )
323 * Compute space needed for DGEBRD
324 CALL DGEBRD( N, N, A, LDA, S, DUM(1), DUM(1),
325 $ DUM(1), DUM(1), -1, IERR )
326 LWORK_DGEBRD = INT( DUM(1) )
327 * Compute space needed for DORGBR P
328 CALL DORGBR( 'P', N, N, N, A, LDA, DUM(1),
330 LWORK_DORGBR_P = INT( DUM(1) )
331 * Compute space needed for DORGBR Q
332 CALL DORGBR( 'Q', N, N, N, A, LDA, DUM(1),
334 LWORK_DORGBR_Q = INT( DUM(1) )
336 IF( M.GE.MNTHR ) THEN
339 * Path 1 (M much larger than N, JOBU='N')
341 MAXWRK = N + LWORK_DGEQRF
342 MAXWRK = MAX( MAXWRK, 3*N + LWORK_DGEBRD )
343 IF( WNTVO .OR. WNTVAS )
344 $ MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_P )
345 MAXWRK = MAX( MAXWRK, BDSPAC )
346 MINWRK = MAX( 4*N, BDSPAC )
347 ELSE IF( WNTUO .AND. WNTVN ) THEN
349 * Path 2 (M much larger than N, JOBU='O', JOBVT='N')
351 WRKBL = N + LWORK_DGEQRF
352 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N )
353 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
354 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
355 WRKBL = MAX( WRKBL, BDSPAC )
356 MAXWRK = MAX( N*N + WRKBL, N*N + M*N + N )
357 MINWRK = MAX( 3*N + M, BDSPAC )
358 ELSE IF( WNTUO .AND. WNTVAS ) THEN
360 * Path 3 (M much larger than N, JOBU='O', JOBVT='S' or
363 WRKBL = N + LWORK_DGEQRF
364 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N )
365 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
366 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
367 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P )
368 WRKBL = MAX( WRKBL, BDSPAC )
369 MAXWRK = MAX( N*N + WRKBL, N*N + M*N + N )
370 MINWRK = MAX( 3*N + M, BDSPAC )
371 ELSE IF( WNTUS .AND. WNTVN ) THEN
373 * Path 4 (M much larger than N, JOBU='S', JOBVT='N')
375 WRKBL = N + LWORK_DGEQRF
376 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N )
377 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
378 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
379 WRKBL = MAX( WRKBL, BDSPAC )
381 MINWRK = MAX( 3*N + M, BDSPAC )
382 ELSE IF( WNTUS .AND. WNTVO ) THEN
384 * Path 5 (M much larger than N, JOBU='S', JOBVT='O')
386 WRKBL = N + LWORK_DGEQRF
387 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N )
388 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
389 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
390 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P )
391 WRKBL = MAX( WRKBL, BDSPAC )
392 MAXWRK = 2*N*N + WRKBL
393 MINWRK = MAX( 3*N + M, BDSPAC )
394 ELSE IF( WNTUS .AND. WNTVAS ) THEN
396 * Path 6 (M much larger than N, JOBU='S', JOBVT='S' or
399 WRKBL = N + LWORK_DGEQRF
400 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_N )
401 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
402 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
403 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P )
404 WRKBL = MAX( WRKBL, BDSPAC )
406 MINWRK = MAX( 3*N + M, BDSPAC )
407 ELSE IF( WNTUA .AND. WNTVN ) THEN
409 * Path 7 (M much larger than N, JOBU='A', JOBVT='N')
411 WRKBL = N + LWORK_DGEQRF
412 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M )
413 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
414 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
415 WRKBL = MAX( WRKBL, BDSPAC )
417 MINWRK = MAX( 3*N + M, BDSPAC )
418 ELSE IF( WNTUA .AND. WNTVO ) THEN
420 * Path 8 (M much larger than N, JOBU='A', JOBVT='O')
422 WRKBL = N + LWORK_DGEQRF
423 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M )
424 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
425 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
426 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P )
427 WRKBL = MAX( WRKBL, BDSPAC )
428 MAXWRK = 2*N*N + WRKBL
429 MINWRK = MAX( 3*N + M, BDSPAC )
430 ELSE IF( WNTUA .AND. WNTVAS ) THEN
432 * Path 9 (M much larger than N, JOBU='A', JOBVT='S' or
435 WRKBL = N + LWORK_DGEQRF
436 WRKBL = MAX( WRKBL, N + LWORK_DORGQR_M )
437 WRKBL = MAX( WRKBL, 3*N + LWORK_DGEBRD )
438 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_Q )
439 WRKBL = MAX( WRKBL, 3*N + LWORK_DORGBR_P )
440 WRKBL = MAX( WRKBL, BDSPAC )
442 MINWRK = MAX( 3*N + M, BDSPAC )
446 * Path 10 (M at least N, but not much larger)
448 CALL DGEBRD( M, N, A, LDA, S, DUM(1), DUM(1),
449 $ DUM(1), DUM(1), -1, IERR )
450 LWORK_DGEBRD = INT( DUM(1) )
451 MAXWRK = 3*N + LWORK_DGEBRD
452 IF( WNTUS .OR. WNTUO ) THEN
453 CALL DORGBR( 'Q', M, N, N, A, LDA, DUM(1),
455 LWORK_DORGBR_Q = INT( DUM(1) )
456 MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_Q )
459 CALL DORGBR( 'Q', M, M, N, A, LDA, DUM(1),
461 LWORK_DORGBR_Q = INT( DUM(1) )
462 MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_Q )
464 IF( .NOT.WNTVN ) THEN
465 MAXWRK = MAX( MAXWRK, 3*N + LWORK_DORGBR_P )
467 MAXWRK = MAX( MAXWRK, BDSPAC )
468 MINWRK = MAX( 3*N + M, BDSPAC )
470 ELSE IF( MINMN.GT.0 ) THEN
472 * Compute space needed for DBDSQR
474 MNTHR = ILAENV( 6, 'DGESVD', JOBU // JOBVT, M, N, 0, 0 )
476 * Compute space needed for DGELQF
477 CALL DGELQF( M, N, A, LDA, DUM(1), DUM(1), -1, IERR )
478 LWORK_DGELQF = INT( DUM(1) )
479 * Compute space needed for DORGLQ
480 CALL DORGLQ( N, N, M, DUM(1), N, DUM(1), DUM(1), -1, IERR )
481 LWORK_DORGLQ_N = INT( DUM(1) )
482 CALL DORGLQ( M, N, M, A, LDA, DUM(1), DUM(1), -1, IERR )
483 LWORK_DORGLQ_M = INT( DUM(1) )
484 * Compute space needed for DGEBRD
485 CALL DGEBRD( M, M, A, LDA, S, DUM(1), DUM(1),
486 $ DUM(1), DUM(1), -1, IERR )
487 LWORK_DGEBRD = INT( DUM(1) )
488 * Compute space needed for DORGBR P
489 CALL DORGBR( 'P', M, M, M, A, N, DUM(1),
491 LWORK_DORGBR_P = INT( DUM(1) )
492 * Compute space needed for DORGBR Q
493 CALL DORGBR( 'Q', M, M, M, A, N, DUM(1),
495 LWORK_DORGBR_Q = INT( DUM(1) )
496 IF( N.GE.MNTHR ) THEN
499 * Path 1t(N much larger than M, JOBVT='N')
501 MAXWRK = M + LWORK_DGELQF
502 MAXWRK = MAX( MAXWRK, 3*M + LWORK_DGEBRD )
503 IF( WNTUO .OR. WNTUAS )
504 $ MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_Q )
505 MAXWRK = MAX( MAXWRK, BDSPAC )
506 MINWRK = MAX( 4*M, BDSPAC )
507 ELSE IF( WNTVO .AND. WNTUN ) THEN
509 * Path 2t(N much larger than M, JOBU='N', JOBVT='O')
511 WRKBL = M + LWORK_DGELQF
512 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M )
513 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
514 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
515 WRKBL = MAX( WRKBL, BDSPAC )
516 MAXWRK = MAX( M*M + WRKBL, M*M + M*N + M )
517 MINWRK = MAX( 3*M + N, BDSPAC )
518 ELSE IF( WNTVO .AND. WNTUAS ) THEN
520 * Path 3t(N much larger than M, JOBU='S' or 'A',
523 WRKBL = M + LWORK_DGELQF
524 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M )
525 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
526 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
527 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q )
528 WRKBL = MAX( WRKBL, BDSPAC )
529 MAXWRK = MAX( M*M + WRKBL, M*M + M*N + M )
530 MINWRK = MAX( 3*M + N, BDSPAC )
531 ELSE IF( WNTVS .AND. WNTUN ) THEN
533 * Path 4t(N much larger than M, JOBU='N', JOBVT='S')
535 WRKBL = M + LWORK_DGELQF
536 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M )
537 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
538 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
539 WRKBL = MAX( WRKBL, BDSPAC )
541 MINWRK = MAX( 3*M + N, BDSPAC )
542 ELSE IF( WNTVS .AND. WNTUO ) THEN
544 * Path 5t(N much larger than M, JOBU='O', JOBVT='S')
546 WRKBL = M + LWORK_DGELQF
547 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M )
548 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
549 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
550 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q )
551 WRKBL = MAX( WRKBL, BDSPAC )
552 MAXWRK = 2*M*M + WRKBL
553 MINWRK = MAX( 3*M + N, BDSPAC )
554 ELSE IF( WNTVS .AND. WNTUAS ) THEN
556 * Path 6t(N much larger than M, JOBU='S' or 'A',
559 WRKBL = M + LWORK_DGELQF
560 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_M )
561 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
562 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
563 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q )
564 WRKBL = MAX( WRKBL, BDSPAC )
566 MINWRK = MAX( 3*M + N, BDSPAC )
567 ELSE IF( WNTVA .AND. WNTUN ) THEN
569 * Path 7t(N much larger than M, JOBU='N', JOBVT='A')
571 WRKBL = M + LWORK_DGELQF
572 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N )
573 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
574 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
575 WRKBL = MAX( WRKBL, BDSPAC )
577 MINWRK = MAX( 3*M + N, BDSPAC )
578 ELSE IF( WNTVA .AND. WNTUO ) THEN
580 * Path 8t(N much larger than M, JOBU='O', JOBVT='A')
582 WRKBL = M + LWORK_DGELQF
583 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N )
584 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
585 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
586 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q )
587 WRKBL = MAX( WRKBL, BDSPAC )
588 MAXWRK = 2*M*M + WRKBL
589 MINWRK = MAX( 3*M + N, BDSPAC )
590 ELSE IF( WNTVA .AND. WNTUAS ) THEN
592 * Path 9t(N much larger than M, JOBU='S' or 'A',
595 WRKBL = M + LWORK_DGELQF
596 WRKBL = MAX( WRKBL, M + LWORK_DORGLQ_N )
597 WRKBL = MAX( WRKBL, 3*M + LWORK_DGEBRD )
598 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_P )
599 WRKBL = MAX( WRKBL, 3*M + LWORK_DORGBR_Q )
600 WRKBL = MAX( WRKBL, BDSPAC )
602 MINWRK = MAX( 3*M + N, BDSPAC )
606 * Path 10t(N greater than M, but not much larger)
608 CALL DGEBRD( M, N, A, LDA, S, DUM(1), DUM(1),
609 $ DUM(1), DUM(1), -1, IERR )
610 LWORK_DGEBRD = INT( DUM(1) )
611 MAXWRK = 3*M + LWORK_DGEBRD
612 IF( WNTVS .OR. WNTVO ) THEN
613 * Compute space needed for DORGBR P
614 CALL DORGBR( 'P', M, N, M, A, N, DUM(1),
616 LWORK_DORGBR_P = INT( DUM(1) )
617 MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_P )
620 CALL DORGBR( 'P', N, N, M, A, N, DUM(1),
622 LWORK_DORGBR_P = INT( DUM(1) )
623 MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_P )
625 IF( .NOT.WNTUN ) THEN
626 MAXWRK = MAX( MAXWRK, 3*M + LWORK_DORGBR_Q )
628 MAXWRK = MAX( MAXWRK, BDSPAC )
629 MINWRK = MAX( 3*M + N, BDSPAC )
632 MAXWRK = MAX( MAXWRK, MINWRK )
635 IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN
641 CALL XERBLA( 'DGESVD', -INFO )
643 ELSE IF( LQUERY ) THEN
647 * Quick return if possible
649 IF( M.EQ.0 .OR. N.EQ.0 ) THEN
653 * Get machine constants
656 SMLNUM = SQRT( DLAMCH( 'S' ) ) / EPS
657 BIGNUM = ONE / SMLNUM
659 * Scale A if max element outside range [SMLNUM,BIGNUM]
661 ANRM = DLANGE( 'M', M, N, A, LDA, DUM )
663 IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN
665 CALL DLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, IERR )
666 ELSE IF( ANRM.GT.BIGNUM ) THEN
668 CALL DLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, IERR )
673 * A has at least as many rows as columns. If A has sufficiently
674 * more rows than columns, first reduce using the QR
675 * decomposition (if sufficient workspace available)
677 IF( M.GE.MNTHR ) THEN
681 * Path 1 (M much larger than N, JOBU='N')
682 * No left singular vectors to be computed
688 * (Workspace: need 2*N, prefer N + N*NB)
690 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( IWORK ),
691 $ LWORK-IWORK+1, IERR )
696 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO, A( 2, 1 ),
704 * Bidiagonalize R in A
705 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
707 CALL DGEBRD( N, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
708 $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1,
711 IF( WNTVO .OR. WNTVAS ) THEN
713 * If right singular vectors desired, generate P'.
714 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
716 CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ),
717 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
722 * Perform bidiagonal QR iteration, computing right
723 * singular vectors of A in A if desired
724 * (Workspace: need BDSPAC)
726 CALL DBDSQR( 'U', N, NCVT, 0, 0, S, WORK( IE ), A, LDA,
727 $ DUM, 1, DUM, 1, WORK( IWORK ), INFO )
729 * If right singular vectors desired in VT, copy them there
732 $ CALL DLACPY( 'F', N, N, A, LDA, VT, LDVT )
734 ELSE IF( WNTUO .AND. WNTVN ) THEN
736 * Path 2 (M much larger than N, JOBU='O', JOBVT='N')
737 * N left singular vectors to be overwritten on A and
738 * no right singular vectors to be computed
740 IF( LWORK.GE.N*N+MAX( 4*N, BDSPAC ) ) THEN
742 * Sufficient workspace for a fast algorithm
745 IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + LDA*N ) THEN
747 * WORK(IU) is LDA by N, WORK(IR) is LDA by N
751 ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + N*N ) THEN
753 * WORK(IU) is LDA by N, WORK(IR) is N by N
759 * WORK(IU) is LDWRKU by N, WORK(IR) is N by N
761 LDWRKU = ( LWORK-N*N-N ) / N
768 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
770 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
771 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
773 * Copy R to WORK(IR) and zero out below it
775 CALL DLACPY( 'U', N, N, A, LDA, WORK( IR ), LDWRKR )
776 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO, WORK( IR+1 ),
780 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
782 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
783 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
789 * Bidiagonalize R in WORK(IR)
790 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
792 CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S, WORK( IE ),
793 $ WORK( ITAUQ ), WORK( ITAUP ),
794 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
796 * Generate left vectors bidiagonalizing R
797 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
799 CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR,
800 $ WORK( ITAUQ ), WORK( IWORK ),
801 $ LWORK-IWORK+1, IERR )
804 * Perform bidiagonal QR iteration, computing left
805 * singular vectors of R in WORK(IR)
806 * (Workspace: need N*N + BDSPAC)
808 CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM, 1,
809 $ WORK( IR ), LDWRKR, DUM, 1,
810 $ WORK( IWORK ), INFO )
813 * Multiply Q in A by left singular vectors of R in
814 * WORK(IR), storing result in WORK(IU) and copying to A
815 * (Workspace: need N*N + 2*N, prefer N*N + M*N + N)
817 DO 10 I = 1, M, LDWRKU
818 CHUNK = MIN( M-I+1, LDWRKU )
819 CALL DGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ),
820 $ LDA, WORK( IR ), LDWRKR, ZERO,
821 $ WORK( IU ), LDWRKU )
822 CALL DLACPY( 'F', CHUNK, N, WORK( IU ), LDWRKU,
828 * Insufficient workspace for a fast algorithm
836 * (Workspace: need 3*N + M, prefer 3*N + (M + N)*NB)
838 CALL DGEBRD( M, N, A, LDA, S, WORK( IE ),
839 $ WORK( ITAUQ ), WORK( ITAUP ),
840 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
842 * Generate left vectors bidiagonalizing A
843 * (Workspace: need 4*N, prefer 3*N + N*NB)
845 CALL DORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ),
846 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
849 * Perform bidiagonal QR iteration, computing left
850 * singular vectors of A in A
851 * (Workspace: need BDSPAC)
853 CALL DBDSQR( 'U', N, 0, M, 0, S, WORK( IE ), DUM, 1,
854 $ A, LDA, DUM, 1, WORK( IWORK ), INFO )
858 ELSE IF( WNTUO .AND. WNTVAS ) THEN
860 * Path 3 (M much larger than N, JOBU='O', JOBVT='S' or 'A')
861 * N left singular vectors to be overwritten on A and
862 * N right singular vectors to be computed in VT
864 IF( LWORK.GE.N*N+MAX( 4*N, BDSPAC ) ) THEN
866 * Sufficient workspace for a fast algorithm
869 IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + LDA*N ) THEN
871 * WORK(IU) is LDA by N and WORK(IR) is LDA by N
875 ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + N ) + N*N ) THEN
877 * WORK(IU) is LDA by N and WORK(IR) is N by N
883 * WORK(IU) is LDWRKU by N and WORK(IR) is N by N
885 LDWRKU = ( LWORK-N*N-N ) / N
892 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
894 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
895 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
897 * Copy R to VT, zeroing out below it
899 CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT )
901 $ CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
905 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
907 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
908 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
914 * Bidiagonalize R in VT, copying result to WORK(IR)
915 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
917 CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ),
918 $ WORK( ITAUQ ), WORK( ITAUP ),
919 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
920 CALL DLACPY( 'L', N, N, VT, LDVT, WORK( IR ), LDWRKR )
922 * Generate left vectors bidiagonalizing R in WORK(IR)
923 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
925 CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR,
926 $ WORK( ITAUQ ), WORK( IWORK ),
927 $ LWORK-IWORK+1, IERR )
929 * Generate right vectors bidiagonalizing R in VT
930 * (Workspace: need N*N + 4*N-1, prefer N*N + 3*N + (N-1)*NB)
932 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
933 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
936 * Perform bidiagonal QR iteration, computing left
937 * singular vectors of R in WORK(IR) and computing right
938 * singular vectors of R in VT
939 * (Workspace: need N*N + BDSPAC)
941 CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT, LDVT,
942 $ WORK( IR ), LDWRKR, DUM, 1,
943 $ WORK( IWORK ), INFO )
946 * Multiply Q in A by left singular vectors of R in
947 * WORK(IR), storing result in WORK(IU) and copying to A
948 * (Workspace: need N*N + 2*N, prefer N*N + M*N + N)
950 DO 20 I = 1, M, LDWRKU
951 CHUNK = MIN( M-I+1, LDWRKU )
952 CALL DGEMM( 'N', 'N', CHUNK, N, N, ONE, A( I, 1 ),
953 $ LDA, WORK( IR ), LDWRKR, ZERO,
954 $ WORK( IU ), LDWRKU )
955 CALL DLACPY( 'F', CHUNK, N, WORK( IU ), LDWRKU,
961 * Insufficient workspace for a fast algorithm
967 * (Workspace: need 2*N, prefer N + N*NB)
969 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
970 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
972 * Copy R to VT, zeroing out below it
974 CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT )
976 $ CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
980 * (Workspace: need 2*N, prefer N + N*NB)
982 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
983 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
989 * Bidiagonalize R in VT
990 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
992 CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ),
993 $ WORK( ITAUQ ), WORK( ITAUP ),
994 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
996 * Multiply Q in A by left vectors bidiagonalizing R
997 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
999 CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT,
1000 $ WORK( ITAUQ ), A, LDA, WORK( IWORK ),
1001 $ LWORK-IWORK+1, IERR )
1003 * Generate right vectors bidiagonalizing R in VT
1004 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
1006 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
1007 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1010 * Perform bidiagonal QR iteration, computing left
1011 * singular vectors of A in A and computing right
1012 * singular vectors of A in VT
1013 * (Workspace: need BDSPAC)
1015 CALL DBDSQR( 'U', N, N, M, 0, S, WORK( IE ), VT, LDVT,
1016 $ A, LDA, DUM, 1, WORK( IWORK ), INFO )
1020 ELSE IF( WNTUS ) THEN
1024 * Path 4 (M much larger than N, JOBU='S', JOBVT='N')
1025 * N left singular vectors to be computed in U and
1026 * no right singular vectors to be computed
1028 IF( LWORK.GE.N*N+MAX( 4*N, BDSPAC ) ) THEN
1030 * Sufficient workspace for a fast algorithm
1033 IF( LWORK.GE.WRKBL+LDA*N ) THEN
1035 * WORK(IR) is LDA by N
1040 * WORK(IR) is N by N
1044 ITAU = IR + LDWRKR*N
1048 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1050 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1051 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1053 * Copy R to WORK(IR), zeroing out below it
1055 CALL DLACPY( 'U', N, N, A, LDA, WORK( IR ),
1057 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1058 $ WORK( IR+1 ), LDWRKR )
1061 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1063 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
1064 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1070 * Bidiagonalize R in WORK(IR)
1071 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
1073 CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S,
1074 $ WORK( IE ), WORK( ITAUQ ),
1075 $ WORK( ITAUP ), WORK( IWORK ),
1076 $ LWORK-IWORK+1, IERR )
1078 * Generate left vectors bidiagonalizing R in WORK(IR)
1079 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
1081 CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR,
1082 $ WORK( ITAUQ ), WORK( IWORK ),
1083 $ LWORK-IWORK+1, IERR )
1086 * Perform bidiagonal QR iteration, computing left
1087 * singular vectors of R in WORK(IR)
1088 * (Workspace: need N*N + BDSPAC)
1090 CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM,
1091 $ 1, WORK( IR ), LDWRKR, DUM, 1,
1092 $ WORK( IWORK ), INFO )
1094 * Multiply Q in A by left singular vectors of R in
1095 * WORK(IR), storing result in U
1096 * (Workspace: need N*N)
1098 CALL DGEMM( 'N', 'N', M, N, N, ONE, A, LDA,
1099 $ WORK( IR ), LDWRKR, ZERO, U, LDU )
1103 * Insufficient workspace for a fast algorithm
1108 * Compute A=Q*R, copying result to U
1109 * (Workspace: need 2*N, prefer N + N*NB)
1111 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1112 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1113 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1116 * (Workspace: need 2*N, prefer N + N*NB)
1118 CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ),
1119 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1125 * Zero out below R in A
1128 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1132 * Bidiagonalize R in A
1133 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1135 CALL DGEBRD( N, N, A, LDA, S, WORK( IE ),
1136 $ WORK( ITAUQ ), WORK( ITAUP ),
1137 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1139 * Multiply Q in U by left vectors bidiagonalizing R
1140 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1142 CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA,
1143 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1144 $ LWORK-IWORK+1, IERR )
1147 * Perform bidiagonal QR iteration, computing left
1148 * singular vectors of A in U
1149 * (Workspace: need BDSPAC)
1151 CALL DBDSQR( 'U', N, 0, M, 0, S, WORK( IE ), DUM,
1152 $ 1, U, LDU, DUM, 1, WORK( IWORK ),
1157 ELSE IF( WNTVO ) THEN
1159 * Path 5 (M much larger than N, JOBU='S', JOBVT='O')
1160 * N left singular vectors to be computed in U and
1161 * N right singular vectors to be overwritten on A
1163 IF( LWORK.GE.2*N*N+MAX( 4*N, BDSPAC ) ) THEN
1165 * Sufficient workspace for a fast algorithm
1168 IF( LWORK.GE.WRKBL+2*LDA*N ) THEN
1170 * WORK(IU) is LDA by N and WORK(IR) is LDA by N
1175 ELSE IF( LWORK.GE.WRKBL+( LDA + N )*N ) THEN
1177 * WORK(IU) is LDA by N and WORK(IR) is N by N
1184 * WORK(IU) is N by N and WORK(IR) is N by N
1190 ITAU = IR + LDWRKR*N
1194 * (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB)
1196 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1197 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1199 * Copy R to WORK(IU), zeroing out below it
1201 CALL DLACPY( 'U', N, N, A, LDA, WORK( IU ),
1203 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1204 $ WORK( IU+1 ), LDWRKU )
1207 * (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB)
1209 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
1210 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1216 * Bidiagonalize R in WORK(IU), copying result to
1218 * (Workspace: need 2*N*N + 4*N,
1219 * prefer 2*N*N+3*N+2*N*NB)
1221 CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S,
1222 $ WORK( IE ), WORK( ITAUQ ),
1223 $ WORK( ITAUP ), WORK( IWORK ),
1224 $ LWORK-IWORK+1, IERR )
1225 CALL DLACPY( 'U', N, N, WORK( IU ), LDWRKU,
1226 $ WORK( IR ), LDWRKR )
1228 * Generate left bidiagonalizing vectors in WORK(IU)
1229 * (Workspace: need 2*N*N + 4*N, prefer 2*N*N + 3*N + N*NB)
1231 CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU,
1232 $ WORK( ITAUQ ), WORK( IWORK ),
1233 $ LWORK-IWORK+1, IERR )
1235 * Generate right bidiagonalizing vectors in WORK(IR)
1236 * (Workspace: need 2*N*N + 4*N-1,
1237 * prefer 2*N*N+3*N+(N-1)*NB)
1239 CALL DORGBR( 'P', N, N, N, WORK( IR ), LDWRKR,
1240 $ WORK( ITAUP ), WORK( IWORK ),
1241 $ LWORK-IWORK+1, IERR )
1244 * Perform bidiagonal QR iteration, computing left
1245 * singular vectors of R in WORK(IU) and computing
1246 * right singular vectors of R in WORK(IR)
1247 * (Workspace: need 2*N*N + BDSPAC)
1249 CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ),
1250 $ WORK( IR ), LDWRKR, WORK( IU ),
1251 $ LDWRKU, DUM, 1, WORK( IWORK ), INFO )
1253 * Multiply Q in A by left singular vectors of R in
1254 * WORK(IU), storing result in U
1255 * (Workspace: need N*N)
1257 CALL DGEMM( 'N', 'N', M, N, N, ONE, A, LDA,
1258 $ WORK( IU ), LDWRKU, ZERO, U, LDU )
1260 * Copy right singular vectors of R to A
1261 * (Workspace: need N*N)
1263 CALL DLACPY( 'F', N, N, WORK( IR ), LDWRKR, A,
1268 * Insufficient workspace for a fast algorithm
1273 * Compute A=Q*R, copying result to U
1274 * (Workspace: need 2*N, prefer N + N*NB)
1276 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1277 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1278 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1281 * (Workspace: need 2*N, prefer N + N*NB)
1283 CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ),
1284 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1290 * Zero out below R in A
1293 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1297 * Bidiagonalize R in A
1298 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1300 CALL DGEBRD( N, N, A, LDA, S, WORK( IE ),
1301 $ WORK( ITAUQ ), WORK( ITAUP ),
1302 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1304 * Multiply Q in U by left vectors bidiagonalizing R
1305 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1307 CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA,
1308 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1309 $ LWORK-IWORK+1, IERR )
1311 * Generate right vectors bidiagonalizing R in A
1312 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
1314 CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ),
1315 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1318 * Perform bidiagonal QR iteration, computing left
1319 * singular vectors of A in U and computing right
1320 * singular vectors of A in A
1321 * (Workspace: need BDSPAC)
1323 CALL DBDSQR( 'U', N, N, M, 0, S, WORK( IE ), A,
1324 $ LDA, U, LDU, DUM, 1, WORK( IWORK ),
1329 ELSE IF( WNTVAS ) THEN
1331 * Path 6 (M much larger than N, JOBU='S', JOBVT='S'
1333 * N left singular vectors to be computed in U and
1334 * N right singular vectors to be computed in VT
1336 IF( LWORK.GE.N*N+MAX( 4*N, BDSPAC ) ) THEN
1338 * Sufficient workspace for a fast algorithm
1341 IF( LWORK.GE.WRKBL+LDA*N ) THEN
1343 * WORK(IU) is LDA by N
1348 * WORK(IU) is N by N
1352 ITAU = IU + LDWRKU*N
1356 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1358 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1359 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1361 * Copy R to WORK(IU), zeroing out below it
1363 CALL DLACPY( 'U', N, N, A, LDA, WORK( IU ),
1365 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1366 $ WORK( IU+1 ), LDWRKU )
1369 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1371 CALL DORGQR( M, N, N, A, LDA, WORK( ITAU ),
1372 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1378 * Bidiagonalize R in WORK(IU), copying result to VT
1379 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
1381 CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S,
1382 $ WORK( IE ), WORK( ITAUQ ),
1383 $ WORK( ITAUP ), WORK( IWORK ),
1384 $ LWORK-IWORK+1, IERR )
1385 CALL DLACPY( 'U', N, N, WORK( IU ), LDWRKU, VT,
1388 * Generate left bidiagonalizing vectors in WORK(IU)
1389 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
1391 CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU,
1392 $ WORK( ITAUQ ), WORK( IWORK ),
1393 $ LWORK-IWORK+1, IERR )
1395 * Generate right bidiagonalizing vectors in VT
1396 * (Workspace: need N*N + 4*N-1,
1397 * prefer N*N+3*N+(N-1)*NB)
1399 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
1400 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1403 * Perform bidiagonal QR iteration, computing left
1404 * singular vectors of R in WORK(IU) and computing
1405 * right singular vectors of R in VT
1406 * (Workspace: need N*N + BDSPAC)
1408 CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT,
1409 $ LDVT, WORK( IU ), LDWRKU, DUM, 1,
1410 $ WORK( IWORK ), INFO )
1412 * Multiply Q in A by left singular vectors of R in
1413 * WORK(IU), storing result in U
1414 * (Workspace: need N*N)
1416 CALL DGEMM( 'N', 'N', M, N, N, ONE, A, LDA,
1417 $ WORK( IU ), LDWRKU, ZERO, U, LDU )
1421 * Insufficient workspace for a fast algorithm
1426 * Compute A=Q*R, copying result to U
1427 * (Workspace: need 2*N, prefer N + N*NB)
1429 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1430 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1431 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1434 * (Workspace: need 2*N, prefer N + N*NB)
1436 CALL DORGQR( M, N, N, U, LDU, WORK( ITAU ),
1437 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1439 * Copy R to VT, zeroing out below it
1441 CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT )
1443 $ CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1444 $ VT( 2, 1 ), LDVT )
1450 * Bidiagonalize R in VT
1451 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1453 CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ),
1454 $ WORK( ITAUQ ), WORK( ITAUP ),
1455 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1457 * Multiply Q in U by left bidiagonalizing vectors
1459 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1461 CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT,
1462 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1463 $ LWORK-IWORK+1, IERR )
1465 * Generate right bidiagonalizing vectors in VT
1466 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
1468 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
1469 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1472 * Perform bidiagonal QR iteration, computing left
1473 * singular vectors of A in U and computing right
1474 * singular vectors of A in VT
1475 * (Workspace: need BDSPAC)
1477 CALL DBDSQR( 'U', N, N, M, 0, S, WORK( IE ), VT,
1478 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ),
1485 ELSE IF( WNTUA ) THEN
1489 * Path 7 (M much larger than N, JOBU='A', JOBVT='N')
1490 * M left singular vectors to be computed in U and
1491 * no right singular vectors to be computed
1493 IF( LWORK.GE.N*N+MAX( N+M, 4*N, BDSPAC ) ) THEN
1495 * Sufficient workspace for a fast algorithm
1498 IF( LWORK.GE.WRKBL+LDA*N ) THEN
1500 * WORK(IR) is LDA by N
1505 * WORK(IR) is N by N
1509 ITAU = IR + LDWRKR*N
1512 * Compute A=Q*R, copying result to U
1513 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1515 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1516 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1517 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1519 * Copy R to WORK(IR), zeroing out below it
1521 CALL DLACPY( 'U', N, N, A, LDA, WORK( IR ),
1523 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1524 $ WORK( IR+1 ), LDWRKR )
1527 * (Workspace: need N*N + N + M, prefer N*N + N + M*NB)
1529 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1530 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1536 * Bidiagonalize R in WORK(IR)
1537 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
1539 CALL DGEBRD( N, N, WORK( IR ), LDWRKR, S,
1540 $ WORK( IE ), WORK( ITAUQ ),
1541 $ WORK( ITAUP ), WORK( IWORK ),
1542 $ LWORK-IWORK+1, IERR )
1544 * Generate left bidiagonalizing vectors in WORK(IR)
1545 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
1547 CALL DORGBR( 'Q', N, N, N, WORK( IR ), LDWRKR,
1548 $ WORK( ITAUQ ), WORK( IWORK ),
1549 $ LWORK-IWORK+1, IERR )
1552 * Perform bidiagonal QR iteration, computing left
1553 * singular vectors of R in WORK(IR)
1554 * (Workspace: need N*N + BDSPAC)
1556 CALL DBDSQR( 'U', N, 0, N, 0, S, WORK( IE ), DUM,
1557 $ 1, WORK( IR ), LDWRKR, DUM, 1,
1558 $ WORK( IWORK ), INFO )
1560 * Multiply Q in U by left singular vectors of R in
1561 * WORK(IR), storing result in A
1562 * (Workspace: need N*N)
1564 CALL DGEMM( 'N', 'N', M, N, N, ONE, U, LDU,
1565 $ WORK( IR ), LDWRKR, ZERO, A, LDA )
1567 * Copy left singular vectors of A from A to U
1569 CALL DLACPY( 'F', M, N, A, LDA, U, LDU )
1573 * Insufficient workspace for a fast algorithm
1578 * Compute A=Q*R, copying result to U
1579 * (Workspace: need 2*N, prefer N + N*NB)
1581 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1582 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1583 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1586 * (Workspace: need N + M, prefer N + M*NB)
1588 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1589 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1595 * Zero out below R in A
1598 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1602 * Bidiagonalize R in A
1603 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1605 CALL DGEBRD( N, N, A, LDA, S, WORK( IE ),
1606 $ WORK( ITAUQ ), WORK( ITAUP ),
1607 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1609 * Multiply Q in U by left bidiagonalizing vectors
1611 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1613 CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA,
1614 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1615 $ LWORK-IWORK+1, IERR )
1618 * Perform bidiagonal QR iteration, computing left
1619 * singular vectors of A in U
1620 * (Workspace: need BDSPAC)
1622 CALL DBDSQR( 'U', N, 0, M, 0, S, WORK( IE ), DUM,
1623 $ 1, U, LDU, DUM, 1, WORK( IWORK ),
1628 ELSE IF( WNTVO ) THEN
1630 * Path 8 (M much larger than N, JOBU='A', JOBVT='O')
1631 * M left singular vectors to be computed in U and
1632 * N right singular vectors to be overwritten on A
1634 IF( LWORK.GE.2*N*N+MAX( N+M, 4*N, BDSPAC ) ) THEN
1636 * Sufficient workspace for a fast algorithm
1639 IF( LWORK.GE.WRKBL+2*LDA*N ) THEN
1641 * WORK(IU) is LDA by N and WORK(IR) is LDA by N
1646 ELSE IF( LWORK.GE.WRKBL+( LDA + N )*N ) THEN
1648 * WORK(IU) is LDA by N and WORK(IR) is N by N
1655 * WORK(IU) is N by N and WORK(IR) is N by N
1661 ITAU = IR + LDWRKR*N
1664 * Compute A=Q*R, copying result to U
1665 * (Workspace: need 2*N*N + 2*N, prefer 2*N*N + N + N*NB)
1667 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1668 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1669 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1672 * (Workspace: need 2*N*N + N + M, prefer 2*N*N + N + M*NB)
1674 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1675 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1677 * Copy R to WORK(IU), zeroing out below it
1679 CALL DLACPY( 'U', N, N, A, LDA, WORK( IU ),
1681 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1682 $ WORK( IU+1 ), LDWRKU )
1688 * Bidiagonalize R in WORK(IU), copying result to
1690 * (Workspace: need 2*N*N + 4*N,
1691 * prefer 2*N*N+3*N+2*N*NB)
1693 CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S,
1694 $ WORK( IE ), WORK( ITAUQ ),
1695 $ WORK( ITAUP ), WORK( IWORK ),
1696 $ LWORK-IWORK+1, IERR )
1697 CALL DLACPY( 'U', N, N, WORK( IU ), LDWRKU,
1698 $ WORK( IR ), LDWRKR )
1700 * Generate left bidiagonalizing vectors in WORK(IU)
1701 * (Workspace: need 2*N*N + 4*N, prefer 2*N*N + 3*N + N*NB)
1703 CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU,
1704 $ WORK( ITAUQ ), WORK( IWORK ),
1705 $ LWORK-IWORK+1, IERR )
1707 * Generate right bidiagonalizing vectors in WORK(IR)
1708 * (Workspace: need 2*N*N + 4*N-1,
1709 * prefer 2*N*N+3*N+(N-1)*NB)
1711 CALL DORGBR( 'P', N, N, N, WORK( IR ), LDWRKR,
1712 $ WORK( ITAUP ), WORK( IWORK ),
1713 $ LWORK-IWORK+1, IERR )
1716 * Perform bidiagonal QR iteration, computing left
1717 * singular vectors of R in WORK(IU) and computing
1718 * right singular vectors of R in WORK(IR)
1719 * (Workspace: need 2*N*N + BDSPAC)
1721 CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ),
1722 $ WORK( IR ), LDWRKR, WORK( IU ),
1723 $ LDWRKU, DUM, 1, WORK( IWORK ), INFO )
1725 * Multiply Q in U by left singular vectors of R in
1726 * WORK(IU), storing result in A
1727 * (Workspace: need N*N)
1729 CALL DGEMM( 'N', 'N', M, N, N, ONE, U, LDU,
1730 $ WORK( IU ), LDWRKU, ZERO, A, LDA )
1732 * Copy left singular vectors of A from A to U
1734 CALL DLACPY( 'F', M, N, A, LDA, U, LDU )
1736 * Copy right singular vectors of R from WORK(IR) to A
1738 CALL DLACPY( 'F', N, N, WORK( IR ), LDWRKR, A,
1743 * Insufficient workspace for a fast algorithm
1748 * Compute A=Q*R, copying result to U
1749 * (Workspace: need 2*N, prefer N + N*NB)
1751 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1752 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1753 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1756 * (Workspace: need N + M, prefer N + M*NB)
1758 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1759 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1765 * Zero out below R in A
1768 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1772 * Bidiagonalize R in A
1773 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1775 CALL DGEBRD( N, N, A, LDA, S, WORK( IE ),
1776 $ WORK( ITAUQ ), WORK( ITAUP ),
1777 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1779 * Multiply Q in U by left bidiagonalizing vectors
1781 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1783 CALL DORMBR( 'Q', 'R', 'N', M, N, N, A, LDA,
1784 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1785 $ LWORK-IWORK+1, IERR )
1787 * Generate right bidiagonalizing vectors in A
1788 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
1790 CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ),
1791 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1794 * Perform bidiagonal QR iteration, computing left
1795 * singular vectors of A in U and computing right
1796 * singular vectors of A in A
1797 * (Workspace: need BDSPAC)
1799 CALL DBDSQR( 'U', N, N, M, 0, S, WORK( IE ), A,
1800 $ LDA, U, LDU, DUM, 1, WORK( IWORK ),
1805 ELSE IF( WNTVAS ) THEN
1807 * Path 9 (M much larger than N, JOBU='A', JOBVT='S'
1809 * M left singular vectors to be computed in U and
1810 * N right singular vectors to be computed in VT
1812 IF( LWORK.GE.N*N+MAX( N+M, 4*N, BDSPAC ) ) THEN
1814 * Sufficient workspace for a fast algorithm
1817 IF( LWORK.GE.WRKBL+LDA*N ) THEN
1819 * WORK(IU) is LDA by N
1824 * WORK(IU) is N by N
1828 ITAU = IU + LDWRKU*N
1831 * Compute A=Q*R, copying result to U
1832 * (Workspace: need N*N + 2*N, prefer N*N + N + N*NB)
1834 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1835 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1836 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1839 * (Workspace: need N*N + N + M, prefer N*N + N + M*NB)
1841 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1842 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1844 * Copy R to WORK(IU), zeroing out below it
1846 CALL DLACPY( 'U', N, N, A, LDA, WORK( IU ),
1848 CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1849 $ WORK( IU+1 ), LDWRKU )
1855 * Bidiagonalize R in WORK(IU), copying result to VT
1856 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + 2*N*NB)
1858 CALL DGEBRD( N, N, WORK( IU ), LDWRKU, S,
1859 $ WORK( IE ), WORK( ITAUQ ),
1860 $ WORK( ITAUP ), WORK( IWORK ),
1861 $ LWORK-IWORK+1, IERR )
1862 CALL DLACPY( 'U', N, N, WORK( IU ), LDWRKU, VT,
1865 * Generate left bidiagonalizing vectors in WORK(IU)
1866 * (Workspace: need N*N + 4*N, prefer N*N + 3*N + N*NB)
1868 CALL DORGBR( 'Q', N, N, N, WORK( IU ), LDWRKU,
1869 $ WORK( ITAUQ ), WORK( IWORK ),
1870 $ LWORK-IWORK+1, IERR )
1872 * Generate right bidiagonalizing vectors in VT
1873 * (Workspace: need N*N + 4*N-1,
1874 * prefer N*N+3*N+(N-1)*NB)
1876 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
1877 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1880 * Perform bidiagonal QR iteration, computing left
1881 * singular vectors of R in WORK(IU) and computing
1882 * right singular vectors of R in VT
1883 * (Workspace: need N*N + BDSPAC)
1885 CALL DBDSQR( 'U', N, N, N, 0, S, WORK( IE ), VT,
1886 $ LDVT, WORK( IU ), LDWRKU, DUM, 1,
1887 $ WORK( IWORK ), INFO )
1889 * Multiply Q in U by left singular vectors of R in
1890 * WORK(IU), storing result in A
1891 * (Workspace: need N*N)
1893 CALL DGEMM( 'N', 'N', M, N, N, ONE, U, LDU,
1894 $ WORK( IU ), LDWRKU, ZERO, A, LDA )
1896 * Copy left singular vectors of A from A to U
1898 CALL DLACPY( 'F', M, N, A, LDA, U, LDU )
1902 * Insufficient workspace for a fast algorithm
1907 * Compute A=Q*R, copying result to U
1908 * (Workspace: need 2*N, prefer N + N*NB)
1910 CALL DGEQRF( M, N, A, LDA, WORK( ITAU ),
1911 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1912 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1915 * (Workspace: need N + M, prefer N + M*NB)
1917 CALL DORGQR( M, M, N, U, LDU, WORK( ITAU ),
1918 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1920 * Copy R from A to VT, zeroing out below it
1922 CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT )
1924 $ CALL DLASET( 'L', N-1, N-1, ZERO, ZERO,
1925 $ VT( 2, 1 ), LDVT )
1931 * Bidiagonalize R in VT
1932 * (Workspace: need 4*N, prefer 3*N + 2*N*NB)
1934 CALL DGEBRD( N, N, VT, LDVT, S, WORK( IE ),
1935 $ WORK( ITAUQ ), WORK( ITAUP ),
1936 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1938 * Multiply Q in U by left bidiagonalizing vectors
1940 * (Workspace: need 3*N + M, prefer 3*N + M*NB)
1942 CALL DORMBR( 'Q', 'R', 'N', M, N, N, VT, LDVT,
1943 $ WORK( ITAUQ ), U, LDU, WORK( IWORK ),
1944 $ LWORK-IWORK+1, IERR )
1946 * Generate right bidiagonalizing vectors in VT
1947 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
1949 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
1950 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
1953 * Perform bidiagonal QR iteration, computing left
1954 * singular vectors of A in U and computing right
1955 * singular vectors of A in VT
1956 * (Workspace: need BDSPAC)
1958 CALL DBDSQR( 'U', N, N, M, 0, S, WORK( IE ), VT,
1959 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ),
1972 * Path 10 (M at least N, but not much larger)
1973 * Reduce to bidiagonal form without QR decomposition
1981 * (Workspace: need 3*N + M, prefer 3*N + (M + N)*NB)
1983 CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
1984 $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1,
1988 * If left singular vectors desired in U, copy result to U
1989 * and generate left bidiagonalizing vectors in U
1990 * (Workspace: need 3*N + NCU, prefer 3*N + NCU*NB)
1992 CALL DLACPY( 'L', M, N, A, LDA, U, LDU )
1997 CALL DORGBR( 'Q', M, NCU, N, U, LDU, WORK( ITAUQ ),
1998 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2002 * If right singular vectors desired in VT, copy result to
2003 * VT and generate right bidiagonalizing vectors in VT
2004 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
2006 CALL DLACPY( 'U', N, N, A, LDA, VT, LDVT )
2007 CALL DORGBR( 'P', N, N, N, VT, LDVT, WORK( ITAUP ),
2008 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2012 * If left singular vectors desired in A, generate left
2013 * bidiagonalizing vectors in A
2014 * (Workspace: need 4*N, prefer 3*N + N*NB)
2016 CALL DORGBR( 'Q', M, N, N, A, LDA, WORK( ITAUQ ),
2017 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2021 * If right singular vectors desired in A, generate right
2022 * bidiagonalizing vectors in A
2023 * (Workspace: need 4*N-1, prefer 3*N + (N-1)*NB)
2025 CALL DORGBR( 'P', N, N, N, A, LDA, WORK( ITAUP ),
2026 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2029 IF( WNTUAS .OR. WNTUO )
2033 IF( WNTVAS .OR. WNTVO )
2037 IF( ( .NOT.WNTUO ) .AND. ( .NOT.WNTVO ) ) THEN
2039 * Perform bidiagonal QR iteration, if desired, computing
2040 * left singular vectors in U and computing right singular
2042 * (Workspace: need BDSPAC)
2044 CALL DBDSQR( 'U', N, NCVT, NRU, 0, S, WORK( IE ), VT,
2045 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ), INFO )
2046 ELSE IF( ( .NOT.WNTUO ) .AND. WNTVO ) THEN
2048 * Perform bidiagonal QR iteration, if desired, computing
2049 * left singular vectors in U and computing right singular
2051 * (Workspace: need BDSPAC)
2053 CALL DBDSQR( 'U', N, NCVT, NRU, 0, S, WORK( IE ), A, LDA,
2054 $ U, LDU, DUM, 1, WORK( IWORK ), INFO )
2057 * Perform bidiagonal QR iteration, if desired, computing
2058 * left singular vectors in A and computing right singular
2060 * (Workspace: need BDSPAC)
2062 CALL DBDSQR( 'U', N, NCVT, NRU, 0, S, WORK( IE ), VT,
2063 $ LDVT, A, LDA, DUM, 1, WORK( IWORK ), INFO )
2070 * A has more columns than rows. If A has sufficiently more
2071 * columns than rows, first reduce using the LQ decomposition (if
2072 * sufficient workspace available)
2074 IF( N.GE.MNTHR ) THEN
2078 * Path 1t(N much larger than M, JOBVT='N')
2079 * No right singular vectors to be computed
2085 * (Workspace: need 2*M, prefer M + M*NB)
2087 CALL DGELQF( M, N, A, LDA, WORK( ITAU ), WORK( IWORK ),
2088 $ LWORK-IWORK+1, IERR )
2092 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, A( 1, 2 ), LDA )
2098 * Bidiagonalize L in A
2099 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
2101 CALL DGEBRD( M, M, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
2102 $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1,
2104 IF( WNTUO .OR. WNTUAS ) THEN
2106 * If left singular vectors desired, generate Q
2107 * (Workspace: need 4*M, prefer 3*M + M*NB)
2109 CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ),
2110 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2114 IF( WNTUO .OR. WNTUAS )
2117 * Perform bidiagonal QR iteration, computing left singular
2118 * vectors of A in A if desired
2119 * (Workspace: need BDSPAC)
2121 CALL DBDSQR( 'U', M, 0, NRU, 0, S, WORK( IE ), DUM, 1, A,
2122 $ LDA, DUM, 1, WORK( IWORK ), INFO )
2124 * If left singular vectors desired in U, copy them there
2127 $ CALL DLACPY( 'F', M, M, A, LDA, U, LDU )
2129 ELSE IF( WNTVO .AND. WNTUN ) THEN
2131 * Path 2t(N much larger than M, JOBU='N', JOBVT='O')
2132 * M right singular vectors to be overwritten on A and
2133 * no left singular vectors to be computed
2135 IF( LWORK.GE.M*M+MAX( 4*M, BDSPAC ) ) THEN
2137 * Sufficient workspace for a fast algorithm
2140 IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + LDA*M ) THEN
2142 * WORK(IU) is LDA by N and WORK(IR) is LDA by M
2147 ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + M*M ) THEN
2149 * WORK(IU) is LDA by N and WORK(IR) is M by M
2156 * WORK(IU) is M by CHUNK and WORK(IR) is M by M
2159 CHUNK = ( LWORK-M*M-M ) / M
2162 ITAU = IR + LDWRKR*M
2166 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2168 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2169 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2171 * Copy L to WORK(IR) and zero out above it
2173 CALL DLACPY( 'L', M, M, A, LDA, WORK( IR ), LDWRKR )
2174 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
2175 $ WORK( IR+LDWRKR ), LDWRKR )
2178 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2180 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2181 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2187 * Bidiagonalize L in WORK(IR)
2188 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
2190 CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S, WORK( IE ),
2191 $ WORK( ITAUQ ), WORK( ITAUP ),
2192 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2194 * Generate right vectors bidiagonalizing L
2195 * (Workspace: need M*M + 4*M-1, prefer M*M + 3*M + (M-1)*NB)
2197 CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR,
2198 $ WORK( ITAUP ), WORK( IWORK ),
2199 $ LWORK-IWORK+1, IERR )
2202 * Perform bidiagonal QR iteration, computing right
2203 * singular vectors of L in WORK(IR)
2204 * (Workspace: need M*M + BDSPAC)
2206 CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ),
2207 $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1,
2208 $ WORK( IWORK ), INFO )
2211 * Multiply right singular vectors of L in WORK(IR) by Q
2212 * in A, storing result in WORK(IU) and copying to A
2213 * (Workspace: need M*M + 2*M, prefer M*M + M*N + M)
2215 DO 30 I = 1, N, CHUNK
2216 BLK = MIN( N-I+1, CHUNK )
2217 CALL DGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IR ),
2218 $ LDWRKR, A( 1, I ), LDA, ZERO,
2219 $ WORK( IU ), LDWRKU )
2220 CALL DLACPY( 'F', M, BLK, WORK( IU ), LDWRKU,
2226 * Insufficient workspace for a fast algorithm
2234 * (Workspace: need 3*M + N, prefer 3*M + (M + N)*NB)
2236 CALL DGEBRD( M, N, A, LDA, S, WORK( IE ),
2237 $ WORK( ITAUQ ), WORK( ITAUP ),
2238 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2240 * Generate right vectors bidiagonalizing A
2241 * (Workspace: need 4*M, prefer 3*M + M*NB)
2243 CALL DORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ),
2244 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2247 * Perform bidiagonal QR iteration, computing right
2248 * singular vectors of A in A
2249 * (Workspace: need BDSPAC)
2251 CALL DBDSQR( 'L', M, N, 0, 0, S, WORK( IE ), A, LDA,
2252 $ DUM, 1, DUM, 1, WORK( IWORK ), INFO )
2256 ELSE IF( WNTVO .AND. WNTUAS ) THEN
2258 * Path 3t(N much larger than M, JOBU='S' or 'A', JOBVT='O')
2259 * M right singular vectors to be overwritten on A and
2260 * M left singular vectors to be computed in U
2262 IF( LWORK.GE.M*M+MAX( 4*M, BDSPAC ) ) THEN
2264 * Sufficient workspace for a fast algorithm
2267 IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + LDA*M ) THEN
2269 * WORK(IU) is LDA by N and WORK(IR) is LDA by M
2274 ELSE IF( LWORK.GE.MAX( WRKBL, LDA*N + M ) + M*M ) THEN
2276 * WORK(IU) is LDA by N and WORK(IR) is M by M
2283 * WORK(IU) is M by CHUNK and WORK(IR) is M by M
2286 CHUNK = ( LWORK-M*M-M ) / M
2289 ITAU = IR + LDWRKR*M
2293 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2295 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2296 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2298 * Copy L to U, zeroing about above it
2300 CALL DLACPY( 'L', M, M, A, LDA, U, LDU )
2301 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, U( 1, 2 ),
2305 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2307 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2308 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2314 * Bidiagonalize L in U, copying result to WORK(IR)
2315 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
2317 CALL DGEBRD( M, M, U, LDU, S, WORK( IE ),
2318 $ WORK( ITAUQ ), WORK( ITAUP ),
2319 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2320 CALL DLACPY( 'U', M, M, U, LDU, WORK( IR ), LDWRKR )
2322 * Generate right vectors bidiagonalizing L in WORK(IR)
2323 * (Workspace: need M*M + 4*M-1, prefer M*M + 3*M + (M-1)*NB)
2325 CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR,
2326 $ WORK( ITAUP ), WORK( IWORK ),
2327 $ LWORK-IWORK+1, IERR )
2329 * Generate left vectors bidiagonalizing L in U
2330 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB)
2332 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
2333 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2336 * Perform bidiagonal QR iteration, computing left
2337 * singular vectors of L in U, and computing right
2338 * singular vectors of L in WORK(IR)
2339 * (Workspace: need M*M + BDSPAC)
2341 CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ),
2342 $ WORK( IR ), LDWRKR, U, LDU, DUM, 1,
2343 $ WORK( IWORK ), INFO )
2346 * Multiply right singular vectors of L in WORK(IR) by Q
2347 * in A, storing result in WORK(IU) and copying to A
2348 * (Workspace: need M*M + 2*M, prefer M*M + M*N + M))
2350 DO 40 I = 1, N, CHUNK
2351 BLK = MIN( N-I+1, CHUNK )
2352 CALL DGEMM( 'N', 'N', M, BLK, M, ONE, WORK( IR ),
2353 $ LDWRKR, A( 1, I ), LDA, ZERO,
2354 $ WORK( IU ), LDWRKU )
2355 CALL DLACPY( 'F', M, BLK, WORK( IU ), LDWRKU,
2361 * Insufficient workspace for a fast algorithm
2367 * (Workspace: need 2*M, prefer M + M*NB)
2369 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2370 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2372 * Copy L to U, zeroing out above it
2374 CALL DLACPY( 'L', M, M, A, LDA, U, LDU )
2375 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, U( 1, 2 ),
2379 * (Workspace: need 2*M, prefer M + M*NB)
2381 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2382 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2388 * Bidiagonalize L in U
2389 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
2391 CALL DGEBRD( M, M, U, LDU, S, WORK( IE ),
2392 $ WORK( ITAUQ ), WORK( ITAUP ),
2393 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2395 * Multiply right vectors bidiagonalizing L by Q in A
2396 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
2398 CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU,
2399 $ WORK( ITAUP ), A, LDA, WORK( IWORK ),
2400 $ LWORK-IWORK+1, IERR )
2402 * Generate left vectors bidiagonalizing L in U
2403 * (Workspace: need 4*M, prefer 3*M + M*NB)
2405 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
2406 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2409 * Perform bidiagonal QR iteration, computing left
2410 * singular vectors of A in U and computing right
2411 * singular vectors of A in A
2412 * (Workspace: need BDSPAC)
2414 CALL DBDSQR( 'U', M, N, M, 0, S, WORK( IE ), A, LDA,
2415 $ U, LDU, DUM, 1, WORK( IWORK ), INFO )
2419 ELSE IF( WNTVS ) THEN
2423 * Path 4t(N much larger than M, JOBU='N', JOBVT='S')
2424 * M right singular vectors to be computed in VT and
2425 * no left singular vectors to be computed
2427 IF( LWORK.GE.M*M+MAX( 4*M, BDSPAC ) ) THEN
2429 * Sufficient workspace for a fast algorithm
2432 IF( LWORK.GE.WRKBL+LDA*M ) THEN
2434 * WORK(IR) is LDA by M
2439 * WORK(IR) is M by M
2443 ITAU = IR + LDWRKR*M
2447 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2449 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2450 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2452 * Copy L to WORK(IR), zeroing out above it
2454 CALL DLACPY( 'L', M, M, A, LDA, WORK( IR ),
2456 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
2457 $ WORK( IR+LDWRKR ), LDWRKR )
2460 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2462 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2463 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2469 * Bidiagonalize L in WORK(IR)
2470 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
2472 CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S,
2473 $ WORK( IE ), WORK( ITAUQ ),
2474 $ WORK( ITAUP ), WORK( IWORK ),
2475 $ LWORK-IWORK+1, IERR )
2477 * Generate right vectors bidiagonalizing L in
2479 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + (M-1)*NB)
2481 CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR,
2482 $ WORK( ITAUP ), WORK( IWORK ),
2483 $ LWORK-IWORK+1, IERR )
2486 * Perform bidiagonal QR iteration, computing right
2487 * singular vectors of L in WORK(IR)
2488 * (Workspace: need M*M + BDSPAC)
2490 CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ),
2491 $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1,
2492 $ WORK( IWORK ), INFO )
2494 * Multiply right singular vectors of L in WORK(IR) by
2495 * Q in A, storing result in VT
2496 * (Workspace: need M*M)
2498 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IR ),
2499 $ LDWRKR, A, LDA, ZERO, VT, LDVT )
2503 * Insufficient workspace for a fast algorithm
2509 * (Workspace: need 2*M, prefer M + M*NB)
2511 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2512 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2516 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
2519 * (Workspace: need 2*M, prefer M + M*NB)
2521 CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ),
2522 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2528 * Zero out above L in A
2530 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, A( 1, 2 ),
2533 * Bidiagonalize L in A
2534 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
2536 CALL DGEBRD( M, M, A, LDA, S, WORK( IE ),
2537 $ WORK( ITAUQ ), WORK( ITAUP ),
2538 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2540 * Multiply right vectors bidiagonalizing L by Q in VT
2541 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
2543 CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA,
2544 $ WORK( ITAUP ), VT, LDVT,
2545 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2548 * Perform bidiagonal QR iteration, computing right
2549 * singular vectors of A in VT
2550 * (Workspace: need BDSPAC)
2552 CALL DBDSQR( 'U', M, N, 0, 0, S, WORK( IE ), VT,
2553 $ LDVT, DUM, 1, DUM, 1, WORK( IWORK ),
2558 ELSE IF( WNTUO ) THEN
2560 * Path 5t(N much larger than M, JOBU='O', JOBVT='S')
2561 * M right singular vectors to be computed in VT and
2562 * M left singular vectors to be overwritten on A
2564 IF( LWORK.GE.2*M*M+MAX( 4*M, BDSPAC ) ) THEN
2566 * Sufficient workspace for a fast algorithm
2569 IF( LWORK.GE.WRKBL+2*LDA*M ) THEN
2571 * WORK(IU) is LDA by M and WORK(IR) is LDA by M
2576 ELSE IF( LWORK.GE.WRKBL+( LDA + M )*M ) THEN
2578 * WORK(IU) is LDA by M and WORK(IR) is M by M
2585 * WORK(IU) is M by M and WORK(IR) is M by M
2591 ITAU = IR + LDWRKR*M
2595 * (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB)
2597 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2598 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2600 * Copy L to WORK(IU), zeroing out below it
2602 CALL DLACPY( 'L', M, M, A, LDA, WORK( IU ),
2604 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
2605 $ WORK( IU+LDWRKU ), LDWRKU )
2608 * (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB)
2610 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2611 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2617 * Bidiagonalize L in WORK(IU), copying result to
2619 * (Workspace: need 2*M*M + 4*M,
2620 * prefer 2*M*M+3*M+2*M*NB)
2622 CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S,
2623 $ WORK( IE ), WORK( ITAUQ ),
2624 $ WORK( ITAUP ), WORK( IWORK ),
2625 $ LWORK-IWORK+1, IERR )
2626 CALL DLACPY( 'L', M, M, WORK( IU ), LDWRKU,
2627 $ WORK( IR ), LDWRKR )
2629 * Generate right bidiagonalizing vectors in WORK(IU)
2630 * (Workspace: need 2*M*M + 4*M-1,
2631 * prefer 2*M*M+3*M+(M-1)*NB)
2633 CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU,
2634 $ WORK( ITAUP ), WORK( IWORK ),
2635 $ LWORK-IWORK+1, IERR )
2637 * Generate left bidiagonalizing vectors in WORK(IR)
2638 * (Workspace: need 2*M*M + 4*M, prefer 2*M*M + 3*M + M*NB)
2640 CALL DORGBR( 'Q', M, M, M, WORK( IR ), LDWRKR,
2641 $ WORK( ITAUQ ), WORK( IWORK ),
2642 $ LWORK-IWORK+1, IERR )
2645 * Perform bidiagonal QR iteration, computing left
2646 * singular vectors of L in WORK(IR) and computing
2647 * right singular vectors of L in WORK(IU)
2648 * (Workspace: need 2*M*M + BDSPAC)
2650 CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ),
2651 $ WORK( IU ), LDWRKU, WORK( IR ),
2652 $ LDWRKR, DUM, 1, WORK( IWORK ), INFO )
2654 * Multiply right singular vectors of L in WORK(IU) by
2655 * Q in A, storing result in VT
2656 * (Workspace: need M*M)
2658 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IU ),
2659 $ LDWRKU, A, LDA, ZERO, VT, LDVT )
2661 * Copy left singular vectors of L to A
2662 * (Workspace: need M*M)
2664 CALL DLACPY( 'F', M, M, WORK( IR ), LDWRKR, A,
2669 * Insufficient workspace for a fast algorithm
2674 * Compute A=L*Q, copying result to VT
2675 * (Workspace: need 2*M, prefer M + M*NB)
2677 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2678 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2679 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
2682 * (Workspace: need 2*M, prefer M + M*NB)
2684 CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ),
2685 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2691 * Zero out above L in A
2693 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, A( 1, 2 ),
2696 * Bidiagonalize L in A
2697 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
2699 CALL DGEBRD( M, M, A, LDA, S, WORK( IE ),
2700 $ WORK( ITAUQ ), WORK( ITAUP ),
2701 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2703 * Multiply right vectors bidiagonalizing L by Q in VT
2704 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
2706 CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA,
2707 $ WORK( ITAUP ), VT, LDVT,
2708 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2710 * Generate left bidiagonalizing vectors of L in A
2711 * (Workspace: need 4*M, prefer 3*M + M*NB)
2713 CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ),
2714 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2717 * Perform bidiagonal QR iteration, compute left
2718 * singular vectors of A in A and compute right
2719 * singular vectors of A in VT
2720 * (Workspace: need BDSPAC)
2722 CALL DBDSQR( 'U', M, N, M, 0, S, WORK( IE ), VT,
2723 $ LDVT, A, LDA, DUM, 1, WORK( IWORK ),
2728 ELSE IF( WNTUAS ) THEN
2730 * Path 6t(N much larger than M, JOBU='S' or 'A',
2732 * M right singular vectors to be computed in VT and
2733 * M left singular vectors to be computed in U
2735 IF( LWORK.GE.M*M+MAX( 4*M, BDSPAC ) ) THEN
2737 * Sufficient workspace for a fast algorithm
2740 IF( LWORK.GE.WRKBL+LDA*M ) THEN
2742 * WORK(IU) is LDA by N
2747 * WORK(IU) is LDA by M
2751 ITAU = IU + LDWRKU*M
2755 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2757 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2758 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2760 * Copy L to WORK(IU), zeroing out above it
2762 CALL DLACPY( 'L', M, M, A, LDA, WORK( IU ),
2764 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
2765 $ WORK( IU+LDWRKU ), LDWRKU )
2768 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2770 CALL DORGLQ( M, N, M, A, LDA, WORK( ITAU ),
2771 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2777 * Bidiagonalize L in WORK(IU), copying result to U
2778 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
2780 CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S,
2781 $ WORK( IE ), WORK( ITAUQ ),
2782 $ WORK( ITAUP ), WORK( IWORK ),
2783 $ LWORK-IWORK+1, IERR )
2784 CALL DLACPY( 'L', M, M, WORK( IU ), LDWRKU, U,
2787 * Generate right bidiagonalizing vectors in WORK(IU)
2788 * (Workspace: need M*M + 4*M-1,
2789 * prefer M*M+3*M+(M-1)*NB)
2791 CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU,
2792 $ WORK( ITAUP ), WORK( IWORK ),
2793 $ LWORK-IWORK+1, IERR )
2795 * Generate left bidiagonalizing vectors in U
2796 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB)
2798 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
2799 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2802 * Perform bidiagonal QR iteration, computing left
2803 * singular vectors of L in U and computing right
2804 * singular vectors of L in WORK(IU)
2805 * (Workspace: need M*M + BDSPAC)
2807 CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ),
2808 $ WORK( IU ), LDWRKU, U, LDU, DUM, 1,
2809 $ WORK( IWORK ), INFO )
2811 * Multiply right singular vectors of L in WORK(IU) by
2812 * Q in A, storing result in VT
2813 * (Workspace: need M*M)
2815 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IU ),
2816 $ LDWRKU, A, LDA, ZERO, VT, LDVT )
2820 * Insufficient workspace for a fast algorithm
2825 * Compute A=L*Q, copying result to VT
2826 * (Workspace: need 2*M, prefer M + M*NB)
2828 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2829 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2830 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
2833 * (Workspace: need 2*M, prefer M + M*NB)
2835 CALL DORGLQ( M, N, M, VT, LDVT, WORK( ITAU ),
2836 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2838 * Copy L to U, zeroing out above it
2840 CALL DLACPY( 'L', M, M, A, LDA, U, LDU )
2841 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, U( 1, 2 ),
2848 * Bidiagonalize L in U
2849 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
2851 CALL DGEBRD( M, M, U, LDU, S, WORK( IE ),
2852 $ WORK( ITAUQ ), WORK( ITAUP ),
2853 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2855 * Multiply right bidiagonalizing vectors in U by Q
2857 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
2859 CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU,
2860 $ WORK( ITAUP ), VT, LDVT,
2861 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2863 * Generate left bidiagonalizing vectors in U
2864 * (Workspace: need 4*M, prefer 3*M + M*NB)
2866 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
2867 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2870 * Perform bidiagonal QR iteration, computing left
2871 * singular vectors of A in U and computing right
2872 * singular vectors of A in VT
2873 * (Workspace: need BDSPAC)
2875 CALL DBDSQR( 'U', M, N, M, 0, S, WORK( IE ), VT,
2876 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ),
2883 ELSE IF( WNTVA ) THEN
2887 * Path 7t(N much larger than M, JOBU='N', JOBVT='A')
2888 * N right singular vectors to be computed in VT and
2889 * no left singular vectors to be computed
2891 IF( LWORK.GE.M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN
2893 * Sufficient workspace for a fast algorithm
2896 IF( LWORK.GE.WRKBL+LDA*M ) THEN
2898 * WORK(IR) is LDA by M
2903 * WORK(IR) is M by M
2907 ITAU = IR + LDWRKR*M
2910 * Compute A=L*Q, copying result to VT
2911 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
2913 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2914 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2915 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
2917 * Copy L to WORK(IR), zeroing out above it
2919 CALL DLACPY( 'L', M, M, A, LDA, WORK( IR ),
2921 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
2922 $ WORK( IR+LDWRKR ), LDWRKR )
2925 * (Workspace: need M*M + M + N, prefer M*M + M + N*NB)
2927 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
2928 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2934 * Bidiagonalize L in WORK(IR)
2935 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
2937 CALL DGEBRD( M, M, WORK( IR ), LDWRKR, S,
2938 $ WORK( IE ), WORK( ITAUQ ),
2939 $ WORK( ITAUP ), WORK( IWORK ),
2940 $ LWORK-IWORK+1, IERR )
2942 * Generate right bidiagonalizing vectors in WORK(IR)
2943 * (Workspace: need M*M + 4*M-1,
2944 * prefer M*M+3*M+(M-1)*NB)
2946 CALL DORGBR( 'P', M, M, M, WORK( IR ), LDWRKR,
2947 $ WORK( ITAUP ), WORK( IWORK ),
2948 $ LWORK-IWORK+1, IERR )
2951 * Perform bidiagonal QR iteration, computing right
2952 * singular vectors of L in WORK(IR)
2953 * (Workspace: need M*M + BDSPAC)
2955 CALL DBDSQR( 'U', M, M, 0, 0, S, WORK( IE ),
2956 $ WORK( IR ), LDWRKR, DUM, 1, DUM, 1,
2957 $ WORK( IWORK ), INFO )
2959 * Multiply right singular vectors of L in WORK(IR) by
2960 * Q in VT, storing result in A
2961 * (Workspace: need M*M)
2963 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IR ),
2964 $ LDWRKR, VT, LDVT, ZERO, A, LDA )
2966 * Copy right singular vectors of A from A to VT
2968 CALL DLACPY( 'F', M, N, A, LDA, VT, LDVT )
2972 * Insufficient workspace for a fast algorithm
2977 * Compute A=L*Q, copying result to VT
2978 * (Workspace: need 2*M, prefer M + M*NB)
2980 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
2981 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2982 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
2985 * (Workspace: need M + N, prefer M + N*NB)
2987 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
2988 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
2994 * Zero out above L in A
2996 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, A( 1, 2 ),
2999 * Bidiagonalize L in A
3000 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
3002 CALL DGEBRD( M, M, A, LDA, S, WORK( IE ),
3003 $ WORK( ITAUQ ), WORK( ITAUP ),
3004 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3006 * Multiply right bidiagonalizing vectors in A by Q
3008 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
3010 CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA,
3011 $ WORK( ITAUP ), VT, LDVT,
3012 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3015 * Perform bidiagonal QR iteration, computing right
3016 * singular vectors of A in VT
3017 * (Workspace: need BDSPAC)
3019 CALL DBDSQR( 'U', M, N, 0, 0, S, WORK( IE ), VT,
3020 $ LDVT, DUM, 1, DUM, 1, WORK( IWORK ),
3025 ELSE IF( WNTUO ) THEN
3027 * Path 8t(N much larger than M, JOBU='O', JOBVT='A')
3028 * N right singular vectors to be computed in VT and
3029 * M left singular vectors to be overwritten on A
3031 IF( LWORK.GE.2*M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN
3033 * Sufficient workspace for a fast algorithm
3036 IF( LWORK.GE.WRKBL+2*LDA*M ) THEN
3038 * WORK(IU) is LDA by M and WORK(IR) is LDA by M
3043 ELSE IF( LWORK.GE.WRKBL+( LDA + M )*M ) THEN
3045 * WORK(IU) is LDA by M and WORK(IR) is M by M
3052 * WORK(IU) is M by M and WORK(IR) is M by M
3058 ITAU = IR + LDWRKR*M
3061 * Compute A=L*Q, copying result to VT
3062 * (Workspace: need 2*M*M + 2*M, prefer 2*M*M + M + M*NB)
3064 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
3065 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3066 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
3069 * (Workspace: need 2*M*M + M + N, prefer 2*M*M + M + N*NB)
3071 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
3072 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3074 * Copy L to WORK(IU), zeroing out above it
3076 CALL DLACPY( 'L', M, M, A, LDA, WORK( IU ),
3078 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
3079 $ WORK( IU+LDWRKU ), LDWRKU )
3085 * Bidiagonalize L in WORK(IU), copying result to
3087 * (Workspace: need 2*M*M + 4*M,
3088 * prefer 2*M*M+3*M+2*M*NB)
3090 CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S,
3091 $ WORK( IE ), WORK( ITAUQ ),
3092 $ WORK( ITAUP ), WORK( IWORK ),
3093 $ LWORK-IWORK+1, IERR )
3094 CALL DLACPY( 'L', M, M, WORK( IU ), LDWRKU,
3095 $ WORK( IR ), LDWRKR )
3097 * Generate right bidiagonalizing vectors in WORK(IU)
3098 * (Workspace: need 2*M*M + 4*M-1,
3099 * prefer 2*M*M+3*M+(M-1)*NB)
3101 CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU,
3102 $ WORK( ITAUP ), WORK( IWORK ),
3103 $ LWORK-IWORK+1, IERR )
3105 * Generate left bidiagonalizing vectors in WORK(IR)
3106 * (Workspace: need 2*M*M + 4*M, prefer 2*M*M + 3*M + M*NB)
3108 CALL DORGBR( 'Q', M, M, M, WORK( IR ), LDWRKR,
3109 $ WORK( ITAUQ ), WORK( IWORK ),
3110 $ LWORK-IWORK+1, IERR )
3113 * Perform bidiagonal QR iteration, computing left
3114 * singular vectors of L in WORK(IR) and computing
3115 * right singular vectors of L in WORK(IU)
3116 * (Workspace: need 2*M*M + BDSPAC)
3118 CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ),
3119 $ WORK( IU ), LDWRKU, WORK( IR ),
3120 $ LDWRKR, DUM, 1, WORK( IWORK ), INFO )
3122 * Multiply right singular vectors of L in WORK(IU) by
3123 * Q in VT, storing result in A
3124 * (Workspace: need M*M)
3126 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IU ),
3127 $ LDWRKU, VT, LDVT, ZERO, A, LDA )
3129 * Copy right singular vectors of A from A to VT
3131 CALL DLACPY( 'F', M, N, A, LDA, VT, LDVT )
3133 * Copy left singular vectors of A from WORK(IR) to A
3135 CALL DLACPY( 'F', M, M, WORK( IR ), LDWRKR, A,
3140 * Insufficient workspace for a fast algorithm
3145 * Compute A=L*Q, copying result to VT
3146 * (Workspace: need 2*M, prefer M + M*NB)
3148 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
3149 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3150 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
3153 * (Workspace: need M + N, prefer M + N*NB)
3155 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
3156 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3162 * Zero out above L in A
3164 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, A( 1, 2 ),
3167 * Bidiagonalize L in A
3168 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
3170 CALL DGEBRD( M, M, A, LDA, S, WORK( IE ),
3171 $ WORK( ITAUQ ), WORK( ITAUP ),
3172 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3174 * Multiply right bidiagonalizing vectors in A by Q
3176 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
3178 CALL DORMBR( 'P', 'L', 'T', M, N, M, A, LDA,
3179 $ WORK( ITAUP ), VT, LDVT,
3180 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3182 * Generate left bidiagonalizing vectors in A
3183 * (Workspace: need 4*M, prefer 3*M + M*NB)
3185 CALL DORGBR( 'Q', M, M, M, A, LDA, WORK( ITAUQ ),
3186 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3189 * Perform bidiagonal QR iteration, computing left
3190 * singular vectors of A in A and computing right
3191 * singular vectors of A in VT
3192 * (Workspace: need BDSPAC)
3194 CALL DBDSQR( 'U', M, N, M, 0, S, WORK( IE ), VT,
3195 $ LDVT, A, LDA, DUM, 1, WORK( IWORK ),
3200 ELSE IF( WNTUAS ) THEN
3202 * Path 9t(N much larger than M, JOBU='S' or 'A',
3204 * N right singular vectors to be computed in VT and
3205 * M left singular vectors to be computed in U
3207 IF( LWORK.GE.M*M+MAX( N + M, 4*M, BDSPAC ) ) THEN
3209 * Sufficient workspace for a fast algorithm
3212 IF( LWORK.GE.WRKBL+LDA*M ) THEN
3214 * WORK(IU) is LDA by M
3219 * WORK(IU) is M by M
3223 ITAU = IU + LDWRKU*M
3226 * Compute A=L*Q, copying result to VT
3227 * (Workspace: need M*M + 2*M, prefer M*M + M + M*NB)
3229 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
3230 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3231 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
3234 * (Workspace: need M*M + M + N, prefer M*M + M + N*NB)
3236 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
3237 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3239 * Copy L to WORK(IU), zeroing out above it
3241 CALL DLACPY( 'L', M, M, A, LDA, WORK( IU ),
3243 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO,
3244 $ WORK( IU+LDWRKU ), LDWRKU )
3250 * Bidiagonalize L in WORK(IU), copying result to U
3251 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + 2*M*NB)
3253 CALL DGEBRD( M, M, WORK( IU ), LDWRKU, S,
3254 $ WORK( IE ), WORK( ITAUQ ),
3255 $ WORK( ITAUP ), WORK( IWORK ),
3256 $ LWORK-IWORK+1, IERR )
3257 CALL DLACPY( 'L', M, M, WORK( IU ), LDWRKU, U,
3260 * Generate right bidiagonalizing vectors in WORK(IU)
3261 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + (M-1)*NB)
3263 CALL DORGBR( 'P', M, M, M, WORK( IU ), LDWRKU,
3264 $ WORK( ITAUP ), WORK( IWORK ),
3265 $ LWORK-IWORK+1, IERR )
3267 * Generate left bidiagonalizing vectors in U
3268 * (Workspace: need M*M + 4*M, prefer M*M + 3*M + M*NB)
3270 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
3271 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3274 * Perform bidiagonal QR iteration, computing left
3275 * singular vectors of L in U and computing right
3276 * singular vectors of L in WORK(IU)
3277 * (Workspace: need M*M + BDSPAC)
3279 CALL DBDSQR( 'U', M, M, M, 0, S, WORK( IE ),
3280 $ WORK( IU ), LDWRKU, U, LDU, DUM, 1,
3281 $ WORK( IWORK ), INFO )
3283 * Multiply right singular vectors of L in WORK(IU) by
3284 * Q in VT, storing result in A
3285 * (Workspace: need M*M)
3287 CALL DGEMM( 'N', 'N', M, N, M, ONE, WORK( IU ),
3288 $ LDWRKU, VT, LDVT, ZERO, A, LDA )
3290 * Copy right singular vectors of A from A to VT
3292 CALL DLACPY( 'F', M, N, A, LDA, VT, LDVT )
3296 * Insufficient workspace for a fast algorithm
3301 * Compute A=L*Q, copying result to VT
3302 * (Workspace: need 2*M, prefer M + M*NB)
3304 CALL DGELQF( M, N, A, LDA, WORK( ITAU ),
3305 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3306 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
3309 * (Workspace: need M + N, prefer M + N*NB)
3311 CALL DORGLQ( N, N, M, VT, LDVT, WORK( ITAU ),
3312 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3314 * Copy L to U, zeroing out above it
3316 CALL DLACPY( 'L', M, M, A, LDA, U, LDU )
3317 CALL DLASET( 'U', M-1, M-1, ZERO, ZERO, U( 1, 2 ),
3324 * Bidiagonalize L in U
3325 * (Workspace: need 4*M, prefer 3*M + 2*M*NB)
3327 CALL DGEBRD( M, M, U, LDU, S, WORK( IE ),
3328 $ WORK( ITAUQ ), WORK( ITAUP ),
3329 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3331 * Multiply right bidiagonalizing vectors in U by Q
3333 * (Workspace: need 3*M + N, prefer 3*M + N*NB)
3335 CALL DORMBR( 'P', 'L', 'T', M, N, M, U, LDU,
3336 $ WORK( ITAUP ), VT, LDVT,
3337 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3339 * Generate left bidiagonalizing vectors in U
3340 * (Workspace: need 4*M, prefer 3*M + M*NB)
3342 CALL DORGBR( 'Q', M, M, M, U, LDU, WORK( ITAUQ ),
3343 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3346 * Perform bidiagonal QR iteration, computing left
3347 * singular vectors of A in U and computing right
3348 * singular vectors of A in VT
3349 * (Workspace: need BDSPAC)
3351 CALL DBDSQR( 'U', M, N, M, 0, S, WORK( IE ), VT,
3352 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ),
3365 * Path 10t(N greater than M, but not much larger)
3366 * Reduce to bidiagonal form without LQ decomposition
3374 * (Workspace: need 3*M + N, prefer 3*M + (M + N)*NB)
3376 CALL DGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
3377 $ WORK( ITAUP ), WORK( IWORK ), LWORK-IWORK+1,
3381 * If left singular vectors desired in U, copy result to U
3382 * and generate left bidiagonalizing vectors in U
3383 * (Workspace: need 4*M-1, prefer 3*M + (M-1)*NB)
3385 CALL DLACPY( 'L', M, M, A, LDA, U, LDU )
3386 CALL DORGBR( 'Q', M, M, N, U, LDU, WORK( ITAUQ ),
3387 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3391 * If right singular vectors desired in VT, copy result to
3392 * VT and generate right bidiagonalizing vectors in VT
3393 * (Workspace: need 3*M + NRVT, prefer 3*M + NRVT*NB)
3395 CALL DLACPY( 'U', M, N, A, LDA, VT, LDVT )
3400 CALL DORGBR( 'P', NRVT, N, M, VT, LDVT, WORK( ITAUP ),
3401 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3405 * If left singular vectors desired in A, generate left
3406 * bidiagonalizing vectors in A
3407 * (Workspace: need 4*M-1, prefer 3*M + (M-1)*NB)
3409 CALL DORGBR( 'Q', M, M, N, A, LDA, WORK( ITAUQ ),
3410 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3414 * If right singular vectors desired in A, generate right
3415 * bidiagonalizing vectors in A
3416 * (Workspace: need 4*M, prefer 3*M + M*NB)
3418 CALL DORGBR( 'P', M, N, M, A, LDA, WORK( ITAUP ),
3419 $ WORK( IWORK ), LWORK-IWORK+1, IERR )
3422 IF( WNTUAS .OR. WNTUO )
3426 IF( WNTVAS .OR. WNTVO )
3430 IF( ( .NOT.WNTUO ) .AND. ( .NOT.WNTVO ) ) THEN
3432 * Perform bidiagonal QR iteration, if desired, computing
3433 * left singular vectors in U and computing right singular
3435 * (Workspace: need BDSPAC)
3437 CALL DBDSQR( 'L', M, NCVT, NRU, 0, S, WORK( IE ), VT,
3438 $ LDVT, U, LDU, DUM, 1, WORK( IWORK ), INFO )
3439 ELSE IF( ( .NOT.WNTUO ) .AND. WNTVO ) THEN
3441 * Perform bidiagonal QR iteration, if desired, computing
3442 * left singular vectors in U and computing right singular
3444 * (Workspace: need BDSPAC)
3446 CALL DBDSQR( 'L', M, NCVT, NRU, 0, S, WORK( IE ), A, LDA,
3447 $ U, LDU, DUM, 1, WORK( IWORK ), INFO )
3450 * Perform bidiagonal QR iteration, if desired, computing
3451 * left singular vectors in A and computing right singular
3453 * (Workspace: need BDSPAC)
3455 CALL DBDSQR( 'L', M, NCVT, NRU, 0, S, WORK( IE ), VT,
3456 $ LDVT, A, LDA, DUM, 1, WORK( IWORK ), INFO )
3463 * If DBDSQR failed to converge, copy unconverged superdiagonals
3464 * to WORK( 2:MINMN )
3466 IF( INFO.NE.0 ) THEN
3468 DO 50 I = 1, MINMN - 1
3469 WORK( I+1 ) = WORK( I+IE-1 )
3473 DO 60 I = MINMN - 1, 1, -1
3474 WORK( I+1 ) = WORK( I+IE-1 )
3479 * Undo scaling if necessary
3481 IF( ISCL.EQ.1 ) THEN
3482 IF( ANRM.GT.BIGNUM )
3483 $ CALL DLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN,
3485 IF( INFO.NE.0 .AND. ANRM.GT.BIGNUM )
3486 $ CALL DLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN-1, 1, WORK( 2 ),
3488 IF( ANRM.LT.SMLNUM )
3489 $ CALL DLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN,
3491 IF( INFO.NE.0 .AND. ANRM.LT.SMLNUM )
3492 $ CALL DLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN-1, 1, WORK( 2 ),
3496 * Return optimal workspace in WORK(1)