1 *> \brief <b> ZGELSD computes the minimum-norm solution to a linear least squares problem for GE matrices</b>
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download ZGELSD + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zgelsd.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zgelsd.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zgelsd.f">
21 * SUBROUTINE ZGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK,
22 * WORK, LWORK, RWORK, IWORK, INFO )
24 * .. Scalar Arguments ..
25 * INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
26 * DOUBLE PRECISION RCOND
28 * .. Array Arguments ..
30 * DOUBLE PRECISION RWORK( * ), S( * )
31 * COMPLEX*16 A( LDA, * ), B( LDB, * ), WORK( * )
40 *> ZGELSD computes the minimum-norm solution to a real linear least
42 *> minimize 2-norm(| b - A*x |)
43 *> using the singular value decomposition (SVD) of A. A is an M-by-N
44 *> matrix which may be rank-deficient.
46 *> Several right hand side vectors b and solution vectors x can be
47 *> handled in a single call; they are stored as the columns of the
48 *> M-by-NRHS right hand side matrix B and the N-by-NRHS solution
51 *> The problem is solved in three steps:
52 *> (1) Reduce the coefficient matrix A to bidiagonal form with
53 *> Householder transformations, reducing the original problem
54 *> into a "bidiagonal least squares problem" (BLS)
55 *> (2) Solve the BLS using a divide and conquer approach.
56 *> (3) Apply back all the Householder transformations to solve
57 *> the original least squares problem.
59 *> The effective rank of A is determined by treating as zero those
60 *> singular values which are less than RCOND times the largest singular
63 *> The divide and conquer algorithm makes very mild assumptions about
64 *> floating point arithmetic. It will work on machines with a guard
65 *> digit in add/subtract, or on those binary machines without guard
66 *> digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or
67 *> Cray-2. It could conceivably fail on hexadecimal or decimal machines
68 *> without guard digits, but we know of none.
77 *> The number of rows of the matrix A. M >= 0.
83 *> The number of columns of the matrix A. N >= 0.
89 *> The number of right hand sides, i.e., the number of columns
90 *> of the matrices B and X. NRHS >= 0.
95 *> A is COMPLEX*16 array, dimension (LDA,N)
96 *> On entry, the M-by-N matrix A.
97 *> On exit, A has been destroyed.
103 *> The leading dimension of the array A. LDA >= max(1,M).
108 *> B is COMPLEX*16 array, dimension (LDB,NRHS)
109 *> On entry, the M-by-NRHS right hand side matrix B.
110 *> On exit, B is overwritten by the N-by-NRHS solution matrix X.
111 *> If m >= n and RANK = n, the residual sum-of-squares for
112 *> the solution in the i-th column is given by the sum of
113 *> squares of the modulus of elements n+1:m in that column.
119 *> The leading dimension of the array B. LDB >= max(1,M,N).
124 *> S is DOUBLE PRECISION array, dimension (min(M,N))
125 *> The singular values of A in decreasing order.
126 *> The condition number of A in the 2-norm = S(1)/S(min(m,n)).
131 *> RCOND is DOUBLE PRECISION
132 *> RCOND is used to determine the effective rank of A.
133 *> Singular values S(i) <= RCOND*S(1) are treated as zero.
134 *> If RCOND < 0, machine precision is used instead.
140 *> The effective rank of A, i.e., the number of singular values
141 *> which are greater than RCOND*S(1).
146 *> WORK is COMPLEX*16 array, dimension (MAX(1,LWORK))
147 *> On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
153 *> The dimension of the array WORK. LWORK must be at least 1.
154 *> The exact minimum amount of workspace needed depends on M,
155 *> N and NRHS. As long as LWORK is at least
157 *> if M is greater than or equal to N or
159 *> if M is less than N, the code will execute correctly.
160 *> For good performance, LWORK should generally be larger.
162 *> If LWORK = -1, then a workspace query is assumed; the routine
163 *> only calculates the optimal size of the array WORK and the
164 *> minimum sizes of the arrays RWORK and IWORK, and returns
165 *> these values as the first entries of the WORK, RWORK and
166 *> IWORK arrays, and no error message related to LWORK is issued
172 *> RWORK is DOUBLE PRECISION array, dimension (MAX(1,LRWORK))
174 *> 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS +
175 *> MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
176 *> if M is greater than or equal to N or
177 *> 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS +
178 *> MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
179 *> if M is less than N, the code will execute correctly.
180 *> SMLSIZ is returned by ILAENV and is equal to the maximum
181 *> size of the subproblems at the bottom of the computation
182 *> tree (usually about 25), and
183 *> NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 )
184 *> On exit, if INFO = 0, RWORK(1) returns the minimum LRWORK.
189 *> IWORK is INTEGER array, dimension (MAX(1,LIWORK))
190 *> LIWORK >= max(1, 3*MINMN*NLVL + 11*MINMN),
191 *> where MINMN = MIN( M,N ).
192 *> On exit, if INFO = 0, IWORK(1) returns the minimum LIWORK.
198 *> = 0: successful exit
199 *> < 0: if INFO = -i, the i-th argument had an illegal value.
200 *> > 0: the algorithm for computing the SVD failed to converge;
201 *> if INFO = i, i off-diagonal elements of an intermediate
202 *> bidiagonal form did not converge to zero.
208 *> \author Univ. of Tennessee
209 *> \author Univ. of California Berkeley
210 *> \author Univ. of Colorado Denver
213 *> \date November 2011
215 *> \ingroup complex16GEsolve
217 *> \par Contributors:
220 *> Ming Gu and Ren-Cang Li, Computer Science Division, University of
221 *> California at Berkeley, USA \n
222 *> Osni Marques, LBNL/NERSC, USA \n
224 * =====================================================================
225 SUBROUTINE ZGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK,
226 $ WORK, LWORK, RWORK, IWORK, INFO )
228 * -- LAPACK driver routine (version 3.4.0) --
229 * -- LAPACK is a software package provided by Univ. of Tennessee, --
230 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
233 * .. Scalar Arguments ..
234 INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
235 DOUBLE PRECISION RCOND
237 * .. Array Arguments ..
239 DOUBLE PRECISION RWORK( * ), S( * )
240 COMPLEX*16 A( LDA, * ), B( LDB, * ), WORK( * )
243 * =====================================================================
246 DOUBLE PRECISION ZERO, ONE, TWO
247 PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0, TWO = 2.0D+0 )
249 PARAMETER ( CZERO = ( 0.0D+0, 0.0D+0 ) )
251 * .. Local Scalars ..
253 INTEGER IASCL, IBSCL, IE, IL, ITAU, ITAUP, ITAUQ,
254 $ LDWORK, LIWORK, LRWORK, MAXMN, MAXWRK, MINMN,
255 $ MINWRK, MM, MNTHR, NLVL, NRWORK, NWORK, SMLSIZ
256 DOUBLE PRECISION ANRM, BIGNUM, BNRM, EPS, SFMIN, SMLNUM
258 * .. External Subroutines ..
259 EXTERNAL DLABAD, DLASCL, DLASET, XERBLA, ZGEBRD, ZGELQF,
260 $ ZGEQRF, ZLACPY, ZLALSD, ZLASCL, ZLASET, ZUNMBR,
263 * .. External Functions ..
265 DOUBLE PRECISION DLAMCH, ZLANGE
266 EXTERNAL ILAENV, DLAMCH, ZLANGE
268 * .. Intrinsic Functions ..
269 INTRINSIC INT, LOG, MAX, MIN, DBLE
271 * .. Executable Statements ..
273 * Test the input arguments.
278 LQUERY = ( LWORK.EQ.-1 )
281 ELSE IF( N.LT.0 ) THEN
283 ELSE IF( NRHS.LT.0 ) THEN
285 ELSE IF( LDA.LT.MAX( 1, M ) ) THEN
287 ELSE IF( LDB.LT.MAX( 1, MAXMN ) ) THEN
292 * (Note: Comments in the code beginning "Workspace:" describe the
293 * minimal amount of workspace needed at that point in the code,
294 * as well as the preferred amount for good performance.
295 * NB refers to the optimal block size for the immediately
296 * following subroutine, as returned by ILAENV.)
303 IF( MINMN.GT.0 ) THEN
304 SMLSIZ = ILAENV( 9, 'ZGELSD', ' ', 0, 0, 0, 0 )
305 MNTHR = ILAENV( 6, 'ZGELSD', ' ', M, N, NRHS, -1 )
306 NLVL = MAX( INT( LOG( DBLE( MINMN ) / DBLE( SMLSIZ + 1 ) ) /
307 $ LOG( TWO ) ) + 1, 0 )
308 LIWORK = 3*MINMN*NLVL + 11*MINMN
310 IF( M.GE.N .AND. M.GE.MNTHR ) THEN
312 * Path 1a - overdetermined, with many more rows than
316 MAXWRK = MAX( MAXWRK, N*ILAENV( 1, 'ZGEQRF', ' ', M, N,
318 MAXWRK = MAX( MAXWRK, NRHS*ILAENV( 1, 'ZUNMQR', 'LC', M,
323 * Path 1 - overdetermined or exactly determined.
325 LRWORK = 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS +
326 $ MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
327 MAXWRK = MAX( MAXWRK, 2*N + ( MM + N )*ILAENV( 1,
328 $ 'ZGEBRD', ' ', MM, N, -1, -1 ) )
329 MAXWRK = MAX( MAXWRK, 2*N + NRHS*ILAENV( 1, 'ZUNMBR',
330 $ 'QLC', MM, NRHS, N, -1 ) )
331 MAXWRK = MAX( MAXWRK, 2*N + ( N - 1 )*ILAENV( 1,
332 $ 'ZUNMBR', 'PLN', N, NRHS, N, -1 ) )
333 MAXWRK = MAX( MAXWRK, 2*N + N*NRHS )
334 MINWRK = MAX( 2*N + MM, 2*N + N*NRHS )
337 LRWORK = 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS +
338 $ MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
339 IF( N.GE.MNTHR ) THEN
341 * Path 2a - underdetermined, with many more columns
344 MAXWRK = M + M*ILAENV( 1, 'ZGELQF', ' ', M, N, -1,
346 MAXWRK = MAX( MAXWRK, M*M + 4*M + 2*M*ILAENV( 1,
347 $ 'ZGEBRD', ' ', M, M, -1, -1 ) )
348 MAXWRK = MAX( MAXWRK, M*M + 4*M + NRHS*ILAENV( 1,
349 $ 'ZUNMBR', 'QLC', M, NRHS, M, -1 ) )
350 MAXWRK = MAX( MAXWRK, M*M + 4*M + ( M - 1 )*ILAENV( 1,
351 $ 'ZUNMLQ', 'LC', N, NRHS, M, -1 ) )
353 MAXWRK = MAX( MAXWRK, M*M + M + M*NRHS )
355 MAXWRK = MAX( MAXWRK, M*M + 2*M )
357 MAXWRK = MAX( MAXWRK, M*M + 4*M + M*NRHS )
358 ! XXX: Ensure the Path 2a case below is triggered. The workspace
359 ! calculation should use queries for all routines eventually.
360 MAXWRK = MAX( MAXWRK,
361 $ 4*M+M*M+MAX( M, 2*M-4, NRHS, N-3*M ) )
364 * Path 2 - underdetermined.
366 MAXWRK = 2*M + ( N + M )*ILAENV( 1, 'ZGEBRD', ' ', M,
368 MAXWRK = MAX( MAXWRK, 2*M + NRHS*ILAENV( 1, 'ZUNMBR',
369 $ 'QLC', M, NRHS, M, -1 ) )
370 MAXWRK = MAX( MAXWRK, 2*M + M*ILAENV( 1, 'ZUNMBR',
371 $ 'PLN', N, NRHS, M, -1 ) )
372 MAXWRK = MAX( MAXWRK, 2*M + M*NRHS )
374 MINWRK = MAX( 2*M + N, 2*M + M*NRHS )
377 MINWRK = MIN( MINWRK, MAXWRK )
382 IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN
388 CALL XERBLA( 'ZGELSD', -INFO )
390 ELSE IF( LQUERY ) THEN
394 * Quick return if possible.
396 IF( M.EQ.0 .OR. N.EQ.0 ) THEN
401 * Get machine parameters.
404 SFMIN = DLAMCH( 'S' )
406 BIGNUM = ONE / SMLNUM
407 CALL DLABAD( SMLNUM, BIGNUM )
409 * Scale A if max entry outside range [SMLNUM,BIGNUM].
411 ANRM = ZLANGE( 'M', M, N, A, LDA, RWORK )
413 IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN
415 * Scale matrix norm up to SMLNUM
417 CALL ZLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, INFO )
419 ELSE IF( ANRM.GT.BIGNUM ) THEN
421 * Scale matrix norm down to BIGNUM.
423 CALL ZLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, INFO )
425 ELSE IF( ANRM.EQ.ZERO ) THEN
427 * Matrix all zero. Return zero solution.
429 CALL ZLASET( 'F', MAX( M, N ), NRHS, CZERO, CZERO, B, LDB )
430 CALL DLASET( 'F', MINMN, 1, ZERO, ZERO, S, 1 )
435 * Scale B if max entry outside range [SMLNUM,BIGNUM].
437 BNRM = ZLANGE( 'M', M, NRHS, B, LDB, RWORK )
439 IF( BNRM.GT.ZERO .AND. BNRM.LT.SMLNUM ) THEN
441 * Scale matrix norm up to SMLNUM.
443 CALL ZLASCL( 'G', 0, 0, BNRM, SMLNUM, M, NRHS, B, LDB, INFO )
445 ELSE IF( BNRM.GT.BIGNUM ) THEN
447 * Scale matrix norm down to BIGNUM.
449 CALL ZLASCL( 'G', 0, 0, BNRM, BIGNUM, M, NRHS, B, LDB, INFO )
453 * If M < N make sure B(M+1:N,:) = 0
456 $ CALL ZLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB )
458 * Overdetermined case.
462 * Path 1 - overdetermined or exactly determined.
465 IF( M.GE.MNTHR ) THEN
467 * Path 1a - overdetermined, with many more rows than columns
474 * (RWorkspace: need N)
475 * (CWorkspace: need N, prefer N*NB)
477 CALL ZGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
478 $ LWORK-NWORK+1, INFO )
480 * Multiply B by transpose(Q).
481 * (RWorkspace: need N)
482 * (CWorkspace: need NRHS, prefer NRHS*NB)
484 CALL ZUNMQR( 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAU ), B,
485 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
490 CALL ZLASET( 'L', N-1, N-1, CZERO, CZERO, A( 2, 1 ),
501 * Bidiagonalize R in A.
502 * (RWorkspace: need N)
503 * (CWorkspace: need 2*N+MM, prefer 2*N+(MM+N)*NB)
505 CALL ZGEBRD( MM, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ),
506 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
509 * Multiply B by transpose of left bidiagonalizing vectors of R.
510 * (CWorkspace: need 2*N+NRHS, prefer 2*N+NRHS*NB)
512 CALL ZUNMBR( 'Q', 'L', 'C', MM, NRHS, N, A, LDA, WORK( ITAUQ ),
513 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
515 * Solve the bidiagonal least squares problem.
517 CALL ZLALSD( 'U', SMLSIZ, N, NRHS, S, RWORK( IE ), B, LDB,
518 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
524 * Multiply B by right bidiagonalizing vectors of R.
526 CALL ZUNMBR( 'P', 'L', 'N', N, NRHS, N, A, LDA, WORK( ITAUP ),
527 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
529 ELSE IF( N.GE.MNTHR .AND. LWORK.GE.4*M+M*M+
530 $ MAX( M, 2*M-4, NRHS, N-3*M ) ) THEN
532 * Path 2a - underdetermined, with many more columns than rows
533 * and sufficient workspace for an efficient algorithm.
536 IF( LWORK.GE.MAX( 4*M+M*LDA+MAX( M, 2*M-4, NRHS, N-3*M ),
537 $ M*LDA+M+M*NRHS ) )LDWORK = LDA
542 * (CWorkspace: need 2*M, prefer M+M*NB)
544 CALL ZGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
545 $ LWORK-NWORK+1, INFO )
548 * Copy L to WORK(IL), zeroing out above its diagonal.
550 CALL ZLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWORK )
551 CALL ZLASET( 'U', M-1, M-1, CZERO, CZERO, WORK( IL+LDWORK ),
553 ITAUQ = IL + LDWORK*M
559 * Bidiagonalize L in WORK(IL).
560 * (RWorkspace: need M)
561 * (CWorkspace: need M*M+4*M, prefer M*M+4*M+2*M*NB)
563 CALL ZGEBRD( M, M, WORK( IL ), LDWORK, S, RWORK( IE ),
564 $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ),
565 $ LWORK-NWORK+1, INFO )
567 * Multiply B by transpose of left bidiagonalizing vectors of L.
568 * (CWorkspace: need M*M+4*M+NRHS, prefer M*M+4*M+NRHS*NB)
570 CALL ZUNMBR( 'Q', 'L', 'C', M, NRHS, M, WORK( IL ), LDWORK,
571 $ WORK( ITAUQ ), B, LDB, WORK( NWORK ),
572 $ LWORK-NWORK+1, INFO )
574 * Solve the bidiagonal least squares problem.
576 CALL ZLALSD( 'U', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB,
577 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
583 * Multiply B by right bidiagonalizing vectors of L.
585 CALL ZUNMBR( 'P', 'L', 'N', M, NRHS, M, WORK( IL ), LDWORK,
586 $ WORK( ITAUP ), B, LDB, WORK( NWORK ),
587 $ LWORK-NWORK+1, INFO )
589 * Zero out below first M rows of B.
591 CALL ZLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB )
594 * Multiply transpose(Q) by B.
595 * (CWorkspace: need NRHS, prefer NRHS*NB)
597 CALL ZUNMLQ( 'L', 'C', N, NRHS, M, A, LDA, WORK( ITAU ), B,
598 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
602 * Path 2 - remaining underdetermined cases.
611 * (RWorkspace: need M)
612 * (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB)
614 CALL ZGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ),
615 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
618 * Multiply B by transpose of left bidiagonalizing vectors.
619 * (CWorkspace: need 2*M+NRHS, prefer 2*M+NRHS*NB)
621 CALL ZUNMBR( 'Q', 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAUQ ),
622 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
624 * Solve the bidiagonal least squares problem.
626 CALL ZLALSD( 'L', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB,
627 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
633 * Multiply B by right bidiagonalizing vectors of A.
635 CALL ZUNMBR( 'P', 'L', 'N', N, NRHS, M, A, LDA, WORK( ITAUP ),
636 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
642 IF( IASCL.EQ.1 ) THEN
643 CALL ZLASCL( 'G', 0, 0, ANRM, SMLNUM, N, NRHS, B, LDB, INFO )
644 CALL DLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN,
646 ELSE IF( IASCL.EQ.2 ) THEN
647 CALL ZLASCL( 'G', 0, 0, ANRM, BIGNUM, N, NRHS, B, LDB, INFO )
648 CALL DLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN,
651 IF( IBSCL.EQ.1 ) THEN
652 CALL ZLASCL( 'G', 0, 0, SMLNUM, BNRM, N, NRHS, B, LDB, INFO )
653 ELSE IF( IBSCL.EQ.2 ) THEN
654 CALL ZLASCL( 'G', 0, 0, BIGNUM, BNRM, N, NRHS, B, LDB, INFO )