1 *> \brief <b> CGELSD computes the minimum-norm solution to a linear least squares problem for GE matrices</b>
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download CGELSD + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/cgelsd.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/cgelsd.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/cgelsd.f">
21 * SUBROUTINE CGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK,
22 * WORK, LWORK, RWORK, IWORK, INFO )
24 * .. Scalar Arguments ..
25 * INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
28 * .. Array Arguments ..
30 * REAL RWORK( * ), S( * )
31 * COMPLEX A( LDA, * ), B( LDB, * ), WORK( * )
40 *> CGELSD computes the minimum-norm solution to a real linear least
42 *> minimize 2-norm(| b - A*x |)
43 *> using the singular value decomposition (SVD) of A. A is an M-by-N
44 *> matrix which may be rank-deficient.
46 *> Several right hand side vectors b and solution vectors x can be
47 *> handled in a single call; they are stored as the columns of the
48 *> M-by-NRHS right hand side matrix B and the N-by-NRHS solution
51 *> The problem is solved in three steps:
52 *> (1) Reduce the coefficient matrix A to bidiagonal form with
53 *> Householder transformations, reducing the original problem
54 *> into a "bidiagonal least squares problem" (BLS)
55 *> (2) Solve the BLS using a divide and conquer approach.
56 *> (3) Apply back all the Householder transformations to solve
57 *> the original least squares problem.
59 *> The effective rank of A is determined by treating as zero those
60 *> singular values which are less than RCOND times the largest singular
63 *> The divide and conquer algorithm makes very mild assumptions about
64 *> floating point arithmetic. It will work on machines with a guard
65 *> digit in add/subtract, or on those binary machines without guard
66 *> digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or
67 *> Cray-2. It could conceivably fail on hexadecimal or decimal machines
68 *> without guard digits, but we know of none.
77 *> The number of rows of the matrix A. M >= 0.
83 *> The number of columns of the matrix A. N >= 0.
89 *> The number of right hand sides, i.e., the number of columns
90 *> of the matrices B and X. NRHS >= 0.
95 *> A is COMPLEX array, dimension (LDA,N)
96 *> On entry, the M-by-N matrix A.
97 *> On exit, A has been destroyed.
103 *> The leading dimension of the array A. LDA >= max(1,M).
108 *> B is COMPLEX array, dimension (LDB,NRHS)
109 *> On entry, the M-by-NRHS right hand side matrix B.
110 *> On exit, B is overwritten by the N-by-NRHS solution matrix X.
111 *> If m >= n and RANK = n, the residual sum-of-squares for
112 *> the solution in the i-th column is given by the sum of
113 *> squares of the modulus of elements n+1:m in that column.
119 *> The leading dimension of the array B. LDB >= max(1,M,N).
124 *> S is REAL array, dimension (min(M,N))
125 *> The singular values of A in decreasing order.
126 *> The condition number of A in the 2-norm = S(1)/S(min(m,n)).
132 *> RCOND is used to determine the effective rank of A.
133 *> Singular values S(i) <= RCOND*S(1) are treated as zero.
134 *> If RCOND < 0, machine precision is used instead.
140 *> The effective rank of A, i.e., the number of singular values
141 *> which are greater than RCOND*S(1).
146 *> WORK is COMPLEX array, dimension (MAX(1,LWORK))
147 *> On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
153 *> The dimension of the array WORK. LWORK must be at least 1.
154 *> The exact minimum amount of workspace needed depends on M,
155 *> N and NRHS. As long as LWORK is at least
157 *> if M is greater than or equal to N or
159 *> if M is less than N, the code will execute correctly.
160 *> For good performance, LWORK should generally be larger.
162 *> If LWORK = -1, then a workspace query is assumed; the routine
163 *> only calculates the optimal size of the array WORK and the
164 *> minimum sizes of the arrays RWORK and IWORK, and returns
165 *> these values as the first entries of the WORK, RWORK and
166 *> IWORK arrays, and no error message related to LWORK is issued
172 *> RWORK is REAL array, dimension (MAX(1,LRWORK))
174 *> 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS +
175 *> MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
176 *> if M is greater than or equal to N or
177 *> 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS +
178 *> MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
179 *> if M is less than N, the code will execute correctly.
180 *> SMLSIZ is returned by ILAENV and is equal to the maximum
181 *> size of the subproblems at the bottom of the computation
182 *> tree (usually about 25), and
183 *> NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 )
184 *> On exit, if INFO = 0, RWORK(1) returns the minimum LRWORK.
189 *> IWORK is INTEGER array, dimension (MAX(1,LIWORK))
190 *> LIWORK >= max(1, 3*MINMN*NLVL + 11*MINMN),
191 *> where MINMN = MIN( M,N ).
192 *> On exit, if INFO = 0, IWORK(1) returns the minimum LIWORK.
198 *> = 0: successful exit
199 *> < 0: if INFO = -i, the i-th argument had an illegal value.
200 *> > 0: the algorithm for computing the SVD failed to converge;
201 *> if INFO = i, i off-diagonal elements of an intermediate
202 *> bidiagonal form did not converge to zero.
208 *> \author Univ. of Tennessee
209 *> \author Univ. of California Berkeley
210 *> \author Univ. of Colorado Denver
213 *> \date November 2011
215 *> \ingroup complexGEsolve
217 *> \par Contributors:
220 *> Ming Gu and Ren-Cang Li, Computer Science Division, University of
221 *> California at Berkeley, USA \n
222 *> Osni Marques, LBNL/NERSC, USA \n
224 * =====================================================================
225 SUBROUTINE CGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND, RANK,
226 $ WORK, LWORK, RWORK, IWORK, INFO )
228 * -- LAPACK driver routine (version 3.4.0) --
229 * -- LAPACK is a software package provided by Univ. of Tennessee, --
230 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
233 * .. Scalar Arguments ..
234 INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
237 * .. Array Arguments ..
239 REAL RWORK( * ), S( * )
240 COMPLEX A( LDA, * ), B( LDB, * ), WORK( * )
243 * =====================================================================
247 PARAMETER ( ZERO = 0.0E+0, ONE = 1.0E+0, TWO = 2.0E+0 )
249 PARAMETER ( CZERO = ( 0.0E+0, 0.0E+0 ) )
251 * .. Local Scalars ..
253 INTEGER IASCL, IBSCL, IE, IL, ITAU, ITAUP, ITAUQ,
254 $ LDWORK, LIWORK, LRWORK, MAXMN, MAXWRK, MINMN,
255 $ MINWRK, MM, MNTHR, NLVL, NRWORK, NWORK, SMLSIZ
256 REAL ANRM, BIGNUM, BNRM, EPS, SFMIN, SMLNUM
258 * .. External Subroutines ..
259 EXTERNAL CGEBRD, CGELQF, CGEQRF, CLACPY,
260 $ CLALSD, CLASCL, CLASET, CUNMBR,
261 $ CUNMLQ, CUNMQR, SLABAD, SLASCL,
264 * .. External Functions ..
267 EXTERNAL CLANGE, SLAMCH, ILAENV
269 * .. Intrinsic Functions ..
270 INTRINSIC INT, LOG, MAX, MIN, REAL
272 * .. Executable Statements ..
274 * Test the input arguments.
279 LQUERY = ( LWORK.EQ.-1 )
282 ELSE IF( N.LT.0 ) THEN
284 ELSE IF( NRHS.LT.0 ) THEN
286 ELSE IF( LDA.LT.MAX( 1, M ) ) THEN
288 ELSE IF( LDB.LT.MAX( 1, MAXMN ) ) THEN
293 * (Note: Comments in the code beginning "Workspace:" describe the
294 * minimal amount of workspace needed at that point in the code,
295 * as well as the preferred amount for good performance.
296 * NB refers to the optimal block size for the immediately
297 * following subroutine, as returned by ILAENV.)
304 IF( MINMN.GT.0 ) THEN
305 SMLSIZ = ILAENV( 9, 'CGELSD', ' ', 0, 0, 0, 0 )
306 MNTHR = ILAENV( 6, 'CGELSD', ' ', M, N, NRHS, -1 )
307 NLVL = MAX( INT( LOG( REAL( MINMN ) / REAL( SMLSIZ + 1 ) ) /
308 $ LOG( TWO ) ) + 1, 0 )
309 LIWORK = 3*MINMN*NLVL + 11*MINMN
311 IF( M.GE.N .AND. M.GE.MNTHR ) THEN
313 * Path 1a - overdetermined, with many more rows than
317 MAXWRK = MAX( MAXWRK, N*ILAENV( 1, 'CGEQRF', ' ', M, N,
319 MAXWRK = MAX( MAXWRK, NRHS*ILAENV( 1, 'CUNMQR', 'LC', M,
324 * Path 1 - overdetermined or exactly determined.
326 LRWORK = 10*N + 2*N*SMLSIZ + 8*N*NLVL + 3*SMLSIZ*NRHS +
327 $ MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
328 MAXWRK = MAX( MAXWRK, 2*N + ( MM + N )*ILAENV( 1,
329 $ 'CGEBRD', ' ', MM, N, -1, -1 ) )
330 MAXWRK = MAX( MAXWRK, 2*N + NRHS*ILAENV( 1, 'CUNMBR',
331 $ 'QLC', MM, NRHS, N, -1 ) )
332 MAXWRK = MAX( MAXWRK, 2*N + ( N - 1 )*ILAENV( 1,
333 $ 'CUNMBR', 'PLN', N, NRHS, N, -1 ) )
334 MAXWRK = MAX( MAXWRK, 2*N + N*NRHS )
335 MINWRK = MAX( 2*N + MM, 2*N + N*NRHS )
338 LRWORK = 10*M + 2*M*SMLSIZ + 8*M*NLVL + 3*SMLSIZ*NRHS +
339 $ MAX( (SMLSIZ+1)**2, N*(1+NRHS) + 2*NRHS )
340 IF( N.GE.MNTHR ) THEN
342 * Path 2a - underdetermined, with many more columns
345 MAXWRK = M + M*ILAENV( 1, 'CGELQF', ' ', M, N, -1,
347 MAXWRK = MAX( MAXWRK, M*M + 4*M + 2*M*ILAENV( 1,
348 $ 'CGEBRD', ' ', M, M, -1, -1 ) )
349 MAXWRK = MAX( MAXWRK, M*M + 4*M + NRHS*ILAENV( 1,
350 $ 'CUNMBR', 'QLC', M, NRHS, M, -1 ) )
351 MAXWRK = MAX( MAXWRK, M*M + 4*M + ( M - 1 )*ILAENV( 1,
352 $ 'CUNMLQ', 'LC', N, NRHS, M, -1 ) )
354 MAXWRK = MAX( MAXWRK, M*M + M + M*NRHS )
356 MAXWRK = MAX( MAXWRK, M*M + 2*M )
358 MAXWRK = MAX( MAXWRK, M*M + 4*M + M*NRHS )
359 ! XXX: Ensure the Path 2a case below is triggered. The workspace
360 ! calculation should use queries for all routines eventually.
361 MAXWRK = MAX( MAXWRK,
362 $ 4*M+M*M+MAX( M, 2*M-4, NRHS, N-3*M ) )
365 * Path 2 - underdetermined.
367 MAXWRK = 2*M + ( N + M )*ILAENV( 1, 'CGEBRD', ' ', M,
369 MAXWRK = MAX( MAXWRK, 2*M + NRHS*ILAENV( 1, 'CUNMBR',
370 $ 'QLC', M, NRHS, M, -1 ) )
371 MAXWRK = MAX( MAXWRK, 2*M + M*ILAENV( 1, 'CUNMBR',
372 $ 'PLN', N, NRHS, M, -1 ) )
373 MAXWRK = MAX( MAXWRK, 2*M + M*NRHS )
375 MINWRK = MAX( 2*M + N, 2*M + M*NRHS )
378 MINWRK = MIN( MINWRK, MAXWRK )
383 IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN
389 CALL XERBLA( 'CGELSD', -INFO )
391 ELSE IF( LQUERY ) THEN
395 * Quick return if possible.
397 IF( M.EQ.0 .OR. N.EQ.0 ) THEN
402 * Get machine parameters.
405 SFMIN = SLAMCH( 'S' )
407 BIGNUM = ONE / SMLNUM
408 CALL SLABAD( SMLNUM, BIGNUM )
410 * Scale A if max entry outside range [SMLNUM,BIGNUM].
412 ANRM = CLANGE( 'M', M, N, A, LDA, RWORK )
414 IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN
416 * Scale matrix norm up to SMLNUM
418 CALL CLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, INFO )
420 ELSE IF( ANRM.GT.BIGNUM ) THEN
422 * Scale matrix norm down to BIGNUM.
424 CALL CLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, INFO )
426 ELSE IF( ANRM.EQ.ZERO ) THEN
428 * Matrix all zero. Return zero solution.
430 CALL CLASET( 'F', MAX( M, N ), NRHS, CZERO, CZERO, B, LDB )
431 CALL SLASET( 'F', MINMN, 1, ZERO, ZERO, S, 1 )
436 * Scale B if max entry outside range [SMLNUM,BIGNUM].
438 BNRM = CLANGE( 'M', M, NRHS, B, LDB, RWORK )
440 IF( BNRM.GT.ZERO .AND. BNRM.LT.SMLNUM ) THEN
442 * Scale matrix norm up to SMLNUM.
444 CALL CLASCL( 'G', 0, 0, BNRM, SMLNUM, M, NRHS, B, LDB, INFO )
446 ELSE IF( BNRM.GT.BIGNUM ) THEN
448 * Scale matrix norm down to BIGNUM.
450 CALL CLASCL( 'G', 0, 0, BNRM, BIGNUM, M, NRHS, B, LDB, INFO )
454 * If M < N make sure B(M+1:N,:) = 0
457 $ CALL CLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB )
459 * Overdetermined case.
463 * Path 1 - overdetermined or exactly determined.
466 IF( M.GE.MNTHR ) THEN
468 * Path 1a - overdetermined, with many more rows than columns
475 * (RWorkspace: need N)
476 * (CWorkspace: need N, prefer N*NB)
478 CALL CGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
479 $ LWORK-NWORK+1, INFO )
481 * Multiply B by transpose(Q).
482 * (RWorkspace: need N)
483 * (CWorkspace: need NRHS, prefer NRHS*NB)
485 CALL CUNMQR( 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAU ), B,
486 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
491 CALL CLASET( 'L', N-1, N-1, CZERO, CZERO, A( 2, 1 ),
502 * Bidiagonalize R in A.
503 * (RWorkspace: need N)
504 * (CWorkspace: need 2*N+MM, prefer 2*N+(MM+N)*NB)
506 CALL CGEBRD( MM, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ),
507 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
510 * Multiply B by transpose of left bidiagonalizing vectors of R.
511 * (CWorkspace: need 2*N+NRHS, prefer 2*N+NRHS*NB)
513 CALL CUNMBR( 'Q', 'L', 'C', MM, NRHS, N, A, LDA, WORK( ITAUQ ),
514 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
516 * Solve the bidiagonal least squares problem.
518 CALL CLALSD( 'U', SMLSIZ, N, NRHS, S, RWORK( IE ), B, LDB,
519 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
525 * Multiply B by right bidiagonalizing vectors of R.
527 CALL CUNMBR( 'P', 'L', 'N', N, NRHS, N, A, LDA, WORK( ITAUP ),
528 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
530 ELSE IF( N.GE.MNTHR .AND. LWORK.GE.4*M+M*M+
531 $ MAX( M, 2*M-4, NRHS, N-3*M ) ) THEN
533 * Path 2a - underdetermined, with many more columns than rows
534 * and sufficient workspace for an efficient algorithm.
537 IF( LWORK.GE.MAX( 4*M+M*LDA+MAX( M, 2*M-4, NRHS, N-3*M ),
538 $ M*LDA+M+M*NRHS ) )LDWORK = LDA
543 * (CWorkspace: need 2*M, prefer M+M*NB)
545 CALL CGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
546 $ LWORK-NWORK+1, INFO )
549 * Copy L to WORK(IL), zeroing out above its diagonal.
551 CALL CLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWORK )
552 CALL CLASET( 'U', M-1, M-1, CZERO, CZERO, WORK( IL+LDWORK ),
554 ITAUQ = IL + LDWORK*M
560 * Bidiagonalize L in WORK(IL).
561 * (RWorkspace: need M)
562 * (CWorkspace: need M*M+4*M, prefer M*M+4*M+2*M*NB)
564 CALL CGEBRD( M, M, WORK( IL ), LDWORK, S, RWORK( IE ),
565 $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ),
566 $ LWORK-NWORK+1, INFO )
568 * Multiply B by transpose of left bidiagonalizing vectors of L.
569 * (CWorkspace: need M*M+4*M+NRHS, prefer M*M+4*M+NRHS*NB)
571 CALL CUNMBR( 'Q', 'L', 'C', M, NRHS, M, WORK( IL ), LDWORK,
572 $ WORK( ITAUQ ), B, LDB, WORK( NWORK ),
573 $ LWORK-NWORK+1, INFO )
575 * Solve the bidiagonal least squares problem.
577 CALL CLALSD( 'U', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB,
578 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
584 * Multiply B by right bidiagonalizing vectors of L.
586 CALL CUNMBR( 'P', 'L', 'N', M, NRHS, M, WORK( IL ), LDWORK,
587 $ WORK( ITAUP ), B, LDB, WORK( NWORK ),
588 $ LWORK-NWORK+1, INFO )
590 * Zero out below first M rows of B.
592 CALL CLASET( 'F', N-M, NRHS, CZERO, CZERO, B( M+1, 1 ), LDB )
595 * Multiply transpose(Q) by B.
596 * (CWorkspace: need NRHS, prefer NRHS*NB)
598 CALL CUNMLQ( 'L', 'C', N, NRHS, M, A, LDA, WORK( ITAU ), B,
599 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
603 * Path 2 - remaining underdetermined cases.
612 * (RWorkspace: need M)
613 * (CWorkspace: need 2*M+N, prefer 2*M+(M+N)*NB)
615 CALL CGEBRD( M, N, A, LDA, S, RWORK( IE ), WORK( ITAUQ ),
616 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
619 * Multiply B by transpose of left bidiagonalizing vectors.
620 * (CWorkspace: need 2*M+NRHS, prefer 2*M+NRHS*NB)
622 CALL CUNMBR( 'Q', 'L', 'C', M, NRHS, N, A, LDA, WORK( ITAUQ ),
623 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
625 * Solve the bidiagonal least squares problem.
627 CALL CLALSD( 'L', SMLSIZ, M, NRHS, S, RWORK( IE ), B, LDB,
628 $ RCOND, RANK, WORK( NWORK ), RWORK( NRWORK ),
634 * Multiply B by right bidiagonalizing vectors of A.
636 CALL CUNMBR( 'P', 'L', 'N', N, NRHS, M, A, LDA, WORK( ITAUP ),
637 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
643 IF( IASCL.EQ.1 ) THEN
644 CALL CLASCL( 'G', 0, 0, ANRM, SMLNUM, N, NRHS, B, LDB, INFO )
645 CALL SLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN,
647 ELSE IF( IASCL.EQ.2 ) THEN
648 CALL CLASCL( 'G', 0, 0, ANRM, BIGNUM, N, NRHS, B, LDB, INFO )
649 CALL SLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN,
652 IF( IBSCL.EQ.1 ) THEN
653 CALL CLASCL( 'G', 0, 0, SMLNUM, BNRM, N, NRHS, B, LDB, INFO )
654 ELSE IF( IBSCL.EQ.2 ) THEN
655 CALL CLASCL( 'G', 0, 0, BIGNUM, BNRM, N, NRHS, B, LDB, INFO )