1 *> \brief <b> SGELSD computes the minimum-norm solution to a linear least squares problem for GE matrices</b>
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download SGELSD + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/sgelsd.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/sgelsd.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/sgelsd.f">
21 * SUBROUTINE SGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND,
22 * RANK, WORK, LWORK, IWORK, INFO )
24 * .. Scalar Arguments ..
25 * INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
28 * .. Array Arguments ..
30 * REAL A( LDA, * ), B( LDB, * ), S( * ), WORK( * )
39 *> SGELSD computes the minimum-norm solution to a real linear least
41 *> minimize 2-norm(| b - A*x |)
42 *> using the singular value decomposition (SVD) of A. A is an M-by-N
43 *> matrix which may be rank-deficient.
45 *> Several right hand side vectors b and solution vectors x can be
46 *> handled in a single call; they are stored as the columns of the
47 *> M-by-NRHS right hand side matrix B and the N-by-NRHS solution
50 *> The problem is solved in three steps:
51 *> (1) Reduce the coefficient matrix A to bidiagonal form with
52 *> Householder transformations, reducing the original problem
53 *> into a "bidiagonal least squares problem" (BLS)
54 *> (2) Solve the BLS using a divide and conquer approach.
55 *> (3) Apply back all the Householder transformations to solve
56 *> the original least squares problem.
58 *> The effective rank of A is determined by treating as zero those
59 *> singular values which are less than RCOND times the largest singular
62 *> The divide and conquer algorithm makes very mild assumptions about
63 *> floating point arithmetic. It will work on machines with a guard
64 *> digit in add/subtract, or on those binary machines without guard
65 *> digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or
66 *> Cray-2. It could conceivably fail on hexadecimal or decimal machines
67 *> without guard digits, but we know of none.
76 *> The number of rows of A. M >= 0.
82 *> The number of columns of A. N >= 0.
88 *> The number of right hand sides, i.e., the number of columns
89 *> of the matrices B and X. NRHS >= 0.
94 *> A is REAL array, dimension (LDA,N)
95 *> On entry, the M-by-N matrix A.
96 *> On exit, A has been destroyed.
102 *> The leading dimension of the array A. LDA >= max(1,M).
107 *> B is REAL array, dimension (LDB,NRHS)
108 *> On entry, the M-by-NRHS right hand side matrix B.
109 *> On exit, B is overwritten by the N-by-NRHS solution
110 *> matrix X. If m >= n and RANK = n, the residual
111 *> sum-of-squares for the solution in the i-th column is given
112 *> by the sum of squares of elements n+1:m in that column.
118 *> The leading dimension of the array B. LDB >= max(1,max(M,N)).
123 *> S is REAL array, dimension (min(M,N))
124 *> The singular values of A in decreasing order.
125 *> The condition number of A in the 2-norm = S(1)/S(min(m,n)).
131 *> RCOND is used to determine the effective rank of A.
132 *> Singular values S(i) <= RCOND*S(1) are treated as zero.
133 *> If RCOND < 0, machine precision is used instead.
139 *> The effective rank of A, i.e., the number of singular values
140 *> which are greater than RCOND*S(1).
145 *> WORK is REAL array, dimension (MAX(1,LWORK))
146 *> On exit, if INFO = 0, WORK(1) returns the optimal LWORK.
152 *> The dimension of the array WORK. LWORK must be at least 1.
153 *> The exact minimum amount of workspace needed depends on M,
154 *> N and NRHS. As long as LWORK is at least
155 *> 12*N + 2*N*SMLSIZ + 8*N*NLVL + N*NRHS + (SMLSIZ+1)**2,
156 *> if M is greater than or equal to N or
157 *> 12*M + 2*M*SMLSIZ + 8*M*NLVL + M*NRHS + (SMLSIZ+1)**2,
158 *> if M is less than N, the code will execute correctly.
159 *> SMLSIZ is returned by ILAENV and is equal to the maximum
160 *> size of the subproblems at the bottom of the computation
161 *> tree (usually about 25), and
162 *> NLVL = MAX( 0, INT( LOG_2( MIN( M,N )/(SMLSIZ+1) ) ) + 1 )
163 *> For good performance, LWORK should generally be larger.
165 *> If LWORK = -1, then a workspace query is assumed; the routine
166 *> only calculates the optimal size of the array WORK and the
167 *> minimum size of the array IWORK, and returns these values as
168 *> the first entries of the WORK and IWORK arrays, and no error
169 *> message related to LWORK is issued by XERBLA.
174 *> IWORK is INTEGER array, dimension (MAX(1,LIWORK))
175 *> LIWORK >= max(1, 3*MINMN*NLVL + 11*MINMN),
176 *> where MINMN = MIN( M,N ).
177 *> On exit, if INFO = 0, IWORK(1) returns the minimum LIWORK.
183 *> = 0: successful exit
184 *> < 0: if INFO = -i, the i-th argument had an illegal value.
185 *> > 0: the algorithm for computing the SVD failed to converge;
186 *> if INFO = i, i off-diagonal elements of an intermediate
187 *> bidiagonal form did not converge to zero.
193 *> \author Univ. of Tennessee
194 *> \author Univ. of California Berkeley
195 *> \author Univ. of Colorado Denver
198 *> \date November 2011
200 *> \ingroup realGEsolve
202 *> \par Contributors:
205 *> Ming Gu and Ren-Cang Li, Computer Science Division, University of
206 *> California at Berkeley, USA \n
207 *> Osni Marques, LBNL/NERSC, USA \n
209 * =====================================================================
210 SUBROUTINE SGELSD( M, N, NRHS, A, LDA, B, LDB, S, RCOND,
211 $ RANK, WORK, LWORK, IWORK, INFO )
213 * -- LAPACK driver routine (version 3.4.0) --
214 * -- LAPACK is a software package provided by Univ. of Tennessee, --
215 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
218 * .. Scalar Arguments ..
219 INTEGER INFO, LDA, LDB, LWORK, M, N, NRHS, RANK
222 * .. Array Arguments ..
224 REAL A( LDA, * ), B( LDB, * ), S( * ), WORK( * )
227 * =====================================================================
231 PARAMETER ( ZERO = 0.0E0, ONE = 1.0E0, TWO = 2.0E0 )
233 * .. Local Scalars ..
235 INTEGER IASCL, IBSCL, IE, IL, ITAU, ITAUP, ITAUQ,
236 $ LDWORK, LIWORK, MAXMN, MAXWRK, MINMN, MINWRK,
237 $ MM, MNTHR, NLVL, NWORK, SMLSIZ, WLALSD
238 REAL ANRM, BIGNUM, BNRM, EPS, SFMIN, SMLNUM
240 * .. External Subroutines ..
241 EXTERNAL SGEBRD, SGELQF, SGEQRF, SLABAD, SLACPY, SLALSD,
242 $ SLASCL, SLASET, SORMBR, SORMLQ, SORMQR, XERBLA
244 * .. External Functions ..
247 EXTERNAL SLAMCH, SLANGE, ILAENV
249 * .. Intrinsic Functions ..
250 INTRINSIC INT, LOG, MAX, MIN, REAL
252 * .. Executable Statements ..
254 * Test the input arguments.
259 LQUERY = ( LWORK.EQ.-1 )
262 ELSE IF( N.LT.0 ) THEN
264 ELSE IF( NRHS.LT.0 ) THEN
266 ELSE IF( LDA.LT.MAX( 1, M ) ) THEN
268 ELSE IF( LDB.LT.MAX( 1, MAXMN ) ) THEN
273 * (Note: Comments in the code beginning "Workspace:" describe the
274 * minimal amount of workspace needed at that point in the code,
275 * as well as the preferred amount for good performance.
276 * NB refers to the optimal block size for the immediately
277 * following subroutine, as returned by ILAENV.)
283 IF( MINMN.GT.0 ) THEN
284 SMLSIZ = ILAENV( 9, 'SGELSD', ' ', 0, 0, 0, 0 )
285 MNTHR = ILAENV( 6, 'SGELSD', ' ', M, N, NRHS, -1 )
286 NLVL = MAX( INT( LOG( REAL( MINMN ) / REAL( SMLSIZ + 1 ) ) /
287 $ LOG( TWO ) ) + 1, 0 )
288 LIWORK = 3*MINMN*NLVL + 11*MINMN
290 IF( M.GE.N .AND. M.GE.MNTHR ) THEN
292 * Path 1a - overdetermined, with many more rows than
296 MAXWRK = MAX( MAXWRK, N + N*ILAENV( 1, 'SGEQRF', ' ', M,
298 MAXWRK = MAX( MAXWRK, N + NRHS*ILAENV( 1, 'SORMQR', 'LT',
303 * Path 1 - overdetermined or exactly determined.
305 MAXWRK = MAX( MAXWRK, 3*N + ( MM + N )*ILAENV( 1,
306 $ 'SGEBRD', ' ', MM, N, -1, -1 ) )
307 MAXWRK = MAX( MAXWRK, 3*N + NRHS*ILAENV( 1, 'SORMBR',
308 $ 'QLT', MM, NRHS, N, -1 ) )
309 MAXWRK = MAX( MAXWRK, 3*N + ( N - 1 )*ILAENV( 1,
310 $ 'SORMBR', 'PLN', N, NRHS, N, -1 ) )
311 WLALSD = 9*N + 2*N*SMLSIZ + 8*N*NLVL + N*NRHS +
313 MAXWRK = MAX( MAXWRK, 3*N + WLALSD )
314 MINWRK = MAX( 3*N + MM, 3*N + NRHS, 3*N + WLALSD )
317 WLALSD = 9*M + 2*M*SMLSIZ + 8*M*NLVL + M*NRHS +
319 IF( N.GE.MNTHR ) THEN
321 * Path 2a - underdetermined, with many more columns
324 MAXWRK = M + M*ILAENV( 1, 'SGELQF', ' ', M, N, -1,
326 MAXWRK = MAX( MAXWRK, M*M + 4*M + 2*M*ILAENV( 1,
327 $ 'SGEBRD', ' ', M, M, -1, -1 ) )
328 MAXWRK = MAX( MAXWRK, M*M + 4*M + NRHS*ILAENV( 1,
329 $ 'SORMBR', 'QLT', M, NRHS, M, -1 ) )
330 MAXWRK = MAX( MAXWRK, M*M + 4*M + ( M - 1 )*ILAENV( 1,
331 $ 'SORMBR', 'PLN', M, NRHS, M, -1 ) )
333 MAXWRK = MAX( MAXWRK, M*M + M + M*NRHS )
335 MAXWRK = MAX( MAXWRK, M*M + 2*M )
337 MAXWRK = MAX( MAXWRK, M + NRHS*ILAENV( 1, 'SORMLQ',
338 $ 'LT', N, NRHS, M, -1 ) )
339 MAXWRK = MAX( MAXWRK, M*M + 4*M + WLALSD )
340 ! XXX: Ensure the Path 2a case below is triggered. The workspace
341 ! calculation should use queries for all routines eventually.
342 MAXWRK = MAX( MAXWRK,
343 $ 4*M+M*M+MAX( M, 2*M-4, NRHS, N-3*M ) )
346 * Path 2 - remaining underdetermined cases.
348 MAXWRK = 3*M + ( N + M )*ILAENV( 1, 'SGEBRD', ' ', M,
350 MAXWRK = MAX( MAXWRK, 3*M + NRHS*ILAENV( 1, 'SORMBR',
351 $ 'QLT', M, NRHS, N, -1 ) )
352 MAXWRK = MAX( MAXWRK, 3*M + M*ILAENV( 1, 'SORMBR',
353 $ 'PLN', N, NRHS, M, -1 ) )
354 MAXWRK = MAX( MAXWRK, 3*M + WLALSD )
356 MINWRK = MAX( 3*M + NRHS, 3*M + M, 3*M + WLALSD )
359 MINWRK = MIN( MINWRK, MAXWRK )
363 IF( LWORK.LT.MINWRK .AND. .NOT.LQUERY ) THEN
369 CALL XERBLA( 'SGELSD', -INFO )
371 ELSE IF( LQUERY ) THEN
375 * Quick return if possible.
377 IF( M.EQ.0 .OR. N.EQ.0 ) THEN
382 * Get machine parameters.
385 SFMIN = SLAMCH( 'S' )
387 BIGNUM = ONE / SMLNUM
388 CALL SLABAD( SMLNUM, BIGNUM )
390 * Scale A if max entry outside range [SMLNUM,BIGNUM].
392 ANRM = SLANGE( 'M', M, N, A, LDA, WORK )
394 IF( ANRM.GT.ZERO .AND. ANRM.LT.SMLNUM ) THEN
396 * Scale matrix norm up to SMLNUM.
398 CALL SLASCL( 'G', 0, 0, ANRM, SMLNUM, M, N, A, LDA, INFO )
400 ELSE IF( ANRM.GT.BIGNUM ) THEN
402 * Scale matrix norm down to BIGNUM.
404 CALL SLASCL( 'G', 0, 0, ANRM, BIGNUM, M, N, A, LDA, INFO )
406 ELSE IF( ANRM.EQ.ZERO ) THEN
408 * Matrix all zero. Return zero solution.
410 CALL SLASET( 'F', MAX( M, N ), NRHS, ZERO, ZERO, B, LDB )
411 CALL SLASET( 'F', MINMN, 1, ZERO, ZERO, S, 1 )
416 * Scale B if max entry outside range [SMLNUM,BIGNUM].
418 BNRM = SLANGE( 'M', M, NRHS, B, LDB, WORK )
420 IF( BNRM.GT.ZERO .AND. BNRM.LT.SMLNUM ) THEN
422 * Scale matrix norm up to SMLNUM.
424 CALL SLASCL( 'G', 0, 0, BNRM, SMLNUM, M, NRHS, B, LDB, INFO )
426 ELSE IF( BNRM.GT.BIGNUM ) THEN
428 * Scale matrix norm down to BIGNUM.
430 CALL SLASCL( 'G', 0, 0, BNRM, BIGNUM, M, NRHS, B, LDB, INFO )
434 * If M < N make sure certain entries of B are zero.
437 $ CALL SLASET( 'F', N-M, NRHS, ZERO, ZERO, B( M+1, 1 ), LDB )
439 * Overdetermined case.
443 * Path 1 - overdetermined or exactly determined.
446 IF( M.GE.MNTHR ) THEN
448 * Path 1a - overdetermined, with many more rows than columns.
455 * (Workspace: need 2*N, prefer N+N*NB)
457 CALL SGEQRF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
458 $ LWORK-NWORK+1, INFO )
460 * Multiply B by transpose(Q).
461 * (Workspace: need N+NRHS, prefer N+NRHS*NB)
463 CALL SORMQR( 'L', 'T', M, NRHS, N, A, LDA, WORK( ITAU ), B,
464 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
469 CALL SLASET( 'L', N-1, N-1, ZERO, ZERO, A( 2, 1 ), LDA )
478 * Bidiagonalize R in A.
479 * (Workspace: need 3*N+MM, prefer 3*N+(MM+N)*NB)
481 CALL SGEBRD( MM, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
482 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
485 * Multiply B by transpose of left bidiagonalizing vectors of R.
486 * (Workspace: need 3*N+NRHS, prefer 3*N+NRHS*NB)
488 CALL SORMBR( 'Q', 'L', 'T', MM, NRHS, N, A, LDA, WORK( ITAUQ ),
489 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
491 * Solve the bidiagonal least squares problem.
493 CALL SLALSD( 'U', SMLSIZ, N, NRHS, S, WORK( IE ), B, LDB,
494 $ RCOND, RANK, WORK( NWORK ), IWORK, INFO )
499 * Multiply B by right bidiagonalizing vectors of R.
501 CALL SORMBR( 'P', 'L', 'N', N, NRHS, N, A, LDA, WORK( ITAUP ),
502 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
504 ELSE IF( N.GE.MNTHR .AND. LWORK.GE.4*M+M*M+
505 $ MAX( M, 2*M-4, NRHS, N-3*M, WLALSD ) ) THEN
507 * Path 2a - underdetermined, with many more columns than rows
508 * and sufficient workspace for an efficient algorithm.
511 IF( LWORK.GE.MAX( 4*M+M*LDA+MAX( M, 2*M-4, NRHS, N-3*M ),
512 $ M*LDA+M+M*NRHS, 4*M+M*LDA+WLALSD ) )LDWORK = LDA
517 * (Workspace: need 2*M, prefer M+M*NB)
519 CALL SGELQF( M, N, A, LDA, WORK( ITAU ), WORK( NWORK ),
520 $ LWORK-NWORK+1, INFO )
523 * Copy L to WORK(IL), zeroing out above its diagonal.
525 CALL SLACPY( 'L', M, M, A, LDA, WORK( IL ), LDWORK )
526 CALL SLASET( 'U', M-1, M-1, ZERO, ZERO, WORK( IL+LDWORK ),
533 * Bidiagonalize L in WORK(IL).
534 * (Workspace: need M*M+5*M, prefer M*M+4*M+2*M*NB)
536 CALL SGEBRD( M, M, WORK( IL ), LDWORK, S, WORK( IE ),
537 $ WORK( ITAUQ ), WORK( ITAUP ), WORK( NWORK ),
538 $ LWORK-NWORK+1, INFO )
540 * Multiply B by transpose of left bidiagonalizing vectors of L.
541 * (Workspace: need M*M+4*M+NRHS, prefer M*M+4*M+NRHS*NB)
543 CALL SORMBR( 'Q', 'L', 'T', M, NRHS, M, WORK( IL ), LDWORK,
544 $ WORK( ITAUQ ), B, LDB, WORK( NWORK ),
545 $ LWORK-NWORK+1, INFO )
547 * Solve the bidiagonal least squares problem.
549 CALL SLALSD( 'U', SMLSIZ, M, NRHS, S, WORK( IE ), B, LDB,
550 $ RCOND, RANK, WORK( NWORK ), IWORK, INFO )
555 * Multiply B by right bidiagonalizing vectors of L.
557 CALL SORMBR( 'P', 'L', 'N', M, NRHS, M, WORK( IL ), LDWORK,
558 $ WORK( ITAUP ), B, LDB, WORK( NWORK ),
559 $ LWORK-NWORK+1, INFO )
561 * Zero out below first M rows of B.
563 CALL SLASET( 'F', N-M, NRHS, ZERO, ZERO, B( M+1, 1 ), LDB )
566 * Multiply transpose(Q) by B.
567 * (Workspace: need M+NRHS, prefer M+NRHS*NB)
569 CALL SORMLQ( 'L', 'T', N, NRHS, M, A, LDA, WORK( ITAU ), B,
570 $ LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
574 * Path 2 - remaining underdetermined cases.
582 * (Workspace: need 3*M+N, prefer 3*M+(M+N)*NB)
584 CALL SGEBRD( M, N, A, LDA, S, WORK( IE ), WORK( ITAUQ ),
585 $ WORK( ITAUP ), WORK( NWORK ), LWORK-NWORK+1,
588 * Multiply B by transpose of left bidiagonalizing vectors.
589 * (Workspace: need 3*M+NRHS, prefer 3*M+NRHS*NB)
591 CALL SORMBR( 'Q', 'L', 'T', M, NRHS, N, A, LDA, WORK( ITAUQ ),
592 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
594 * Solve the bidiagonal least squares problem.
596 CALL SLALSD( 'L', SMLSIZ, M, NRHS, S, WORK( IE ), B, LDB,
597 $ RCOND, RANK, WORK( NWORK ), IWORK, INFO )
602 * Multiply B by right bidiagonalizing vectors of A.
604 CALL SORMBR( 'P', 'L', 'N', N, NRHS, M, A, LDA, WORK( ITAUP ),
605 $ B, LDB, WORK( NWORK ), LWORK-NWORK+1, INFO )
611 IF( IASCL.EQ.1 ) THEN
612 CALL SLASCL( 'G', 0, 0, ANRM, SMLNUM, N, NRHS, B, LDB, INFO )
613 CALL SLASCL( 'G', 0, 0, SMLNUM, ANRM, MINMN, 1, S, MINMN,
615 ELSE IF( IASCL.EQ.2 ) THEN
616 CALL SLASCL( 'G', 0, 0, ANRM, BIGNUM, N, NRHS, B, LDB, INFO )
617 CALL SLASCL( 'G', 0, 0, BIGNUM, ANRM, MINMN, 1, S, MINMN,
620 IF( IBSCL.EQ.1 ) THEN
621 CALL SLASCL( 'G', 0, 0, SMLNUM, BNRM, N, NRHS, B, LDB, INFO )
622 ELSE IF( IBSCL.EQ.2 ) THEN
623 CALL SLASCL( 'G', 0, 0, BIGNUM, BNRM, N, NRHS, B, LDB, INFO )