1 *> \brief \b ZGSVJ1 pre-processor for the routine zgesvj, applies Jacobi rotations targeting only particular pivots.
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download ZGSVJ1 + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dgsvj1.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dgsvj1.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dgsvj1.f">
21 * SUBROUTINE ZGSVJ1( JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV,
22 * EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO )
24 * .. Scalar Arguments ..
25 * DOUBLE PRECISION EPS, SFMIN, TOL
26 * INTEGER INFO, LDA, LDV, LWORK, M, MV, N, N1, NSWEEP
29 * .. Array Arguments ..
30 * COMPLEX*16 A( LDA, * ), D( N ), V( LDV, * ), WORK( LWORK )
31 * DOUBLE PRECISION SVA( N )
40 *> ZGSVJ1 is called from ZGESVJ as a pre-processor and that is its main
41 *> purpose. It applies Jacobi rotations in the same way as ZGESVJ does, but
42 *> it targets only particular pivots and it does not check convergence
43 *> (stopping criterion). Few tunning parameters (marked by [TP]) are
44 *> available for the implementer.
48 *> ZGSVJ1 applies few sweeps of Jacobi rotations in the column space of
49 *> the input M-by-N matrix A. The pivot pairs are taken from the (1,2)
50 *> off-diagonal block in the corresponding N-by-N Gram matrix A^T * A. The
51 *> block-entries (tiles) of the (1,2) off-diagonal block are marked by the
52 *> [x]'s in the following scheme:
54 *> | * * * [x] [x] [x]|
55 *> | * * * [x] [x] [x]| Row-cycling in the nblr-by-nblc [x] blocks.
56 *> | * * * [x] [x] [x]| Row-cyclic pivoting inside each [x] block.
57 *> |[x] [x] [x] * * * |
58 *> |[x] [x] [x] * * * |
59 *> |[x] [x] [x] * * * |
61 *> In terms of the columns of A, the first N1 columns are rotated 'against'
62 *> the remaining N-N1 columns, trying to increase the angle between the
63 *> corresponding subspaces. The off-diagonal block is N1-by(N-N1) and it is
64 *> tiled using quadratic tiles of side KBL. Here, KBL is a tunning parmeter.
65 *> The number of sweeps is given in NSWEEP and the orthogonality threshold
74 *> JOBV is CHARACTER*1
75 *> Specifies whether the output from this procedure is used
76 *> to compute the matrix V:
77 *> = 'V': the product of the Jacobi rotations is accumulated
78 *> by postmulyiplying the N-by-N array V.
79 *> (See the description of V.)
80 *> = 'A': the product of the Jacobi rotations is accumulated
81 *> by postmulyiplying the MV-by-N array V.
82 *> (See the descriptions of MV and V.)
83 *> = 'N': the Jacobi rotations are not accumulated.
89 *> The number of rows of the input matrix A. M >= 0.
95 *> The number of columns of the input matrix A.
102 *> N1 specifies the 2 x 2 block partition, the first N1 columns are
103 *> rotated 'against' the remaining N-N1 columns of A.
108 *> A is COMPLEX*16 array, dimension (LDA,N)
109 *> On entry, M-by-N matrix A, such that A*diag(D) represents
112 *> A_onexit * D_onexit represents the input matrix A*diag(D)
113 *> post-multiplied by a sequence of Jacobi rotations, where the
114 *> rotation threshold and the total number of sweeps are given in
115 *> TOL and NSWEEP, respectively.
116 *> (See the descriptions of N1, D, TOL and NSWEEP.)
122 *> The leading dimension of the array A. LDA >= max(1,M).
127 *> D is COMPLEX*16 array, dimension (N)
128 *> The array D accumulates the scaling factors from the fast scaled
130 *> On entry, A*diag(D) represents the input matrix.
131 *> On exit, A_onexit*diag(D_onexit) represents the input matrix
132 *> post-multiplied by a sequence of Jacobi rotations, where the
133 *> rotation threshold and the total number of sweeps are given in
134 *> TOL and NSWEEP, respectively.
135 *> (See the descriptions of N1, A, TOL and NSWEEP.)
138 *> \param[in,out] SVA
140 *> SVA is DOUBLE PRECISION array, dimension (N)
141 *> On entry, SVA contains the Euclidean norms of the columns of
142 *> the matrix A*diag(D).
143 *> On exit, SVA contains the Euclidean norms of the columns of
144 *> the matrix onexit*diag(D_onexit).
150 *> If JOBV .EQ. 'A', then MV rows of V are post-multipled by a
151 *> sequence of Jacobi rotations.
152 *> If JOBV = 'N', then MV is not referenced.
157 *> V is COMPLEX*16 array, dimension (LDV,N)
158 *> If JOBV .EQ. 'V' then N rows of V are post-multipled by a
159 *> sequence of Jacobi rotations.
160 *> If JOBV .EQ. 'A' then MV rows of V are post-multipled by a
161 *> sequence of Jacobi rotations.
162 *> If JOBV = 'N', then V is not referenced.
168 *> The leading dimension of the array V, LDV >= 1.
169 *> If JOBV = 'V', LDV .GE. N.
170 *> If JOBV = 'A', LDV .GE. MV.
175 *> EPS is DOUBLE PRECISION
176 *> EPS = DLAMCH('Epsilon')
181 *> SFMIN is DOUBLE PRECISION
182 *> SFMIN = DLAMCH('Safe Minimum')
187 *> TOL is DOUBLE PRECISION
188 *> TOL is the threshold for Jacobi rotations. For a pair
189 *> A(:,p), A(:,q) of pivot columns, the Jacobi rotation is
190 *> applied only if ABS(COS(angle(A(:,p),A(:,q)))) .GT. TOL.
196 *> NSWEEP is the number of sweeps of Jacobi rotations to be
202 *> WORK is COMPLEX*16 array, dimension (LWORK)
208 *> LWORK is the dimension of WORK. LWORK .GE. M.
214 *> = 0 : successful exit.
215 *> < 0 : if INFO = -i, then the i-th argument had an illegal value
221 *> \author Univ. of Tennessee
222 *> \author Univ. of California Berkeley
223 *> \author Univ. of Colorado Denver
228 *> \ingroup complex16OTHERcomputational
230 *> \par Contributors:
233 *> Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
235 * =====================================================================
236 SUBROUTINE ZGSVJ1( JOBV, M, N, N1, A, LDA, D, SVA, MV, V, LDV,
237 $ EPS, SFMIN, TOL, NSWEEP, WORK, LWORK, INFO )
239 * -- LAPACK computational routine (version 3.6.1) --
240 * -- LAPACK is a software package provided by Univ. of Tennessee, --
241 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
245 * .. Scalar Arguments ..
246 DOUBLE PRECISION EPS, SFMIN, TOL
247 INTEGER INFO, LDA, LDV, LWORK, M, MV, N, N1, NSWEEP
250 * .. Array Arguments ..
251 COMPLEX*16 A( LDA, * ), D( N ), V( LDV, * ), WORK( LWORK )
252 DOUBLE PRECISION SVA( N )
255 * =====================================================================
257 * .. Local Parameters ..
258 DOUBLE PRECISION ZERO, HALF, ONE
259 PARAMETER ( ZERO = 0.0D0, HALF = 0.5D0, ONE = 1.0D0)
261 * .. Local Scalars ..
262 COMPLEX*16 AAPQ, OMPQ
263 DOUBLE PRECISION AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
264 $ BIGTHETA, CS, LARGE, MXAAPQ, MXSINJ, ROOTBIG,
265 $ ROOTEPS, ROOTSFMIN, ROOTTOL, SMALL, SN, T,
266 $ TEMP1, THETA, THSIGN
267 INTEGER BLSKIP, EMPTSW, i, ibr, igl, IERR, IJBLSK,
268 $ ISWROT, jbc, jgl, KBL, MVL, NOTROT, nblc, nblr,
269 $ p, PSKIPPED, q, ROWSKIP, SWBAND
270 LOGICAL APPLV, ROTOK, RSVEC
273 * .. Intrinsic Functions ..
274 INTRINSIC ABS, DCONJG, DMAX1, DBLE, MIN0, DSIGN, DSQRT
276 * .. External Functions ..
277 DOUBLE PRECISION DZNRM2
281 EXTERNAL IDAMAX, LSAME, ZDOTC, DZNRM2
283 * .. External Subroutines ..
285 EXTERNAL ZCOPY, ZROT, ZSWAP
287 EXTERNAL ZLASCL, ZLASSQ, XERBLA
289 * .. Executable Statements ..
291 * Test the input parameters.
293 APPLV = LSAME( JOBV, 'A' )
294 RSVEC = LSAME( JOBV, 'V' )
295 IF( .NOT.( RSVEC .OR. APPLV .OR. LSAME( JOBV, 'N' ) ) ) THEN
297 ELSE IF( M.LT.0 ) THEN
299 ELSE IF( ( N.LT.0 ) .OR. ( N.GT.M ) ) THEN
301 ELSE IF( N1.LT.0 ) THEN
303 ELSE IF( LDA.LT.M ) THEN
305 ELSE IF( ( RSVEC.OR.APPLV ) .AND. ( MV.LT.0 ) ) THEN
307 ELSE IF( ( RSVEC.AND.( LDV.LT.N ) ).OR.
308 $ ( APPLV.AND.( LDV.LT.MV ) ) ) THEN
310 ELSE IF( TOL.LE.EPS ) THEN
312 ELSE IF( NSWEEP.LT.0 ) THEN
314 ELSE IF( LWORK.LT.M ) THEN
322 CALL XERBLA( 'ZGSVJ1', -INFO )
328 ELSE IF( APPLV ) THEN
331 RSVEC = RSVEC .OR. APPLV
333 ROOTEPS = DSQRT( EPS )
334 ROOTSFMIN = DSQRT( SFMIN )
337 ROOTBIG = ONE / ROOTSFMIN
338 LARGE = BIG / DSQRT( DBLE( M*N ) )
339 BIGTHETA = ONE / ROOTEPS
340 ROOTTOL = DSQRT( TOL )
342 * .. Initialize the right singular vector matrix ..
344 * RSVEC = LSAME( JOBV, 'Y' )
349 * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
353 IF( ( NBLR*KBL ).NE.N1 )NBLR = NBLR + 1
355 * .. the tiling is nblr-by-nblc [tiles]
357 NBLC = ( N-N1 ) / KBL
358 IF( ( NBLC*KBL ).NE.( N-N1 ) )NBLC = NBLC + 1
359 BLSKIP = ( KBL**2 ) + 1
360 *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
362 ROWSKIP = MIN0( 5, KBL )
363 *[TP] ROWSKIP is a tuning parameter.
365 *[TP] SWBAND is a tuning parameter. It is meaningful and effective
366 * if ZGESVJ is used as a computational routine in the preconditioned
367 * Jacobi SVD algorithm ZGEJSV.
370 * | * * * [x] [x] [x]|
371 * | * * * [x] [x] [x]| Row-cycling in the nblr-by-nblc [x] blocks.
372 * | * * * [x] [x] [x]| Row-cyclic pivoting inside each [x] block.
373 * |[x] [x] [x] * * * |
374 * |[x] [x] [x] * * * |
375 * |[x] [x] [x] * * * |
378 DO 1993 i = 1, NSWEEP
389 * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
390 * 1 <= p < q <= N. This is the first step toward a blocked implementation
391 * of the rotations. New implementation, based on block transformations,
392 * is under development.
394 DO 2000 ibr = 1, NBLR
396 igl = ( ibr-1 )*KBL + 1
400 * ... go to the off diagonal blocks
402 igl = ( ibr-1 )*KBL + 1
404 * DO 2010 jbc = ibr + 1, NBL
405 DO 2010 jbc = 1, NBLC
407 jgl = ( jbc-1 )*KBL + N1 + 1
409 * doing the block at ( ibr, jbc )
412 DO 2100 p = igl, MIN0( igl+KBL-1, N1 )
415 IF( AAPP.GT.ZERO ) THEN
419 DO 2200 q = jgl, MIN0( jgl+KBL-1, N )
422 IF( AAQQ.GT.ZERO ) THEN
425 * .. M x 2 Jacobi SVD ..
427 * Safe Gram matrix computation
429 IF( AAQQ.GE.ONE ) THEN
430 IF( AAPP.GE.AAQQ ) THEN
431 ROTOK = ( SMALL*AAPP ).LE.AAQQ
433 ROTOK = ( SMALL*AAQQ ).LE.AAPP
435 IF( AAPP.LT.( BIG / AAQQ ) ) THEN
436 AAPQ = ( ZDOTC( M, A( 1, p ), 1,
437 $ A( 1, q ), 1 ) / AAQQ ) / AAPP
439 CALL ZCOPY( M, A( 1, p ), 1,
441 CALL ZLASCL( 'G', 0, 0, AAPP,
444 AAPQ = ZDOTC( M, WORK, 1,
445 $ A( 1, q ), 1 ) / AAQQ
448 IF( AAPP.GE.AAQQ ) THEN
449 ROTOK = AAPP.LE.( AAQQ / SMALL )
451 ROTOK = AAQQ.LE.( AAPP / SMALL )
453 IF( AAPP.GT.( SMALL / AAQQ ) ) THEN
454 AAPQ = ( ZDOTC( M, A( 1, p ), 1,
455 $ A( 1, q ), 1 ) / AAQQ ) / AAPP
457 CALL ZCOPY( M, A( 1, q ), 1,
459 CALL ZLASCL( 'G', 0, 0, AAQQ,
462 AAPQ = ZDOTC( M, A( 1, p ), 1,
467 OMPQ = AAPQ / ABS(AAPQ)
468 * AAPQ = AAPQ * DCONJG(CWORK(p))*CWORK(q)
470 MXAAPQ = DMAX1( MXAAPQ, -AAPQ1 )
472 * TO rotate or NOT to rotate, THAT is the question ...
474 IF( ABS( AAPQ1 ).GT.TOL ) THEN
476 *[RTD] ROTATED = ROTATED + 1
484 THETA = -HALF*ABS( AQOAP-APOAQ )/ AAPQ1
485 IF( AAQQ.GT.AAPP0 )THETA = -THETA
487 IF( ABS( THETA ).GT.BIGTHETA ) THEN
490 CALL ZROT( M, A(1,p), 1, A(1,q), 1,
491 $ CS, DCONJG(OMPQ)*T )
493 CALL ZROT( MVL, V(1,p), 1,
494 $ V(1,q), 1, CS, DCONJG(OMPQ)*T )
496 SVA( q ) = AAQQ*DSQRT( DMAX1( ZERO,
497 $ ONE+T*APOAQ*AAPQ1 ) )
498 AAPP = AAPP*DSQRT( DMAX1( ZERO,
499 $ ONE-T*AQOAP*AAPQ1 ) )
500 MXSINJ = DMAX1( MXSINJ, ABS( T ) )
503 * .. choose correct signum for THETA and rotate
505 THSIGN = -DSIGN( ONE, AAPQ1 )
506 IF( AAQQ.GT.AAPP0 )THSIGN = -THSIGN
507 T = ONE / ( THETA+THSIGN*
508 $ DSQRT( ONE+THETA*THETA ) )
509 CS = DSQRT( ONE / ( ONE+T*T ) )
511 MXSINJ = DMAX1( MXSINJ, ABS( SN ) )
512 SVA( q ) = AAQQ*DSQRT( DMAX1( ZERO,
513 $ ONE+T*APOAQ*AAPQ1 ) )
514 AAPP = AAPP*DSQRT( DMAX1( ZERO,
515 $ ONE-T*AQOAP*AAPQ1 ) )
517 CALL ZROT( M, A(1,p), 1, A(1,q), 1,
518 $ CS, DCONJG(OMPQ)*SN )
520 CALL ZROT( MVL, V(1,p), 1,
521 $ V(1,q), 1, CS, DCONJG(OMPQ)*SN )
527 * .. have to use modified Gram-Schmidt like transformation
528 IF( AAPP.GT.AAQQ ) THEN
529 CALL ZCOPY( M, A( 1, p ), 1,
531 CALL ZLASCL( 'G', 0, 0, AAPP, ONE,
534 CALL ZLASCL( 'G', 0, 0, AAQQ, ONE,
535 $ M, 1, A( 1, q ), LDA,
537 CALL ZAXPY( M, -AAPQ, WORK,
539 CALL ZLASCL( 'G', 0, 0, ONE, AAQQ,
540 $ M, 1, A( 1, q ), LDA,
542 SVA( q ) = AAQQ*DSQRT( DMAX1( ZERO,
543 $ ONE-AAPQ1*AAPQ1 ) )
544 MXSINJ = DMAX1( MXSINJ, SFMIN )
546 CALL ZCOPY( M, A( 1, q ), 1,
548 CALL ZLASCL( 'G', 0, 0, AAQQ, ONE,
551 CALL ZLASCL( 'G', 0, 0, AAPP, ONE,
552 $ M, 1, A( 1, p ), LDA,
554 CALL ZAXPY( M, -DCONJG(AAPQ),
555 $ WORK, 1, A( 1, p ), 1 )
556 CALL ZLASCL( 'G', 0, 0, ONE, AAPP,
557 $ M, 1, A( 1, p ), LDA,
559 SVA( p ) = AAPP*DSQRT( DMAX1( ZERO,
560 $ ONE-AAPQ1*AAPQ1 ) )
561 MXSINJ = DMAX1( MXSINJ, SFMIN )
564 * END IF ROTOK THEN ... ELSE
566 * In the case of cancellation in updating SVA(q), SVA(p)
567 * .. recompute SVA(q), SVA(p)
568 IF( ( SVA( q ) / AAQQ )**2.LE.ROOTEPS )
570 IF( ( AAQQ.LT.ROOTBIG ) .AND.
571 $ ( AAQQ.GT.ROOTSFMIN ) ) THEN
572 SVA( q ) = DZNRM2( M, A( 1, q ), 1)
576 CALL ZLASSQ( M, A( 1, q ), 1, T,
578 SVA( q ) = T*DSQRT( AAQQ )
581 IF( ( AAPP / AAPP0 )**2.LE.ROOTEPS ) THEN
582 IF( ( AAPP.LT.ROOTBIG ) .AND.
583 $ ( AAPP.GT.ROOTSFMIN ) ) THEN
584 AAPP = DZNRM2( M, A( 1, p ), 1 )
588 CALL ZLASSQ( M, A( 1, p ), 1, T,
590 AAPP = T*DSQRT( AAPP )
597 *[RTD] SKIPPED = SKIPPED + 1
598 PSKIPPED = PSKIPPED + 1
603 PSKIPPED = PSKIPPED + 1
607 IF( ( i.LE.SWBAND ) .AND. ( IJBLSK.GE.BLSKIP ) )
613 IF( ( i.LE.SWBAND ) .AND.
614 $ ( PSKIPPED.GT.ROWSKIP ) ) THEN
628 IF( AAPP.EQ.ZERO )NOTROT = NOTROT +
629 $ MIN0( jgl+KBL-1, N ) - jgl + 1
630 IF( AAPP.LT.ZERO )NOTROT = 0
637 * end of the jbc-loop
639 *2011 bailed out of the jbc-loop
640 DO 2012 p = igl, MIN0( igl+KBL-1, N )
641 SVA( p ) = ABS( SVA( p ) )
645 *2000 :: end of the ibr-loop
648 IF( ( SVA( N ).LT.ROOTBIG ) .AND. ( SVA( N ).GT.ROOTSFMIN ) )
650 SVA( N ) = DZNRM2( M, A( 1, N ), 1 )
654 CALL ZLASSQ( M, A( 1, N ), 1, T, AAPP )
655 SVA( N ) = T*DSQRT( AAPP )
658 * Additional steering devices
660 IF( ( i.LT.SWBAND ) .AND. ( ( MXAAPQ.LE.ROOTTOL ) .OR.
661 $ ( ISWROT.LE.N ) ) )SWBAND = i
663 IF( ( i.GT.SWBAND+1 ) .AND. ( MXAAPQ.LT.DSQRT( DBLE( N ) )*
664 $ TOL ) .AND. ( DBLE( N )*MXAAPQ*MXSINJ.LT.TOL ) ) THEN
668 IF( NOTROT.GE.EMPTSW )GO TO 1994
671 * end i=1:NSWEEP loop
673 * #:( Reaching this point means that the procedure has not converged.
678 * #:) Reaching this point means numerical convergence after the i-th
682 * #:) INFO = 0 confirms successful iterations.
685 * Sort the vector SVA() of column norms.
687 q = IDAMAX( N-p+1, SVA( p ), 1 ) + p - 1
695 CALL ZSWAP( M, A( 1, p ), 1, A( 1, q ), 1 )
696 IF( RSVEC )CALL ZSWAP( MVL, V( 1, p ), 1, V( 1, q ), 1 )