1 *> \brief \b SLAQR2 performs the orthogonal similarity transformation of a Hessenberg matrix to detect and deflate fully converged eigenvalues from a trailing principal submatrix (aggressive early deflation).
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download SLAQR2 + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/slaqr2.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/slaqr2.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/slaqr2.f">
21 * SUBROUTINE SLAQR2( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
22 * IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
23 * LDT, NV, WV, LDWV, WORK, LWORK )
25 * .. Scalar Arguments ..
26 * INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
27 * $ LDZ, LWORK, N, ND, NH, NS, NV, NW
28 * LOGICAL WANTT, WANTZ
30 * .. Array Arguments ..
31 * REAL H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
32 * $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
42 *> SLAQR2 is identical to SLAQR3 except that it avoids
43 *> recursion by calling SLAHQR instead of SLAQR4.
45 *> Aggressive early deflation:
47 *> This subroutine accepts as input an upper Hessenberg matrix
48 *> H and performs an orthogonal similarity transformation
49 *> designed to detect and deflate fully converged eigenvalues from
50 *> a trailing principal submatrix. On output H has been over-
51 *> written by a new Hessenberg matrix that is a perturbation of
52 *> an orthogonal similarity transformation of H. It is to be
53 *> hoped that the final version of H has many zero subdiagonal
63 *> If .TRUE., then the Hessenberg matrix H is fully updated
64 *> so that the quasi-triangular Schur factor may be
65 *> computed (in cooperation with the calling subroutine).
66 *> If .FALSE., then only enough of H is updated to preserve
73 *> If .TRUE., then the orthogonal matrix Z is updated so
74 *> so that the orthogonal Schur factor may be computed
75 *> (in cooperation with the calling subroutine).
76 *> If .FALSE., then Z is not referenced.
82 *> The order of the matrix H and (if WANTZ is .TRUE.) the
83 *> order of the orthogonal matrix Z.
89 *> It is assumed that either KTOP = 1 or H(KTOP,KTOP-1)=0.
90 *> KBOT and KTOP together determine an isolated block
91 *> along the diagonal of the Hessenberg matrix.
97 *> It is assumed without a check that either
98 *> KBOT = N or H(KBOT+1,KBOT)=0. KBOT and KTOP together
99 *> determine an isolated block along the diagonal of the
100 *> Hessenberg matrix.
106 *> Deflation window size. 1 .LE. NW .LE. (KBOT-KTOP+1).
111 *> H is REAL array, dimension (LDH,N)
112 *> On input the initial N-by-N section of H stores the
113 *> Hessenberg matrix undergoing aggressive early deflation.
114 *> On output H has been transformed by an orthogonal
115 *> similarity transformation, perturbed, and the returned
116 *> to Hessenberg form that (it is to be hoped) has some
117 *> zero subdiagonal entries.
123 *> Leading dimension of H just as declared in the calling
124 *> subroutine. N .LE. LDH
135 *> Specify the rows of Z to which transformations must be
136 *> applied if WANTZ is .TRUE.. 1 .LE. ILOZ .LE. IHIZ .LE. N.
141 *> Z is REAL array, dimension (LDZ,N)
142 *> IF WANTZ is .TRUE., then on output, the orthogonal
143 *> similarity transformation mentioned above has been
144 *> accumulated into Z(ILOZ:IHIZ,ILO:IHI) from the right.
145 *> If WANTZ is .FALSE., then Z is unreferenced.
151 *> The leading dimension of Z just as declared in the
152 *> calling subroutine. 1 .LE. LDZ.
158 *> The number of unconverged (ie approximate) eigenvalues
159 *> returned in SR and SI that may be used as shifts by the
160 *> calling subroutine.
166 *> The number of converged eigenvalues uncovered by this
172 *> SR is REAL array, dimension KBOT
177 *> SI is REAL array, dimension KBOT
178 *> On output, the real and imaginary parts of approximate
179 *> eigenvalues that may be used for shifts are stored in
180 *> SR(KBOT-ND-NS+1) through SR(KBOT-ND) and
181 *> SI(KBOT-ND-NS+1) through SI(KBOT-ND), respectively.
182 *> The real and imaginary parts of converged eigenvalues
183 *> are stored in SR(KBOT-ND+1) through SR(KBOT) and
184 *> SI(KBOT-ND+1) through SI(KBOT), respectively.
189 *> V is REAL array, dimension (LDV,NW)
190 *> An NW-by-NW work array.
195 *> LDV is integer scalar
196 *> The leading dimension of V just as declared in the
197 *> calling subroutine. NW .LE. LDV
202 *> NH is integer scalar
203 *> The number of columns of T. NH.GE.NW.
208 *> T is REAL array, dimension (LDT,NW)
214 *> The leading dimension of T just as declared in the
215 *> calling subroutine. NW .LE. LDT
221 *> The number of rows of work array WV available for
222 *> workspace. NV.GE.NW.
227 *> WV is REAL array, dimension (LDWV,NW)
233 *> The leading dimension of W just as declared in the
234 *> calling subroutine. NW .LE. LDV
239 *> WORK is REAL array, dimension LWORK.
240 *> On exit, WORK(1) is set to an estimate of the optimal value
241 *> of LWORK for the given values of N, NW, KTOP and KBOT.
247 *> The dimension of the work array WORK. LWORK = 2*NW
248 *> suffices, but greater efficiency may result from larger
251 *> If LWORK = -1, then a workspace query is assumed; SLAQR2
252 *> only estimates the optimal workspace size for the given
253 *> values of N, NW, KTOP and KBOT. The estimate is returned
254 *> in WORK(1). No error message related to LWORK is issued
255 *> by XERBLA. Neither H nor Z are accessed.
261 *> \author Univ. of Tennessee
262 *> \author Univ. of California Berkeley
263 *> \author Univ. of Colorado Denver
266 *> \date September 2012
268 *> \ingroup realOTHERauxiliary
270 *> \par Contributors:
273 *> Karen Braman and Ralph Byers, Department of Mathematics,
274 *> University of Kansas, USA
276 * =====================================================================
277 SUBROUTINE SLAQR2( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
278 $ IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
279 $ LDT, NV, WV, LDWV, WORK, LWORK )
281 * -- LAPACK auxiliary routine (version 3.4.2) --
282 * -- LAPACK is a software package provided by Univ. of Tennessee, --
283 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
286 * .. Scalar Arguments ..
287 INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
288 $ LDZ, LWORK, N, ND, NH, NS, NV, NW
291 * .. Array Arguments ..
292 REAL H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
293 $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
297 * ================================================================
300 PARAMETER ( ZERO = 0.0e0, ONE = 1.0e0 )
302 * .. Local Scalars ..
303 REAL AA, BB, BETA, CC, CS, DD, EVI, EVK, FOO, S,
304 $ SAFMAX, SAFMIN, SMLNUM, SN, TAU, ULP
305 INTEGER I, IFST, ILST, INFO, INFQR, J, JW, K, KCOL,
306 $ KEND, KLN, KROW, KWTOP, LTOP, LWK1, LWK2,
308 LOGICAL BULGE, SORTED
310 * .. External Functions ..
314 * .. External Subroutines ..
315 EXTERNAL SCOPY, SGEHRD, SGEMM, SLABAD, SLACPY, SLAHQR,
316 $ SLANV2, SLARF, SLARFG, SLASET, SORMHR, STREXC
318 * .. Intrinsic Functions ..
319 INTRINSIC ABS, INT, MAX, MIN, REAL, SQRT
321 * .. Executable Statements ..
323 * ==== Estimate optimal workspace. ====
325 JW = MIN( NW, KBOT-KTOP+1 )
330 * ==== Workspace query call to SGEHRD ====
332 CALL SGEHRD( JW, 1, JW-1, T, LDT, WORK, WORK, -1, INFO )
333 LWK1 = INT( WORK( 1 ) )
335 * ==== Workspace query call to SORMHR ====
337 CALL SORMHR( 'R', 'N', JW, JW, 1, JW-1, T, LDT, WORK, V, LDV,
339 LWK2 = INT( WORK( 1 ) )
341 * ==== Optimal workspace ====
343 LWKOPT = JW + MAX( LWK1, LWK2 )
346 * ==== Quick return in case of workspace query. ====
348 IF( LWORK.EQ.-1 ) THEN
349 WORK( 1 ) = REAL( LWKOPT )
353 * ==== Nothing to do ...
354 * ... for an empty active block ... ====
360 * ... nor for an empty deflation window. ====
364 * ==== Machine constants ====
366 SAFMIN = SLAMCH( 'SAFE MINIMUM' )
367 SAFMAX = ONE / SAFMIN
368 CALL SLABAD( SAFMIN, SAFMAX )
369 ULP = SLAMCH( 'PRECISION' )
370 SMLNUM = SAFMIN*( REAL( N ) / ULP )
372 * ==== Setup deflation window ====
374 JW = MIN( NW, KBOT-KTOP+1 )
375 KWTOP = KBOT - JW + 1
376 IF( KWTOP.EQ.KTOP ) THEN
379 S = H( KWTOP, KWTOP-1 )
382 IF( KBOT.EQ.KWTOP ) THEN
384 * ==== 1-by-1 deflation window: not much to do ====
386 SR( KWTOP ) = H( KWTOP, KWTOP )
390 IF( ABS( S ).LE.MAX( SMLNUM, ULP*ABS( H( KWTOP, KWTOP ) ) ) )
395 $ H( KWTOP, KWTOP-1 ) = ZERO
401 * ==== Convert to spike-triangular form. (In case of a
402 * . rare QR failure, this routine continues to do
403 * . aggressive early deflation using that part of
404 * . the deflation window that converged using INFQR
405 * . here and there to keep track.) ====
407 CALL SLACPY( 'U', JW, JW, H( KWTOP, KWTOP ), LDH, T, LDT )
408 CALL SCOPY( JW-1, H( KWTOP+1, KWTOP ), LDH+1, T( 2, 1 ), LDT+1 )
410 CALL SLASET( 'A', JW, JW, ZERO, ONE, V, LDV )
411 CALL SLAHQR( .true., .true., JW, 1, JW, T, LDT, SR( KWTOP ),
412 $ SI( KWTOP ), 1, JW, V, LDV, INFQR )
414 * ==== STREXC needs a clean margin near the diagonal ====
421 $ T( JW, JW-2 ) = ZERO
423 * ==== Deflation detection loop ====
428 IF( ILST.LE.NS ) THEN
432 BULGE = T( NS, NS-1 ).NE.ZERO
435 * ==== Small spike tip test for deflation ====
437 IF( .NOT.BULGE ) THEN
439 * ==== Real eigenvalue ====
441 FOO = ABS( T( NS, NS ) )
444 IF( ABS( S*V( 1, NS ) ).LE.MAX( SMLNUM, ULP*FOO ) ) THEN
446 * ==== Deflatable ====
451 * ==== Undeflatable. Move it up out of the way.
452 * . (STREXC can not fail in this case.) ====
455 CALL STREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
461 * ==== Complex conjugate pair ====
463 FOO = ABS( T( NS, NS ) ) + SQRT( ABS( T( NS, NS-1 ) ) )*
464 $ SQRT( ABS( T( NS-1, NS ) ) )
467 IF( MAX( ABS( S*V( 1, NS ) ), ABS( S*V( 1, NS-1 ) ) ).LE.
468 $ MAX( SMLNUM, ULP*FOO ) ) THEN
470 * ==== Deflatable ====
475 * ==== Undeflatable. Move them up out of the way.
476 * . Fortunately, STREXC does the right thing with
477 * . ILST in case of a rare exchange failure. ====
480 CALL STREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
486 * ==== End deflation detection loop ====
491 * ==== Return to Hessenberg form ====
498 * ==== sorting diagonal blocks of T improves accuracy for
499 * . graded matrices. Bubble sort deals well with
500 * . exchange failures. ====
513 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
521 EVI = ABS( T( I, I ) )
523 EVI = ABS( T( I, I ) ) + SQRT( ABS( T( I+1, I ) ) )*
524 $ SQRT( ABS( T( I, I+1 ) ) )
528 EVK = ABS( T( K, K ) )
529 ELSE IF( T( K+1, K ).EQ.ZERO ) THEN
530 EVK = ABS( T( K, K ) )
532 EVK = ABS( T( K, K ) ) + SQRT( ABS( T( K+1, K ) ) )*
533 $ SQRT( ABS( T( K, K+1 ) ) )
536 IF( EVI.GE.EVK ) THEN
542 CALL STREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
552 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
563 * ==== Restore shift/eigenvalue array from T ====
567 IF( I.GE.INFQR+1 ) THEN
568 IF( I.EQ.INFQR+1 ) THEN
569 SR( KWTOP+I-1 ) = T( I, I )
570 SI( KWTOP+I-1 ) = ZERO
572 ELSE IF( T( I, I-1 ).EQ.ZERO ) THEN
573 SR( KWTOP+I-1 ) = T( I, I )
574 SI( KWTOP+I-1 ) = ZERO
581 CALL SLANV2( AA, BB, CC, DD, SR( KWTOP+I-2 ),
582 $ SI( KWTOP+I-2 ), SR( KWTOP+I-1 ),
583 $ SI( KWTOP+I-1 ), CS, SN )
589 IF( NS.LT.JW .OR. S.EQ.ZERO ) THEN
590 IF( NS.GT.1 .AND. S.NE.ZERO ) THEN
592 * ==== Reflect spike back into lower triangle ====
594 CALL SCOPY( NS, V, LDV, WORK, 1 )
596 CALL SLARFG( NS, BETA, WORK( 2 ), 1, TAU )
599 CALL SLASET( 'L', JW-2, JW-2, ZERO, ZERO, T( 3, 1 ), LDT )
601 CALL SLARF( 'L', NS, JW, WORK, 1, TAU, T, LDT,
603 CALL SLARF( 'R', NS, NS, WORK, 1, TAU, T, LDT,
605 CALL SLARF( 'R', JW, NS, WORK, 1, TAU, V, LDV,
608 CALL SGEHRD( JW, 1, NS, T, LDT, WORK, WORK( JW+1 ),
612 * ==== Copy updated reduced window into place ====
615 $ H( KWTOP, KWTOP-1 ) = S*V( 1, 1 )
616 CALL SLACPY( 'U', JW, JW, T, LDT, H( KWTOP, KWTOP ), LDH )
617 CALL SCOPY( JW-1, T( 2, 1 ), LDT+1, H( KWTOP+1, KWTOP ),
620 * ==== Accumulate orthogonal matrix in order update
621 * . H and Z, if requested. ====
623 IF( NS.GT.1 .AND. S.NE.ZERO )
624 $ CALL SORMHR( 'R', 'N', JW, NS, 1, NS, T, LDT, WORK, V, LDV,
625 $ WORK( JW+1 ), LWORK-JW, INFO )
627 * ==== Update vertical slab in H ====
634 DO 70 KROW = LTOP, KWTOP - 1, NV
635 KLN = MIN( NV, KWTOP-KROW )
636 CALL SGEMM( 'N', 'N', KLN, JW, JW, ONE, H( KROW, KWTOP ),
637 $ LDH, V, LDV, ZERO, WV, LDWV )
638 CALL SLACPY( 'A', KLN, JW, WV, LDWV, H( KROW, KWTOP ), LDH )
641 * ==== Update horizontal slab in H ====
644 DO 80 KCOL = KBOT + 1, N, NH
645 KLN = MIN( NH, N-KCOL+1 )
646 CALL SGEMM( 'C', 'N', JW, KLN, JW, ONE, V, LDV,
647 $ H( KWTOP, KCOL ), LDH, ZERO, T, LDT )
648 CALL SLACPY( 'A', JW, KLN, T, LDT, H( KWTOP, KCOL ),
653 * ==== Update vertical slab in Z ====
656 DO 90 KROW = ILOZ, IHIZ, NV
657 KLN = MIN( NV, IHIZ-KROW+1 )
658 CALL SGEMM( 'N', 'N', KLN, JW, JW, ONE, Z( KROW, KWTOP ),
659 $ LDZ, V, LDV, ZERO, WV, LDWV )
660 CALL SLACPY( 'A', KLN, JW, WV, LDWV, Z( KROW, KWTOP ),
666 * ==== Return the number of deflations ... ====
670 * ==== ... and the number of shifts. (Subtracting
671 * . INFQR from the spike length takes care
672 * . of the case of a rare QR failure while
673 * . calculating eigenvalues of the deflation
678 * ==== Return optimal workspace. ====
680 WORK( 1 ) = REAL( LWKOPT )
682 * ==== End of SLAQR2 ====