1 *> \brief \b DLAQR2 performs the orthogonal similarity transformation of a Hessenberg matrix to detect and deflate fully converged eigenvalues from a trailing principal submatrix (aggressive early deflation).
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download DLAQR2 + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dlaqr2.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dlaqr2.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dlaqr2.f">
21 * SUBROUTINE DLAQR2( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
22 * IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
23 * LDT, NV, WV, LDWV, WORK, LWORK )
25 * .. Scalar Arguments ..
26 * INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
27 * $ LDZ, LWORK, N, ND, NH, NS, NV, NW
28 * LOGICAL WANTT, WANTZ
30 * .. Array Arguments ..
31 * DOUBLE PRECISION H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
32 * $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
42 *> DLAQR2 is identical to DLAQR3 except that it avoids
43 *> recursion by calling DLAHQR instead of DLAQR4.
45 *> Aggressive early deflation:
47 *> This subroutine accepts as input an upper Hessenberg matrix
48 *> H and performs an orthogonal similarity transformation
49 *> designed to detect and deflate fully converged eigenvalues from
50 *> a trailing principal submatrix. On output H has been over-
51 *> written by a new Hessenberg matrix that is a perturbation of
52 *> an orthogonal similarity transformation of H. It is to be
53 *> hoped that the final version of H has many zero subdiagonal
63 *> If .TRUE., then the Hessenberg matrix H is fully updated
64 *> so that the quasi-triangular Schur factor may be
65 *> computed (in cooperation with the calling subroutine).
66 *> If .FALSE., then only enough of H is updated to preserve
73 *> If .TRUE., then the orthogonal matrix Z is updated so
74 *> so that the orthogonal Schur factor may be computed
75 *> (in cooperation with the calling subroutine).
76 *> If .FALSE., then Z is not referenced.
82 *> The order of the matrix H and (if WANTZ is .TRUE.) the
83 *> order of the orthogonal matrix Z.
89 *> It is assumed that either KTOP = 1 or H(KTOP,KTOP-1)=0.
90 *> KBOT and KTOP together determine an isolated block
91 *> along the diagonal of the Hessenberg matrix.
97 *> It is assumed without a check that either
98 *> KBOT = N or H(KBOT+1,KBOT)=0. KBOT and KTOP together
99 *> determine an isolated block along the diagonal of the
100 *> Hessenberg matrix.
106 *> Deflation window size. 1 .LE. NW .LE. (KBOT-KTOP+1).
111 *> H is DOUBLE PRECISION array, dimension (LDH,N)
112 *> On input the initial N-by-N section of H stores the
113 *> Hessenberg matrix undergoing aggressive early deflation.
114 *> On output H has been transformed by an orthogonal
115 *> similarity transformation, perturbed, and the returned
116 *> to Hessenberg form that (it is to be hoped) has some
117 *> zero subdiagonal entries.
123 *> Leading dimension of H just as declared in the calling
124 *> subroutine. N .LE. LDH
135 *> Specify the rows of Z to which transformations must be
136 *> applied if WANTZ is .TRUE.. 1 .LE. ILOZ .LE. IHIZ .LE. N.
141 *> Z is DOUBLE PRECISION array, dimension (LDZ,N)
142 *> IF WANTZ is .TRUE., then on output, the orthogonal
143 *> similarity transformation mentioned above has been
144 *> accumulated into Z(ILOZ:IHIZ,ILO:IHI) from the right.
145 *> If WANTZ is .FALSE., then Z is unreferenced.
151 *> The leading dimension of Z just as declared in the
152 *> calling subroutine. 1 .LE. LDZ.
158 *> The number of unconverged (ie approximate) eigenvalues
159 *> returned in SR and SI that may be used as shifts by the
160 *> calling subroutine.
166 *> The number of converged eigenvalues uncovered by this
172 *> SR is DOUBLE PRECISION array, dimension (KBOT)
177 *> SI is DOUBLE PRECISION array, dimension (KBOT)
178 *> On output, the real and imaginary parts of approximate
179 *> eigenvalues that may be used for shifts are stored in
180 *> SR(KBOT-ND-NS+1) through SR(KBOT-ND) and
181 *> SI(KBOT-ND-NS+1) through SI(KBOT-ND), respectively.
182 *> The real and imaginary parts of converged eigenvalues
183 *> are stored in SR(KBOT-ND+1) through SR(KBOT) and
184 *> SI(KBOT-ND+1) through SI(KBOT), respectively.
189 *> V is DOUBLE PRECISION array, dimension (LDV,NW)
190 *> An NW-by-NW work array.
195 *> LDV is integer scalar
196 *> The leading dimension of V just as declared in the
197 *> calling subroutine. NW .LE. LDV
202 *> NH is integer scalar
203 *> The number of columns of T. NH.GE.NW.
208 *> T is DOUBLE PRECISION array, dimension (LDT,NW)
214 *> The leading dimension of T just as declared in the
215 *> calling subroutine. NW .LE. LDT
221 *> The number of rows of work array WV available for
222 *> workspace. NV.GE.NW.
227 *> WV is DOUBLE PRECISION array, dimension (LDWV,NW)
233 *> The leading dimension of W just as declared in the
234 *> calling subroutine. NW .LE. LDV
239 *> WORK is DOUBLE PRECISION array, dimension (LWORK)
240 *> On exit, WORK(1) is set to an estimate of the optimal value
241 *> of LWORK for the given values of N, NW, KTOP and KBOT.
247 *> The dimension of the work array WORK. LWORK = 2*NW
248 *> suffices, but greater efficiency may result from larger
251 *> If LWORK = -1, then a workspace query is assumed; DLAQR2
252 *> only estimates the optimal workspace size for the given
253 *> values of N, NW, KTOP and KBOT. The estimate is returned
254 *> in WORK(1). No error message related to LWORK is issued
255 *> by XERBLA. Neither H nor Z are accessed.
261 *> \author Univ. of Tennessee
262 *> \author Univ. of California Berkeley
263 *> \author Univ. of Colorado Denver
266 *> \date September 2012
268 *> \ingroup doubleOTHERauxiliary
270 *> \par Contributors:
273 *> Karen Braman and Ralph Byers, Department of Mathematics,
274 *> University of Kansas, USA
276 * =====================================================================
277 SUBROUTINE DLAQR2( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
278 $ IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
279 $ LDT, NV, WV, LDWV, WORK, LWORK )
281 * -- LAPACK auxiliary routine (version 3.4.2) --
282 * -- LAPACK is a software package provided by Univ. of Tennessee, --
283 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
286 * .. Scalar Arguments ..
287 INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
288 $ LDZ, LWORK, N, ND, NH, NS, NV, NW
291 * .. Array Arguments ..
292 DOUBLE PRECISION H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
293 $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
297 * ================================================================
299 DOUBLE PRECISION ZERO, ONE
300 PARAMETER ( ZERO = 0.0d0, ONE = 1.0d0 )
302 * .. Local Scalars ..
303 DOUBLE PRECISION AA, BB, BETA, CC, CS, DD, EVI, EVK, FOO, S,
304 $ SAFMAX, SAFMIN, SMLNUM, SN, TAU, ULP
305 INTEGER I, IFST, ILST, INFO, INFQR, J, JW, K, KCOL,
306 $ KEND, KLN, KROW, KWTOP, LTOP, LWK1, LWK2,
308 LOGICAL BULGE, SORTED
310 * .. External Functions ..
311 DOUBLE PRECISION DLAMCH
314 * .. External Subroutines ..
315 EXTERNAL DCOPY, DGEHRD, DGEMM, DLABAD, DLACPY, DLAHQR,
316 $ DLANV2, DLARF, DLARFG, DLASET, DORMHR, DTREXC
318 * .. Intrinsic Functions ..
319 INTRINSIC ABS, DBLE, INT, MAX, MIN, SQRT
321 * .. Executable Statements ..
323 * ==== Estimate optimal workspace. ====
325 JW = MIN( NW, KBOT-KTOP+1 )
330 * ==== Workspace query call to DGEHRD ====
332 CALL DGEHRD( JW, 1, JW-1, T, LDT, WORK, WORK, -1, INFO )
333 LWK1 = INT( WORK( 1 ) )
335 * ==== Workspace query call to DORMHR ====
337 CALL DORMHR( 'R', 'N', JW, JW, 1, JW-1, T, LDT, WORK, V, LDV,
339 LWK2 = INT( WORK( 1 ) )
341 * ==== Optimal workspace ====
343 LWKOPT = JW + MAX( LWK1, LWK2 )
346 * ==== Quick return in case of workspace query. ====
348 IF( LWORK.EQ.-1 ) THEN
349 WORK( 1 ) = DBLE( LWKOPT )
353 * ==== Nothing to do ...
354 * ... for an empty active block ... ====
360 * ... nor for an empty deflation window. ====
364 * ==== Machine constants ====
366 SAFMIN = DLAMCH( 'SAFE MINIMUM' )
367 SAFMAX = ONE / SAFMIN
368 CALL DLABAD( SAFMIN, SAFMAX )
369 ULP = DLAMCH( 'PRECISION' )
370 SMLNUM = SAFMIN*( DBLE( N ) / ULP )
372 * ==== Setup deflation window ====
374 JW = MIN( NW, KBOT-KTOP+1 )
375 KWTOP = KBOT - JW + 1
376 IF( KWTOP.EQ.KTOP ) THEN
379 S = H( KWTOP, KWTOP-1 )
382 IF( KBOT.EQ.KWTOP ) THEN
384 * ==== 1-by-1 deflation window: not much to do ====
386 SR( KWTOP ) = H( KWTOP, KWTOP )
390 IF( ABS( S ).LE.MAX( SMLNUM, ULP*ABS( H( KWTOP, KWTOP ) ) ) )
395 $ H( KWTOP, KWTOP-1 ) = ZERO
401 * ==== Convert to spike-triangular form. (In case of a
402 * . rare QR failure, this routine continues to do
403 * . aggressive early deflation using that part of
404 * . the deflation window that converged using INFQR
405 * . here and there to keep track.) ====
407 CALL DLACPY( 'U', JW, JW, H( KWTOP, KWTOP ), LDH, T, LDT )
408 CALL DCOPY( JW-1, H( KWTOP+1, KWTOP ), LDH+1, T( 2, 1 ), LDT+1 )
410 CALL DLASET( 'A', JW, JW, ZERO, ONE, V, LDV )
411 CALL DLAHQR( .true., .true., JW, 1, JW, T, LDT, SR( KWTOP ),
412 $ SI( KWTOP ), 1, JW, V, LDV, INFQR )
414 * ==== DTREXC needs a clean margin near the diagonal ====
421 $ T( JW, JW-2 ) = ZERO
423 * ==== Deflation detection loop ====
428 IF( ILST.LE.NS ) THEN
432 BULGE = T( NS, NS-1 ).NE.ZERO
435 * ==== Small spike tip test for deflation ====
437 IF( .NOT.BULGE ) THEN
439 * ==== Real eigenvalue ====
441 FOO = ABS( T( NS, NS ) )
444 IF( ABS( S*V( 1, NS ) ).LE.MAX( SMLNUM, ULP*FOO ) ) THEN
446 * ==== Deflatable ====
451 * ==== Undeflatable. Move it up out of the way.
452 * . (DTREXC can not fail in this case.) ====
455 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
461 * ==== Complex conjugate pair ====
463 FOO = ABS( T( NS, NS ) ) + SQRT( ABS( T( NS, NS-1 ) ) )*
464 $ SQRT( ABS( T( NS-1, NS ) ) )
467 IF( MAX( ABS( S*V( 1, NS ) ), ABS( S*V( 1, NS-1 ) ) ).LE.
468 $ MAX( SMLNUM, ULP*FOO ) ) THEN
470 * ==== Deflatable ====
475 * ==== Undeflatable. Move them up out of the way.
476 * . Fortunately, DTREXC does the right thing with
477 * . ILST in case of a rare exchange failure. ====
480 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
486 * ==== End deflation detection loop ====
491 * ==== Return to Hessenberg form ====
498 * ==== sorting diagonal blocks of T improves accuracy for
499 * . graded matrices. Bubble sort deals well with
500 * . exchange failures. ====
513 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
521 EVI = ABS( T( I, I ) )
523 EVI = ABS( T( I, I ) ) + SQRT( ABS( T( I+1, I ) ) )*
524 $ SQRT( ABS( T( I, I+1 ) ) )
528 EVK = ABS( T( K, K ) )
529 ELSE IF( T( K+1, K ).EQ.ZERO ) THEN
530 EVK = ABS( T( K, K ) )
532 EVK = ABS( T( K, K ) ) + SQRT( ABS( T( K+1, K ) ) )*
533 $ SQRT( ABS( T( K, K+1 ) ) )
536 IF( EVI.GE.EVK ) THEN
542 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
552 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
563 * ==== Restore shift/eigenvalue array from T ====
567 IF( I.GE.INFQR+1 ) THEN
568 IF( I.EQ.INFQR+1 ) THEN
569 SR( KWTOP+I-1 ) = T( I, I )
570 SI( KWTOP+I-1 ) = ZERO
572 ELSE IF( T( I, I-1 ).EQ.ZERO ) THEN
573 SR( KWTOP+I-1 ) = T( I, I )
574 SI( KWTOP+I-1 ) = ZERO
581 CALL DLANV2( AA, BB, CC, DD, SR( KWTOP+I-2 ),
582 $ SI( KWTOP+I-2 ), SR( KWTOP+I-1 ),
583 $ SI( KWTOP+I-1 ), CS, SN )
589 IF( NS.LT.JW .OR. S.EQ.ZERO ) THEN
590 IF( NS.GT.1 .AND. S.NE.ZERO ) THEN
592 * ==== Reflect spike back into lower triangle ====
594 CALL DCOPY( NS, V, LDV, WORK, 1 )
596 CALL DLARFG( NS, BETA, WORK( 2 ), 1, TAU )
599 CALL DLASET( 'L', JW-2, JW-2, ZERO, ZERO, T( 3, 1 ), LDT )
601 CALL DLARF( 'L', NS, JW, WORK, 1, TAU, T, LDT,
603 CALL DLARF( 'R', NS, NS, WORK, 1, TAU, T, LDT,
605 CALL DLARF( 'R', JW, NS, WORK, 1, TAU, V, LDV,
608 CALL DGEHRD( JW, 1, NS, T, LDT, WORK, WORK( JW+1 ),
612 * ==== Copy updated reduced window into place ====
615 $ H( KWTOP, KWTOP-1 ) = S*V( 1, 1 )
616 CALL DLACPY( 'U', JW, JW, T, LDT, H( KWTOP, KWTOP ), LDH )
617 CALL DCOPY( JW-1, T( 2, 1 ), LDT+1, H( KWTOP+1, KWTOP ),
620 * ==== Accumulate orthogonal matrix in order update
621 * . H and Z, if requested. ====
623 IF( NS.GT.1 .AND. S.NE.ZERO )
624 $ CALL DORMHR( 'R', 'N', JW, NS, 1, NS, T, LDT, WORK, V, LDV,
625 $ WORK( JW+1 ), LWORK-JW, INFO )
627 * ==== Update vertical slab in H ====
634 DO 70 KROW = LTOP, KWTOP - 1, NV
635 KLN = MIN( NV, KWTOP-KROW )
636 CALL DGEMM( 'N', 'N', KLN, JW, JW, ONE, H( KROW, KWTOP ),
637 $ LDH, V, LDV, ZERO, WV, LDWV )
638 CALL DLACPY( 'A', KLN, JW, WV, LDWV, H( KROW, KWTOP ), LDH )
641 * ==== Update horizontal slab in H ====
644 DO 80 KCOL = KBOT + 1, N, NH
645 KLN = MIN( NH, N-KCOL+1 )
646 CALL DGEMM( 'C', 'N', JW, KLN, JW, ONE, V, LDV,
647 $ H( KWTOP, KCOL ), LDH, ZERO, T, LDT )
648 CALL DLACPY( 'A', JW, KLN, T, LDT, H( KWTOP, KCOL ),
653 * ==== Update vertical slab in Z ====
656 DO 90 KROW = ILOZ, IHIZ, NV
657 KLN = MIN( NV, IHIZ-KROW+1 )
658 CALL DGEMM( 'N', 'N', KLN, JW, JW, ONE, Z( KROW, KWTOP ),
659 $ LDZ, V, LDV, ZERO, WV, LDWV )
660 CALL DLACPY( 'A', KLN, JW, WV, LDWV, Z( KROW, KWTOP ),
666 * ==== Return the number of deflations ... ====
670 * ==== ... and the number of shifts. (Subtracting
671 * . INFQR from the spike length takes care
672 * . of the case of a rare QR failure while
673 * . calculating eigenvalues of the deflation
678 * ==== Return optimal workspace. ====
680 WORK( 1 ) = DBLE( LWKOPT )
682 * ==== End of DLAQR2 ====