1 *> \brief \b DLAQR3 performs the orthogonal similarity transformation of a Hessenberg matrix to detect and deflate fully converged eigenvalues from a trailing principal submatrix (aggressive early deflation).
3 * =========== DOCUMENTATION ===========
5 * Online html documentation available at
6 * http://www.netlib.org/lapack/explore-html/
9 *> Download DLAQR3 + dependencies
10 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dlaqr3.f">
12 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dlaqr3.f">
14 *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dlaqr3.f">
21 * SUBROUTINE DLAQR3( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
22 * IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
23 * LDT, NV, WV, LDWV, WORK, LWORK )
25 * .. Scalar Arguments ..
26 * INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
27 * $ LDZ, LWORK, N, ND, NH, NS, NV, NW
28 * LOGICAL WANTT, WANTZ
30 * .. Array Arguments ..
31 * DOUBLE PRECISION H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
32 * $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
42 *> Aggressive early deflation:
44 *> DLAQR3 accepts as input an upper Hessenberg matrix
45 *> H and performs an orthogonal similarity transformation
46 *> designed to detect and deflate fully converged eigenvalues from
47 *> a trailing principal submatrix. On output H has been over-
48 *> written by a new Hessenberg matrix that is a perturbation of
49 *> an orthogonal similarity transformation of H. It is to be
50 *> hoped that the final version of H has many zero subdiagonal
60 *> If .TRUE., then the Hessenberg matrix H is fully updated
61 *> so that the quasi-triangular Schur factor may be
62 *> computed (in cooperation with the calling subroutine).
63 *> If .FALSE., then only enough of H is updated to preserve
70 *> If .TRUE., then the orthogonal matrix Z is updated so
71 *> so that the orthogonal Schur factor may be computed
72 *> (in cooperation with the calling subroutine).
73 *> If .FALSE., then Z is not referenced.
79 *> The order of the matrix H and (if WANTZ is .TRUE.) the
80 *> order of the orthogonal matrix Z.
86 *> It is assumed that either KTOP = 1 or H(KTOP,KTOP-1)=0.
87 *> KBOT and KTOP together determine an isolated block
88 *> along the diagonal of the Hessenberg matrix.
94 *> It is assumed without a check that either
95 *> KBOT = N or H(KBOT+1,KBOT)=0. KBOT and KTOP together
96 *> determine an isolated block along the diagonal of the
103 *> Deflation window size. 1 .LE. NW .LE. (KBOT-KTOP+1).
108 *> H is DOUBLE PRECISION array, dimension (LDH,N)
109 *> On input the initial N-by-N section of H stores the
110 *> Hessenberg matrix undergoing aggressive early deflation.
111 *> On output H has been transformed by an orthogonal
112 *> similarity transformation, perturbed, and the returned
113 *> to Hessenberg form that (it is to be hoped) has some
114 *> zero subdiagonal entries.
120 *> Leading dimension of H just as declared in the calling
121 *> subroutine. N .LE. LDH
132 *> Specify the rows of Z to which transformations must be
133 *> applied if WANTZ is .TRUE.. 1 .LE. ILOZ .LE. IHIZ .LE. N.
138 *> Z is DOUBLE PRECISION array, dimension (LDZ,N)
139 *> IF WANTZ is .TRUE., then on output, the orthogonal
140 *> similarity transformation mentioned above has been
141 *> accumulated into Z(ILOZ:IHIZ,ILOZ:IHIZ) from the right.
142 *> If WANTZ is .FALSE., then Z is unreferenced.
148 *> The leading dimension of Z just as declared in the
149 *> calling subroutine. 1 .LE. LDZ.
155 *> The number of unconverged (ie approximate) eigenvalues
156 *> returned in SR and SI that may be used as shifts by the
157 *> calling subroutine.
163 *> The number of converged eigenvalues uncovered by this
169 *> SR is DOUBLE PRECISION array, dimension (KBOT)
174 *> SI is DOUBLE PRECISION array, dimension (KBOT)
175 *> On output, the real and imaginary parts of approximate
176 *> eigenvalues that may be used for shifts are stored in
177 *> SR(KBOT-ND-NS+1) through SR(KBOT-ND) and
178 *> SI(KBOT-ND-NS+1) through SI(KBOT-ND), respectively.
179 *> The real and imaginary parts of converged eigenvalues
180 *> are stored in SR(KBOT-ND+1) through SR(KBOT) and
181 *> SI(KBOT-ND+1) through SI(KBOT), respectively.
186 *> V is DOUBLE PRECISION array, dimension (LDV,NW)
187 *> An NW-by-NW work array.
192 *> LDV is integer scalar
193 *> The leading dimension of V just as declared in the
194 *> calling subroutine. NW .LE. LDV
199 *> NH is integer scalar
200 *> The number of columns of T. NH.GE.NW.
205 *> T is DOUBLE PRECISION array, dimension (LDT,NW)
211 *> The leading dimension of T just as declared in the
212 *> calling subroutine. NW .LE. LDT
218 *> The number of rows of work array WV available for
219 *> workspace. NV.GE.NW.
224 *> WV is DOUBLE PRECISION array, dimension (LDWV,NW)
230 *> The leading dimension of W just as declared in the
231 *> calling subroutine. NW .LE. LDV
236 *> WORK is DOUBLE PRECISION array, dimension (LWORK)
237 *> On exit, WORK(1) is set to an estimate of the optimal value
238 *> of LWORK for the given values of N, NW, KTOP and KBOT.
244 *> The dimension of the work array WORK. LWORK = 2*NW
245 *> suffices, but greater efficiency may result from larger
248 *> If LWORK = -1, then a workspace query is assumed; DLAQR3
249 *> only estimates the optimal workspace size for the given
250 *> values of N, NW, KTOP and KBOT. The estimate is returned
251 *> in WORK(1). No error message related to LWORK is issued
252 *> by XERBLA. Neither H nor Z are accessed.
258 *> \author Univ. of Tennessee
259 *> \author Univ. of California Berkeley
260 *> \author Univ. of Colorado Denver
265 *> \ingroup doubleOTHERauxiliary
267 *> \par Contributors:
270 *> Karen Braman and Ralph Byers, Department of Mathematics,
271 *> University of Kansas, USA
273 * =====================================================================
274 SUBROUTINE DLAQR3( WANTT, WANTZ, N, KTOP, KBOT, NW, H, LDH, ILOZ,
275 $ IHIZ, Z, LDZ, NS, ND, SR, SI, V, LDV, NH, T,
276 $ LDT, NV, WV, LDWV, WORK, LWORK )
278 * -- LAPACK auxiliary routine (version 3.6.1) --
279 * -- LAPACK is a software package provided by Univ. of Tennessee, --
280 * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
283 * .. Scalar Arguments ..
284 INTEGER IHIZ, ILOZ, KBOT, KTOP, LDH, LDT, LDV, LDWV,
285 $ LDZ, LWORK, N, ND, NH, NS, NV, NW
288 * .. Array Arguments ..
289 DOUBLE PRECISION H( LDH, * ), SI( * ), SR( * ), T( LDT, * ),
290 $ V( LDV, * ), WORK( * ), WV( LDWV, * ),
294 * ================================================================
296 DOUBLE PRECISION ZERO, ONE
297 PARAMETER ( ZERO = 0.0d0, ONE = 1.0d0 )
299 * .. Local Scalars ..
300 DOUBLE PRECISION AA, BB, BETA, CC, CS, DD, EVI, EVK, FOO, S,
301 $ SAFMAX, SAFMIN, SMLNUM, SN, TAU, ULP
302 INTEGER I, IFST, ILST, INFO, INFQR, J, JW, K, KCOL,
303 $ KEND, KLN, KROW, KWTOP, LTOP, LWK1, LWK2, LWK3,
305 LOGICAL BULGE, SORTED
307 * .. External Functions ..
308 DOUBLE PRECISION DLAMCH
310 EXTERNAL DLAMCH, ILAENV
312 * .. External Subroutines ..
313 EXTERNAL DCOPY, DGEHRD, DGEMM, DLABAD, DLACPY, DLAHQR,
314 $ DLANV2, DLAQR4, DLARF, DLARFG, DLASET, DORMHR,
317 * .. Intrinsic Functions ..
318 INTRINSIC ABS, DBLE, INT, MAX, MIN, SQRT
320 * .. Executable Statements ..
322 * ==== Estimate optimal workspace. ====
324 JW = MIN( NW, KBOT-KTOP+1 )
329 * ==== Workspace query call to DGEHRD ====
331 CALL DGEHRD( JW, 1, JW-1, T, LDT, WORK, WORK, -1, INFO )
332 LWK1 = INT( WORK( 1 ) )
334 * ==== Workspace query call to DORMHR ====
336 CALL DORMHR( 'R', 'N', JW, JW, 1, JW-1, T, LDT, WORK, V, LDV,
338 LWK2 = INT( WORK( 1 ) )
340 * ==== Workspace query call to DLAQR4 ====
342 CALL DLAQR4( .true., .true., JW, 1, JW, T, LDT, SR, SI, 1, JW,
343 $ V, LDV, WORK, -1, INFQR )
344 LWK3 = INT( WORK( 1 ) )
346 * ==== Optimal workspace ====
348 LWKOPT = MAX( JW+MAX( LWK1, LWK2 ), LWK3 )
351 * ==== Quick return in case of workspace query. ====
353 IF( LWORK.EQ.-1 ) THEN
354 WORK( 1 ) = DBLE( LWKOPT )
358 * ==== Nothing to do ...
359 * ... for an empty active block ... ====
365 * ... nor for an empty deflation window. ====
369 * ==== Machine constants ====
371 SAFMIN = DLAMCH( 'SAFE MINIMUM' )
372 SAFMAX = ONE / SAFMIN
373 CALL DLABAD( SAFMIN, SAFMAX )
374 ULP = DLAMCH( 'PRECISION' )
375 SMLNUM = SAFMIN*( DBLE( N ) / ULP )
377 * ==== Setup deflation window ====
379 JW = MIN( NW, KBOT-KTOP+1 )
380 KWTOP = KBOT - JW + 1
381 IF( KWTOP.EQ.KTOP ) THEN
384 S = H( KWTOP, KWTOP-1 )
387 IF( KBOT.EQ.KWTOP ) THEN
389 * ==== 1-by-1 deflation window: not much to do ====
391 SR( KWTOP ) = H( KWTOP, KWTOP )
395 IF( ABS( S ).LE.MAX( SMLNUM, ULP*ABS( H( KWTOP, KWTOP ) ) ) )
400 $ H( KWTOP, KWTOP-1 ) = ZERO
406 * ==== Convert to spike-triangular form. (In case of a
407 * . rare QR failure, this routine continues to do
408 * . aggressive early deflation using that part of
409 * . the deflation window that converged using INFQR
410 * . here and there to keep track.) ====
412 CALL DLACPY( 'U', JW, JW, H( KWTOP, KWTOP ), LDH, T, LDT )
413 CALL DCOPY( JW-1, H( KWTOP+1, KWTOP ), LDH+1, T( 2, 1 ), LDT+1 )
415 CALL DLASET( 'A', JW, JW, ZERO, ONE, V, LDV )
416 NMIN = ILAENV( 12, 'DLAQR3', 'SV', JW, 1, JW, LWORK )
417 IF( JW.GT.NMIN ) THEN
418 CALL DLAQR4( .true., .true., JW, 1, JW, T, LDT, SR( KWTOP ),
419 $ SI( KWTOP ), 1, JW, V, LDV, WORK, LWORK, INFQR )
421 CALL DLAHQR( .true., .true., JW, 1, JW, T, LDT, SR( KWTOP ),
422 $ SI( KWTOP ), 1, JW, V, LDV, INFQR )
425 * ==== DTREXC needs a clean margin near the diagonal ====
432 $ T( JW, JW-2 ) = ZERO
434 * ==== Deflation detection loop ====
439 IF( ILST.LE.NS ) THEN
443 BULGE = T( NS, NS-1 ).NE.ZERO
446 * ==== Small spike tip test for deflation ====
448 IF( .NOT. BULGE ) THEN
450 * ==== Real eigenvalue ====
452 FOO = ABS( T( NS, NS ) )
455 IF( ABS( S*V( 1, NS ) ).LE.MAX( SMLNUM, ULP*FOO ) ) THEN
457 * ==== Deflatable ====
462 * ==== Undeflatable. Move it up out of the way.
463 * . (DTREXC can not fail in this case.) ====
466 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
472 * ==== Complex conjugate pair ====
474 FOO = ABS( T( NS, NS ) ) + SQRT( ABS( T( NS, NS-1 ) ) )*
475 $ SQRT( ABS( T( NS-1, NS ) ) )
478 IF( MAX( ABS( S*V( 1, NS ) ), ABS( S*V( 1, NS-1 ) ) ).LE.
479 $ MAX( SMLNUM, ULP*FOO ) ) THEN
481 * ==== Deflatable ====
486 * ==== Undeflatable. Move them up out of the way.
487 * . Fortunately, DTREXC does the right thing with
488 * . ILST in case of a rare exchange failure. ====
491 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
497 * ==== End deflation detection loop ====
502 * ==== Return to Hessenberg form ====
509 * ==== sorting diagonal blocks of T improves accuracy for
510 * . graded matrices. Bubble sort deals well with
511 * . exchange failures. ====
524 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
532 EVI = ABS( T( I, I ) )
534 EVI = ABS( T( I, I ) ) + SQRT( ABS( T( I+1, I ) ) )*
535 $ SQRT( ABS( T( I, I+1 ) ) )
539 EVK = ABS( T( K, K ) )
540 ELSE IF( T( K+1, K ).EQ.ZERO ) THEN
541 EVK = ABS( T( K, K ) )
543 EVK = ABS( T( K, K ) ) + SQRT( ABS( T( K+1, K ) ) )*
544 $ SQRT( ABS( T( K, K+1 ) ) )
547 IF( EVI.GE.EVK ) THEN
553 CALL DTREXC( 'V', JW, T, LDT, V, LDV, IFST, ILST, WORK,
563 ELSE IF( T( I+1, I ).EQ.ZERO ) THEN
574 * ==== Restore shift/eigenvalue array from T ====
578 IF( I.GE.INFQR+1 ) THEN
579 IF( I.EQ.INFQR+1 ) THEN
580 SR( KWTOP+I-1 ) = T( I, I )
581 SI( KWTOP+I-1 ) = ZERO
583 ELSE IF( T( I, I-1 ).EQ.ZERO ) THEN
584 SR( KWTOP+I-1 ) = T( I, I )
585 SI( KWTOP+I-1 ) = ZERO
592 CALL DLANV2( AA, BB, CC, DD, SR( KWTOP+I-2 ),
593 $ SI( KWTOP+I-2 ), SR( KWTOP+I-1 ),
594 $ SI( KWTOP+I-1 ), CS, SN )
600 IF( NS.LT.JW .OR. S.EQ.ZERO ) THEN
601 IF( NS.GT.1 .AND. S.NE.ZERO ) THEN
603 * ==== Reflect spike back into lower triangle ====
605 CALL DCOPY( NS, V, LDV, WORK, 1 )
607 CALL DLARFG( NS, BETA, WORK( 2 ), 1, TAU )
610 CALL DLASET( 'L', JW-2, JW-2, ZERO, ZERO, T( 3, 1 ), LDT )
612 CALL DLARF( 'L', NS, JW, WORK, 1, TAU, T, LDT,
614 CALL DLARF( 'R', NS, NS, WORK, 1, TAU, T, LDT,
616 CALL DLARF( 'R', JW, NS, WORK, 1, TAU, V, LDV,
619 CALL DGEHRD( JW, 1, NS, T, LDT, WORK, WORK( JW+1 ),
623 * ==== Copy updated reduced window into place ====
626 $ H( KWTOP, KWTOP-1 ) = S*V( 1, 1 )
627 CALL DLACPY( 'U', JW, JW, T, LDT, H( KWTOP, KWTOP ), LDH )
628 CALL DCOPY( JW-1, T( 2, 1 ), LDT+1, H( KWTOP+1, KWTOP ),
631 * ==== Accumulate orthogonal matrix in order update
632 * . H and Z, if requested. ====
634 IF( NS.GT.1 .AND. S.NE.ZERO )
635 $ CALL DORMHR( 'R', 'N', JW, NS, 1, NS, T, LDT, WORK, V, LDV,
636 $ WORK( JW+1 ), LWORK-JW, INFO )
638 * ==== Update vertical slab in H ====
645 DO 70 KROW = LTOP, KWTOP - 1, NV
646 KLN = MIN( NV, KWTOP-KROW )
647 CALL DGEMM( 'N', 'N', KLN, JW, JW, ONE, H( KROW, KWTOP ),
648 $ LDH, V, LDV, ZERO, WV, LDWV )
649 CALL DLACPY( 'A', KLN, JW, WV, LDWV, H( KROW, KWTOP ), LDH )
652 * ==== Update horizontal slab in H ====
655 DO 80 KCOL = KBOT + 1, N, NH
656 KLN = MIN( NH, N-KCOL+1 )
657 CALL DGEMM( 'C', 'N', JW, KLN, JW, ONE, V, LDV,
658 $ H( KWTOP, KCOL ), LDH, ZERO, T, LDT )
659 CALL DLACPY( 'A', JW, KLN, T, LDT, H( KWTOP, KCOL ),
664 * ==== Update vertical slab in Z ====
667 DO 90 KROW = ILOZ, IHIZ, NV
668 KLN = MIN( NV, IHIZ-KROW+1 )
669 CALL DGEMM( 'N', 'N', KLN, JW, JW, ONE, Z( KROW, KWTOP ),
670 $ LDZ, V, LDV, ZERO, WV, LDWV )
671 CALL DLACPY( 'A', KLN, JW, WV, LDWV, Z( KROW, KWTOP ),
677 * ==== Return the number of deflations ... ====
681 * ==== ... and the number of shifts. (Subtracting
682 * . INFQR from the spike length takes care
683 * . of the case of a rare QR failure while
684 * . calculating eigenvalues of the deflation
689 * ==== Return optimal workspace. ====
691 WORK( 1 ) = DBLE( LWKOPT )
693 * ==== End of DLAQR3 ====