1 // Licensed to the .NET Foundation under one or more agreements.
2 // The .NET Foundation licenses this file to you under the MIT license.
3 // See the LICENSE file in the project root for more information.
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
9 Linear Scan Register Allocation
14 - All register requirements are expressed in the code stream, either as destination
15 registers of tree nodes, or as internal registers. These requirements are
16 expressed in the TreeNodeInfo computed for each node, which includes:
17 - The number of register sources and destinations.
18 - The register restrictions (candidates) of the target register, both from itself,
19 as producer of the value (dstCandidates), and from its consuming node (srcCandidates).
20 Note that the srcCandidates field of TreeNodeInfo refers to the destination register
21 (not any of its sources).
22 - The number (internalCount) of registers required, and their register restrictions (internalCandidates).
23 These are neither inputs nor outputs of the node, but used in the sequence of code generated for the tree.
24 "Internal registers" are registers used during the code sequence generated for the node.
25 The register lifetimes must obey the following lifetime model:
26 - First, any internal registers are defined.
27 - Next, any source registers are used (and are then freed if they are last use and are not identified as
29 - Next, the internal registers are used (and are then freed).
30 - Next, any registers in the kill set for the instruction are killed.
31 - Next, the destination register(s) are defined (multiple destination registers are only supported on ARM)
32 - Finally, any "delayRegFree" source registers are freed.
33 There are several things to note about this order:
34 - The internal registers will never overlap any use, but they may overlap a destination register.
35 - Internal registers are never live beyond the node.
36 - The "delayRegFree" annotation is used for instructions that are only available in a Read-Modify-Write form.
37 That is, the destination register is one of the sources. In this case, we must not use the same register for
38 the non-RMW operand as for the destination.
40 Overview (doLinearScan):
41 - Walk all blocks, building intervals and RefPositions (buildIntervals)
42 - Allocate registers (allocateRegisters)
43 - Annotate nodes with register assignments (resolveRegisters)
44 - Add move nodes as needed to resolve conflicting register
45 assignments across non-adjacent edges. (resolveEdges, called from resolveRegisters)
50 - GenTree::gtRegNum (and gtRegPair for ARM) is annotated with the register
51 assignment for a node. If the node does not require a register, it is
52 annotated as such (for single registers, gtRegNum = REG_NA; for register
53 pair type, gtRegPair = REG_PAIR_NONE). For a variable definition or interior
54 tree node (an "implicit" definition), this is the register to put the result.
55 For an expression use, this is the place to find the value that has previously
57 - In most cases, this register must satisfy the constraints specified by the TreeNodeInfo.
58 - In some cases, this is difficult:
59 - If a lclVar node currently lives in some register, it may not be desirable to move it
60 (i.e. its current location may be desirable for future uses, e.g. if it's a callee save register,
61 but needs to be in a specific arg register for a call).
62 - In other cases there may be conflicts on the restrictions placed by the defining node and the node which
64 - If such a node is constrained to a single fixed register (e.g. an arg register, or a return from a call),
65 then LSRA is free to annotate the node with a different register. The code generator must issue the appropriate
67 - However, if such a node is constrained to a set of registers, and its current location does not satisfy that
68 requirement, LSRA must insert a GT_COPY node between the node and its parent. The gtRegNum on the GT_COPY node
69 must satisfy the register requirement of the parent.
70 - GenTree::gtRsvdRegs has a set of registers used for internal temps.
71 - A tree node is marked GTF_SPILL if the tree node must be spilled by the code generator after it has been
73 - LSRA currently does not set GTF_SPILLED on such nodes, because it caused problems in the old code generator.
74 In the new backend perhaps this should change (see also the note below under CodeGen).
75 - A tree node is marked GTF_SPILLED if it is a lclVar that must be reloaded prior to use.
76 - The register (gtRegNum) on the node indicates the register to which it must be reloaded.
77 - For lclVar nodes, since the uses and defs are distinct tree nodes, it is always possible to annotate the node
78 with the register to which the variable must be reloaded.
79 - For other nodes, since they represent both the def and use, if the value must be reloaded to a different
80 register, LSRA must insert a GT_RELOAD node in order to specify the register to which it should be reloaded.
82 Local variable table (LclVarDsc):
83 - LclVarDsc::lvRegister is set to true if a local variable has the
84 same register assignment for its entire lifetime.
85 - LclVarDsc::lvRegNum / lvOtherReg: these are initialized to their
86 first value at the end of LSRA (it looks like lvOtherReg isn't?
87 This is probably a bug (ARM)). Codegen will set them to their current value
88 as it processes the trees, since a variable can (now) be assigned different
89 registers over its lifetimes.
91 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
92 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
100 #ifndef LEGACY_BACKEND // This file is ONLY used for the RyuJIT backend that uses the linear scan register allocator
105 const char* LinearScan::resolveTypeName[] = {"Split", "Join", "Critical", "SharedCritical"};
108 /*XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
109 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
111 XX Small Helper functions XX
114 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
115 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
118 //--------------------------------------------------------------
119 // lsraAssignRegToTree: Assign the given reg to tree node.
122 // tree - Gentree node
123 // reg - register to be assigned
124 // regIdx - register idx, if tree is a multi-reg call node.
125 // regIdx will be zero for single-reg result producing tree nodes.
130 void lsraAssignRegToTree(GenTree* tree, regNumber reg, unsigned regIdx)
134 tree->gtRegNum = reg;
136 #if defined(_TARGET_ARM_)
137 else if (tree->OperIsMultiRegOp())
140 GenTreeMultiRegOp* mul = tree->AsMultiRegOp();
141 mul->gtOtherReg = reg;
143 else if (tree->OperGet() == GT_COPY)
146 GenTreeCopyOrReload* copy = tree->AsCopyOrReload();
147 copy->gtOtherRegs[0] = (regNumberSmall)reg;
149 else if (tree->OperIsPutArgSplit())
151 GenTreePutArgSplit* putArg = tree->AsPutArgSplit();
152 putArg->SetRegNumByIdx(reg, regIdx);
154 #endif // _TARGET_ARM_
157 assert(tree->IsMultiRegCall());
158 GenTreeCall* call = tree->AsCall();
159 call->SetRegNumByIdx(reg, regIdx);
163 //-------------------------------------------------------------
164 // getWeight: Returns the weight of the RefPosition.
167 // refPos - ref position
170 // Weight of ref position.
171 unsigned LinearScan::getWeight(RefPosition* refPos)
174 GenTree* treeNode = refPos->treeNode;
176 if (treeNode != nullptr)
178 if (isCandidateLocalRef(treeNode))
180 // Tracked locals: use weighted ref cnt as the weight of the
182 GenTreeLclVarCommon* lclCommon = treeNode->AsLclVarCommon();
183 LclVarDsc* varDsc = &(compiler->lvaTable[lclCommon->gtLclNum]);
184 weight = varDsc->lvRefCntWtd;
185 if (refPos->getInterval()->isSpilled)
187 // Decrease the weight if the interval has already been spilled.
188 weight -= BB_UNITY_WEIGHT;
193 // Non-candidate local ref or non-lcl tree node.
194 // These are considered to have two references in the basic block:
195 // a def and a use and hence weighted ref count would be 2 times
196 // the basic block weight in which they appear.
197 // However, it is generally more harmful to spill tree temps, so we
199 const unsigned TREE_TEMP_REF_COUNT = 2;
200 const unsigned TREE_TEMP_BOOST_FACTOR = 2;
201 weight = TREE_TEMP_REF_COUNT * TREE_TEMP_BOOST_FACTOR * blockInfo[refPos->bbNum].weight;
206 // Non-tree node ref positions. These will have a single
207 // reference in the basic block and hence their weighted
208 // refcount is equal to the block weight in which they
210 weight = blockInfo[refPos->bbNum].weight;
216 // allRegs represents a set of registers that can
217 // be used to allocate the specified type in any point
218 // in time (more of a 'bank' of registers).
219 regMaskTP LinearScan::allRegs(RegisterType rt)
223 return availableFloatRegs;
225 else if (rt == TYP_DOUBLE)
227 return availableDoubleRegs;
229 // TODO-Cleanup: Add an RBM_ALLSIMD
231 else if (varTypeIsSIMD(rt))
233 return availableDoubleRegs;
234 #endif // FEATURE_SIMD
238 return availableIntRegs;
242 //--------------------------------------------------------------------------
243 // allMultiRegCallNodeRegs: represents a set of registers that can be used
244 // to allocate a multi-reg call node.
247 // call - Multi-reg call node
250 // Mask representing the set of available registers for multi-reg call
254 // Multi-reg call node available regs = Bitwise-OR(allregs(GetReturnRegType(i)))
255 // for all i=0..RetRegCount-1.
256 regMaskTP LinearScan::allMultiRegCallNodeRegs(GenTreeCall* call)
258 assert(call->HasMultiRegRetVal());
260 ReturnTypeDesc* retTypeDesc = call->GetReturnTypeDesc();
261 regMaskTP resultMask = allRegs(retTypeDesc->GetReturnRegType(0));
263 unsigned count = retTypeDesc->GetReturnRegCount();
264 for (unsigned i = 1; i < count; ++i)
266 resultMask |= allRegs(retTypeDesc->GetReturnRegType(i));
272 //--------------------------------------------------------------------------
273 // allRegs: returns the set of registers that can accomodate the type of
277 // tree - GenTree node
280 // Mask representing the set of available registers for given tree
282 // Note: In case of multi-reg call node, the full set of registers must be
283 // determined by looking at types of individual return register types.
284 // In this case, the registers may include registers from different register
285 // sets and will not be limited to the actual ABI return registers.
286 regMaskTP LinearScan::allRegs(GenTree* tree)
288 regMaskTP resultMask;
290 // In case of multi-reg calls, allRegs is defined as
291 // Bitwise-Or(allRegs(GetReturnRegType(i)) for i=0..ReturnRegCount-1
292 if (tree->IsMultiRegCall())
294 resultMask = allMultiRegCallNodeRegs(tree->AsCall());
298 resultMask = allRegs(tree->TypeGet());
304 regMaskTP LinearScan::allSIMDRegs()
306 return availableFloatRegs;
309 //------------------------------------------------------------------------
310 // internalFloatRegCandidates: Return the set of registers that are appropriate
311 // for use as internal float registers.
314 // The set of registers (as a regMaskTP).
317 // compFloatingPointUsed is only required to be set if it is possible that we
318 // will use floating point callee-save registers.
319 // It is unlikely, if an internal register is the only use of floating point,
320 // that it will select a callee-save register. But to be safe, we restrict
321 // the set of candidates if compFloatingPointUsed is not already set.
323 regMaskTP LinearScan::internalFloatRegCandidates()
325 if (compiler->compFloatingPointUsed)
327 return allRegs(TYP_FLOAT);
331 return RBM_FLT_CALLEE_TRASH;
335 /*****************************************************************************
336 * Inline functions for RegRecord
337 *****************************************************************************/
339 bool RegRecord::isFree()
341 return ((assignedInterval == nullptr || !assignedInterval->isActive) && !isBusyUntilNextKill);
344 /*****************************************************************************
345 * Inline functions for LinearScan
346 *****************************************************************************/
347 RegRecord* LinearScan::getRegisterRecord(regNumber regNum)
349 assert((unsigned)regNum < ArrLen(physRegs));
350 return &physRegs[regNum];
355 //----------------------------------------------------------------------------
356 // getConstrainedRegMask: Returns new regMask which is the intersection of
357 // regMaskActual and regMaskConstraint if the new regMask has at least
358 // minRegCount registers, otherwise returns regMaskActual.
361 // regMaskActual - regMask that needs to be constrained
362 // regMaskConstraint - regMask constraint that needs to be
363 // applied to regMaskActual
364 // minRegCount - Minimum number of regs that should be
365 // be present in new regMask.
368 // New regMask that has minRegCount registers after instersection.
369 // Otherwise returns regMaskActual.
370 regMaskTP LinearScan::getConstrainedRegMask(regMaskTP regMaskActual, regMaskTP regMaskConstraint, unsigned minRegCount)
372 regMaskTP newMask = regMaskActual & regMaskConstraint;
373 if (genCountBits(newMask) >= minRegCount)
378 return regMaskActual;
381 //------------------------------------------------------------------------
382 // stressLimitRegs: Given a set of registers, expressed as a register mask, reduce
383 // them based on the current stress options.
386 // mask - The current mask of register candidates for a node
389 // A possibly-modified mask, based on the value of COMPlus_JitStressRegs.
392 // This is the method used to implement the stress options that limit
393 // the set of registers considered for allocation.
395 regMaskTP LinearScan::stressLimitRegs(RefPosition* refPosition, regMaskTP mask)
397 if (getStressLimitRegs() != LSRA_LIMIT_NONE)
399 // The refPosition could be null, for example when called
400 // by getTempRegForResolution().
401 int minRegCount = (refPosition != nullptr) ? refPosition->minRegCandidateCount : 1;
403 switch (getStressLimitRegs())
405 case LSRA_LIMIT_CALLEE:
406 if (!compiler->opts.compDbgEnC)
408 mask = getConstrainedRegMask(mask, RBM_CALLEE_SAVED, minRegCount);
412 case LSRA_LIMIT_CALLER:
414 mask = getConstrainedRegMask(mask, RBM_CALLEE_TRASH, minRegCount);
418 case LSRA_LIMIT_SMALL_SET:
419 if ((mask & LsraLimitSmallIntSet) != RBM_NONE)
421 mask = getConstrainedRegMask(mask, LsraLimitSmallIntSet, minRegCount);
423 else if ((mask & LsraLimitSmallFPSet) != RBM_NONE)
425 mask = getConstrainedRegMask(mask, LsraLimitSmallFPSet, minRegCount);
433 if (refPosition != nullptr && refPosition->isFixedRegRef)
435 mask |= refPosition->registerAssignment;
443 //------------------------------------------------------------------------
444 // conflictingFixedRegReference: Determine whether the current RegRecord has a
445 // fixed register use that conflicts with 'refPosition'
448 // refPosition - The RefPosition of interest
451 // Returns true iff the given RefPosition is NOT a fixed use of this register,
453 // - there is a RefPosition on this RegRecord at the nodeLocation of the given RefPosition, or
454 // - the given RefPosition has a delayRegFree, and there is a RefPosition on this RegRecord at
455 // the nodeLocation just past the given RefPosition.
458 // 'refPosition is non-null.
460 bool RegRecord::conflictingFixedRegReference(RefPosition* refPosition)
462 // Is this a fixed reference of this register? If so, there is no conflict.
463 if (refPosition->isFixedRefOfRegMask(genRegMask(regNum)))
467 // Otherwise, check for conflicts.
468 // There is a conflict if:
469 // 1. There is a recent RefPosition on this RegRecord that is at this location,
470 // except in the case where it is a special "putarg" that is associated with this interval, OR
471 // 2. There is an upcoming RefPosition at this location, or at the next location
472 // if refPosition is a delayed use (i.e. must be kept live through the next/def location).
474 LsraLocation refLocation = refPosition->nodeLocation;
475 if (recentRefPosition != nullptr && recentRefPosition->refType != RefTypeKill &&
476 recentRefPosition->nodeLocation == refLocation &&
477 (!isBusyUntilNextKill || assignedInterval != refPosition->getInterval()))
481 LsraLocation nextPhysRefLocation = getNextRefLocation();
482 if (nextPhysRefLocation == refLocation || (refPosition->delayRegFree && nextPhysRefLocation == (refLocation + 1)))
489 /*****************************************************************************
490 * Inline functions for Interval
491 *****************************************************************************/
492 RefPosition* Referenceable::getNextRefPosition()
494 if (recentRefPosition == nullptr)
496 return firstRefPosition;
500 return recentRefPosition->nextRefPosition;
504 LsraLocation Referenceable::getNextRefLocation()
506 RefPosition* nextRefPosition = getNextRefPosition();
507 if (nextRefPosition == nullptr)
513 return nextRefPosition->nodeLocation;
517 // Iterate through all the registers of the given type
518 class RegisterIterator
520 friend class Registers;
523 RegisterIterator(RegisterType type) : regType(type)
525 if (useFloatReg(regType))
527 currentRegNum = REG_FP_FIRST;
531 currentRegNum = REG_INT_FIRST;
536 static RegisterIterator Begin(RegisterType regType)
538 return RegisterIterator(regType);
540 static RegisterIterator End(RegisterType regType)
542 RegisterIterator endIter = RegisterIterator(regType);
543 // This assumes only integer and floating point register types
544 // if we target a processor with additional register types,
545 // this would have to change
546 if (useFloatReg(regType))
548 // This just happens to work for both double & float
549 endIter.currentRegNum = REG_NEXT(REG_FP_LAST);
553 endIter.currentRegNum = REG_NEXT(REG_INT_LAST);
559 void operator++(int dummy) // int dummy is c++ for "this is postfix ++"
561 currentRegNum = REG_NEXT(currentRegNum);
563 if (regType == TYP_DOUBLE)
564 currentRegNum = REG_NEXT(currentRegNum);
567 void operator++() // prefix operator++
569 currentRegNum = REG_NEXT(currentRegNum);
571 if (regType == TYP_DOUBLE)
572 currentRegNum = REG_NEXT(currentRegNum);
575 regNumber operator*()
577 return currentRegNum;
579 bool operator!=(const RegisterIterator& other)
581 return other.currentRegNum != currentRegNum;
585 regNumber currentRegNum;
586 RegisterType regType;
592 friend class RegisterIterator;
594 Registers(RegisterType t)
598 RegisterIterator begin()
600 return RegisterIterator::Begin(type);
602 RegisterIterator end()
604 return RegisterIterator::End(type);
609 void LinearScan::dumpVarToRegMap(VarToRegMap map)
611 bool anyPrinted = false;
612 for (unsigned varIndex = 0; varIndex < compiler->lvaTrackedCount; varIndex++)
614 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
615 if (map[varIndex] != REG_STK)
617 printf("V%02u=%s ", varNum, getRegName(map[varIndex]));
628 void LinearScan::dumpInVarToRegMap(BasicBlock* block)
630 printf("Var=Reg beg of BB%02u: ", block->bbNum);
631 VarToRegMap map = getInVarToRegMap(block->bbNum);
632 dumpVarToRegMap(map);
635 void LinearScan::dumpOutVarToRegMap(BasicBlock* block)
637 printf("Var=Reg end of BB%02u: ", block->bbNum);
638 VarToRegMap map = getOutVarToRegMap(block->bbNum);
639 dumpVarToRegMap(map);
644 LinearScanInterface* getLinearScanAllocator(Compiler* comp)
646 return new (comp, CMK_LSRA) LinearScan(comp);
649 //------------------------------------------------------------------------
656 // The constructor takes care of initializing the data structures that are used
657 // during Lowering, including (in DEBUG) getting the stress environment variables,
658 // as they may affect the block ordering.
660 LinearScan::LinearScan(Compiler* theCompiler)
661 : compiler(theCompiler)
662 #if MEASURE_MEM_ALLOC
663 , lsraAllocator(nullptr)
664 #endif // MEASURE_MEM_ALLOC
665 , intervals(LinearScanMemoryAllocatorInterval(theCompiler))
666 , refPositions(LinearScanMemoryAllocatorRefPosition(theCompiler))
667 , listNodePool(theCompiler)
671 activeRefPosition = nullptr;
672 specialPutArgCount = 0;
674 // Get the value of the environment variable that controls stress for register allocation
675 lsraStressMask = JitConfig.JitStressRegs();
677 if (lsraStressMask != 0)
679 // The code in this #if can be used to debug JitStressRegs issues according to
680 // method hash. To use, simply set environment variables JitStressRegsHashLo and JitStressRegsHashHi
681 unsigned methHash = compiler->info.compMethodHash();
682 char* lostr = getenv("JitStressRegsHashLo");
683 unsigned methHashLo = 0;
685 if (lostr != nullptr)
687 sscanf_s(lostr, "%x", &methHashLo);
690 char* histr = getenv("JitStressRegsHashHi");
691 unsigned methHashHi = UINT32_MAX;
692 if (histr != nullptr)
694 sscanf_s(histr, "%x", &methHashHi);
697 if (methHash < methHashLo || methHash > methHashHi)
701 else if (dump == true)
703 printf("JitStressRegs = %x for method %s, hash = 0x%x.\n",
704 lsraStressMask, compiler->info.compFullName, compiler->info.compMethodHash());
705 printf(""); // in our logic this causes a flush
711 // Assume that we will enregister local variables if it's not disabled. We'll reset it if we
712 // have no tracked locals when we start allocating. Note that new tracked lclVars may be added
713 // after the first liveness analysis - either by optimizations or by Lowering, and the tracked
714 // set won't be recomputed until after Lowering (and this constructor is called prior to Lowering),
715 // so we don't want to check that yet.
716 enregisterLocalVars = ((compiler->opts.compFlags & CLFLG_REGVAR) != 0);
717 #ifdef _TARGET_ARM64_
718 availableIntRegs = (RBM_ALLINT & ~(RBM_PR | RBM_FP | RBM_LR) & ~compiler->codeGen->regSet.rsMaskResvd);
720 availableIntRegs = (RBM_ALLINT & ~compiler->codeGen->regSet.rsMaskResvd);
724 availableIntRegs &= ~RBM_FPBASE;
725 #endif // ETW_EBP_FRAMED
727 availableFloatRegs = RBM_ALLFLOAT;
728 availableDoubleRegs = RBM_ALLDOUBLE;
730 #ifdef _TARGET_AMD64_
731 if (compiler->opts.compDbgEnC)
733 // On x64 when the EnC option is set, we always save exactly RBP, RSI and RDI.
734 // RBP is not available to the register allocator, so RSI and RDI are the only
735 // callee-save registers available.
736 availableIntRegs &= ~RBM_CALLEE_SAVED | RBM_RSI | RBM_RDI;
737 availableFloatRegs &= ~RBM_CALLEE_SAVED;
738 availableDoubleRegs &= ~RBM_CALLEE_SAVED;
740 #endif // _TARGET_AMD64_
741 compiler->rpFrameType = FT_NOT_SET;
742 compiler->rpMustCreateEBPCalled = false;
744 compiler->codeGen->intRegState.rsIsFloat = false;
745 compiler->codeGen->floatRegState.rsIsFloat = true;
747 // Block sequencing (the order in which we schedule).
748 // Note that we don't initialize the bbVisitedSet until we do the first traversal
749 // (currently during Lowering's second phase, where it sets the TreeNodeInfo).
750 // This is so that any blocks that are added during the first phase of Lowering
751 // are accounted for (and we don't have BasicBlockEpoch issues).
752 blockSequencingDone = false;
753 blockSequence = nullptr;
754 blockSequenceWorkList = nullptr;
758 // Information about each block, including predecessor blocks used for variable locations at block entry.
761 // Populate the register mask table.
762 // The first two masks in the table are allint/allfloat
763 // The next N are the masks for each single register.
764 // After that are the dynamically added ones.
765 regMaskTable = new (compiler, CMK_LSRA) regMaskTP[numMasks];
766 regMaskTable[ALLINT_IDX] = allRegs(TYP_INT);
767 regMaskTable[ALLFLOAT_IDX] = allRegs(TYP_DOUBLE);
770 for (reg = REG_FIRST; reg < REG_COUNT; reg = REG_NEXT(reg))
772 regMaskTable[FIRST_SINGLE_REG_IDX + reg - REG_FIRST] = (reg == REG_STK) ? RBM_NONE : genRegMask(reg);
774 nextFreeMask = FIRST_SINGLE_REG_IDX + REG_COUNT;
775 noway_assert(nextFreeMask <= numMasks);
778 // Return the reg mask corresponding to the given index.
779 regMaskTP LinearScan::GetRegMaskForIndex(RegMaskIndex index)
781 assert(index < numMasks);
782 assert(index < nextFreeMask);
783 return regMaskTable[index];
786 // Given a reg mask, return the index it corresponds to. If it is not a 'well known' reg mask,
787 // add it at the end. This method has linear behavior in the worst cases but that is fairly rare.
788 // Most methods never use any but the well-known masks, and when they do use more
789 // it is only one or two more.
790 LinearScan::RegMaskIndex LinearScan::GetIndexForRegMask(regMaskTP mask)
793 if (isSingleRegister(mask))
795 result = genRegNumFromMask(mask) + FIRST_SINGLE_REG_IDX;
797 else if (mask == allRegs(TYP_INT))
801 else if (mask == allRegs(TYP_DOUBLE))
803 result = ALLFLOAT_IDX;
807 for (int i = FIRST_SINGLE_REG_IDX + REG_COUNT; i < nextFreeMask; i++)
809 if (regMaskTable[i] == mask)
815 // We only allocate a fixed number of masks. Since we don't reallocate, we will throw a
816 // noway_assert if we exceed this limit.
817 noway_assert(nextFreeMask < numMasks);
819 regMaskTable[nextFreeMask] = mask;
820 result = nextFreeMask;
823 assert(mask == regMaskTable[result]);
827 // We've decided that we can't use one or more registers during register allocation (probably FPBASE),
828 // but we've already added it to the register masks. Go through the masks and remove it.
829 void LinearScan::RemoveRegistersFromMasks(regMaskTP removeMask)
833 JITDUMP("Removing registers from LSRA register masks: ");
834 INDEBUG(dumpRegMask(removeMask));
838 regMaskTP mask = ~removeMask;
839 for (int i = 0; i < nextFreeMask; i++)
841 regMaskTable[i] &= mask;
844 JITDUMP("After removing registers:\n");
845 DBEXEC(VERBOSE, dspRegisterMaskTable());
849 void LinearScan::dspRegisterMaskTable()
851 printf("LSRA register masks. Total allocated: %d, total used: %d\n", numMasks, nextFreeMask);
852 for (int i = 0; i < nextFreeMask; i++)
855 dspRegMask(regMaskTable[i]);
861 //------------------------------------------------------------------------
862 // getNextCandidateFromWorkList: Get the next candidate for block sequencing
868 // The next block to be placed in the sequence.
871 // This method currently always returns the next block in the list, and relies on having
872 // blocks added to the list only when they are "ready", and on the
873 // addToBlockSequenceWorkList() method to insert them in the proper order.
874 // However, a block may be in the list and already selected, if it was subsequently
875 // encountered as both a flow and layout successor of the most recently selected
878 BasicBlock* LinearScan::getNextCandidateFromWorkList()
880 BasicBlockList* nextWorkList = nullptr;
881 for (BasicBlockList* workList = blockSequenceWorkList; workList != nullptr; workList = nextWorkList)
883 nextWorkList = workList->next;
884 BasicBlock* candBlock = workList->block;
885 removeFromBlockSequenceWorkList(workList, nullptr);
886 if (!isBlockVisited(candBlock))
894 //------------------------------------------------------------------------
895 // setBlockSequence:Determine the block order for register allocation.
904 // On return, the blockSequence array contains the blocks, in the order in which they
905 // will be allocated.
906 // This method clears the bbVisitedSet on LinearScan, and when it returns the set
907 // contains all the bbNums for the block.
908 // This requires a traversal of the BasicBlocks, and could potentially be
909 // combined with the first traversal (currently the one in Lowering that sets the
912 void LinearScan::setBlockSequence()
914 // Reset the "visited" flag on each block.
915 compiler->EnsureBasicBlockEpoch();
916 bbVisitedSet = BlockSetOps::MakeEmpty(compiler);
917 BlockSet readySet(BlockSetOps::MakeEmpty(compiler));
918 BlockSet predSet(BlockSetOps::MakeEmpty(compiler));
920 assert(blockSequence == nullptr && bbSeqCount == 0);
921 blockSequence = new (compiler, CMK_LSRA) BasicBlock*[compiler->fgBBcount];
922 bbNumMaxBeforeResolution = compiler->fgBBNumMax;
923 blockInfo = new (compiler, CMK_LSRA) LsraBlockInfo[bbNumMaxBeforeResolution + 1];
925 assert(blockSequenceWorkList == nullptr);
927 bool addedInternalBlocks = false;
928 verifiedAllBBs = false;
929 hasCriticalEdges = false;
930 BasicBlock* nextBlock;
931 // We use a bbNum of 0 for entry RefPositions.
932 // The other information in blockInfo[0] will never be used.
933 blockInfo[0].weight = BB_UNITY_WEIGHT;
934 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = nextBlock)
936 blockSequence[bbSeqCount] = block;
937 markBlockVisited(block);
941 // Initialize the blockInfo.
942 // predBBNum will be set later. 0 is never used as a bbNum.
943 assert(block->bbNum != 0);
944 blockInfo[block->bbNum].predBBNum = 0;
945 // We check for critical edges below, but initialize to false.
946 blockInfo[block->bbNum].hasCriticalInEdge = false;
947 blockInfo[block->bbNum].hasCriticalOutEdge = false;
948 blockInfo[block->bbNum].weight = block->getBBWeight(compiler);
951 blockInfo[block->bbNum].spillCount = 0;
952 blockInfo[block->bbNum].copyRegCount = 0;
953 blockInfo[block->bbNum].resolutionMovCount = 0;
954 blockInfo[block->bbNum].splitEdgeCount = 0;
955 #endif // TRACK_LSRA_STATS
957 if (block->GetUniquePred(compiler) == nullptr)
959 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
961 BasicBlock* predBlock = pred->flBlock;
962 if (predBlock->NumSucc(compiler) > 1)
964 blockInfo[block->bbNum].hasCriticalInEdge = true;
965 hasCriticalEdges = true;
968 else if (predBlock->bbJumpKind == BBJ_SWITCH)
970 assert(!"Switch with single successor");
975 // Determine which block to schedule next.
977 // First, update the NORMAL successors of the current block, adding them to the worklist
978 // according to the desired order. We will handle the EH successors below.
979 bool checkForCriticalOutEdge = (block->NumSucc(compiler) > 1);
980 if (!checkForCriticalOutEdge && block->bbJumpKind == BBJ_SWITCH)
982 assert(!"Switch with single successor");
985 const unsigned numSuccs = block->NumSucc(compiler);
986 for (unsigned succIndex = 0; succIndex < numSuccs; succIndex++)
988 BasicBlock* succ = block->GetSucc(succIndex, compiler);
989 if (checkForCriticalOutEdge && succ->GetUniquePred(compiler) == nullptr)
991 blockInfo[block->bbNum].hasCriticalOutEdge = true;
992 hasCriticalEdges = true;
993 // We can stop checking now.
994 checkForCriticalOutEdge = false;
997 if (isTraversalLayoutOrder() || isBlockVisited(succ))
1002 // We've now seen a predecessor, so add it to the work list and the "readySet".
1003 // It will be inserted in the worklist according to the specified traversal order
1004 // (i.e. pred-first or random, since layout order is handled above).
1005 if (!BlockSetOps::IsMember(compiler, readySet, succ->bbNum))
1007 addToBlockSequenceWorkList(readySet, succ, predSet);
1008 BlockSetOps::AddElemD(compiler, readySet, succ->bbNum);
1012 // For layout order, simply use bbNext
1013 if (isTraversalLayoutOrder())
1015 nextBlock = block->bbNext;
1019 while (nextBlock == nullptr)
1021 nextBlock = getNextCandidateFromWorkList();
1023 // TODO-Throughput: We would like to bypass this traversal if we know we've handled all
1024 // the blocks - but fgBBcount does not appear to be updated when blocks are removed.
1025 if (nextBlock == nullptr /* && bbSeqCount != compiler->fgBBcount*/ && !verifiedAllBBs)
1027 // If we don't encounter all blocks by traversing the regular sucessor links, do a full
1028 // traversal of all the blocks, and add them in layout order.
1029 // This may include:
1030 // - internal-only blocks (in the fgAddCodeList) which may not be in the flow graph
1031 // (these are not even in the bbNext links).
1032 // - blocks that have become unreachable due to optimizations, but that are strongly
1033 // connected (these are not removed)
1036 for (Compiler::AddCodeDsc* desc = compiler->fgAddCodeList; desc != nullptr; desc = desc->acdNext)
1038 if (!isBlockVisited(block))
1040 addToBlockSequenceWorkList(readySet, block, predSet);
1041 BlockSetOps::AddElemD(compiler, readySet, block->bbNum);
1045 for (BasicBlock* block = compiler->fgFirstBB; block; block = block->bbNext)
1047 if (!isBlockVisited(block))
1049 addToBlockSequenceWorkList(readySet, block, predSet);
1050 BlockSetOps::AddElemD(compiler, readySet, block->bbNum);
1053 verifiedAllBBs = true;
1061 blockSequencingDone = true;
1064 // Make sure that we've visited all the blocks.
1065 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = block->bbNext)
1067 assert(isBlockVisited(block));
1070 JITDUMP("LSRA Block Sequence: ");
1072 for (BasicBlock *block = startBlockSequence(); block != nullptr; ++i, block = moveToNextBlock())
1074 JITDUMP("BB%02u", block->bbNum);
1076 if (block->isMaxBBWeight())
1082 JITDUMP("(%6s) ", refCntWtd2str(block->getBBWeight(compiler)));
1094 //------------------------------------------------------------------------
1095 // compareBlocksForSequencing: Compare two basic blocks for sequencing order.
1098 // block1 - the first block for comparison
1099 // block2 - the second block for comparison
1100 // useBlockWeights - whether to use block weights for comparison
1103 // -1 if block1 is preferred.
1104 // 0 if the blocks are equivalent.
1105 // 1 if block2 is preferred.
1108 // See addToBlockSequenceWorkList.
1109 int LinearScan::compareBlocksForSequencing(BasicBlock* block1, BasicBlock* block2, bool useBlockWeights)
1111 if (useBlockWeights)
1113 unsigned weight1 = block1->getBBWeight(compiler);
1114 unsigned weight2 = block2->getBBWeight(compiler);
1116 if (weight1 > weight2)
1120 else if (weight1 < weight2)
1126 // If weights are the same prefer LOWER bbnum
1127 if (block1->bbNum < block2->bbNum)
1131 else if (block1->bbNum == block2->bbNum)
1141 //------------------------------------------------------------------------
1142 // addToBlockSequenceWorkList: Add a BasicBlock to the work list for sequencing.
1145 // sequencedBlockSet - the set of blocks that are already sequenced
1146 // block - the new block to be added
1147 // predSet - the buffer to save predecessors set. A block set allocated by the caller used here as a
1148 // temporary block set for constructing a predecessor set. Allocated by the caller to avoid reallocating a new block
1149 // set with every call to this function
1155 // The first block in the list will be the next one to be sequenced, as soon
1156 // as we encounter a block whose successors have all been sequenced, in pred-first
1157 // order, or the very next block if we are traversing in random order (once implemented).
1158 // This method uses a comparison method to determine the order in which to place
1159 // the blocks in the list. This method queries whether all predecessors of the
1160 // block are sequenced at the time it is added to the list and if so uses block weights
1161 // for inserting the block. A block is never inserted ahead of its predecessors.
1162 // A block at the time of insertion may not have all its predecessors sequenced, in
1163 // which case it will be sequenced based on its block number. Once a block is inserted,
1164 // its priority\order will not be changed later once its remaining predecessors are
1165 // sequenced. This would mean that work list may not be sorted entirely based on
1166 // block weights alone.
1168 // Note also that, when random traversal order is implemented, this method
1169 // should insert the blocks into the list in random order, so that we can always
1170 // simply select the first block in the list.
1171 void LinearScan::addToBlockSequenceWorkList(BlockSet sequencedBlockSet, BasicBlock* block, BlockSet& predSet)
1173 // The block that is being added is not already sequenced
1174 assert(!BlockSetOps::IsMember(compiler, sequencedBlockSet, block->bbNum));
1176 // Get predSet of block
1177 BlockSetOps::ClearD(compiler, predSet);
1179 for (pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
1181 BlockSetOps::AddElemD(compiler, predSet, pred->flBlock->bbNum);
1184 // If either a rarely run block or all its preds are already sequenced, use block's weight to sequence
1185 bool useBlockWeight = block->isRunRarely() || BlockSetOps::IsSubset(compiler, sequencedBlockSet, predSet);
1187 BasicBlockList* prevNode = nullptr;
1188 BasicBlockList* nextNode = blockSequenceWorkList;
1190 while (nextNode != nullptr)
1194 if (nextNode->block->isRunRarely())
1196 // If the block that is yet to be sequenced is a rarely run block, always use block weights for sequencing
1197 seqResult = compareBlocksForSequencing(nextNode->block, block, true);
1199 else if (BlockSetOps::IsMember(compiler, predSet, nextNode->block->bbNum))
1201 // always prefer unsequenced pred blocks
1206 seqResult = compareBlocksForSequencing(nextNode->block, block, useBlockWeight);
1214 prevNode = nextNode;
1215 nextNode = nextNode->next;
1218 BasicBlockList* newListNode = new (compiler, CMK_LSRA) BasicBlockList(block, nextNode);
1219 if (prevNode == nullptr)
1221 blockSequenceWorkList = newListNode;
1225 prevNode->next = newListNode;
1229 void LinearScan::removeFromBlockSequenceWorkList(BasicBlockList* listNode, BasicBlockList* prevNode)
1231 if (listNode == blockSequenceWorkList)
1233 assert(prevNode == nullptr);
1234 blockSequenceWorkList = listNode->next;
1238 assert(prevNode != nullptr && prevNode->next == listNode);
1239 prevNode->next = listNode->next;
1241 // TODO-Cleanup: consider merging Compiler::BlockListNode and BasicBlockList
1242 // compiler->FreeBlockListNode(listNode);
1245 // Initialize the block order for allocation (called each time a new traversal begins).
1246 BasicBlock* LinearScan::startBlockSequence()
1248 if (!blockSequencingDone)
1252 BasicBlock* curBB = compiler->fgFirstBB;
1254 curBBNum = curBB->bbNum;
1255 clearVisitedBlocks();
1256 assert(blockSequence[0] == compiler->fgFirstBB);
1257 markBlockVisited(curBB);
1261 //------------------------------------------------------------------------
1262 // moveToNextBlock: Move to the next block in order for allocation or resolution.
1271 // This method is used when the next block is actually going to be handled.
1272 // It changes curBBNum.
1274 BasicBlock* LinearScan::moveToNextBlock()
1276 BasicBlock* nextBlock = getNextBlock();
1278 if (nextBlock != nullptr)
1280 curBBNum = nextBlock->bbNum;
1285 //------------------------------------------------------------------------
1286 // getNextBlock: Get the next block in order for allocation or resolution.
1295 // This method does not actually change the current block - it is used simply
1296 // to determine which block will be next.
1298 BasicBlock* LinearScan::getNextBlock()
1300 assert(blockSequencingDone);
1301 unsigned int nextBBSeqNum = curBBSeqNum + 1;
1302 if (nextBBSeqNum < bbSeqCount)
1304 return blockSequence[nextBBSeqNum];
1309 //------------------------------------------------------------------------
1310 // doLinearScan: The main method for register allocation.
1319 void LinearScan::doLinearScan()
1321 // Check to see whether we have any local variables to enregister.
1322 // We initialize this in the constructor based on opt settings,
1323 // but we don't want to spend time on the lclVar parts of LinearScan
1324 // if we have no tracked locals.
1325 if (enregisterLocalVars && (compiler->lvaTrackedCount == 0))
1327 enregisterLocalVars = false;
1330 unsigned lsraBlockEpoch = compiler->GetCurBasicBlockEpoch();
1332 splitBBNumToTargetBBNumMap = nullptr;
1334 // This is complicated by the fact that physical registers have refs associated
1335 // with locations where they are killed (e.g. calls), but we don't want to
1336 // count these as being touched.
1338 compiler->codeGen->regSet.rsClearRegsModified();
1342 DBEXEC(VERBOSE, TupleStyleDump(LSRA_DUMP_REFPOS));
1343 compiler->EndPhase(PHASE_LINEAR_SCAN_BUILD);
1345 DBEXEC(VERBOSE, lsraDumpIntervals("after buildIntervals"));
1347 clearVisitedBlocks();
1349 allocateRegisters();
1350 compiler->EndPhase(PHASE_LINEAR_SCAN_ALLOC);
1352 compiler->EndPhase(PHASE_LINEAR_SCAN_RESOLVE);
1354 #if TRACK_LSRA_STATS
1355 if ((JitConfig.DisplayLsraStats() != 0)
1361 dumpLsraStats(jitstdout);
1363 #endif // TRACK_LSRA_STATS
1365 DBEXEC(VERBOSE, TupleStyleDump(LSRA_DUMP_POST));
1367 compiler->compLSRADone = true;
1368 noway_assert(lsraBlockEpoch = compiler->GetCurBasicBlockEpoch());
1371 //------------------------------------------------------------------------
1372 // recordVarLocationsAtStartOfBB: Update live-in LclVarDscs with the appropriate
1373 // register location at the start of a block, during codegen.
1376 // bb - the block for which code is about to be generated.
1382 // CodeGen will take care of updating the reg masks and the current var liveness,
1383 // after calling this method.
1384 // This is because we need to kill off the dead registers before setting the newly live ones.
1386 void LinearScan::recordVarLocationsAtStartOfBB(BasicBlock* bb)
1388 if (!enregisterLocalVars)
1392 JITDUMP("Recording Var Locations at start of BB%02u\n", bb->bbNum);
1393 VarToRegMap map = getInVarToRegMap(bb->bbNum);
1396 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
1397 VarSetOps::Intersection(compiler, registerCandidateVars, bb->bbLiveIn));
1398 VarSetOps::Iter iter(compiler, currentLiveVars);
1399 unsigned varIndex = 0;
1400 while (iter.NextElem(&varIndex))
1402 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
1403 LclVarDsc* varDsc = &(compiler->lvaTable[varNum]);
1404 regNumber regNum = getVarReg(map, varIndex);
1406 regNumber oldRegNum = varDsc->lvRegNum;
1407 regNumber newRegNum = regNum;
1409 if (oldRegNum != newRegNum)
1411 JITDUMP(" V%02u(%s->%s)", varNum, compiler->compRegVarName(oldRegNum),
1412 compiler->compRegVarName(newRegNum));
1413 varDsc->lvRegNum = newRegNum;
1416 else if (newRegNum != REG_STK)
1418 JITDUMP(" V%02u(%s)", varNum, compiler->compRegVarName(newRegNum));
1425 JITDUMP(" <none>\n");
1431 void Interval::setLocalNumber(Compiler* compiler, unsigned lclNum, LinearScan* linScan)
1433 LclVarDsc* varDsc = &compiler->lvaTable[lclNum];
1434 assert(varDsc->lvTracked);
1435 assert(varDsc->lvVarIndex < compiler->lvaTrackedCount);
1437 linScan->localVarIntervals[varDsc->lvVarIndex] = this;
1439 assert(linScan->getIntervalForLocalVar(varDsc->lvVarIndex) == this);
1440 this->isLocalVar = true;
1441 this->varNum = lclNum;
1444 // identify the candidates which we are not going to enregister due to
1445 // being used in EH in a way we don't want to deal with
1446 // this logic cloned from fgInterBlockLocalVarLiveness
1447 void LinearScan::identifyCandidatesExceptionDataflow()
1449 VARSET_TP exceptVars(VarSetOps::MakeEmpty(compiler));
1450 VARSET_TP filterVars(VarSetOps::MakeEmpty(compiler));
1451 VARSET_TP finallyVars(VarSetOps::MakeEmpty(compiler));
1454 foreach_block(compiler, block)
1456 if (block->bbCatchTyp != BBCT_NONE)
1458 // live on entry to handler
1459 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveIn);
1462 if (block->bbJumpKind == BBJ_EHFILTERRET)
1464 // live on exit from filter
1465 VarSetOps::UnionD(compiler, filterVars, block->bbLiveOut);
1467 else if (block->bbJumpKind == BBJ_EHFINALLYRET)
1469 // live on exit from finally
1470 VarSetOps::UnionD(compiler, finallyVars, block->bbLiveOut);
1472 #if FEATURE_EH_FUNCLETS
1473 // Funclets are called and returned from, as such we can only count on the frame
1474 // pointer being restored, and thus everything live in or live out must be on the
1476 if (block->bbFlags & BBF_FUNCLET_BEG)
1478 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveIn);
1480 if ((block->bbJumpKind == BBJ_EHFINALLYRET) || (block->bbJumpKind == BBJ_EHFILTERRET) ||
1481 (block->bbJumpKind == BBJ_EHCATCHRET))
1483 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveOut);
1485 #endif // FEATURE_EH_FUNCLETS
1488 // slam them all together (there was really no need to use more than 2 bitvectors here)
1489 VarSetOps::UnionD(compiler, exceptVars, filterVars);
1490 VarSetOps::UnionD(compiler, exceptVars, finallyVars);
1492 /* Mark all pointer variables live on exit from a 'finally'
1493 block as either volatile for non-GC ref types or as
1494 'explicitly initialized' (volatile and must-init) for GC-ref types */
1496 VarSetOps::Iter iter(compiler, exceptVars);
1497 unsigned varIndex = 0;
1498 while (iter.NextElem(&varIndex))
1500 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
1501 LclVarDsc* varDsc = compiler->lvaTable + varNum;
1503 compiler->lvaSetVarDoNotEnregister(varNum DEBUGARG(Compiler::DNER_LiveInOutOfHandler));
1505 if (varTypeIsGC(varDsc))
1507 if (VarSetOps::IsMember(compiler, finallyVars, varIndex) && !varDsc->lvIsParam)
1509 varDsc->lvMustInit = true;
1515 bool LinearScan::isRegCandidate(LclVarDsc* varDsc)
1517 // We shouldn't be called if opt settings do not permit register variables.
1518 assert((compiler->opts.compFlags & CLFLG_REGVAR) != 0);
1520 if (!varDsc->lvTracked)
1525 #if !defined(_TARGET_64BIT_)
1526 if (varDsc->lvType == TYP_LONG)
1528 // Long variables should not be register candidates.
1529 // Lowering will have split any candidate lclVars into lo/hi vars.
1532 #endif // !defined(_TARGET_64BIT)
1534 // If we have JMP, reg args must be put on the stack
1536 if (compiler->compJmpOpUsed && varDsc->lvIsRegArg)
1541 // Don't allocate registers for dependently promoted struct fields
1542 if (compiler->lvaIsFieldOfDependentlyPromotedStruct(varDsc))
1549 // Identify locals & compiler temps that are register candidates
1550 // TODO-Cleanup: This was cloned from Compiler::lvaSortByRefCount() in lclvars.cpp in order
1551 // to avoid perturbation, but should be merged.
1553 void LinearScan::identifyCandidates()
1555 if (enregisterLocalVars)
1557 // Initialize the set of lclVars that are candidates for register allocation.
1558 VarSetOps::AssignNoCopy(compiler, registerCandidateVars, VarSetOps::MakeEmpty(compiler));
1560 // Initialize the sets of lclVars that are used to determine whether, and for which lclVars,
1561 // we need to perform resolution across basic blocks.
1562 // Note that we can't do this in the constructor because the number of tracked lclVars may
1563 // change between the constructor and the actual allocation.
1564 VarSetOps::AssignNoCopy(compiler, resolutionCandidateVars, VarSetOps::MakeEmpty(compiler));
1565 VarSetOps::AssignNoCopy(compiler, splitOrSpilledVars, VarSetOps::MakeEmpty(compiler));
1567 // We set enregisterLocalVars to true only if there are tracked lclVars
1568 assert(compiler->lvaCount != 0);
1570 else if (compiler->lvaCount == 0)
1572 // Nothing to do. Note that even if enregisterLocalVars is false, we still need to set the
1573 // lvLRACandidate field on all the lclVars to false if we have any.
1577 if (compiler->compHndBBtabCount > 0)
1579 identifyCandidatesExceptionDataflow();
1585 // While we build intervals for the candidate lclVars, we will determine the floating point
1586 // lclVars, if any, to consider for callee-save register preferencing.
1587 // We maintain two sets of FP vars - those that meet the first threshold of weighted ref Count,
1588 // and those that meet the second.
1589 // The first threshold is used for methods that are heuristically deemed either to have light
1590 // fp usage, or other factors that encourage conservative use of callee-save registers, such
1591 // as multiple exits (where there might be an early exit that woudl be excessively penalized by
1592 // lots of prolog/epilog saves & restores).
1593 // The second threshold is used where there are factors deemed to make it more likely that fp
1594 // fp callee save registers will be needed, such as loops or many fp vars.
1595 // We keep two sets of vars, since we collect some of the information to determine which set to
1596 // use as we iterate over the vars.
1597 // When we are generating AVX code on non-Unix (FEATURE_PARTIAL_SIMD_CALLEE_SAVE), we maintain an
1598 // additional set of LargeVectorType vars, and there is a separate threshold defined for those.
1599 // It is assumed that if we encounter these, that we should consider this a "high use" scenario,
1600 // so we don't maintain two sets of these vars.
1601 // This is defined as thresholdLargeVectorRefCntWtd, as we are likely to use the same mechanism
1602 // for vectors on Arm64, though the actual value may differ.
1604 unsigned int floatVarCount = 0;
1605 unsigned int thresholdFPRefCntWtd = 4 * BB_UNITY_WEIGHT;
1606 unsigned int maybeFPRefCntWtd = 2 * BB_UNITY_WEIGHT;
1607 VARSET_TP fpMaybeCandidateVars(VarSetOps::UninitVal());
1608 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1609 unsigned int largeVectorVarCount = 0;
1610 unsigned int thresholdLargeVectorRefCntWtd = 4 * BB_UNITY_WEIGHT;
1611 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1612 if (enregisterLocalVars)
1614 VarSetOps::AssignNoCopy(compiler, fpCalleeSaveCandidateVars, VarSetOps::MakeEmpty(compiler));
1615 VarSetOps::AssignNoCopy(compiler, fpMaybeCandidateVars, VarSetOps::MakeEmpty(compiler));
1616 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1617 VarSetOps::AssignNoCopy(compiler, largeVectorVars, VarSetOps::MakeEmpty(compiler));
1618 VarSetOps::AssignNoCopy(compiler, largeVectorCalleeSaveCandidateVars, VarSetOps::MakeEmpty(compiler));
1619 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1622 unsigned refCntStk = 0;
1623 unsigned refCntReg = 0;
1624 unsigned refCntWtdReg = 0;
1625 unsigned refCntStkParam = 0; // sum of ref counts for all stack based parameters
1626 unsigned refCntWtdStkDbl = 0; // sum of wtd ref counts for stack based doubles
1627 doDoubleAlign = false;
1628 bool checkDoubleAlign = true;
1629 if (compiler->codeGen->isFramePointerRequired() || compiler->opts.MinOpts())
1631 checkDoubleAlign = false;
1635 switch (compiler->getCanDoubleAlign())
1637 case MUST_DOUBLE_ALIGN:
1638 doDoubleAlign = true;
1639 checkDoubleAlign = false;
1641 case CAN_DOUBLE_ALIGN:
1643 case CANT_DOUBLE_ALIGN:
1644 doDoubleAlign = false;
1645 checkDoubleAlign = false;
1651 #endif // DOUBLE_ALIGN
1653 // Check whether register variables are permitted.
1654 if (!enregisterLocalVars)
1656 localVarIntervals = nullptr;
1658 else if (compiler->lvaTrackedCount > 0)
1660 // initialize mapping from tracked local to interval
1661 localVarIntervals = new (compiler, CMK_LSRA) Interval*[compiler->lvaTrackedCount];
1664 INTRACK_STATS(regCandidateVarCount = 0);
1665 for (lclNum = 0, varDsc = compiler->lvaTable; lclNum < compiler->lvaCount; lclNum++, varDsc++)
1667 // Initialize all variables to REG_STK
1668 varDsc->lvRegNum = REG_STK;
1669 #ifndef _TARGET_64BIT_
1670 varDsc->lvOtherReg = REG_STK;
1671 #endif // _TARGET_64BIT_
1673 if (!enregisterLocalVars)
1675 varDsc->lvLRACandidate = false;
1680 if (checkDoubleAlign)
1682 if (varDsc->lvIsParam && !varDsc->lvIsRegArg)
1684 refCntStkParam += varDsc->lvRefCnt;
1686 else if (!isRegCandidate(varDsc) || varDsc->lvDoNotEnregister)
1688 refCntStk += varDsc->lvRefCnt;
1689 if ((varDsc->lvType == TYP_DOUBLE) ||
1690 ((varTypeIsStruct(varDsc) && varDsc->lvStructDoubleAlign &&
1691 (compiler->lvaGetPromotionType(varDsc) != Compiler::PROMOTION_TYPE_INDEPENDENT))))
1693 refCntWtdStkDbl += varDsc->lvRefCntWtd;
1698 refCntReg += varDsc->lvRefCnt;
1699 refCntWtdReg += varDsc->lvRefCntWtd;
1702 #endif // DOUBLE_ALIGN
1704 /* Track all locals that can be enregistered */
1706 if (!isRegCandidate(varDsc))
1708 varDsc->lvLRACandidate = 0;
1709 if (varDsc->lvTracked)
1711 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1716 assert(varDsc->lvTracked);
1718 varDsc->lvLRACandidate = 1;
1720 // Start with lvRegister as false - set it true only if the variable gets
1721 // the same register assignment throughout
1722 varDsc->lvRegister = false;
1724 /* If the ref count is zero */
1725 if (varDsc->lvRefCnt == 0)
1727 /* Zero ref count, make this untracked */
1728 varDsc->lvRefCntWtd = 0;
1729 varDsc->lvLRACandidate = 0;
1732 // Variables that are address-exposed are never enregistered, or tracked.
1733 // A struct may be promoted, and a struct that fits in a register may be fully enregistered.
1734 // Pinned variables may not be tracked (a condition of the GCInfo representation)
1735 // or enregistered, on x86 -- it is believed that we can enregister pinned (more properly, "pinning")
1736 // references when using the general GC encoding.
1738 if (varDsc->lvAddrExposed || !varTypeIsEnregisterableStruct(varDsc))
1740 varDsc->lvLRACandidate = 0;
1742 Compiler::DoNotEnregisterReason dner = Compiler::DNER_AddrExposed;
1743 if (!varDsc->lvAddrExposed)
1745 dner = Compiler::DNER_IsStruct;
1748 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(dner));
1750 else if (varDsc->lvPinned)
1752 varDsc->lvTracked = 0;
1753 #ifdef JIT32_GCENCODER
1754 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(Compiler::DNER_PinningRef));
1755 #endif // JIT32_GCENCODER
1758 // Are we not optimizing and we have exception handlers?
1759 // if so mark all args and locals as volatile, so that they
1760 // won't ever get enregistered.
1762 if (compiler->opts.MinOpts() && compiler->compHndBBtabCount > 0)
1764 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(Compiler::DNER_LiveInOutOfHandler));
1767 if (varDsc->lvDoNotEnregister)
1769 varDsc->lvLRACandidate = 0;
1770 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1774 var_types type = genActualType(varDsc->TypeGet());
1778 #if CPU_HAS_FP_SUPPORT
1781 if (compiler->opts.compDbgCode)
1783 varDsc->lvLRACandidate = 0;
1786 if (varDsc->lvIsParam && varDsc->lvIsRegArg)
1788 type = (type == TYP_DOUBLE) ? TYP_LONG : TYP_INT;
1790 #endif // ARM_SOFTFP
1792 #endif // CPU_HAS_FP_SUPPORT
1804 if (varDsc->lvPromoted)
1806 varDsc->lvLRACandidate = 0;
1810 // TODO-1stClassStructs: Move TYP_SIMD8 up with the other SIMD types, after handling the param issue
1811 // (passing & returning as TYP_LONG).
1813 #endif // FEATURE_SIMD
1817 varDsc->lvLRACandidate = 0;
1823 noway_assert(!"lvType not set correctly");
1824 varDsc->lvType = TYP_INT;
1829 varDsc->lvLRACandidate = 0;
1832 if (varDsc->lvLRACandidate)
1834 Interval* newInt = newInterval(type);
1835 newInt->setLocalNumber(compiler, lclNum, this);
1836 VarSetOps::AddElemD(compiler, registerCandidateVars, varDsc->lvVarIndex);
1838 // we will set this later when we have determined liveness
1839 varDsc->lvMustInit = false;
1841 if (varDsc->lvIsStructField)
1843 newInt->isStructField = true;
1846 INTRACK_STATS(regCandidateVarCount++);
1848 // We maintain two sets of FP vars - those that meet the first threshold of weighted ref Count,
1849 // and those that meet the second (see the definitions of thresholdFPRefCntWtd and maybeFPRefCntWtd
1851 CLANG_FORMAT_COMMENT_ANCHOR;
1853 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1854 // Additionally, when we are generating AVX on non-UNIX amd64, we keep a separate set of the LargeVectorType
1856 if (varTypeNeedsPartialCalleeSave(varDsc->lvType))
1858 largeVectorVarCount++;
1859 VarSetOps::AddElemD(compiler, largeVectorVars, varDsc->lvVarIndex);
1860 unsigned refCntWtd = varDsc->lvRefCntWtd;
1861 if (refCntWtd >= thresholdLargeVectorRefCntWtd)
1863 VarSetOps::AddElemD(compiler, largeVectorCalleeSaveCandidateVars, varDsc->lvVarIndex);
1867 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1868 if (regType(type) == FloatRegisterType)
1871 unsigned refCntWtd = varDsc->lvRefCntWtd;
1872 if (varDsc->lvIsRegArg)
1874 // Don't count the initial reference for register params. In those cases,
1875 // using a callee-save causes an extra copy.
1876 refCntWtd -= BB_UNITY_WEIGHT;
1878 if (refCntWtd >= thresholdFPRefCntWtd)
1880 VarSetOps::AddElemD(compiler, fpCalleeSaveCandidateVars, varDsc->lvVarIndex);
1882 else if (refCntWtd >= maybeFPRefCntWtd)
1884 VarSetOps::AddElemD(compiler, fpMaybeCandidateVars, varDsc->lvVarIndex);
1890 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1895 if (checkDoubleAlign)
1897 // TODO-CQ: Fine-tune this:
1898 // In the legacy reg predictor, this runs after allocation, and then demotes any lclVars
1899 // allocated to the frame pointer, which is probably the wrong order.
1900 // However, because it runs after allocation, it can determine the impact of demoting
1901 // the lclVars allocated to the frame pointer.
1902 // => Here, estimate of the EBP refCnt and weighted refCnt is a wild guess.
1904 unsigned refCntEBP = refCntReg / 8;
1905 unsigned refCntWtdEBP = refCntWtdReg / 8;
1908 compiler->shouldDoubleAlign(refCntStk, refCntEBP, refCntWtdEBP, refCntStkParam, refCntWtdStkDbl);
1910 #endif // DOUBLE_ALIGN
1912 // The factors we consider to determine which set of fp vars to use as candidates for callee save
1913 // registers current include the number of fp vars, whether there are loops, and whether there are
1914 // multiple exits. These have been selected somewhat empirically, but there is probably room for
1916 CLANG_FORMAT_COMMENT_ANCHOR;
1921 printf("\nFP callee save candidate vars: ");
1922 if (enregisterLocalVars && !VarSetOps::IsEmpty(compiler, fpCalleeSaveCandidateVars))
1924 dumpConvertedVarSet(compiler, fpCalleeSaveCandidateVars);
1934 JITDUMP("floatVarCount = %d; hasLoops = %d, singleExit = %d\n", floatVarCount, compiler->fgHasLoops,
1935 (compiler->fgReturnBlocks == nullptr || compiler->fgReturnBlocks->next == nullptr));
1937 // Determine whether to use the 2nd, more aggressive, threshold for fp callee saves.
1938 if (floatVarCount > 6 && compiler->fgHasLoops &&
1939 (compiler->fgReturnBlocks == nullptr || compiler->fgReturnBlocks->next == nullptr))
1941 assert(enregisterLocalVars);
1945 printf("Adding additional fp callee save candidates: \n");
1946 if (!VarSetOps::IsEmpty(compiler, fpMaybeCandidateVars))
1948 dumpConvertedVarSet(compiler, fpMaybeCandidateVars);
1957 VarSetOps::UnionD(compiler, fpCalleeSaveCandidateVars, fpMaybeCandidateVars);
1964 // Frame layout is only pre-computed for ARM
1965 printf("\nlvaTable after IdentifyCandidates\n");
1966 compiler->lvaTableDump(Compiler::FrameLayoutState::PRE_REGALLOC_FRAME_LAYOUT);
1969 #endif // _TARGET_ARM_
1972 // TODO-Throughput: This mapping can surely be more efficiently done
1973 void LinearScan::initVarRegMaps()
1975 if (!enregisterLocalVars)
1977 inVarToRegMaps = nullptr;
1978 outVarToRegMaps = nullptr;
1981 assert(compiler->lvaTrackedFixed); // We should have already set this to prevent us from adding any new tracked
1984 // The compiler memory allocator requires that the allocation be an
1985 // even multiple of int-sized objects
1986 unsigned int varCount = compiler->lvaTrackedCount;
1987 regMapCount = (unsigned int)roundUp(varCount, sizeof(int));
1989 // Not sure why blocks aren't numbered from zero, but they don't appear to be.
1990 // So, if we want to index by bbNum we have to know the maximum value.
1991 unsigned int bbCount = compiler->fgBBNumMax + 1;
1993 inVarToRegMaps = new (compiler, CMK_LSRA) regNumberSmall*[bbCount];
1994 outVarToRegMaps = new (compiler, CMK_LSRA) regNumberSmall*[bbCount];
1998 // This VarToRegMap is used during the resolution of critical edges.
1999 sharedCriticalVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
2001 for (unsigned int i = 0; i < bbCount; i++)
2003 VarToRegMap inVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
2004 VarToRegMap outVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
2006 for (unsigned int j = 0; j < regMapCount; j++)
2008 inVarToRegMap[j] = REG_STK;
2009 outVarToRegMap[j] = REG_STK;
2011 inVarToRegMaps[i] = inVarToRegMap;
2012 outVarToRegMaps[i] = outVarToRegMap;
2017 sharedCriticalVarToRegMap = nullptr;
2018 for (unsigned int i = 0; i < bbCount; i++)
2020 inVarToRegMaps[i] = nullptr;
2021 outVarToRegMaps[i] = nullptr;
2026 void LinearScan::setInVarRegForBB(unsigned int bbNum, unsigned int varNum, regNumber reg)
2028 assert(enregisterLocalVars);
2029 assert(reg < UCHAR_MAX && varNum < compiler->lvaCount);
2030 inVarToRegMaps[bbNum][compiler->lvaTable[varNum].lvVarIndex] = (regNumberSmall)reg;
2033 void LinearScan::setOutVarRegForBB(unsigned int bbNum, unsigned int varNum, regNumber reg)
2035 assert(enregisterLocalVars);
2036 assert(reg < UCHAR_MAX && varNum < compiler->lvaCount);
2037 outVarToRegMaps[bbNum][compiler->lvaTable[varNum].lvVarIndex] = (regNumberSmall)reg;
2040 LinearScan::SplitEdgeInfo LinearScan::getSplitEdgeInfo(unsigned int bbNum)
2042 assert(enregisterLocalVars);
2043 SplitEdgeInfo splitEdgeInfo;
2044 assert(bbNum <= compiler->fgBBNumMax);
2045 assert(bbNum > bbNumMaxBeforeResolution);
2046 assert(splitBBNumToTargetBBNumMap != nullptr);
2047 splitBBNumToTargetBBNumMap->Lookup(bbNum, &splitEdgeInfo);
2048 assert(splitEdgeInfo.toBBNum <= bbNumMaxBeforeResolution);
2049 assert(splitEdgeInfo.fromBBNum <= bbNumMaxBeforeResolution);
2050 return splitEdgeInfo;
2053 VarToRegMap LinearScan::getInVarToRegMap(unsigned int bbNum)
2055 assert(enregisterLocalVars);
2056 assert(bbNum <= compiler->fgBBNumMax);
2057 // For the blocks inserted to split critical edges, the inVarToRegMap is
2058 // equal to the outVarToRegMap at the "from" block.
2059 if (bbNum > bbNumMaxBeforeResolution)
2061 SplitEdgeInfo splitEdgeInfo = getSplitEdgeInfo(bbNum);
2062 unsigned fromBBNum = splitEdgeInfo.fromBBNum;
2065 assert(splitEdgeInfo.toBBNum != 0);
2066 return inVarToRegMaps[splitEdgeInfo.toBBNum];
2070 return outVarToRegMaps[fromBBNum];
2074 return inVarToRegMaps[bbNum];
2077 VarToRegMap LinearScan::getOutVarToRegMap(unsigned int bbNum)
2079 assert(enregisterLocalVars);
2080 assert(bbNum <= compiler->fgBBNumMax);
2081 // For the blocks inserted to split critical edges, the outVarToRegMap is
2082 // equal to the inVarToRegMap at the target.
2083 if (bbNum > bbNumMaxBeforeResolution)
2085 // If this is an empty block, its in and out maps are both the same.
2086 // We identify this case by setting fromBBNum or toBBNum to 0, and using only the other.
2087 SplitEdgeInfo splitEdgeInfo = getSplitEdgeInfo(bbNum);
2088 unsigned toBBNum = splitEdgeInfo.toBBNum;
2091 assert(splitEdgeInfo.fromBBNum != 0);
2092 return outVarToRegMaps[splitEdgeInfo.fromBBNum];
2096 return inVarToRegMaps[toBBNum];
2099 return outVarToRegMaps[bbNum];
2102 //------------------------------------------------------------------------
2103 // setVarReg: Set the register associated with a variable in the given 'bbVarToRegMap'.
2106 // bbVarToRegMap - the map of interest
2107 // trackedVarIndex - the lvVarIndex for the variable
2108 // reg - the register to which it is being mapped
2113 void LinearScan::setVarReg(VarToRegMap bbVarToRegMap, unsigned int trackedVarIndex, regNumber reg)
2115 assert(trackedVarIndex < compiler->lvaTrackedCount);
2116 regNumberSmall regSmall = (regNumberSmall)reg;
2117 assert((regNumber)regSmall == reg);
2118 bbVarToRegMap[trackedVarIndex] = regSmall;
2121 //------------------------------------------------------------------------
2122 // getVarReg: Get the register associated with a variable in the given 'bbVarToRegMap'.
2125 // bbVarToRegMap - the map of interest
2126 // trackedVarIndex - the lvVarIndex for the variable
2129 // The register to which 'trackedVarIndex' is mapped
2131 regNumber LinearScan::getVarReg(VarToRegMap bbVarToRegMap, unsigned int trackedVarIndex)
2133 assert(enregisterLocalVars);
2134 assert(trackedVarIndex < compiler->lvaTrackedCount);
2135 return (regNumber)bbVarToRegMap[trackedVarIndex];
2138 // Initialize the incoming VarToRegMap to the given map values (generally a predecessor of
2140 VarToRegMap LinearScan::setInVarToRegMap(unsigned int bbNum, VarToRegMap srcVarToRegMap)
2142 assert(enregisterLocalVars);
2143 VarToRegMap inVarToRegMap = inVarToRegMaps[bbNum];
2144 memcpy(inVarToRegMap, srcVarToRegMap, (regMapCount * sizeof(regNumber)));
2145 return inVarToRegMap;
2148 //------------------------------------------------------------------------
2149 // checkLastUses: Check correctness of last use flags
2152 // The block for which we are checking last uses.
2155 // This does a backward walk of the RefPositions, starting from the liveOut set.
2156 // This method was previously used to set the last uses, which were computed by
2157 // liveness, but were not create in some cases of multiple lclVar references in the
2158 // same tree. However, now that last uses are computed as RefPositions are created,
2159 // that is no longer necessary, and this method is simply retained as a check.
2160 // The exception to the check-only behavior is when LSRA_EXTEND_LIFETIMES if set via
2161 // COMPlus_JitStressRegs. In that case, this method is required, because even though
2162 // the RefPositions will not be marked lastUse in that case, we still need to correclty
2163 // mark the last uses on the tree nodes, which is done by this method.
2166 void LinearScan::checkLastUses(BasicBlock* block)
2170 JITDUMP("\n\nCHECKING LAST USES for block %u, liveout=", block->bbNum);
2171 dumpConvertedVarSet(compiler, block->bbLiveOut);
2172 JITDUMP("\n==============================\n");
2175 unsigned keepAliveVarNum = BAD_VAR_NUM;
2176 if (compiler->lvaKeepAliveAndReportThis())
2178 keepAliveVarNum = compiler->info.compThisArg;
2179 assert(compiler->info.compIsStatic == false);
2182 // find which uses are lastUses
2184 // Work backwards starting with live out.
2185 // 'computedLive' is updated to include any exposed use (including those in this
2186 // block that we've already seen). When we encounter a use, if it's
2187 // not in that set, then it's a last use.
2189 VARSET_TP computedLive(VarSetOps::MakeCopy(compiler, block->bbLiveOut));
2191 bool foundDiff = false;
2192 RefPositionReverseIterator reverseIterator = refPositions.rbegin();
2193 RefPosition* currentRefPosition;
2194 for (currentRefPosition = &reverseIterator; currentRefPosition->refType != RefTypeBB;
2195 reverseIterator++, currentRefPosition = &reverseIterator)
2197 // We should never see ParamDefs or ZeroInits within a basic block.
2198 assert(currentRefPosition->refType != RefTypeParamDef && currentRefPosition->refType != RefTypeZeroInit);
2199 if (currentRefPosition->isIntervalRef() && currentRefPosition->getInterval()->isLocalVar)
2201 unsigned varNum = currentRefPosition->getInterval()->varNum;
2202 unsigned varIndex = currentRefPosition->getInterval()->getVarIndex(compiler);
2204 LsraLocation loc = currentRefPosition->nodeLocation;
2206 // We should always have a tree node for a localVar, except for the "special" RefPositions.
2207 GenTree* tree = currentRefPosition->treeNode;
2208 assert(tree != nullptr || currentRefPosition->refType == RefTypeExpUse ||
2209 currentRefPosition->refType == RefTypeDummyDef);
2211 if (!VarSetOps::IsMember(compiler, computedLive, varIndex) && varNum != keepAliveVarNum)
2213 // There was no exposed use, so this is a "last use" (and we mark it thus even if it's a def)
2215 if (extendLifetimes())
2217 // NOTE: this is a bit of a hack. When extending lifetimes, the "last use" bit will be clear.
2218 // This bit, however, would normally be used during resolveLocalRef to set the value of
2219 // GTF_VAR_DEATH on the node for a ref position. If this bit is not set correctly even when
2220 // extending lifetimes, the code generator will assert as it expects to have accurate last
2221 // use information. To avoid these asserts, set the GTF_VAR_DEATH bit here.
2222 // Note also that extendLifetimes() is an LSRA stress mode, so it will only be true for
2223 // Checked or Debug builds, for which this method will be executed.
2224 if (tree != nullptr)
2226 tree->gtFlags |= GTF_VAR_DEATH;
2229 else if (!currentRefPosition->lastUse)
2231 JITDUMP("missing expected last use of V%02u @%u\n", compiler->lvaTrackedToVarNum[varIndex], loc);
2234 VarSetOps::AddElemD(compiler, computedLive, varIndex);
2236 else if (currentRefPosition->lastUse)
2238 JITDUMP("unexpected last use of V%02u @%u\n", compiler->lvaTrackedToVarNum[varIndex], loc);
2241 else if (extendLifetimes() && tree != nullptr)
2243 // NOTE: see the comment above re: the extendLifetimes hack.
2244 tree->gtFlags &= ~GTF_VAR_DEATH;
2247 if (currentRefPosition->refType == RefTypeDef || currentRefPosition->refType == RefTypeDummyDef)
2249 VarSetOps::RemoveElemD(compiler, computedLive, varIndex);
2253 assert(reverseIterator != refPositions.rend());
2256 VARSET_TP liveInNotComputedLive(VarSetOps::Diff(compiler, block->bbLiveIn, computedLive));
2258 VarSetOps::Iter liveInNotComputedLiveIter(compiler, liveInNotComputedLive);
2259 unsigned liveInNotComputedLiveIndex = 0;
2260 while (liveInNotComputedLiveIter.NextElem(&liveInNotComputedLiveIndex))
2262 unsigned varNum = compiler->lvaTrackedToVarNum[liveInNotComputedLiveIndex];
2263 if (compiler->lvaTable[varNum].lvLRACandidate)
2265 JITDUMP("BB%02u: V%02u is in LiveIn set, but not computed live.\n", block->bbNum, varNum);
2270 VarSetOps::DiffD(compiler, computedLive, block->bbLiveIn);
2271 const VARSET_TP& computedLiveNotLiveIn(computedLive); // reuse the buffer.
2272 VarSetOps::Iter computedLiveNotLiveInIter(compiler, computedLiveNotLiveIn);
2273 unsigned computedLiveNotLiveInIndex = 0;
2274 while (computedLiveNotLiveInIter.NextElem(&computedLiveNotLiveInIndex))
2276 unsigned varNum = compiler->lvaTrackedToVarNum[computedLiveNotLiveInIndex];
2277 if (compiler->lvaTable[varNum].lvLRACandidate)
2279 JITDUMP("BB%02u: V%02u is computed live, but not in LiveIn set.\n", block->bbNum, varNum);
2288 //------------------------------------------------------------------------
2289 // findPredBlockForLiveIn: Determine which block should be used for the register locations of the live-in variables.
2292 // block - The block for which we're selecting a predecesor.
2293 // prevBlock - The previous block in in allocation order.
2294 // pPredBlockIsAllocated - A debug-only argument that indicates whether any of the predecessors have been seen
2295 // in allocation order.
2298 // The selected predecessor.
2301 // in DEBUG, caller initializes *pPredBlockIsAllocated to false, and it will be set to true if the block
2302 // returned is in fact a predecessor.
2305 // This will select a predecessor based on the heuristics obtained by getLsraBlockBoundaryLocations(), which can be
2307 // LSRA_BLOCK_BOUNDARY_PRED - Use the register locations of a predecessor block (default)
2308 // LSRA_BLOCK_BOUNDARY_LAYOUT - Use the register locations of the previous block in layout order.
2309 // This is the only case where this actually returns a different block.
2310 // LSRA_BLOCK_BOUNDARY_ROTATE - Rotate the register locations from a predecessor.
2311 // For this case, the block returned is the same as for LSRA_BLOCK_BOUNDARY_PRED, but
2312 // the register locations will be "rotated" to stress the resolution and allocation
2315 BasicBlock* LinearScan::findPredBlockForLiveIn(BasicBlock* block,
2316 BasicBlock* prevBlock DEBUGARG(bool* pPredBlockIsAllocated))
2318 BasicBlock* predBlock = nullptr;
2320 assert(*pPredBlockIsAllocated == false);
2321 if (getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_LAYOUT)
2323 if (prevBlock != nullptr)
2325 predBlock = prevBlock;
2330 if (block != compiler->fgFirstBB)
2332 predBlock = block->GetUniquePred(compiler);
2333 if (predBlock != nullptr)
2335 if (isBlockVisited(predBlock))
2337 if (predBlock->bbJumpKind == BBJ_COND)
2339 // Special handling to improve matching on backedges.
2340 BasicBlock* otherBlock = (block == predBlock->bbNext) ? predBlock->bbJumpDest : predBlock->bbNext;
2341 noway_assert(otherBlock != nullptr);
2342 if (isBlockVisited(otherBlock))
2344 // This is the case when we have a conditional branch where one target has already
2345 // been visited. It would be best to use the same incoming regs as that block,
2346 // so that we have less likelihood of having to move registers.
2347 // For example, in determining the block to use for the starting register locations for
2348 // "block" in the following example, we'd like to use the same predecessor for "block"
2349 // as for "otherBlock", so that both successors of predBlock have the same locations, reducing
2350 // the likelihood of needing a split block on a backedge:
2361 for (flowList* pred = otherBlock->bbPreds; pred != nullptr; pred = pred->flNext)
2363 BasicBlock* otherPred = pred->flBlock;
2364 if (otherPred->bbNum == blockInfo[otherBlock->bbNum].predBBNum)
2366 predBlock = otherPred;
2375 predBlock = nullptr;
2380 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
2382 BasicBlock* candidatePredBlock = pred->flBlock;
2383 if (isBlockVisited(candidatePredBlock))
2385 if (predBlock == nullptr || predBlock->bbWeight < candidatePredBlock->bbWeight)
2387 predBlock = candidatePredBlock;
2388 INDEBUG(*pPredBlockIsAllocated = true;)
2393 if (predBlock == nullptr)
2395 predBlock = prevBlock;
2396 assert(predBlock != nullptr);
2397 JITDUMP("\n\nNo allocated predecessor; ");
2404 void LinearScan::dumpVarRefPositions(const char* title)
2406 if (enregisterLocalVars)
2408 printf("\nVAR REFPOSITIONS %s\n", title);
2410 for (unsigned i = 0; i < compiler->lvaCount; i++)
2412 printf("--- V%02u\n", i);
2414 LclVarDsc* varDsc = compiler->lvaTable + i;
2415 if (varDsc->lvIsRegCandidate())
2417 Interval* interval = getIntervalForLocalVar(varDsc->lvVarIndex);
2418 for (RefPosition* ref = interval->firstRefPosition; ref != nullptr; ref = ref->nextRefPosition)
2430 // Set the default rpFrameType based upon codeGen->isFramePointerRequired()
2431 // This was lifted from the register predictor
2433 void LinearScan::setFrameType()
2435 FrameType frameType = FT_NOT_SET;
2437 compiler->codeGen->setDoubleAlign(false);
2440 frameType = FT_DOUBLE_ALIGN_FRAME;
2441 compiler->codeGen->setDoubleAlign(true);
2444 #endif // DOUBLE_ALIGN
2445 if (compiler->codeGen->isFramePointerRequired())
2447 frameType = FT_EBP_FRAME;
2451 if (compiler->rpMustCreateEBPCalled == false)
2456 compiler->rpMustCreateEBPCalled = true;
2457 if (compiler->rpMustCreateEBPFrame(INDEBUG(&reason)))
2459 JITDUMP("; Decided to create an EBP based frame for ETW stackwalking (%s)\n", reason);
2460 compiler->codeGen->setFrameRequired(true);
2464 if (compiler->codeGen->isFrameRequired())
2466 frameType = FT_EBP_FRAME;
2470 frameType = FT_ESP_FRAME;
2477 noway_assert(!compiler->codeGen->isFramePointerRequired());
2478 noway_assert(!compiler->codeGen->isFrameRequired());
2479 compiler->codeGen->setFramePointerUsed(false);
2482 compiler->codeGen->setFramePointerUsed(true);
2485 case FT_DOUBLE_ALIGN_FRAME:
2486 noway_assert(!compiler->codeGen->isFramePointerRequired());
2487 compiler->codeGen->setFramePointerUsed(false);
2489 #endif // DOUBLE_ALIGN
2491 noway_assert(!"rpFrameType not set correctly!");
2495 // If we are using FPBASE as the frame register, we cannot also use it for
2496 // a local var. Note that we may have already added it to the register masks,
2497 // which are computed when the LinearScan class constructor is created, and
2498 // used during lowering. Luckily, the TreeNodeInfo only stores an index to
2499 // the masks stored in the LinearScan class, so we only need to walk the
2500 // unique masks and remove FPBASE.
2501 regMaskTP removeMask = RBM_NONE;
2502 if (frameType == FT_EBP_FRAME)
2504 removeMask |= RBM_FPBASE;
2507 compiler->rpFrameType = frameType;
2509 #ifdef _TARGET_ARMARCH_
2510 // Determine whether we need to reserve a register for large lclVar offsets.
2511 if (compiler->compRsvdRegCheck(Compiler::REGALLOC_FRAME_LAYOUT))
2513 // We reserve R10/IP1 in this case to hold the offsets in load/store instructions
2514 compiler->codeGen->regSet.rsMaskResvd |= RBM_OPT_RSVD;
2515 assert(REG_OPT_RSVD != REG_FP);
2516 JITDUMP(" Reserved REG_OPT_RSVD (%s) due to large frame\n", getRegName(REG_OPT_RSVD));
2517 removeMask |= RBM_OPT_RSVD;
2519 #endif // _TARGET_ARMARCH_
2521 if ((removeMask != RBM_NONE) && ((availableIntRegs & removeMask) != 0))
2523 RemoveRegistersFromMasks(removeMask);
2524 // We know that we're already in "read mode" for availableIntRegs. However,
2525 // we need to remove these registers, so subsequent users (like callers
2526 // to allRegs()) get the right thing. The RemoveRegistersFromMasks() code
2527 // fixes up everything that already took a dependency on the value that was
2528 // previously read, so this completes the picture.
2529 availableIntRegs.OverrideAssign(availableIntRegs & ~removeMask);
2533 //------------------------------------------------------------------------
2534 // copyOrMoveRegInUse: Is 'ref' a copyReg/moveReg that is still busy at the given location?
2537 // ref: The RefPosition of interest
2538 // loc: The LsraLocation at which we're determining whether it's busy.
2541 // true iff 'ref' is active at the given location
2543 bool copyOrMoveRegInUse(RefPosition* ref, LsraLocation loc)
2545 if (!ref->copyReg && !ref->moveReg)
2549 if (ref->getRefEndLocation() >= loc)
2553 Interval* interval = ref->getInterval();
2554 RefPosition* nextRef = interval->getNextRefPosition();
2555 if (nextRef != nullptr && nextRef->treeNode == ref->treeNode && nextRef->getRefEndLocation() >= loc)
2562 // Determine whether the register represented by "physRegRecord" is available at least
2563 // at the "currentLoc", and if so, return the next location at which it is in use in
2564 // "nextRefLocationPtr"
2566 bool LinearScan::registerIsAvailable(RegRecord* physRegRecord,
2567 LsraLocation currentLoc,
2568 LsraLocation* nextRefLocationPtr,
2569 RegisterType regType)
2571 *nextRefLocationPtr = MaxLocation;
2572 LsraLocation nextRefLocation = MaxLocation;
2573 regMaskTP regMask = genRegMask(physRegRecord->regNum);
2574 if (physRegRecord->isBusyUntilNextKill)
2579 RefPosition* nextPhysReference = physRegRecord->getNextRefPosition();
2580 if (nextPhysReference != nullptr)
2582 nextRefLocation = nextPhysReference->nodeLocation;
2583 // if (nextPhysReference->refType == RefTypeFixedReg) nextRefLocation--;
2585 else if (!physRegRecord->isCalleeSave)
2587 nextRefLocation = MaxLocation - 1;
2590 Interval* assignedInterval = physRegRecord->assignedInterval;
2592 if (assignedInterval != nullptr)
2594 RefPosition* recentReference = assignedInterval->recentRefPosition;
2596 // The only case where we have an assignedInterval, but recentReference is null
2597 // is where this interval is live at procedure entry (i.e. an arg register), in which
2598 // case it's still live and its assigned register is not available
2599 // (Note that the ParamDef will be recorded as a recentReference when we encounter
2600 // it, but we will be allocating registers, potentially to other incoming parameters,
2601 // as we process the ParamDefs.)
2603 if (recentReference == nullptr)
2608 // Is this a copyReg/moveReg? It is if the register assignment doesn't match.
2609 // (the recentReference may not be a copyReg/moveReg, because we could have seen another
2610 // reference since the copyReg/moveReg)
2612 if (!assignedInterval->isAssignedTo(physRegRecord->regNum))
2614 // If the recentReference is for a different register, it can be reassigned, but
2615 // otherwise don't reassign it if it's still in use.
2616 // (Note that it is unlikely that we have a recent copy or move to a different register,
2617 // where this physRegRecord is still pointing at an earlier copy or move, but it is possible,
2618 // especially in stress modes.)
2619 if ((recentReference->registerAssignment == regMask) && copyOrMoveRegInUse(recentReference, currentLoc))
2624 else if (!assignedInterval->isActive && assignedInterval->isConstant)
2626 // Treat this as unassigned, i.e. do nothing.
2627 // TODO-CQ: Consider adjusting the heuristics (probably in the caller of this method)
2628 // to avoid reusing these registers.
2630 // If this interval isn't active, it's available if it isn't referenced
2631 // at this location (or the previous location, if the recent RefPosition
2632 // is a delayRegFree).
2633 else if (!assignedInterval->isActive &&
2634 (recentReference->refType == RefTypeExpUse || recentReference->getRefEndLocation() < currentLoc))
2636 // This interval must have a next reference (otherwise it wouldn't be assigned to this register)
2637 RefPosition* nextReference = recentReference->nextRefPosition;
2638 if (nextReference != nullptr)
2640 if (nextReference->nodeLocation < nextRefLocation)
2642 nextRefLocation = nextReference->nodeLocation;
2647 assert(recentReference->copyReg && recentReference->registerAssignment != regMask);
2655 if (nextRefLocation < *nextRefLocationPtr)
2657 *nextRefLocationPtr = nextRefLocation;
2661 if (regType == TYP_DOUBLE)
2663 // Recurse, but check the other half this time (TYP_FLOAT)
2664 if (!registerIsAvailable(findAnotherHalfRegRec(physRegRecord), currentLoc, nextRefLocationPtr, TYP_FLOAT))
2666 nextRefLocation = *nextRefLocationPtr;
2668 #endif // _TARGET_ARM_
2670 return (nextRefLocation >= currentLoc);
2673 //------------------------------------------------------------------------
2674 // getRegisterType: Get the RegisterType to use for the given RefPosition
2677 // currentInterval: The interval for the current allocation
2678 // refPosition: The RefPosition of the current Interval for which a register is being allocated
2681 // The RegisterType that should be allocated for this RefPosition
2684 // This will nearly always be identical to the registerType of the interval, except in the case
2685 // of SIMD types of 8 bytes (currently only Vector2) when they are passed and returned in integer
2686 // registers, or copied to a return temp.
2687 // This method need only be called in situations where we may be dealing with the register requirements
2688 // of a RefTypeUse RefPosition (i.e. not when we are only looking at the type of an interval, nor when
2689 // we are interested in the "defining" type of the interval). This is because the situation of interest
2690 // only happens at the use (where it must be copied to an integer register).
2692 RegisterType LinearScan::getRegisterType(Interval* currentInterval, RefPosition* refPosition)
2694 assert(refPosition->getInterval() == currentInterval);
2695 RegisterType regType = currentInterval->registerType;
2696 regMaskTP candidates = refPosition->registerAssignment;
2698 assert((candidates & allRegs(regType)) != RBM_NONE);
2702 //------------------------------------------------------------------------
2703 // isMatchingConstant: Check to see whether a given register contains the constant referenced
2704 // by the given RefPosition
2707 // physRegRecord: The RegRecord for the register we're interested in.
2708 // refPosition: The RefPosition for a constant interval.
2711 // True iff the register was defined by an identical constant node as the current interval.
2713 bool LinearScan::isMatchingConstant(RegRecord* physRegRecord, RefPosition* refPosition)
2715 if ((physRegRecord->assignedInterval == nullptr) || !physRegRecord->assignedInterval->isConstant)
2719 noway_assert(refPosition->treeNode != nullptr);
2720 GenTree* otherTreeNode = physRegRecord->assignedInterval->firstRefPosition->treeNode;
2721 noway_assert(otherTreeNode != nullptr);
2723 if (refPosition->treeNode->OperGet() == otherTreeNode->OperGet())
2725 switch (otherTreeNode->OperGet())
2728 if ((refPosition->treeNode->AsIntCon()->IconValue() == otherTreeNode->AsIntCon()->IconValue()) &&
2729 (varTypeGCtype(refPosition->treeNode) == varTypeGCtype(otherTreeNode)))
2731 #ifdef _TARGET_64BIT_
2732 // If the constant is negative, only reuse registers of the same type.
2733 // This is because, on a 64-bit system, we do not sign-extend immediates in registers to
2734 // 64-bits unless they are actually longs, as this requires a longer instruction.
2735 // This doesn't apply to a 32-bit system, on which long values occupy multiple registers.
2736 // (We could sign-extend, but we would have to always sign-extend, because if we reuse more
2737 // than once, we won't have access to the instruction that originally defines the constant).
2738 if ((refPosition->treeNode->TypeGet() == otherTreeNode->TypeGet()) ||
2739 (refPosition->treeNode->AsIntCon()->IconValue() >= 0))
2740 #endif // _TARGET_64BIT_
2748 // For floating point constants, the values must be identical, not simply compare
2749 // equal. So we compare the bits.
2750 if (refPosition->treeNode->AsDblCon()->isBitwiseEqual(otherTreeNode->AsDblCon()) &&
2751 (refPosition->treeNode->TypeGet() == otherTreeNode->TypeGet()))
2764 //------------------------------------------------------------------------
2765 // tryAllocateFreeReg: Find a free register that satisfies the requirements for refPosition,
2766 // and takes into account the preferences for the given Interval
2769 // currentInterval: The interval for the current allocation
2770 // refPosition: The RefPosition of the current Interval for which a register is being allocated
2773 // The regNumber, if any, allocated to the RefPositon. Returns REG_NA if no free register is found.
2776 // TODO-CQ: Consider whether we need to use a different order for tree temps than for vars, as
2779 static const regNumber lsraRegOrder[] = {REG_VAR_ORDER};
2780 const unsigned lsraRegOrderSize = ArrLen(lsraRegOrder);
2781 static const regNumber lsraRegOrderFlt[] = {REG_VAR_ORDER_FLT};
2782 const unsigned lsraRegOrderFltSize = ArrLen(lsraRegOrderFlt);
2784 regNumber LinearScan::tryAllocateFreeReg(Interval* currentInterval, RefPosition* refPosition)
2786 regNumber foundReg = REG_NA;
2788 RegisterType regType = getRegisterType(currentInterval, refPosition);
2789 const regNumber* regOrder;
2790 unsigned regOrderSize;
2791 if (useFloatReg(regType))
2793 regOrder = lsraRegOrderFlt;
2794 regOrderSize = lsraRegOrderFltSize;
2798 regOrder = lsraRegOrder;
2799 regOrderSize = lsraRegOrderSize;
2802 LsraLocation currentLocation = refPosition->nodeLocation;
2803 RefPosition* nextRefPos = refPosition->nextRefPosition;
2804 LsraLocation nextLocation = (nextRefPos == nullptr) ? currentLocation : nextRefPos->nodeLocation;
2805 regMaskTP candidates = refPosition->registerAssignment;
2806 regMaskTP preferences = currentInterval->registerPreferences;
2808 if (RefTypeIsDef(refPosition->refType))
2810 if (currentInterval->hasConflictingDefUse)
2812 resolveConflictingDefAndUse(currentInterval, refPosition);
2813 candidates = refPosition->registerAssignment;
2815 // Otherwise, check for the case of a fixed-reg def of a reg that will be killed before the
2816 // use, or interferes at the point of use (which shouldn't happen, but Lower doesn't mark
2817 // the contained nodes as interfering).
2818 // Note that we may have a ParamDef RefPosition that is marked isFixedRegRef, but which
2819 // has had its registerAssignment changed to no longer be a single register.
2820 else if (refPosition->isFixedRegRef && nextRefPos != nullptr && RefTypeIsUse(nextRefPos->refType) &&
2821 !nextRefPos->isFixedRegRef && genMaxOneBit(refPosition->registerAssignment))
2823 regNumber defReg = refPosition->assignedReg();
2824 RegRecord* defRegRecord = getRegisterRecord(defReg);
2826 RefPosition* currFixedRegRefPosition = defRegRecord->recentRefPosition;
2827 assert(currFixedRegRefPosition != nullptr &&
2828 currFixedRegRefPosition->nodeLocation == refPosition->nodeLocation);
2830 // If there is another fixed reference to this register before the use, change the candidates
2831 // on this RefPosition to include that of nextRefPos.
2832 if (currFixedRegRefPosition->nextRefPosition != nullptr &&
2833 currFixedRegRefPosition->nextRefPosition->nodeLocation <= nextRefPos->getRefEndLocation())
2835 candidates |= nextRefPos->registerAssignment;
2836 if (preferences == refPosition->registerAssignment)
2838 preferences = candidates;
2844 preferences &= candidates;
2845 if (preferences == RBM_NONE)
2847 preferences = candidates;
2849 regMaskTP relatedPreferences = RBM_NONE;
2852 candidates = stressLimitRegs(refPosition, candidates);
2854 assert(candidates != RBM_NONE);
2856 // If the related interval has no further references, it is possible that it is a source of the
2857 // node that produces this interval. However, we don't want to use the relatedInterval for preferencing
2858 // if its next reference is not a new definition (as it either is or will become live).
2859 Interval* relatedInterval = currentInterval->relatedInterval;
2860 if (relatedInterval != nullptr)
2862 RefPosition* nextRelatedRefPosition = relatedInterval->getNextRefPosition();
2863 if (nextRelatedRefPosition != nullptr)
2865 // Don't use the relatedInterval for preferencing if its next reference is not a new definition,
2866 // or if it is only related because they are multi-reg targets of the same node.
2867 if (!RefTypeIsDef(nextRelatedRefPosition->refType) ||
2868 isMultiRegRelated(nextRelatedRefPosition, refPosition->nodeLocation))
2870 relatedInterval = nullptr;
2872 // Is the relatedInterval not assigned and simply a copy to another relatedInterval?
2873 else if ((relatedInterval->assignedReg == nullptr) && (relatedInterval->relatedInterval != nullptr) &&
2874 (nextRelatedRefPosition->nextRefPosition != nullptr) &&
2875 (nextRelatedRefPosition->nextRefPosition->nextRefPosition == nullptr) &&
2876 (nextRelatedRefPosition->nextRefPosition->nodeLocation <
2877 relatedInterval->relatedInterval->getNextRefLocation()))
2879 // The current relatedInterval has only two remaining RefPositions, both of which
2880 // occur prior to the next RefPosition for its relatedInterval.
2881 // It is likely a copy.
2882 relatedInterval = relatedInterval->relatedInterval;
2887 if (relatedInterval != nullptr)
2889 // If the related interval already has an assigned register, then use that
2890 // as the related preference. We'll take the related
2891 // interval preferences into account in the loop over all the registers.
2893 if (relatedInterval->assignedReg != nullptr)
2895 relatedPreferences = genRegMask(relatedInterval->assignedReg->regNum);
2899 relatedPreferences = relatedInterval->registerPreferences;
2903 bool preferCalleeSave = currentInterval->preferCalleeSave;
2905 // For floating point, we want to be less aggressive about using callee-save registers.
2906 // So in that case, we just need to ensure that the current RefPosition is covered.
2907 RefPosition* rangeEndRefPosition;
2908 RefPosition* lastRefPosition = currentInterval->lastRefPosition;
2909 if (useFloatReg(currentInterval->registerType))
2911 rangeEndRefPosition = refPosition;
2915 rangeEndRefPosition = currentInterval->lastRefPosition;
2916 // If we have a relatedInterval that is not currently occupying a register,
2917 // and whose lifetime begins after this one ends,
2918 // we want to try to select a register that will cover its lifetime.
2919 if ((relatedInterval != nullptr) && (relatedInterval->assignedReg == nullptr) &&
2920 (relatedInterval->getNextRefLocation() >= rangeEndRefPosition->nodeLocation))
2922 lastRefPosition = relatedInterval->lastRefPosition;
2923 preferCalleeSave = relatedInterval->preferCalleeSave;
2927 // If this has a delayed use (due to being used in a rmw position of a
2928 // non-commutative operator), its endLocation is delayed until the "def"
2929 // position, which is one location past the use (getRefEndLocation() takes care of this).
2930 LsraLocation rangeEndLocation = rangeEndRefPosition->getRefEndLocation();
2931 LsraLocation lastLocation = lastRefPosition->getRefEndLocation();
2932 regNumber prevReg = REG_NA;
2934 if (currentInterval->assignedReg)
2936 bool useAssignedReg = false;
2937 // This was an interval that was previously allocated to the given
2938 // physical register, and we should try to allocate it to that register
2939 // again, if possible and reasonable.
2940 // Use it preemptively (i.e. before checking other available regs)
2941 // only if it is preferred and available.
2943 RegRecord* regRec = currentInterval->assignedReg;
2944 prevReg = regRec->regNum;
2945 regMaskTP prevRegBit = genRegMask(prevReg);
2947 // Is it in the preferred set of regs?
2948 if ((prevRegBit & preferences) != RBM_NONE)
2950 // Is it currently available?
2951 LsraLocation nextPhysRefLoc;
2952 if (registerIsAvailable(regRec, currentLocation, &nextPhysRefLoc, currentInterval->registerType))
2954 // If the register is next referenced at this location, only use it if
2955 // this has a fixed reg requirement (i.e. this is the reference that caused
2956 // the FixedReg ref to be created)
2958 if (!regRec->conflictingFixedRegReference(refPosition))
2960 useAssignedReg = true;
2966 regNumber foundReg = prevReg;
2967 assignPhysReg(regRec, currentInterval);
2968 refPosition->registerAssignment = genRegMask(foundReg);
2973 // Don't keep trying to allocate to this register
2974 currentInterval->assignedReg = nullptr;
2978 //-------------------------------------------------------------------------
2979 // Register Selection
2981 RegRecord* availablePhysRegInterval = nullptr;
2982 bool unassignInterval = false;
2984 // Each register will receive a score which is the sum of the scoring criteria below.
2985 // These were selected on the assumption that they will have an impact on the "goodness"
2986 // of a register selection, and have been tuned to a certain extent by observing the impact
2987 // of the ordering on asmDiffs. However, there is probably much more room for tuning,
2988 // and perhaps additional criteria.
2990 // These are FLAGS (bits) so that we can easily order them and add them together.
2991 // If the scores are equal, but one covers more of the current interval's range,
2992 // then it wins. Otherwise, the one encountered earlier in the regOrder wins.
2996 VALUE_AVAILABLE = 0x40, // It is a constant value that is already in an acceptable register.
2997 COVERS = 0x20, // It is in the interval's preference set and it covers the entire lifetime.
2998 OWN_PREFERENCE = 0x10, // It is in the preference set of this interval.
2999 COVERS_RELATED = 0x08, // It is in the preference set of the related interval and covers the entire lifetime.
3000 RELATED_PREFERENCE = 0x04, // It is in the preference set of the related interval.
3001 CALLER_CALLEE = 0x02, // It is in the right "set" for the interval (caller or callee-save).
3002 UNASSIGNED = 0x01, // It is not currently assigned to an inactive interval.
3007 // Compute the best possible score so we can stop looping early if we find it.
3008 // TODO-Throughput: At some point we may want to short-circuit the computation of each score, but
3009 // probably not until we've tuned the order of these criteria. At that point,
3010 // we'll need to avoid the short-circuit if we've got a stress option to reverse
3012 int bestPossibleScore = COVERS + UNASSIGNED + OWN_PREFERENCE + CALLER_CALLEE;
3013 if (relatedPreferences != RBM_NONE)
3015 bestPossibleScore |= RELATED_PREFERENCE + COVERS_RELATED;
3018 LsraLocation bestLocation = MinLocation;
3020 // In non-debug builds, this will simply get optimized away
3021 bool reverseSelect = false;
3023 reverseSelect = doReverseSelect();
3026 // An optimization for the common case where there is only one candidate -
3027 // avoid looping over all the other registers
3029 regNumber singleReg = REG_NA;
3031 if (genMaxOneBit(candidates))
3034 singleReg = genRegNumFromMask(candidates);
3035 regOrder = &singleReg;
3038 for (unsigned i = 0; i < regOrderSize && (candidates != RBM_NONE); i++)
3040 regNumber regNum = regOrder[i];
3041 regMaskTP candidateBit = genRegMask(regNum);
3043 if (!(candidates & candidateBit))
3048 candidates &= ~candidateBit;
3050 RegRecord* physRegRecord = getRegisterRecord(regNum);
3053 LsraLocation nextPhysRefLocation = MaxLocation;
3055 // By chance, is this register already holding this interval, as a copyReg or having
3056 // been restored as inactive after a kill?
3057 if (physRegRecord->assignedInterval == currentInterval)
3059 availablePhysRegInterval = physRegRecord;
3060 unassignInterval = false;
3064 // Find the next RefPosition of the physical register
3065 if (!registerIsAvailable(physRegRecord, currentLocation, &nextPhysRefLocation, regType))
3070 // If the register is next referenced at this location, only use it if
3071 // this has a fixed reg requirement (i.e. this is the reference that caused
3072 // the FixedReg ref to be created)
3074 if (physRegRecord->conflictingFixedRegReference(refPosition))
3079 // If this is a definition of a constant interval, check to see if its value is already in this register.
3080 if (currentInterval->isConstant && RefTypeIsDef(refPosition->refType) &&
3081 isMatchingConstant(physRegRecord, refPosition))
3083 score |= VALUE_AVAILABLE;
3086 // If the nextPhysRefLocation is a fixedRef for the rangeEndRefPosition, increment it so that
3087 // we don't think it isn't covering the live range.
3088 // This doesn't handle the case where earlier RefPositions for this Interval are also
3089 // FixedRefs of this regNum, but at least those are only interesting in the case where those
3090 // are "local last uses" of the Interval - otherwise the liveRange would interfere with the reg.
3091 if (nextPhysRefLocation == rangeEndLocation && rangeEndRefPosition->isFixedRefOfReg(regNum))
3093 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_INCREMENT_RANGE_END, currentInterval, regNum));
3094 nextPhysRefLocation++;
3097 if ((candidateBit & preferences) != RBM_NONE)
3099 score |= OWN_PREFERENCE;
3100 if (nextPhysRefLocation > rangeEndLocation)
3105 if (relatedInterval != nullptr && (candidateBit & relatedPreferences) != RBM_NONE)
3107 score |= RELATED_PREFERENCE;
3108 if (nextPhysRefLocation > relatedInterval->lastRefPosition->nodeLocation)
3110 score |= COVERS_RELATED;
3114 // If we had a fixed-reg def of a reg that will be killed before the use, prefer it to any other registers
3115 // with the same score. (Note that we haven't changed the original registerAssignment on the RefPosition).
3116 // Overload the RELATED_PREFERENCE value.
3117 else if (candidateBit == refPosition->registerAssignment)
3119 score |= RELATED_PREFERENCE;
3122 if ((preferCalleeSave && physRegRecord->isCalleeSave) || (!preferCalleeSave && !physRegRecord->isCalleeSave))
3124 score |= CALLER_CALLEE;
3127 // The register is considered unassigned if it has no assignedInterval, OR
3128 // if its next reference is beyond the range of this interval.
3129 if (!isAssigned(physRegRecord, lastLocation ARM_ARG(currentInterval->registerType)))
3131 score |= UNASSIGNED;
3134 bool foundBetterCandidate = false;
3136 if (score > bestScore)
3138 foundBetterCandidate = true;
3140 else if (score == bestScore)
3142 // Prefer a register that covers the range.
3143 if (bestLocation <= lastLocation)
3145 if (nextPhysRefLocation > bestLocation)
3147 foundBetterCandidate = true;
3150 // If both cover the range, prefer a register that is killed sooner (leaving the longer range register
3151 // available). If both cover the range and also getting killed at the same location, prefer the one which
3152 // is same as previous assignment.
3153 else if (nextPhysRefLocation > lastLocation)
3155 if (nextPhysRefLocation < bestLocation)
3157 foundBetterCandidate = true;
3159 else if (nextPhysRefLocation == bestLocation && prevReg == regNum)
3161 foundBetterCandidate = true;
3167 if (doReverseSelect() && bestScore != 0)
3169 foundBetterCandidate = !foundBetterCandidate;
3173 if (foundBetterCandidate)
3175 bestLocation = nextPhysRefLocation;
3176 availablePhysRegInterval = physRegRecord;
3177 unassignInterval = true;
3181 // there is no way we can get a better score so break out
3182 if (!reverseSelect && score == bestPossibleScore && bestLocation == rangeEndLocation + 1)
3188 if (availablePhysRegInterval != nullptr)
3190 if (unassignInterval && isAssigned(availablePhysRegInterval ARM_ARG(currentInterval->registerType)))
3192 Interval* const intervalToUnassign = availablePhysRegInterval->assignedInterval;
3193 unassignPhysReg(availablePhysRegInterval ARM_ARG(currentInterval->registerType));
3195 if ((bestScore & VALUE_AVAILABLE) != 0 && intervalToUnassign != nullptr)
3197 assert(intervalToUnassign->isConstant);
3198 refPosition->treeNode->SetReuseRegVal();
3200 // If we considered this "unassigned" because this interval's lifetime ends before
3201 // the next ref, remember it.
3202 else if ((bestScore & UNASSIGNED) != 0 && intervalToUnassign != nullptr)
3204 updatePreviousInterval(availablePhysRegInterval, intervalToUnassign, intervalToUnassign->registerType);
3209 assert((bestScore & VALUE_AVAILABLE) == 0);
3211 assignPhysReg(availablePhysRegInterval, currentInterval);
3212 foundReg = availablePhysRegInterval->regNum;
3213 regMaskTP foundRegMask = genRegMask(foundReg);
3214 refPosition->registerAssignment = foundRegMask;
3215 if (relatedInterval != nullptr)
3217 relatedInterval->updateRegisterPreferences(foundRegMask);
3224 //------------------------------------------------------------------------
3225 // canSpillReg: Determine whether we can spill physRegRecord
3228 // physRegRecord - reg to spill
3229 // refLocation - Location of RefPosition where this register will be spilled
3230 // recentAssignedRefWeight - Weight of recent assigned RefPosition which will be determined in this function
3231 // farthestRefPosWeight - Current farthestRefPosWeight at allocateBusyReg()
3234 // True - if we can spill physRegRecord
3235 // False - otherwise
3237 // Note: This helper is designed to be used only from allocateBusyReg() and canSpillDoubleReg()
3239 bool LinearScan::canSpillReg(RegRecord* physRegRecord, LsraLocation refLocation, unsigned* recentAssignedRefWeight)
3241 assert(physRegRecord->assignedInterval != nullptr);
3242 RefPosition* recentAssignedRef = physRegRecord->assignedInterval->recentRefPosition;
3244 if (recentAssignedRef != nullptr)
3246 if (isRefPositionActive(recentAssignedRef, refLocation))
3248 // We can't spill a register that's active at the current location
3252 // We don't prefer to spill a register if the weight of recentAssignedRef > weight
3253 // of the spill candidate found so far. We would consider spilling a greater weight
3254 // ref position only if the refPosition being allocated must need a reg.
3255 *recentAssignedRefWeight = getWeight(recentAssignedRef);
3261 //------------------------------------------------------------------------
3262 // canSpillDoubleReg: Determine whether we can spill physRegRecord
3265 // physRegRecord - reg to spill (must be a valid double register)
3266 // refLocation - Location of RefPosition where this register will be spilled
3267 // recentAssignedRefWeight - Weight of recent assigned RefPosition which will be determined in this function
3270 // True - if we can spill physRegRecord
3271 // False - otherwise
3274 // This helper is designed to be used only from allocateBusyReg() and canSpillDoubleReg().
3275 // The recentAssignedRefWeight is not updated if either register cannot be spilled.
3277 bool LinearScan::canSpillDoubleReg(RegRecord* physRegRecord,
3278 LsraLocation refLocation,
3279 unsigned* recentAssignedRefWeight)
3281 assert(genIsValidDoubleReg(physRegRecord->regNum));
3283 unsigned weight = BB_ZERO_WEIGHT;
3284 unsigned weight2 = BB_ZERO_WEIGHT;
3286 RegRecord* physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3288 if ((physRegRecord->assignedInterval != nullptr) && !canSpillReg(physRegRecord, refLocation, &weight))
3292 if (physRegRecord2->assignedInterval != nullptr)
3294 if (!canSpillReg(physRegRecord2, refLocation, &weight2))
3298 if (weight2 > weight)
3303 *recentAssignedRefWeight = weight;
3309 //------------------------------------------------------------------------
3310 // unassignDoublePhysReg: unassign a double register (pair)
3313 // doubleRegRecord - reg to unassign
3316 // The given RegRecord must be a valid (even numbered) double register.
3318 void LinearScan::unassignDoublePhysReg(RegRecord* doubleRegRecord)
3320 assert(genIsValidDoubleReg(doubleRegRecord->regNum));
3322 RegRecord* doubleRegRecordLo = doubleRegRecord;
3323 RegRecord* doubleRegRecordHi = findAnotherHalfRegRec(doubleRegRecordLo);
3324 // For a double register, we has following four cases.
3325 // Case 1: doubleRegRecLo is assigned to TYP_DOUBLE interval
3326 // Case 2: doubleRegRecLo and doubleRegRecHi are assigned to different TYP_FLOAT intervals
3327 // Case 3: doubelRegRecLo is assgined to TYP_FLOAT interval and doubleRegRecHi is nullptr
3328 // Case 4: doubleRegRecordLo is nullptr, and doubleRegRecordHi is assigned to a TYP_FLOAT interval
3329 if (doubleRegRecordLo->assignedInterval != nullptr)
3331 if (doubleRegRecordLo->assignedInterval->registerType == TYP_DOUBLE)
3333 // Case 1: doubleRegRecLo is assigned to TYP_DOUBLE interval
3334 unassignPhysReg(doubleRegRecordLo, doubleRegRecordLo->assignedInterval->recentRefPosition);
3338 // Case 2: doubleRegRecLo and doubleRegRecHi are assigned to different TYP_FLOAT intervals
3339 // Case 3: doubelRegRecLo is assgined to TYP_FLOAT interval and doubleRegRecHi is nullptr
3340 assert(doubleRegRecordLo->assignedInterval->registerType == TYP_FLOAT);
3341 unassignPhysReg(doubleRegRecordLo, doubleRegRecordLo->assignedInterval->recentRefPosition);
3343 if (doubleRegRecordHi != nullptr)
3345 if (doubleRegRecordHi->assignedInterval != nullptr)
3347 assert(doubleRegRecordHi->assignedInterval->registerType == TYP_FLOAT);
3348 unassignPhysReg(doubleRegRecordHi, doubleRegRecordHi->assignedInterval->recentRefPosition);
3355 // Case 4: doubleRegRecordLo is nullptr, and doubleRegRecordHi is assigned to a TYP_FLOAT interval
3356 assert(doubleRegRecordHi->assignedInterval != nullptr);
3357 assert(doubleRegRecordHi->assignedInterval->registerType == TYP_FLOAT);
3358 unassignPhysReg(doubleRegRecordHi, doubleRegRecordHi->assignedInterval->recentRefPosition);
3362 #endif // _TARGET_ARM_
3364 //------------------------------------------------------------------------
3365 // isRefPositionActive: Determine whether a given RefPosition is active at the given location
3368 // refPosition - the RefPosition of interest
3369 // refLocation - the LsraLocation at which we want to know if it is active
3372 // True - if this RefPosition occurs at the given location, OR
3373 // if it occurs at the previous location and is marked delayRegFree.
3374 // False - otherwise
3376 bool LinearScan::isRefPositionActive(RefPosition* refPosition, LsraLocation refLocation)
3378 return (refPosition->nodeLocation == refLocation ||
3379 ((refPosition->nodeLocation + 1 == refLocation) && refPosition->delayRegFree));
3382 //----------------------------------------------------------------------------------------
3383 // isRegInUse: Test whether regRec is being used at the refPosition
3386 // regRec - A register to be tested
3387 // refPosition - RefPosition where regRec is tested
3390 // True - if regRec is being used
3391 // False - otherwise
3394 // This helper is designed to be used only from allocateBusyReg(), where:
3395 // - This register was *not* found when looking for a free register, and
3396 // - The caller must have already checked for the case where 'refPosition' is a fixed ref
3397 // (asserted at the beginning of this method).
3399 bool LinearScan::isRegInUse(RegRecord* regRec, RefPosition* refPosition)
3401 // We shouldn't reach this check if 'refPosition' is a FixedReg of this register.
3402 assert(!refPosition->isFixedRefOfReg(regRec->regNum));
3403 Interval* assignedInterval = regRec->assignedInterval;
3404 if (assignedInterval != nullptr)
3406 if (!assignedInterval->isActive)
3408 // This can only happen if we have a recentRefPosition active at this location that hasn't yet been freed.
3409 CLANG_FORMAT_COMMENT_ANCHOR;
3411 if (isRefPositionActive(assignedInterval->recentRefPosition, refPosition->nodeLocation))
3418 // In the case of TYP_DOUBLE, we may have the case where 'assignedInterval' is inactive,
3419 // but the other half register is active. If so, it must be have an active recentRefPosition,
3421 if (refPosition->getInterval()->registerType == TYP_DOUBLE)
3423 RegRecord* otherHalfRegRec = findAnotherHalfRegRec(regRec);
3424 if (!otherHalfRegRec->assignedInterval->isActive)
3426 if (isRefPositionActive(otherHalfRegRec->assignedInterval->recentRefPosition,
3427 refPosition->nodeLocation))
3433 assert(!"Unexpected inactive assigned interval in isRegInUse");
3441 assert(!"Unexpected inactive assigned interval in isRegInUse");
3446 RefPosition* nextAssignedRef = assignedInterval->getNextRefPosition();
3448 // We should never spill a register that's occupied by an Interval with its next use at the current
3450 // Normally this won't occur (unless we actually had more uses in a single node than there are registers),
3451 // because we'll always find something with a later nextLocation, but it can happen in stress when
3452 // we have LSRA_SELECT_NEAREST.
3453 if ((nextAssignedRef != nullptr) && isRefPositionActive(nextAssignedRef, refPosition->nodeLocation) &&
3454 nextAssignedRef->RequiresRegister())
3462 //------------------------------------------------------------------------
3463 // isSpillCandidate: Determine if a register is a spill candidate for a given RefPosition.
3466 // current The interval for the current allocation
3467 // refPosition The RefPosition of the current Interval for which a register is being allocated
3468 // physRegRecord The RegRecord for the register we're considering for spill
3469 // nextLocation An out (reference) parameter in which the next use location of the
3470 // given RegRecord will be returned.
3473 // True iff the given register can be spilled to accommodate the given RefPosition.
3475 bool LinearScan::isSpillCandidate(Interval* current,
3476 RefPosition* refPosition,
3477 RegRecord* physRegRecord,
3478 LsraLocation& nextLocation)
3480 regMaskTP candidateBit = genRegMask(physRegRecord->regNum);
3481 LsraLocation refLocation = refPosition->nodeLocation;
3482 if (physRegRecord->isBusyUntilNextKill)
3486 Interval* assignedInterval = physRegRecord->assignedInterval;
3487 if (assignedInterval != nullptr)
3489 nextLocation = assignedInterval->getNextRefLocation();
3492 RegRecord* physRegRecord2 = nullptr;
3493 Interval* assignedInterval2 = nullptr;
3495 // For ARM32, a double occupies a consecutive even/odd pair of float registers.
3496 if (current->registerType == TYP_DOUBLE)
3498 assert(genIsValidDoubleReg(physRegRecord->regNum));
3499 physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3500 if (physRegRecord2->isBusyUntilNextKill)
3504 assignedInterval2 = physRegRecord2->assignedInterval;
3505 if ((assignedInterval2 != nullptr) && (assignedInterval2->getNextRefLocation() > nextLocation))
3507 nextLocation = assignedInterval2->getNextRefLocation();
3512 // If there is a fixed reference at the same location (and it's not due to this reference),
3514 if (physRegRecord->conflictingFixedRegReference(refPosition))
3519 if (refPosition->isFixedRefOfRegMask(candidateBit))
3522 // - there is a fixed reference due to this node, OR
3523 // - or there is a fixed use fed by a def at this node, OR
3524 // - or we have restricted the set of registers for stress.
3525 // In any case, we must use this register as it's the only candidate
3526 // TODO-CQ: At the time we allocate a register to a fixed-reg def, if it's not going
3527 // to remain live until the use, we should set the candidates to allRegs(regType)
3528 // to avoid a spill - codegen can then insert the copy.
3529 // If this is marked as allocateIfProfitable, the caller will compare the weights
3530 // of this RefPosition and the RefPosition to which it is currently assigned.
3531 assert(refPosition->isFixedRegRef ||
3532 (refPosition->nextRefPosition != nullptr && refPosition->nextRefPosition->isFixedRegRef) ||
3533 candidatesAreStressLimited());
3537 // If this register is not assigned to an interval, either
3538 // - it has a FixedReg reference at the current location that is not this reference, OR
3539 // - this is the special case of a fixed loReg, where this interval has a use at the same location
3540 // In either case, we cannot use it
3541 CLANG_FORMAT_COMMENT_ANCHOR;
3544 if (assignedInterval == nullptr && assignedInterval2 == nullptr)
3546 if (assignedInterval == nullptr)
3549 RefPosition* nextPhysRegPosition = physRegRecord->getNextRefPosition();
3550 #ifdef _TARGET_ARM64_
3551 // On ARM64, we may need to actually allocate IP0 and IP1 in some cases, but we don't include it in
3552 // the allocation order for tryAllocateFreeReg.
3553 if ((physRegRecord->regNum != REG_IP0) && (physRegRecord->regNum != REG_IP1))
3554 #endif // _TARGET_ARM64_
3556 assert((nextPhysRegPosition != nullptr) && (nextPhysRegPosition->nodeLocation == refLocation) &&
3557 (candidateBit != refPosition->registerAssignment));
3562 if (isRegInUse(physRegRecord, refPosition))
3568 if (current->registerType == TYP_DOUBLE)
3570 if (isRegInUse(physRegRecord2, refPosition))
3579 //------------------------------------------------------------------------
3580 // allocateBusyReg: Find a busy register that satisfies the requirements for refPosition,
3581 // and that can be spilled.
3584 // current The interval for the current allocation
3585 // refPosition The RefPosition of the current Interval for which a register is being allocated
3586 // allocateIfProfitable If true, a reg may not be allocated if all other ref positions currently
3587 // occupying registers are more important than the 'refPosition'.
3590 // The regNumber allocated to the RefPositon. Returns REG_NA if no free register is found.
3592 // Note: Currently this routine uses weight and farthest distance of next reference
3593 // to select a ref position for spilling.
3594 // a) if allocateIfProfitable = false
3595 // The ref position chosen for spilling will be the lowest weight
3596 // of all and if there is is more than one ref position with the
3597 // same lowest weight, among them choses the one with farthest
3598 // distance to its next reference.
3600 // b) if allocateIfProfitable = true
3601 // The ref position chosen for spilling will not only be lowest weight
3602 // of all but also has a weight lower than 'refPosition'. If there is
3603 // no such ref position, reg will not be allocated.
3605 regNumber LinearScan::allocateBusyReg(Interval* current, RefPosition* refPosition, bool allocateIfProfitable)
3607 regNumber foundReg = REG_NA;
3609 RegisterType regType = getRegisterType(current, refPosition);
3610 regMaskTP candidates = refPosition->registerAssignment;
3611 regMaskTP preferences = (current->registerPreferences & candidates);
3612 if (preferences == RBM_NONE)
3614 preferences = candidates;
3616 if (candidates == RBM_NONE)
3618 // This assumes only integer and floating point register types
3619 // if we target a processor with additional register types,
3620 // this would have to change
3621 candidates = allRegs(regType);
3625 candidates = stressLimitRegs(refPosition, candidates);
3628 // TODO-CQ: Determine whether/how to take preferences into account in addition to
3629 // prefering the one with the furthest ref position when considering
3630 // a candidate to spill
3631 RegRecord* farthestRefPhysRegRecord = nullptr;
3633 RegRecord* farthestRefPhysRegRecord2 = nullptr;
3635 LsraLocation farthestLocation = MinLocation;
3636 LsraLocation refLocation = refPosition->nodeLocation;
3637 unsigned farthestRefPosWeight;
3638 if (allocateIfProfitable)
3640 // If allocating a reg is optional, we will consider those ref positions
3641 // whose weight is less than 'refPosition' for spilling.
3642 farthestRefPosWeight = getWeight(refPosition);
3646 // If allocating a reg is a must, we start off with max weight so
3647 // that the first spill candidate will be selected based on
3648 // farthest distance alone. Since we start off with farthestLocation
3649 // initialized to MinLocation, the first available ref position
3650 // will be selected as spill candidate and its weight as the
3651 // fathestRefPosWeight.
3652 farthestRefPosWeight = BB_MAX_WEIGHT;
3655 for (regNumber regNum : Registers(regType))
3657 regMaskTP candidateBit = genRegMask(regNum);
3658 if (!(candidates & candidateBit))
3662 RegRecord* physRegRecord = getRegisterRecord(regNum);
3663 RegRecord* physRegRecord2 = nullptr; // only used for _TARGET_ARM_
3664 LsraLocation nextLocation = MinLocation;
3665 LsraLocation physRegNextLocation;
3666 if (!isSpillCandidate(current, refPosition, physRegRecord, nextLocation))
3668 assert(candidates != candidateBit);
3672 // We've passed the preliminary checks for a spill candidate.
3673 // Now, if we have a recentAssignedRef, check that it is going to be OK to spill it.
3674 Interval* assignedInterval = physRegRecord->assignedInterval;
3675 unsigned recentAssignedRefWeight = BB_ZERO_WEIGHT;
3676 RefPosition* recentAssignedRef = nullptr;
3677 RefPosition* recentAssignedRef2 = nullptr;
3679 if (current->registerType == TYP_DOUBLE)
3681 recentAssignedRef = (assignedInterval == nullptr) ? nullptr : assignedInterval->recentRefPosition;
3682 physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3683 Interval* assignedInterval2 = physRegRecord2->assignedInterval;
3684 recentAssignedRef2 = (assignedInterval2 == nullptr) ? nullptr : assignedInterval2->recentRefPosition;
3685 if (!canSpillDoubleReg(physRegRecord, refLocation, &recentAssignedRefWeight))
3693 recentAssignedRef = assignedInterval->recentRefPosition;
3694 if (!canSpillReg(physRegRecord, refLocation, &recentAssignedRefWeight))
3699 if (recentAssignedRefWeight > farthestRefPosWeight)
3704 physRegNextLocation = physRegRecord->getNextRefLocation();
3705 if (nextLocation > physRegNextLocation)
3707 nextLocation = physRegNextLocation;
3710 bool isBetterLocation;
3713 if (doSelectNearest() && farthestRefPhysRegRecord != nullptr)
3715 isBetterLocation = (nextLocation <= farthestLocation);
3719 // This if-stmt is associated with the above else
3720 if (recentAssignedRefWeight < farthestRefPosWeight)
3722 isBetterLocation = true;
3726 // This would mean the weight of spill ref position we found so far is equal
3727 // to the weight of the ref position that is being evaluated. In this case
3728 // we prefer to spill ref position whose distance to its next reference is
3730 assert(recentAssignedRefWeight == farthestRefPosWeight);
3732 // If allocateIfProfitable=true, the first spill candidate selected
3733 // will be based on weight alone. After we have found a spill
3734 // candidate whose weight is less than the 'refPosition', we will
3735 // consider farthest distance when there is a tie in weights.
3736 // This is to ensure that we don't spill a ref position whose
3737 // weight is equal to weight of 'refPosition'.
3738 if (allocateIfProfitable && farthestRefPhysRegRecord == nullptr)
3740 isBetterLocation = false;
3744 isBetterLocation = (nextLocation > farthestLocation);
3746 if (nextLocation > farthestLocation)
3748 isBetterLocation = true;
3750 else if (nextLocation == farthestLocation)
3752 // Both weight and distance are equal.
3753 // Prefer that ref position which is marked both reload and
3754 // allocate if profitable. These ref positions don't need
3755 // need to be spilled as they are already in memory and
3756 // codegen considers them as contained memory operands.
3757 CLANG_FORMAT_COMMENT_ANCHOR;
3759 // TODO-CQ-ARM: Just conservatively "and" two condition. We may implement better condision later.
3760 isBetterLocation = true;
3761 if (recentAssignedRef != nullptr)
3762 isBetterLocation &= (recentAssignedRef->reload && recentAssignedRef->AllocateIfProfitable());
3764 if (recentAssignedRef2 != nullptr)
3765 isBetterLocation &= (recentAssignedRef2->reload && recentAssignedRef2->AllocateIfProfitable());
3767 isBetterLocation = (recentAssignedRef != nullptr) && recentAssignedRef->reload &&
3768 recentAssignedRef->AllocateIfProfitable();
3773 isBetterLocation = false;
3778 if (isBetterLocation)
3780 farthestLocation = nextLocation;
3781 farthestRefPhysRegRecord = physRegRecord;
3783 farthestRefPhysRegRecord2 = physRegRecord2;
3785 farthestRefPosWeight = recentAssignedRefWeight;
3790 if (allocateIfProfitable)
3792 // There may not be a spill candidate or if one is found
3793 // its weight must be less than the weight of 'refPosition'
3794 assert((farthestRefPhysRegRecord == nullptr) || (farthestRefPosWeight < getWeight(refPosition)));
3798 // Must have found a spill candidate.
3799 assert(farthestRefPhysRegRecord != nullptr);
3801 if (farthestLocation == refLocation)
3803 // This must be a RefPosition that is constrained to use a single register, either directly,
3804 // or at the use, or by stress.
3805 bool isConstrained = (refPosition->isFixedRegRef || (refPosition->nextRefPosition != nullptr &&
3806 refPosition->nextRefPosition->isFixedRegRef) ||
3807 candidatesAreStressLimited());
3811 Interval* assignedInterval =
3812 (farthestRefPhysRegRecord == nullptr) ? nullptr : farthestRefPhysRegRecord->assignedInterval;
3813 Interval* assignedInterval2 =
3814 (farthestRefPhysRegRecord2 == nullptr) ? nullptr : farthestRefPhysRegRecord2->assignedInterval;
3815 RefPosition* nextRefPosition =
3816 (assignedInterval == nullptr) ? nullptr : assignedInterval->getNextRefPosition();
3817 RefPosition* nextRefPosition2 =
3818 (assignedInterval2 == nullptr) ? nullptr : assignedInterval2->getNextRefPosition();
3819 if (nextRefPosition != nullptr)
3821 if (nextRefPosition2 != nullptr)
3823 assert(!nextRefPosition->RequiresRegister() || !nextRefPosition2->RequiresRegister());
3827 assert(!nextRefPosition->RequiresRegister());
3832 assert(nextRefPosition2 != nullptr && !nextRefPosition2->RequiresRegister());
3834 #else // !_TARGET_ARM_
3835 Interval* assignedInterval = farthestRefPhysRegRecord->assignedInterval;
3836 RefPosition* nextRefPosition = assignedInterval->getNextRefPosition();
3837 assert(!nextRefPosition->RequiresRegister());
3838 #endif // !_TARGET_ARM_
3843 assert(farthestLocation > refLocation);
3848 if (farthestRefPhysRegRecord != nullptr)
3850 foundReg = farthestRefPhysRegRecord->regNum;
3853 if (current->registerType == TYP_DOUBLE)
3855 assert(genIsValidDoubleReg(foundReg));
3856 unassignDoublePhysReg(farthestRefPhysRegRecord);
3861 unassignPhysReg(farthestRefPhysRegRecord, farthestRefPhysRegRecord->assignedInterval->recentRefPosition);
3864 assignPhysReg(farthestRefPhysRegRecord, current);
3865 refPosition->registerAssignment = genRegMask(foundReg);
3870 refPosition->registerAssignment = RBM_NONE;
3876 // Grab a register to use to copy and then immediately use.
3877 // This is called only for localVar intervals that already have a register
3878 // assignment that is not compatible with the current RefPosition.
3879 // This is not like regular assignment, because we don't want to change
3880 // any preferences or existing register assignments.
3881 // Prefer a free register that's got the earliest next use.
3882 // Otherwise, spill something with the farthest next use
3884 regNumber LinearScan::assignCopyReg(RefPosition* refPosition)
3886 Interval* currentInterval = refPosition->getInterval();
3887 assert(currentInterval != nullptr);
3888 assert(currentInterval->isActive);
3890 bool foundFreeReg = false;
3891 RegRecord* bestPhysReg = nullptr;
3892 LsraLocation bestLocation = MinLocation;
3893 regMaskTP candidates = refPosition->registerAssignment;
3895 // Save the relatedInterval, if any, so that it doesn't get modified during allocation.
3896 Interval* savedRelatedInterval = currentInterval->relatedInterval;
3897 currentInterval->relatedInterval = nullptr;
3899 // We don't want really want to change the default assignment,
3900 // so 1) pretend this isn't active, and 2) remember the old reg
3901 regNumber oldPhysReg = currentInterval->physReg;
3902 RegRecord* oldRegRecord = currentInterval->assignedReg;
3903 assert(oldRegRecord->regNum == oldPhysReg);
3904 currentInterval->isActive = false;
3906 regNumber allocatedReg = tryAllocateFreeReg(currentInterval, refPosition);
3907 if (allocatedReg == REG_NA)
3909 allocatedReg = allocateBusyReg(currentInterval, refPosition, false);
3912 // Now restore the old info
3913 currentInterval->relatedInterval = savedRelatedInterval;
3914 currentInterval->physReg = oldPhysReg;
3915 currentInterval->assignedReg = oldRegRecord;
3916 currentInterval->isActive = true;
3918 refPosition->copyReg = true;
3919 return allocatedReg;
3922 //------------------------------------------------------------------------
3923 // isAssigned: This is the function to check if the given RegRecord has an assignedInterval
3924 // regardless of lastLocation.
3925 // So it would be call isAssigned() with Maxlocation value.
3928 // regRec - The RegRecord to check that it is assigned.
3929 // newRegType - There are elements to judge according to the upcoming register type.
3932 // Returns true if the given RegRecord has an assignedInterval.
3935 // There is the case to check if the RegRecord has an assignedInterval regardless of Lastlocation.
3937 bool LinearScan::isAssigned(RegRecord* regRec ARM_ARG(RegisterType newRegType))
3939 return isAssigned(regRec, MaxLocation ARM_ARG(newRegType));
3942 //------------------------------------------------------------------------
3943 // isAssigned: Check whether the given RegRecord has an assignedInterval
3944 // that has a reference prior to the given location.
3947 // regRec - The RegRecord of interest
3948 // lastLocation - The LsraLocation up to which we want to check
3949 // newRegType - The `RegisterType` of interval we want to check
3950 // (this is for the purposes of checking the other half of a TYP_DOUBLE RegRecord)
3953 // Returns true if the given RegRecord (and its other half, if TYP_DOUBLE) has an assignedInterval
3954 // that is referenced prior to the given location
3957 // The register is not considered to be assigned if it has no assignedInterval, or that Interval's
3958 // next reference is beyond lastLocation
3960 bool LinearScan::isAssigned(RegRecord* regRec, LsraLocation lastLocation ARM_ARG(RegisterType newRegType))
3962 Interval* assignedInterval = regRec->assignedInterval;
3964 if ((assignedInterval == nullptr) || assignedInterval->getNextRefLocation() > lastLocation)
3967 if (newRegType == TYP_DOUBLE)
3969 RegRecord* anotherRegRec = findAnotherHalfRegRec(regRec);
3971 if ((anotherRegRec->assignedInterval == nullptr) ||
3972 (anotherRegRec->assignedInterval->getNextRefLocation() > lastLocation))
3974 // In case the newRegType is a double register,
3975 // the score would be set UNASSIGNED if another register is also not set.
3989 // Check if the interval is already assigned and if it is then unassign the physical record
3990 // then set the assignedInterval to 'interval'
3992 void LinearScan::checkAndAssignInterval(RegRecord* regRec, Interval* interval)
3994 Interval* assignedInterval = regRec->assignedInterval;
3995 if (assignedInterval != nullptr && assignedInterval != interval)
3997 // This is allocated to another interval. Either it is inactive, or it was allocated as a
3998 // copyReg and is therefore not the "assignedReg" of the other interval. In the latter case,
3999 // we simply unassign it - in the former case we need to set the physReg on the interval to
4000 // REG_NA to indicate that it is no longer in that register.
4001 // The lack of checking for this case resulted in an assert in the retail version of System.dll,
4002 // in method SerialStream.GetDcbFlag.
4003 // Note that we can't check for the copyReg case, because we may have seen a more recent
4004 // RefPosition for the Interval that was NOT a copyReg.
4005 if (assignedInterval->assignedReg == regRec)
4007 assert(assignedInterval->isActive == false);
4008 assignedInterval->physReg = REG_NA;
4010 unassignPhysReg(regRec->regNum);
4013 // If 'interval' and 'assignedInterval' were both TYP_DOUBLE, then we have unassigned 'assignedInterval'
4014 // from both halves. Otherwise, if 'interval' is TYP_DOUBLE, we now need to unassign the other half.
4015 if ((interval->registerType == TYP_DOUBLE) &&
4016 ((assignedInterval == nullptr) || (assignedInterval->registerType == TYP_FLOAT)))
4018 RegRecord* otherRegRecord = getSecondHalfRegRec(regRec);
4019 assignedInterval = otherRegRecord->assignedInterval;
4020 if (assignedInterval != nullptr && assignedInterval != interval)
4022 if (assignedInterval->assignedReg == otherRegRecord)
4024 assert(assignedInterval->isActive == false);
4025 assignedInterval->physReg = REG_NA;
4027 unassignPhysReg(otherRegRecord->regNum);
4032 updateAssignedInterval(regRec, interval, interval->registerType);
4035 // Assign the given physical register interval to the given interval
4036 void LinearScan::assignPhysReg(RegRecord* regRec, Interval* interval)
4038 regMaskTP assignedRegMask = genRegMask(regRec->regNum);
4039 compiler->codeGen->regSet.rsSetRegsModified(assignedRegMask DEBUGARG(true));
4041 checkAndAssignInterval(regRec, interval);
4042 interval->assignedReg = regRec;
4044 interval->physReg = regRec->regNum;
4045 interval->isActive = true;
4046 if (interval->isLocalVar)
4048 // Prefer this register for future references
4049 interval->updateRegisterPreferences(assignedRegMask);
4053 //------------------------------------------------------------------------
4054 // setIntervalAsSplit: Set this Interval as being split
4057 // interval - The Interval which is being split
4063 // The given Interval will be marked as split, and it will be added to the
4064 // set of splitOrSpilledVars.
4067 // "interval" must be a lclVar interval, as tree temps are never split.
4068 // This is asserted in the call to getVarIndex().
4070 void LinearScan::setIntervalAsSplit(Interval* interval)
4072 if (interval->isLocalVar)
4074 unsigned varIndex = interval->getVarIndex(compiler);
4075 if (!interval->isSplit)
4077 VarSetOps::AddElemD(compiler, splitOrSpilledVars, varIndex);
4081 assert(VarSetOps::IsMember(compiler, splitOrSpilledVars, varIndex));
4084 interval->isSplit = true;
4087 //------------------------------------------------------------------------
4088 // setIntervalAsSpilled: Set this Interval as being spilled
4091 // interval - The Interval which is being spilled
4097 // The given Interval will be marked as spilled, and it will be added
4098 // to the set of splitOrSpilledVars.
4100 void LinearScan::setIntervalAsSpilled(Interval* interval)
4102 if (interval->isLocalVar)
4104 unsigned varIndex = interval->getVarIndex(compiler);
4105 if (!interval->isSpilled)
4107 VarSetOps::AddElemD(compiler, splitOrSpilledVars, varIndex);
4111 assert(VarSetOps::IsMember(compiler, splitOrSpilledVars, varIndex));
4114 interval->isSpilled = true;
4117 //------------------------------------------------------------------------
4118 // spill: Spill this Interval between "fromRefPosition" and "toRefPosition"
4121 // fromRefPosition - The RefPosition at which the Interval is to be spilled
4122 // toRefPosition - The RefPosition at which it must be reloaded
4128 // fromRefPosition and toRefPosition must not be null
4130 void LinearScan::spillInterval(Interval* interval, RefPosition* fromRefPosition, RefPosition* toRefPosition)
4132 assert(fromRefPosition != nullptr && toRefPosition != nullptr);
4133 assert(fromRefPosition->getInterval() == interval && toRefPosition->getInterval() == interval);
4134 assert(fromRefPosition->nextRefPosition == toRefPosition);
4136 if (!fromRefPosition->lastUse)
4138 // If not allocated a register, Lcl var def/use ref positions even if reg optional
4139 // should be marked as spillAfter.
4140 if (!fromRefPosition->RequiresRegister() && !(interval->isLocalVar && fromRefPosition->IsActualRef()))
4142 fromRefPosition->registerAssignment = RBM_NONE;
4146 fromRefPosition->spillAfter = true;
4149 assert(toRefPosition != nullptr);
4154 dumpLsraAllocationEvent(LSRA_EVENT_SPILL, interval);
4158 INTRACK_STATS(updateLsraStat(LSRA_STAT_SPILL, fromRefPosition->bbNum));
4160 interval->isActive = false;
4161 setIntervalAsSpilled(interval);
4163 // If fromRefPosition occurs before the beginning of this block, mark this as living in the stack
4164 // on entry to this block.
4165 if (fromRefPosition->nodeLocation <= curBBStartLocation)
4167 // This must be a lclVar interval
4168 assert(interval->isLocalVar);
4169 setInVarRegForBB(curBBNum, interval->varNum, REG_STK);
4173 //------------------------------------------------------------------------
4174 // unassignPhysRegNoSpill: Unassign the given physical register record from
4175 // an active interval, without spilling.
4178 // regRec - the RegRecord to be unasssigned
4184 // The assignedInterval must not be null, and must be active.
4187 // This method is used to unassign a register when an interval needs to be moved to a
4188 // different register, but not (yet) spilled.
4190 void LinearScan::unassignPhysRegNoSpill(RegRecord* regRec)
4192 Interval* assignedInterval = regRec->assignedInterval;
4193 assert(assignedInterval != nullptr && assignedInterval->isActive);
4194 assignedInterval->isActive = false;
4195 unassignPhysReg(regRec, nullptr);
4196 assignedInterval->isActive = true;
4199 //------------------------------------------------------------------------
4200 // checkAndClearInterval: Clear the assignedInterval for the given
4201 // physical register record
4204 // regRec - the physical RegRecord to be unasssigned
4205 // spillRefPosition - The RefPosition at which the assignedInterval is to be spilled
4206 // or nullptr if we aren't spilling
4212 // see unassignPhysReg
4214 void LinearScan::checkAndClearInterval(RegRecord* regRec, RefPosition* spillRefPosition)
4216 Interval* assignedInterval = regRec->assignedInterval;
4217 assert(assignedInterval != nullptr);
4218 regNumber thisRegNum = regRec->regNum;
4220 if (spillRefPosition == nullptr)
4222 // Note that we can't assert for the copyReg case
4224 if (assignedInterval->physReg == thisRegNum)
4226 assert(assignedInterval->isActive == false);
4231 assert(spillRefPosition->getInterval() == assignedInterval);
4234 updateAssignedInterval(regRec, nullptr, assignedInterval->registerType);
4237 //------------------------------------------------------------------------
4238 // unassignPhysReg: Unassign the given physical register record, and spill the
4239 // assignedInterval at the given spillRefPosition, if any.
4242 // regRec - The RegRecord to be unasssigned
4243 // newRegType - The RegisterType of interval that would be assigned
4249 // On ARM architecture, Intervals have to be unassigned considering
4250 // with the register type of interval that would be assigned.
4252 void LinearScan::unassignPhysReg(RegRecord* regRec ARM_ARG(RegisterType newRegType))
4254 RegRecord* regRecToUnassign = regRec;
4256 RegRecord* anotherRegRec = nullptr;
4258 if ((regRecToUnassign->assignedInterval != nullptr) &&
4259 (regRecToUnassign->assignedInterval->registerType == TYP_DOUBLE))
4261 // If the register type of interval(being unassigned or new) is TYP_DOUBLE,
4262 // It should have to be valid double register (even register)
4263 if (!genIsValidDoubleReg(regRecToUnassign->regNum))
4265 regRecToUnassign = findAnotherHalfRegRec(regRec);
4270 if (newRegType == TYP_DOUBLE)
4272 anotherRegRec = findAnotherHalfRegRec(regRecToUnassign);
4277 if (regRecToUnassign->assignedInterval != nullptr)
4279 unassignPhysReg(regRecToUnassign, regRecToUnassign->assignedInterval->recentRefPosition);
4282 if ((anotherRegRec != nullptr) && (anotherRegRec->assignedInterval != nullptr))
4284 unassignPhysReg(anotherRegRec, anotherRegRec->assignedInterval->recentRefPosition);
4289 //------------------------------------------------------------------------
4290 // unassignPhysReg: Unassign the given physical register record, and spill the
4291 // assignedInterval at the given spillRefPosition, if any.
4294 // regRec - the RegRecord to be unasssigned
4295 // spillRefPosition - The RefPosition at which the assignedInterval is to be spilled
4301 // The assignedInterval must not be null.
4302 // If spillRefPosition is null, the assignedInterval must be inactive, or not currently
4303 // assigned to this register (e.g. this is a copyReg for that Interval).
4304 // Otherwise, spillRefPosition must be associated with the assignedInterval.
4306 void LinearScan::unassignPhysReg(RegRecord* regRec, RefPosition* spillRefPosition)
4308 Interval* assignedInterval = regRec->assignedInterval;
4309 assert(assignedInterval != nullptr);
4310 regNumber thisRegNum = regRec->regNum;
4312 // Is assignedInterval actually still assigned to this register?
4313 bool intervalIsAssigned = (assignedInterval->physReg == thisRegNum);
4316 RegRecord* anotherRegRec = nullptr;
4318 // Prepare second half RegRecord of a double register for TYP_DOUBLE
4319 if (assignedInterval->registerType == TYP_DOUBLE)
4321 assert(isFloatRegType(regRec->registerType));
4323 anotherRegRec = findAnotherHalfRegRec(regRec);
4325 // Both two RegRecords should have been assigned to the same interval.
4326 assert(assignedInterval == anotherRegRec->assignedInterval);
4327 if (!intervalIsAssigned && (assignedInterval->physReg == anotherRegRec->regNum))
4329 intervalIsAssigned = true;
4332 #endif // _TARGET_ARM_
4334 checkAndClearInterval(regRec, spillRefPosition);
4337 if (assignedInterval->registerType == TYP_DOUBLE)
4339 // Both two RegRecords should have been unassigned together.
4340 assert(regRec->assignedInterval == nullptr);
4341 assert(anotherRegRec->assignedInterval == nullptr);
4343 #endif // _TARGET_ARM_
4345 RefPosition* nextRefPosition = nullptr;
4346 if (spillRefPosition != nullptr)
4348 nextRefPosition = spillRefPosition->nextRefPosition;
4351 if (!intervalIsAssigned && assignedInterval->physReg != REG_NA)
4353 // This must have been a temporary copy reg, but we can't assert that because there
4354 // may have been intervening RefPositions that were not copyRegs.
4356 // reg->assignedInterval has already been set to nullptr by checkAndClearInterval()
4357 assert(regRec->assignedInterval == nullptr);
4361 regNumber victimAssignedReg = assignedInterval->physReg;
4362 assignedInterval->physReg = REG_NA;
4364 bool spill = assignedInterval->isActive && nextRefPosition != nullptr;
4367 // If this is an active interval, it must have a recentRefPosition,
4368 // otherwise it would not be active
4369 assert(spillRefPosition != nullptr);
4372 // TODO-CQ: Enable this and insert an explicit GT_COPY (otherwise there's no way to communicate
4373 // to codegen that we want the copyReg to be the new home location).
4374 // If the last reference was a copyReg, and we're spilling the register
4375 // it was copied from, then make the copyReg the new primary location
4377 if (spillRefPosition->copyReg)
4379 regNumber copyFromRegNum = victimAssignedReg;
4380 regNumber copyRegNum = genRegNumFromMask(spillRefPosition->registerAssignment);
4381 if (copyFromRegNum == thisRegNum &&
4382 getRegisterRecord(copyRegNum)->assignedInterval == assignedInterval)
4384 assert(copyRegNum != thisRegNum);
4385 assignedInterval->physReg = copyRegNum;
4386 assignedInterval->assignedReg = this->getRegisterRecord(copyRegNum);
4392 // With JitStressRegs == 0x80 (LSRA_EXTEND_LIFETIMES), we may have a RefPosition
4393 // that is not marked lastUse even though the treeNode is a lastUse. In that case
4394 // we must not mark it for spill because the register will have been immediately freed
4395 // after use. While we could conceivably add special handling for this case in codegen,
4396 // it would be messy and undesirably cause the "bleeding" of LSRA stress modes outside
4398 if (extendLifetimes() && assignedInterval->isLocalVar && RefTypeIsUse(spillRefPosition->refType) &&
4399 spillRefPosition->treeNode != nullptr && (spillRefPosition->treeNode->gtFlags & GTF_VAR_DEATH) != 0)
4401 dumpLsraAllocationEvent(LSRA_EVENT_SPILL_EXTENDED_LIFETIME, assignedInterval);
4402 assignedInterval->isActive = false;
4404 // If the spillRefPosition occurs before the beginning of this block, it will have
4405 // been marked as living in this register on entry to this block, but we now need
4406 // to mark this as living on the stack.
4407 if (spillRefPosition->nodeLocation <= curBBStartLocation)
4409 setInVarRegForBB(curBBNum, assignedInterval->varNum, REG_STK);
4410 if (spillRefPosition->nextRefPosition != nullptr)
4412 setIntervalAsSpilled(assignedInterval);
4417 // Otherwise, we need to mark spillRefPosition as lastUse, or the interval
4418 // will remain active beyond its allocated range during the resolution phase.
4419 spillRefPosition->lastUse = true;
4425 spillInterval(assignedInterval, spillRefPosition, nextRefPosition);
4428 // Maintain the association with the interval, if it has more references.
4429 // Or, if we "remembered" an interval assigned to this register, restore it.
4430 if (nextRefPosition != nullptr)
4432 assignedInterval->assignedReg = regRec;
4434 else if (canRestorePreviousInterval(regRec, assignedInterval))
4436 regRec->assignedInterval = regRec->previousInterval;
4437 regRec->previousInterval = nullptr;
4441 // We can not use updateAssignedInterval() and updatePreviousInterval() here,
4442 // because regRec may not be a even-numbered float register.
4444 // Update second half RegRecord of a double register for TYP_DOUBLE
4445 if (regRec->assignedInterval->registerType == TYP_DOUBLE)
4447 RegRecord* anotherHalfRegRec = findAnotherHalfRegRec(regRec);
4449 anotherHalfRegRec->assignedInterval = regRec->assignedInterval;
4450 anotherHalfRegRec->previousInterval = nullptr;
4452 #endif // _TARGET_ARM_
4457 dumpLsraAllocationEvent(LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL_AFTER_SPILL, regRec->assignedInterval,
4462 dumpLsraAllocationEvent(LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL, regRec->assignedInterval, thisRegNum);
4468 updateAssignedInterval(regRec, nullptr, assignedInterval->registerType);
4469 updatePreviousInterval(regRec, nullptr, assignedInterval->registerType);
4473 //------------------------------------------------------------------------
4474 // spillGCRefs: Spill any GC-type intervals that are currently in registers.a
4477 // killRefPosition - The RefPosition for the kill
4482 void LinearScan::spillGCRefs(RefPosition* killRefPosition)
4484 // For each physical register that can hold a GC type,
4485 // if it is occupied by an interval of a GC type, spill that interval.
4486 regMaskTP candidateRegs = killRefPosition->registerAssignment;
4487 while (candidateRegs != RBM_NONE)
4489 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
4490 candidateRegs &= ~nextRegBit;
4491 regNumber nextReg = genRegNumFromMask(nextRegBit);
4492 RegRecord* regRecord = getRegisterRecord(nextReg);
4493 Interval* assignedInterval = regRecord->assignedInterval;
4494 if (assignedInterval == nullptr || (assignedInterval->isActive == false) ||
4495 !varTypeIsGC(assignedInterval->registerType))
4499 unassignPhysReg(regRecord, assignedInterval->recentRefPosition);
4501 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_DONE_KILL_GC_REFS, nullptr, REG_NA, nullptr));
4504 //------------------------------------------------------------------------
4505 // processBlockEndAllocation: Update var locations after 'currentBlock' has been allocated
4508 // currentBlock - the BasicBlock we have just finished allocating registers for
4514 // Calls processBlockEndLocations() to set the outVarToRegMap, then gets the next block,
4515 // and sets the inVarToRegMap appropriately.
4517 void LinearScan::processBlockEndAllocation(BasicBlock* currentBlock)
4519 assert(currentBlock != nullptr);
4520 if (enregisterLocalVars)
4522 processBlockEndLocations(currentBlock);
4524 markBlockVisited(currentBlock);
4526 // Get the next block to allocate.
4527 // When the last block in the method has successors, there will be a final "RefTypeBB" to
4528 // ensure that we get the varToRegMap set appropriately, but in that case we don't need
4529 // to worry about "nextBlock".
4530 BasicBlock* nextBlock = getNextBlock();
4531 if (nextBlock != nullptr)
4533 processBlockStartLocations(nextBlock, true);
4537 //------------------------------------------------------------------------
4538 // rotateBlockStartLocation: When in the LSRA_BLOCK_BOUNDARY_ROTATE stress mode, attempt to
4539 // "rotate" the register assignment for a localVar to the next higher
4540 // register that is available.
4543 // interval - the Interval for the variable whose register is getting rotated
4544 // targetReg - its register assignment from the predecessor block being used for live-in
4545 // availableRegs - registers available for use
4548 // The new register to use.
4551 regNumber LinearScan::rotateBlockStartLocation(Interval* interval, regNumber targetReg, regMaskTP availableRegs)
4553 if (targetReg != REG_STK && getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_ROTATE)
4555 // If we're rotating the register locations at block boundaries, try to use
4556 // the next higher register number of the appropriate register type.
4557 regMaskTP candidateRegs = allRegs(interval->registerType) & availableRegs;
4558 regNumber firstReg = REG_NA;
4559 regNumber newReg = REG_NA;
4560 while (candidateRegs != RBM_NONE)
4562 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
4563 candidateRegs &= ~nextRegBit;
4564 regNumber nextReg = genRegNumFromMask(nextRegBit);
4565 if (nextReg > targetReg)
4570 else if (firstReg == REG_NA)
4575 if (newReg == REG_NA)
4577 assert(firstReg != REG_NA);
4587 //--------------------------------------------------------------------------------------
4588 // isSecondHalfReg: Test if recRec is second half of double register
4589 // which is assigned to an interval.
4592 // regRec - a register to be tested
4593 // interval - an interval which is assigned to some register
4599 // True only if regRec is second half of assignedReg in interval
4601 bool LinearScan::isSecondHalfReg(RegRecord* regRec, Interval* interval)
4603 RegRecord* assignedReg = interval->assignedReg;
4605 if (assignedReg != nullptr && interval->registerType == TYP_DOUBLE)
4607 // interval should have been allocated to a valid double register
4608 assert(genIsValidDoubleReg(assignedReg->regNum));
4610 // Find a second half RegRecord of double register
4611 regNumber firstRegNum = assignedReg->regNum;
4612 regNumber secondRegNum = REG_NEXT(firstRegNum);
4614 assert(genIsValidFloatReg(secondRegNum) && !genIsValidDoubleReg(secondRegNum));
4616 RegRecord* secondRegRec = getRegisterRecord(secondRegNum);
4618 return secondRegRec == regRec;
4624 //------------------------------------------------------------------------------------------
4625 // getSecondHalfRegRec: Get the second (odd) half of an ARM32 double register
4628 // regRec - A float RegRecord
4631 // regRec must be a valid double register (i.e. even)
4634 // The RegRecord for the second half of the double register
4636 RegRecord* LinearScan::getSecondHalfRegRec(RegRecord* regRec)
4638 regNumber secondHalfRegNum;
4639 RegRecord* secondHalfRegRec;
4641 assert(genIsValidDoubleReg(regRec->regNum));
4643 secondHalfRegNum = REG_NEXT(regRec->regNum);
4644 secondHalfRegRec = getRegisterRecord(secondHalfRegNum);
4646 return secondHalfRegRec;
4648 //------------------------------------------------------------------------------------------
4649 // findAnotherHalfRegRec: Find another half RegRecord which forms same ARM32 double register
4652 // regRec - A float RegRecord
4658 // A RegRecord which forms same double register with regRec
4660 RegRecord* LinearScan::findAnotherHalfRegRec(RegRecord* regRec)
4662 regNumber anotherHalfRegNum;
4663 RegRecord* anotherHalfRegRec;
4665 assert(genIsValidFloatReg(regRec->regNum));
4667 // Find another half register for TYP_DOUBLE interval,
4668 // following same logic in canRestorePreviousInterval().
4669 if (genIsValidDoubleReg(regRec->regNum))
4671 anotherHalfRegNum = REG_NEXT(regRec->regNum);
4672 assert(!genIsValidDoubleReg(anotherHalfRegNum));
4676 anotherHalfRegNum = REG_PREV(regRec->regNum);
4677 assert(genIsValidDoubleReg(anotherHalfRegNum));
4679 anotherHalfRegRec = getRegisterRecord(anotherHalfRegNum);
4681 return anotherHalfRegRec;
4685 //--------------------------------------------------------------------------------------
4686 // canRestorePreviousInterval: Test if we can restore previous interval
4689 // regRec - a register which contains previous interval to be restored
4690 // assignedInterval - an interval just unassigned
4696 // True only if previous interval of regRec can be restored
4698 bool LinearScan::canRestorePreviousInterval(RegRecord* regRec, Interval* assignedInterval)
4701 (regRec->previousInterval != nullptr && regRec->previousInterval != assignedInterval &&
4702 regRec->previousInterval->assignedReg == regRec && regRec->previousInterval->getNextRefPosition() != nullptr);
4705 if (retVal && regRec->previousInterval->registerType == TYP_DOUBLE)
4707 RegRecord* anotherHalfRegRec = findAnotherHalfRegRec(regRec);
4709 retVal = retVal && anotherHalfRegRec->assignedInterval == nullptr;
4716 bool LinearScan::isAssignedToInterval(Interval* interval, RegRecord* regRec)
4718 bool isAssigned = (interval->assignedReg == regRec);
4720 isAssigned |= isSecondHalfReg(regRec, interval);
4725 void LinearScan::unassignIntervalBlockStart(RegRecord* regRecord, VarToRegMap inVarToRegMap)
4727 // Is there another interval currently assigned to this register? If so unassign it.
4728 Interval* assignedInterval = regRecord->assignedInterval;
4729 if (assignedInterval != nullptr)
4731 if (isAssignedToInterval(assignedInterval, regRecord))
4733 // Only localVars or constants should be assigned to registers at block boundaries.
4734 if (!assignedInterval->isLocalVar)
4736 assert(assignedInterval->isConstant);
4737 // Don't need to update the VarToRegMap.
4738 inVarToRegMap = nullptr;
4741 regNumber assignedRegNum = assignedInterval->assignedReg->regNum;
4743 // If the interval is active, it will be set to active when we reach its new
4744 // register assignment (which we must not yet have done, or it wouldn't still be
4745 // assigned to this register).
4746 assignedInterval->isActive = false;
4747 unassignPhysReg(assignedInterval->assignedReg, nullptr);
4748 if ((inVarToRegMap != nullptr) && inVarToRegMap[assignedInterval->getVarIndex(compiler)] == assignedRegNum)
4750 inVarToRegMap[assignedInterval->getVarIndex(compiler)] = REG_STK;
4755 // This interval is no longer assigned to this register.
4756 updateAssignedInterval(regRecord, nullptr, assignedInterval->registerType);
4761 //------------------------------------------------------------------------
4762 // processBlockStartLocations: Update var locations on entry to 'currentBlock' and clear constant
4766 // currentBlock - the BasicBlock we are about to allocate registers for
4767 // allocationPass - true if we are currently allocating registers (versus writing them back)
4773 // During the allocation pass, we use the outVarToRegMap of the selected predecessor to
4774 // determine the lclVar locations for the inVarToRegMap.
4775 // During the resolution (write-back) pass, we only modify the inVarToRegMap in cases where
4776 // a lclVar was spilled after the block had been completed.
4777 void LinearScan::processBlockStartLocations(BasicBlock* currentBlock, bool allocationPass)
4779 // If we have no register candidates we should only call this method during allocation.
4781 assert(enregisterLocalVars || allocationPass);
4783 if (!enregisterLocalVars)
4785 // Just clear any constant registers and return.
4786 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
4788 RegRecord* physRegRecord = getRegisterRecord(reg);
4789 Interval* assignedInterval = physRegRecord->assignedInterval;
4791 if (assignedInterval != nullptr)
4793 assert(assignedInterval->isConstant);
4794 physRegRecord->assignedInterval = nullptr;
4797 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, currentBlock));
4801 unsigned predBBNum = blockInfo[currentBlock->bbNum].predBBNum;
4802 VarToRegMap predVarToRegMap = getOutVarToRegMap(predBBNum);
4803 VarToRegMap inVarToRegMap = getInVarToRegMap(currentBlock->bbNum);
4804 bool hasCriticalInEdge = blockInfo[currentBlock->bbNum].hasCriticalInEdge;
4806 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
4807 VarSetOps::Intersection(compiler, registerCandidateVars, currentBlock->bbLiveIn));
4809 if (getLsraExtendLifeTimes())
4811 VarSetOps::AssignNoCopy(compiler, currentLiveVars, registerCandidateVars);
4813 // If we are rotating register assignments at block boundaries, we want to make the
4814 // inactive registers available for the rotation.
4815 regMaskTP inactiveRegs = RBM_NONE;
4817 regMaskTP liveRegs = RBM_NONE;
4818 VarSetOps::Iter iter(compiler, currentLiveVars);
4819 unsigned varIndex = 0;
4820 while (iter.NextElem(&varIndex))
4822 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
4823 if (!compiler->lvaTable[varNum].lvLRACandidate)
4827 regNumber targetReg;
4828 Interval* interval = getIntervalForLocalVar(varIndex);
4829 RefPosition* nextRefPosition = interval->getNextRefPosition();
4830 assert(nextRefPosition != nullptr);
4834 targetReg = getVarReg(predVarToRegMap, varIndex);
4836 regNumber newTargetReg = rotateBlockStartLocation(interval, targetReg, (~liveRegs | inactiveRegs));
4837 if (newTargetReg != targetReg)
4839 targetReg = newTargetReg;
4840 setIntervalAsSplit(interval);
4843 setVarReg(inVarToRegMap, varIndex, targetReg);
4845 else // !allocationPass (i.e. resolution/write-back pass)
4847 targetReg = getVarReg(inVarToRegMap, varIndex);
4848 // There are four cases that we need to consider during the resolution pass:
4849 // 1. This variable had a register allocated initially, and it was not spilled in the RefPosition
4850 // that feeds this block. In this case, both targetReg and predVarToRegMap[varIndex] will be targetReg.
4851 // 2. This variable had not been spilled prior to the end of predBB, but was later spilled, so
4852 // predVarToRegMap[varIndex] will be REG_STK, but targetReg is its former allocated value.
4853 // In this case, we will normally change it to REG_STK. We will update its "spilled" status when we
4854 // encounter it in resolveLocalRef().
4855 // 2a. If the next RefPosition is marked as a copyReg, we need to retain the allocated register. This is
4856 // because the copyReg RefPosition will not have recorded the "home" register, yet downstream
4857 // RefPositions rely on the correct "home" register.
4858 // 3. This variable was spilled before we reached the end of predBB. In this case, both targetReg and
4859 // predVarToRegMap[varIndex] will be REG_STK, and the next RefPosition will have been marked
4860 // as reload during allocation time if necessary (note that by the time we actually reach the next
4861 // RefPosition, we may be using a different predecessor, at which it is still in a register).
4862 // 4. This variable was spilled during the allocation of this block, so targetReg is REG_STK
4863 // (because we set inVarToRegMap at the time we spilled it), but predVarToRegMap[varIndex]
4864 // is not REG_STK. We retain the REG_STK value in the inVarToRegMap.
4865 if (targetReg != REG_STK)
4867 if (getVarReg(predVarToRegMap, varIndex) != REG_STK)
4870 assert(getVarReg(predVarToRegMap, varIndex) == targetReg ||
4871 getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_ROTATE);
4873 else if (!nextRefPosition->copyReg)
4876 setVarReg(inVarToRegMap, varIndex, REG_STK);
4877 targetReg = REG_STK;
4879 // Else case 2a. - retain targetReg.
4881 // Else case #3 or #4, we retain targetReg and nothing further to do or assert.
4883 if (interval->physReg == targetReg)
4885 if (interval->isActive)
4887 assert(targetReg != REG_STK);
4888 assert(interval->assignedReg != nullptr && interval->assignedReg->regNum == targetReg &&
4889 interval->assignedReg->assignedInterval == interval);
4890 liveRegs |= genRegMask(targetReg);
4894 else if (interval->physReg != REG_NA)
4896 // This can happen if we are using the locations from a basic block other than the
4897 // immediately preceding one - where the variable was in a different location.
4898 if (targetReg != REG_STK)
4900 // Unassign it from the register (it will get a new register below).
4901 if (interval->assignedReg != nullptr && interval->assignedReg->assignedInterval == interval)
4903 interval->isActive = false;
4904 unassignPhysReg(getRegisterRecord(interval->physReg), nullptr);
4908 // This interval was live in this register the last time we saw a reference to it,
4909 // but has since been displaced.
4910 interval->physReg = REG_NA;
4913 else if (allocationPass)
4915 // Keep the register assignment - if another var has it, it will get unassigned.
4916 // Otherwise, resolution will fix it up later, and it will be more
4917 // likely to match other assignments this way.
4918 interval->isActive = true;
4919 liveRegs |= genRegMask(interval->physReg);
4920 INDEBUG(inactiveRegs |= genRegMask(interval->physReg));
4921 setVarReg(inVarToRegMap, varIndex, interval->physReg);
4925 interval->physReg = REG_NA;
4928 if (targetReg != REG_STK)
4930 RegRecord* targetRegRecord = getRegisterRecord(targetReg);
4931 liveRegs |= genRegMask(targetReg);
4932 if (!interval->isActive)
4934 interval->isActive = true;
4935 interval->physReg = targetReg;
4936 interval->assignedReg = targetRegRecord;
4938 if (targetRegRecord->assignedInterval != interval)
4941 // If this is a TYP_DOUBLE interval, and the assigned interval is either null or is TYP_FLOAT,
4942 // we also need to unassign the other half of the register.
4943 // Note that if the assigned interval is TYP_DOUBLE, it will be unassigned below.
4944 if ((interval->registerType == TYP_DOUBLE) &&
4945 ((targetRegRecord->assignedInterval == nullptr) ||
4946 (targetRegRecord->assignedInterval->registerType == TYP_FLOAT)))
4948 assert(genIsValidDoubleReg(targetReg));
4949 unassignIntervalBlockStart(findAnotherHalfRegRec(targetRegRecord),
4950 allocationPass ? inVarToRegMap : nullptr);
4952 #endif // _TARGET_ARM_
4953 unassignIntervalBlockStart(targetRegRecord, allocationPass ? inVarToRegMap : nullptr);
4954 assignPhysReg(targetRegRecord, interval);
4956 if (interval->recentRefPosition != nullptr && !interval->recentRefPosition->copyReg &&
4957 interval->recentRefPosition->registerAssignment != genRegMask(targetReg))
4959 interval->getNextRefPosition()->outOfOrder = true;
4964 // Unassign any registers that are no longer live.
4965 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
4967 if ((liveRegs & genRegMask(reg)) == 0)
4969 RegRecord* physRegRecord = getRegisterRecord(reg);
4970 Interval* assignedInterval = physRegRecord->assignedInterval;
4972 if (assignedInterval != nullptr)
4974 assert(assignedInterval->isLocalVar || assignedInterval->isConstant);
4976 if (!assignedInterval->isConstant && assignedInterval->assignedReg == physRegRecord)
4978 assignedInterval->isActive = false;
4979 if (assignedInterval->getNextRefPosition() == nullptr)
4981 unassignPhysReg(physRegRecord, nullptr);
4983 inVarToRegMap[assignedInterval->getVarIndex(compiler)] = REG_STK;
4987 // This interval may still be active, but was in another register in an
4988 // intervening block.
4989 updateAssignedInterval(physRegRecord, nullptr, assignedInterval->registerType);
4993 // unassignPhysReg, above, may have restored a 'previousInterval', in which case we need to
4994 // get the value of 'physRegRecord->assignedInterval' rather than using 'assignedInterval'.
4995 if (physRegRecord->assignedInterval != nullptr)
4997 assignedInterval = physRegRecord->assignedInterval;
4999 if (assignedInterval->registerType == TYP_DOUBLE)
5001 // Skip next float register, because we already addressed a double register
5002 assert(genIsValidDoubleReg(reg));
5003 reg = REG_NEXT(reg);
5005 #endif // _TARGET_ARM_
5011 RegRecord* physRegRecord = getRegisterRecord(reg);
5012 Interval* assignedInterval = physRegRecord->assignedInterval;
5014 if (assignedInterval != nullptr && assignedInterval->registerType == TYP_DOUBLE)
5016 // Skip next float register, because we already addressed a double register
5017 assert(genIsValidDoubleReg(reg));
5018 reg = REG_NEXT(reg);
5021 #endif // _TARGET_ARM_
5023 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, currentBlock));
5026 //------------------------------------------------------------------------
5027 // processBlockEndLocations: Record the variables occupying registers after completing the current block.
5030 // currentBlock - the block we have just completed.
5036 // This must be called both during the allocation and resolution (write-back) phases.
5037 // This is because we need to have the outVarToRegMap locations in order to set the locations
5038 // at successor blocks during allocation time, but if lclVars are spilled after a block has been
5039 // completed, we need to record the REG_STK location for those variables at resolution time.
5041 void LinearScan::processBlockEndLocations(BasicBlock* currentBlock)
5043 assert(currentBlock != nullptr && currentBlock->bbNum == curBBNum);
5044 VarToRegMap outVarToRegMap = getOutVarToRegMap(curBBNum);
5046 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
5047 VarSetOps::Intersection(compiler, registerCandidateVars, currentBlock->bbLiveOut));
5049 if (getLsraExtendLifeTimes())
5051 VarSetOps::Assign(compiler, currentLiveVars, registerCandidateVars);
5054 regMaskTP liveRegs = RBM_NONE;
5055 VarSetOps::Iter iter(compiler, currentLiveVars);
5056 unsigned varIndex = 0;
5057 while (iter.NextElem(&varIndex))
5059 Interval* interval = getIntervalForLocalVar(varIndex);
5060 if (interval->isActive)
5062 assert(interval->physReg != REG_NA && interval->physReg != REG_STK);
5063 setVarReg(outVarToRegMap, varIndex, interval->physReg);
5067 outVarToRegMap[varIndex] = REG_STK;
5070 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_END_BB));
5074 void LinearScan::dumpRefPositions(const char* str)
5076 printf("------------\n");
5077 printf("REFPOSITIONS %s: \n", str);
5078 printf("------------\n");
5079 for (RefPosition& refPos : refPositions)
5086 bool LinearScan::registerIsFree(regNumber regNum, RegisterType regType)
5088 RegRecord* physRegRecord = getRegisterRecord(regNum);
5090 bool isFree = physRegRecord->isFree();
5093 if (isFree && regType == TYP_DOUBLE)
5095 isFree = getSecondHalfRegRec(physRegRecord)->isFree();
5097 #endif // _TARGET_ARM_
5102 // isMultiRegRelated: is this RefPosition defining part of a multi-reg value
5103 // at the given location?
5105 bool LinearScan::isMultiRegRelated(RefPosition* refPosition, LsraLocation location)
5107 #ifdef FEATURE_MULTIREG_ARGS_OR_RET
5108 return ((refPosition->nodeLocation == location) && refPosition->getInterval()->isMultiReg);
5114 //------------------------------------------------------------------------
5115 // LinearScan::freeRegister: Make a register available for use
5118 // physRegRecord - the RegRecord for the register to be freed.
5125 // It may be that the RegRecord has already been freed, e.g. due to a kill,
5126 // in which case this method has no effect.
5129 // If there is currently an Interval assigned to this register, and it has
5130 // more references (i.e. this is a local last-use, but more uses and/or
5131 // defs remain), it will remain assigned to the physRegRecord. However, since
5132 // it is marked inactive, the register will be available, albeit less desirable
5134 void LinearScan::freeRegister(RegRecord* physRegRecord)
5136 Interval* assignedInterval = physRegRecord->assignedInterval;
5137 // It may have already been freed by a "Kill"
5138 if (assignedInterval != nullptr)
5140 assignedInterval->isActive = false;
5141 // If this is a constant node, that we may encounter again (e.g. constant),
5142 // don't unassign it until we need the register.
5143 if (!assignedInterval->isConstant)
5145 RefPosition* nextRefPosition = assignedInterval->getNextRefPosition();
5146 // Unassign the register only if there are no more RefPositions, or the next
5147 // one is a def. Note that the latter condition doesn't actually ensure that
5148 // there aren't subsequent uses that could be reached by a def in the assigned
5149 // register, but is merely a heuristic to avoid tying up the register (or using
5150 // it when it's non-optimal). A better alternative would be to use SSA, so that
5151 // we wouldn't unnecessarily link separate live ranges to the same register.
5152 if (nextRefPosition == nullptr || RefTypeIsDef(nextRefPosition->refType))
5155 assert((assignedInterval->registerType != TYP_DOUBLE) || genIsValidDoubleReg(physRegRecord->regNum));
5156 #endif // _TARGET_ARM_
5157 unassignPhysReg(physRegRecord, nullptr);
5163 void LinearScan::freeRegisters(regMaskTP regsToFree)
5165 if (regsToFree == RBM_NONE)
5170 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_FREE_REGS));
5171 while (regsToFree != RBM_NONE)
5173 regMaskTP nextRegBit = genFindLowestBit(regsToFree);
5174 regsToFree &= ~nextRegBit;
5175 regNumber nextReg = genRegNumFromMask(nextRegBit);
5176 freeRegister(getRegisterRecord(nextReg));
5180 // Actual register allocation, accomplished by iterating over all of the previously
5181 // constructed Intervals
5182 // Loosely based on raAssignVars()
5184 void LinearScan::allocateRegisters()
5186 JITDUMP("*************** In LinearScan::allocateRegisters()\n");
5187 DBEXEC(VERBOSE, lsraDumpIntervals("before allocateRegisters"));
5189 // at start, nothing is active except for register args
5190 for (Interval& interval : intervals)
5192 Interval* currentInterval = &interval;
5193 currentInterval->recentRefPosition = nullptr;
5194 currentInterval->isActive = false;
5195 if (currentInterval->isLocalVar)
5197 LclVarDsc* varDsc = currentInterval->getLocalVar(compiler);
5198 if (varDsc->lvIsRegArg && currentInterval->firstRefPosition != nullptr)
5200 currentInterval->isActive = true;
5205 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
5207 getRegisterRecord(reg)->recentRefPosition = nullptr;
5208 getRegisterRecord(reg)->isActive = false;
5212 regNumber lastAllocatedReg = REG_NA;
5215 dumpRefPositions("BEFORE ALLOCATION");
5216 dumpVarRefPositions("BEFORE ALLOCATION");
5218 printf("\n\nAllocating Registers\n"
5219 "--------------------\n");
5220 // Start with a small set of commonly used registers, so that we don't keep having to print a new title.
5221 registersToDump = LsraLimitSmallIntSet | LsraLimitSmallFPSet;
5222 dumpRegRecordHeader();
5223 // Now print an empty "RefPosition", since we complete the dump of the regs at the beginning of the loop.
5224 printf(indentFormat, "");
5228 BasicBlock* currentBlock = nullptr;
5230 LsraLocation prevLocation = MinLocation;
5231 regMaskTP regsToFree = RBM_NONE;
5232 regMaskTP delayRegsToFree = RBM_NONE;
5234 // This is the most recent RefPosition for which a register was allocated
5235 // - currently only used for DEBUG but maintained in non-debug, for clarity of code
5236 // (and will be optimized away because in non-debug spillAlways() unconditionally returns false)
5237 RefPosition* lastAllocatedRefPosition = nullptr;
5239 bool handledBlockEnd = false;
5241 for (RefPosition& refPositionIterator : refPositions)
5243 RefPosition* currentRefPosition = &refPositionIterator;
5246 // Set the activeRefPosition to null until we're done with any boundary handling.
5247 activeRefPosition = nullptr;
5250 // We're really dumping the RegRecords "after" the previous RefPosition, but it's more convenient
5251 // to do this here, since there are a number of "continue"s in this loop.
5256 // This is the previousRefPosition of the current Referent, if any
5257 RefPosition* previousRefPosition = nullptr;
5259 Interval* currentInterval = nullptr;
5260 Referenceable* currentReferent = nullptr;
5261 bool isInternalRef = false;
5262 RefType refType = currentRefPosition->refType;
5264 currentReferent = currentRefPosition->referent;
5266 if (spillAlways() && lastAllocatedRefPosition != nullptr && !lastAllocatedRefPosition->isPhysRegRef &&
5267 !lastAllocatedRefPosition->getInterval()->isInternal &&
5268 (RefTypeIsDef(lastAllocatedRefPosition->refType) || lastAllocatedRefPosition->getInterval()->isLocalVar))
5270 assert(lastAllocatedRefPosition->registerAssignment != RBM_NONE);
5271 RegRecord* regRecord = lastAllocatedRefPosition->getInterval()->assignedReg;
5272 unassignPhysReg(regRecord, lastAllocatedRefPosition);
5273 // Now set lastAllocatedRefPosition to null, so that we don't try to spill it again
5274 lastAllocatedRefPosition = nullptr;
5277 // We wait to free any registers until we've completed all the
5278 // uses for the current node.
5279 // This avoids reusing registers too soon.
5280 // We free before the last true def (after all the uses & internal
5281 // registers), and then again at the beginning of the next node.
5282 // This is made easier by assigning two LsraLocations per node - one
5283 // for all the uses, internal registers & all but the last def, and
5284 // another for the final def (if any).
5286 LsraLocation currentLocation = currentRefPosition->nodeLocation;
5288 if ((regsToFree | delayRegsToFree) != RBM_NONE)
5290 // Free at a new location, or at a basic block boundary
5291 if (refType == RefTypeBB)
5293 assert(currentLocation > prevLocation);
5295 if (currentLocation > prevLocation)
5297 freeRegisters(regsToFree);
5298 if ((currentLocation > (prevLocation + 1)) && (delayRegsToFree != RBM_NONE))
5300 // We should never see a delayReg that is delayed until a Location that has no RefPosition
5301 // (that would be the RefPosition that it was supposed to interfere with).
5302 assert(!"Found a delayRegFree associated with Location with no reference");
5303 // However, to be cautious for the Release build case, we will free them.
5304 freeRegisters(delayRegsToFree);
5305 delayRegsToFree = RBM_NONE;
5307 regsToFree = delayRegsToFree;
5308 delayRegsToFree = RBM_NONE;
5311 prevLocation = currentLocation;
5313 // get previous refposition, then current refpos is the new previous
5314 if (currentReferent != nullptr)
5316 previousRefPosition = currentReferent->recentRefPosition;
5317 currentReferent->recentRefPosition = currentRefPosition;
5321 assert((refType == RefTypeBB) || (refType == RefTypeKillGCRefs));
5325 activeRefPosition = currentRefPosition;
5328 // For the purposes of register resolution, we handle the DummyDefs before
5329 // the block boundary - so the RefTypeBB is after all the DummyDefs.
5330 // However, for the purposes of allocation, we want to handle the block
5331 // boundary first, so that we can free any registers occupied by lclVars
5332 // that aren't live in the next block and make them available for the
5335 if (!handledBlockEnd && (refType == RefTypeBB || refType == RefTypeDummyDef))
5337 // Free any delayed regs (now in regsToFree) before processing the block boundary
5338 freeRegisters(regsToFree);
5339 regsToFree = RBM_NONE;
5340 handledBlockEnd = true;
5341 curBBStartLocation = currentRefPosition->nodeLocation;
5342 if (currentBlock == nullptr)
5344 currentBlock = startBlockSequence();
5345 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, compiler->fgFirstBB));
5349 processBlockEndAllocation(currentBlock);
5350 currentBlock = moveToNextBlock();
5354 if (refType == RefTypeBB)
5356 handledBlockEnd = false;
5360 if (refType == RefTypeKillGCRefs)
5362 spillGCRefs(currentRefPosition);
5366 // If this is a FixedReg, disassociate any inactive constant interval from this register.
5367 // Otherwise, do nothing.
5368 if (refType == RefTypeFixedReg)
5370 RegRecord* regRecord = currentRefPosition->getReg();
5371 Interval* assignedInterval = regRecord->assignedInterval;
5373 if (assignedInterval != nullptr && !assignedInterval->isActive && assignedInterval->isConstant)
5375 regRecord->assignedInterval = nullptr;
5378 // Update overlapping floating point register for TYP_DOUBLE
5379 if (assignedInterval->registerType == TYP_DOUBLE)
5381 regRecord = findAnotherHalfRegRec(regRecord);
5382 assert(regRecord->assignedInterval == assignedInterval);
5383 regRecord->assignedInterval = nullptr;
5387 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_FIXED_REG, nullptr, currentRefPosition->assignedReg()));
5391 // If this is an exposed use, do nothing - this is merely a placeholder to attempt to
5392 // ensure that a register is allocated for the full lifetime. The resolution logic
5393 // will take care of moving to the appropriate register if needed.
5395 if (refType == RefTypeExpUse)
5397 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_EXP_USE));
5401 regNumber assignedRegister = REG_NA;
5403 if (currentRefPosition->isIntervalRef())
5405 currentInterval = currentRefPosition->getInterval();
5406 assignedRegister = currentInterval->physReg;
5408 // Identify the special cases where we decide up-front not to allocate
5409 bool allocate = true;
5410 bool didDump = false;
5412 if (refType == RefTypeParamDef || refType == RefTypeZeroInit)
5414 // For a ParamDef with a weighted refCount less than unity, don't enregister it at entry.
5415 // TODO-CQ: Consider doing this only for stack parameters, since otherwise we may be needlessly
5416 // inserting a store.
5417 LclVarDsc* varDsc = currentInterval->getLocalVar(compiler);
5418 assert(varDsc != nullptr);
5419 if (refType == RefTypeParamDef && varDsc->lvRefCntWtd <= BB_UNITY_WEIGHT)
5421 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_ENTRY_REG_ALLOCATED, currentInterval));
5424 setIntervalAsSpilled(currentInterval);
5426 // If it has no actual references, mark it as "lastUse"; since they're not actually part
5427 // of any flow they won't have been marked during dataflow. Otherwise, if we allocate a
5428 // register we won't unassign it.
5429 else if (currentRefPosition->nextRefPosition == nullptr)
5431 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_ZERO_REF, currentInterval));
5432 currentRefPosition->lastUse = true;
5436 else if (refType == RefTypeUpperVectorSaveDef || refType == RefTypeUpperVectorSaveUse)
5438 Interval* lclVarInterval = currentInterval->relatedInterval;
5439 if (lclVarInterval->physReg == REG_NA)
5444 #endif // FEATURE_SIMD
5446 if (allocate == false)
5448 if (assignedRegister != REG_NA)
5450 unassignPhysReg(getRegisterRecord(assignedRegister), currentRefPosition);
5454 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5457 currentRefPosition->registerAssignment = RBM_NONE;
5461 if (currentInterval->isSpecialPutArg)
5463 assert(!currentInterval->isLocalVar);
5464 Interval* srcInterval = currentInterval->relatedInterval;
5465 assert(srcInterval->isLocalVar);
5466 if (refType == RefTypeDef)
5468 assert(srcInterval->recentRefPosition->nodeLocation == currentLocation - 1);
5469 RegRecord* physRegRecord = srcInterval->assignedReg;
5471 // For a putarg_reg to be special, its next use location has to be the same
5472 // as fixed reg's next kill location. Otherwise, if source lcl var's next use
5473 // is after the kill of fixed reg but before putarg_reg's next use, fixed reg's
5474 // kill would lead to spill of source but not the putarg_reg if it were treated
5476 if (srcInterval->isActive &&
5477 genRegMask(srcInterval->physReg) == currentRefPosition->registerAssignment &&
5478 currentInterval->getNextRefLocation() == physRegRecord->getNextRefLocation())
5480 assert(physRegRecord->regNum == srcInterval->physReg);
5482 // Special putarg_reg acts as a pass-thru since both source lcl var
5483 // and putarg_reg have the same register allocated. Physical reg
5484 // record of reg continue to point to source lcl var's interval
5485 // instead of to putarg_reg's interval. So if a spill of reg
5486 // allocated to source lcl var happens, to reallocate to another
5487 // tree node, before its use at call node it will lead to spill of
5488 // lcl var instead of putarg_reg since physical reg record is pointing
5489 // to lcl var's interval. As a result, arg reg would get trashed leading
5490 // to bad codegen. The assumption here is that source lcl var of a
5491 // special putarg_reg doesn't get spilled and re-allocated prior to
5492 // its use at the call node. This is ensured by marking physical reg
5493 // record as busy until next kill.
5494 physRegRecord->isBusyUntilNextKill = true;
5498 currentInterval->isSpecialPutArg = false;
5501 // If this is still a SpecialPutArg, continue;
5502 if (currentInterval->isSpecialPutArg)
5504 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_SPECIAL_PUTARG, currentInterval,
5505 currentRefPosition->assignedReg()));
5510 if (assignedRegister == REG_NA && RefTypeIsUse(refType))
5512 currentRefPosition->reload = true;
5513 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_RELOAD, currentInterval, assignedRegister));
5517 regMaskTP assignedRegBit = RBM_NONE;
5518 bool isInRegister = false;
5519 if (assignedRegister != REG_NA)
5521 isInRegister = true;
5522 assignedRegBit = genRegMask(assignedRegister);
5523 if (!currentInterval->isActive)
5525 // If this is a use, it must have started the block on the stack, but the register
5526 // was available for use so we kept the association.
5527 if (RefTypeIsUse(refType))
5529 assert(enregisterLocalVars);
5530 assert(inVarToRegMaps[curBBNum][currentInterval->getVarIndex(compiler)] == REG_STK &&
5531 previousRefPosition->nodeLocation <= curBBStartLocation);
5532 isInRegister = false;
5536 currentInterval->isActive = true;
5539 assert(currentInterval->assignedReg != nullptr &&
5540 currentInterval->assignedReg->regNum == assignedRegister &&
5541 currentInterval->assignedReg->assignedInterval == currentInterval);
5544 // If this is a physical register, we unconditionally assign it to itself!
5545 if (currentRefPosition->isPhysRegRef)
5547 RegRecord* currentReg = currentRefPosition->getReg();
5548 Interval* assignedInterval = currentReg->assignedInterval;
5550 if (assignedInterval != nullptr)
5552 unassignPhysReg(currentReg, assignedInterval->recentRefPosition);
5554 currentReg->isActive = true;
5555 assignedRegister = currentReg->regNum;
5556 assignedRegBit = genRegMask(assignedRegister);
5557 if (refType == RefTypeKill)
5559 currentReg->isBusyUntilNextKill = false;
5562 else if (previousRefPosition != nullptr)
5564 assert(previousRefPosition->nextRefPosition == currentRefPosition);
5565 assert(assignedRegister == REG_NA || assignedRegBit == previousRefPosition->registerAssignment ||
5566 currentRefPosition->outOfOrder || previousRefPosition->copyReg ||
5567 previousRefPosition->refType == RefTypeExpUse || currentRefPosition->refType == RefTypeDummyDef);
5569 else if (assignedRegister != REG_NA)
5571 // Handle the case where this is a preassigned register (i.e. parameter).
5572 // We don't want to actually use the preassigned register if it's not
5573 // going to cover the lifetime - but we had to preallocate it to ensure
5574 // that it remained live.
5575 // TODO-CQ: At some point we may want to refine the analysis here, in case
5576 // it might be beneficial to keep it in this reg for PART of the lifetime
5577 if (currentInterval->isLocalVar)
5579 regMaskTP preferences = currentInterval->registerPreferences;
5580 bool keepAssignment = true;
5581 bool matchesPreferences = (preferences & genRegMask(assignedRegister)) != RBM_NONE;
5583 // Will the assigned register cover the lifetime? If not, does it at least
5584 // meet the preferences for the next RefPosition?
5585 RegRecord* physRegRecord = getRegisterRecord(currentInterval->physReg);
5586 RefPosition* nextPhysRegRefPos = physRegRecord->getNextRefPosition();
5587 if (nextPhysRegRefPos != nullptr &&
5588 nextPhysRegRefPos->nodeLocation <= currentInterval->lastRefPosition->nodeLocation)
5590 // Check to see if the existing assignment matches the preferences (e.g. callee save registers)
5591 // and ensure that the next use of this localVar does not occur after the nextPhysRegRefPos
5592 // There must be a next RefPosition, because we know that the Interval extends beyond the
5593 // nextPhysRegRefPos.
5594 RefPosition* nextLclVarRefPos = currentRefPosition->nextRefPosition;
5595 assert(nextLclVarRefPos != nullptr);
5596 if (!matchesPreferences || nextPhysRegRefPos->nodeLocation < nextLclVarRefPos->nodeLocation ||
5597 physRegRecord->conflictingFixedRegReference(nextLclVarRefPos))
5599 keepAssignment = false;
5602 else if (refType == RefTypeParamDef && !matchesPreferences)
5604 // Don't use the register, even if available, if it doesn't match the preferences.
5605 // Note that this case is only for ParamDefs, for which we haven't yet taken preferences
5606 // into account (we've just automatically got the initial location). In other cases,
5607 // we would already have put it in a preferenced register, if it was available.
5608 // TODO-CQ: Consider expanding this to check availability - that would duplicate
5609 // code here, but otherwise we may wind up in this register anyway.
5610 keepAssignment = false;
5613 if (keepAssignment == false)
5615 currentRefPosition->registerAssignment = allRegs(currentInterval->registerType);
5616 unassignPhysRegNoSpill(physRegRecord);
5618 // If the preferences are currently set to just this register, reset them to allRegs
5619 // of the appropriate type (just as we just reset the registerAssignment for this
5621 // Otherwise, simply remove this register from the preferences, if it's there.
5623 if (currentInterval->registerPreferences == assignedRegBit)
5625 currentInterval->registerPreferences = currentRefPosition->registerAssignment;
5629 currentInterval->registerPreferences &= ~assignedRegBit;
5632 assignedRegister = REG_NA;
5633 assignedRegBit = RBM_NONE;
5638 if (assignedRegister != REG_NA)
5640 // If there is a conflicting fixed reference, insert a copy.
5641 RegRecord* physRegRecord = getRegisterRecord(assignedRegister);
5642 if (physRegRecord->conflictingFixedRegReference(currentRefPosition))
5644 // We may have already reassigned the register to the conflicting reference.
5645 // If not, we need to unassign this interval.
5646 if (physRegRecord->assignedInterval == currentInterval)
5648 unassignPhysRegNoSpill(physRegRecord);
5650 currentRefPosition->moveReg = true;
5651 assignedRegister = REG_NA;
5652 setIntervalAsSplit(currentInterval);
5653 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_MOVE_REG, currentInterval, assignedRegister));
5655 else if ((genRegMask(assignedRegister) & currentRefPosition->registerAssignment) != 0)
5657 currentRefPosition->registerAssignment = assignedRegBit;
5658 if (!currentReferent->isActive)
5660 // If we've got an exposed use at the top of a block, the
5661 // interval might not have been active. Otherwise if it's a use,
5662 // the interval must be active.
5663 if (refType == RefTypeDummyDef)
5665 currentReferent->isActive = true;
5666 assert(getRegisterRecord(assignedRegister)->assignedInterval == currentInterval);
5670 currentRefPosition->reload = true;
5673 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, currentInterval, assignedRegister));
5677 assert(currentInterval != nullptr);
5679 // It's already in a register, but not one we need.
5680 if (!RefTypeIsDef(currentRefPosition->refType))
5682 regNumber copyReg = assignCopyReg(currentRefPosition);
5683 assert(copyReg != REG_NA);
5684 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_COPY_REG, currentInterval, copyReg));
5685 lastAllocatedRefPosition = currentRefPosition;
5686 if (currentRefPosition->lastUse)
5688 if (currentRefPosition->delayRegFree)
5690 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE_DELAYED, currentInterval,
5692 delayRegsToFree |= (genRegMask(assignedRegister) | currentRefPosition->registerAssignment);
5696 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE, currentInterval, assignedRegister));
5697 regsToFree |= (genRegMask(assignedRegister) | currentRefPosition->registerAssignment);
5700 // If this is a tree temp (non-localVar) interval, we will need an explicit move.
5701 if (!currentInterval->isLocalVar)
5703 currentRefPosition->moveReg = true;
5704 currentRefPosition->copyReg = false;
5710 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NEEDS_NEW_REG, nullptr, assignedRegister));
5711 regsToFree |= genRegMask(assignedRegister);
5712 // We want a new register, but we don't want this to be considered a spill.
5713 assignedRegister = REG_NA;
5714 if (physRegRecord->assignedInterval == currentInterval)
5716 unassignPhysRegNoSpill(physRegRecord);
5722 if (assignedRegister == REG_NA)
5724 bool allocateReg = true;
5726 if (currentRefPosition->AllocateIfProfitable())
5728 // We can avoid allocating a register if it is a the last use requiring a reload.
5729 if (currentRefPosition->lastUse && currentRefPosition->reload)
5731 allocateReg = false;
5735 // Under stress mode, don't attempt to allocate a reg to
5736 // reg optional ref position.
5737 if (allocateReg && regOptionalNoAlloc())
5739 allocateReg = false;
5746 // Try to allocate a register
5747 assignedRegister = tryAllocateFreeReg(currentInterval, currentRefPosition);
5750 // If no register was found, and if the currentRefPosition must have a register,
5751 // then find a register to spill
5752 if (assignedRegister == REG_NA)
5754 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
5755 if (refType == RefTypeUpperVectorSaveDef)
5757 // TODO-CQ: Determine whether copying to two integer callee-save registers would be profitable.
5758 // TODO-ARM64-CQ: Determine whether copying to one integer callee-save registers would be
5761 // SaveDef position occurs after the Use of args and at the same location as Kill/Def
5762 // positions of a call node. But SaveDef position cannot use any of the arg regs as
5763 // they are needed for call node.
5764 currentRefPosition->registerAssignment =
5765 (allRegs(TYP_FLOAT) & RBM_FLT_CALLEE_TRASH & ~RBM_FLTARG_REGS);
5766 assignedRegister = tryAllocateFreeReg(currentInterval, currentRefPosition);
5768 // There MUST be caller-save registers available, because they have all just been killed.
5769 // Amd64 Windows: xmm4-xmm5 are guaranteed to be available as xmm0-xmm3 are used for passing args.
5770 // Amd64 Unix: xmm8-xmm15 are guaranteed to be avilable as xmm0-xmm7 are used for passing args.
5771 // X86 RyuJIT Windows: xmm4-xmm7 are guanrateed to be available.
5772 assert(assignedRegister != REG_NA);
5776 // i) The reason we have to spill is that SaveDef position is allocated after the Kill positions
5777 // of the call node are processed. Since callee-trash registers are killed by call node
5778 // we explicity spill and unassign the register.
5779 // ii) These will look a bit backward in the dump, but it's a pain to dump the alloc before the
5781 unassignPhysReg(getRegisterRecord(assignedRegister), currentRefPosition);
5782 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, currentInterval, assignedRegister));
5784 // Now set assignedRegister to REG_NA again so that we don't re-activate it.
5785 assignedRegister = REG_NA;
5788 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
5789 if (currentRefPosition->RequiresRegister() || currentRefPosition->AllocateIfProfitable())
5793 assignedRegister = allocateBusyReg(currentInterval, currentRefPosition,
5794 currentRefPosition->AllocateIfProfitable());
5797 if (assignedRegister != REG_NA)
5800 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_SPILLED_REG, currentInterval, assignedRegister));
5804 // This can happen only for those ref positions that are to be allocated
5805 // only if profitable.
5806 noway_assert(currentRefPosition->AllocateIfProfitable());
5808 currentRefPosition->registerAssignment = RBM_NONE;
5809 currentRefPosition->reload = false;
5810 setIntervalAsSpilled(currentInterval);
5812 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5817 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5818 currentRefPosition->registerAssignment = RBM_NONE;
5819 currentInterval->isActive = false;
5820 setIntervalAsSpilled(currentInterval);
5828 if (currentInterval->isConstant && (currentRefPosition->treeNode != nullptr) &&
5829 currentRefPosition->treeNode->IsReuseRegVal())
5831 dumpLsraAllocationEvent(LSRA_EVENT_REUSE_REG, currentInterval, assignedRegister, currentBlock);
5835 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, currentInterval, assignedRegister, currentBlock);
5841 if (refType == RefTypeDummyDef && assignedRegister != REG_NA)
5843 setInVarRegForBB(curBBNum, currentInterval->varNum, assignedRegister);
5846 // If we allocated a register, and this is a use of a spilled value,
5847 // it should have been marked for reload above.
5848 if (assignedRegister != REG_NA && RefTypeIsUse(refType) && !isInRegister)
5850 assert(currentRefPosition->reload);
5854 // If we allocated a register, record it
5855 if (currentInterval != nullptr && assignedRegister != REG_NA)
5857 assignedRegBit = genRegMask(assignedRegister);
5858 currentRefPosition->registerAssignment = assignedRegBit;
5859 currentInterval->physReg = assignedRegister;
5860 regsToFree &= ~assignedRegBit; // we'll set it again later if it's dead
5862 // If this interval is dead, free the register.
5863 // The interval could be dead if this is a user variable, or if the
5864 // node is being evaluated for side effects, or a call whose result
5865 // is not used, etc.
5866 if (currentRefPosition->lastUse || currentRefPosition->nextRefPosition == nullptr)
5868 assert(currentRefPosition->isIntervalRef());
5870 if (refType != RefTypeExpUse && currentRefPosition->nextRefPosition == nullptr)
5872 if (currentRefPosition->delayRegFree)
5874 delayRegsToFree |= assignedRegBit;
5876 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE_DELAYED));
5880 regsToFree |= assignedRegBit;
5882 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE));
5887 currentInterval->isActive = false;
5891 lastAllocatedRefPosition = currentRefPosition;
5895 // Free registers to clear associated intervals for resolution phase
5896 CLANG_FORMAT_COMMENT_ANCHOR;
5899 if (getLsraExtendLifeTimes())
5901 // If we have extended lifetimes, we need to make sure all the registers are freed.
5902 for (int regNumIndex = 0; regNumIndex <= REG_FP_LAST; regNumIndex++)
5904 RegRecord& regRecord = physRegs[regNumIndex];
5905 Interval* interval = regRecord.assignedInterval;
5906 if (interval != nullptr)
5908 interval->isActive = false;
5909 unassignPhysReg(®Record, nullptr);
5916 freeRegisters(regsToFree | delayRegsToFree);
5922 // Dump the RegRecords after the last RefPosition is handled.
5926 dumpRefPositions("AFTER ALLOCATION");
5927 dumpVarRefPositions("AFTER ALLOCATION");
5929 // Dump the intervals that remain active
5930 printf("Active intervals at end of allocation:\n");
5932 // We COULD just reuse the intervalIter from above, but ArrayListIterator doesn't
5933 // provide a Reset function (!) - we'll probably replace this so don't bother
5936 for (Interval& interval : intervals)
5938 if (interval.isActive)
5950 //-----------------------------------------------------------------------------
5951 // updateAssignedInterval: Update assigned interval of register.
5954 // reg - register to be updated
5955 // interval - interval to be assigned
5956 // regType - regsiter type
5962 // For ARM32, when "regType" is TYP_DOUBLE, "reg" should be a even-numbered
5963 // float register, i.e. lower half of double register.
5966 // For ARM32, two float registers consisting a double register are updated
5967 // together when "regType" is TYP_DOUBLE.
5969 void LinearScan::updateAssignedInterval(RegRecord* reg, Interval* interval, RegisterType regType)
5972 // Update overlapping floating point register for TYP_DOUBLE.
5973 Interval* oldAssignedInterval = reg->assignedInterval;
5974 if (regType == TYP_DOUBLE)
5976 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
5978 anotherHalfReg->assignedInterval = interval;
5980 else if ((oldAssignedInterval != nullptr) && (oldAssignedInterval->registerType == TYP_DOUBLE))
5982 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
5984 anotherHalfReg->assignedInterval = nullptr;
5987 reg->assignedInterval = interval;
5990 //-----------------------------------------------------------------------------
5991 // updatePreviousInterval: Update previous interval of register.
5994 // reg - register to be updated
5995 // interval - interval to be assigned
5996 // regType - regsiter type
6002 // For ARM32, when "regType" is TYP_DOUBLE, "reg" should be a even-numbered
6003 // float register, i.e. lower half of double register.
6006 // For ARM32, two float registers consisting a double register are updated
6007 // together when "regType" is TYP_DOUBLE.
6009 void LinearScan::updatePreviousInterval(RegRecord* reg, Interval* interval, RegisterType regType)
6011 reg->previousInterval = interval;
6014 // Update overlapping floating point register for TYP_DOUBLE
6015 if (regType == TYP_DOUBLE)
6017 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
6019 anotherHalfReg->previousInterval = interval;
6024 // LinearScan::resolveLocalRef
6026 // Update the graph for a local reference.
6027 // Also, track the register (if any) that is currently occupied.
6029 // treeNode: The lclVar that's being resolved
6030 // currentRefPosition: the RefPosition associated with the treeNode
6033 // This method is called for each local reference, during the resolveRegisters
6034 // phase of LSRA. It is responsible for keeping the following in sync:
6035 // - varDsc->lvRegNum (and lvOtherReg) contain the unique register location.
6036 // If it is not in the same register through its lifetime, it is set to REG_STK.
6037 // - interval->physReg is set to the assigned register
6038 // (i.e. at the code location which is currently being handled by resolveRegisters())
6039 // - interval->isActive is true iff the interval is live and occupying a register
6040 // - interval->isSpilled should have already been set to true if the interval is EVER spilled
6041 // - interval->isSplit is set to true if the interval does not occupy the same
6042 // register throughout the method
6043 // - RegRecord->assignedInterval points to the interval which currently occupies
6045 // - For each lclVar node:
6046 // - gtRegNum/gtRegPair is set to the currently allocated register(s).
6047 // - GTF_SPILLED is set on a use if it must be reloaded prior to use.
6048 // - GTF_SPILL is set if it must be spilled after use.
6050 // A copyReg is an ugly case where the variable must be in a specific (fixed) register,
6051 // but it currently resides elsewhere. The register allocator must track the use of the
6052 // fixed register, but it marks the lclVar node with the register it currently lives in
6053 // and the code generator does the necessary move.
6055 // Before beginning, the varDsc for each parameter must be set to its initial location.
6057 // NICE: Consider tracking whether an Interval is always in the same location (register/stack)
6058 // in which case it will require no resolution.
6060 void LinearScan::resolveLocalRef(BasicBlock* block, GenTree* treeNode, RefPosition* currentRefPosition)
6062 assert((block == nullptr) == (treeNode == nullptr));
6063 assert(enregisterLocalVars);
6065 // Is this a tracked local? Or just a register allocated for loading
6066 // a non-tracked one?
6067 Interval* interval = currentRefPosition->getInterval();
6068 if (!interval->isLocalVar)
6072 interval->recentRefPosition = currentRefPosition;
6073 LclVarDsc* varDsc = interval->getLocalVar(compiler);
6075 // NOTE: we set the GTF_VAR_DEATH flag here unless we are extending lifetimes, in which case we write
6076 // this bit in checkLastUses. This is a bit of a hack, but is necessary because codegen requires
6077 // accurate last use info that is not reflected in the lastUse bit on ref positions when we are extending
6078 // lifetimes. See also the comments in checkLastUses.
6079 if ((treeNode != nullptr) && !extendLifetimes())
6081 if (currentRefPosition->lastUse)
6083 treeNode->gtFlags |= GTF_VAR_DEATH;
6087 treeNode->gtFlags &= ~GTF_VAR_DEATH;
6091 if (currentRefPosition->registerAssignment == RBM_NONE)
6093 assert(!currentRefPosition->RequiresRegister());
6094 assert(interval->isSpilled);
6096 varDsc->lvRegNum = REG_STK;
6097 if (interval->assignedReg != nullptr && interval->assignedReg->assignedInterval == interval)
6099 updateAssignedInterval(interval->assignedReg, nullptr, interval->registerType);
6101 interval->assignedReg = nullptr;
6102 interval->physReg = REG_NA;
6103 if (treeNode != nullptr)
6105 treeNode->SetContained();
6111 // In most cases, assigned and home registers will be the same
6112 // The exception is the copyReg case, where we've assigned a register
6113 // for a specific purpose, but will be keeping the register assignment
6114 regNumber assignedReg = currentRefPosition->assignedReg();
6115 regNumber homeReg = assignedReg;
6117 // Undo any previous association with a physical register, UNLESS this
6119 if (!currentRefPosition->copyReg)
6121 regNumber oldAssignedReg = interval->physReg;
6122 if (oldAssignedReg != REG_NA && assignedReg != oldAssignedReg)
6124 RegRecord* oldRegRecord = getRegisterRecord(oldAssignedReg);
6125 if (oldRegRecord->assignedInterval == interval)
6127 updateAssignedInterval(oldRegRecord, nullptr, interval->registerType);
6132 if (currentRefPosition->refType == RefTypeUse && !currentRefPosition->reload)
6134 // Was this spilled after our predecessor was scheduled?
6135 if (interval->physReg == REG_NA)
6137 assert(inVarToRegMaps[curBBNum][varDsc->lvVarIndex] == REG_STK);
6138 currentRefPosition->reload = true;
6142 bool reload = currentRefPosition->reload;
6143 bool spillAfter = currentRefPosition->spillAfter;
6145 // In the reload case we either:
6146 // - Set the register to REG_STK if it will be referenced only from the home location, or
6147 // - Set the register to the assigned register and set GTF_SPILLED if it must be loaded into a register.
6150 assert(currentRefPosition->refType != RefTypeDef);
6151 assert(interval->isSpilled);
6152 varDsc->lvRegNum = REG_STK;
6155 interval->physReg = assignedReg;
6158 // If there is no treeNode, this must be a RefTypeExpUse, in
6159 // which case we did the reload already
6160 if (treeNode != nullptr)
6162 treeNode->gtFlags |= GTF_SPILLED;
6165 if (currentRefPosition->AllocateIfProfitable())
6167 // This is a use of lclVar that is flagged as reg-optional
6168 // by lower/codegen and marked for both reload and spillAfter.
6169 // In this case we can avoid unnecessary reload and spill
6170 // by setting reg on lclVar to REG_STK and reg on tree node
6171 // to REG_NA. Codegen will generate the code by considering
6172 // it as a contained memory operand.
6174 // Note that varDsc->lvRegNum is already to REG_STK above.
6175 interval->physReg = REG_NA;
6176 treeNode->gtRegNum = REG_NA;
6177 treeNode->gtFlags &= ~GTF_SPILLED;
6178 treeNode->SetContained();
6182 treeNode->gtFlags |= GTF_SPILL;
6188 assert(currentRefPosition->refType == RefTypeExpUse);
6191 else if (spillAfter && !RefTypeIsUse(currentRefPosition->refType))
6193 // In the case of a pure def, don't bother spilling - just assign it to the
6194 // stack. However, we need to remember that it was spilled.
6196 assert(interval->isSpilled);
6197 varDsc->lvRegNum = REG_STK;
6198 interval->physReg = REG_NA;
6199 if (treeNode != nullptr)
6201 treeNode->gtRegNum = REG_NA;
6206 // Not reload and Not pure-def that's spillAfter
6208 if (currentRefPosition->copyReg || currentRefPosition->moveReg)
6210 // For a copyReg or moveReg, we have two cases:
6211 // - In the first case, we have a fixedReg - i.e. a register which the code
6212 // generator is constrained to use.
6213 // The code generator will generate the appropriate move to meet the requirement.
6214 // - In the second case, we were forced to use a different register because of
6215 // interference (or JitStressRegs).
6216 // In this case, we generate a GT_COPY.
6217 // In either case, we annotate the treeNode with the register in which the value
6218 // currently lives. For moveReg, the homeReg is the new register (as assigned above).
6219 // But for copyReg, the homeReg remains unchanged.
6221 assert(treeNode != nullptr);
6222 treeNode->gtRegNum = interval->physReg;
6224 if (currentRefPosition->copyReg)
6226 homeReg = interval->physReg;
6230 assert(interval->isSplit);
6231 interval->physReg = assignedReg;
6234 if (!currentRefPosition->isFixedRegRef || currentRefPosition->moveReg)
6236 // This is the second case, where we need to generate a copy
6237 insertCopyOrReload(block, treeNode, currentRefPosition->getMultiRegIdx(), currentRefPosition);
6242 interval->physReg = assignedReg;
6244 if (!interval->isSpilled && !interval->isSplit)
6246 if (varDsc->lvRegNum != REG_STK)
6248 // If the register assignments don't match, then this interval is split.
6249 if (varDsc->lvRegNum != assignedReg)
6251 setIntervalAsSplit(interval);
6252 varDsc->lvRegNum = REG_STK;
6257 varDsc->lvRegNum = assignedReg;
6263 if (treeNode != nullptr)
6265 treeNode->gtFlags |= GTF_SPILL;
6267 assert(interval->isSpilled);
6268 interval->physReg = REG_NA;
6269 varDsc->lvRegNum = REG_STK;
6273 // Update the physRegRecord for the register, so that we know what vars are in
6274 // regs at the block boundaries
6275 RegRecord* physRegRecord = getRegisterRecord(homeReg);
6276 if (spillAfter || currentRefPosition->lastUse)
6278 interval->isActive = false;
6279 interval->assignedReg = nullptr;
6280 interval->physReg = REG_NA;
6282 updateAssignedInterval(physRegRecord, nullptr, interval->registerType);
6286 interval->isActive = true;
6287 interval->assignedReg = physRegRecord;
6289 updateAssignedInterval(physRegRecord, interval, interval->registerType);
6293 void LinearScan::writeRegisters(RefPosition* currentRefPosition, GenTree* tree)
6295 lsraAssignRegToTree(tree, currentRefPosition->assignedReg(), currentRefPosition->getMultiRegIdx());
6298 //------------------------------------------------------------------------
6299 // insertCopyOrReload: Insert a copy in the case where a tree node value must be moved
6300 // to a different register at the point of use (GT_COPY), or it is reloaded to a different register
6301 // than the one it was spilled from (GT_RELOAD).
6304 // block - basic block in which GT_COPY/GT_RELOAD is inserted.
6305 // tree - This is the node to copy or reload.
6306 // Insert copy or reload node between this node and its parent.
6307 // multiRegIdx - register position of tree node for which copy or reload is needed.
6308 // refPosition - The RefPosition at which copy or reload will take place.
6311 // The GT_COPY or GT_RELOAD will be inserted in the proper spot in execution order where the reload is to occur.
6313 // For example, for this tree (numbers are execution order, lower is earlier and higher is later):
6315 // +---------+----------+
6317 // +---------+----------+
6322 // +-------------------+ +----------------------+
6323 // | x (1) | "tree" | y (2) |
6324 // +-------------------+ +----------------------+
6326 // generate this tree:
6328 // +---------+----------+
6330 // +---------+----------+
6335 // +-------------------+ +----------------------+
6336 // | GT_RELOAD (3) | | y (2) |
6337 // +-------------------+ +----------------------+
6339 // +-------------------+
6341 // +-------------------+
6343 // Note in particular that the GT_RELOAD node gets inserted in execution order immediately before the parent of "tree",
6344 // which seems a bit weird since normally a node's parent (in this case, the parent of "x", GT_RELOAD in the "after"
6345 // picture) immediately follows all of its children (that is, normally the execution ordering is postorder).
6346 // The ordering must be this weird "out of normal order" way because the "x" node is being spilled, probably
6347 // because the expression in the tree represented above by "y" has high register requirements. We don't want
6348 // to reload immediately, of course. So we put GT_RELOAD where the reload should actually happen.
6350 // Note that GT_RELOAD is required when we reload to a different register than the one we spilled to. It can also be
6351 // used if we reload to the same register. Normally, though, in that case we just mark the node with GTF_SPILLED,
6352 // and the unspilling code automatically reuses the same register, and does the reload when it notices that flag
6353 // when considering a node's operands.
6355 void LinearScan::insertCopyOrReload(BasicBlock* block, GenTree* tree, unsigned multiRegIdx, RefPosition* refPosition)
6357 LIR::Range& blockRange = LIR::AsRange(block);
6360 bool foundUse = blockRange.TryGetUse(tree, &treeUse);
6363 GenTree* parent = treeUse.User();
6366 if (refPosition->reload)
6374 #if TRACK_LSRA_STATS
6375 updateLsraStat(LSRA_STAT_COPY_REG, block->bbNum);
6379 // If the parent is a reload/copy node, then tree must be a multi-reg call node
6380 // that has already had one of its registers spilled. This is because multi-reg
6381 // call node is the only node whose RefTypeDef positions get independently
6382 // spilled or reloaded. It is possible that one of its RefTypeDef position got
6383 // spilled and the next use of it requires it to be in a different register.
6385 // In this case set the i'th position reg of reload/copy node to the reg allocated
6386 // for copy/reload refPosition. Essentially a copy/reload node will have a reg
6387 // for each multi-reg position of its child. If there is a valid reg in i'th
6388 // position of GT_COPY or GT_RELOAD node then the corresponding result of its
6389 // child needs to be copied or reloaded to that reg.
6390 if (parent->IsCopyOrReload())
6392 noway_assert(parent->OperGet() == oper);
6393 noway_assert(tree->IsMultiRegCall());
6394 GenTreeCall* call = tree->AsCall();
6395 GenTreeCopyOrReload* copyOrReload = parent->AsCopyOrReload();
6396 noway_assert(copyOrReload->GetRegNumByIdx(multiRegIdx) == REG_NA);
6397 copyOrReload->SetRegNumByIdx(refPosition->assignedReg(), multiRegIdx);
6401 // Create the new node, with "tree" as its only child.
6402 var_types treeType = tree->TypeGet();
6404 GenTreeCopyOrReload* newNode = new (compiler, oper) GenTreeCopyOrReload(oper, treeType, tree);
6405 assert(refPosition->registerAssignment != RBM_NONE);
6406 SetLsraAdded(newNode);
6407 newNode->SetRegNumByIdx(refPosition->assignedReg(), multiRegIdx);
6408 if (refPosition->copyReg)
6410 // This is a TEMPORARY copy
6411 assert(isCandidateLocalRef(tree));
6412 newNode->gtFlags |= GTF_VAR_DEATH;
6415 // Insert the copy/reload after the spilled node and replace the use of the original node with a use
6416 // of the copy/reload.
6417 blockRange.InsertAfter(tree, newNode);
6418 treeUse.ReplaceWith(compiler, newNode);
6422 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6423 //------------------------------------------------------------------------
6424 // insertUpperVectorSaveAndReload: Insert code to save and restore the upper half of a vector that lives
6425 // in a callee-save register at the point of a kill (the upper half is
6429 // tree - This is the node around which we will insert the Save & Reload.
6430 // It will be a call or some node that turns into a call.
6431 // refPosition - The RefTypeUpperVectorSaveDef RefPosition.
6433 void LinearScan::insertUpperVectorSaveAndReload(GenTree* tree, RefPosition* refPosition, BasicBlock* block)
6435 Interval* lclVarInterval = refPosition->getInterval()->relatedInterval;
6436 assert(lclVarInterval->isLocalVar == true);
6437 LclVarDsc* varDsc = compiler->lvaTable + lclVarInterval->varNum;
6438 assert(varTypeNeedsPartialCalleeSave(varDsc->lvType));
6439 regNumber lclVarReg = lclVarInterval->physReg;
6440 if (lclVarReg == REG_NA)
6445 assert((genRegMask(lclVarReg) & RBM_FLT_CALLEE_SAVED) != RBM_NONE);
6447 regNumber spillReg = refPosition->assignedReg();
6448 bool spillToMem = refPosition->spillAfter;
6450 LIR::Range& blockRange = LIR::AsRange(block);
6452 // First, insert the save before the call.
6454 GenTree* saveLcl = compiler->gtNewLclvNode(lclVarInterval->varNum, varDsc->lvType);
6455 saveLcl->gtRegNum = lclVarReg;
6456 SetLsraAdded(saveLcl);
6458 GenTreeSIMD* simdNode =
6459 new (compiler, GT_SIMD) GenTreeSIMD(LargeVectorSaveType, saveLcl, nullptr, SIMDIntrinsicUpperSave,
6460 varDsc->lvBaseType, genTypeSize(varDsc->lvType));
6461 SetLsraAdded(simdNode);
6462 simdNode->gtRegNum = spillReg;
6465 simdNode->gtFlags |= GTF_SPILL;
6468 blockRange.InsertBefore(tree, LIR::SeqTree(compiler, simdNode));
6470 // Now insert the restore after the call.
6472 GenTree* restoreLcl = compiler->gtNewLclvNode(lclVarInterval->varNum, varDsc->lvType);
6473 restoreLcl->gtRegNum = lclVarReg;
6474 SetLsraAdded(restoreLcl);
6476 simdNode = new (compiler, GT_SIMD) GenTreeSIMD(varDsc->lvType, restoreLcl, nullptr, SIMDIntrinsicUpperRestore,
6477 varDsc->lvBaseType, genTypeSize(varDsc->lvType));
6478 simdNode->gtRegNum = spillReg;
6479 SetLsraAdded(simdNode);
6482 simdNode->gtFlags |= GTF_SPILLED;
6485 blockRange.InsertAfter(tree, LIR::SeqTree(compiler, simdNode));
6487 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6489 //------------------------------------------------------------------------
6490 // initMaxSpill: Initializes the LinearScan members used to track the max number
6491 // of concurrent spills. This is needed so that we can set the
6492 // fields in Compiler, so that the code generator, in turn can
6493 // allocate the right number of spill locations.
6502 // This is called before any calls to updateMaxSpill().
6504 void LinearScan::initMaxSpill()
6506 needDoubleTmpForFPCall = false;
6507 needFloatTmpForFPCall = false;
6508 for (int i = 0; i < TYP_COUNT; i++)
6511 currentSpill[i] = 0;
6515 //------------------------------------------------------------------------
6516 // recordMaxSpill: Sets the fields in Compiler for the max number of concurrent spills.
6517 // (See the comment on initMaxSpill.)
6526 // This is called after updateMaxSpill() has been called for all "real"
6529 void LinearScan::recordMaxSpill()
6531 // Note: due to the temp normalization process (see tmpNormalizeType)
6532 // only a few types should actually be seen here.
6533 JITDUMP("Recording the maximum number of concurrent spills:\n");
6535 var_types returnType = compiler->tmpNormalizeType(compiler->info.compRetType);
6536 if (needDoubleTmpForFPCall || (returnType == TYP_DOUBLE))
6538 JITDUMP("Adding a spill temp for moving a double call/return value between xmm reg and x87 stack.\n");
6539 maxSpill[TYP_DOUBLE] += 1;
6541 if (needFloatTmpForFPCall || (returnType == TYP_FLOAT))
6543 JITDUMP("Adding a spill temp for moving a float call/return value between xmm reg and x87 stack.\n");
6544 maxSpill[TYP_FLOAT] += 1;
6546 #endif // _TARGET_X86_
6547 for (int i = 0; i < TYP_COUNT; i++)
6549 if (var_types(i) != compiler->tmpNormalizeType(var_types(i)))
6551 // Only normalized types should have anything in the maxSpill array.
6552 // We assume here that if type 'i' does not normalize to itself, then
6553 // nothing else normalizes to 'i', either.
6554 assert(maxSpill[i] == 0);
6556 if (maxSpill[i] != 0)
6558 JITDUMP(" %s: %d\n", varTypeName(var_types(i)), maxSpill[i]);
6559 compiler->tmpPreAllocateTemps(var_types(i), maxSpill[i]);
6565 //------------------------------------------------------------------------
6566 // updateMaxSpill: Update the maximum number of concurrent spills
6569 // refPosition - the current RefPosition being handled
6575 // The RefPosition has an associated interval (getInterval() will
6576 // otherwise assert).
6579 // This is called for each "real" RefPosition during the writeback
6580 // phase of LSRA. It keeps track of how many concurrently-live
6581 // spills there are, and the largest number seen so far.
6583 void LinearScan::updateMaxSpill(RefPosition* refPosition)
6585 RefType refType = refPosition->refType;
6587 if (refPosition->spillAfter || refPosition->reload ||
6588 (refPosition->AllocateIfProfitable() && refPosition->assignedReg() == REG_NA))
6590 Interval* interval = refPosition->getInterval();
6591 if (!interval->isLocalVar)
6593 // The tmp allocation logic 'normalizes' types to a small number of
6594 // types that need distinct stack locations from each other.
6595 // Those types are currently gc refs, byrefs, <= 4 byte non-GC items,
6596 // 8-byte non-GC items, and 16-byte or 32-byte SIMD vectors.
6597 // LSRA is agnostic to those choices but needs
6598 // to know what they are here.
6601 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6602 if ((refType == RefTypeUpperVectorSaveDef) || (refType == RefTypeUpperVectorSaveUse))
6604 typ = LargeVectorSaveType;
6607 #endif // !FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6609 GenTree* treeNode = refPosition->treeNode;
6610 if (treeNode == nullptr)
6612 assert(RefTypeIsUse(refType));
6613 treeNode = interval->firstRefPosition->treeNode;
6615 assert(treeNode != nullptr);
6617 // In case of multi-reg call nodes, we need to use the type
6618 // of the return register given by multiRegIdx of the refposition.
6619 if (treeNode->IsMultiRegCall())
6621 ReturnTypeDesc* retTypeDesc = treeNode->AsCall()->GetReturnTypeDesc();
6622 typ = retTypeDesc->GetReturnRegType(refPosition->getMultiRegIdx());
6625 else if (treeNode->OperIsPutArgSplit())
6627 typ = treeNode->AsPutArgSplit()->GetRegType(refPosition->getMultiRegIdx());
6629 else if (treeNode->OperIsPutArgReg())
6631 // For double arg regs, the type is changed to long since they must be passed via `r0-r3`.
6632 // However when they get spilled, they should be treated as separated int registers.
6633 var_types typNode = treeNode->TypeGet();
6634 typ = (typNode == TYP_LONG) ? TYP_INT : typNode;
6636 #endif // _TARGET_ARM_
6639 typ = treeNode->TypeGet();
6641 typ = compiler->tmpNormalizeType(typ);
6644 if (refPosition->spillAfter && !refPosition->reload)
6646 currentSpill[typ]++;
6647 if (currentSpill[typ] > maxSpill[typ])
6649 maxSpill[typ] = currentSpill[typ];
6652 else if (refPosition->reload)
6654 assert(currentSpill[typ] > 0);
6655 currentSpill[typ]--;
6657 else if (refPosition->AllocateIfProfitable() && refPosition->assignedReg() == REG_NA)
6659 // A spill temp not getting reloaded into a reg because it is
6660 // marked as allocate if profitable and getting used from its
6661 // memory location. To properly account max spill for typ we
6662 // decrement spill count.
6663 assert(RefTypeIsUse(refType));
6664 assert(currentSpill[typ] > 0);
6665 currentSpill[typ]--;
6667 JITDUMP(" Max spill for %s is %d\n", varTypeName(typ), maxSpill[typ]);
6672 // This is the final phase of register allocation. It writes the register assignments to
6673 // the tree, and performs resolution across joins and backedges.
6675 void LinearScan::resolveRegisters()
6677 // Iterate over the tree and the RefPositions in lockstep
6678 // - annotate the tree with register assignments by setting gtRegNum or gtRegPair (for longs)
6680 // - track globally-live var locations
6681 // - add resolution points at split/merge/critical points as needed
6683 // Need to use the same traversal order as the one that assigns the location numbers.
6685 // Dummy RefPositions have been added at any split, join or critical edge, at the
6686 // point where resolution may be required. These are located:
6687 // - for a split, at the top of the non-adjacent block
6688 // - for a join, at the bottom of the non-adjacent joining block
6689 // - for a critical edge, at the top of the target block of each critical
6691 // Note that a target block may have multiple incoming critical or split edges
6693 // These RefPositions record the expected location of the Interval at that point.
6694 // At each branch, we identify the location of each liveOut interval, and check
6695 // against the RefPositions at the target.
6698 LsraLocation currentLocation = MinLocation;
6700 // Clear register assignments - these will be reestablished as lclVar defs (including RefTypeParamDefs)
6702 if (enregisterLocalVars)
6704 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
6706 RegRecord* physRegRecord = getRegisterRecord(reg);
6707 Interval* assignedInterval = physRegRecord->assignedInterval;
6708 if (assignedInterval != nullptr)
6710 assignedInterval->assignedReg = nullptr;
6711 assignedInterval->physReg = REG_NA;
6713 physRegRecord->assignedInterval = nullptr;
6714 physRegRecord->recentRefPosition = nullptr;
6717 // Clear "recentRefPosition" for lclVar intervals
6718 for (unsigned varIndex = 0; varIndex < compiler->lvaTrackedCount; varIndex++)
6720 if (localVarIntervals[varIndex] != nullptr)
6722 localVarIntervals[varIndex]->recentRefPosition = nullptr;
6723 localVarIntervals[varIndex]->isActive = false;
6727 assert(compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate == false);
6732 // handle incoming arguments and special temps
6733 RefPositionIterator refPosIterator = refPositions.begin();
6734 RefPosition* currentRefPosition = &refPosIterator;
6736 if (enregisterLocalVars)
6738 VarToRegMap entryVarToRegMap = inVarToRegMaps[compiler->fgFirstBB->bbNum];
6739 for (; refPosIterator != refPositions.end() &&
6740 (currentRefPosition->refType == RefTypeParamDef || currentRefPosition->refType == RefTypeZeroInit);
6741 ++refPosIterator, currentRefPosition = &refPosIterator)
6743 Interval* interval = currentRefPosition->getInterval();
6744 assert(interval != nullptr && interval->isLocalVar);
6745 resolveLocalRef(nullptr, nullptr, currentRefPosition);
6746 regNumber reg = REG_STK;
6747 int varIndex = interval->getVarIndex(compiler);
6749 if (!currentRefPosition->spillAfter && currentRefPosition->registerAssignment != RBM_NONE)
6751 reg = currentRefPosition->assignedReg();
6756 interval->isActive = false;
6758 setVarReg(entryVarToRegMap, varIndex, reg);
6763 assert(refPosIterator == refPositions.end() ||
6764 (refPosIterator->refType != RefTypeParamDef && refPosIterator->refType != RefTypeZeroInit));
6767 BasicBlock* insertionBlock = compiler->fgFirstBB;
6768 GenTree* insertionPoint = LIR::AsRange(insertionBlock).FirstNonPhiNode();
6770 // write back assignments
6771 for (block = startBlockSequence(); block != nullptr; block = moveToNextBlock())
6773 assert(curBBNum == block->bbNum);
6775 if (enregisterLocalVars)
6777 // Record the var locations at the start of this block.
6778 // (If it's fgFirstBB, we've already done that above, see entryVarToRegMap)
6780 curBBStartLocation = currentRefPosition->nodeLocation;
6781 if (block != compiler->fgFirstBB)
6783 processBlockStartLocations(block, false);
6786 // Handle the DummyDefs, updating the incoming var location.
6787 for (; refPosIterator != refPositions.end() && currentRefPosition->refType == RefTypeDummyDef;
6788 ++refPosIterator, currentRefPosition = &refPosIterator)
6790 assert(currentRefPosition->isIntervalRef());
6791 // Don't mark dummy defs as reload
6792 currentRefPosition->reload = false;
6793 resolveLocalRef(nullptr, nullptr, currentRefPosition);
6795 if (currentRefPosition->registerAssignment != RBM_NONE)
6797 reg = currentRefPosition->assignedReg();
6802 currentRefPosition->getInterval()->isActive = false;
6804 setInVarRegForBB(curBBNum, currentRefPosition->getInterval()->varNum, reg);
6808 // The next RefPosition should be for the block. Move past it.
6809 assert(refPosIterator != refPositions.end());
6810 assert(currentRefPosition->refType == RefTypeBB);
6812 currentRefPosition = &refPosIterator;
6814 // Handle the RefPositions for the block
6815 for (; refPosIterator != refPositions.end() && currentRefPosition->refType != RefTypeBB &&
6816 currentRefPosition->refType != RefTypeDummyDef;
6817 ++refPosIterator, currentRefPosition = &refPosIterator)
6819 currentLocation = currentRefPosition->nodeLocation;
6821 // Ensure that the spill & copy info is valid.
6822 // First, if it's reload, it must not be copyReg or moveReg
6823 assert(!currentRefPosition->reload || (!currentRefPosition->copyReg && !currentRefPosition->moveReg));
6824 // If it's copyReg it must not be moveReg, and vice-versa
6825 assert(!currentRefPosition->copyReg || !currentRefPosition->moveReg);
6827 switch (currentRefPosition->refType)
6830 case RefTypeUpperVectorSaveUse:
6831 case RefTypeUpperVectorSaveDef:
6832 #endif // FEATURE_SIMD
6835 // These are the ones we're interested in
6838 case RefTypeFixedReg:
6839 // These require no handling at resolution time
6840 assert(currentRefPosition->referent != nullptr);
6841 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6844 // Ignore the ExpUse cases - a RefTypeExpUse would only exist if the
6845 // variable is dead at the entry to the next block. So we'll mark
6846 // it as in its current location and resolution will take care of any
6848 assert(getNextBlock() == nullptr ||
6849 !VarSetOps::IsMember(compiler, getNextBlock()->bbLiveIn,
6850 currentRefPosition->getInterval()->getVarIndex(compiler)));
6851 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6853 case RefTypeKillGCRefs:
6854 // No action to take at resolution time, and no interval to update recentRefPosition for.
6856 case RefTypeDummyDef:
6857 case RefTypeParamDef:
6858 case RefTypeZeroInit:
6859 // Should have handled all of these already
6864 updateMaxSpill(currentRefPosition);
6865 GenTree* treeNode = currentRefPosition->treeNode;
6867 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6868 if (currentRefPosition->refType == RefTypeUpperVectorSaveDef)
6870 // The treeNode must be a call, and this must be a RefPosition for a LargeVectorType LocalVar.
6871 // If the LocalVar is in a callee-save register, we are going to spill its upper half around the call.
6872 // If we have allocated a register to spill it to, we will use that; otherwise, we will spill it
6873 // to the stack. We can use as a temp register any non-arg caller-save register.
6874 noway_assert(treeNode != nullptr);
6875 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6876 insertUpperVectorSaveAndReload(treeNode, currentRefPosition, block);
6878 else if (currentRefPosition->refType == RefTypeUpperVectorSaveUse)
6882 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6884 // Most uses won't actually need to be recorded (they're on the def).
6885 // In those cases, treeNode will be nullptr.
6886 if (treeNode == nullptr)
6888 // This is either a use, a dead def, or a field of a struct
6889 Interval* interval = currentRefPosition->getInterval();
6890 assert(currentRefPosition->refType == RefTypeUse ||
6891 currentRefPosition->registerAssignment == RBM_NONE || interval->isStructField);
6893 // TODO-Review: Need to handle the case where any of the struct fields
6894 // are reloaded/spilled at this use
6895 assert(!interval->isStructField ||
6896 (currentRefPosition->reload == false && currentRefPosition->spillAfter == false));
6898 if (interval->isLocalVar && !interval->isStructField)
6900 LclVarDsc* varDsc = interval->getLocalVar(compiler);
6902 // This must be a dead definition. We need to mark the lclVar
6903 // so that it's not considered a candidate for lvRegister, as
6904 // this dead def will have to go to the stack.
6905 assert(currentRefPosition->refType == RefTypeDef);
6906 varDsc->lvRegNum = REG_STK;
6911 if (currentRefPosition->isIntervalRef() && currentRefPosition->getInterval()->isInternal)
6913 treeNode->gtRsvdRegs |= currentRefPosition->registerAssignment;
6917 writeRegisters(currentRefPosition, treeNode);
6919 if (treeNode->IsLocal() && currentRefPosition->getInterval()->isLocalVar)
6921 resolveLocalRef(block, treeNode, currentRefPosition);
6924 // Mark spill locations on temps
6925 // (local vars are handled in resolveLocalRef, above)
6926 // Note that the tree node will be changed from GTF_SPILL to GTF_SPILLED
6927 // in codegen, taking care of the "reload" case for temps
6928 else if (currentRefPosition->spillAfter || (currentRefPosition->nextRefPosition != nullptr &&
6929 currentRefPosition->nextRefPosition->moveReg))
6931 if (treeNode != nullptr && currentRefPosition->isIntervalRef())
6933 if (currentRefPosition->spillAfter)
6935 treeNode->gtFlags |= GTF_SPILL;
6937 // If this is a constant interval that is reusing a pre-existing value, we actually need
6938 // to generate the value at this point in order to spill it.
6939 if (treeNode->IsReuseRegVal())
6941 treeNode->ResetReuseRegVal();
6944 // In case of multi-reg call node, also set spill flag on the
6945 // register specified by multi-reg index of current RefPosition.
6946 // Note that the spill flag on treeNode indicates that one or
6947 // more its allocated registers are in that state.
6948 if (treeNode->IsMultiRegCall())
6950 GenTreeCall* call = treeNode->AsCall();
6951 call->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6954 else if (treeNode->OperIsPutArgSplit())
6956 GenTreePutArgSplit* splitArg = treeNode->AsPutArgSplit();
6957 splitArg->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6959 else if (treeNode->OperIsMultiRegOp())
6961 GenTreeMultiRegOp* multiReg = treeNode->AsMultiRegOp();
6962 multiReg->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6967 // If the value is reloaded or moved to a different register, we need to insert
6968 // a node to hold the register to which it should be reloaded
6969 RefPosition* nextRefPosition = currentRefPosition->nextRefPosition;
6970 assert(nextRefPosition != nullptr);
6971 if (INDEBUG(alwaysInsertReload() ||)
6972 nextRefPosition->assignedReg() != currentRefPosition->assignedReg())
6974 if (nextRefPosition->assignedReg() != REG_NA)
6976 insertCopyOrReload(block, treeNode, currentRefPosition->getMultiRegIdx(),
6981 assert(nextRefPosition->AllocateIfProfitable());
6983 // In case of tree temps, if def is spilled and use didn't
6984 // get a register, set a flag on tree node to be treated as
6985 // contained at the point of its use.
6986 if (currentRefPosition->spillAfter && currentRefPosition->refType == RefTypeDef &&
6987 nextRefPosition->refType == RefTypeUse)
6989 assert(nextRefPosition->treeNode == nullptr);
6990 treeNode->gtFlags |= GTF_NOREG_AT_USE;
6996 // We should never have to "spill after" a temp use, since
6997 // they're single use
7006 if (enregisterLocalVars)
7008 processBlockEndLocations(block);
7012 if (enregisterLocalVars)
7017 printf("-----------------------\n");
7018 printf("RESOLVING BB BOUNDARIES\n");
7019 printf("-----------------------\n");
7021 printf("Resolution Candidates: ");
7022 dumpConvertedVarSet(compiler, resolutionCandidateVars);
7024 printf("Has %sCritical Edges\n\n", hasCriticalEdges ? "" : "No");
7026 printf("Prior to Resolution\n");
7027 foreach_block(compiler, block)
7029 printf("\nBB%02u use def in out\n", block->bbNum);
7030 dumpConvertedVarSet(compiler, block->bbVarUse);
7032 dumpConvertedVarSet(compiler, block->bbVarDef);
7034 dumpConvertedVarSet(compiler, block->bbLiveIn);
7036 dumpConvertedVarSet(compiler, block->bbLiveOut);
7039 dumpInVarToRegMap(block);
7040 dumpOutVarToRegMap(block);
7049 // Verify register assignments on variables
7052 for (lclNum = 0, varDsc = compiler->lvaTable; lclNum < compiler->lvaCount; lclNum++, varDsc++)
7054 if (!isCandidateVar(varDsc))
7056 varDsc->lvRegNum = REG_STK;
7060 Interval* interval = getIntervalForLocalVar(varDsc->lvVarIndex);
7062 // Determine initial position for parameters
7064 if (varDsc->lvIsParam)
7066 regMaskTP initialRegMask = interval->firstRefPosition->registerAssignment;
7067 regNumber initialReg = (initialRegMask == RBM_NONE || interval->firstRefPosition->spillAfter)
7069 : genRegNumFromMask(initialRegMask);
7070 regNumber sourceReg = (varDsc->lvIsRegArg) ? varDsc->lvArgReg : REG_STK;
7073 if (varTypeIsMultiReg(varDsc))
7075 // TODO-ARM-NYI: Map the hi/lo intervals back to lvRegNum and lvOtherReg (these should NYI
7077 assert(!"Multi-reg types not yet supported");
7080 #endif // _TARGET_ARM_
7082 varDsc->lvArgInitReg = initialReg;
7083 JITDUMP(" Set V%02u argument initial register to %s\n", lclNum, getRegName(initialReg));
7086 // Stack args that are part of dependently-promoted structs should never be register candidates (see
7087 // LinearScan::isRegCandidate).
7088 assert(varDsc->lvIsRegArg || !compiler->lvaIsFieldOfDependentlyPromotedStruct(varDsc));
7091 // If lvRegNum is REG_STK, that means that either no register
7092 // was assigned, or (more likely) that the same register was not
7093 // used for all references. In that case, codegen gets the register
7094 // from the tree node.
7095 if (varDsc->lvRegNum == REG_STK || interval->isSpilled || interval->isSplit)
7097 // For codegen purposes, we'll set lvRegNum to whatever register
7098 // it's currently in as we go.
7099 // However, we never mark an interval as lvRegister if it has either been spilled
7101 varDsc->lvRegister = false;
7103 // Skip any dead defs or exposed uses
7104 // (first use exposed will only occur when there is no explicit initialization)
7105 RefPosition* firstRefPosition = interval->firstRefPosition;
7106 while ((firstRefPosition != nullptr) && (firstRefPosition->refType == RefTypeExpUse))
7108 firstRefPosition = firstRefPosition->nextRefPosition;
7110 if (firstRefPosition == nullptr)
7113 varDsc->lvLRACandidate = false;
7114 if (varDsc->lvRefCnt == 0)
7116 varDsc->lvOnFrame = false;
7120 // We may encounter cases where a lclVar actually has no references, but
7121 // a non-zero refCnt. For safety (in case this is some "hidden" lclVar that we're
7122 // not correctly recognizing), we'll mark those as needing a stack location.
7123 // TODO-Cleanup: Make this an assert if/when we correct the refCnt
7125 varDsc->lvOnFrame = true;
7130 // If the interval was not spilled, it doesn't need a stack location.
7131 if (!interval->isSpilled)
7133 varDsc->lvOnFrame = false;
7135 if (firstRefPosition->registerAssignment == RBM_NONE || firstRefPosition->spillAfter)
7137 // Either this RefPosition is spilled, or regOptional or it is not a "real" def or use
7139 firstRefPosition->spillAfter || firstRefPosition->AllocateIfProfitable() ||
7140 (firstRefPosition->refType != RefTypeDef && firstRefPosition->refType != RefTypeUse));
7141 varDsc->lvRegNum = REG_STK;
7145 varDsc->lvRegNum = firstRefPosition->assignedReg();
7152 varDsc->lvRegister = true;
7153 varDsc->lvOnFrame = false;
7156 regMaskTP registerAssignment = genRegMask(varDsc->lvRegNum);
7157 assert(!interval->isSpilled && !interval->isSplit);
7158 RefPosition* refPosition = interval->firstRefPosition;
7159 assert(refPosition != nullptr);
7161 while (refPosition != nullptr)
7163 // All RefPositions must match, except for dead definitions,
7164 // copyReg/moveReg and RefTypeExpUse positions
7165 if (refPosition->registerAssignment != RBM_NONE && !refPosition->copyReg &&
7166 !refPosition->moveReg && refPosition->refType != RefTypeExpUse)
7168 assert(refPosition->registerAssignment == registerAssignment);
7170 refPosition = refPosition->nextRefPosition;
7181 printf("Trees after linear scan register allocator (LSRA)\n");
7182 compiler->fgDispBasicBlocks(true);
7185 verifyFinalAllocation();
7188 compiler->raMarkStkVars();
7191 // TODO-CQ: Review this comment and address as needed.
7192 // Change all unused promoted non-argument struct locals to a non-GC type (in this case TYP_INT)
7193 // so that the gc tracking logic and lvMustInit logic will ignore them.
7194 // Extract the code that does this from raAssignVars, and call it here.
7195 // PRECONDITIONS: Ensure that lvPromoted is set on promoted structs, if and
7196 // only if it is promoted on all paths.
7197 // Call might be something like:
7198 // compiler->BashUnusedStructLocals();
7202 //------------------------------------------------------------------------
7203 // insertMove: Insert a move of a lclVar with the given lclNum into the given block.
7206 // block - the BasicBlock into which the move will be inserted.
7207 // insertionPoint - the instruction before which to insert the move
7208 // lclNum - the lclNum of the var to be moved
7209 // fromReg - the register from which the var is moving
7210 // toReg - the register to which the var is moving
7216 // If insertionPoint is non-NULL, insert before that instruction;
7217 // otherwise, insert "near" the end (prior to the branch, if any).
7218 // If fromReg or toReg is REG_STK, then move from/to memory, respectively.
7220 void LinearScan::insertMove(
7221 BasicBlock* block, GenTree* insertionPoint, unsigned lclNum, regNumber fromReg, regNumber toReg)
7223 LclVarDsc* varDsc = compiler->lvaTable + lclNum;
7224 // the lclVar must be a register candidate
7225 assert(isRegCandidate(varDsc));
7226 // One or both MUST be a register
7227 assert(fromReg != REG_STK || toReg != REG_STK);
7228 // They must not be the same register.
7229 assert(fromReg != toReg);
7231 // This var can't be marked lvRegister now
7232 varDsc->lvRegNum = REG_STK;
7234 GenTree* src = compiler->gtNewLclvNode(lclNum, varDsc->TypeGet());
7237 // There are three cases we need to handle:
7238 // - We are loading a lclVar from the stack.
7239 // - We are storing a lclVar to the stack.
7240 // - We are copying a lclVar between registers.
7242 // In the first and second cases, the lclVar node will be marked with GTF_SPILLED and GTF_SPILL, respectively.
7243 // It is up to the code generator to ensure that any necessary normalization is done when loading or storing the
7246 // In the third case, we generate GT_COPY(GT_LCL_VAR) and type each node with the normalized type of the lclVar.
7247 // This is safe because a lclVar is always normalized once it is in a register.
7250 if (fromReg == REG_STK)
7252 src->gtFlags |= GTF_SPILLED;
7253 src->gtRegNum = toReg;
7255 else if (toReg == REG_STK)
7257 src->gtFlags |= GTF_SPILL;
7258 src->gtRegNum = fromReg;
7262 var_types movType = genActualType(varDsc->TypeGet());
7263 src->gtType = movType;
7265 dst = new (compiler, GT_COPY) GenTreeCopyOrReload(GT_COPY, movType, src);
7266 // This is the new home of the lclVar - indicate that by clearing the GTF_VAR_DEATH flag.
7267 // Note that if src is itself a lastUse, this will have no effect.
7268 dst->gtFlags &= ~(GTF_VAR_DEATH);
7269 src->gtRegNum = fromReg;
7270 dst->gtRegNum = toReg;
7273 dst->SetUnusedValue();
7275 LIR::Range treeRange = LIR::SeqTree(compiler, dst);
7276 LIR::Range& blockRange = LIR::AsRange(block);
7278 if (insertionPoint != nullptr)
7280 blockRange.InsertBefore(insertionPoint, std::move(treeRange));
7284 // Put the copy at the bottom
7285 // If there's a branch, make an embedded statement that executes just prior to the branch
7286 if (block->bbJumpKind == BBJ_COND || block->bbJumpKind == BBJ_SWITCH)
7288 noway_assert(!blockRange.IsEmpty());
7290 GenTree* branch = blockRange.LastNode();
7291 assert(branch->OperIsConditionalJump() || branch->OperGet() == GT_SWITCH_TABLE ||
7292 branch->OperGet() == GT_SWITCH);
7294 blockRange.InsertBefore(branch, std::move(treeRange));
7298 assert(block->bbJumpKind == BBJ_NONE || block->bbJumpKind == BBJ_ALWAYS);
7299 blockRange.InsertAtEnd(std::move(treeRange));
7304 void LinearScan::insertSwap(
7305 BasicBlock* block, GenTree* insertionPoint, unsigned lclNum1, regNumber reg1, unsigned lclNum2, regNumber reg2)
7310 const char* insertionPointString = "top";
7311 if (insertionPoint == nullptr)
7313 insertionPointString = "bottom";
7315 printf(" BB%02u %s: swap V%02u in %s with V%02u in %s\n", block->bbNum, insertionPointString, lclNum1,
7316 getRegName(reg1), lclNum2, getRegName(reg2));
7320 LclVarDsc* varDsc1 = compiler->lvaTable + lclNum1;
7321 LclVarDsc* varDsc2 = compiler->lvaTable + lclNum2;
7322 assert(reg1 != REG_STK && reg1 != REG_NA && reg2 != REG_STK && reg2 != REG_NA);
7324 GenTree* lcl1 = compiler->gtNewLclvNode(lclNum1, varDsc1->TypeGet());
7325 lcl1->gtRegNum = reg1;
7328 GenTree* lcl2 = compiler->gtNewLclvNode(lclNum2, varDsc2->TypeGet());
7329 lcl2->gtRegNum = reg2;
7332 GenTree* swap = compiler->gtNewOperNode(GT_SWAP, TYP_VOID, lcl1, lcl2);
7333 swap->gtRegNum = REG_NA;
7336 lcl1->gtNext = lcl2;
7337 lcl2->gtPrev = lcl1;
7338 lcl2->gtNext = swap;
7339 swap->gtPrev = lcl2;
7341 LIR::Range swapRange = LIR::SeqTree(compiler, swap);
7342 LIR::Range& blockRange = LIR::AsRange(block);
7344 if (insertionPoint != nullptr)
7346 blockRange.InsertBefore(insertionPoint, std::move(swapRange));
7350 // Put the copy at the bottom
7351 // If there's a branch, make an embedded statement that executes just prior to the branch
7352 if (block->bbJumpKind == BBJ_COND || block->bbJumpKind == BBJ_SWITCH)
7354 noway_assert(!blockRange.IsEmpty());
7356 GenTree* branch = blockRange.LastNode();
7357 assert(branch->OperIsConditionalJump() || branch->OperGet() == GT_SWITCH_TABLE ||
7358 branch->OperGet() == GT_SWITCH);
7360 blockRange.InsertBefore(branch, std::move(swapRange));
7364 assert(block->bbJumpKind == BBJ_NONE || block->bbJumpKind == BBJ_ALWAYS);
7365 blockRange.InsertAtEnd(std::move(swapRange));
7370 //------------------------------------------------------------------------
7371 // getTempRegForResolution: Get a free register to use for resolution code.
7374 // fromBlock - The "from" block on the edge being resolved.
7375 // toBlock - The "to"block on the edge
7376 // type - the type of register required
7379 // Returns a register that is free on the given edge, or REG_NA if none is available.
7382 // It is up to the caller to check the return value, and to determine whether a register is
7383 // available, and to handle that case appropriately.
7384 // It is also up to the caller to cache the return value, as this is not cheap to compute.
7386 regNumber LinearScan::getTempRegForResolution(BasicBlock* fromBlock, BasicBlock* toBlock, var_types type)
7388 // TODO-Throughput: This would be much more efficient if we add RegToVarMaps instead of VarToRegMaps
7389 // and they would be more space-efficient as well.
7390 VarToRegMap fromVarToRegMap = getOutVarToRegMap(fromBlock->bbNum);
7391 VarToRegMap toVarToRegMap = getInVarToRegMap(toBlock->bbNum);
7395 if (type == TYP_DOUBLE)
7397 // We have to consider all float registers for TYP_DOUBLE
7398 freeRegs = allRegs(TYP_FLOAT);
7402 freeRegs = allRegs(type);
7404 #else // !_TARGET_ARM_
7405 regMaskTP freeRegs = allRegs(type);
7406 #endif // !_TARGET_ARM_
7409 if (getStressLimitRegs() == LSRA_LIMIT_SMALL_SET)
7414 INDEBUG(freeRegs = stressLimitRegs(nullptr, freeRegs));
7416 // We are only interested in the variables that are live-in to the "to" block.
7417 VarSetOps::Iter iter(compiler, toBlock->bbLiveIn);
7418 unsigned varIndex = 0;
7419 while (iter.NextElem(&varIndex) && freeRegs != RBM_NONE)
7421 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
7422 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
7423 assert(fromReg != REG_NA && toReg != REG_NA);
7424 if (fromReg != REG_STK)
7426 freeRegs &= ~genRegMask(fromReg, getIntervalForLocalVar(varIndex)->registerType);
7428 if (toReg != REG_STK)
7430 freeRegs &= ~genRegMask(toReg, getIntervalForLocalVar(varIndex)->registerType);
7435 if (type == TYP_DOUBLE)
7437 // Exclude any doubles for which the odd half isn't in freeRegs.
7438 freeRegs = freeRegs & ((freeRegs << 1) & RBM_ALLDOUBLE);
7442 if (freeRegs == RBM_NONE)
7448 regNumber tempReg = genRegNumFromMask(genFindLowestBit(freeRegs));
7454 //------------------------------------------------------------------------
7455 // addResolutionForDouble: Add resolution move(s) for TYP_DOUBLE interval
7456 // and update location.
7459 // block - the BasicBlock into which the move will be inserted.
7460 // insertionPoint - the instruction before which to insert the move
7461 // sourceIntervals - maintains sourceIntervals[reg] which each 'reg' is associated with
7462 // location - maintains location[reg] which is the location of the var that was originally in 'reg'.
7463 // toReg - the register to which the var is moving
7464 // fromReg - the register from which the var is moving
7465 // resolveType - the type of resolution to be performed
7471 // It inserts at least one move and updates incoming parameter 'location'.
7473 void LinearScan::addResolutionForDouble(BasicBlock* block,
7474 GenTree* insertionPoint,
7475 Interval** sourceIntervals,
7476 regNumberSmall* location,
7479 ResolveType resolveType)
7481 regNumber secondHalfTargetReg = REG_NEXT(fromReg);
7482 Interval* intervalToBeMoved1 = sourceIntervals[fromReg];
7483 Interval* intervalToBeMoved2 = sourceIntervals[secondHalfTargetReg];
7485 assert(!(intervalToBeMoved1 == nullptr && intervalToBeMoved2 == nullptr));
7487 if (intervalToBeMoved1 != nullptr)
7489 if (intervalToBeMoved1->registerType == TYP_DOUBLE)
7491 // TYP_DOUBLE interval occupies a double register, i.e. two float registers.
7492 assert(intervalToBeMoved2 == nullptr);
7493 assert(genIsValidDoubleReg(toReg));
7497 // TYP_FLOAT interval occupies 1st half of double register, i.e. 1st float register
7498 assert(genIsValidFloatReg(toReg));
7500 addResolution(block, insertionPoint, intervalToBeMoved1, toReg, fromReg);
7501 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
7502 location[fromReg] = (regNumberSmall)toReg;
7505 if (intervalToBeMoved2 != nullptr)
7507 // TYP_FLOAT interval occupies 2nd half of double register.
7508 assert(intervalToBeMoved2->registerType == TYP_FLOAT);
7509 regNumber secondHalfTempReg = REG_NEXT(toReg);
7511 addResolution(block, insertionPoint, intervalToBeMoved2, secondHalfTempReg, secondHalfTargetReg);
7512 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
7513 location[secondHalfTargetReg] = (regNumberSmall)secondHalfTempReg;
7518 #endif // _TARGET_ARM_
7520 //------------------------------------------------------------------------
7521 // addResolution: Add a resolution move of the given interval
7524 // block - the BasicBlock into which the move will be inserted.
7525 // insertionPoint - the instruction before which to insert the move
7526 // interval - the interval of the var to be moved
7527 // toReg - the register to which the var is moving
7528 // fromReg - the register from which the var is moving
7534 // For joins, we insert at the bottom (indicated by an insertionPoint
7535 // of nullptr), while for splits we insert at the top.
7536 // This is because for joins 'block' is a pred of the join, while for splits it is a succ.
7537 // For critical edges, this function may be called twice - once to move from
7538 // the source (fromReg), if any, to the stack, in which case toReg will be
7539 // REG_STK, and we insert at the bottom (leave insertionPoint as nullptr).
7540 // The next time, we want to move from the stack to the destination (toReg),
7541 // in which case fromReg will be REG_STK, and we insert at the top.
7543 void LinearScan::addResolution(
7544 BasicBlock* block, GenTree* insertionPoint, Interval* interval, regNumber toReg, regNumber fromReg)
7547 const char* insertionPointString = "top";
7549 if (insertionPoint == nullptr)
7552 insertionPointString = "bottom";
7556 JITDUMP(" BB%02u %s: move V%02u from ", block->bbNum, insertionPointString, interval->varNum);
7557 JITDUMP("%s to %s", getRegName(fromReg), getRegName(toReg));
7559 insertMove(block, insertionPoint, interval->varNum, fromReg, toReg);
7560 if (fromReg == REG_STK || toReg == REG_STK)
7562 assert(interval->isSpilled);
7566 // We should have already marked this as spilled or split.
7567 assert((interval->isSpilled) || (interval->isSplit));
7570 INTRACK_STATS(updateLsraStat(LSRA_STAT_RESOLUTION_MOV, block->bbNum));
7573 //------------------------------------------------------------------------
7574 // handleOutgoingCriticalEdges: Performs the necessary resolution on all critical edges that feed out of 'block'
7577 // block - the block with outgoing critical edges.
7583 // For all outgoing critical edges (i.e. any successor of this block which is
7584 // a join edge), if there are any conflicts, split the edge by adding a new block,
7585 // and generate the resolution code into that block.
7587 void LinearScan::handleOutgoingCriticalEdges(BasicBlock* block)
7589 VARSET_TP outResolutionSet(VarSetOps::Intersection(compiler, block->bbLiveOut, resolutionCandidateVars));
7590 if (VarSetOps::IsEmpty(compiler, outResolutionSet))
7594 VARSET_TP sameResolutionSet(VarSetOps::MakeEmpty(compiler));
7595 VARSET_TP sameLivePathsSet(VarSetOps::MakeEmpty(compiler));
7596 VARSET_TP singleTargetSet(VarSetOps::MakeEmpty(compiler));
7597 VARSET_TP diffResolutionSet(VarSetOps::MakeEmpty(compiler));
7599 // Get the outVarToRegMap for this block
7600 VarToRegMap outVarToRegMap = getOutVarToRegMap(block->bbNum);
7601 unsigned succCount = block->NumSucc(compiler);
7602 assert(succCount > 1);
7603 VarToRegMap firstSuccInVarToRegMap = nullptr;
7604 BasicBlock* firstSucc = nullptr;
7606 // First, determine the live regs at the end of this block so that we know what regs are
7607 // available to copy into.
7608 // Note that for this purpose we use the full live-out set, because we must ensure that
7609 // even the registers that remain the same across the edge are preserved correctly.
7610 regMaskTP liveOutRegs = RBM_NONE;
7611 VarSetOps::Iter liveOutIter(compiler, block->bbLiveOut);
7612 unsigned liveOutVarIndex = 0;
7613 while (liveOutIter.NextElem(&liveOutVarIndex))
7615 regNumber fromReg = getVarReg(outVarToRegMap, liveOutVarIndex);
7616 if (fromReg != REG_STK)
7618 liveOutRegs |= genRegMask(fromReg);
7622 // Next, if this blocks ends with a switch table, we have to make sure not to copy
7623 // into the registers that it uses.
7624 regMaskTP switchRegs = RBM_NONE;
7625 if (block->bbJumpKind == BBJ_SWITCH)
7627 // At this point, Lowering has transformed any non-switch-table blocks into
7629 GenTree* switchTable = LIR::AsRange(block).LastNode();
7630 assert(switchTable != nullptr && switchTable->OperGet() == GT_SWITCH_TABLE);
7632 switchRegs = switchTable->gtRsvdRegs;
7633 GenTree* op1 = switchTable->gtGetOp1();
7634 GenTree* op2 = switchTable->gtGetOp2();
7635 noway_assert(op1 != nullptr && op2 != nullptr);
7636 assert(op1->gtRegNum != REG_NA && op2->gtRegNum != REG_NA);
7637 switchRegs |= genRegMask(op1->gtRegNum);
7638 switchRegs |= genRegMask(op2->gtRegNum);
7641 #ifdef _TARGET_ARM64_
7642 // Next, if this blocks ends with a JCMP, we have to make sure not to copy
7643 // into the register that it uses or modify the local variable it must consume
7644 LclVarDsc* jcmpLocalVarDsc = nullptr;
7645 if (block->bbJumpKind == BBJ_COND)
7647 GenTree* lastNode = LIR::AsRange(block).LastNode();
7649 if (lastNode->OperIs(GT_JCMP))
7651 GenTree* op1 = lastNode->gtGetOp1();
7652 switchRegs |= genRegMask(op1->gtRegNum);
7656 GenTreeLclVarCommon* lcl = op1->AsLclVarCommon();
7657 jcmpLocalVarDsc = &compiler->lvaTable[lcl->gtLclNum];
7663 VarToRegMap sameVarToRegMap = sharedCriticalVarToRegMap;
7664 regMaskTP sameWriteRegs = RBM_NONE;
7665 regMaskTP diffReadRegs = RBM_NONE;
7667 // For each var that may require resolution, classify them as:
7668 // - in the same register at the end of this block and at each target (no resolution needed)
7669 // - in different registers at different targets (resolve separately):
7670 // diffResolutionSet
7671 // - in the same register at each target at which it's live, but different from the end of
7672 // this block. We may be able to resolve these as if it is "join", but only if they do not
7673 // write to any registers that are read by those in the diffResolutionSet:
7674 // sameResolutionSet
7676 VarSetOps::Iter outResolutionSetIter(compiler, outResolutionSet);
7677 unsigned outResolutionSetVarIndex = 0;
7678 while (outResolutionSetIter.NextElem(&outResolutionSetVarIndex))
7680 regNumber fromReg = getVarReg(outVarToRegMap, outResolutionSetVarIndex);
7681 bool isMatch = true;
7682 bool isSame = false;
7683 bool maybeSingleTarget = false;
7684 bool maybeSameLivePaths = false;
7685 bool liveOnlyAtSplitEdge = true;
7686 regNumber sameToReg = REG_NA;
7687 for (unsigned succIndex = 0; succIndex < succCount; succIndex++)
7689 BasicBlock* succBlock = block->GetSucc(succIndex, compiler);
7690 if (!VarSetOps::IsMember(compiler, succBlock->bbLiveIn, outResolutionSetVarIndex))
7692 maybeSameLivePaths = true;
7695 else if (liveOnlyAtSplitEdge)
7697 // Is the var live only at those target blocks which are connected by a split edge to this block
7698 liveOnlyAtSplitEdge = ((succBlock->bbPreds->flNext == nullptr) && (succBlock != compiler->fgFirstBB));
7701 regNumber toReg = getVarReg(getInVarToRegMap(succBlock->bbNum), outResolutionSetVarIndex);
7702 if (sameToReg == REG_NA)
7707 if (toReg == sameToReg)
7715 // Check for the cases where we can't write to a register.
7716 // We only need to check for these cases if sameToReg is an actual register (not REG_STK).
7717 if (sameToReg != REG_NA && sameToReg != REG_STK)
7719 // If there's a path on which this var isn't live, it may use the original value in sameToReg.
7720 // In this case, sameToReg will be in the liveOutRegs of this block.
7721 // Similarly, if sameToReg is in sameWriteRegs, it has already been used (i.e. for a lclVar that's
7722 // live only at another target), and we can't copy another lclVar into that reg in this block.
7723 regMaskTP sameToRegMask = genRegMask(sameToReg);
7724 if (maybeSameLivePaths &&
7725 (((sameToRegMask & liveOutRegs) != RBM_NONE) || ((sameToRegMask & sameWriteRegs) != RBM_NONE)))
7729 // If this register is used by a switch table at the end of the block, we can't do the copy
7730 // in this block (since we can't insert it after the switch).
7731 if ((sameToRegMask & switchRegs) != RBM_NONE)
7736 #ifdef _TARGET_ARM64_
7737 if (jcmpLocalVarDsc && (jcmpLocalVarDsc->lvVarIndex == outResolutionSetVarIndex))
7743 // If the var is live only at those blocks connected by a split edge and not live-in at some of the
7744 // target blocks, we will resolve it the same way as if it were in diffResolutionSet and resolution
7745 // will be deferred to the handling of split edges, which means copy will only be at those target(s).
7747 // Another way to achieve similar resolution for vars live only at split edges is by removing them
7748 // from consideration up-front but it requires that we traverse those edges anyway to account for
7749 // the registers that must note be overwritten.
7750 if (liveOnlyAtSplitEdge && maybeSameLivePaths)
7756 if (sameToReg == REG_NA)
7758 VarSetOps::AddElemD(compiler, diffResolutionSet, outResolutionSetVarIndex);
7759 if (fromReg != REG_STK)
7761 diffReadRegs |= genRegMask(fromReg);
7764 else if (sameToReg != fromReg)
7766 VarSetOps::AddElemD(compiler, sameResolutionSet, outResolutionSetVarIndex);
7767 setVarReg(sameVarToRegMap, outResolutionSetVarIndex, sameToReg);
7768 if (sameToReg != REG_STK)
7770 sameWriteRegs |= genRegMask(sameToReg);
7775 if (!VarSetOps::IsEmpty(compiler, sameResolutionSet))
7777 if ((sameWriteRegs & diffReadRegs) != RBM_NONE)
7779 // We cannot split the "same" and "diff" regs if the "same" set writes registers
7780 // that must be read by the "diff" set. (Note that when these are done as a "batch"
7781 // we carefully order them to ensure all the input regs are read before they are
7783 VarSetOps::UnionD(compiler, diffResolutionSet, sameResolutionSet);
7784 VarSetOps::ClearD(compiler, sameResolutionSet);
7788 // For any vars in the sameResolutionSet, we can simply add the move at the end of "block".
7789 resolveEdge(block, nullptr, ResolveSharedCritical, sameResolutionSet);
7792 if (!VarSetOps::IsEmpty(compiler, diffResolutionSet))
7794 for (unsigned succIndex = 0; succIndex < succCount; succIndex++)
7796 BasicBlock* succBlock = block->GetSucc(succIndex, compiler);
7798 // Any "diffResolutionSet" resolution for a block with no other predecessors will be handled later
7799 // as split resolution.
7800 if ((succBlock->bbPreds->flNext == nullptr) && (succBlock != compiler->fgFirstBB))
7805 // Now collect the resolution set for just this edge, if any.
7806 // Check only the vars in diffResolutionSet that are live-in to this successor.
7807 bool needsResolution = false;
7808 VarToRegMap succInVarToRegMap = getInVarToRegMap(succBlock->bbNum);
7809 VARSET_TP edgeResolutionSet(VarSetOps::Intersection(compiler, diffResolutionSet, succBlock->bbLiveIn));
7810 VarSetOps::Iter iter(compiler, edgeResolutionSet);
7811 unsigned varIndex = 0;
7812 while (iter.NextElem(&varIndex))
7814 regNumber fromReg = getVarReg(outVarToRegMap, varIndex);
7815 regNumber toReg = getVarReg(succInVarToRegMap, varIndex);
7817 if (fromReg == toReg)
7819 VarSetOps::RemoveElemD(compiler, edgeResolutionSet, varIndex);
7822 if (!VarSetOps::IsEmpty(compiler, edgeResolutionSet))
7824 resolveEdge(block, succBlock, ResolveCritical, edgeResolutionSet);
7830 //------------------------------------------------------------------------
7831 // resolveEdges: Perform resolution across basic block edges
7840 // Traverse the basic blocks.
7841 // - If this block has a single predecessor that is not the immediately
7842 // preceding block, perform any needed 'split' resolution at the beginning of this block
7843 // - Otherwise if this block has critical incoming edges, handle them.
7844 // - If this block has a single successor that has multiple predecesors, perform any needed
7845 // 'join' resolution at the end of this block.
7846 // Note that a block may have both 'split' or 'critical' incoming edge(s) and 'join' outgoing
7849 void LinearScan::resolveEdges()
7851 JITDUMP("RESOLVING EDGES\n");
7853 // The resolutionCandidateVars set was initialized with all the lclVars that are live-in to
7854 // any block. We now intersect that set with any lclVars that ever spilled or split.
7855 // If there are no candidates for resoultion, simply return.
7857 VarSetOps::IntersectionD(compiler, resolutionCandidateVars, splitOrSpilledVars);
7858 if (VarSetOps::IsEmpty(compiler, resolutionCandidateVars))
7863 BasicBlock *block, *prevBlock = nullptr;
7865 // Handle all the critical edges first.
7866 // We will try to avoid resolution across critical edges in cases where all the critical-edge
7867 // targets of a block have the same home. We will then split the edges only for the
7868 // remaining mismatches. We visit the out-edges, as that allows us to share the moves that are
7869 // common among allt he targets.
7871 if (hasCriticalEdges)
7873 foreach_block(compiler, block)
7875 if (block->bbNum > bbNumMaxBeforeResolution)
7877 // This is a new block added during resolution - we don't need to visit these now.
7880 if (blockInfo[block->bbNum].hasCriticalOutEdge)
7882 handleOutgoingCriticalEdges(block);
7888 prevBlock = nullptr;
7889 foreach_block(compiler, block)
7891 if (block->bbNum > bbNumMaxBeforeResolution)
7893 // This is a new block added during resolution - we don't need to visit these now.
7897 unsigned succCount = block->NumSucc(compiler);
7898 flowList* preds = block->bbPreds;
7899 BasicBlock* uniquePredBlock = block->GetUniquePred(compiler);
7901 // First, if this block has a single predecessor,
7902 // we may need resolution at the beginning of this block.
7903 // This may be true even if it's the block we used for starting locations,
7904 // if a variable was spilled.
7905 VARSET_TP inResolutionSet(VarSetOps::Intersection(compiler, block->bbLiveIn, resolutionCandidateVars));
7906 if (!VarSetOps::IsEmpty(compiler, inResolutionSet))
7908 if (uniquePredBlock != nullptr)
7910 // We may have split edges during critical edge resolution, and in the process split
7911 // a non-critical edge as well.
7912 // It is unlikely that we would ever have more than one of these in sequence (indeed,
7913 // I don't think it's possible), but there's no need to assume that it can't.
7914 while (uniquePredBlock->bbNum > bbNumMaxBeforeResolution)
7916 uniquePredBlock = uniquePredBlock->GetUniquePred(compiler);
7917 noway_assert(uniquePredBlock != nullptr);
7919 resolveEdge(uniquePredBlock, block, ResolveSplit, inResolutionSet);
7923 // Finally, if this block has a single successor:
7924 // - and that has at least one other predecessor (otherwise we will do the resolution at the
7925 // top of the successor),
7926 // - and that is not the target of a critical edge (otherwise we've already handled it)
7927 // we may need resolution at the end of this block.
7931 BasicBlock* succBlock = block->GetSucc(0, compiler);
7932 if (succBlock->GetUniquePred(compiler) == nullptr)
7934 VARSET_TP outResolutionSet(
7935 VarSetOps::Intersection(compiler, succBlock->bbLiveIn, resolutionCandidateVars));
7936 if (!VarSetOps::IsEmpty(compiler, outResolutionSet))
7938 resolveEdge(block, succBlock, ResolveJoin, outResolutionSet);
7944 // Now, fixup the mapping for any blocks that were adding for edge splitting.
7945 // See the comment prior to the call to fgSplitEdge() in resolveEdge().
7946 // Note that we could fold this loop in with the checking code below, but that
7947 // would only improve the debug case, and would clutter up the code somewhat.
7948 if (compiler->fgBBNumMax > bbNumMaxBeforeResolution)
7950 foreach_block(compiler, block)
7952 if (block->bbNum > bbNumMaxBeforeResolution)
7954 // There may be multiple blocks inserted when we split. But we must always have exactly
7955 // one path (i.e. all blocks must be single-successor and single-predecessor),
7956 // and only one block along the path may be non-empty.
7957 // Note that we may have a newly-inserted block that is empty, but which connects
7958 // two non-resolution blocks. This happens when an edge is split that requires it.
7960 BasicBlock* succBlock = block;
7963 succBlock = succBlock->GetUniqueSucc();
7964 noway_assert(succBlock != nullptr);
7965 } while ((succBlock->bbNum > bbNumMaxBeforeResolution) && succBlock->isEmpty());
7967 BasicBlock* predBlock = block;
7970 predBlock = predBlock->GetUniquePred(compiler);
7971 noway_assert(predBlock != nullptr);
7972 } while ((predBlock->bbNum > bbNumMaxBeforeResolution) && predBlock->isEmpty());
7974 unsigned succBBNum = succBlock->bbNum;
7975 unsigned predBBNum = predBlock->bbNum;
7976 if (block->isEmpty())
7978 // For the case of the empty block, find the non-resolution block (succ or pred).
7979 if (predBBNum > bbNumMaxBeforeResolution)
7981 assert(succBBNum <= bbNumMaxBeforeResolution);
7991 assert((succBBNum <= bbNumMaxBeforeResolution) && (predBBNum <= bbNumMaxBeforeResolution));
7993 SplitEdgeInfo info = {predBBNum, succBBNum};
7994 getSplitBBNumToTargetBBNumMap()->Set(block->bbNum, info);
8000 // Make sure the varToRegMaps match up on all edges.
8001 bool foundMismatch = false;
8002 foreach_block(compiler, block)
8004 if (block->isEmpty() && block->bbNum > bbNumMaxBeforeResolution)
8008 VarToRegMap toVarToRegMap = getInVarToRegMap(block->bbNum);
8009 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
8011 BasicBlock* predBlock = pred->flBlock;
8012 VarToRegMap fromVarToRegMap = getOutVarToRegMap(predBlock->bbNum);
8013 VarSetOps::Iter iter(compiler, block->bbLiveIn);
8014 unsigned varIndex = 0;
8015 while (iter.NextElem(&varIndex))
8017 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
8018 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
8019 if (fromReg != toReg)
8023 foundMismatch = true;
8024 printf("Found mismatched var locations after resolution!\n");
8026 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
8027 printf(" V%02u: BB%02u to BB%02u: %s to %s\n", varNum, predBlock->bbNum, block->bbNum,
8028 getRegName(fromReg), getRegName(toReg));
8033 assert(!foundMismatch);
8038 //------------------------------------------------------------------------
8039 // resolveEdge: Perform the specified type of resolution between two blocks.
8042 // fromBlock - the block from which the edge originates
8043 // toBlock - the block at which the edge terminates
8044 // resolveType - the type of resolution to be performed
8045 // liveSet - the set of tracked lclVar indices which may require resolution
8051 // The caller must have performed the analysis to determine the type of the edge.
8054 // This method emits the correctly ordered moves necessary to place variables in the
8055 // correct registers across a Split, Join or Critical edge.
8056 // In order to avoid overwriting register values before they have been moved to their
8057 // new home (register/stack), it first does the register-to-stack moves (to free those
8058 // registers), then the register to register moves, ensuring that the target register
8059 // is free before the move, and then finally the stack to register moves.
8061 void LinearScan::resolveEdge(BasicBlock* fromBlock,
8062 BasicBlock* toBlock,
8063 ResolveType resolveType,
8064 VARSET_VALARG_TP liveSet)
8066 VarToRegMap fromVarToRegMap = getOutVarToRegMap(fromBlock->bbNum);
8067 VarToRegMap toVarToRegMap;
8068 if (resolveType == ResolveSharedCritical)
8070 toVarToRegMap = sharedCriticalVarToRegMap;
8074 toVarToRegMap = getInVarToRegMap(toBlock->bbNum);
8077 // The block to which we add the resolution moves depends on the resolveType
8079 switch (resolveType)
8082 case ResolveSharedCritical:
8088 case ResolveCritical:
8089 // fgSplitEdge may add one or two BasicBlocks. It returns the block that splits
8090 // the edge from 'fromBlock' and 'toBlock', but if it inserts that block right after
8091 // a block with a fall-through it will have to create another block to handle that edge.
8092 // These new blocks can be mapped to existing blocks in order to correctly handle
8093 // the calls to recordVarLocationsAtStartOfBB() from codegen. That mapping is handled
8094 // in resolveEdges(), after all the edge resolution has been done (by calling this
8095 // method for each edge).
8096 block = compiler->fgSplitEdge(fromBlock, toBlock);
8098 // Split edges are counted against fromBlock.
8099 INTRACK_STATS(updateLsraStat(LSRA_STAT_SPLIT_EDGE, fromBlock->bbNum));
8106 #ifndef _TARGET_XARCH_
8107 // We record tempregs for beginning and end of each block.
8108 // For amd64/x86 we only need a tempReg for float - we'll use xchg for int.
8109 // TODO-Throughput: It would be better to determine the tempRegs on demand, but the code below
8110 // modifies the varToRegMaps so we don't have all the correct registers at the time
8111 // we need to get the tempReg.
8112 regNumber tempRegInt =
8113 (resolveType == ResolveSharedCritical) ? REG_NA : getTempRegForResolution(fromBlock, toBlock, TYP_INT);
8114 #endif // !_TARGET_XARCH_
8115 regNumber tempRegFlt = REG_NA;
8116 regNumber tempRegDbl = REG_NA; // Used only for ARM
8117 if ((compiler->compFloatingPointUsed) && (resolveType != ResolveSharedCritical))
8120 // Try to reserve a double register for TYP_DOUBLE and use it for TYP_FLOAT too if available.
8121 tempRegDbl = getTempRegForResolution(fromBlock, toBlock, TYP_DOUBLE);
8122 if (tempRegDbl != REG_NA)
8124 tempRegFlt = tempRegDbl;
8127 #endif // _TARGET_ARM_
8129 tempRegFlt = getTempRegForResolution(fromBlock, toBlock, TYP_FLOAT);
8133 regMaskTP targetRegsToDo = RBM_NONE;
8134 regMaskTP targetRegsReady = RBM_NONE;
8135 regMaskTP targetRegsFromStack = RBM_NONE;
8137 // The following arrays capture the location of the registers as they are moved:
8138 // - location[reg] gives the current location of the var that was originally in 'reg'.
8139 // (Note that a var may be moved more than once.)
8140 // - source[reg] gives the original location of the var that needs to be moved to 'reg'.
8141 // For example, if a var is in rax and needs to be moved to rsi, then we would start with:
8142 // location[rax] == rax
8143 // source[rsi] == rax -- this doesn't change
8144 // Then, if for some reason we need to move it temporary to rbx, we would have:
8145 // location[rax] == rbx
8146 // Once we have completed the move, we will have:
8147 // location[rax] == REG_NA
8148 // This indicates that the var originally in rax is now in its target register.
8150 regNumberSmall location[REG_COUNT];
8151 C_ASSERT(sizeof(char) == sizeof(regNumberSmall)); // for memset to work
8152 memset(location, REG_NA, REG_COUNT);
8153 regNumberSmall source[REG_COUNT];
8154 memset(source, REG_NA, REG_COUNT);
8156 // What interval is this register associated with?
8157 // (associated with incoming reg)
8158 Interval* sourceIntervals[REG_COUNT];
8159 memset(&sourceIntervals, 0, sizeof(sourceIntervals));
8161 // Intervals for vars that need to be loaded from the stack
8162 Interval* stackToRegIntervals[REG_COUNT];
8163 memset(&stackToRegIntervals, 0, sizeof(stackToRegIntervals));
8165 // Get the starting insertion point for the "to" resolution
8166 GenTree* insertionPoint = nullptr;
8167 if (resolveType == ResolveSplit || resolveType == ResolveCritical)
8169 insertionPoint = LIR::AsRange(block).FirstNonPhiNode();
8173 // - Perform all moves from reg to stack (no ordering needed on these)
8174 // - For reg to reg moves, record the current location, associating their
8175 // source location with the target register they need to go into
8176 // - For stack to reg moves (done last, no ordering needed between them)
8177 // record the interval associated with the target reg
8178 // TODO-Throughput: We should be looping over the liveIn and liveOut registers, since
8179 // that will scale better than the live variables
8181 VarSetOps::Iter iter(compiler, liveSet);
8182 unsigned varIndex = 0;
8183 while (iter.NextElem(&varIndex))
8185 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
8186 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
8187 if (fromReg == toReg)
8192 // For Critical edges, the location will not change on either side of the edge,
8193 // since we'll add a new block to do the move.
8194 if (resolveType == ResolveSplit)
8196 setVarReg(toVarToRegMap, varIndex, fromReg);
8198 else if (resolveType == ResolveJoin || resolveType == ResolveSharedCritical)
8200 setVarReg(fromVarToRegMap, varIndex, toReg);
8203 assert(fromReg < UCHAR_MAX && toReg < UCHAR_MAX);
8205 Interval* interval = getIntervalForLocalVar(varIndex);
8207 if (fromReg == REG_STK)
8209 stackToRegIntervals[toReg] = interval;
8210 targetRegsFromStack |= genRegMask(toReg);
8212 else if (toReg == REG_STK)
8214 // Do the reg to stack moves now
8215 addResolution(block, insertionPoint, interval, REG_STK, fromReg);
8216 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8220 location[fromReg] = (regNumberSmall)fromReg;
8221 source[toReg] = (regNumberSmall)fromReg;
8222 sourceIntervals[fromReg] = interval;
8223 targetRegsToDo |= genRegMask(toReg);
8227 // REGISTER to REGISTER MOVES
8229 // First, find all the ones that are ready to move now
8230 regMaskTP targetCandidates = targetRegsToDo;
8231 while (targetCandidates != RBM_NONE)
8233 regMaskTP targetRegMask = genFindLowestBit(targetCandidates);
8234 targetCandidates &= ~targetRegMask;
8235 regNumber targetReg = genRegNumFromMask(targetRegMask);
8236 if (location[targetReg] == REG_NA)
8239 regNumber sourceReg = (regNumber)source[targetReg];
8240 Interval* interval = sourceIntervals[sourceReg];
8241 if (interval->registerType == TYP_DOUBLE)
8243 // For ARM32, make sure that both of the float halves of the double register are available.
8244 assert(genIsValidDoubleReg(targetReg));
8245 regNumber anotherHalfRegNum = REG_NEXT(targetReg);
8246 if (location[anotherHalfRegNum] == REG_NA)
8248 targetRegsReady |= targetRegMask;
8252 #endif // _TARGET_ARM_
8254 targetRegsReady |= targetRegMask;
8259 // Perform reg to reg moves
8260 while (targetRegsToDo != RBM_NONE)
8262 while (targetRegsReady != RBM_NONE)
8264 regMaskTP targetRegMask = genFindLowestBit(targetRegsReady);
8265 targetRegsToDo &= ~targetRegMask;
8266 targetRegsReady &= ~targetRegMask;
8267 regNumber targetReg = genRegNumFromMask(targetRegMask);
8268 assert(location[targetReg] != targetReg);
8269 regNumber sourceReg = (regNumber)source[targetReg];
8270 regNumber fromReg = (regNumber)location[sourceReg];
8271 assert(fromReg < UCHAR_MAX && sourceReg < UCHAR_MAX);
8272 Interval* interval = sourceIntervals[sourceReg];
8273 assert(interval != nullptr);
8274 addResolution(block, insertionPoint, interval, targetReg, fromReg);
8275 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8276 sourceIntervals[sourceReg] = nullptr;
8277 location[sourceReg] = REG_NA;
8279 // Do we have a free targetReg?
8280 if (fromReg == sourceReg)
8282 if (source[fromReg] != REG_NA)
8284 regMaskTP fromRegMask = genRegMask(fromReg);
8285 targetRegsReady |= fromRegMask;
8287 if (genIsValidDoubleReg(fromReg))
8289 // Ensure that either:
8290 // - the Interval targetting fromReg is not double, or
8291 // - the other half of the double is free.
8292 Interval* otherInterval = sourceIntervals[source[fromReg]];
8293 regNumber upperHalfReg = REG_NEXT(fromReg);
8294 if ((otherInterval->registerType == TYP_DOUBLE) && (location[upperHalfReg] != REG_NA))
8296 targetRegsReady &= ~fromRegMask;
8300 else if (genIsValidFloatReg(fromReg) && !genIsValidDoubleReg(fromReg))
8302 // We may have freed up the other half of a double where the lower half
8303 // was already free.
8304 regNumber lowerHalfReg = REG_PREV(fromReg);
8305 regNumber lowerHalfSrcReg = (regNumber)source[lowerHalfReg];
8306 regNumber lowerHalfSrcLoc = (regNumber)location[lowerHalfReg];
8307 // Necessary conditions:
8308 // - There is a source register for this reg (lowerHalfSrcReg != REG_NA)
8309 // - It is currently free (lowerHalfSrcLoc == REG_NA)
8310 // - The source interval isn't yet completed (sourceIntervals[lowerHalfSrcReg] != nullptr)
8311 // - It's not in the ready set ((targetRegsReady & genRegMask(lowerHalfReg)) ==
8314 if ((lowerHalfSrcReg != REG_NA) && (lowerHalfSrcLoc == REG_NA) &&
8315 (sourceIntervals[lowerHalfSrcReg] != nullptr) &&
8316 ((targetRegsReady & genRegMask(lowerHalfReg)) == RBM_NONE))
8318 // This must be a double interval, otherwise it would be in targetRegsReady, or already
8320 assert(sourceIntervals[lowerHalfSrcReg]->registerType == TYP_DOUBLE);
8321 targetRegsReady |= genRegMask(lowerHalfReg);
8323 #endif // _TARGET_ARM_
8327 if (targetRegsToDo != RBM_NONE)
8329 regMaskTP targetRegMask = genFindLowestBit(targetRegsToDo);
8330 regNumber targetReg = genRegNumFromMask(targetRegMask);
8332 // Is it already there due to other moves?
8333 // If not, move it to the temp reg, OR swap it with another register
8334 regNumber sourceReg = (regNumber)source[targetReg];
8335 regNumber fromReg = (regNumber)location[sourceReg];
8336 if (targetReg == fromReg)
8338 targetRegsToDo &= ~targetRegMask;
8342 regNumber tempReg = REG_NA;
8343 bool useSwap = false;
8344 if (emitter::isFloatReg(targetReg))
8347 if (sourceIntervals[fromReg]->registerType == TYP_DOUBLE)
8349 // ARM32 requires a double temp register for TYP_DOUBLE.
8350 tempReg = tempRegDbl;
8353 #endif // _TARGET_ARM_
8354 tempReg = tempRegFlt;
8356 #ifdef _TARGET_XARCH_
8361 #else // !_TARGET_XARCH_
8365 tempReg = tempRegInt;
8368 #endif // !_TARGET_XARCH_
8369 if (useSwap || tempReg == REG_NA)
8371 // First, we have to figure out the destination register for what's currently in fromReg,
8372 // so that we can find its sourceInterval.
8373 regNumber otherTargetReg = REG_NA;
8375 // By chance, is fromReg going where it belongs?
8376 if (location[source[fromReg]] == targetReg)
8378 otherTargetReg = fromReg;
8379 // If we can swap, we will be done with otherTargetReg as well.
8380 // Otherwise, we'll spill it to the stack and reload it later.
8383 regMaskTP fromRegMask = genRegMask(fromReg);
8384 targetRegsToDo &= ~fromRegMask;
8389 // Look at the remaining registers from targetRegsToDo (which we expect to be relatively
8390 // small at this point) to find out what's currently in targetReg.
8391 regMaskTP mask = targetRegsToDo;
8392 while (mask != RBM_NONE && otherTargetReg == REG_NA)
8394 regMaskTP nextRegMask = genFindLowestBit(mask);
8395 regNumber nextReg = genRegNumFromMask(nextRegMask);
8396 mask &= ~nextRegMask;
8397 if (location[source[nextReg]] == targetReg)
8399 otherTargetReg = nextReg;
8403 assert(otherTargetReg != REG_NA);
8407 // Generate a "swap" of fromReg and targetReg
8408 insertSwap(block, insertionPoint, sourceIntervals[source[otherTargetReg]]->varNum, targetReg,
8409 sourceIntervals[sourceReg]->varNum, fromReg);
8410 location[sourceReg] = REG_NA;
8411 location[source[otherTargetReg]] = (regNumberSmall)fromReg;
8413 INTRACK_STATS(updateLsraStat(LSRA_STAT_RESOLUTION_MOV, block->bbNum));
8417 // Spill "targetReg" to the stack and add its eventual target (otherTargetReg)
8418 // to "targetRegsFromStack", which will be handled below.
8419 // NOTE: This condition is very rare. Setting COMPlus_JitStressRegs=0x203
8420 // has been known to trigger it in JIT SH.
8422 // First, spill "otherInterval" from targetReg to the stack.
8423 Interval* otherInterval = sourceIntervals[source[otherTargetReg]];
8424 setIntervalAsSpilled(otherInterval);
8425 addResolution(block, insertionPoint, otherInterval, REG_STK, targetReg);
8426 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8427 location[source[otherTargetReg]] = REG_STK;
8429 // Now, move the interval that is going to targetReg, and add its "fromReg" to
8430 // "targetRegsReady".
8431 addResolution(block, insertionPoint, sourceIntervals[sourceReg], targetReg, fromReg);
8432 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8433 location[sourceReg] = REG_NA;
8434 targetRegsReady |= genRegMask(fromReg);
8436 targetRegsToDo &= ~targetRegMask;
8440 compiler->codeGen->regSet.rsSetRegsModified(genRegMask(tempReg) DEBUGARG(true));
8442 if (sourceIntervals[fromReg]->registerType == TYP_DOUBLE)
8444 assert(genIsValidDoubleReg(targetReg));
8445 assert(genIsValidDoubleReg(tempReg));
8447 addResolutionForDouble(block, insertionPoint, sourceIntervals, location, tempReg, targetReg,
8451 #endif // _TARGET_ARM_
8453 assert(sourceIntervals[targetReg] != nullptr);
8455 addResolution(block, insertionPoint, sourceIntervals[targetReg], tempReg, targetReg);
8456 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8457 location[targetReg] = (regNumberSmall)tempReg;
8459 targetRegsReady |= targetRegMask;
8465 // Finally, perform stack to reg moves
8466 // All the target regs will be empty at this point
8467 while (targetRegsFromStack != RBM_NONE)
8469 regMaskTP targetRegMask = genFindLowestBit(targetRegsFromStack);
8470 targetRegsFromStack &= ~targetRegMask;
8471 regNumber targetReg = genRegNumFromMask(targetRegMask);
8473 Interval* interval = stackToRegIntervals[targetReg];
8474 assert(interval != nullptr);
8476 addResolution(block, insertionPoint, interval, targetReg, REG_STK);
8477 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8481 #if TRACK_LSRA_STATS
8482 // ----------------------------------------------------------
8483 // updateLsraStat: Increment LSRA stat counter.
8486 // stat - LSRA stat enum
8487 // bbNum - Basic block to which LSRA stat needs to be
8490 void LinearScan::updateLsraStat(LsraStat stat, unsigned bbNum)
8492 if (bbNum > bbNumMaxBeforeResolution)
8494 // This is a newly created basic block as part of resolution.
8495 // These blocks contain resolution moves that are already accounted.
8501 case LSRA_STAT_SPILL:
8502 ++(blockInfo[bbNum].spillCount);
8505 case LSRA_STAT_COPY_REG:
8506 ++(blockInfo[bbNum].copyRegCount);
8509 case LSRA_STAT_RESOLUTION_MOV:
8510 ++(blockInfo[bbNum].resolutionMovCount);
8513 case LSRA_STAT_SPLIT_EDGE:
8514 ++(blockInfo[bbNum].splitEdgeCount);
8522 // -----------------------------------------------------------
8523 // dumpLsraStats - dumps Lsra stats to given file.
8526 // file - file to which stats are to be written.
8528 void LinearScan::dumpLsraStats(FILE* file)
8530 unsigned sumSpillCount = 0;
8531 unsigned sumCopyRegCount = 0;
8532 unsigned sumResolutionMovCount = 0;
8533 unsigned sumSplitEdgeCount = 0;
8534 UINT64 wtdSpillCount = 0;
8535 UINT64 wtdCopyRegCount = 0;
8536 UINT64 wtdResolutionMovCount = 0;
8538 fprintf(file, "----------\n");
8539 fprintf(file, "LSRA Stats");
8543 fprintf(file, " : %s\n", compiler->info.compFullName);
8547 // In verbose mode no need to print full name
8548 // while printing lsra stats.
8549 fprintf(file, "\n");
8552 fprintf(file, " : %s\n", compiler->eeGetMethodFullName(compiler->info.compCompHnd));
8555 fprintf(file, "----------\n");
8557 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = block->bbNext)
8559 if (block->bbNum > bbNumMaxBeforeResolution)
8564 unsigned spillCount = blockInfo[block->bbNum].spillCount;
8565 unsigned copyRegCount = blockInfo[block->bbNum].copyRegCount;
8566 unsigned resolutionMovCount = blockInfo[block->bbNum].resolutionMovCount;
8567 unsigned splitEdgeCount = blockInfo[block->bbNum].splitEdgeCount;
8569 if (spillCount != 0 || copyRegCount != 0 || resolutionMovCount != 0 || splitEdgeCount != 0)
8571 fprintf(file, "BB%02u [%8d]: ", block->bbNum, block->bbWeight);
8572 fprintf(file, "SpillCount = %d, ResolutionMovs = %d, SplitEdges = %d, CopyReg = %d\n", spillCount,
8573 resolutionMovCount, splitEdgeCount, copyRegCount);
8576 sumSpillCount += spillCount;
8577 sumCopyRegCount += copyRegCount;
8578 sumResolutionMovCount += resolutionMovCount;
8579 sumSplitEdgeCount += splitEdgeCount;
8581 wtdSpillCount += (UINT64)spillCount * block->bbWeight;
8582 wtdCopyRegCount += (UINT64)copyRegCount * block->bbWeight;
8583 wtdResolutionMovCount += (UINT64)resolutionMovCount * block->bbWeight;
8586 fprintf(file, "Total Tracked Vars: %d\n", compiler->lvaTrackedCount);
8587 fprintf(file, "Total Reg Cand Vars: %d\n", regCandidateVarCount);
8588 fprintf(file, "Total number of Intervals: %d\n", static_cast<unsigned>(intervals.size() - 1));
8589 fprintf(file, "Total number of RefPositions: %d\n", static_cast<unsigned>(refPositions.size() - 1));
8590 fprintf(file, "Total Spill Count: %d Weighted: %I64u\n", sumSpillCount, wtdSpillCount);
8591 fprintf(file, "Total CopyReg Count: %d Weighted: %I64u\n", sumCopyRegCount, wtdCopyRegCount);
8592 fprintf(file, "Total ResolutionMov Count: %d Weighted: %I64u\n", sumResolutionMovCount, wtdResolutionMovCount);
8593 fprintf(file, "Total number of split edges: %d\n", sumSplitEdgeCount);
8595 // compute total number of spill temps created
8596 unsigned numSpillTemps = 0;
8597 for (int i = 0; i < TYP_COUNT; i++)
8599 numSpillTemps += maxSpill[i];
8601 fprintf(file, "Total Number of spill temps created: %d\n\n", numSpillTemps);
8603 #endif // TRACK_LSRA_STATS
8606 void dumpRegMask(regMaskTP regs)
8608 if (regs == RBM_ALLINT)
8612 else if (regs == (RBM_ALLINT & ~RBM_FPBASE))
8614 printf("[allIntButFP]");
8616 else if (regs == RBM_ALLFLOAT)
8618 printf("[allFloat]");
8620 else if (regs == RBM_ALLDOUBLE)
8622 printf("[allDouble]");
8630 static const char* getRefTypeName(RefType refType)
8634 #define DEF_REFTYPE(memberName, memberValue, shortName) \
8637 #include "lsra_reftypes.h"
8644 static const char* getRefTypeShortName(RefType refType)
8648 #define DEF_REFTYPE(memberName, memberValue, shortName) \
8651 #include "lsra_reftypes.h"
8658 void RefPosition::dump()
8660 printf("<RefPosition #%-3u @%-3u", rpNum, nodeLocation);
8662 if (nextRefPosition)
8664 printf(" ->#%-3u", nextRefPosition->rpNum);
8667 printf(" %s ", getRefTypeName(refType));
8669 if (this->isPhysRegRef)
8671 this->getReg()->tinyDump();
8673 else if (getInterval())
8675 this->getInterval()->tinyDump();
8680 printf("%s ", treeNode->OpName(treeNode->OperGet()));
8682 printf("BB%02u ", this->bbNum);
8685 dumpRegMask(registerAssignment);
8687 printf(" minReg=%d", minRegCandidateCount);
8697 if (this->spillAfter)
8699 printf(" spillAfter");
8709 if (this->isFixedRegRef)
8713 if (this->isLocalDefUse)
8717 if (this->delayRegFree)
8721 if (this->outOfOrder)
8723 printf(" outOfOrder");
8726 if (this->AllocateIfProfitable())
8728 printf(" regOptional");
8733 void RegRecord::dump()
8738 void Interval::dump()
8740 printf("Interval %2u:", intervalIndex);
8744 printf(" (V%02u)", varNum);
8748 printf(" (INTERNAL)");
8752 printf(" (SPILLED)");
8760 printf(" (struct)");
8762 if (isPromotedStruct)
8764 printf(" (promoted struct)");
8766 if (hasConflictingDefUse)
8768 printf(" (def-use conflict)");
8770 if (hasInterferingUses)
8772 printf(" (interfering uses)");
8774 if (isSpecialPutArg)
8776 printf(" (specialPutArg)");
8780 printf(" (constant)");
8784 printf(" (multireg)");
8787 printf(" RefPositions {");
8788 for (RefPosition* refPosition = this->firstRefPosition; refPosition != nullptr;
8789 refPosition = refPosition->nextRefPosition)
8791 printf("#%u@%u", refPosition->rpNum, refPosition->nodeLocation);
8792 if (refPosition->nextRefPosition)
8799 // this is not used (yet?)
8800 // printf(" SpillOffset %d", this->spillOffset);
8802 printf(" physReg:%s", getRegName(physReg));
8804 printf(" Preferences=");
8805 dumpRegMask(this->registerPreferences);
8807 if (relatedInterval)
8809 printf(" RelatedInterval ");
8810 relatedInterval->microDump();
8811 printf("[%p]", dspPtr(relatedInterval));
8817 // print out very concise representation
8818 void Interval::tinyDump()
8820 printf("<Ivl:%u", intervalIndex);
8823 printf(" V%02u", varNum);
8827 printf(" internal");
8832 // print out extremely concise representation
8833 void Interval::microDump()
8835 char intervalTypeChar = 'I';
8838 intervalTypeChar = 'T';
8840 else if (isLocalVar)
8842 intervalTypeChar = 'L';
8845 printf("<%c%u>", intervalTypeChar, intervalIndex);
8848 void RegRecord::tinyDump()
8850 printf("<Reg:%-3s> ", getRegName(regNum));
8853 void TreeNodeInfo::dump(LinearScan* lsra)
8855 printf("<TreeNodeInfo %d=%d %di %df", dstCount, srcCount, internalIntCount, internalFloatCount);
8857 dumpRegMask(getSrcCandidates(lsra));
8859 dumpRegMask(getInternalCandidates(lsra));
8861 dumpRegMask(getDstCandidates(lsra));
8878 if (isInternalRegDelayFree)
8885 void LinearScan::dumpDefList()
8887 JITDUMP("DefList: { ");
8889 for (LocationInfoListNode *listNode = defList.Begin(), *end = defList.End(); listNode != end;
8890 listNode = listNode->Next())
8892 GenTree* node = listNode->treeNode;
8893 JITDUMP("%sN%03u.t%d. %s", first ? "" : "; ", node->gtSeqNum, node->gtTreeID, GenTree::OpName(node->OperGet()));
8899 void LinearScan::lsraDumpIntervals(const char* msg)
8901 printf("\nLinear scan intervals %s:\n", msg);
8902 for (Interval& interval : intervals)
8904 // only dump something if it has references
8905 // if (interval->firstRefPosition)
8912 // Dumps a tree node as a destination or source operand, with the style
8913 // of dump dependent on the mode
8914 void LinearScan::lsraGetOperandString(GenTree* tree,
8915 LsraTupleDumpMode mode,
8916 char* operandString,
8917 unsigned operandStringLength)
8919 const char* lastUseChar = "";
8920 if ((tree->gtFlags & GTF_VAR_DEATH) != 0)
8926 case LinearScan::LSRA_DUMP_PRE:
8927 _snprintf_s(operandString, operandStringLength, operandStringLength, "t%d%s", tree->gtTreeID, lastUseChar);
8929 case LinearScan::LSRA_DUMP_REFPOS:
8930 _snprintf_s(operandString, operandStringLength, operandStringLength, "t%d%s", tree->gtTreeID, lastUseChar);
8932 case LinearScan::LSRA_DUMP_POST:
8934 Compiler* compiler = JitTls::GetCompiler();
8936 if (!tree->gtHasReg())
8938 _snprintf_s(operandString, operandStringLength, operandStringLength, "STK%s", lastUseChar);
8942 _snprintf_s(operandString, operandStringLength, operandStringLength, "%s%s",
8943 getRegName(tree->gtRegNum, useFloatReg(tree->TypeGet())), lastUseChar);
8948 printf("ERROR: INVALID TUPLE DUMP MODE\n");
8952 void LinearScan::lsraDispNode(GenTree* tree, LsraTupleDumpMode mode, bool hasDest)
8954 Compiler* compiler = JitTls::GetCompiler();
8955 const unsigned operandStringLength = 16;
8956 char operandString[operandStringLength];
8957 const char* emptyDestOperand = " ";
8958 char spillChar = ' ';
8960 if (mode == LinearScan::LSRA_DUMP_POST)
8962 if ((tree->gtFlags & GTF_SPILL) != 0)
8966 if (!hasDest && tree->gtHasReg())
8968 // A node can define a register, but not produce a value for a parent to consume,
8969 // i.e. in the "localDefUse" case.
8970 // There used to be an assert here that we wouldn't spill such a node.
8971 // However, we can have unused lclVars that wind up being the node at which
8972 // it is spilled. This probably indicates a bug, but we don't realy want to
8973 // assert during a dump.
8974 if (spillChar == 'S')
8985 printf("%c N%03u. ", spillChar, tree->gtSeqNum);
8987 LclVarDsc* varDsc = nullptr;
8988 unsigned varNum = UINT_MAX;
8989 if (tree->IsLocal())
8991 varNum = tree->gtLclVarCommon.gtLclNum;
8992 varDsc = &(compiler->lvaTable[varNum]);
8993 if (varDsc->lvLRACandidate)
9000 if (mode == LinearScan::LSRA_DUMP_POST && tree->gtFlags & GTF_SPILLED)
9002 assert(tree->gtHasReg());
9004 lsraGetOperandString(tree, mode, operandString, operandStringLength);
9005 printf("%-15s =", operandString);
9009 printf("%-15s ", emptyDestOperand);
9011 if (varDsc != nullptr)
9013 if (varDsc->lvLRACandidate)
9015 if (mode == LSRA_DUMP_REFPOS)
9017 printf(" V%02u(L%d)", varNum, getIntervalForLocalVar(varDsc->lvVarIndex)->intervalIndex);
9021 lsraGetOperandString(tree, mode, operandString, operandStringLength);
9022 printf(" V%02u(%s)", varNum, operandString);
9023 if (mode == LinearScan::LSRA_DUMP_POST && tree->gtFlags & GTF_SPILLED)
9031 printf(" V%02u MEM", varNum);
9034 else if (tree->OperIsAssignment())
9036 assert(!tree->gtHasReg());
9037 printf(" asg%s ", GenTree::OpName(tree->OperGet()));
9041 compiler->gtDispNodeName(tree);
9042 if (tree->OperKind() & GTK_LEAF)
9044 compiler->gtDispLeaf(tree, nullptr);
9049 //------------------------------------------------------------------------
9050 // DumpOperandDefs: dumps the registers defined by a node.
9053 // operand - The operand for which to compute a register count.
9056 // The number of registers defined by `operand`.
9058 void LinearScan::DumpOperandDefs(
9059 GenTree* operand, bool& first, LsraTupleDumpMode mode, char* operandString, const unsigned operandStringLength)
9061 assert(operand != nullptr);
9062 assert(operandString != nullptr);
9063 if (!operand->IsLIR())
9068 int dstCount = ComputeOperandDstCount(operand);
9072 // This operand directly produces registers; print it.
9073 for (int i = 0; i < dstCount; i++)
9080 lsraGetOperandString(operand, mode, operandString, operandStringLength);
9081 printf("%s", operandString);
9086 else if (operand->isContained())
9088 // This is a contained node. Dump the defs produced by its operands.
9089 for (GenTree* op : operand->Operands())
9091 DumpOperandDefs(op, first, mode, operandString, operandStringLength);
9096 void LinearScan::TupleStyleDump(LsraTupleDumpMode mode)
9099 LsraLocation currentLoc = 1; // 0 is the entry
9100 const unsigned operandStringLength = 16;
9101 char operandString[operandStringLength];
9103 // currentRefPosition is not used for LSRA_DUMP_PRE
9104 // We keep separate iterators for defs, so that we can print them
9105 // on the lhs of the dump
9106 RefPositionIterator refPosIterator = refPositions.begin();
9107 RefPosition* currentRefPosition = &refPosIterator;
9112 printf("TUPLE STYLE DUMP BEFORE LSRA\n");
9114 case LSRA_DUMP_REFPOS:
9115 printf("TUPLE STYLE DUMP WITH REF POSITIONS\n");
9117 case LSRA_DUMP_POST:
9118 printf("TUPLE STYLE DUMP WITH REGISTER ASSIGNMENTS\n");
9121 printf("ERROR: INVALID TUPLE DUMP MODE\n");
9125 if (mode != LSRA_DUMP_PRE)
9127 printf("Incoming Parameters: ");
9128 for (; refPosIterator != refPositions.end() && currentRefPosition->refType != RefTypeBB;
9129 ++refPosIterator, currentRefPosition = &refPosIterator)
9131 Interval* interval = currentRefPosition->getInterval();
9132 assert(interval != nullptr && interval->isLocalVar);
9133 printf(" V%02d", interval->varNum);
9134 if (mode == LSRA_DUMP_POST)
9137 if (currentRefPosition->registerAssignment == RBM_NONE)
9143 reg = currentRefPosition->assignedReg();
9145 LclVarDsc* varDsc = &(compiler->lvaTable[interval->varNum]);
9147 regNumber assignedReg = varDsc->lvRegNum;
9148 regNumber argReg = (varDsc->lvIsRegArg) ? varDsc->lvArgReg : REG_STK;
9150 assert(reg == assignedReg || varDsc->lvRegister == false);
9153 printf(getRegName(argReg, isFloatRegType(interval->registerType)));
9156 printf("%s)", getRegName(reg, isFloatRegType(interval->registerType)));
9162 for (block = startBlockSequence(); block != nullptr; block = moveToNextBlock())
9166 if (mode == LSRA_DUMP_REFPOS)
9168 bool printedBlockHeader = false;
9169 // We should find the boundary RefPositions in the order of exposed uses, dummy defs, and the blocks
9170 for (; refPosIterator != refPositions.end() &&
9171 (currentRefPosition->refType == RefTypeExpUse || currentRefPosition->refType == RefTypeDummyDef ||
9172 (currentRefPosition->refType == RefTypeBB && !printedBlockHeader));
9173 ++refPosIterator, currentRefPosition = &refPosIterator)
9175 Interval* interval = nullptr;
9176 if (currentRefPosition->isIntervalRef())
9178 interval = currentRefPosition->getInterval();
9180 switch (currentRefPosition->refType)
9183 assert(interval != nullptr);
9184 assert(interval->isLocalVar);
9185 printf(" Exposed use of V%02u at #%d\n", interval->varNum, currentRefPosition->rpNum);
9187 case RefTypeDummyDef:
9188 assert(interval != nullptr);
9189 assert(interval->isLocalVar);
9190 printf(" Dummy def of V%02u at #%d\n", interval->varNum, currentRefPosition->rpNum);
9193 block->dspBlockHeader(compiler);
9194 printedBlockHeader = true;
9198 printf("Unexpected RefPosition type at #%d\n", currentRefPosition->rpNum);
9205 block->dspBlockHeader(compiler);
9208 if (enregisterLocalVars && mode == LSRA_DUMP_POST && block != compiler->fgFirstBB &&
9209 block->bbNum <= bbNumMaxBeforeResolution)
9211 printf("Predecessor for variable locations: BB%02u\n", blockInfo[block->bbNum].predBBNum);
9212 dumpInVarToRegMap(block);
9214 if (block->bbNum > bbNumMaxBeforeResolution)
9216 SplitEdgeInfo splitEdgeInfo;
9217 splitBBNumToTargetBBNumMap->Lookup(block->bbNum, &splitEdgeInfo);
9218 assert(splitEdgeInfo.toBBNum <= bbNumMaxBeforeResolution);
9219 assert(splitEdgeInfo.fromBBNum <= bbNumMaxBeforeResolution);
9220 printf("New block introduced for resolution from BB%02u to BB%02u\n", splitEdgeInfo.fromBBNum,
9221 splitEdgeInfo.toBBNum);
9224 for (GenTree* node : LIR::AsRange(block).NonPhiNodes())
9226 GenTree* tree = node;
9228 genTreeOps oper = tree->OperGet();
9229 int produce = tree->IsValue() ? ComputeOperandDstCount(tree) : 0;
9230 int consume = ComputeAvailableSrcCount(tree);
9231 regMaskTP killMask = RBM_NONE;
9232 regMaskTP fixedMask = RBM_NONE;
9234 lsraDispNode(tree, mode, produce != 0 && mode != LSRA_DUMP_REFPOS);
9236 if (mode != LSRA_DUMP_REFPOS)
9243 for (GenTree* operand : tree->Operands())
9245 DumpOperandDefs(operand, first, mode, operandString, operandStringLength);
9251 // Print each RefPosition on a new line, but
9252 // printing all the kills for each node on a single line
9253 // and combining the fixed regs with their associated def or use
9254 bool killPrinted = false;
9255 RefPosition* lastFixedRegRefPos = nullptr;
9256 for (; refPosIterator != refPositions.end() &&
9257 (currentRefPosition->refType == RefTypeUse || currentRefPosition->refType == RefTypeFixedReg ||
9258 currentRefPosition->refType == RefTypeKill || currentRefPosition->refType == RefTypeDef) &&
9259 (currentRefPosition->nodeLocation == tree->gtSeqNum ||
9260 currentRefPosition->nodeLocation == tree->gtSeqNum + 1);
9261 ++refPosIterator, currentRefPosition = &refPosIterator)
9263 Interval* interval = nullptr;
9264 if (currentRefPosition->isIntervalRef())
9266 interval = currentRefPosition->getInterval();
9268 switch (currentRefPosition->refType)
9271 if (currentRefPosition->isPhysRegRef)
9273 printf("\n Use:R%d(#%d)",
9274 currentRefPosition->getReg()->regNum, currentRefPosition->rpNum);
9278 assert(interval != nullptr);
9280 interval->microDump();
9281 printf("(#%d)", currentRefPosition->rpNum);
9282 if (currentRefPosition->isFixedRegRef)
9284 assert(genMaxOneBit(currentRefPosition->registerAssignment));
9285 assert(lastFixedRegRefPos != nullptr);
9286 printf(" Fixed:%s(#%d)", getRegName(currentRefPosition->assignedReg(),
9287 isFloatRegType(interval->registerType)),
9288 lastFixedRegRefPos->rpNum);
9289 lastFixedRegRefPos = nullptr;
9291 if (currentRefPosition->isLocalDefUse)
9293 printf(" LocalDefUse");
9295 if (currentRefPosition->lastUse)
9303 // Print each def on a new line
9304 assert(interval != nullptr);
9306 interval->microDump();
9307 printf("(#%d)", currentRefPosition->rpNum);
9308 if (currentRefPosition->isFixedRegRef)
9310 assert(genMaxOneBit(currentRefPosition->registerAssignment));
9311 printf(" %s", getRegName(currentRefPosition->assignedReg(),
9312 isFloatRegType(interval->registerType)));
9314 if (currentRefPosition->isLocalDefUse)
9316 printf(" LocalDefUse");
9318 if (currentRefPosition->lastUse)
9322 if (interval->relatedInterval != nullptr)
9325 interval->relatedInterval->microDump();
9332 printf("\n Kill: ");
9335 printf(getRegName(currentRefPosition->assignedReg(),
9336 isFloatRegType(currentRefPosition->getReg()->registerType)));
9339 case RefTypeFixedReg:
9340 lastFixedRegRefPos = currentRefPosition;
9343 printf("Unexpected RefPosition type at #%d\n", currentRefPosition->rpNum);
9350 if (enregisterLocalVars && mode == LSRA_DUMP_POST)
9352 dumpOutVarToRegMap(block);
9359 void LinearScan::dumpLsraAllocationEvent(LsraDumpEvent event,
9362 BasicBlock* currentBlock)
9368 if ((interval != nullptr) && (reg != REG_NA) && (reg != REG_STK))
9370 registersToDump |= genRegMask(reg);
9371 dumpRegRecordTitleIfNeeded();
9376 // Conflicting def/use
9377 case LSRA_EVENT_DEFUSE_CONFLICT:
9378 dumpRefPositionShort(activeRefPosition, currentBlock);
9379 printf("DUconflict ");
9382 case LSRA_EVENT_DEFUSE_CASE1:
9383 printf(indentFormat, " Case #1 use defRegAssignment");
9386 case LSRA_EVENT_DEFUSE_CASE2:
9387 printf(indentFormat, " Case #2 use useRegAssignment");
9390 case LSRA_EVENT_DEFUSE_CASE3:
9391 printf(indentFormat, " Case #3 use useRegAssignment");
9395 case LSRA_EVENT_DEFUSE_CASE4:
9396 printf(indentFormat, " Case #4 use defRegAssignment");
9399 case LSRA_EVENT_DEFUSE_CASE5:
9400 printf(indentFormat, " Case #5 set def to all regs");
9403 case LSRA_EVENT_DEFUSE_CASE6:
9404 printf(indentFormat, " Case #6 need a copy");
9408 case LSRA_EVENT_SPILL:
9409 dumpRefPositionShort(activeRefPosition, currentBlock);
9410 assert(interval != nullptr && interval->assignedReg != nullptr);
9411 printf("Spill %-4s ", getRegName(interval->assignedReg->regNum));
9415 // Restoring the previous register
9416 case LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL_AFTER_SPILL:
9417 assert(interval != nullptr);
9418 dumpRefPositionShort(activeRefPosition, currentBlock);
9419 printf("SRstr %-4s ", getRegName(reg));
9423 case LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL:
9424 assert(interval != nullptr);
9425 if (activeRefPosition == nullptr)
9427 printf(emptyRefPositionFormat, "");
9431 dumpRefPositionShort(activeRefPosition, currentBlock);
9433 printf("Restr %-4s ", getRegName(reg));
9435 if (activeRefPosition != nullptr)
9437 printf(emptyRefPositionFormat, "");
9441 // Done with GC Kills
9442 case LSRA_EVENT_DONE_KILL_GC_REFS:
9443 printf(indentFormat, " DoneKillGC ");
9447 case LSRA_EVENT_START_BB:
9448 assert(currentBlock != nullptr);
9449 dumpRefPositionShort(activeRefPosition, currentBlock);
9452 // Allocation decisions
9453 case LSRA_EVENT_NEEDS_NEW_REG:
9454 dumpRefPositionShort(activeRefPosition, currentBlock);
9455 printf("Free %-4s ", getRegName(reg));
9459 case LSRA_EVENT_ZERO_REF:
9460 assert(interval != nullptr && interval->isLocalVar);
9461 dumpRefPositionShort(activeRefPosition, currentBlock);
9466 case LSRA_EVENT_FIXED_REG:
9467 case LSRA_EVENT_EXP_USE:
9468 case LSRA_EVENT_KEPT_ALLOCATION:
9469 dumpRefPositionShort(activeRefPosition, currentBlock);
9470 printf("Keep %-4s ", getRegName(reg));
9473 case LSRA_EVENT_COPY_REG:
9474 assert(interval != nullptr && interval->recentRefPosition != nullptr);
9475 dumpRefPositionShort(activeRefPosition, currentBlock);
9476 printf("Copy %-4s ", getRegName(reg));
9479 case LSRA_EVENT_MOVE_REG:
9480 assert(interval != nullptr && interval->recentRefPosition != nullptr);
9481 dumpRefPositionShort(activeRefPosition, currentBlock);
9482 printf("Move %-4s ", getRegName(reg));
9486 case LSRA_EVENT_ALLOC_REG:
9487 dumpRefPositionShort(activeRefPosition, currentBlock);
9488 printf("Alloc %-4s ", getRegName(reg));
9491 case LSRA_EVENT_REUSE_REG:
9492 dumpRefPositionShort(activeRefPosition, currentBlock);
9493 printf("Reuse %-4s ", getRegName(reg));
9496 case LSRA_EVENT_ALLOC_SPILLED_REG:
9497 dumpRefPositionShort(activeRefPosition, currentBlock);
9498 printf("Steal %-4s ", getRegName(reg));
9501 case LSRA_EVENT_NO_ENTRY_REG_ALLOCATED:
9502 assert(interval != nullptr && interval->isLocalVar);
9503 dumpRefPositionShort(activeRefPosition, currentBlock);
9507 case LSRA_EVENT_NO_REG_ALLOCATED:
9508 dumpRefPositionShort(activeRefPosition, currentBlock);
9512 case LSRA_EVENT_RELOAD:
9513 dumpRefPositionShort(activeRefPosition, currentBlock);
9514 printf("ReLod %-4s ", getRegName(reg));
9518 case LSRA_EVENT_SPECIAL_PUTARG:
9519 dumpRefPositionShort(activeRefPosition, currentBlock);
9520 printf("PtArg %-4s ", getRegName(reg));
9523 // We currently don't dump anything for these events.
9524 case LSRA_EVENT_DEFUSE_FIXED_DELAY_USE:
9525 case LSRA_EVENT_SPILL_EXTENDED_LIFETIME:
9526 case LSRA_EVENT_END_BB:
9527 case LSRA_EVENT_FREE_REGS:
9528 case LSRA_EVENT_INCREMENT_RANGE_END:
9529 case LSRA_EVENT_LAST_USE:
9530 case LSRA_EVENT_LAST_USE_DELAYED:
9538 //------------------------------------------------------------------------
9539 // dumpRegRecordHeader: Dump the header for a column-based dump of the register state.
9548 // Reg names fit in 4 characters (minimum width of the columns)
9551 // In order to make the table as dense as possible (for ease of reading the dumps),
9552 // we determine the minimum regColumnWidth width required to represent:
9553 // regs, by name (e.g. eax or xmm0) - this is fixed at 4 characters.
9554 // intervals, as Vnn for lclVar intervals, or as I<num> for other intervals.
9555 // The table is indented by the amount needed for dumpRefPositionShort, which is
9556 // captured in shortRefPositionDumpWidth.
9558 void LinearScan::dumpRegRecordHeader()
9560 printf("The following table has one or more rows for each RefPosition that is handled during allocation.\n"
9561 "The first column provides the basic information about the RefPosition, with its type (e.g. Def,\n"
9562 "Use, Fixd) followed by a '*' if it is a last use, and a 'D' if it is delayRegFree, and then the\n"
9563 "action taken during allocation (e.g. Alloc a new register, or Keep an existing one).\n"
9564 "The subsequent columns show the Interval occupying each register, if any, followed by 'a' if it is\n"
9565 "active, and 'i'if it is inactive. Columns are only printed up to the last modifed register, which\n"
9566 "may increase during allocation, in which case additional columns will appear. Registers which are\n"
9567 "not marked modified have ---- in their column.\n\n");
9569 // First, determine the width of each register column (which holds a reg name in the
9570 // header, and an interval name in each subsequent row).
9571 int intervalNumberWidth = (int)log10((double)intervals.size()) + 1;
9572 // The regColumnWidth includes the identifying character (I or V) and an 'i' or 'a' (inactive or active)
9573 regColumnWidth = intervalNumberWidth + 2;
9574 if (regColumnWidth < 4)
9578 sprintf_s(intervalNameFormat, MAX_FORMAT_CHARS, "%%c%%-%dd", regColumnWidth - 2);
9579 sprintf_s(regNameFormat, MAX_FORMAT_CHARS, "%%-%ds", regColumnWidth);
9581 // Next, determine the width of the short RefPosition (see dumpRefPositionShort()).
9582 // This is in the form:
9583 // nnn.#mmm NAME TYPEld
9585 // nnn is the Location, right-justified to the width needed for the highest location.
9586 // mmm is the RefPosition rpNum, left-justified to the width needed for the highest rpNum.
9587 // NAME is dumped by dumpReferentName(), and is "regColumnWidth".
9588 // TYPE is RefTypeNameShort, and is 4 characters
9589 // l is either '*' (if a last use) or ' ' (otherwise)
9590 // d is either 'D' (if a delayed use) or ' ' (otherwise)
9592 maxNodeLocation = (maxNodeLocation == 0)
9594 : maxNodeLocation; // corner case of a method with an infinite loop without any gentree nodes
9595 assert(maxNodeLocation >= 1);
9596 assert(refPositions.size() >= 1);
9597 int nodeLocationWidth = (int)log10((double)maxNodeLocation) + 1;
9598 int refPositionWidth = (int)log10((double)refPositions.size()) + 1;
9599 int refTypeInfoWidth = 4 /*TYPE*/ + 2 /* last-use and delayed */ + 1 /* space */;
9600 int locationAndRPNumWidth = nodeLocationWidth + 2 /* .# */ + refPositionWidth + 1 /* space */;
9601 int shortRefPositionDumpWidth = locationAndRPNumWidth + regColumnWidth + 1 /* space */ + refTypeInfoWidth;
9602 sprintf_s(shortRefPositionFormat, MAX_FORMAT_CHARS, "%%%dd.#%%-%dd ", nodeLocationWidth, refPositionWidth);
9603 sprintf_s(emptyRefPositionFormat, MAX_FORMAT_CHARS, "%%-%ds", shortRefPositionDumpWidth);
9605 // The width of the "allocation info"
9606 // - a 5-character allocation decision
9608 // - a 4-character register
9610 int allocationInfoWidth = 5 + 1 + 4 + 1;
9612 // Next, determine the width of the legend for each row. This includes:
9613 // - a short RefPosition dump (shortRefPositionDumpWidth), which includes a space
9614 // - the allocation info (allocationInfoWidth), which also includes a space
9616 regTableIndent = shortRefPositionDumpWidth + allocationInfoWidth;
9618 // BBnn printed left-justified in the NAME Typeld and allocationInfo space.
9619 int bbDumpWidth = regColumnWidth + 1 + refTypeInfoWidth + allocationInfoWidth;
9620 int bbNumWidth = (int)log10((double)compiler->fgBBNumMax) + 1;
9621 // In the unlikely event that BB numbers overflow the space, we'll simply omit the predBB
9622 int predBBNumDumpSpace = regTableIndent - locationAndRPNumWidth - bbNumWidth - 9; // 'BB' + ' PredBB'
9623 if (predBBNumDumpSpace < bbNumWidth)
9625 sprintf_s(bbRefPosFormat, MAX_LEGEND_FORMAT_CHARS, "BB%%-%dd", shortRefPositionDumpWidth - 2);
9629 sprintf_s(bbRefPosFormat, MAX_LEGEND_FORMAT_CHARS, "BB%%-%dd PredBB%%-%dd", bbNumWidth, predBBNumDumpSpace);
9632 if (compiler->shouldDumpASCIITrees())
9634 columnSeparator = "|";
9642 columnSeparator = "\xe2\x94\x82";
9643 line = "\xe2\x94\x80";
9644 leftBox = "\xe2\x94\x9c";
9645 middleBox = "\xe2\x94\xbc";
9646 rightBox = "\xe2\x94\xa4";
9648 sprintf_s(indentFormat, MAX_FORMAT_CHARS, "%%-%ds", regTableIndent);
9650 // Now, set up the legend format for the RefPosition info
9651 sprintf_s(legendFormat, MAX_LEGEND_FORMAT_CHARS, "%%-%d.%ds%%-%d.%ds%%-%ds%%s", nodeLocationWidth + 1,
9652 nodeLocationWidth + 1, refPositionWidth + 2, refPositionWidth + 2, regColumnWidth + 1);
9654 // Print a "title row" including the legend and the reg names.
9655 lastDumpedRegisters = RBM_NONE;
9656 dumpRegRecordTitleIfNeeded();
9659 void LinearScan::dumpRegRecordTitleIfNeeded()
9661 if ((lastDumpedRegisters != registersToDump) || (rowCountSinceLastTitle > MAX_ROWS_BETWEEN_TITLES))
9663 lastUsedRegNumIndex = 0;
9664 int lastRegNumIndex = compiler->compFloatingPointUsed ? REG_FP_LAST : REG_INT_LAST;
9665 for (int regNumIndex = 0; regNumIndex <= lastRegNumIndex; regNumIndex++)
9667 if ((registersToDump & genRegMask((regNumber)regNumIndex)) != 0)
9669 lastUsedRegNumIndex = regNumIndex;
9672 dumpRegRecordTitle();
9673 lastDumpedRegisters = registersToDump;
9677 void LinearScan::dumpRegRecordTitleLines()
9679 for (int i = 0; i < regTableIndent; i++)
9683 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9685 regNumber regNum = (regNumber)regNumIndex;
9686 if (shouldDumpReg(regNum))
9688 printf("%s", middleBox);
9689 for (int i = 0; i < regColumnWidth; i++)
9695 printf("%s\n", rightBox);
9697 void LinearScan::dumpRegRecordTitle()
9699 dumpRegRecordTitleLines();
9701 // Print out the legend for the RefPosition info
9702 printf(legendFormat, "Loc ", "RP# ", "Name ", "Type Action Reg ");
9704 // Print out the register name column headers
9705 char columnFormatArray[MAX_FORMAT_CHARS];
9706 sprintf_s(columnFormatArray, MAX_FORMAT_CHARS, "%s%%-%d.%ds", columnSeparator, regColumnWidth, regColumnWidth);
9707 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9709 regNumber regNum = (regNumber)regNumIndex;
9710 if (shouldDumpReg(regNum))
9712 const char* regName = getRegName(regNum);
9713 printf(columnFormatArray, regName);
9716 printf("%s\n", columnSeparator);
9718 rowCountSinceLastTitle = 0;
9720 dumpRegRecordTitleLines();
9723 void LinearScan::dumpRegRecords()
9725 static char columnFormatArray[18];
9727 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9729 if (shouldDumpReg((regNumber)regNumIndex))
9731 printf("%s", columnSeparator);
9732 RegRecord& regRecord = physRegs[regNumIndex];
9733 Interval* interval = regRecord.assignedInterval;
9734 if (interval != nullptr)
9736 dumpIntervalName(interval);
9737 char activeChar = interval->isActive ? 'a' : 'i';
9738 printf("%c", activeChar);
9740 else if (regRecord.isBusyUntilNextKill)
9742 printf(columnFormatArray, "Busy");
9746 sprintf_s(columnFormatArray, MAX_FORMAT_CHARS, "%%-%ds", regColumnWidth);
9747 printf(columnFormatArray, "");
9751 printf("%s\n", columnSeparator);
9752 rowCountSinceLastTitle++;
9755 void LinearScan::dumpIntervalName(Interval* interval)
9757 if (interval->isLocalVar)
9759 printf(intervalNameFormat, 'V', interval->varNum);
9761 else if (interval->isConstant)
9763 printf(intervalNameFormat, 'C', interval->intervalIndex);
9767 printf(intervalNameFormat, 'I', interval->intervalIndex);
9771 void LinearScan::dumpEmptyRefPosition()
9773 printf(emptyRefPositionFormat, "");
9776 // Note that the size of this dump is computed in dumpRegRecordHeader().
9778 void LinearScan::dumpRefPositionShort(RefPosition* refPosition, BasicBlock* currentBlock)
9780 BasicBlock* block = currentBlock;
9781 static RefPosition* lastPrintedRefPosition = nullptr;
9782 if (refPosition == lastPrintedRefPosition)
9784 dumpEmptyRefPosition();
9787 lastPrintedRefPosition = refPosition;
9788 if (refPosition->refType == RefTypeBB)
9790 // Always print a title row before a RefTypeBB (except for the first, because we
9791 // will already have printed it before the parameters)
9792 if (refPosition->refType == RefTypeBB && block != compiler->fgFirstBB && block != nullptr)
9794 dumpRegRecordTitle();
9797 printf(shortRefPositionFormat, refPosition->nodeLocation, refPosition->rpNum);
9798 if (refPosition->refType == RefTypeBB)
9800 if (block == nullptr)
9802 printf(regNameFormat, "END");
9804 printf(regNameFormat, "");
9808 printf(bbRefPosFormat, block->bbNum, block == compiler->fgFirstBB ? 0 : blockInfo[block->bbNum].predBBNum);
9811 else if (refPosition->isIntervalRef())
9813 Interval* interval = refPosition->getInterval();
9814 dumpIntervalName(interval);
9815 char lastUseChar = ' ';
9816 char delayChar = ' ';
9817 if (refPosition->lastUse)
9820 if (refPosition->delayRegFree)
9825 printf(" %s%c%c ", getRefTypeShortName(refPosition->refType), lastUseChar, delayChar);
9827 else if (refPosition->isPhysRegRef)
9829 RegRecord* regRecord = refPosition->getReg();
9830 printf(regNameFormat, getRegName(regRecord->regNum));
9831 printf(" %s ", getRefTypeShortName(refPosition->refType));
9835 assert(refPosition->refType == RefTypeKillGCRefs);
9836 // There's no interval or reg name associated with this.
9837 printf(regNameFormat, " ");
9838 printf(" %s ", getRefTypeShortName(refPosition->refType));
9842 //------------------------------------------------------------------------
9843 // LinearScan::IsResolutionMove:
9844 // Returns true if the given node is a move inserted by LSRA
9848 // node - the node to check.
9850 bool LinearScan::IsResolutionMove(GenTree* node)
9852 if (!IsLsraAdded(node))
9857 switch (node->OperGet())
9861 return node->IsUnusedValue();
9871 //------------------------------------------------------------------------
9872 // LinearScan::IsResolutionNode:
9873 // Returns true if the given node is either a move inserted by LSRA
9874 // resolution or an operand to such a move.
9877 // containingRange - the range that contains the node to check.
9878 // node - the node to check.
9880 bool LinearScan::IsResolutionNode(LIR::Range& containingRange, GenTree* node)
9884 if (IsResolutionMove(node))
9889 if (!IsLsraAdded(node) || (node->OperGet() != GT_LCL_VAR))
9895 bool foundUse = containingRange.TryGetUse(node, &use);
9902 //------------------------------------------------------------------------
9903 // verifyFinalAllocation: Traverse the RefPositions and verify various invariants.
9912 // If verbose is set, this will also dump a table of the final allocations.
9913 void LinearScan::verifyFinalAllocation()
9917 printf("\nFinal allocation\n");
9920 // Clear register assignments.
9921 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
9923 RegRecord* physRegRecord = getRegisterRecord(reg);
9924 physRegRecord->assignedInterval = nullptr;
9927 for (Interval& interval : intervals)
9929 interval.assignedReg = nullptr;
9930 interval.physReg = REG_NA;
9933 DBEXEC(VERBOSE, dumpRegRecordTitle());
9935 BasicBlock* currentBlock = nullptr;
9936 GenTree* firstBlockEndResolutionNode = nullptr;
9937 regMaskTP regsToFree = RBM_NONE;
9938 regMaskTP delayRegsToFree = RBM_NONE;
9939 LsraLocation currentLocation = MinLocation;
9940 for (RefPosition& refPosition : refPositions)
9942 RefPosition* currentRefPosition = &refPosition;
9943 Interval* interval = nullptr;
9944 RegRecord* regRecord = nullptr;
9945 regNumber regNum = REG_NA;
9946 activeRefPosition = currentRefPosition;
9948 if (currentRefPosition->refType == RefTypeBB)
9950 regsToFree |= delayRegsToFree;
9951 delayRegsToFree = RBM_NONE;
9955 if (currentRefPosition->isPhysRegRef)
9957 regRecord = currentRefPosition->getReg();
9958 regRecord->recentRefPosition = currentRefPosition;
9959 regNum = regRecord->regNum;
9961 else if (currentRefPosition->isIntervalRef())
9963 interval = currentRefPosition->getInterval();
9964 interval->recentRefPosition = currentRefPosition;
9965 if (currentRefPosition->registerAssignment != RBM_NONE)
9967 if (!genMaxOneBit(currentRefPosition->registerAssignment))
9969 assert(currentRefPosition->refType == RefTypeExpUse ||
9970 currentRefPosition->refType == RefTypeDummyDef);
9974 regNum = currentRefPosition->assignedReg();
9975 regRecord = getRegisterRecord(regNum);
9981 LsraLocation newLocation = currentRefPosition->nodeLocation;
9983 if (newLocation > currentLocation)
9986 // We could use the freeRegisters() method, but we'd have to carefully manage the active intervals.
9987 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
9989 regMaskTP regMask = genRegMask(reg);
9990 if ((regsToFree & regMask) != RBM_NONE)
9992 RegRecord* physRegRecord = getRegisterRecord(reg);
9993 physRegRecord->assignedInterval = nullptr;
9996 regsToFree = delayRegsToFree;
9997 regsToFree = RBM_NONE;
9999 currentLocation = newLocation;
10001 switch (currentRefPosition->refType)
10005 if (currentBlock == nullptr)
10007 currentBlock = startBlockSequence();
10011 // Verify the resolution moves at the end of the previous block.
10012 for (GenTree* node = firstBlockEndResolutionNode; node != nullptr; node = node->gtNext)
10014 assert(enregisterLocalVars);
10015 // Only verify nodes that are actually moves; don't bother with the nodes that are
10016 // operands to moves.
10017 if (IsResolutionMove(node))
10019 verifyResolutionMove(node, currentLocation);
10023 // Validate the locations at the end of the previous block.
10024 if (enregisterLocalVars)
10026 VarToRegMap outVarToRegMap = outVarToRegMaps[currentBlock->bbNum];
10027 VarSetOps::Iter iter(compiler, currentBlock->bbLiveOut);
10028 unsigned varIndex = 0;
10029 while (iter.NextElem(&varIndex))
10031 if (localVarIntervals[varIndex] == nullptr)
10033 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10036 regNumber regNum = getVarReg(outVarToRegMap, varIndex);
10037 interval = getIntervalForLocalVar(varIndex);
10038 assert(interval->physReg == regNum || (interval->physReg == REG_NA && regNum == REG_STK));
10039 interval->physReg = REG_NA;
10040 interval->assignedReg = nullptr;
10041 interval->isActive = false;
10045 // Clear register assignments.
10046 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
10048 RegRecord* physRegRecord = getRegisterRecord(reg);
10049 physRegRecord->assignedInterval = nullptr;
10052 // Now, record the locations at the beginning of this block.
10053 currentBlock = moveToNextBlock();
10056 if (currentBlock != nullptr)
10058 if (enregisterLocalVars)
10060 VarToRegMap inVarToRegMap = inVarToRegMaps[currentBlock->bbNum];
10061 VarSetOps::Iter iter(compiler, currentBlock->bbLiveIn);
10062 unsigned varIndex = 0;
10063 while (iter.NextElem(&varIndex))
10065 if (localVarIntervals[varIndex] == nullptr)
10067 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10070 regNumber regNum = getVarReg(inVarToRegMap, varIndex);
10071 interval = getIntervalForLocalVar(varIndex);
10072 interval->physReg = regNum;
10073 interval->assignedReg = &(physRegs[regNum]);
10074 interval->isActive = true;
10075 physRegs[regNum].assignedInterval = interval;
10081 dumpRefPositionShort(currentRefPosition, currentBlock);
10085 // Finally, handle the resolution moves, if any, at the beginning of the next block.
10086 firstBlockEndResolutionNode = nullptr;
10087 bool foundNonResolutionNode = false;
10089 LIR::Range& currentBlockRange = LIR::AsRange(currentBlock);
10090 for (GenTree* node : currentBlockRange.NonPhiNodes())
10092 if (IsResolutionNode(currentBlockRange, node))
10094 assert(enregisterLocalVars);
10095 if (foundNonResolutionNode)
10097 firstBlockEndResolutionNode = node;
10100 else if (IsResolutionMove(node))
10102 // Only verify nodes that are actually moves; don't bother with the nodes that are
10103 // operands to moves.
10104 verifyResolutionMove(node, currentLocation);
10109 foundNonResolutionNode = true;
10118 assert(regRecord != nullptr);
10119 assert(regRecord->assignedInterval == nullptr);
10120 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10122 case RefTypeFixedReg:
10123 assert(regRecord != nullptr);
10124 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10127 case RefTypeUpperVectorSaveDef:
10128 case RefTypeUpperVectorSaveUse:
10131 case RefTypeParamDef:
10132 case RefTypeZeroInit:
10133 assert(interval != nullptr);
10135 if (interval->isSpecialPutArg)
10137 dumpLsraAllocationEvent(LSRA_EVENT_SPECIAL_PUTARG, interval, regNum);
10140 if (currentRefPosition->reload)
10142 interval->isActive = true;
10143 assert(regNum != REG_NA);
10144 interval->physReg = regNum;
10145 interval->assignedReg = regRecord;
10146 regRecord->assignedInterval = interval;
10147 dumpLsraAllocationEvent(LSRA_EVENT_RELOAD, nullptr, regRecord->regNum, currentBlock);
10149 if (regNum == REG_NA)
10151 dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, interval);
10153 else if (RefTypeIsDef(currentRefPosition->refType))
10155 interval->isActive = true;
10158 if (interval->isConstant && (currentRefPosition->treeNode != nullptr) &&
10159 currentRefPosition->treeNode->IsReuseRegVal())
10161 dumpLsraAllocationEvent(LSRA_EVENT_REUSE_REG, nullptr, regRecord->regNum, currentBlock);
10165 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, nullptr, regRecord->regNum, currentBlock);
10169 else if (currentRefPosition->copyReg)
10171 dumpLsraAllocationEvent(LSRA_EVENT_COPY_REG, interval, regRecord->regNum, currentBlock);
10173 else if (currentRefPosition->moveReg)
10175 assert(interval->assignedReg != nullptr);
10176 interval->assignedReg->assignedInterval = nullptr;
10177 interval->physReg = regNum;
10178 interval->assignedReg = regRecord;
10179 regRecord->assignedInterval = interval;
10182 printf("Move %-4s ", getRegName(regRecord->regNum));
10187 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10189 if (currentRefPosition->lastUse || currentRefPosition->spillAfter)
10191 interval->isActive = false;
10193 if (regNum != REG_NA)
10195 if (currentRefPosition->spillAfter)
10199 // If refPos is marked as copyReg, then the reg that is spilled
10200 // is the homeReg of the interval not the reg currently assigned
10202 regNumber spillReg = regNum;
10203 if (currentRefPosition->copyReg)
10205 assert(interval != nullptr);
10206 spillReg = interval->physReg;
10209 dumpEmptyRefPosition();
10210 printf("Spill %-4s ", getRegName(spillReg));
10213 else if (currentRefPosition->copyReg)
10215 regRecord->assignedInterval = interval;
10219 interval->physReg = regNum;
10220 interval->assignedReg = regRecord;
10221 regRecord->assignedInterval = interval;
10225 case RefTypeKillGCRefs:
10226 // No action to take.
10227 // However, we will assert that, at resolution time, no registers contain GC refs.
10229 DBEXEC(VERBOSE, printf(" "));
10230 regMaskTP candidateRegs = currentRefPosition->registerAssignment;
10231 while (candidateRegs != RBM_NONE)
10233 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
10234 candidateRegs &= ~nextRegBit;
10235 regNumber nextReg = genRegNumFromMask(nextRegBit);
10236 RegRecord* regRecord = getRegisterRecord(nextReg);
10237 Interval* assignedInterval = regRecord->assignedInterval;
10238 assert(assignedInterval == nullptr || !varTypeIsGC(assignedInterval->registerType));
10243 case RefTypeExpUse:
10244 case RefTypeDummyDef:
10245 // Do nothing; these will be handled by the RefTypeBB.
10246 DBEXEC(VERBOSE, printf(" "));
10249 case RefTypeInvalid:
10250 // for these 'currentRefPosition->refType' values, No action to take
10254 if (currentRefPosition->refType != RefTypeBB)
10256 DBEXEC(VERBOSE, dumpRegRecords());
10257 if (interval != nullptr)
10259 if (currentRefPosition->copyReg)
10261 assert(interval->physReg != regNum);
10262 regRecord->assignedInterval = nullptr;
10263 assert(interval->assignedReg != nullptr);
10264 regRecord = interval->assignedReg;
10266 if (currentRefPosition->spillAfter || currentRefPosition->lastUse)
10268 interval->physReg = REG_NA;
10269 interval->assignedReg = nullptr;
10271 // regRegcord could be null if the RefPosition does not require a register.
10272 if (regRecord != nullptr)
10274 regRecord->assignedInterval = nullptr;
10278 assert(!currentRefPosition->RequiresRegister());
10285 // Now, verify the resolution blocks.
10286 // Currently these are nearly always at the end of the method, but that may not alwyas be the case.
10287 // So, we'll go through all the BBs looking for blocks whose bbNum is greater than bbNumMaxBeforeResolution.
10288 for (BasicBlock* currentBlock = compiler->fgFirstBB; currentBlock != nullptr; currentBlock = currentBlock->bbNext)
10290 if (currentBlock->bbNum > bbNumMaxBeforeResolution)
10292 // If we haven't enregistered an lclVars, we have no resolution blocks.
10293 assert(enregisterLocalVars);
10297 dumpRegRecordTitle();
10298 printf(shortRefPositionFormat, 0, 0);
10299 assert(currentBlock->bbPreds != nullptr && currentBlock->bbPreds->flBlock != nullptr);
10300 printf(bbRefPosFormat, currentBlock->bbNum, currentBlock->bbPreds->flBlock->bbNum);
10304 // Clear register assignments.
10305 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
10307 RegRecord* physRegRecord = getRegisterRecord(reg);
10308 physRegRecord->assignedInterval = nullptr;
10311 // Set the incoming register assignments
10312 VarToRegMap inVarToRegMap = getInVarToRegMap(currentBlock->bbNum);
10313 VarSetOps::Iter iter(compiler, currentBlock->bbLiveIn);
10314 unsigned varIndex = 0;
10315 while (iter.NextElem(&varIndex))
10317 if (localVarIntervals[varIndex] == nullptr)
10319 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10322 regNumber regNum = getVarReg(inVarToRegMap, varIndex);
10323 Interval* interval = getIntervalForLocalVar(varIndex);
10324 interval->physReg = regNum;
10325 interval->assignedReg = &(physRegs[regNum]);
10326 interval->isActive = true;
10327 physRegs[regNum].assignedInterval = interval;
10330 // Verify the moves in this block
10331 LIR::Range& currentBlockRange = LIR::AsRange(currentBlock);
10332 for (GenTree* node : currentBlockRange.NonPhiNodes())
10334 assert(IsResolutionNode(currentBlockRange, node));
10335 if (IsResolutionMove(node))
10337 // Only verify nodes that are actually moves; don't bother with the nodes that are
10338 // operands to moves.
10339 verifyResolutionMove(node, currentLocation);
10343 // Verify the outgoing register assignments
10345 VarToRegMap outVarToRegMap = getOutVarToRegMap(currentBlock->bbNum);
10346 VarSetOps::Iter iter(compiler, currentBlock->bbLiveOut);
10347 unsigned varIndex = 0;
10348 while (iter.NextElem(&varIndex))
10350 if (localVarIntervals[varIndex] == nullptr)
10352 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10355 regNumber regNum = getVarReg(outVarToRegMap, varIndex);
10356 Interval* interval = getIntervalForLocalVar(varIndex);
10357 assert(interval->physReg == regNum || (interval->physReg == REG_NA && regNum == REG_STK));
10358 interval->physReg = REG_NA;
10359 interval->assignedReg = nullptr;
10360 interval->isActive = false;
10366 DBEXEC(VERBOSE, printf("\n"));
10369 //------------------------------------------------------------------------
10370 // verifyResolutionMove: Verify a resolution statement. Called by verifyFinalAllocation()
10373 // resolutionMove - A GenTree* that must be a resolution move.
10374 // currentLocation - The LsraLocation of the most recent RefPosition that has been verified.
10380 // If verbose is set, this will also dump the moves into the table of final allocations.
10381 void LinearScan::verifyResolutionMove(GenTree* resolutionMove, LsraLocation currentLocation)
10383 GenTree* dst = resolutionMove;
10384 assert(IsResolutionMove(dst));
10386 if (dst->OperGet() == GT_SWAP)
10388 GenTreeLclVarCommon* left = dst->gtGetOp1()->AsLclVarCommon();
10389 GenTreeLclVarCommon* right = dst->gtGetOp2()->AsLclVarCommon();
10390 regNumber leftRegNum = left->gtRegNum;
10391 regNumber rightRegNum = right->gtRegNum;
10392 LclVarDsc* leftVarDsc = compiler->lvaTable + left->gtLclNum;
10393 LclVarDsc* rightVarDsc = compiler->lvaTable + right->gtLclNum;
10394 Interval* leftInterval = getIntervalForLocalVar(leftVarDsc->lvVarIndex);
10395 Interval* rightInterval = getIntervalForLocalVar(rightVarDsc->lvVarIndex);
10396 assert(leftInterval->physReg == leftRegNum && rightInterval->physReg == rightRegNum);
10397 leftInterval->physReg = rightRegNum;
10398 rightInterval->physReg = leftRegNum;
10399 leftInterval->assignedReg = &physRegs[rightRegNum];
10400 rightInterval->assignedReg = &physRegs[leftRegNum];
10401 physRegs[rightRegNum].assignedInterval = leftInterval;
10402 physRegs[leftRegNum].assignedInterval = rightInterval;
10405 printf(shortRefPositionFormat, currentLocation, 0);
10406 dumpIntervalName(leftInterval);
10408 printf(" %-4s ", getRegName(rightRegNum));
10410 printf(shortRefPositionFormat, currentLocation, 0);
10411 dumpIntervalName(rightInterval);
10413 printf(" %-4s ", getRegName(leftRegNum));
10418 regNumber dstRegNum = dst->gtRegNum;
10419 regNumber srcRegNum;
10420 GenTreeLclVarCommon* lcl;
10421 if (dst->OperGet() == GT_COPY)
10423 lcl = dst->gtGetOp1()->AsLclVarCommon();
10424 srcRegNum = lcl->gtRegNum;
10428 lcl = dst->AsLclVarCommon();
10429 if ((lcl->gtFlags & GTF_SPILLED) != 0)
10431 srcRegNum = REG_STK;
10435 assert((lcl->gtFlags & GTF_SPILL) != 0);
10436 srcRegNum = dstRegNum;
10437 dstRegNum = REG_STK;
10441 Interval* interval = getIntervalForLocalVarNode(lcl);
10442 assert(interval->physReg == srcRegNum || (srcRegNum == REG_STK && interval->physReg == REG_NA));
10443 if (srcRegNum != REG_STK)
10445 physRegs[srcRegNum].assignedInterval = nullptr;
10447 if (dstRegNum != REG_STK)
10449 interval->physReg = dstRegNum;
10450 interval->assignedReg = &(physRegs[dstRegNum]);
10451 physRegs[dstRegNum].assignedInterval = interval;
10452 interval->isActive = true;
10456 interval->physReg = REG_NA;
10457 interval->assignedReg = nullptr;
10458 interval->isActive = false;
10462 printf(shortRefPositionFormat, currentLocation, 0);
10463 dumpIntervalName(interval);
10465 printf(" %-4s ", getRegName(dstRegNum));
10471 #endif // !LEGACY_BACKEND