1 // Licensed to the .NET Foundation under one or more agreements.
2 // The .NET Foundation licenses this file to you under the MIT license.
3 // See the LICENSE file in the project root for more information.
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
9 Linear Scan Register Allocation
14 - All register requirements are expressed in the code stream, either as destination
15 registers of tree nodes, or as internal registers. These requirements are
16 expressed in the TreeNodeInfo computed for each node, which includes:
17 - The number of register sources and destinations.
18 - The register restrictions (candidates) of the target register, both from itself,
19 as producer of the value (dstCandidates), and from its consuming node (srcCandidates).
20 Note that the srcCandidates field of TreeNodeInfo refers to the destination register
21 (not any of its sources).
22 - The number (internalCount) of registers required, and their register restrictions (internalCandidates).
23 These are neither inputs nor outputs of the node, but used in the sequence of code generated for the tree.
24 "Internal registers" are registers used during the code sequence generated for the node.
25 The register lifetimes must obey the following lifetime model:
26 - First, any internal registers are defined.
27 - Next, any source registers are used (and are then freed if they are last use and are not identified as
29 - Next, the internal registers are used (and are then freed).
30 - Next, any registers in the kill set for the instruction are killed.
31 - Next, the destination register(s) are defined (multiple destination registers are only supported on ARM)
32 - Finally, any "delayRegFree" source registers are freed.
33 There are several things to note about this order:
34 - The internal registers will never overlap any use, but they may overlap a destination register.
35 - Internal registers are never live beyond the node.
36 - The "delayRegFree" annotation is used for instructions that are only available in a Read-Modify-Write form.
37 That is, the destination register is one of the sources. In this case, we must not use the same register for
38 the non-RMW operand as for the destination.
40 Overview (doLinearScan):
41 - Walk all blocks, building intervals and RefPositions (buildIntervals)
42 - Allocate registers (allocateRegisters)
43 - Annotate nodes with register assignments (resolveRegisters)
44 - Add move nodes as needed to resolve conflicting register
45 assignments across non-adjacent edges. (resolveEdges, called from resolveRegisters)
50 - GenTree::gtRegNum (and gtRegPair for ARM) is annotated with the register
51 assignment for a node. If the node does not require a register, it is
52 annotated as such (for single registers, gtRegNum = REG_NA; for register
53 pair type, gtRegPair = REG_PAIR_NONE). For a variable definition or interior
54 tree node (an "implicit" definition), this is the register to put the result.
55 For an expression use, this is the place to find the value that has previously
57 - In most cases, this register must satisfy the constraints specified by the TreeNodeInfo.
58 - In some cases, this is difficult:
59 - If a lclVar node currently lives in some register, it may not be desirable to move it
60 (i.e. its current location may be desirable for future uses, e.g. if it's a callee save register,
61 but needs to be in a specific arg register for a call).
62 - In other cases there may be conflicts on the restrictions placed by the defining node and the node which
64 - If such a node is constrained to a single fixed register (e.g. an arg register, or a return from a call),
65 then LSRA is free to annotate the node with a different register. The code generator must issue the appropriate
67 - However, if such a node is constrained to a set of registers, and its current location does not satisfy that
68 requirement, LSRA must insert a GT_COPY node between the node and its parent. The gtRegNum on the GT_COPY node
69 must satisfy the register requirement of the parent.
70 - GenTree::gtRsvdRegs has a set of registers used for internal temps.
71 - A tree node is marked GTF_SPILL if the tree node must be spilled by the code generator after it has been
73 - LSRA currently does not set GTF_SPILLED on such nodes, because it caused problems in the old code generator.
74 In the new backend perhaps this should change (see also the note below under CodeGen).
75 - A tree node is marked GTF_SPILLED if it is a lclVar that must be reloaded prior to use.
76 - The register (gtRegNum) on the node indicates the register to which it must be reloaded.
77 - For lclVar nodes, since the uses and defs are distinct tree nodes, it is always possible to annotate the node
78 with the register to which the variable must be reloaded.
79 - For other nodes, since they represent both the def and use, if the value must be reloaded to a different
80 register, LSRA must insert a GT_RELOAD node in order to specify the register to which it should be reloaded.
82 Local variable table (LclVarDsc):
83 - LclVarDsc::lvRegister is set to true if a local variable has the
84 same register assignment for its entire lifetime.
85 - LclVarDsc::lvRegNum / lvOtherReg: these are initialized to their
86 first value at the end of LSRA (it looks like lvOtherReg isn't?
87 This is probably a bug (ARM)). Codegen will set them to their current value
88 as it processes the trees, since a variable can (now) be assigned different
89 registers over its lifetimes.
91 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
92 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
100 #ifndef LEGACY_BACKEND // This file is ONLY used for the RyuJIT backend that uses the linear scan register allocator
105 const char* LinearScan::resolveTypeName[] = {"Split", "Join", "Critical", "SharedCritical"};
108 /*XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
109 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
111 XX Small Helper functions XX
114 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
115 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
118 //--------------------------------------------------------------
119 // lsraAssignRegToTree: Assign the given reg to tree node.
122 // tree - Gentree node
123 // reg - register to be assigned
124 // regIdx - register idx, if tree is a multi-reg call node.
125 // regIdx will be zero for single-reg result producing tree nodes.
130 void lsraAssignRegToTree(GenTree* tree, regNumber reg, unsigned regIdx)
134 tree->gtRegNum = reg;
136 #if defined(_TARGET_ARM_)
137 else if (tree->OperIsMultiRegOp())
140 GenTreeMultiRegOp* mul = tree->AsMultiRegOp();
141 mul->gtOtherReg = reg;
143 else if (tree->OperGet() == GT_COPY)
146 GenTreeCopyOrReload* copy = tree->AsCopyOrReload();
147 copy->gtOtherRegs[0] = (regNumberSmall)reg;
149 else if (tree->OperIsPutArgSplit())
151 GenTreePutArgSplit* putArg = tree->AsPutArgSplit();
152 putArg->SetRegNumByIdx(reg, regIdx);
154 #endif // _TARGET_ARM_
157 assert(tree->IsMultiRegCall());
158 GenTreeCall* call = tree->AsCall();
159 call->SetRegNumByIdx(reg, regIdx);
163 //-------------------------------------------------------------
164 // getWeight: Returns the weight of the RefPosition.
167 // refPos - ref position
170 // Weight of ref position.
171 unsigned LinearScan::getWeight(RefPosition* refPos)
174 GenTree* treeNode = refPos->treeNode;
176 if (treeNode != nullptr)
178 if (isCandidateLocalRef(treeNode))
180 // Tracked locals: use weighted ref cnt as the weight of the
182 GenTreeLclVarCommon* lclCommon = treeNode->AsLclVarCommon();
183 LclVarDsc* varDsc = &(compiler->lvaTable[lclCommon->gtLclNum]);
184 weight = varDsc->lvRefCntWtd;
185 if (refPos->getInterval()->isSpilled)
187 // Decrease the weight if the interval has already been spilled.
188 weight -= BB_UNITY_WEIGHT;
193 // Non-candidate local ref or non-lcl tree node.
194 // These are considered to have two references in the basic block:
195 // a def and a use and hence weighted ref count would be 2 times
196 // the basic block weight in which they appear.
197 // However, it is generally more harmful to spill tree temps, so we
199 const unsigned TREE_TEMP_REF_COUNT = 2;
200 const unsigned TREE_TEMP_BOOST_FACTOR = 2;
201 weight = TREE_TEMP_REF_COUNT * TREE_TEMP_BOOST_FACTOR * blockInfo[refPos->bbNum].weight;
206 // Non-tree node ref positions. These will have a single
207 // reference in the basic block and hence their weighted
208 // refcount is equal to the block weight in which they
210 weight = blockInfo[refPos->bbNum].weight;
216 // allRegs represents a set of registers that can
217 // be used to allocate the specified type in any point
218 // in time (more of a 'bank' of registers).
219 regMaskTP LinearScan::allRegs(RegisterType rt)
223 return availableFloatRegs;
225 else if (rt == TYP_DOUBLE)
227 return availableDoubleRegs;
229 // TODO-Cleanup: Add an RBM_ALLSIMD
231 else if (varTypeIsSIMD(rt))
233 return availableDoubleRegs;
234 #endif // FEATURE_SIMD
238 return availableIntRegs;
242 //--------------------------------------------------------------------------
243 // allMultiRegCallNodeRegs: represents a set of registers that can be used
244 // to allocate a multi-reg call node.
247 // call - Multi-reg call node
250 // Mask representing the set of available registers for multi-reg call
254 // Multi-reg call node available regs = Bitwise-OR(allregs(GetReturnRegType(i)))
255 // for all i=0..RetRegCount-1.
256 regMaskTP LinearScan::allMultiRegCallNodeRegs(GenTreeCall* call)
258 assert(call->HasMultiRegRetVal());
260 ReturnTypeDesc* retTypeDesc = call->GetReturnTypeDesc();
261 regMaskTP resultMask = allRegs(retTypeDesc->GetReturnRegType(0));
263 unsigned count = retTypeDesc->GetReturnRegCount();
264 for (unsigned i = 1; i < count; ++i)
266 resultMask |= allRegs(retTypeDesc->GetReturnRegType(i));
272 //--------------------------------------------------------------------------
273 // allRegs: returns the set of registers that can accomodate the type of
277 // tree - GenTree node
280 // Mask representing the set of available registers for given tree
282 // Note: In case of multi-reg call node, the full set of registers must be
283 // determined by looking at types of individual return register types.
284 // In this case, the registers may include registers from different register
285 // sets and will not be limited to the actual ABI return registers.
286 regMaskTP LinearScan::allRegs(GenTree* tree)
288 regMaskTP resultMask;
290 // In case of multi-reg calls, allRegs is defined as
291 // Bitwise-Or(allRegs(GetReturnRegType(i)) for i=0..ReturnRegCount-1
292 if (tree->IsMultiRegCall())
294 resultMask = allMultiRegCallNodeRegs(tree->AsCall());
298 resultMask = allRegs(tree->TypeGet());
304 regMaskTP LinearScan::allSIMDRegs()
306 return availableFloatRegs;
309 //------------------------------------------------------------------------
310 // internalFloatRegCandidates: Return the set of registers that are appropriate
311 // for use as internal float registers.
314 // The set of registers (as a regMaskTP).
317 // compFloatingPointUsed is only required to be set if it is possible that we
318 // will use floating point callee-save registers.
319 // It is unlikely, if an internal register is the only use of floating point,
320 // that it will select a callee-save register. But to be safe, we restrict
321 // the set of candidates if compFloatingPointUsed is not already set.
323 regMaskTP LinearScan::internalFloatRegCandidates()
325 if (compiler->compFloatingPointUsed)
327 return allRegs(TYP_FLOAT);
331 return RBM_FLT_CALLEE_TRASH;
335 /*****************************************************************************
336 * Inline functions for RegRecord
337 *****************************************************************************/
339 bool RegRecord::isFree()
341 return ((assignedInterval == nullptr || !assignedInterval->isActive) && !isBusyUntilNextKill);
344 /*****************************************************************************
345 * Inline functions for LinearScan
346 *****************************************************************************/
347 RegRecord* LinearScan::getRegisterRecord(regNumber regNum)
349 assert((unsigned)regNum < ArrLen(physRegs));
350 return &physRegs[regNum];
355 //----------------------------------------------------------------------------
356 // getConstrainedRegMask: Returns new regMask which is the intersection of
357 // regMaskActual and regMaskConstraint if the new regMask has at least
358 // minRegCount registers, otherwise returns regMaskActual.
361 // regMaskActual - regMask that needs to be constrained
362 // regMaskConstraint - regMask constraint that needs to be
363 // applied to regMaskActual
364 // minRegCount - Minimum number of regs that should be
365 // be present in new regMask.
368 // New regMask that has minRegCount registers after instersection.
369 // Otherwise returns regMaskActual.
370 regMaskTP LinearScan::getConstrainedRegMask(regMaskTP regMaskActual, regMaskTP regMaskConstraint, unsigned minRegCount)
372 regMaskTP newMask = regMaskActual & regMaskConstraint;
373 if (genCountBits(newMask) >= minRegCount)
378 return regMaskActual;
381 //------------------------------------------------------------------------
382 // stressLimitRegs: Given a set of registers, expressed as a register mask, reduce
383 // them based on the current stress options.
386 // mask - The current mask of register candidates for a node
389 // A possibly-modified mask, based on the value of COMPlus_JitStressRegs.
392 // This is the method used to implement the stress options that limit
393 // the set of registers considered for allocation.
395 regMaskTP LinearScan::stressLimitRegs(RefPosition* refPosition, regMaskTP mask)
397 if (getStressLimitRegs() != LSRA_LIMIT_NONE)
399 // The refPosition could be null, for example when called
400 // by getTempRegForResolution().
401 int minRegCount = (refPosition != nullptr) ? refPosition->minRegCandidateCount : 1;
403 switch (getStressLimitRegs())
405 case LSRA_LIMIT_CALLEE:
406 if (!compiler->opts.compDbgEnC)
408 mask = getConstrainedRegMask(mask, RBM_CALLEE_SAVED, minRegCount);
412 case LSRA_LIMIT_CALLER:
414 mask = getConstrainedRegMask(mask, RBM_CALLEE_TRASH, minRegCount);
418 case LSRA_LIMIT_SMALL_SET:
419 if ((mask & LsraLimitSmallIntSet) != RBM_NONE)
421 mask = getConstrainedRegMask(mask, LsraLimitSmallIntSet, minRegCount);
423 else if ((mask & LsraLimitSmallFPSet) != RBM_NONE)
425 mask = getConstrainedRegMask(mask, LsraLimitSmallFPSet, minRegCount);
433 if (refPosition != nullptr && refPosition->isFixedRegRef)
435 mask |= refPosition->registerAssignment;
443 //------------------------------------------------------------------------
444 // conflictingFixedRegReference: Determine whether the current RegRecord has a
445 // fixed register use that conflicts with 'refPosition'
448 // refPosition - The RefPosition of interest
451 // Returns true iff the given RefPosition is NOT a fixed use of this register,
453 // - there is a RefPosition on this RegRecord at the nodeLocation of the given RefPosition, or
454 // - the given RefPosition has a delayRegFree, and there is a RefPosition on this RegRecord at
455 // the nodeLocation just past the given RefPosition.
458 // 'refPosition is non-null.
460 bool RegRecord::conflictingFixedRegReference(RefPosition* refPosition)
462 // Is this a fixed reference of this register? If so, there is no conflict.
463 if (refPosition->isFixedRefOfRegMask(genRegMask(regNum)))
467 // Otherwise, check for conflicts.
468 // There is a conflict if:
469 // 1. There is a recent RefPosition on this RegRecord that is at this location,
470 // except in the case where it is a special "putarg" that is associated with this interval, OR
471 // 2. There is an upcoming RefPosition at this location, or at the next location
472 // if refPosition is a delayed use (i.e. must be kept live through the next/def location).
474 LsraLocation refLocation = refPosition->nodeLocation;
475 if (recentRefPosition != nullptr && recentRefPosition->refType != RefTypeKill &&
476 recentRefPosition->nodeLocation == refLocation &&
477 (!isBusyUntilNextKill || assignedInterval != refPosition->getInterval()))
481 LsraLocation nextPhysRefLocation = getNextRefLocation();
482 if (nextPhysRefLocation == refLocation || (refPosition->delayRegFree && nextPhysRefLocation == (refLocation + 1)))
489 /*****************************************************************************
490 * Inline functions for Interval
491 *****************************************************************************/
492 RefPosition* Referenceable::getNextRefPosition()
494 if (recentRefPosition == nullptr)
496 return firstRefPosition;
500 return recentRefPosition->nextRefPosition;
504 LsraLocation Referenceable::getNextRefLocation()
506 RefPosition* nextRefPosition = getNextRefPosition();
507 if (nextRefPosition == nullptr)
513 return nextRefPosition->nodeLocation;
517 // Iterate through all the registers of the given type
518 class RegisterIterator
520 friend class Registers;
523 RegisterIterator(RegisterType type) : regType(type)
525 if (useFloatReg(regType))
527 currentRegNum = REG_FP_FIRST;
531 currentRegNum = REG_INT_FIRST;
536 static RegisterIterator Begin(RegisterType regType)
538 return RegisterIterator(regType);
540 static RegisterIterator End(RegisterType regType)
542 RegisterIterator endIter = RegisterIterator(regType);
543 // This assumes only integer and floating point register types
544 // if we target a processor with additional register types,
545 // this would have to change
546 if (useFloatReg(regType))
548 // This just happens to work for both double & float
549 endIter.currentRegNum = REG_NEXT(REG_FP_LAST);
553 endIter.currentRegNum = REG_NEXT(REG_INT_LAST);
559 void operator++(int dummy) // int dummy is c++ for "this is postfix ++"
561 currentRegNum = REG_NEXT(currentRegNum);
563 if (regType == TYP_DOUBLE)
564 currentRegNum = REG_NEXT(currentRegNum);
567 void operator++() // prefix operator++
569 currentRegNum = REG_NEXT(currentRegNum);
571 if (regType == TYP_DOUBLE)
572 currentRegNum = REG_NEXT(currentRegNum);
575 regNumber operator*()
577 return currentRegNum;
579 bool operator!=(const RegisterIterator& other)
581 return other.currentRegNum != currentRegNum;
585 regNumber currentRegNum;
586 RegisterType regType;
592 friend class RegisterIterator;
594 Registers(RegisterType t)
598 RegisterIterator begin()
600 return RegisterIterator::Begin(type);
602 RegisterIterator end()
604 return RegisterIterator::End(type);
609 void LinearScan::dumpVarToRegMap(VarToRegMap map)
611 bool anyPrinted = false;
612 for (unsigned varIndex = 0; varIndex < compiler->lvaTrackedCount; varIndex++)
614 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
615 if (map[varIndex] != REG_STK)
617 printf("V%02u=%s ", varNum, getRegName(map[varIndex]));
628 void LinearScan::dumpInVarToRegMap(BasicBlock* block)
630 printf("Var=Reg beg of BB%02u: ", block->bbNum);
631 VarToRegMap map = getInVarToRegMap(block->bbNum);
632 dumpVarToRegMap(map);
635 void LinearScan::dumpOutVarToRegMap(BasicBlock* block)
637 printf("Var=Reg end of BB%02u: ", block->bbNum);
638 VarToRegMap map = getOutVarToRegMap(block->bbNum);
639 dumpVarToRegMap(map);
644 LinearScanInterface* getLinearScanAllocator(Compiler* comp)
646 return new (comp, CMK_LSRA) LinearScan(comp);
649 //------------------------------------------------------------------------
656 // The constructor takes care of initializing the data structures that are used
657 // during Lowering, including (in DEBUG) getting the stress environment variables,
658 // as they may affect the block ordering.
660 LinearScan::LinearScan(Compiler* theCompiler)
661 : compiler(theCompiler)
662 #if MEASURE_MEM_ALLOC
663 , lsraAllocator(nullptr)
664 #endif // MEASURE_MEM_ALLOC
665 , intervals(LinearScanMemoryAllocatorInterval(theCompiler))
666 , refPositions(LinearScanMemoryAllocatorRefPosition(theCompiler))
667 , listNodePool(theCompiler)
671 activeRefPosition = nullptr;
672 specialPutArgCount = 0;
674 // Get the value of the environment variable that controls stress for register allocation
675 lsraStressMask = JitConfig.JitStressRegs();
677 if (lsraStressMask != 0)
679 // The code in this #if can be used to debug JitStressRegs issues according to
680 // method hash. To use, simply set environment variables JitStressRegsHashLo and JitStressRegsHashHi
681 unsigned methHash = compiler->info.compMethodHash();
682 char* lostr = getenv("JitStressRegsHashLo");
683 unsigned methHashLo = 0;
685 if (lostr != nullptr)
687 sscanf_s(lostr, "%x", &methHashLo);
690 char* histr = getenv("JitStressRegsHashHi");
691 unsigned methHashHi = UINT32_MAX;
692 if (histr != nullptr)
694 sscanf_s(histr, "%x", &methHashHi);
697 if (methHash < methHashLo || methHash > methHashHi)
701 else if (dump == true)
703 printf("JitStressRegs = %x for method %s, hash = 0x%x.\n",
704 lsraStressMask, compiler->info.compFullName, compiler->info.compMethodHash());
705 printf(""); // in our logic this causes a flush
711 enregisterLocalVars = ((compiler->opts.compFlags & CLFLG_REGVAR) != 0) && compiler->lvaTrackedCount > 0;
712 #ifdef _TARGET_ARM64_
713 availableIntRegs = (RBM_ALLINT & ~(RBM_PR | RBM_FP | RBM_LR) & ~compiler->codeGen->regSet.rsMaskResvd);
715 availableIntRegs = (RBM_ALLINT & ~compiler->codeGen->regSet.rsMaskResvd);
719 availableIntRegs &= ~RBM_FPBASE;
720 #endif // ETW_EBP_FRAMED
722 availableFloatRegs = RBM_ALLFLOAT;
723 availableDoubleRegs = RBM_ALLDOUBLE;
725 #ifdef _TARGET_AMD64_
726 if (compiler->opts.compDbgEnC)
728 // On x64 when the EnC option is set, we always save exactly RBP, RSI and RDI.
729 // RBP is not available to the register allocator, so RSI and RDI are the only
730 // callee-save registers available.
731 availableIntRegs &= ~RBM_CALLEE_SAVED | RBM_RSI | RBM_RDI;
732 availableFloatRegs &= ~RBM_CALLEE_SAVED;
733 availableDoubleRegs &= ~RBM_CALLEE_SAVED;
735 #endif // _TARGET_AMD64_
736 compiler->rpFrameType = FT_NOT_SET;
737 compiler->rpMustCreateEBPCalled = false;
739 compiler->codeGen->intRegState.rsIsFloat = false;
740 compiler->codeGen->floatRegState.rsIsFloat = true;
742 // Block sequencing (the order in which we schedule).
743 // Note that we don't initialize the bbVisitedSet until we do the first traversal
744 // (currently during Lowering's second phase, where it sets the TreeNodeInfo).
745 // This is so that any blocks that are added during the first phase of Lowering
746 // are accounted for (and we don't have BasicBlockEpoch issues).
747 blockSequencingDone = false;
748 blockSequence = nullptr;
749 blockSequenceWorkList = nullptr;
753 // Information about each block, including predecessor blocks used for variable locations at block entry.
756 // Populate the register mask table.
757 // The first two masks in the table are allint/allfloat
758 // The next N are the masks for each single register.
759 // After that are the dynamically added ones.
760 regMaskTable = new (compiler, CMK_LSRA) regMaskTP[numMasks];
761 regMaskTable[ALLINT_IDX] = allRegs(TYP_INT);
762 regMaskTable[ALLFLOAT_IDX] = allRegs(TYP_DOUBLE);
765 for (reg = REG_FIRST; reg < REG_COUNT; reg = REG_NEXT(reg))
767 regMaskTable[FIRST_SINGLE_REG_IDX + reg - REG_FIRST] = (reg == REG_STK) ? RBM_NONE : genRegMask(reg);
769 nextFreeMask = FIRST_SINGLE_REG_IDX + REG_COUNT;
770 noway_assert(nextFreeMask <= numMasks);
773 // Return the reg mask corresponding to the given index.
774 regMaskTP LinearScan::GetRegMaskForIndex(RegMaskIndex index)
776 assert(index < numMasks);
777 assert(index < nextFreeMask);
778 return regMaskTable[index];
781 // Given a reg mask, return the index it corresponds to. If it is not a 'well known' reg mask,
782 // add it at the end. This method has linear behavior in the worst cases but that is fairly rare.
783 // Most methods never use any but the well-known masks, and when they do use more
784 // it is only one or two more.
785 LinearScan::RegMaskIndex LinearScan::GetIndexForRegMask(regMaskTP mask)
788 if (isSingleRegister(mask))
790 result = genRegNumFromMask(mask) + FIRST_SINGLE_REG_IDX;
792 else if (mask == allRegs(TYP_INT))
796 else if (mask == allRegs(TYP_DOUBLE))
798 result = ALLFLOAT_IDX;
802 for (int i = FIRST_SINGLE_REG_IDX + REG_COUNT; i < nextFreeMask; i++)
804 if (regMaskTable[i] == mask)
810 // We only allocate a fixed number of masks. Since we don't reallocate, we will throw a
811 // noway_assert if we exceed this limit.
812 noway_assert(nextFreeMask < numMasks);
814 regMaskTable[nextFreeMask] = mask;
815 result = nextFreeMask;
818 assert(mask == regMaskTable[result]);
822 // We've decided that we can't use a register during register allocation (probably FPBASE),
823 // but we've already added it to the register masks. Go through the masks and remove it.
824 void LinearScan::RemoveRegisterFromMasks(regNumber reg)
826 JITDUMP("Removing register %s from LSRA register masks\n", getRegName(reg));
828 regMaskTP mask = ~genRegMask(reg);
829 for (int i = 0; i < nextFreeMask; i++)
831 regMaskTable[i] &= mask;
834 JITDUMP("After removing register:\n");
835 DBEXEC(VERBOSE, dspRegisterMaskTable());
839 void LinearScan::dspRegisterMaskTable()
841 printf("LSRA register masks. Total allocated: %d, total used: %d\n", numMasks, nextFreeMask);
842 for (int i = 0; i < nextFreeMask; i++)
845 dspRegMask(regMaskTable[i]);
851 //------------------------------------------------------------------------
852 // getNextCandidateFromWorkList: Get the next candidate for block sequencing
858 // The next block to be placed in the sequence.
861 // This method currently always returns the next block in the list, and relies on having
862 // blocks added to the list only when they are "ready", and on the
863 // addToBlockSequenceWorkList() method to insert them in the proper order.
864 // However, a block may be in the list and already selected, if it was subsequently
865 // encountered as both a flow and layout successor of the most recently selected
868 BasicBlock* LinearScan::getNextCandidateFromWorkList()
870 BasicBlockList* nextWorkList = nullptr;
871 for (BasicBlockList* workList = blockSequenceWorkList; workList != nullptr; workList = nextWorkList)
873 nextWorkList = workList->next;
874 BasicBlock* candBlock = workList->block;
875 removeFromBlockSequenceWorkList(workList, nullptr);
876 if (!isBlockVisited(candBlock))
884 //------------------------------------------------------------------------
885 // setBlockSequence:Determine the block order for register allocation.
894 // On return, the blockSequence array contains the blocks, in the order in which they
895 // will be allocated.
896 // This method clears the bbVisitedSet on LinearScan, and when it returns the set
897 // contains all the bbNums for the block.
898 // This requires a traversal of the BasicBlocks, and could potentially be
899 // combined with the first traversal (currently the one in Lowering that sets the
902 void LinearScan::setBlockSequence()
904 // Reset the "visited" flag on each block.
905 compiler->EnsureBasicBlockEpoch();
906 bbVisitedSet = BlockSetOps::MakeEmpty(compiler);
907 BlockSet readySet(BlockSetOps::MakeEmpty(compiler));
908 BlockSet predSet(BlockSetOps::MakeEmpty(compiler));
910 assert(blockSequence == nullptr && bbSeqCount == 0);
911 blockSequence = new (compiler, CMK_LSRA) BasicBlock*[compiler->fgBBcount];
912 bbNumMaxBeforeResolution = compiler->fgBBNumMax;
913 blockInfo = new (compiler, CMK_LSRA) LsraBlockInfo[bbNumMaxBeforeResolution + 1];
915 assert(blockSequenceWorkList == nullptr);
917 bool addedInternalBlocks = false;
918 verifiedAllBBs = false;
919 hasCriticalEdges = false;
920 BasicBlock* nextBlock;
921 // We use a bbNum of 0 for entry RefPositions.
922 // The other information in blockInfo[0] will never be used.
923 blockInfo[0].weight = BB_UNITY_WEIGHT;
924 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = nextBlock)
926 blockSequence[bbSeqCount] = block;
927 markBlockVisited(block);
931 // Initialize the blockInfo.
932 // predBBNum will be set later. 0 is never used as a bbNum.
933 assert(block->bbNum != 0);
934 blockInfo[block->bbNum].predBBNum = 0;
935 // We check for critical edges below, but initialize to false.
936 blockInfo[block->bbNum].hasCriticalInEdge = false;
937 blockInfo[block->bbNum].hasCriticalOutEdge = false;
938 blockInfo[block->bbNum].weight = block->getBBWeight(compiler);
941 blockInfo[block->bbNum].spillCount = 0;
942 blockInfo[block->bbNum].copyRegCount = 0;
943 blockInfo[block->bbNum].resolutionMovCount = 0;
944 blockInfo[block->bbNum].splitEdgeCount = 0;
945 #endif // TRACK_LSRA_STATS
947 if (block->GetUniquePred(compiler) == nullptr)
949 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
951 BasicBlock* predBlock = pred->flBlock;
952 if (predBlock->NumSucc(compiler) > 1)
954 blockInfo[block->bbNum].hasCriticalInEdge = true;
955 hasCriticalEdges = true;
958 else if (predBlock->bbJumpKind == BBJ_SWITCH)
960 assert(!"Switch with single successor");
965 // Determine which block to schedule next.
967 // First, update the NORMAL successors of the current block, adding them to the worklist
968 // according to the desired order. We will handle the EH successors below.
969 bool checkForCriticalOutEdge = (block->NumSucc(compiler) > 1);
970 if (!checkForCriticalOutEdge && block->bbJumpKind == BBJ_SWITCH)
972 assert(!"Switch with single successor");
975 const unsigned numSuccs = block->NumSucc(compiler);
976 for (unsigned succIndex = 0; succIndex < numSuccs; succIndex++)
978 BasicBlock* succ = block->GetSucc(succIndex, compiler);
979 if (checkForCriticalOutEdge && succ->GetUniquePred(compiler) == nullptr)
981 blockInfo[block->bbNum].hasCriticalOutEdge = true;
982 hasCriticalEdges = true;
983 // We can stop checking now.
984 checkForCriticalOutEdge = false;
987 if (isTraversalLayoutOrder() || isBlockVisited(succ))
992 // We've now seen a predecessor, so add it to the work list and the "readySet".
993 // It will be inserted in the worklist according to the specified traversal order
994 // (i.e. pred-first or random, since layout order is handled above).
995 if (!BlockSetOps::IsMember(compiler, readySet, succ->bbNum))
997 addToBlockSequenceWorkList(readySet, succ, predSet);
998 BlockSetOps::AddElemD(compiler, readySet, succ->bbNum);
1002 // For layout order, simply use bbNext
1003 if (isTraversalLayoutOrder())
1005 nextBlock = block->bbNext;
1009 while (nextBlock == nullptr)
1011 nextBlock = getNextCandidateFromWorkList();
1013 // TODO-Throughput: We would like to bypass this traversal if we know we've handled all
1014 // the blocks - but fgBBcount does not appear to be updated when blocks are removed.
1015 if (nextBlock == nullptr /* && bbSeqCount != compiler->fgBBcount*/ && !verifiedAllBBs)
1017 // If we don't encounter all blocks by traversing the regular sucessor links, do a full
1018 // traversal of all the blocks, and add them in layout order.
1019 // This may include:
1020 // - internal-only blocks (in the fgAddCodeList) which may not be in the flow graph
1021 // (these are not even in the bbNext links).
1022 // - blocks that have become unreachable due to optimizations, but that are strongly
1023 // connected (these are not removed)
1026 for (Compiler::AddCodeDsc* desc = compiler->fgAddCodeList; desc != nullptr; desc = desc->acdNext)
1028 if (!isBlockVisited(block))
1030 addToBlockSequenceWorkList(readySet, block, predSet);
1031 BlockSetOps::AddElemD(compiler, readySet, block->bbNum);
1035 for (BasicBlock* block = compiler->fgFirstBB; block; block = block->bbNext)
1037 if (!isBlockVisited(block))
1039 addToBlockSequenceWorkList(readySet, block, predSet);
1040 BlockSetOps::AddElemD(compiler, readySet, block->bbNum);
1043 verifiedAllBBs = true;
1051 blockSequencingDone = true;
1054 // Make sure that we've visited all the blocks.
1055 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = block->bbNext)
1057 assert(isBlockVisited(block));
1060 JITDUMP("LSRA Block Sequence: ");
1062 for (BasicBlock *block = startBlockSequence(); block != nullptr; ++i, block = moveToNextBlock())
1064 JITDUMP("BB%02u", block->bbNum);
1066 if (block->isMaxBBWeight())
1072 JITDUMP("(%6s) ", refCntWtd2str(block->getBBWeight(compiler)));
1084 //------------------------------------------------------------------------
1085 // compareBlocksForSequencing: Compare two basic blocks for sequencing order.
1088 // block1 - the first block for comparison
1089 // block2 - the second block for comparison
1090 // useBlockWeights - whether to use block weights for comparison
1093 // -1 if block1 is preferred.
1094 // 0 if the blocks are equivalent.
1095 // 1 if block2 is preferred.
1098 // See addToBlockSequenceWorkList.
1099 int LinearScan::compareBlocksForSequencing(BasicBlock* block1, BasicBlock* block2, bool useBlockWeights)
1101 if (useBlockWeights)
1103 unsigned weight1 = block1->getBBWeight(compiler);
1104 unsigned weight2 = block2->getBBWeight(compiler);
1106 if (weight1 > weight2)
1110 else if (weight1 < weight2)
1116 // If weights are the same prefer LOWER bbnum
1117 if (block1->bbNum < block2->bbNum)
1121 else if (block1->bbNum == block2->bbNum)
1131 //------------------------------------------------------------------------
1132 // addToBlockSequenceWorkList: Add a BasicBlock to the work list for sequencing.
1135 // sequencedBlockSet - the set of blocks that are already sequenced
1136 // block - the new block to be added
1137 // predSet - the buffer to save predecessors set. A block set allocated by the caller used here as a
1138 // temporary block set for constructing a predecessor set. Allocated by the caller to avoid reallocating a new block
1139 // set with every call to this function
1145 // The first block in the list will be the next one to be sequenced, as soon
1146 // as we encounter a block whose successors have all been sequenced, in pred-first
1147 // order, or the very next block if we are traversing in random order (once implemented).
1148 // This method uses a comparison method to determine the order in which to place
1149 // the blocks in the list. This method queries whether all predecessors of the
1150 // block are sequenced at the time it is added to the list and if so uses block weights
1151 // for inserting the block. A block is never inserted ahead of its predecessors.
1152 // A block at the time of insertion may not have all its predecessors sequenced, in
1153 // which case it will be sequenced based on its block number. Once a block is inserted,
1154 // its priority\order will not be changed later once its remaining predecessors are
1155 // sequenced. This would mean that work list may not be sorted entirely based on
1156 // block weights alone.
1158 // Note also that, when random traversal order is implemented, this method
1159 // should insert the blocks into the list in random order, so that we can always
1160 // simply select the first block in the list.
1161 void LinearScan::addToBlockSequenceWorkList(BlockSet sequencedBlockSet, BasicBlock* block, BlockSet& predSet)
1163 // The block that is being added is not already sequenced
1164 assert(!BlockSetOps::IsMember(compiler, sequencedBlockSet, block->bbNum));
1166 // Get predSet of block
1167 BlockSetOps::ClearD(compiler, predSet);
1169 for (pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
1171 BlockSetOps::AddElemD(compiler, predSet, pred->flBlock->bbNum);
1174 // If either a rarely run block or all its preds are already sequenced, use block's weight to sequence
1175 bool useBlockWeight = block->isRunRarely() || BlockSetOps::IsSubset(compiler, sequencedBlockSet, predSet);
1177 BasicBlockList* prevNode = nullptr;
1178 BasicBlockList* nextNode = blockSequenceWorkList;
1180 while (nextNode != nullptr)
1184 if (nextNode->block->isRunRarely())
1186 // If the block that is yet to be sequenced is a rarely run block, always use block weights for sequencing
1187 seqResult = compareBlocksForSequencing(nextNode->block, block, true);
1189 else if (BlockSetOps::IsMember(compiler, predSet, nextNode->block->bbNum))
1191 // always prefer unsequenced pred blocks
1196 seqResult = compareBlocksForSequencing(nextNode->block, block, useBlockWeight);
1204 prevNode = nextNode;
1205 nextNode = nextNode->next;
1208 BasicBlockList* newListNode = new (compiler, CMK_LSRA) BasicBlockList(block, nextNode);
1209 if (prevNode == nullptr)
1211 blockSequenceWorkList = newListNode;
1215 prevNode->next = newListNode;
1219 void LinearScan::removeFromBlockSequenceWorkList(BasicBlockList* listNode, BasicBlockList* prevNode)
1221 if (listNode == blockSequenceWorkList)
1223 assert(prevNode == nullptr);
1224 blockSequenceWorkList = listNode->next;
1228 assert(prevNode != nullptr && prevNode->next == listNode);
1229 prevNode->next = listNode->next;
1231 // TODO-Cleanup: consider merging Compiler::BlockListNode and BasicBlockList
1232 // compiler->FreeBlockListNode(listNode);
1235 // Initialize the block order for allocation (called each time a new traversal begins).
1236 BasicBlock* LinearScan::startBlockSequence()
1238 if (!blockSequencingDone)
1242 BasicBlock* curBB = compiler->fgFirstBB;
1244 curBBNum = curBB->bbNum;
1245 clearVisitedBlocks();
1246 assert(blockSequence[0] == compiler->fgFirstBB);
1247 markBlockVisited(curBB);
1251 //------------------------------------------------------------------------
1252 // moveToNextBlock: Move to the next block in order for allocation or resolution.
1261 // This method is used when the next block is actually going to be handled.
1262 // It changes curBBNum.
1264 BasicBlock* LinearScan::moveToNextBlock()
1266 BasicBlock* nextBlock = getNextBlock();
1268 if (nextBlock != nullptr)
1270 curBBNum = nextBlock->bbNum;
1275 //------------------------------------------------------------------------
1276 // getNextBlock: Get the next block in order for allocation or resolution.
1285 // This method does not actually change the current block - it is used simply
1286 // to determine which block will be next.
1288 BasicBlock* LinearScan::getNextBlock()
1290 assert(blockSequencingDone);
1291 unsigned int nextBBSeqNum = curBBSeqNum + 1;
1292 if (nextBBSeqNum < bbSeqCount)
1294 return blockSequence[nextBBSeqNum];
1299 //------------------------------------------------------------------------
1300 // doLinearScan: The main method for register allocation.
1309 void LinearScan::doLinearScan()
1311 unsigned lsraBlockEpoch = compiler->GetCurBasicBlockEpoch();
1313 splitBBNumToTargetBBNumMap = nullptr;
1315 // This is complicated by the fact that physical registers have refs associated
1316 // with locations where they are killed (e.g. calls), but we don't want to
1317 // count these as being touched.
1319 compiler->codeGen->regSet.rsClearRegsModified();
1323 DBEXEC(VERBOSE, TupleStyleDump(LSRA_DUMP_REFPOS));
1324 compiler->EndPhase(PHASE_LINEAR_SCAN_BUILD);
1326 DBEXEC(VERBOSE, lsraDumpIntervals("after buildIntervals"));
1328 clearVisitedBlocks();
1330 allocateRegisters();
1331 compiler->EndPhase(PHASE_LINEAR_SCAN_ALLOC);
1333 compiler->EndPhase(PHASE_LINEAR_SCAN_RESOLVE);
1335 #if TRACK_LSRA_STATS
1336 if ((JitConfig.DisplayLsraStats() != 0)
1342 dumpLsraStats(jitstdout);
1344 #endif // TRACK_LSRA_STATS
1346 DBEXEC(VERBOSE, TupleStyleDump(LSRA_DUMP_POST));
1348 compiler->compLSRADone = true;
1349 noway_assert(lsraBlockEpoch = compiler->GetCurBasicBlockEpoch());
1352 //------------------------------------------------------------------------
1353 // recordVarLocationsAtStartOfBB: Update live-in LclVarDscs with the appropriate
1354 // register location at the start of a block, during codegen.
1357 // bb - the block for which code is about to be generated.
1363 // CodeGen will take care of updating the reg masks and the current var liveness,
1364 // after calling this method.
1365 // This is because we need to kill off the dead registers before setting the newly live ones.
1367 void LinearScan::recordVarLocationsAtStartOfBB(BasicBlock* bb)
1369 if (!enregisterLocalVars)
1373 JITDUMP("Recording Var Locations at start of BB%02u\n", bb->bbNum);
1374 VarToRegMap map = getInVarToRegMap(bb->bbNum);
1377 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
1378 VarSetOps::Intersection(compiler, registerCandidateVars, bb->bbLiveIn));
1379 VarSetOps::Iter iter(compiler, currentLiveVars);
1380 unsigned varIndex = 0;
1381 while (iter.NextElem(&varIndex))
1383 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
1384 LclVarDsc* varDsc = &(compiler->lvaTable[varNum]);
1385 regNumber regNum = getVarReg(map, varIndex);
1387 regNumber oldRegNum = varDsc->lvRegNum;
1388 regNumber newRegNum = regNum;
1390 if (oldRegNum != newRegNum)
1392 JITDUMP(" V%02u(%s->%s)", varNum, compiler->compRegVarName(oldRegNum),
1393 compiler->compRegVarName(newRegNum));
1394 varDsc->lvRegNum = newRegNum;
1397 else if (newRegNum != REG_STK)
1399 JITDUMP(" V%02u(%s)", varNum, compiler->compRegVarName(newRegNum));
1406 JITDUMP(" <none>\n");
1412 void Interval::setLocalNumber(Compiler* compiler, unsigned lclNum, LinearScan* linScan)
1414 LclVarDsc* varDsc = &compiler->lvaTable[lclNum];
1415 assert(varDsc->lvTracked);
1416 assert(varDsc->lvVarIndex < compiler->lvaTrackedCount);
1418 linScan->localVarIntervals[varDsc->lvVarIndex] = this;
1420 assert(linScan->getIntervalForLocalVar(varDsc->lvVarIndex) == this);
1421 this->isLocalVar = true;
1422 this->varNum = lclNum;
1425 // identify the candidates which we are not going to enregister due to
1426 // being used in EH in a way we don't want to deal with
1427 // this logic cloned from fgInterBlockLocalVarLiveness
1428 void LinearScan::identifyCandidatesExceptionDataflow()
1430 VARSET_TP exceptVars(VarSetOps::MakeEmpty(compiler));
1431 VARSET_TP filterVars(VarSetOps::MakeEmpty(compiler));
1432 VARSET_TP finallyVars(VarSetOps::MakeEmpty(compiler));
1435 foreach_block(compiler, block)
1437 if (block->bbCatchTyp != BBCT_NONE)
1439 // live on entry to handler
1440 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveIn);
1443 if (block->bbJumpKind == BBJ_EHFILTERRET)
1445 // live on exit from filter
1446 VarSetOps::UnionD(compiler, filterVars, block->bbLiveOut);
1448 else if (block->bbJumpKind == BBJ_EHFINALLYRET)
1450 // live on exit from finally
1451 VarSetOps::UnionD(compiler, finallyVars, block->bbLiveOut);
1453 #if FEATURE_EH_FUNCLETS
1454 // Funclets are called and returned from, as such we can only count on the frame
1455 // pointer being restored, and thus everything live in or live out must be on the
1457 if (block->bbFlags & BBF_FUNCLET_BEG)
1459 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveIn);
1461 if ((block->bbJumpKind == BBJ_EHFINALLYRET) || (block->bbJumpKind == BBJ_EHFILTERRET) ||
1462 (block->bbJumpKind == BBJ_EHCATCHRET))
1464 VarSetOps::UnionD(compiler, exceptVars, block->bbLiveOut);
1466 #endif // FEATURE_EH_FUNCLETS
1469 // slam them all together (there was really no need to use more than 2 bitvectors here)
1470 VarSetOps::UnionD(compiler, exceptVars, filterVars);
1471 VarSetOps::UnionD(compiler, exceptVars, finallyVars);
1473 /* Mark all pointer variables live on exit from a 'finally'
1474 block as either volatile for non-GC ref types or as
1475 'explicitly initialized' (volatile and must-init) for GC-ref types */
1477 VarSetOps::Iter iter(compiler, exceptVars);
1478 unsigned varIndex = 0;
1479 while (iter.NextElem(&varIndex))
1481 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
1482 LclVarDsc* varDsc = compiler->lvaTable + varNum;
1484 compiler->lvaSetVarDoNotEnregister(varNum DEBUGARG(Compiler::DNER_LiveInOutOfHandler));
1486 if (varTypeIsGC(varDsc))
1488 if (VarSetOps::IsMember(compiler, finallyVars, varIndex) && !varDsc->lvIsParam)
1490 varDsc->lvMustInit = true;
1496 bool LinearScan::isRegCandidate(LclVarDsc* varDsc)
1498 // We shouldn't be called if opt settings do not permit register variables.
1499 assert((compiler->opts.compFlags & CLFLG_REGVAR) != 0);
1501 if (!varDsc->lvTracked)
1506 #if !defined(_TARGET_64BIT_)
1507 if (varDsc->lvType == TYP_LONG)
1509 // Long variables should not be register candidates.
1510 // Lowering will have split any candidate lclVars into lo/hi vars.
1513 #endif // !defined(_TARGET_64BIT)
1515 // If we have JMP, reg args must be put on the stack
1517 if (compiler->compJmpOpUsed && varDsc->lvIsRegArg)
1522 // Don't allocate registers for dependently promoted struct fields
1523 if (compiler->lvaIsFieldOfDependentlyPromotedStruct(varDsc))
1530 // Identify locals & compiler temps that are register candidates
1531 // TODO-Cleanup: This was cloned from Compiler::lvaSortByRefCount() in lclvars.cpp in order
1532 // to avoid perturbation, but should be merged.
1534 void LinearScan::identifyCandidates()
1536 if (enregisterLocalVars)
1538 // Initialize the set of lclVars that are candidates for register allocation.
1539 VarSetOps::AssignNoCopy(compiler, registerCandidateVars, VarSetOps::MakeEmpty(compiler));
1541 // Initialize the sets of lclVars that are used to determine whether, and for which lclVars,
1542 // we need to perform resolution across basic blocks.
1543 // Note that we can't do this in the constructor because the number of tracked lclVars may
1544 // change between the constructor and the actual allocation.
1545 VarSetOps::AssignNoCopy(compiler, resolutionCandidateVars, VarSetOps::MakeEmpty(compiler));
1546 VarSetOps::AssignNoCopy(compiler, splitOrSpilledVars, VarSetOps::MakeEmpty(compiler));
1548 // We set enregisterLocalVars to true only if there are tracked lclVars
1549 assert(compiler->lvaCount != 0);
1551 else if (compiler->lvaCount == 0)
1553 // Nothing to do. Note that even if enregisterLocalVars is false, we still need to set the
1554 // lvLRACandidate field on all the lclVars to false if we have any.
1558 if (compiler->compHndBBtabCount > 0)
1560 identifyCandidatesExceptionDataflow();
1566 // While we build intervals for the candidate lclVars, we will determine the floating point
1567 // lclVars, if any, to consider for callee-save register preferencing.
1568 // We maintain two sets of FP vars - those that meet the first threshold of weighted ref Count,
1569 // and those that meet the second.
1570 // The first threshold is used for methods that are heuristically deemed either to have light
1571 // fp usage, or other factors that encourage conservative use of callee-save registers, such
1572 // as multiple exits (where there might be an early exit that woudl be excessively penalized by
1573 // lots of prolog/epilog saves & restores).
1574 // The second threshold is used where there are factors deemed to make it more likely that fp
1575 // fp callee save registers will be needed, such as loops or many fp vars.
1576 // We keep two sets of vars, since we collect some of the information to determine which set to
1577 // use as we iterate over the vars.
1578 // When we are generating AVX code on non-Unix (FEATURE_PARTIAL_SIMD_CALLEE_SAVE), we maintain an
1579 // additional set of LargeVectorType vars, and there is a separate threshold defined for those.
1580 // It is assumed that if we encounter these, that we should consider this a "high use" scenario,
1581 // so we don't maintain two sets of these vars.
1582 // This is defined as thresholdLargeVectorRefCntWtd, as we are likely to use the same mechanism
1583 // for vectors on Arm64, though the actual value may differ.
1585 unsigned int floatVarCount = 0;
1586 unsigned int thresholdFPRefCntWtd = 4 * BB_UNITY_WEIGHT;
1587 unsigned int maybeFPRefCntWtd = 2 * BB_UNITY_WEIGHT;
1588 VARSET_TP fpMaybeCandidateVars(VarSetOps::UninitVal());
1589 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1590 unsigned int largeVectorVarCount = 0;
1591 unsigned int thresholdLargeVectorRefCntWtd = 4 * BB_UNITY_WEIGHT;
1592 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1593 if (enregisterLocalVars)
1595 VarSetOps::AssignNoCopy(compiler, fpCalleeSaveCandidateVars, VarSetOps::MakeEmpty(compiler));
1596 VarSetOps::AssignNoCopy(compiler, fpMaybeCandidateVars, VarSetOps::MakeEmpty(compiler));
1597 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1598 VarSetOps::AssignNoCopy(compiler, largeVectorVars, VarSetOps::MakeEmpty(compiler));
1599 VarSetOps::AssignNoCopy(compiler, largeVectorCalleeSaveCandidateVars, VarSetOps::MakeEmpty(compiler));
1600 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1603 unsigned refCntStk = 0;
1604 unsigned refCntReg = 0;
1605 unsigned refCntWtdReg = 0;
1606 unsigned refCntStkParam = 0; // sum of ref counts for all stack based parameters
1607 unsigned refCntWtdStkDbl = 0; // sum of wtd ref counts for stack based doubles
1608 doDoubleAlign = false;
1609 bool checkDoubleAlign = true;
1610 if (compiler->codeGen->isFramePointerRequired() || compiler->opts.MinOpts())
1612 checkDoubleAlign = false;
1616 switch (compiler->getCanDoubleAlign())
1618 case MUST_DOUBLE_ALIGN:
1619 doDoubleAlign = true;
1620 checkDoubleAlign = false;
1622 case CAN_DOUBLE_ALIGN:
1624 case CANT_DOUBLE_ALIGN:
1625 doDoubleAlign = false;
1626 checkDoubleAlign = false;
1632 #endif // DOUBLE_ALIGN
1634 // Check whether register variables are permitted.
1635 if (!enregisterLocalVars)
1637 localVarIntervals = nullptr;
1639 else if (compiler->lvaTrackedCount > 0)
1641 // initialize mapping from tracked local to interval
1642 localVarIntervals = new (compiler, CMK_LSRA) Interval*[compiler->lvaTrackedCount];
1645 INTRACK_STATS(regCandidateVarCount = 0);
1646 for (lclNum = 0, varDsc = compiler->lvaTable; lclNum < compiler->lvaCount; lclNum++, varDsc++)
1648 // Initialize all variables to REG_STK
1649 varDsc->lvRegNum = REG_STK;
1650 #ifndef _TARGET_64BIT_
1651 varDsc->lvOtherReg = REG_STK;
1652 #endif // _TARGET_64BIT_
1654 if (!enregisterLocalVars)
1656 varDsc->lvLRACandidate = false;
1661 if (checkDoubleAlign)
1663 if (varDsc->lvIsParam && !varDsc->lvIsRegArg)
1665 refCntStkParam += varDsc->lvRefCnt;
1667 else if (!isRegCandidate(varDsc) || varDsc->lvDoNotEnregister)
1669 refCntStk += varDsc->lvRefCnt;
1670 if ((varDsc->lvType == TYP_DOUBLE) ||
1671 ((varTypeIsStruct(varDsc) && varDsc->lvStructDoubleAlign &&
1672 (compiler->lvaGetPromotionType(varDsc) != Compiler::PROMOTION_TYPE_INDEPENDENT))))
1674 refCntWtdStkDbl += varDsc->lvRefCntWtd;
1679 refCntReg += varDsc->lvRefCnt;
1680 refCntWtdReg += varDsc->lvRefCntWtd;
1683 #endif // DOUBLE_ALIGN
1685 /* Track all locals that can be enregistered */
1687 if (!isRegCandidate(varDsc))
1689 varDsc->lvLRACandidate = 0;
1690 if (varDsc->lvTracked)
1692 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1697 assert(varDsc->lvTracked);
1699 varDsc->lvLRACandidate = 1;
1701 // Start with lvRegister as false - set it true only if the variable gets
1702 // the same register assignment throughout
1703 varDsc->lvRegister = false;
1705 /* If the ref count is zero */
1706 if (varDsc->lvRefCnt == 0)
1708 /* Zero ref count, make this untracked */
1709 varDsc->lvRefCntWtd = 0;
1710 varDsc->lvLRACandidate = 0;
1713 // Variables that are address-exposed are never enregistered, or tracked.
1714 // A struct may be promoted, and a struct that fits in a register may be fully enregistered.
1715 // Pinned variables may not be tracked (a condition of the GCInfo representation)
1716 // or enregistered, on x86 -- it is believed that we can enregister pinned (more properly, "pinning")
1717 // references when using the general GC encoding.
1719 if (varDsc->lvAddrExposed || !varTypeIsEnregisterableStruct(varDsc))
1721 varDsc->lvLRACandidate = 0;
1723 Compiler::DoNotEnregisterReason dner = Compiler::DNER_AddrExposed;
1724 if (!varDsc->lvAddrExposed)
1726 dner = Compiler::DNER_IsStruct;
1729 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(dner));
1731 else if (varDsc->lvPinned)
1733 varDsc->lvTracked = 0;
1734 #ifdef JIT32_GCENCODER
1735 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(Compiler::DNER_PinningRef));
1736 #endif // JIT32_GCENCODER
1739 // Are we not optimizing and we have exception handlers?
1740 // if so mark all args and locals as volatile, so that they
1741 // won't ever get enregistered.
1743 if (compiler->opts.MinOpts() && compiler->compHndBBtabCount > 0)
1745 compiler->lvaSetVarDoNotEnregister(lclNum DEBUGARG(Compiler::DNER_LiveInOutOfHandler));
1748 if (varDsc->lvDoNotEnregister)
1750 varDsc->lvLRACandidate = 0;
1751 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1755 var_types type = genActualType(varDsc->TypeGet());
1759 #if CPU_HAS_FP_SUPPORT
1762 if (compiler->opts.compDbgCode)
1764 varDsc->lvLRACandidate = 0;
1767 if (varDsc->lvIsParam && varDsc->lvIsRegArg)
1769 type = (type == TYP_DOUBLE) ? TYP_LONG : TYP_INT;
1771 #endif // ARM_SOFTFP
1773 #endif // CPU_HAS_FP_SUPPORT
1785 if (varDsc->lvPromoted)
1787 varDsc->lvLRACandidate = 0;
1791 // TODO-1stClassStructs: Move TYP_SIMD8 up with the other SIMD types, after handling the param issue
1792 // (passing & returning as TYP_LONG).
1794 #endif // FEATURE_SIMD
1798 varDsc->lvLRACandidate = 0;
1804 noway_assert(!"lvType not set correctly");
1805 varDsc->lvType = TYP_INT;
1810 varDsc->lvLRACandidate = 0;
1813 if (varDsc->lvLRACandidate)
1815 Interval* newInt = newInterval(type);
1816 newInt->setLocalNumber(compiler, lclNum, this);
1817 VarSetOps::AddElemD(compiler, registerCandidateVars, varDsc->lvVarIndex);
1819 // we will set this later when we have determined liveness
1820 varDsc->lvMustInit = false;
1822 if (varDsc->lvIsStructField)
1824 newInt->isStructField = true;
1827 INTRACK_STATS(regCandidateVarCount++);
1829 // We maintain two sets of FP vars - those that meet the first threshold of weighted ref Count,
1830 // and those that meet the second (see the definitions of thresholdFPRefCntWtd and maybeFPRefCntWtd
1832 CLANG_FORMAT_COMMENT_ANCHOR;
1834 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1835 // Additionally, when we are generating AVX on non-UNIX amd64, we keep a separate set of the LargeVectorType
1837 if (varTypeNeedsPartialCalleeSave(varDsc->lvType))
1839 largeVectorVarCount++;
1840 VarSetOps::AddElemD(compiler, largeVectorVars, varDsc->lvVarIndex);
1841 unsigned refCntWtd = varDsc->lvRefCntWtd;
1842 if (refCntWtd >= thresholdLargeVectorRefCntWtd)
1844 VarSetOps::AddElemD(compiler, largeVectorCalleeSaveCandidateVars, varDsc->lvVarIndex);
1848 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
1849 if (regType(type) == FloatRegisterType)
1852 unsigned refCntWtd = varDsc->lvRefCntWtd;
1853 if (varDsc->lvIsRegArg)
1855 // Don't count the initial reference for register params. In those cases,
1856 // using a callee-save causes an extra copy.
1857 refCntWtd -= BB_UNITY_WEIGHT;
1859 if (refCntWtd >= thresholdFPRefCntWtd)
1861 VarSetOps::AddElemD(compiler, fpCalleeSaveCandidateVars, varDsc->lvVarIndex);
1863 else if (refCntWtd >= maybeFPRefCntWtd)
1865 VarSetOps::AddElemD(compiler, fpMaybeCandidateVars, varDsc->lvVarIndex);
1871 localVarIntervals[varDsc->lvVarIndex] = nullptr;
1876 if (checkDoubleAlign)
1878 // TODO-CQ: Fine-tune this:
1879 // In the legacy reg predictor, this runs after allocation, and then demotes any lclVars
1880 // allocated to the frame pointer, which is probably the wrong order.
1881 // However, because it runs after allocation, it can determine the impact of demoting
1882 // the lclVars allocated to the frame pointer.
1883 // => Here, estimate of the EBP refCnt and weighted refCnt is a wild guess.
1885 unsigned refCntEBP = refCntReg / 8;
1886 unsigned refCntWtdEBP = refCntWtdReg / 8;
1889 compiler->shouldDoubleAlign(refCntStk, refCntEBP, refCntWtdEBP, refCntStkParam, refCntWtdStkDbl);
1891 #endif // DOUBLE_ALIGN
1893 // The factors we consider to determine which set of fp vars to use as candidates for callee save
1894 // registers current include the number of fp vars, whether there are loops, and whether there are
1895 // multiple exits. These have been selected somewhat empirically, but there is probably room for
1897 CLANG_FORMAT_COMMENT_ANCHOR;
1902 printf("\nFP callee save candidate vars: ");
1903 if (enregisterLocalVars && !VarSetOps::IsEmpty(compiler, fpCalleeSaveCandidateVars))
1905 dumpConvertedVarSet(compiler, fpCalleeSaveCandidateVars);
1915 JITDUMP("floatVarCount = %d; hasLoops = %d, singleExit = %d\n", floatVarCount, compiler->fgHasLoops,
1916 (compiler->fgReturnBlocks == nullptr || compiler->fgReturnBlocks->next == nullptr));
1918 // Determine whether to use the 2nd, more aggressive, threshold for fp callee saves.
1919 if (floatVarCount > 6 && compiler->fgHasLoops &&
1920 (compiler->fgReturnBlocks == nullptr || compiler->fgReturnBlocks->next == nullptr))
1922 assert(enregisterLocalVars);
1926 printf("Adding additional fp callee save candidates: \n");
1927 if (!VarSetOps::IsEmpty(compiler, fpMaybeCandidateVars))
1929 dumpConvertedVarSet(compiler, fpMaybeCandidateVars);
1938 VarSetOps::UnionD(compiler, fpCalleeSaveCandidateVars, fpMaybeCandidateVars);
1945 // Frame layout is only pre-computed for ARM
1946 printf("\nlvaTable after IdentifyCandidates\n");
1947 compiler->lvaTableDump();
1950 #endif // _TARGET_ARM_
1953 // TODO-Throughput: This mapping can surely be more efficiently done
1954 void LinearScan::initVarRegMaps()
1956 if (!enregisterLocalVars)
1958 inVarToRegMaps = nullptr;
1959 outVarToRegMaps = nullptr;
1962 assert(compiler->lvaTrackedFixed); // We should have already set this to prevent us from adding any new tracked
1965 // The compiler memory allocator requires that the allocation be an
1966 // even multiple of int-sized objects
1967 unsigned int varCount = compiler->lvaTrackedCount;
1968 regMapCount = (unsigned int)roundUp(varCount, sizeof(int));
1970 // Not sure why blocks aren't numbered from zero, but they don't appear to be.
1971 // So, if we want to index by bbNum we have to know the maximum value.
1972 unsigned int bbCount = compiler->fgBBNumMax + 1;
1974 inVarToRegMaps = new (compiler, CMK_LSRA) regNumberSmall*[bbCount];
1975 outVarToRegMaps = new (compiler, CMK_LSRA) regNumberSmall*[bbCount];
1979 // This VarToRegMap is used during the resolution of critical edges.
1980 sharedCriticalVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
1982 for (unsigned int i = 0; i < bbCount; i++)
1984 VarToRegMap inVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
1985 VarToRegMap outVarToRegMap = new (compiler, CMK_LSRA) regNumberSmall[regMapCount];
1987 for (unsigned int j = 0; j < regMapCount; j++)
1989 inVarToRegMap[j] = REG_STK;
1990 outVarToRegMap[j] = REG_STK;
1992 inVarToRegMaps[i] = inVarToRegMap;
1993 outVarToRegMaps[i] = outVarToRegMap;
1998 sharedCriticalVarToRegMap = nullptr;
1999 for (unsigned int i = 0; i < bbCount; i++)
2001 inVarToRegMaps[i] = nullptr;
2002 outVarToRegMaps[i] = nullptr;
2007 void LinearScan::setInVarRegForBB(unsigned int bbNum, unsigned int varNum, regNumber reg)
2009 assert(enregisterLocalVars);
2010 assert(reg < UCHAR_MAX && varNum < compiler->lvaCount);
2011 inVarToRegMaps[bbNum][compiler->lvaTable[varNum].lvVarIndex] = (regNumberSmall)reg;
2014 void LinearScan::setOutVarRegForBB(unsigned int bbNum, unsigned int varNum, regNumber reg)
2016 assert(enregisterLocalVars);
2017 assert(reg < UCHAR_MAX && varNum < compiler->lvaCount);
2018 outVarToRegMaps[bbNum][compiler->lvaTable[varNum].lvVarIndex] = (regNumberSmall)reg;
2021 LinearScan::SplitEdgeInfo LinearScan::getSplitEdgeInfo(unsigned int bbNum)
2023 assert(enregisterLocalVars);
2024 SplitEdgeInfo splitEdgeInfo;
2025 assert(bbNum <= compiler->fgBBNumMax);
2026 assert(bbNum > bbNumMaxBeforeResolution);
2027 assert(splitBBNumToTargetBBNumMap != nullptr);
2028 splitBBNumToTargetBBNumMap->Lookup(bbNum, &splitEdgeInfo);
2029 assert(splitEdgeInfo.toBBNum <= bbNumMaxBeforeResolution);
2030 assert(splitEdgeInfo.fromBBNum <= bbNumMaxBeforeResolution);
2031 return splitEdgeInfo;
2034 VarToRegMap LinearScan::getInVarToRegMap(unsigned int bbNum)
2036 assert(enregisterLocalVars);
2037 assert(bbNum <= compiler->fgBBNumMax);
2038 // For the blocks inserted to split critical edges, the inVarToRegMap is
2039 // equal to the outVarToRegMap at the "from" block.
2040 if (bbNum > bbNumMaxBeforeResolution)
2042 SplitEdgeInfo splitEdgeInfo = getSplitEdgeInfo(bbNum);
2043 unsigned fromBBNum = splitEdgeInfo.fromBBNum;
2046 assert(splitEdgeInfo.toBBNum != 0);
2047 return inVarToRegMaps[splitEdgeInfo.toBBNum];
2051 return outVarToRegMaps[fromBBNum];
2055 return inVarToRegMaps[bbNum];
2058 VarToRegMap LinearScan::getOutVarToRegMap(unsigned int bbNum)
2060 assert(enregisterLocalVars);
2061 assert(bbNum <= compiler->fgBBNumMax);
2062 // For the blocks inserted to split critical edges, the outVarToRegMap is
2063 // equal to the inVarToRegMap at the target.
2064 if (bbNum > bbNumMaxBeforeResolution)
2066 // If this is an empty block, its in and out maps are both the same.
2067 // We identify this case by setting fromBBNum or toBBNum to 0, and using only the other.
2068 SplitEdgeInfo splitEdgeInfo = getSplitEdgeInfo(bbNum);
2069 unsigned toBBNum = splitEdgeInfo.toBBNum;
2072 assert(splitEdgeInfo.fromBBNum != 0);
2073 return outVarToRegMaps[splitEdgeInfo.fromBBNum];
2077 return inVarToRegMaps[toBBNum];
2080 return outVarToRegMaps[bbNum];
2083 //------------------------------------------------------------------------
2084 // setVarReg: Set the register associated with a variable in the given 'bbVarToRegMap'.
2087 // bbVarToRegMap - the map of interest
2088 // trackedVarIndex - the lvVarIndex for the variable
2089 // reg - the register to which it is being mapped
2094 void LinearScan::setVarReg(VarToRegMap bbVarToRegMap, unsigned int trackedVarIndex, regNumber reg)
2096 assert(trackedVarIndex < compiler->lvaTrackedCount);
2097 regNumberSmall regSmall = (regNumberSmall)reg;
2098 assert((regNumber)regSmall == reg);
2099 bbVarToRegMap[trackedVarIndex] = regSmall;
2102 //------------------------------------------------------------------------
2103 // getVarReg: Get the register associated with a variable in the given 'bbVarToRegMap'.
2106 // bbVarToRegMap - the map of interest
2107 // trackedVarIndex - the lvVarIndex for the variable
2110 // The register to which 'trackedVarIndex' is mapped
2112 regNumber LinearScan::getVarReg(VarToRegMap bbVarToRegMap, unsigned int trackedVarIndex)
2114 assert(enregisterLocalVars);
2115 assert(trackedVarIndex < compiler->lvaTrackedCount);
2116 return (regNumber)bbVarToRegMap[trackedVarIndex];
2119 // Initialize the incoming VarToRegMap to the given map values (generally a predecessor of
2121 VarToRegMap LinearScan::setInVarToRegMap(unsigned int bbNum, VarToRegMap srcVarToRegMap)
2123 assert(enregisterLocalVars);
2124 VarToRegMap inVarToRegMap = inVarToRegMaps[bbNum];
2125 memcpy(inVarToRegMap, srcVarToRegMap, (regMapCount * sizeof(regNumber)));
2126 return inVarToRegMap;
2129 //------------------------------------------------------------------------
2130 // checkLastUses: Check correctness of last use flags
2133 // The block for which we are checking last uses.
2136 // This does a backward walk of the RefPositions, starting from the liveOut set.
2137 // This method was previously used to set the last uses, which were computed by
2138 // liveness, but were not create in some cases of multiple lclVar references in the
2139 // same tree. However, now that last uses are computed as RefPositions are created,
2140 // that is no longer necessary, and this method is simply retained as a check.
2141 // The exception to the check-only behavior is when LSRA_EXTEND_LIFETIMES if set via
2142 // COMPlus_JitStressRegs. In that case, this method is required, because even though
2143 // the RefPositions will not be marked lastUse in that case, we still need to correclty
2144 // mark the last uses on the tree nodes, which is done by this method.
2147 void LinearScan::checkLastUses(BasicBlock* block)
2151 JITDUMP("\n\nCHECKING LAST USES for block %u, liveout=", block->bbNum);
2152 dumpConvertedVarSet(compiler, block->bbLiveOut);
2153 JITDUMP("\n==============================\n");
2156 unsigned keepAliveVarNum = BAD_VAR_NUM;
2157 if (compiler->lvaKeepAliveAndReportThis())
2159 keepAliveVarNum = compiler->info.compThisArg;
2160 assert(compiler->info.compIsStatic == false);
2163 // find which uses are lastUses
2165 // Work backwards starting with live out.
2166 // 'computedLive' is updated to include any exposed use (including those in this
2167 // block that we've already seen). When we encounter a use, if it's
2168 // not in that set, then it's a last use.
2170 VARSET_TP computedLive(VarSetOps::MakeCopy(compiler, block->bbLiveOut));
2172 bool foundDiff = false;
2173 RefPositionReverseIterator reverseIterator = refPositions.rbegin();
2174 RefPosition* currentRefPosition;
2175 for (currentRefPosition = &reverseIterator; currentRefPosition->refType != RefTypeBB;
2176 reverseIterator++, currentRefPosition = &reverseIterator)
2178 // We should never see ParamDefs or ZeroInits within a basic block.
2179 assert(currentRefPosition->refType != RefTypeParamDef && currentRefPosition->refType != RefTypeZeroInit);
2180 if (currentRefPosition->isIntervalRef() && currentRefPosition->getInterval()->isLocalVar)
2182 unsigned varNum = currentRefPosition->getInterval()->varNum;
2183 unsigned varIndex = currentRefPosition->getInterval()->getVarIndex(compiler);
2185 LsraLocation loc = currentRefPosition->nodeLocation;
2187 // We should always have a tree node for a localVar, except for the "special" RefPositions.
2188 GenTree* tree = currentRefPosition->treeNode;
2189 assert(tree != nullptr || currentRefPosition->refType == RefTypeExpUse ||
2190 currentRefPosition->refType == RefTypeDummyDef);
2192 if (!VarSetOps::IsMember(compiler, computedLive, varIndex) && varNum != keepAliveVarNum)
2194 // There was no exposed use, so this is a "last use" (and we mark it thus even if it's a def)
2196 if (extendLifetimes())
2198 // NOTE: this is a bit of a hack. When extending lifetimes, the "last use" bit will be clear.
2199 // This bit, however, would normally be used during resolveLocalRef to set the value of
2200 // GTF_VAR_DEATH on the node for a ref position. If this bit is not set correctly even when
2201 // extending lifetimes, the code generator will assert as it expects to have accurate last
2202 // use information. To avoid these asserts, set the GTF_VAR_DEATH bit here.
2203 // Note also that extendLifetimes() is an LSRA stress mode, so it will only be true for
2204 // Checked or Debug builds, for which this method will be executed.
2205 if (tree != nullptr)
2207 tree->gtFlags |= GTF_VAR_DEATH;
2210 else if (!currentRefPosition->lastUse)
2212 JITDUMP("missing expected last use of V%02u @%u\n", compiler->lvaTrackedToVarNum[varIndex], loc);
2215 VarSetOps::AddElemD(compiler, computedLive, varIndex);
2217 else if (currentRefPosition->lastUse)
2219 JITDUMP("unexpected last use of V%02u @%u\n", compiler->lvaTrackedToVarNum[varIndex], loc);
2222 else if (extendLifetimes() && tree != nullptr)
2224 // NOTE: see the comment above re: the extendLifetimes hack.
2225 tree->gtFlags &= ~GTF_VAR_DEATH;
2228 if (currentRefPosition->refType == RefTypeDef || currentRefPosition->refType == RefTypeDummyDef)
2230 VarSetOps::RemoveElemD(compiler, computedLive, varIndex);
2234 assert(reverseIterator != refPositions.rend());
2237 VARSET_TP liveInNotComputedLive(VarSetOps::Diff(compiler, block->bbLiveIn, computedLive));
2239 VarSetOps::Iter liveInNotComputedLiveIter(compiler, liveInNotComputedLive);
2240 unsigned liveInNotComputedLiveIndex = 0;
2241 while (liveInNotComputedLiveIter.NextElem(&liveInNotComputedLiveIndex))
2243 unsigned varNum = compiler->lvaTrackedToVarNum[liveInNotComputedLiveIndex];
2244 if (compiler->lvaTable[varNum].lvLRACandidate)
2246 JITDUMP("BB%02u: V%02u is in LiveIn set, but not computed live.\n", block->bbNum, varNum);
2251 VarSetOps::DiffD(compiler, computedLive, block->bbLiveIn);
2252 const VARSET_TP& computedLiveNotLiveIn(computedLive); // reuse the buffer.
2253 VarSetOps::Iter computedLiveNotLiveInIter(compiler, computedLiveNotLiveIn);
2254 unsigned computedLiveNotLiveInIndex = 0;
2255 while (computedLiveNotLiveInIter.NextElem(&computedLiveNotLiveInIndex))
2257 unsigned varNum = compiler->lvaTrackedToVarNum[computedLiveNotLiveInIndex];
2258 if (compiler->lvaTable[varNum].lvLRACandidate)
2260 JITDUMP("BB%02u: V%02u is computed live, but not in LiveIn set.\n", block->bbNum, varNum);
2269 //------------------------------------------------------------------------
2270 // findPredBlockForLiveIn: Determine which block should be used for the register locations of the live-in variables.
2273 // block - The block for which we're selecting a predecesor.
2274 // prevBlock - The previous block in in allocation order.
2275 // pPredBlockIsAllocated - A debug-only argument that indicates whether any of the predecessors have been seen
2276 // in allocation order.
2279 // The selected predecessor.
2282 // in DEBUG, caller initializes *pPredBlockIsAllocated to false, and it will be set to true if the block
2283 // returned is in fact a predecessor.
2286 // This will select a predecessor based on the heuristics obtained by getLsraBlockBoundaryLocations(), which can be
2288 // LSRA_BLOCK_BOUNDARY_PRED - Use the register locations of a predecessor block (default)
2289 // LSRA_BLOCK_BOUNDARY_LAYOUT - Use the register locations of the previous block in layout order.
2290 // This is the only case where this actually returns a different block.
2291 // LSRA_BLOCK_BOUNDARY_ROTATE - Rotate the register locations from a predecessor.
2292 // For this case, the block returned is the same as for LSRA_BLOCK_BOUNDARY_PRED, but
2293 // the register locations will be "rotated" to stress the resolution and allocation
2296 BasicBlock* LinearScan::findPredBlockForLiveIn(BasicBlock* block,
2297 BasicBlock* prevBlock DEBUGARG(bool* pPredBlockIsAllocated))
2299 BasicBlock* predBlock = nullptr;
2301 assert(*pPredBlockIsAllocated == false);
2302 if (getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_LAYOUT)
2304 if (prevBlock != nullptr)
2306 predBlock = prevBlock;
2311 if (block != compiler->fgFirstBB)
2313 predBlock = block->GetUniquePred(compiler);
2314 if (predBlock != nullptr)
2316 if (isBlockVisited(predBlock))
2318 if (predBlock->bbJumpKind == BBJ_COND)
2320 // Special handling to improve matching on backedges.
2321 BasicBlock* otherBlock = (block == predBlock->bbNext) ? predBlock->bbJumpDest : predBlock->bbNext;
2322 noway_assert(otherBlock != nullptr);
2323 if (isBlockVisited(otherBlock))
2325 // This is the case when we have a conditional branch where one target has already
2326 // been visited. It would be best to use the same incoming regs as that block,
2327 // so that we have less likelihood of having to move registers.
2328 // For example, in determining the block to use for the starting register locations for
2329 // "block" in the following example, we'd like to use the same predecessor for "block"
2330 // as for "otherBlock", so that both successors of predBlock have the same locations, reducing
2331 // the likelihood of needing a split block on a backedge:
2342 for (flowList* pred = otherBlock->bbPreds; pred != nullptr; pred = pred->flNext)
2344 BasicBlock* otherPred = pred->flBlock;
2345 if (otherPred->bbNum == blockInfo[otherBlock->bbNum].predBBNum)
2347 predBlock = otherPred;
2356 predBlock = nullptr;
2361 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
2363 BasicBlock* candidatePredBlock = pred->flBlock;
2364 if (isBlockVisited(candidatePredBlock))
2366 if (predBlock == nullptr || predBlock->bbWeight < candidatePredBlock->bbWeight)
2368 predBlock = candidatePredBlock;
2369 INDEBUG(*pPredBlockIsAllocated = true;)
2374 if (predBlock == nullptr)
2376 predBlock = prevBlock;
2377 assert(predBlock != nullptr);
2378 JITDUMP("\n\nNo allocated predecessor; ");
2385 void LinearScan::dumpVarRefPositions(const char* title)
2387 if (enregisterLocalVars)
2389 printf("\nVAR REFPOSITIONS %s\n", title);
2391 for (unsigned i = 0; i < compiler->lvaCount; i++)
2393 printf("--- V%02u\n", i);
2395 LclVarDsc* varDsc = compiler->lvaTable + i;
2396 if (varDsc->lvIsRegCandidate())
2398 Interval* interval = getIntervalForLocalVar(varDsc->lvVarIndex);
2399 for (RefPosition* ref = interval->firstRefPosition; ref != nullptr; ref = ref->nextRefPosition)
2411 // Set the default rpFrameType based upon codeGen->isFramePointerRequired()
2412 // This was lifted from the register predictor
2414 void LinearScan::setFrameType()
2416 FrameType frameType = FT_NOT_SET;
2418 compiler->codeGen->setDoubleAlign(false);
2421 frameType = FT_DOUBLE_ALIGN_FRAME;
2422 compiler->codeGen->setDoubleAlign(true);
2425 #endif // DOUBLE_ALIGN
2426 if (compiler->codeGen->isFramePointerRequired())
2428 frameType = FT_EBP_FRAME;
2432 if (compiler->rpMustCreateEBPCalled == false)
2437 compiler->rpMustCreateEBPCalled = true;
2438 if (compiler->rpMustCreateEBPFrame(INDEBUG(&reason)))
2440 JITDUMP("; Decided to create an EBP based frame for ETW stackwalking (%s)\n", reason);
2441 compiler->codeGen->setFrameRequired(true);
2445 if (compiler->codeGen->isFrameRequired())
2447 frameType = FT_EBP_FRAME;
2451 frameType = FT_ESP_FRAME;
2458 noway_assert(!compiler->codeGen->isFramePointerRequired());
2459 noway_assert(!compiler->codeGen->isFrameRequired());
2460 compiler->codeGen->setFramePointerUsed(false);
2463 compiler->codeGen->setFramePointerUsed(true);
2466 case FT_DOUBLE_ALIGN_FRAME:
2467 noway_assert(!compiler->codeGen->isFramePointerRequired());
2468 compiler->codeGen->setFramePointerUsed(false);
2470 #endif // DOUBLE_ALIGN
2472 noway_assert(!"rpFrameType not set correctly!");
2476 // If we are using FPBASE as the frame register, we cannot also use it for
2477 // a local var. Note that we may have already added it to the register masks,
2478 // which are computed when the LinearScan class constructor is created, and
2479 // used during lowering. Luckily, the TreeNodeInfo only stores an index to
2480 // the masks stored in the LinearScan class, so we only need to walk the
2481 // unique masks and remove FPBASE.
2482 if (frameType == FT_EBP_FRAME)
2484 if ((availableIntRegs & RBM_FPBASE) != 0)
2486 RemoveRegisterFromMasks(REG_FPBASE);
2488 // We know that we're already in "read mode" for availableIntRegs. However,
2489 // we need to remove the FPBASE register, so subsequent users (like callers
2490 // to allRegs()) get the right thing. The RemoveRegisterFromMasks() code
2491 // fixes up everything that already took a dependency on the value that was
2492 // previously read, so this completes the picture.
2493 availableIntRegs.OverrideAssign(availableIntRegs & ~RBM_FPBASE);
2497 compiler->rpFrameType = frameType;
2500 //------------------------------------------------------------------------
2501 // copyOrMoveRegInUse: Is 'ref' a copyReg/moveReg that is still busy at the given location?
2504 // ref: The RefPosition of interest
2505 // loc: The LsraLocation at which we're determining whether it's busy.
2508 // true iff 'ref' is active at the given location
2510 bool copyOrMoveRegInUse(RefPosition* ref, LsraLocation loc)
2512 if (!ref->copyReg && !ref->moveReg)
2516 if (ref->getRefEndLocation() >= loc)
2520 Interval* interval = ref->getInterval();
2521 RefPosition* nextRef = interval->getNextRefPosition();
2522 if (nextRef != nullptr && nextRef->treeNode == ref->treeNode && nextRef->getRefEndLocation() >= loc)
2529 // Determine whether the register represented by "physRegRecord" is available at least
2530 // at the "currentLoc", and if so, return the next location at which it is in use in
2531 // "nextRefLocationPtr"
2533 bool LinearScan::registerIsAvailable(RegRecord* physRegRecord,
2534 LsraLocation currentLoc,
2535 LsraLocation* nextRefLocationPtr,
2536 RegisterType regType)
2538 *nextRefLocationPtr = MaxLocation;
2539 LsraLocation nextRefLocation = MaxLocation;
2540 regMaskTP regMask = genRegMask(physRegRecord->regNum);
2541 if (physRegRecord->isBusyUntilNextKill)
2546 RefPosition* nextPhysReference = physRegRecord->getNextRefPosition();
2547 if (nextPhysReference != nullptr)
2549 nextRefLocation = nextPhysReference->nodeLocation;
2550 // if (nextPhysReference->refType == RefTypeFixedReg) nextRefLocation--;
2552 else if (!physRegRecord->isCalleeSave)
2554 nextRefLocation = MaxLocation - 1;
2557 Interval* assignedInterval = physRegRecord->assignedInterval;
2559 if (assignedInterval != nullptr)
2561 RefPosition* recentReference = assignedInterval->recentRefPosition;
2563 // The only case where we have an assignedInterval, but recentReference is null
2564 // is where this interval is live at procedure entry (i.e. an arg register), in which
2565 // case it's still live and its assigned register is not available
2566 // (Note that the ParamDef will be recorded as a recentReference when we encounter
2567 // it, but we will be allocating registers, potentially to other incoming parameters,
2568 // as we process the ParamDefs.)
2570 if (recentReference == nullptr)
2575 // Is this a copyReg/moveReg? It is if the register assignment doesn't match.
2576 // (the recentReference may not be a copyReg/moveReg, because we could have seen another
2577 // reference since the copyReg/moveReg)
2579 if (!assignedInterval->isAssignedTo(physRegRecord->regNum))
2581 // If the recentReference is for a different register, it can be reassigned, but
2582 // otherwise don't reassign it if it's still in use.
2583 // (Note that it is unlikely that we have a recent copy or move to a different register,
2584 // where this physRegRecord is still pointing at an earlier copy or move, but it is possible,
2585 // especially in stress modes.)
2586 if ((recentReference->registerAssignment == regMask) && copyOrMoveRegInUse(recentReference, currentLoc))
2591 else if (!assignedInterval->isActive && assignedInterval->isConstant)
2593 // Treat this as unassigned, i.e. do nothing.
2594 // TODO-CQ: Consider adjusting the heuristics (probably in the caller of this method)
2595 // to avoid reusing these registers.
2597 // If this interval isn't active, it's available if it isn't referenced
2598 // at this location (or the previous location, if the recent RefPosition
2599 // is a delayRegFree).
2600 else if (!assignedInterval->isActive &&
2601 (recentReference->refType == RefTypeExpUse || recentReference->getRefEndLocation() < currentLoc))
2603 // This interval must have a next reference (otherwise it wouldn't be assigned to this register)
2604 RefPosition* nextReference = recentReference->nextRefPosition;
2605 if (nextReference != nullptr)
2607 if (nextReference->nodeLocation < nextRefLocation)
2609 nextRefLocation = nextReference->nodeLocation;
2614 assert(recentReference->copyReg && recentReference->registerAssignment != regMask);
2622 if (nextRefLocation < *nextRefLocationPtr)
2624 *nextRefLocationPtr = nextRefLocation;
2628 if (regType == TYP_DOUBLE)
2630 // Recurse, but check the other half this time (TYP_FLOAT)
2631 if (!registerIsAvailable(findAnotherHalfRegRec(physRegRecord), currentLoc, nextRefLocationPtr, TYP_FLOAT))
2633 nextRefLocation = *nextRefLocationPtr;
2635 #endif // _TARGET_ARM_
2637 return (nextRefLocation >= currentLoc);
2640 //------------------------------------------------------------------------
2641 // getRegisterType: Get the RegisterType to use for the given RefPosition
2644 // currentInterval: The interval for the current allocation
2645 // refPosition: The RefPosition of the current Interval for which a register is being allocated
2648 // The RegisterType that should be allocated for this RefPosition
2651 // This will nearly always be identical to the registerType of the interval, except in the case
2652 // of SIMD types of 8 bytes (currently only Vector2) when they are passed and returned in integer
2653 // registers, or copied to a return temp.
2654 // This method need only be called in situations where we may be dealing with the register requirements
2655 // of a RefTypeUse RefPosition (i.e. not when we are only looking at the type of an interval, nor when
2656 // we are interested in the "defining" type of the interval). This is because the situation of interest
2657 // only happens at the use (where it must be copied to an integer register).
2659 RegisterType LinearScan::getRegisterType(Interval* currentInterval, RefPosition* refPosition)
2661 assert(refPosition->getInterval() == currentInterval);
2662 RegisterType regType = currentInterval->registerType;
2663 regMaskTP candidates = refPosition->registerAssignment;
2665 assert((candidates & allRegs(regType)) != RBM_NONE);
2669 //------------------------------------------------------------------------
2670 // isMatchingConstant: Check to see whether a given register contains the constant referenced
2671 // by the given RefPosition
2674 // physRegRecord: The RegRecord for the register we're interested in.
2675 // refPosition: The RefPosition for a constant interval.
2678 // True iff the register was defined by an identical constant node as the current interval.
2680 bool LinearScan::isMatchingConstant(RegRecord* physRegRecord, RefPosition* refPosition)
2682 if ((physRegRecord->assignedInterval == nullptr) || !physRegRecord->assignedInterval->isConstant)
2686 noway_assert(refPosition->treeNode != nullptr);
2687 GenTree* otherTreeNode = physRegRecord->assignedInterval->firstRefPosition->treeNode;
2688 noway_assert(otherTreeNode != nullptr);
2690 if (refPosition->treeNode->OperGet() == otherTreeNode->OperGet())
2692 switch (otherTreeNode->OperGet())
2695 if ((refPosition->treeNode->AsIntCon()->IconValue() == otherTreeNode->AsIntCon()->IconValue()) &&
2696 (varTypeGCtype(refPosition->treeNode) == varTypeGCtype(otherTreeNode)))
2698 #ifdef _TARGET_64BIT_
2699 // If the constant is negative, only reuse registers of the same type.
2700 // This is because, on a 64-bit system, we do not sign-extend immediates in registers to
2701 // 64-bits unless they are actually longs, as this requires a longer instruction.
2702 // This doesn't apply to a 32-bit system, on which long values occupy multiple registers.
2703 // (We could sign-extend, but we would have to always sign-extend, because if we reuse more
2704 // than once, we won't have access to the instruction that originally defines the constant).
2705 if ((refPosition->treeNode->TypeGet() == otherTreeNode->TypeGet()) ||
2706 (refPosition->treeNode->AsIntCon()->IconValue() >= 0))
2707 #endif // _TARGET_64BIT_
2715 // For floating point constants, the values must be identical, not simply compare
2716 // equal. So we compare the bits.
2717 if (refPosition->treeNode->AsDblCon()->isBitwiseEqual(otherTreeNode->AsDblCon()) &&
2718 (refPosition->treeNode->TypeGet() == otherTreeNode->TypeGet()))
2731 //------------------------------------------------------------------------
2732 // tryAllocateFreeReg: Find a free register that satisfies the requirements for refPosition,
2733 // and takes into account the preferences for the given Interval
2736 // currentInterval: The interval for the current allocation
2737 // refPosition: The RefPosition of the current Interval for which a register is being allocated
2740 // The regNumber, if any, allocated to the RefPositon. Returns REG_NA if no free register is found.
2743 // TODO-CQ: Consider whether we need to use a different order for tree temps than for vars, as
2746 static const regNumber lsraRegOrder[] = {REG_VAR_ORDER};
2747 const unsigned lsraRegOrderSize = ArrLen(lsraRegOrder);
2748 static const regNumber lsraRegOrderFlt[] = {REG_VAR_ORDER_FLT};
2749 const unsigned lsraRegOrderFltSize = ArrLen(lsraRegOrderFlt);
2751 regNumber LinearScan::tryAllocateFreeReg(Interval* currentInterval, RefPosition* refPosition)
2753 regNumber foundReg = REG_NA;
2755 RegisterType regType = getRegisterType(currentInterval, refPosition);
2756 const regNumber* regOrder;
2757 unsigned regOrderSize;
2758 if (useFloatReg(regType))
2760 regOrder = lsraRegOrderFlt;
2761 regOrderSize = lsraRegOrderFltSize;
2765 regOrder = lsraRegOrder;
2766 regOrderSize = lsraRegOrderSize;
2769 LsraLocation currentLocation = refPosition->nodeLocation;
2770 RefPosition* nextRefPos = refPosition->nextRefPosition;
2771 LsraLocation nextLocation = (nextRefPos == nullptr) ? currentLocation : nextRefPos->nodeLocation;
2772 regMaskTP candidates = refPosition->registerAssignment;
2773 regMaskTP preferences = currentInterval->registerPreferences;
2775 if (RefTypeIsDef(refPosition->refType))
2777 if (currentInterval->hasConflictingDefUse)
2779 resolveConflictingDefAndUse(currentInterval, refPosition);
2780 candidates = refPosition->registerAssignment;
2782 // Otherwise, check for the case of a fixed-reg def of a reg that will be killed before the
2783 // use, or interferes at the point of use (which shouldn't happen, but Lower doesn't mark
2784 // the contained nodes as interfering).
2785 // Note that we may have a ParamDef RefPosition that is marked isFixedRegRef, but which
2786 // has had its registerAssignment changed to no longer be a single register.
2787 else if (refPosition->isFixedRegRef && nextRefPos != nullptr && RefTypeIsUse(nextRefPos->refType) &&
2788 !nextRefPos->isFixedRegRef && genMaxOneBit(refPosition->registerAssignment))
2790 regNumber defReg = refPosition->assignedReg();
2791 RegRecord* defRegRecord = getRegisterRecord(defReg);
2793 RefPosition* currFixedRegRefPosition = defRegRecord->recentRefPosition;
2794 assert(currFixedRegRefPosition != nullptr &&
2795 currFixedRegRefPosition->nodeLocation == refPosition->nodeLocation);
2797 // If there is another fixed reference to this register before the use, change the candidates
2798 // on this RefPosition to include that of nextRefPos.
2799 if (currFixedRegRefPosition->nextRefPosition != nullptr &&
2800 currFixedRegRefPosition->nextRefPosition->nodeLocation <= nextRefPos->getRefEndLocation())
2802 candidates |= nextRefPos->registerAssignment;
2803 if (preferences == refPosition->registerAssignment)
2805 preferences = candidates;
2811 preferences &= candidates;
2812 if (preferences == RBM_NONE)
2814 preferences = candidates;
2816 regMaskTP relatedPreferences = RBM_NONE;
2819 candidates = stressLimitRegs(refPosition, candidates);
2821 assert(candidates != RBM_NONE);
2823 // If the related interval has no further references, it is possible that it is a source of the
2824 // node that produces this interval. However, we don't want to use the relatedInterval for preferencing
2825 // if its next reference is not a new definition (as it either is or will become live).
2826 Interval* relatedInterval = currentInterval->relatedInterval;
2827 if (relatedInterval != nullptr)
2829 RefPosition* nextRelatedRefPosition = relatedInterval->getNextRefPosition();
2830 if (nextRelatedRefPosition != nullptr)
2832 // Don't use the relatedInterval for preferencing if its next reference is not a new definition,
2833 // or if it is only related because they are multi-reg targets of the same node.
2834 if (!RefTypeIsDef(nextRelatedRefPosition->refType) ||
2835 isMultiRegRelated(nextRelatedRefPosition, refPosition->nodeLocation))
2837 relatedInterval = nullptr;
2839 // Is the relatedInterval not assigned and simply a copy to another relatedInterval?
2840 else if ((relatedInterval->assignedReg == nullptr) && (relatedInterval->relatedInterval != nullptr) &&
2841 (nextRelatedRefPosition->nextRefPosition != nullptr) &&
2842 (nextRelatedRefPosition->nextRefPosition->nextRefPosition == nullptr) &&
2843 (nextRelatedRefPosition->nextRefPosition->nodeLocation <
2844 relatedInterval->relatedInterval->getNextRefLocation()))
2846 // The current relatedInterval has only two remaining RefPositions, both of which
2847 // occur prior to the next RefPosition for its relatedInterval.
2848 // It is likely a copy.
2849 relatedInterval = relatedInterval->relatedInterval;
2854 if (relatedInterval != nullptr)
2856 // If the related interval already has an assigned register, then use that
2857 // as the related preference. We'll take the related
2858 // interval preferences into account in the loop over all the registers.
2860 if (relatedInterval->assignedReg != nullptr)
2862 relatedPreferences = genRegMask(relatedInterval->assignedReg->regNum);
2866 relatedPreferences = relatedInterval->registerPreferences;
2870 bool preferCalleeSave = currentInterval->preferCalleeSave;
2872 // For floating point, we want to be less aggressive about using callee-save registers.
2873 // So in that case, we just need to ensure that the current RefPosition is covered.
2874 RefPosition* rangeEndRefPosition;
2875 RefPosition* lastRefPosition = currentInterval->lastRefPosition;
2876 if (useFloatReg(currentInterval->registerType))
2878 rangeEndRefPosition = refPosition;
2882 rangeEndRefPosition = currentInterval->lastRefPosition;
2883 // If we have a relatedInterval that is not currently occupying a register,
2884 // and whose lifetime begins after this one ends,
2885 // we want to try to select a register that will cover its lifetime.
2886 if ((relatedInterval != nullptr) && (relatedInterval->assignedReg == nullptr) &&
2887 (relatedInterval->getNextRefLocation() >= rangeEndRefPosition->nodeLocation))
2889 lastRefPosition = relatedInterval->lastRefPosition;
2890 preferCalleeSave = relatedInterval->preferCalleeSave;
2894 // If this has a delayed use (due to being used in a rmw position of a
2895 // non-commutative operator), its endLocation is delayed until the "def"
2896 // position, which is one location past the use (getRefEndLocation() takes care of this).
2897 LsraLocation rangeEndLocation = rangeEndRefPosition->getRefEndLocation();
2898 LsraLocation lastLocation = lastRefPosition->getRefEndLocation();
2899 regNumber prevReg = REG_NA;
2901 if (currentInterval->assignedReg)
2903 bool useAssignedReg = false;
2904 // This was an interval that was previously allocated to the given
2905 // physical register, and we should try to allocate it to that register
2906 // again, if possible and reasonable.
2907 // Use it preemptively (i.e. before checking other available regs)
2908 // only if it is preferred and available.
2910 RegRecord* regRec = currentInterval->assignedReg;
2911 prevReg = regRec->regNum;
2912 regMaskTP prevRegBit = genRegMask(prevReg);
2914 // Is it in the preferred set of regs?
2915 if ((prevRegBit & preferences) != RBM_NONE)
2917 // Is it currently available?
2918 LsraLocation nextPhysRefLoc;
2919 if (registerIsAvailable(regRec, currentLocation, &nextPhysRefLoc, currentInterval->registerType))
2921 // If the register is next referenced at this location, only use it if
2922 // this has a fixed reg requirement (i.e. this is the reference that caused
2923 // the FixedReg ref to be created)
2925 if (!regRec->conflictingFixedRegReference(refPosition))
2927 useAssignedReg = true;
2933 regNumber foundReg = prevReg;
2934 assignPhysReg(regRec, currentInterval);
2935 refPosition->registerAssignment = genRegMask(foundReg);
2940 // Don't keep trying to allocate to this register
2941 currentInterval->assignedReg = nullptr;
2945 //-------------------------------------------------------------------------
2946 // Register Selection
2948 RegRecord* availablePhysRegInterval = nullptr;
2949 bool unassignInterval = false;
2951 // Each register will receive a score which is the sum of the scoring criteria below.
2952 // These were selected on the assumption that they will have an impact on the "goodness"
2953 // of a register selection, and have been tuned to a certain extent by observing the impact
2954 // of the ordering on asmDiffs. However, there is probably much more room for tuning,
2955 // and perhaps additional criteria.
2957 // These are FLAGS (bits) so that we can easily order them and add them together.
2958 // If the scores are equal, but one covers more of the current interval's range,
2959 // then it wins. Otherwise, the one encountered earlier in the regOrder wins.
2963 VALUE_AVAILABLE = 0x40, // It is a constant value that is already in an acceptable register.
2964 COVERS = 0x20, // It is in the interval's preference set and it covers the entire lifetime.
2965 OWN_PREFERENCE = 0x10, // It is in the preference set of this interval.
2966 COVERS_RELATED = 0x08, // It is in the preference set of the related interval and covers the entire lifetime.
2967 RELATED_PREFERENCE = 0x04, // It is in the preference set of the related interval.
2968 CALLER_CALLEE = 0x02, // It is in the right "set" for the interval (caller or callee-save).
2969 UNASSIGNED = 0x01, // It is not currently assigned to an inactive interval.
2974 // Compute the best possible score so we can stop looping early if we find it.
2975 // TODO-Throughput: At some point we may want to short-circuit the computation of each score, but
2976 // probably not until we've tuned the order of these criteria. At that point,
2977 // we'll need to avoid the short-circuit if we've got a stress option to reverse
2979 int bestPossibleScore = COVERS + UNASSIGNED + OWN_PREFERENCE + CALLER_CALLEE;
2980 if (relatedPreferences != RBM_NONE)
2982 bestPossibleScore |= RELATED_PREFERENCE + COVERS_RELATED;
2985 LsraLocation bestLocation = MinLocation;
2987 // In non-debug builds, this will simply get optimized away
2988 bool reverseSelect = false;
2990 reverseSelect = doReverseSelect();
2993 // An optimization for the common case where there is only one candidate -
2994 // avoid looping over all the other registers
2996 regNumber singleReg = REG_NA;
2998 if (genMaxOneBit(candidates))
3001 singleReg = genRegNumFromMask(candidates);
3002 regOrder = &singleReg;
3005 for (unsigned i = 0; i < regOrderSize && (candidates != RBM_NONE); i++)
3007 regNumber regNum = regOrder[i];
3008 regMaskTP candidateBit = genRegMask(regNum);
3010 if (!(candidates & candidateBit))
3015 candidates &= ~candidateBit;
3017 RegRecord* physRegRecord = getRegisterRecord(regNum);
3020 LsraLocation nextPhysRefLocation = MaxLocation;
3022 // By chance, is this register already holding this interval, as a copyReg or having
3023 // been restored as inactive after a kill?
3024 if (physRegRecord->assignedInterval == currentInterval)
3026 availablePhysRegInterval = physRegRecord;
3027 unassignInterval = false;
3031 // Find the next RefPosition of the physical register
3032 if (!registerIsAvailable(physRegRecord, currentLocation, &nextPhysRefLocation, regType))
3037 // If the register is next referenced at this location, only use it if
3038 // this has a fixed reg requirement (i.e. this is the reference that caused
3039 // the FixedReg ref to be created)
3041 if (physRegRecord->conflictingFixedRegReference(refPosition))
3046 // If this is a definition of a constant interval, check to see if its value is already in this register.
3047 if (currentInterval->isConstant && RefTypeIsDef(refPosition->refType) &&
3048 isMatchingConstant(physRegRecord, refPosition))
3050 score |= VALUE_AVAILABLE;
3053 // If the nextPhysRefLocation is a fixedRef for the rangeEndRefPosition, increment it so that
3054 // we don't think it isn't covering the live range.
3055 // This doesn't handle the case where earlier RefPositions for this Interval are also
3056 // FixedRefs of this regNum, but at least those are only interesting in the case where those
3057 // are "local last uses" of the Interval - otherwise the liveRange would interfere with the reg.
3058 if (nextPhysRefLocation == rangeEndLocation && rangeEndRefPosition->isFixedRefOfReg(regNum))
3060 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_INCREMENT_RANGE_END, currentInterval, regNum));
3061 nextPhysRefLocation++;
3064 if ((candidateBit & preferences) != RBM_NONE)
3066 score |= OWN_PREFERENCE;
3067 if (nextPhysRefLocation > rangeEndLocation)
3072 if (relatedInterval != nullptr && (candidateBit & relatedPreferences) != RBM_NONE)
3074 score |= RELATED_PREFERENCE;
3075 if (nextPhysRefLocation > relatedInterval->lastRefPosition->nodeLocation)
3077 score |= COVERS_RELATED;
3081 // If we had a fixed-reg def of a reg that will be killed before the use, prefer it to any other registers
3082 // with the same score. (Note that we haven't changed the original registerAssignment on the RefPosition).
3083 // Overload the RELATED_PREFERENCE value.
3084 else if (candidateBit == refPosition->registerAssignment)
3086 score |= RELATED_PREFERENCE;
3089 if ((preferCalleeSave && physRegRecord->isCalleeSave) || (!preferCalleeSave && !physRegRecord->isCalleeSave))
3091 score |= CALLER_CALLEE;
3094 // The register is considered unassigned if it has no assignedInterval, OR
3095 // if its next reference is beyond the range of this interval.
3096 if (!isAssigned(physRegRecord, lastLocation ARM_ARG(currentInterval->registerType)))
3098 score |= UNASSIGNED;
3101 bool foundBetterCandidate = false;
3103 if (score > bestScore)
3105 foundBetterCandidate = true;
3107 else if (score == bestScore)
3109 // Prefer a register that covers the range.
3110 if (bestLocation <= lastLocation)
3112 if (nextPhysRefLocation > bestLocation)
3114 foundBetterCandidate = true;
3117 // If both cover the range, prefer a register that is killed sooner (leaving the longer range register
3118 // available). If both cover the range and also getting killed at the same location, prefer the one which
3119 // is same as previous assignment.
3120 else if (nextPhysRefLocation > lastLocation)
3122 if (nextPhysRefLocation < bestLocation)
3124 foundBetterCandidate = true;
3126 else if (nextPhysRefLocation == bestLocation && prevReg == regNum)
3128 foundBetterCandidate = true;
3134 if (doReverseSelect() && bestScore != 0)
3136 foundBetterCandidate = !foundBetterCandidate;
3140 if (foundBetterCandidate)
3142 bestLocation = nextPhysRefLocation;
3143 availablePhysRegInterval = physRegRecord;
3144 unassignInterval = true;
3148 // there is no way we can get a better score so break out
3149 if (!reverseSelect && score == bestPossibleScore && bestLocation == rangeEndLocation + 1)
3155 if (availablePhysRegInterval != nullptr)
3157 if (unassignInterval && isAssigned(availablePhysRegInterval ARM_ARG(currentInterval->registerType)))
3159 Interval* const intervalToUnassign = availablePhysRegInterval->assignedInterval;
3160 unassignPhysReg(availablePhysRegInterval ARM_ARG(currentInterval->registerType));
3162 if ((bestScore & VALUE_AVAILABLE) != 0 && intervalToUnassign != nullptr)
3164 assert(intervalToUnassign->isConstant);
3165 refPosition->treeNode->SetReuseRegVal();
3167 // If we considered this "unassigned" because this interval's lifetime ends before
3168 // the next ref, remember it.
3169 else if ((bestScore & UNASSIGNED) != 0 && intervalToUnassign != nullptr)
3171 updatePreviousInterval(availablePhysRegInterval, intervalToUnassign, intervalToUnassign->registerType);
3176 assert((bestScore & VALUE_AVAILABLE) == 0);
3178 assignPhysReg(availablePhysRegInterval, currentInterval);
3179 foundReg = availablePhysRegInterval->regNum;
3180 regMaskTP foundRegMask = genRegMask(foundReg);
3181 refPosition->registerAssignment = foundRegMask;
3182 if (relatedInterval != nullptr)
3184 relatedInterval->updateRegisterPreferences(foundRegMask);
3191 //------------------------------------------------------------------------
3192 // canSpillReg: Determine whether we can spill physRegRecord
3195 // physRegRecord - reg to spill
3196 // refLocation - Location of RefPosition where this register will be spilled
3197 // recentAssignedRefWeight - Weight of recent assigned RefPosition which will be determined in this function
3198 // farthestRefPosWeight - Current farthestRefPosWeight at allocateBusyReg()
3201 // True - if we can spill physRegRecord
3202 // False - otherwise
3204 // Note: This helper is designed to be used only from allocateBusyReg() and canSpillDoubleReg()
3206 bool LinearScan::canSpillReg(RegRecord* physRegRecord, LsraLocation refLocation, unsigned* recentAssignedRefWeight)
3208 assert(physRegRecord->assignedInterval != nullptr);
3209 RefPosition* recentAssignedRef = physRegRecord->assignedInterval->recentRefPosition;
3211 if (recentAssignedRef != nullptr)
3213 if (isRefPositionActive(recentAssignedRef, refLocation))
3215 // We can't spill a register that's active at the current location
3219 // We don't prefer to spill a register if the weight of recentAssignedRef > weight
3220 // of the spill candidate found so far. We would consider spilling a greater weight
3221 // ref position only if the refPosition being allocated must need a reg.
3222 *recentAssignedRefWeight = getWeight(recentAssignedRef);
3228 //------------------------------------------------------------------------
3229 // canSpillDoubleReg: Determine whether we can spill physRegRecord
3232 // physRegRecord - reg to spill (must be a valid double register)
3233 // refLocation - Location of RefPosition where this register will be spilled
3234 // recentAssignedRefWeight - Weight of recent assigned RefPosition which will be determined in this function
3237 // True - if we can spill physRegRecord
3238 // False - otherwise
3241 // This helper is designed to be used only from allocateBusyReg() and canSpillDoubleReg().
3242 // The recentAssignedRefWeight is not updated if either register cannot be spilled.
3244 bool LinearScan::canSpillDoubleReg(RegRecord* physRegRecord,
3245 LsraLocation refLocation,
3246 unsigned* recentAssignedRefWeight)
3248 assert(genIsValidDoubleReg(physRegRecord->regNum));
3250 unsigned weight = BB_ZERO_WEIGHT;
3251 unsigned weight2 = BB_ZERO_WEIGHT;
3253 RegRecord* physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3255 if ((physRegRecord->assignedInterval != nullptr) && !canSpillReg(physRegRecord, refLocation, &weight))
3259 if (physRegRecord2->assignedInterval != nullptr)
3261 if (!canSpillReg(physRegRecord2, refLocation, &weight2))
3265 if (weight2 > weight)
3270 *recentAssignedRefWeight = weight;
3276 //------------------------------------------------------------------------
3277 // unassignDoublePhysReg: unassign a double register (pair)
3280 // doubleRegRecord - reg to unassign
3283 // The given RegRecord must be a valid (even numbered) double register.
3285 void LinearScan::unassignDoublePhysReg(RegRecord* doubleRegRecord)
3287 assert(genIsValidDoubleReg(doubleRegRecord->regNum));
3289 RegRecord* doubleRegRecordLo = doubleRegRecord;
3290 RegRecord* doubleRegRecordHi = findAnotherHalfRegRec(doubleRegRecordLo);
3291 // For a double register, we has following four cases.
3292 // Case 1: doubleRegRecLo is assigned to TYP_DOUBLE interval
3293 // Case 2: doubleRegRecLo and doubleRegRecHi are assigned to different TYP_FLOAT intervals
3294 // Case 3: doubelRegRecLo is assgined to TYP_FLOAT interval and doubleRegRecHi is nullptr
3295 // Case 4: doubleRegRecordLo is nullptr, and doubleRegRecordHi is assigned to a TYP_FLOAT interval
3296 if (doubleRegRecordLo->assignedInterval != nullptr)
3298 if (doubleRegRecordLo->assignedInterval->registerType == TYP_DOUBLE)
3300 // Case 1: doubleRegRecLo is assigned to TYP_DOUBLE interval
3301 unassignPhysReg(doubleRegRecordLo, doubleRegRecordLo->assignedInterval->recentRefPosition);
3305 // Case 2: doubleRegRecLo and doubleRegRecHi are assigned to different TYP_FLOAT intervals
3306 // Case 3: doubelRegRecLo is assgined to TYP_FLOAT interval and doubleRegRecHi is nullptr
3307 assert(doubleRegRecordLo->assignedInterval->registerType == TYP_FLOAT);
3308 unassignPhysReg(doubleRegRecordLo, doubleRegRecordLo->assignedInterval->recentRefPosition);
3310 if (doubleRegRecordHi != nullptr)
3312 if (doubleRegRecordHi->assignedInterval != nullptr)
3314 assert(doubleRegRecordHi->assignedInterval->registerType == TYP_FLOAT);
3315 unassignPhysReg(doubleRegRecordHi, doubleRegRecordHi->assignedInterval->recentRefPosition);
3322 // Case 4: doubleRegRecordLo is nullptr, and doubleRegRecordHi is assigned to a TYP_FLOAT interval
3323 assert(doubleRegRecordHi->assignedInterval != nullptr);
3324 assert(doubleRegRecordHi->assignedInterval->registerType == TYP_FLOAT);
3325 unassignPhysReg(doubleRegRecordHi, doubleRegRecordHi->assignedInterval->recentRefPosition);
3329 #endif // _TARGET_ARM_
3331 //------------------------------------------------------------------------
3332 // isRefPositionActive: Determine whether a given RefPosition is active at the given location
3335 // refPosition - the RefPosition of interest
3336 // refLocation - the LsraLocation at which we want to know if it is active
3339 // True - if this RefPosition occurs at the given location, OR
3340 // if it occurs at the previous location and is marked delayRegFree.
3341 // False - otherwise
3343 bool LinearScan::isRefPositionActive(RefPosition* refPosition, LsraLocation refLocation)
3345 return (refPosition->nodeLocation == refLocation ||
3346 ((refPosition->nodeLocation + 1 == refLocation) && refPosition->delayRegFree));
3349 //----------------------------------------------------------------------------------------
3350 // isRegInUse: Test whether regRec is being used at the refPosition
3353 // regRec - A register to be tested
3354 // refPosition - RefPosition where regRec is tested
3357 // True - if regRec is being used
3358 // False - otherwise
3361 // This helper is designed to be used only from allocateBusyReg(), where:
3362 // - This register was *not* found when looking for a free register, and
3363 // - The caller must have already checked for the case where 'refPosition' is a fixed ref
3364 // (asserted at the beginning of this method).
3366 bool LinearScan::isRegInUse(RegRecord* regRec, RefPosition* refPosition)
3368 // We shouldn't reach this check if 'refPosition' is a FixedReg of this register.
3369 assert(!refPosition->isFixedRefOfReg(regRec->regNum));
3370 Interval* assignedInterval = regRec->assignedInterval;
3371 if (assignedInterval != nullptr)
3373 if (!assignedInterval->isActive)
3375 // This can only happen if we have a recentRefPosition active at this location that hasn't yet been freed.
3376 CLANG_FORMAT_COMMENT_ANCHOR;
3378 if (isRefPositionActive(assignedInterval->recentRefPosition, refPosition->nodeLocation))
3385 // In the case of TYP_DOUBLE, we may have the case where 'assignedInterval' is inactive,
3386 // but the other half register is active. If so, it must be have an active recentRefPosition,
3388 if (refPosition->getInterval()->registerType == TYP_DOUBLE)
3390 RegRecord* otherHalfRegRec = findAnotherHalfRegRec(regRec);
3391 if (!otherHalfRegRec->assignedInterval->isActive)
3393 if (isRefPositionActive(otherHalfRegRec->assignedInterval->recentRefPosition,
3394 refPosition->nodeLocation))
3400 assert(!"Unexpected inactive assigned interval in isRegInUse");
3408 assert(!"Unexpected inactive assigned interval in isRegInUse");
3413 RefPosition* nextAssignedRef = assignedInterval->getNextRefPosition();
3415 // We should never spill a register that's occupied by an Interval with its next use at the current
3417 // Normally this won't occur (unless we actually had more uses in a single node than there are registers),
3418 // because we'll always find something with a later nextLocation, but it can happen in stress when
3419 // we have LSRA_SELECT_NEAREST.
3420 if ((nextAssignedRef != nullptr) && isRefPositionActive(nextAssignedRef, refPosition->nodeLocation) &&
3421 nextAssignedRef->RequiresRegister())
3429 //------------------------------------------------------------------------
3430 // isSpillCandidate: Determine if a register is a spill candidate for a given RefPosition.
3433 // current The interval for the current allocation
3434 // refPosition The RefPosition of the current Interval for which a register is being allocated
3435 // physRegRecord The RegRecord for the register we're considering for spill
3436 // nextLocation An out (reference) parameter in which the next use location of the
3437 // given RegRecord will be returned.
3440 // True iff the given register can be spilled to accommodate the given RefPosition.
3442 bool LinearScan::isSpillCandidate(Interval* current,
3443 RefPosition* refPosition,
3444 RegRecord* physRegRecord,
3445 LsraLocation& nextLocation)
3447 regMaskTP candidateBit = genRegMask(physRegRecord->regNum);
3448 LsraLocation refLocation = refPosition->nodeLocation;
3449 if (physRegRecord->isBusyUntilNextKill)
3453 Interval* assignedInterval = physRegRecord->assignedInterval;
3454 if (assignedInterval != nullptr)
3456 nextLocation = assignedInterval->getNextRefLocation();
3459 RegRecord* physRegRecord2 = nullptr;
3460 Interval* assignedInterval2 = nullptr;
3462 // For ARM32, a double occupies a consecutive even/odd pair of float registers.
3463 if (current->registerType == TYP_DOUBLE)
3465 assert(genIsValidDoubleReg(physRegRecord->regNum));
3466 physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3467 if (physRegRecord2->isBusyUntilNextKill)
3471 assignedInterval2 = physRegRecord2->assignedInterval;
3472 if ((assignedInterval2 != nullptr) && (assignedInterval2->getNextRefLocation() > nextLocation))
3474 nextLocation = assignedInterval2->getNextRefLocation();
3479 // If there is a fixed reference at the same location (and it's not due to this reference),
3481 if (physRegRecord->conflictingFixedRegReference(refPosition))
3486 if (refPosition->isFixedRefOfRegMask(candidateBit))
3489 // - there is a fixed reference due to this node, OR
3490 // - or there is a fixed use fed by a def at this node, OR
3491 // - or we have restricted the set of registers for stress.
3492 // In any case, we must use this register as it's the only candidate
3493 // TODO-CQ: At the time we allocate a register to a fixed-reg def, if it's not going
3494 // to remain live until the use, we should set the candidates to allRegs(regType)
3495 // to avoid a spill - codegen can then insert the copy.
3496 // If this is marked as allocateIfProfitable, the caller will compare the weights
3497 // of this RefPosition and the RefPosition to which it is currently assigned.
3498 assert(refPosition->isFixedRegRef ||
3499 (refPosition->nextRefPosition != nullptr && refPosition->nextRefPosition->isFixedRegRef) ||
3500 candidatesAreStressLimited());
3504 // If this register is not assigned to an interval, either
3505 // - it has a FixedReg reference at the current location that is not this reference, OR
3506 // - this is the special case of a fixed loReg, where this interval has a use at the same location
3507 // In either case, we cannot use it
3508 CLANG_FORMAT_COMMENT_ANCHOR;
3511 if (assignedInterval == nullptr && assignedInterval2 == nullptr)
3513 if (assignedInterval == nullptr)
3516 RefPosition* nextPhysRegPosition = physRegRecord->getNextRefPosition();
3517 #ifdef _TARGET_ARM64_
3518 // On ARM64, we may need to actually allocate IP0 and IP1 in some cases, but we don't include it in
3519 // the allocation order for tryAllocateFreeReg.
3520 if ((physRegRecord->regNum != REG_IP0) && (physRegRecord->regNum != REG_IP1))
3521 #endif // _TARGET_ARM64_
3523 assert((nextPhysRegPosition != nullptr) && (nextPhysRegPosition->nodeLocation == refLocation) &&
3524 (candidateBit != refPosition->registerAssignment));
3529 if (isRegInUse(physRegRecord, refPosition))
3535 if (current->registerType == TYP_DOUBLE)
3537 if (isRegInUse(physRegRecord2, refPosition))
3546 //------------------------------------------------------------------------
3547 // allocateBusyReg: Find a busy register that satisfies the requirements for refPosition,
3548 // and that can be spilled.
3551 // current The interval for the current allocation
3552 // refPosition The RefPosition of the current Interval for which a register is being allocated
3553 // allocateIfProfitable If true, a reg may not be allocated if all other ref positions currently
3554 // occupying registers are more important than the 'refPosition'.
3557 // The regNumber allocated to the RefPositon. Returns REG_NA if no free register is found.
3559 // Note: Currently this routine uses weight and farthest distance of next reference
3560 // to select a ref position for spilling.
3561 // a) if allocateIfProfitable = false
3562 // The ref position chosen for spilling will be the lowest weight
3563 // of all and if there is is more than one ref position with the
3564 // same lowest weight, among them choses the one with farthest
3565 // distance to its next reference.
3567 // b) if allocateIfProfitable = true
3568 // The ref position chosen for spilling will not only be lowest weight
3569 // of all but also has a weight lower than 'refPosition'. If there is
3570 // no such ref position, reg will not be allocated.
3572 regNumber LinearScan::allocateBusyReg(Interval* current, RefPosition* refPosition, bool allocateIfProfitable)
3574 regNumber foundReg = REG_NA;
3576 RegisterType regType = getRegisterType(current, refPosition);
3577 regMaskTP candidates = refPosition->registerAssignment;
3578 regMaskTP preferences = (current->registerPreferences & candidates);
3579 if (preferences == RBM_NONE)
3581 preferences = candidates;
3583 if (candidates == RBM_NONE)
3585 // This assumes only integer and floating point register types
3586 // if we target a processor with additional register types,
3587 // this would have to change
3588 candidates = allRegs(regType);
3592 candidates = stressLimitRegs(refPosition, candidates);
3595 // TODO-CQ: Determine whether/how to take preferences into account in addition to
3596 // prefering the one with the furthest ref position when considering
3597 // a candidate to spill
3598 RegRecord* farthestRefPhysRegRecord = nullptr;
3600 RegRecord* farthestRefPhysRegRecord2 = nullptr;
3602 LsraLocation farthestLocation = MinLocation;
3603 LsraLocation refLocation = refPosition->nodeLocation;
3604 unsigned farthestRefPosWeight;
3605 if (allocateIfProfitable)
3607 // If allocating a reg is optional, we will consider those ref positions
3608 // whose weight is less than 'refPosition' for spilling.
3609 farthestRefPosWeight = getWeight(refPosition);
3613 // If allocating a reg is a must, we start off with max weight so
3614 // that the first spill candidate will be selected based on
3615 // farthest distance alone. Since we start off with farthestLocation
3616 // initialized to MinLocation, the first available ref position
3617 // will be selected as spill candidate and its weight as the
3618 // fathestRefPosWeight.
3619 farthestRefPosWeight = BB_MAX_WEIGHT;
3622 for (regNumber regNum : Registers(regType))
3624 regMaskTP candidateBit = genRegMask(regNum);
3625 if (!(candidates & candidateBit))
3629 RegRecord* physRegRecord = getRegisterRecord(regNum);
3630 RegRecord* physRegRecord2 = nullptr; // only used for _TARGET_ARM_
3631 LsraLocation nextLocation = MinLocation;
3632 LsraLocation physRegNextLocation;
3633 if (!isSpillCandidate(current, refPosition, physRegRecord, nextLocation))
3635 assert(candidates != candidateBit);
3639 // We've passed the preliminary checks for a spill candidate.
3640 // Now, if we have a recentAssignedRef, check that it is going to be OK to spill it.
3641 Interval* assignedInterval = physRegRecord->assignedInterval;
3642 unsigned recentAssignedRefWeight = BB_ZERO_WEIGHT;
3643 RefPosition* recentAssignedRef = nullptr;
3644 RefPosition* recentAssignedRef2 = nullptr;
3646 if (current->registerType == TYP_DOUBLE)
3648 recentAssignedRef = (assignedInterval == nullptr) ? nullptr : assignedInterval->recentRefPosition;
3649 physRegRecord2 = findAnotherHalfRegRec(physRegRecord);
3650 Interval* assignedInterval2 = physRegRecord2->assignedInterval;
3651 recentAssignedRef2 = (assignedInterval2 == nullptr) ? nullptr : assignedInterval2->recentRefPosition;
3652 if (!canSpillDoubleReg(physRegRecord, refLocation, &recentAssignedRefWeight))
3660 recentAssignedRef = assignedInterval->recentRefPosition;
3661 if (!canSpillReg(physRegRecord, refLocation, &recentAssignedRefWeight))
3666 if (recentAssignedRefWeight > farthestRefPosWeight)
3671 physRegNextLocation = physRegRecord->getNextRefLocation();
3672 if (nextLocation > physRegNextLocation)
3674 nextLocation = physRegNextLocation;
3677 bool isBetterLocation;
3680 if (doSelectNearest() && farthestRefPhysRegRecord != nullptr)
3682 isBetterLocation = (nextLocation <= farthestLocation);
3686 // This if-stmt is associated with the above else
3687 if (recentAssignedRefWeight < farthestRefPosWeight)
3689 isBetterLocation = true;
3693 // This would mean the weight of spill ref position we found so far is equal
3694 // to the weight of the ref position that is being evaluated. In this case
3695 // we prefer to spill ref position whose distance to its next reference is
3697 assert(recentAssignedRefWeight == farthestRefPosWeight);
3699 // If allocateIfProfitable=true, the first spill candidate selected
3700 // will be based on weight alone. After we have found a spill
3701 // candidate whose weight is less than the 'refPosition', we will
3702 // consider farthest distance when there is a tie in weights.
3703 // This is to ensure that we don't spill a ref position whose
3704 // weight is equal to weight of 'refPosition'.
3705 if (allocateIfProfitable && farthestRefPhysRegRecord == nullptr)
3707 isBetterLocation = false;
3711 isBetterLocation = (nextLocation > farthestLocation);
3713 if (nextLocation > farthestLocation)
3715 isBetterLocation = true;
3717 else if (nextLocation == farthestLocation)
3719 // Both weight and distance are equal.
3720 // Prefer that ref position which is marked both reload and
3721 // allocate if profitable. These ref positions don't need
3722 // need to be spilled as they are already in memory and
3723 // codegen considers them as contained memory operands.
3724 CLANG_FORMAT_COMMENT_ANCHOR;
3726 // TODO-CQ-ARM: Just conservatively "and" two condition. We may implement better condision later.
3727 isBetterLocation = true;
3728 if (recentAssignedRef != nullptr)
3729 isBetterLocation &= (recentAssignedRef->reload && recentAssignedRef->AllocateIfProfitable());
3731 if (recentAssignedRef2 != nullptr)
3732 isBetterLocation &= (recentAssignedRef2->reload && recentAssignedRef2->AllocateIfProfitable());
3734 isBetterLocation = (recentAssignedRef != nullptr) && recentAssignedRef->reload &&
3735 recentAssignedRef->AllocateIfProfitable();
3740 isBetterLocation = false;
3745 if (isBetterLocation)
3747 farthestLocation = nextLocation;
3748 farthestRefPhysRegRecord = physRegRecord;
3750 farthestRefPhysRegRecord2 = physRegRecord2;
3752 farthestRefPosWeight = recentAssignedRefWeight;
3757 if (allocateIfProfitable)
3759 // There may not be a spill candidate or if one is found
3760 // its weight must be less than the weight of 'refPosition'
3761 assert((farthestRefPhysRegRecord == nullptr) || (farthestRefPosWeight < getWeight(refPosition)));
3765 // Must have found a spill candidate.
3766 assert(farthestRefPhysRegRecord != nullptr);
3768 if (farthestLocation == refLocation)
3770 // This must be a RefPosition that is constrained to use a single register, either directly,
3771 // or at the use, or by stress.
3772 bool isConstrained = (refPosition->isFixedRegRef || (refPosition->nextRefPosition != nullptr &&
3773 refPosition->nextRefPosition->isFixedRegRef) ||
3774 candidatesAreStressLimited());
3778 Interval* assignedInterval =
3779 (farthestRefPhysRegRecord == nullptr) ? nullptr : farthestRefPhysRegRecord->assignedInterval;
3780 Interval* assignedInterval2 =
3781 (farthestRefPhysRegRecord2 == nullptr) ? nullptr : farthestRefPhysRegRecord2->assignedInterval;
3782 RefPosition* nextRefPosition =
3783 (assignedInterval == nullptr) ? nullptr : assignedInterval->getNextRefPosition();
3784 RefPosition* nextRefPosition2 =
3785 (assignedInterval2 == nullptr) ? nullptr : assignedInterval2->getNextRefPosition();
3786 if (nextRefPosition != nullptr)
3788 if (nextRefPosition2 != nullptr)
3790 assert(!nextRefPosition->RequiresRegister() || !nextRefPosition2->RequiresRegister());
3794 assert(!nextRefPosition->RequiresRegister());
3799 assert(nextRefPosition2 != nullptr && !nextRefPosition2->RequiresRegister());
3801 #else // !_TARGET_ARM_
3802 Interval* assignedInterval = farthestRefPhysRegRecord->assignedInterval;
3803 RefPosition* nextRefPosition = assignedInterval->getNextRefPosition();
3804 assert(!nextRefPosition->RequiresRegister());
3805 #endif // !_TARGET_ARM_
3810 assert(farthestLocation > refLocation);
3815 if (farthestRefPhysRegRecord != nullptr)
3817 foundReg = farthestRefPhysRegRecord->regNum;
3820 if (current->registerType == TYP_DOUBLE)
3822 assert(genIsValidDoubleReg(foundReg));
3823 unassignDoublePhysReg(farthestRefPhysRegRecord);
3828 unassignPhysReg(farthestRefPhysRegRecord, farthestRefPhysRegRecord->assignedInterval->recentRefPosition);
3831 assignPhysReg(farthestRefPhysRegRecord, current);
3832 refPosition->registerAssignment = genRegMask(foundReg);
3837 refPosition->registerAssignment = RBM_NONE;
3843 // Grab a register to use to copy and then immediately use.
3844 // This is called only for localVar intervals that already have a register
3845 // assignment that is not compatible with the current RefPosition.
3846 // This is not like regular assignment, because we don't want to change
3847 // any preferences or existing register assignments.
3848 // Prefer a free register that's got the earliest next use.
3849 // Otherwise, spill something with the farthest next use
3851 regNumber LinearScan::assignCopyReg(RefPosition* refPosition)
3853 Interval* currentInterval = refPosition->getInterval();
3854 assert(currentInterval != nullptr);
3855 assert(currentInterval->isActive);
3857 bool foundFreeReg = false;
3858 RegRecord* bestPhysReg = nullptr;
3859 LsraLocation bestLocation = MinLocation;
3860 regMaskTP candidates = refPosition->registerAssignment;
3862 // Save the relatedInterval, if any, so that it doesn't get modified during allocation.
3863 Interval* savedRelatedInterval = currentInterval->relatedInterval;
3864 currentInterval->relatedInterval = nullptr;
3866 // We don't want really want to change the default assignment,
3867 // so 1) pretend this isn't active, and 2) remember the old reg
3868 regNumber oldPhysReg = currentInterval->physReg;
3869 RegRecord* oldRegRecord = currentInterval->assignedReg;
3870 assert(oldRegRecord->regNum == oldPhysReg);
3871 currentInterval->isActive = false;
3873 regNumber allocatedReg = tryAllocateFreeReg(currentInterval, refPosition);
3874 if (allocatedReg == REG_NA)
3876 allocatedReg = allocateBusyReg(currentInterval, refPosition, false);
3879 // Now restore the old info
3880 currentInterval->relatedInterval = savedRelatedInterval;
3881 currentInterval->physReg = oldPhysReg;
3882 currentInterval->assignedReg = oldRegRecord;
3883 currentInterval->isActive = true;
3885 refPosition->copyReg = true;
3886 return allocatedReg;
3889 //------------------------------------------------------------------------
3890 // isAssigned: This is the function to check if the given RegRecord has an assignedInterval
3891 // regardless of lastLocation.
3892 // So it would be call isAssigned() with Maxlocation value.
3895 // regRec - The RegRecord to check that it is assigned.
3896 // newRegType - There are elements to judge according to the upcoming register type.
3899 // Returns true if the given RegRecord has an assignedInterval.
3902 // There is the case to check if the RegRecord has an assignedInterval regardless of Lastlocation.
3904 bool LinearScan::isAssigned(RegRecord* regRec ARM_ARG(RegisterType newRegType))
3906 return isAssigned(regRec, MaxLocation ARM_ARG(newRegType));
3909 //------------------------------------------------------------------------
3910 // isAssigned: Check whether the given RegRecord has an assignedInterval
3911 // that has a reference prior to the given location.
3914 // regRec - The RegRecord of interest
3915 // lastLocation - The LsraLocation up to which we want to check
3916 // newRegType - The `RegisterType` of interval we want to check
3917 // (this is for the purposes of checking the other half of a TYP_DOUBLE RegRecord)
3920 // Returns true if the given RegRecord (and its other half, if TYP_DOUBLE) has an assignedInterval
3921 // that is referenced prior to the given location
3924 // The register is not considered to be assigned if it has no assignedInterval, or that Interval's
3925 // next reference is beyond lastLocation
3927 bool LinearScan::isAssigned(RegRecord* regRec, LsraLocation lastLocation ARM_ARG(RegisterType newRegType))
3929 Interval* assignedInterval = regRec->assignedInterval;
3931 if ((assignedInterval == nullptr) || assignedInterval->getNextRefLocation() > lastLocation)
3934 if (newRegType == TYP_DOUBLE)
3936 RegRecord* anotherRegRec = findAnotherHalfRegRec(regRec);
3938 if ((anotherRegRec->assignedInterval == nullptr) ||
3939 (anotherRegRec->assignedInterval->getNextRefLocation() > lastLocation))
3941 // In case the newRegType is a double register,
3942 // the score would be set UNASSIGNED if another register is also not set.
3956 // Check if the interval is already assigned and if it is then unassign the physical record
3957 // then set the assignedInterval to 'interval'
3959 void LinearScan::checkAndAssignInterval(RegRecord* regRec, Interval* interval)
3961 Interval* assignedInterval = regRec->assignedInterval;
3962 if (assignedInterval != nullptr && assignedInterval != interval)
3964 // This is allocated to another interval. Either it is inactive, or it was allocated as a
3965 // copyReg and is therefore not the "assignedReg" of the other interval. In the latter case,
3966 // we simply unassign it - in the former case we need to set the physReg on the interval to
3967 // REG_NA to indicate that it is no longer in that register.
3968 // The lack of checking for this case resulted in an assert in the retail version of System.dll,
3969 // in method SerialStream.GetDcbFlag.
3970 // Note that we can't check for the copyReg case, because we may have seen a more recent
3971 // RefPosition for the Interval that was NOT a copyReg.
3972 if (assignedInterval->assignedReg == regRec)
3974 assert(assignedInterval->isActive == false);
3975 assignedInterval->physReg = REG_NA;
3977 unassignPhysReg(regRec->regNum);
3980 // If 'interval' and 'assignedInterval' were both TYP_DOUBLE, then we have unassigned 'assignedInterval'
3981 // from both halves. Otherwise, if 'interval' is TYP_DOUBLE, we now need to unassign the other half.
3982 if ((interval->registerType == TYP_DOUBLE) &&
3983 ((assignedInterval == nullptr) || (assignedInterval->registerType == TYP_FLOAT)))
3985 RegRecord* otherRegRecord = getSecondHalfRegRec(regRec);
3986 assignedInterval = otherRegRecord->assignedInterval;
3987 if (assignedInterval != nullptr && assignedInterval != interval)
3989 if (assignedInterval->assignedReg == otherRegRecord)
3991 assert(assignedInterval->isActive == false);
3992 assignedInterval->physReg = REG_NA;
3994 unassignPhysReg(otherRegRecord->regNum);
3999 updateAssignedInterval(regRec, interval, interval->registerType);
4002 // Assign the given physical register interval to the given interval
4003 void LinearScan::assignPhysReg(RegRecord* regRec, Interval* interval)
4005 regMaskTP assignedRegMask = genRegMask(regRec->regNum);
4006 compiler->codeGen->regSet.rsSetRegsModified(assignedRegMask DEBUGARG(true));
4008 checkAndAssignInterval(regRec, interval);
4009 interval->assignedReg = regRec;
4011 interval->physReg = regRec->regNum;
4012 interval->isActive = true;
4013 if (interval->isLocalVar)
4015 // Prefer this register for future references
4016 interval->updateRegisterPreferences(assignedRegMask);
4020 //------------------------------------------------------------------------
4021 // setIntervalAsSplit: Set this Interval as being split
4024 // interval - The Interval which is being split
4030 // The given Interval will be marked as split, and it will be added to the
4031 // set of splitOrSpilledVars.
4034 // "interval" must be a lclVar interval, as tree temps are never split.
4035 // This is asserted in the call to getVarIndex().
4037 void LinearScan::setIntervalAsSplit(Interval* interval)
4039 if (interval->isLocalVar)
4041 unsigned varIndex = interval->getVarIndex(compiler);
4042 if (!interval->isSplit)
4044 VarSetOps::AddElemD(compiler, splitOrSpilledVars, varIndex);
4048 assert(VarSetOps::IsMember(compiler, splitOrSpilledVars, varIndex));
4051 interval->isSplit = true;
4054 //------------------------------------------------------------------------
4055 // setIntervalAsSpilled: Set this Interval as being spilled
4058 // interval - The Interval which is being spilled
4064 // The given Interval will be marked as spilled, and it will be added
4065 // to the set of splitOrSpilledVars.
4067 void LinearScan::setIntervalAsSpilled(Interval* interval)
4069 if (interval->isLocalVar)
4071 unsigned varIndex = interval->getVarIndex(compiler);
4072 if (!interval->isSpilled)
4074 VarSetOps::AddElemD(compiler, splitOrSpilledVars, varIndex);
4078 assert(VarSetOps::IsMember(compiler, splitOrSpilledVars, varIndex));
4081 interval->isSpilled = true;
4084 //------------------------------------------------------------------------
4085 // spill: Spill this Interval between "fromRefPosition" and "toRefPosition"
4088 // fromRefPosition - The RefPosition at which the Interval is to be spilled
4089 // toRefPosition - The RefPosition at which it must be reloaded
4095 // fromRefPosition and toRefPosition must not be null
4097 void LinearScan::spillInterval(Interval* interval, RefPosition* fromRefPosition, RefPosition* toRefPosition)
4099 assert(fromRefPosition != nullptr && toRefPosition != nullptr);
4100 assert(fromRefPosition->getInterval() == interval && toRefPosition->getInterval() == interval);
4101 assert(fromRefPosition->nextRefPosition == toRefPosition);
4103 if (!fromRefPosition->lastUse)
4105 // If not allocated a register, Lcl var def/use ref positions even if reg optional
4106 // should be marked as spillAfter.
4107 if (!fromRefPosition->RequiresRegister() && !(interval->isLocalVar && fromRefPosition->IsActualRef()))
4109 fromRefPosition->registerAssignment = RBM_NONE;
4113 fromRefPosition->spillAfter = true;
4116 assert(toRefPosition != nullptr);
4121 dumpLsraAllocationEvent(LSRA_EVENT_SPILL, interval);
4125 INTRACK_STATS(updateLsraStat(LSRA_STAT_SPILL, fromRefPosition->bbNum));
4127 interval->isActive = false;
4128 setIntervalAsSpilled(interval);
4130 // If fromRefPosition occurs before the beginning of this block, mark this as living in the stack
4131 // on entry to this block.
4132 if (fromRefPosition->nodeLocation <= curBBStartLocation)
4134 // This must be a lclVar interval
4135 assert(interval->isLocalVar);
4136 setInVarRegForBB(curBBNum, interval->varNum, REG_STK);
4140 //------------------------------------------------------------------------
4141 // unassignPhysRegNoSpill: Unassign the given physical register record from
4142 // an active interval, without spilling.
4145 // regRec - the RegRecord to be unasssigned
4151 // The assignedInterval must not be null, and must be active.
4154 // This method is used to unassign a register when an interval needs to be moved to a
4155 // different register, but not (yet) spilled.
4157 void LinearScan::unassignPhysRegNoSpill(RegRecord* regRec)
4159 Interval* assignedInterval = regRec->assignedInterval;
4160 assert(assignedInterval != nullptr && assignedInterval->isActive);
4161 assignedInterval->isActive = false;
4162 unassignPhysReg(regRec, nullptr);
4163 assignedInterval->isActive = true;
4166 //------------------------------------------------------------------------
4167 // checkAndClearInterval: Clear the assignedInterval for the given
4168 // physical register record
4171 // regRec - the physical RegRecord to be unasssigned
4172 // spillRefPosition - The RefPosition at which the assignedInterval is to be spilled
4173 // or nullptr if we aren't spilling
4179 // see unassignPhysReg
4181 void LinearScan::checkAndClearInterval(RegRecord* regRec, RefPosition* spillRefPosition)
4183 Interval* assignedInterval = regRec->assignedInterval;
4184 assert(assignedInterval != nullptr);
4185 regNumber thisRegNum = regRec->regNum;
4187 if (spillRefPosition == nullptr)
4189 // Note that we can't assert for the copyReg case
4191 if (assignedInterval->physReg == thisRegNum)
4193 assert(assignedInterval->isActive == false);
4198 assert(spillRefPosition->getInterval() == assignedInterval);
4201 updateAssignedInterval(regRec, nullptr, assignedInterval->registerType);
4204 //------------------------------------------------------------------------
4205 // unassignPhysReg: Unassign the given physical register record, and spill the
4206 // assignedInterval at the given spillRefPosition, if any.
4209 // regRec - The RegRecord to be unasssigned
4210 // newRegType - The RegisterType of interval that would be assigned
4216 // On ARM architecture, Intervals have to be unassigned considering
4217 // with the register type of interval that would be assigned.
4219 void LinearScan::unassignPhysReg(RegRecord* regRec ARM_ARG(RegisterType newRegType))
4221 RegRecord* regRecToUnassign = regRec;
4223 RegRecord* anotherRegRec = nullptr;
4225 if ((regRecToUnassign->assignedInterval != nullptr) &&
4226 (regRecToUnassign->assignedInterval->registerType == TYP_DOUBLE))
4228 // If the register type of interval(being unassigned or new) is TYP_DOUBLE,
4229 // It should have to be valid double register (even register)
4230 if (!genIsValidDoubleReg(regRecToUnassign->regNum))
4232 regRecToUnassign = findAnotherHalfRegRec(regRec);
4237 if (newRegType == TYP_DOUBLE)
4239 anotherRegRec = findAnotherHalfRegRec(regRecToUnassign);
4244 if (regRecToUnassign->assignedInterval != nullptr)
4246 unassignPhysReg(regRecToUnassign, regRecToUnassign->assignedInterval->recentRefPosition);
4249 if ((anotherRegRec != nullptr) && (anotherRegRec->assignedInterval != nullptr))
4251 unassignPhysReg(anotherRegRec, anotherRegRec->assignedInterval->recentRefPosition);
4256 //------------------------------------------------------------------------
4257 // unassignPhysReg: Unassign the given physical register record, and spill the
4258 // assignedInterval at the given spillRefPosition, if any.
4261 // regRec - the RegRecord to be unasssigned
4262 // spillRefPosition - The RefPosition at which the assignedInterval is to be spilled
4268 // The assignedInterval must not be null.
4269 // If spillRefPosition is null, the assignedInterval must be inactive, or not currently
4270 // assigned to this register (e.g. this is a copyReg for that Interval).
4271 // Otherwise, spillRefPosition must be associated with the assignedInterval.
4273 void LinearScan::unassignPhysReg(RegRecord* regRec, RefPosition* spillRefPosition)
4275 Interval* assignedInterval = regRec->assignedInterval;
4276 assert(assignedInterval != nullptr);
4277 regNumber thisRegNum = regRec->regNum;
4279 // Is assignedInterval actually still assigned to this register?
4280 bool intervalIsAssigned = (assignedInterval->physReg == thisRegNum);
4283 RegRecord* anotherRegRec = nullptr;
4285 // Prepare second half RegRecord of a double register for TYP_DOUBLE
4286 if (assignedInterval->registerType == TYP_DOUBLE)
4288 assert(isFloatRegType(regRec->registerType));
4290 anotherRegRec = findAnotherHalfRegRec(regRec);
4292 // Both two RegRecords should have been assigned to the same interval.
4293 assert(assignedInterval == anotherRegRec->assignedInterval);
4294 if (!intervalIsAssigned && (assignedInterval->physReg == anotherRegRec->regNum))
4296 intervalIsAssigned = true;
4299 #endif // _TARGET_ARM_
4301 checkAndClearInterval(regRec, spillRefPosition);
4304 if (assignedInterval->registerType == TYP_DOUBLE)
4306 // Both two RegRecords should have been unassigned together.
4307 assert(regRec->assignedInterval == nullptr);
4308 assert(anotherRegRec->assignedInterval == nullptr);
4310 #endif // _TARGET_ARM_
4312 RefPosition* nextRefPosition = nullptr;
4313 if (spillRefPosition != nullptr)
4315 nextRefPosition = spillRefPosition->nextRefPosition;
4318 if (!intervalIsAssigned && assignedInterval->physReg != REG_NA)
4320 // This must have been a temporary copy reg, but we can't assert that because there
4321 // may have been intervening RefPositions that were not copyRegs.
4323 // reg->assignedInterval has already been set to nullptr by checkAndClearInterval()
4324 assert(regRec->assignedInterval == nullptr);
4328 regNumber victimAssignedReg = assignedInterval->physReg;
4329 assignedInterval->physReg = REG_NA;
4331 bool spill = assignedInterval->isActive && nextRefPosition != nullptr;
4334 // If this is an active interval, it must have a recentRefPosition,
4335 // otherwise it would not be active
4336 assert(spillRefPosition != nullptr);
4339 // TODO-CQ: Enable this and insert an explicit GT_COPY (otherwise there's no way to communicate
4340 // to codegen that we want the copyReg to be the new home location).
4341 // If the last reference was a copyReg, and we're spilling the register
4342 // it was copied from, then make the copyReg the new primary location
4344 if (spillRefPosition->copyReg)
4346 regNumber copyFromRegNum = victimAssignedReg;
4347 regNumber copyRegNum = genRegNumFromMask(spillRefPosition->registerAssignment);
4348 if (copyFromRegNum == thisRegNum &&
4349 getRegisterRecord(copyRegNum)->assignedInterval == assignedInterval)
4351 assert(copyRegNum != thisRegNum);
4352 assignedInterval->physReg = copyRegNum;
4353 assignedInterval->assignedReg = this->getRegisterRecord(copyRegNum);
4359 // With JitStressRegs == 0x80 (LSRA_EXTEND_LIFETIMES), we may have a RefPosition
4360 // that is not marked lastUse even though the treeNode is a lastUse. In that case
4361 // we must not mark it for spill because the register will have been immediately freed
4362 // after use. While we could conceivably add special handling for this case in codegen,
4363 // it would be messy and undesirably cause the "bleeding" of LSRA stress modes outside
4365 if (extendLifetimes() && assignedInterval->isLocalVar && RefTypeIsUse(spillRefPosition->refType) &&
4366 spillRefPosition->treeNode != nullptr && (spillRefPosition->treeNode->gtFlags & GTF_VAR_DEATH) != 0)
4368 dumpLsraAllocationEvent(LSRA_EVENT_SPILL_EXTENDED_LIFETIME, assignedInterval);
4369 assignedInterval->isActive = false;
4371 // If the spillRefPosition occurs before the beginning of this block, it will have
4372 // been marked as living in this register on entry to this block, but we now need
4373 // to mark this as living on the stack.
4374 if (spillRefPosition->nodeLocation <= curBBStartLocation)
4376 setInVarRegForBB(curBBNum, assignedInterval->varNum, REG_STK);
4377 if (spillRefPosition->nextRefPosition != nullptr)
4379 setIntervalAsSpilled(assignedInterval);
4384 // Otherwise, we need to mark spillRefPosition as lastUse, or the interval
4385 // will remain active beyond its allocated range during the resolution phase.
4386 spillRefPosition->lastUse = true;
4392 spillInterval(assignedInterval, spillRefPosition, nextRefPosition);
4395 // Maintain the association with the interval, if it has more references.
4396 // Or, if we "remembered" an interval assigned to this register, restore it.
4397 if (nextRefPosition != nullptr)
4399 assignedInterval->assignedReg = regRec;
4401 else if (canRestorePreviousInterval(regRec, assignedInterval))
4403 regRec->assignedInterval = regRec->previousInterval;
4404 regRec->previousInterval = nullptr;
4408 // We can not use updateAssignedInterval() and updatePreviousInterval() here,
4409 // because regRec may not be a even-numbered float register.
4411 // Update second half RegRecord of a double register for TYP_DOUBLE
4412 if (regRec->assignedInterval->registerType == TYP_DOUBLE)
4414 RegRecord* anotherHalfRegRec = findAnotherHalfRegRec(regRec);
4416 anotherHalfRegRec->assignedInterval = regRec->assignedInterval;
4417 anotherHalfRegRec->previousInterval = nullptr;
4419 #endif // _TARGET_ARM_
4424 dumpLsraAllocationEvent(LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL_AFTER_SPILL, regRec->assignedInterval,
4429 dumpLsraAllocationEvent(LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL, regRec->assignedInterval, thisRegNum);
4435 updateAssignedInterval(regRec, nullptr, assignedInterval->registerType);
4436 updatePreviousInterval(regRec, nullptr, assignedInterval->registerType);
4440 //------------------------------------------------------------------------
4441 // spillGCRefs: Spill any GC-type intervals that are currently in registers.a
4444 // killRefPosition - The RefPosition for the kill
4449 void LinearScan::spillGCRefs(RefPosition* killRefPosition)
4451 // For each physical register that can hold a GC type,
4452 // if it is occupied by an interval of a GC type, spill that interval.
4453 regMaskTP candidateRegs = killRefPosition->registerAssignment;
4454 while (candidateRegs != RBM_NONE)
4456 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
4457 candidateRegs &= ~nextRegBit;
4458 regNumber nextReg = genRegNumFromMask(nextRegBit);
4459 RegRecord* regRecord = getRegisterRecord(nextReg);
4460 Interval* assignedInterval = regRecord->assignedInterval;
4461 if (assignedInterval == nullptr || (assignedInterval->isActive == false) ||
4462 !varTypeIsGC(assignedInterval->registerType))
4466 unassignPhysReg(regRecord, assignedInterval->recentRefPosition);
4468 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_DONE_KILL_GC_REFS, nullptr, REG_NA, nullptr));
4471 //------------------------------------------------------------------------
4472 // processBlockEndAllocation: Update var locations after 'currentBlock' has been allocated
4475 // currentBlock - the BasicBlock we have just finished allocating registers for
4481 // Calls processBlockEndLocations() to set the outVarToRegMap, then gets the next block,
4482 // and sets the inVarToRegMap appropriately.
4484 void LinearScan::processBlockEndAllocation(BasicBlock* currentBlock)
4486 assert(currentBlock != nullptr);
4487 if (enregisterLocalVars)
4489 processBlockEndLocations(currentBlock);
4491 markBlockVisited(currentBlock);
4493 // Get the next block to allocate.
4494 // When the last block in the method has successors, there will be a final "RefTypeBB" to
4495 // ensure that we get the varToRegMap set appropriately, but in that case we don't need
4496 // to worry about "nextBlock".
4497 BasicBlock* nextBlock = getNextBlock();
4498 if (nextBlock != nullptr)
4500 processBlockStartLocations(nextBlock, true);
4504 //------------------------------------------------------------------------
4505 // rotateBlockStartLocation: When in the LSRA_BLOCK_BOUNDARY_ROTATE stress mode, attempt to
4506 // "rotate" the register assignment for a localVar to the next higher
4507 // register that is available.
4510 // interval - the Interval for the variable whose register is getting rotated
4511 // targetReg - its register assignment from the predecessor block being used for live-in
4512 // availableRegs - registers available for use
4515 // The new register to use.
4518 regNumber LinearScan::rotateBlockStartLocation(Interval* interval, regNumber targetReg, regMaskTP availableRegs)
4520 if (targetReg != REG_STK && getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_ROTATE)
4522 // If we're rotating the register locations at block boundaries, try to use
4523 // the next higher register number of the appropriate register type.
4524 regMaskTP candidateRegs = allRegs(interval->registerType) & availableRegs;
4525 regNumber firstReg = REG_NA;
4526 regNumber newReg = REG_NA;
4527 while (candidateRegs != RBM_NONE)
4529 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
4530 candidateRegs &= ~nextRegBit;
4531 regNumber nextReg = genRegNumFromMask(nextRegBit);
4532 if (nextReg > targetReg)
4537 else if (firstReg == REG_NA)
4542 if (newReg == REG_NA)
4544 assert(firstReg != REG_NA);
4554 //--------------------------------------------------------------------------------------
4555 // isSecondHalfReg: Test if recRec is second half of double register
4556 // which is assigned to an interval.
4559 // regRec - a register to be tested
4560 // interval - an interval which is assigned to some register
4566 // True only if regRec is second half of assignedReg in interval
4568 bool LinearScan::isSecondHalfReg(RegRecord* regRec, Interval* interval)
4570 RegRecord* assignedReg = interval->assignedReg;
4572 if (assignedReg != nullptr && interval->registerType == TYP_DOUBLE)
4574 // interval should have been allocated to a valid double register
4575 assert(genIsValidDoubleReg(assignedReg->regNum));
4577 // Find a second half RegRecord of double register
4578 regNumber firstRegNum = assignedReg->regNum;
4579 regNumber secondRegNum = REG_NEXT(firstRegNum);
4581 assert(genIsValidFloatReg(secondRegNum) && !genIsValidDoubleReg(secondRegNum));
4583 RegRecord* secondRegRec = getRegisterRecord(secondRegNum);
4585 return secondRegRec == regRec;
4591 //------------------------------------------------------------------------------------------
4592 // getSecondHalfRegRec: Get the second (odd) half of an ARM32 double register
4595 // regRec - A float RegRecord
4598 // regRec must be a valid double register (i.e. even)
4601 // The RegRecord for the second half of the double register
4603 RegRecord* LinearScan::getSecondHalfRegRec(RegRecord* regRec)
4605 regNumber secondHalfRegNum;
4606 RegRecord* secondHalfRegRec;
4608 assert(genIsValidDoubleReg(regRec->regNum));
4610 secondHalfRegNum = REG_NEXT(regRec->regNum);
4611 secondHalfRegRec = getRegisterRecord(secondHalfRegNum);
4613 return secondHalfRegRec;
4615 //------------------------------------------------------------------------------------------
4616 // findAnotherHalfRegRec: Find another half RegRecord which forms same ARM32 double register
4619 // regRec - A float RegRecord
4625 // A RegRecord which forms same double register with regRec
4627 RegRecord* LinearScan::findAnotherHalfRegRec(RegRecord* regRec)
4629 regNumber anotherHalfRegNum;
4630 RegRecord* anotherHalfRegRec;
4632 assert(genIsValidFloatReg(regRec->regNum));
4634 // Find another half register for TYP_DOUBLE interval,
4635 // following same logic in canRestorePreviousInterval().
4636 if (genIsValidDoubleReg(regRec->regNum))
4638 anotherHalfRegNum = REG_NEXT(regRec->regNum);
4639 assert(!genIsValidDoubleReg(anotherHalfRegNum));
4643 anotherHalfRegNum = REG_PREV(regRec->regNum);
4644 assert(genIsValidDoubleReg(anotherHalfRegNum));
4646 anotherHalfRegRec = getRegisterRecord(anotherHalfRegNum);
4648 return anotherHalfRegRec;
4652 //--------------------------------------------------------------------------------------
4653 // canRestorePreviousInterval: Test if we can restore previous interval
4656 // regRec - a register which contains previous interval to be restored
4657 // assignedInterval - an interval just unassigned
4663 // True only if previous interval of regRec can be restored
4665 bool LinearScan::canRestorePreviousInterval(RegRecord* regRec, Interval* assignedInterval)
4668 (regRec->previousInterval != nullptr && regRec->previousInterval != assignedInterval &&
4669 regRec->previousInterval->assignedReg == regRec && regRec->previousInterval->getNextRefPosition() != nullptr);
4672 if (retVal && regRec->previousInterval->registerType == TYP_DOUBLE)
4674 RegRecord* anotherHalfRegRec = findAnotherHalfRegRec(regRec);
4676 retVal = retVal && anotherHalfRegRec->assignedInterval == nullptr;
4683 bool LinearScan::isAssignedToInterval(Interval* interval, RegRecord* regRec)
4685 bool isAssigned = (interval->assignedReg == regRec);
4687 isAssigned |= isSecondHalfReg(regRec, interval);
4692 void LinearScan::unassignIntervalBlockStart(RegRecord* regRecord, VarToRegMap inVarToRegMap)
4694 // Is there another interval currently assigned to this register? If so unassign it.
4695 Interval* assignedInterval = regRecord->assignedInterval;
4696 if (assignedInterval != nullptr)
4698 if (isAssignedToInterval(assignedInterval, regRecord))
4700 // Only localVars or constants should be assigned to registers at block boundaries.
4701 if (!assignedInterval->isLocalVar)
4703 assert(assignedInterval->isConstant);
4704 // Don't need to update the VarToRegMap.
4705 inVarToRegMap = nullptr;
4708 regNumber assignedRegNum = assignedInterval->assignedReg->regNum;
4710 // If the interval is active, it will be set to active when we reach its new
4711 // register assignment (which we must not yet have done, or it wouldn't still be
4712 // assigned to this register).
4713 assignedInterval->isActive = false;
4714 unassignPhysReg(assignedInterval->assignedReg, nullptr);
4715 if ((inVarToRegMap != nullptr) && inVarToRegMap[assignedInterval->getVarIndex(compiler)] == assignedRegNum)
4717 inVarToRegMap[assignedInterval->getVarIndex(compiler)] = REG_STK;
4722 // This interval is no longer assigned to this register.
4723 updateAssignedInterval(regRecord, nullptr, assignedInterval->registerType);
4728 //------------------------------------------------------------------------
4729 // processBlockStartLocations: Update var locations on entry to 'currentBlock' and clear constant
4733 // currentBlock - the BasicBlock we are about to allocate registers for
4734 // allocationPass - true if we are currently allocating registers (versus writing them back)
4740 // During the allocation pass, we use the outVarToRegMap of the selected predecessor to
4741 // determine the lclVar locations for the inVarToRegMap.
4742 // During the resolution (write-back) pass, we only modify the inVarToRegMap in cases where
4743 // a lclVar was spilled after the block had been completed.
4744 void LinearScan::processBlockStartLocations(BasicBlock* currentBlock, bool allocationPass)
4746 // If we have no register candidates we should only call this method during allocation.
4748 assert(enregisterLocalVars || allocationPass);
4750 if (!enregisterLocalVars)
4752 // Just clear any constant registers and return.
4753 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
4755 RegRecord* physRegRecord = getRegisterRecord(reg);
4756 Interval* assignedInterval = physRegRecord->assignedInterval;
4758 if (assignedInterval != nullptr)
4760 assert(assignedInterval->isConstant);
4761 physRegRecord->assignedInterval = nullptr;
4764 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, currentBlock));
4768 unsigned predBBNum = blockInfo[currentBlock->bbNum].predBBNum;
4769 VarToRegMap predVarToRegMap = getOutVarToRegMap(predBBNum);
4770 VarToRegMap inVarToRegMap = getInVarToRegMap(currentBlock->bbNum);
4771 bool hasCriticalInEdge = blockInfo[currentBlock->bbNum].hasCriticalInEdge;
4773 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
4774 VarSetOps::Intersection(compiler, registerCandidateVars, currentBlock->bbLiveIn));
4776 if (getLsraExtendLifeTimes())
4778 VarSetOps::AssignNoCopy(compiler, currentLiveVars, registerCandidateVars);
4780 // If we are rotating register assignments at block boundaries, we want to make the
4781 // inactive registers available for the rotation.
4782 regMaskTP inactiveRegs = RBM_NONE;
4784 regMaskTP liveRegs = RBM_NONE;
4785 VarSetOps::Iter iter(compiler, currentLiveVars);
4786 unsigned varIndex = 0;
4787 while (iter.NextElem(&varIndex))
4789 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
4790 if (!compiler->lvaTable[varNum].lvLRACandidate)
4794 regNumber targetReg;
4795 Interval* interval = getIntervalForLocalVar(varIndex);
4796 RefPosition* nextRefPosition = interval->getNextRefPosition();
4797 assert(nextRefPosition != nullptr);
4801 targetReg = getVarReg(predVarToRegMap, varIndex);
4803 regNumber newTargetReg = rotateBlockStartLocation(interval, targetReg, (~liveRegs | inactiveRegs));
4804 if (newTargetReg != targetReg)
4806 targetReg = newTargetReg;
4807 setIntervalAsSplit(interval);
4810 setVarReg(inVarToRegMap, varIndex, targetReg);
4812 else // !allocationPass (i.e. resolution/write-back pass)
4814 targetReg = getVarReg(inVarToRegMap, varIndex);
4815 // There are four cases that we need to consider during the resolution pass:
4816 // 1. This variable had a register allocated initially, and it was not spilled in the RefPosition
4817 // that feeds this block. In this case, both targetReg and predVarToRegMap[varIndex] will be targetReg.
4818 // 2. This variable had not been spilled prior to the end of predBB, but was later spilled, so
4819 // predVarToRegMap[varIndex] will be REG_STK, but targetReg is its former allocated value.
4820 // In this case, we will normally change it to REG_STK. We will update its "spilled" status when we
4821 // encounter it in resolveLocalRef().
4822 // 2a. If the next RefPosition is marked as a copyReg, we need to retain the allocated register. This is
4823 // because the copyReg RefPosition will not have recorded the "home" register, yet downstream
4824 // RefPositions rely on the correct "home" register.
4825 // 3. This variable was spilled before we reached the end of predBB. In this case, both targetReg and
4826 // predVarToRegMap[varIndex] will be REG_STK, and the next RefPosition will have been marked
4827 // as reload during allocation time if necessary (note that by the time we actually reach the next
4828 // RefPosition, we may be using a different predecessor, at which it is still in a register).
4829 // 4. This variable was spilled during the allocation of this block, so targetReg is REG_STK
4830 // (because we set inVarToRegMap at the time we spilled it), but predVarToRegMap[varIndex]
4831 // is not REG_STK. We retain the REG_STK value in the inVarToRegMap.
4832 if (targetReg != REG_STK)
4834 if (getVarReg(predVarToRegMap, varIndex) != REG_STK)
4837 assert(getVarReg(predVarToRegMap, varIndex) == targetReg ||
4838 getLsraBlockBoundaryLocations() == LSRA_BLOCK_BOUNDARY_ROTATE);
4840 else if (!nextRefPosition->copyReg)
4843 setVarReg(inVarToRegMap, varIndex, REG_STK);
4844 targetReg = REG_STK;
4846 // Else case 2a. - retain targetReg.
4848 // Else case #3 or #4, we retain targetReg and nothing further to do or assert.
4850 if (interval->physReg == targetReg)
4852 if (interval->isActive)
4854 assert(targetReg != REG_STK);
4855 assert(interval->assignedReg != nullptr && interval->assignedReg->regNum == targetReg &&
4856 interval->assignedReg->assignedInterval == interval);
4857 liveRegs |= genRegMask(targetReg);
4861 else if (interval->physReg != REG_NA)
4863 // This can happen if we are using the locations from a basic block other than the
4864 // immediately preceding one - where the variable was in a different location.
4865 if (targetReg != REG_STK)
4867 // Unassign it from the register (it will get a new register below).
4868 if (interval->assignedReg != nullptr && interval->assignedReg->assignedInterval == interval)
4870 interval->isActive = false;
4871 unassignPhysReg(getRegisterRecord(interval->physReg), nullptr);
4875 // This interval was live in this register the last time we saw a reference to it,
4876 // but has since been displaced.
4877 interval->physReg = REG_NA;
4880 else if (allocationPass)
4882 // Keep the register assignment - if another var has it, it will get unassigned.
4883 // Otherwise, resolution will fix it up later, and it will be more
4884 // likely to match other assignments this way.
4885 interval->isActive = true;
4886 liveRegs |= genRegMask(interval->physReg);
4887 INDEBUG(inactiveRegs |= genRegMask(interval->physReg));
4888 setVarReg(inVarToRegMap, varIndex, interval->physReg);
4892 interval->physReg = REG_NA;
4895 if (targetReg != REG_STK)
4897 RegRecord* targetRegRecord = getRegisterRecord(targetReg);
4898 liveRegs |= genRegMask(targetReg);
4899 if (!interval->isActive)
4901 interval->isActive = true;
4902 interval->physReg = targetReg;
4903 interval->assignedReg = targetRegRecord;
4905 if (targetRegRecord->assignedInterval != interval)
4908 // If this is a TYP_DOUBLE interval, and the assigned interval is either null or is TYP_FLOAT,
4909 // we also need to unassign the other half of the register.
4910 // Note that if the assigned interval is TYP_DOUBLE, it will be unassigned below.
4911 if ((interval->registerType == TYP_DOUBLE) &&
4912 ((targetRegRecord->assignedInterval == nullptr) ||
4913 (targetRegRecord->assignedInterval->registerType == TYP_FLOAT)))
4915 assert(genIsValidDoubleReg(targetReg));
4916 unassignIntervalBlockStart(findAnotherHalfRegRec(targetRegRecord),
4917 allocationPass ? inVarToRegMap : nullptr);
4919 #endif // _TARGET_ARM_
4920 unassignIntervalBlockStart(targetRegRecord, allocationPass ? inVarToRegMap : nullptr);
4921 assignPhysReg(targetRegRecord, interval);
4923 if (interval->recentRefPosition != nullptr && !interval->recentRefPosition->copyReg &&
4924 interval->recentRefPosition->registerAssignment != genRegMask(targetReg))
4926 interval->getNextRefPosition()->outOfOrder = true;
4931 // Unassign any registers that are no longer live.
4932 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
4934 if ((liveRegs & genRegMask(reg)) == 0)
4936 RegRecord* physRegRecord = getRegisterRecord(reg);
4937 Interval* assignedInterval = physRegRecord->assignedInterval;
4939 if (assignedInterval != nullptr)
4941 assert(assignedInterval->isLocalVar || assignedInterval->isConstant);
4943 if (!assignedInterval->isConstant && assignedInterval->assignedReg == physRegRecord)
4945 assignedInterval->isActive = false;
4946 if (assignedInterval->getNextRefPosition() == nullptr)
4948 unassignPhysReg(physRegRecord, nullptr);
4950 inVarToRegMap[assignedInterval->getVarIndex(compiler)] = REG_STK;
4954 // This interval may still be active, but was in another register in an
4955 // intervening block.
4956 updateAssignedInterval(physRegRecord, nullptr, assignedInterval->registerType);
4960 if (assignedInterval->registerType == TYP_DOUBLE)
4962 // Skip next float register, because we already addressed a double register
4963 assert(genIsValidDoubleReg(reg));
4964 reg = REG_NEXT(reg);
4966 #endif // _TARGET_ARM_
4972 RegRecord* physRegRecord = getRegisterRecord(reg);
4973 Interval* assignedInterval = physRegRecord->assignedInterval;
4975 if (assignedInterval != nullptr && assignedInterval->registerType == TYP_DOUBLE)
4977 // Skip next float register, because we already addressed a double register
4978 assert(genIsValidDoubleReg(reg));
4979 reg = REG_NEXT(reg);
4982 #endif // _TARGET_ARM_
4984 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, currentBlock));
4987 //------------------------------------------------------------------------
4988 // processBlockEndLocations: Record the variables occupying registers after completing the current block.
4991 // currentBlock - the block we have just completed.
4997 // This must be called both during the allocation and resolution (write-back) phases.
4998 // This is because we need to have the outVarToRegMap locations in order to set the locations
4999 // at successor blocks during allocation time, but if lclVars are spilled after a block has been
5000 // completed, we need to record the REG_STK location for those variables at resolution time.
5002 void LinearScan::processBlockEndLocations(BasicBlock* currentBlock)
5004 assert(currentBlock != nullptr && currentBlock->bbNum == curBBNum);
5005 VarToRegMap outVarToRegMap = getOutVarToRegMap(curBBNum);
5007 VarSetOps::AssignNoCopy(compiler, currentLiveVars,
5008 VarSetOps::Intersection(compiler, registerCandidateVars, currentBlock->bbLiveOut));
5010 if (getLsraExtendLifeTimes())
5012 VarSetOps::Assign(compiler, currentLiveVars, registerCandidateVars);
5015 regMaskTP liveRegs = RBM_NONE;
5016 VarSetOps::Iter iter(compiler, currentLiveVars);
5017 unsigned varIndex = 0;
5018 while (iter.NextElem(&varIndex))
5020 Interval* interval = getIntervalForLocalVar(varIndex);
5021 if (interval->isActive)
5023 assert(interval->physReg != REG_NA && interval->physReg != REG_STK);
5024 setVarReg(outVarToRegMap, varIndex, interval->physReg);
5028 outVarToRegMap[varIndex] = REG_STK;
5031 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_END_BB));
5035 void LinearScan::dumpRefPositions(const char* str)
5037 printf("------------\n");
5038 printf("REFPOSITIONS %s: \n", str);
5039 printf("------------\n");
5040 for (RefPosition& refPos : refPositions)
5047 bool LinearScan::registerIsFree(regNumber regNum, RegisterType regType)
5049 RegRecord* physRegRecord = getRegisterRecord(regNum);
5051 bool isFree = physRegRecord->isFree();
5054 if (isFree && regType == TYP_DOUBLE)
5056 isFree = getSecondHalfRegRec(physRegRecord)->isFree();
5058 #endif // _TARGET_ARM_
5063 // isMultiRegRelated: is this RefPosition defining part of a multi-reg value
5064 // at the given location?
5066 bool LinearScan::isMultiRegRelated(RefPosition* refPosition, LsraLocation location)
5068 #ifdef FEATURE_MULTIREG_ARGS_OR_RET
5069 return ((refPosition->nodeLocation == location) && refPosition->getInterval()->isMultiReg);
5075 //------------------------------------------------------------------------
5076 // LinearScan::freeRegister: Make a register available for use
5079 // physRegRecord - the RegRecord for the register to be freed.
5086 // It may be that the RegRecord has already been freed, e.g. due to a kill,
5087 // in which case this method has no effect.
5090 // If there is currently an Interval assigned to this register, and it has
5091 // more references (i.e. this is a local last-use, but more uses and/or
5092 // defs remain), it will remain assigned to the physRegRecord. However, since
5093 // it is marked inactive, the register will be available, albeit less desirable
5095 void LinearScan::freeRegister(RegRecord* physRegRecord)
5097 Interval* assignedInterval = physRegRecord->assignedInterval;
5098 // It may have already been freed by a "Kill"
5099 if (assignedInterval != nullptr)
5101 assignedInterval->isActive = false;
5102 // If this is a constant node, that we may encounter again (e.g. constant),
5103 // don't unassign it until we need the register.
5104 if (!assignedInterval->isConstant)
5106 RefPosition* nextRefPosition = assignedInterval->getNextRefPosition();
5107 // Unassign the register only if there are no more RefPositions, or the next
5108 // one is a def. Note that the latter condition doesn't actually ensure that
5109 // there aren't subsequent uses that could be reached by a def in the assigned
5110 // register, but is merely a heuristic to avoid tying up the register (or using
5111 // it when it's non-optimal). A better alternative would be to use SSA, so that
5112 // we wouldn't unnecessarily link separate live ranges to the same register.
5113 if (nextRefPosition == nullptr || RefTypeIsDef(nextRefPosition->refType))
5116 assert((assignedInterval->registerType != TYP_DOUBLE) || genIsValidDoubleReg(physRegRecord->regNum));
5117 #endif // _TARGET_ARM_
5118 unassignPhysReg(physRegRecord, nullptr);
5124 void LinearScan::freeRegisters(regMaskTP regsToFree)
5126 if (regsToFree == RBM_NONE)
5131 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_FREE_REGS));
5132 while (regsToFree != RBM_NONE)
5134 regMaskTP nextRegBit = genFindLowestBit(regsToFree);
5135 regsToFree &= ~nextRegBit;
5136 regNumber nextReg = genRegNumFromMask(nextRegBit);
5137 freeRegister(getRegisterRecord(nextReg));
5141 // Actual register allocation, accomplished by iterating over all of the previously
5142 // constructed Intervals
5143 // Loosely based on raAssignVars()
5145 void LinearScan::allocateRegisters()
5147 JITDUMP("*************** In LinearScan::allocateRegisters()\n");
5148 DBEXEC(VERBOSE, lsraDumpIntervals("before allocateRegisters"));
5150 // at start, nothing is active except for register args
5151 for (Interval& interval : intervals)
5153 Interval* currentInterval = &interval;
5154 currentInterval->recentRefPosition = nullptr;
5155 currentInterval->isActive = false;
5156 if (currentInterval->isLocalVar)
5158 LclVarDsc* varDsc = currentInterval->getLocalVar(compiler);
5159 if (varDsc->lvIsRegArg && currentInterval->firstRefPosition != nullptr)
5161 currentInterval->isActive = true;
5166 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
5168 getRegisterRecord(reg)->recentRefPosition = nullptr;
5169 getRegisterRecord(reg)->isActive = false;
5173 regNumber lastAllocatedReg = REG_NA;
5176 dumpRefPositions("BEFORE ALLOCATION");
5177 dumpVarRefPositions("BEFORE ALLOCATION");
5179 printf("\n\nAllocating Registers\n"
5180 "--------------------\n");
5181 // Start with a small set of commonly used registers, so that we don't keep having to print a new title.
5182 registersToDump = LsraLimitSmallIntSet | LsraLimitSmallFPSet;
5183 dumpRegRecordHeader();
5184 // Now print an empty "RefPosition", since we complete the dump of the regs at the beginning of the loop.
5185 printf(indentFormat, "");
5189 BasicBlock* currentBlock = nullptr;
5191 LsraLocation prevLocation = MinLocation;
5192 regMaskTP regsToFree = RBM_NONE;
5193 regMaskTP delayRegsToFree = RBM_NONE;
5195 // This is the most recent RefPosition for which a register was allocated
5196 // - currently only used for DEBUG but maintained in non-debug, for clarity of code
5197 // (and will be optimized away because in non-debug spillAlways() unconditionally returns false)
5198 RefPosition* lastAllocatedRefPosition = nullptr;
5200 bool handledBlockEnd = false;
5202 for (RefPosition& refPositionIterator : refPositions)
5204 RefPosition* currentRefPosition = &refPositionIterator;
5207 // Set the activeRefPosition to null until we're done with any boundary handling.
5208 activeRefPosition = nullptr;
5211 // We're really dumping the RegRecords "after" the previous RefPosition, but it's more convenient
5212 // to do this here, since there are a number of "continue"s in this loop.
5217 // This is the previousRefPosition of the current Referent, if any
5218 RefPosition* previousRefPosition = nullptr;
5220 Interval* currentInterval = nullptr;
5221 Referenceable* currentReferent = nullptr;
5222 bool isInternalRef = false;
5223 RefType refType = currentRefPosition->refType;
5225 currentReferent = currentRefPosition->referent;
5227 if (spillAlways() && lastAllocatedRefPosition != nullptr && !lastAllocatedRefPosition->isPhysRegRef &&
5228 !lastAllocatedRefPosition->getInterval()->isInternal &&
5229 (RefTypeIsDef(lastAllocatedRefPosition->refType) || lastAllocatedRefPosition->getInterval()->isLocalVar))
5231 assert(lastAllocatedRefPosition->registerAssignment != RBM_NONE);
5232 RegRecord* regRecord = lastAllocatedRefPosition->getInterval()->assignedReg;
5233 unassignPhysReg(regRecord, lastAllocatedRefPosition);
5234 // Now set lastAllocatedRefPosition to null, so that we don't try to spill it again
5235 lastAllocatedRefPosition = nullptr;
5238 // We wait to free any registers until we've completed all the
5239 // uses for the current node.
5240 // This avoids reusing registers too soon.
5241 // We free before the last true def (after all the uses & internal
5242 // registers), and then again at the beginning of the next node.
5243 // This is made easier by assigning two LsraLocations per node - one
5244 // for all the uses, internal registers & all but the last def, and
5245 // another for the final def (if any).
5247 LsraLocation currentLocation = currentRefPosition->nodeLocation;
5249 if ((regsToFree | delayRegsToFree) != RBM_NONE)
5251 // Free at a new location, or at a basic block boundary
5252 if (refType == RefTypeBB)
5254 assert(currentLocation > prevLocation);
5256 if (currentLocation > prevLocation)
5258 freeRegisters(regsToFree);
5259 if ((currentLocation > (prevLocation + 1)) && (delayRegsToFree != RBM_NONE))
5261 // We should never see a delayReg that is delayed until a Location that has no RefPosition
5262 // (that would be the RefPosition that it was supposed to interfere with).
5263 assert(!"Found a delayRegFree associated with Location with no reference");
5264 // However, to be cautious for the Release build case, we will free them.
5265 freeRegisters(delayRegsToFree);
5266 delayRegsToFree = RBM_NONE;
5268 regsToFree = delayRegsToFree;
5269 delayRegsToFree = RBM_NONE;
5272 prevLocation = currentLocation;
5274 // get previous refposition, then current refpos is the new previous
5275 if (currentReferent != nullptr)
5277 previousRefPosition = currentReferent->recentRefPosition;
5278 currentReferent->recentRefPosition = currentRefPosition;
5282 assert((refType == RefTypeBB) || (refType == RefTypeKillGCRefs));
5286 activeRefPosition = currentRefPosition;
5289 // For the purposes of register resolution, we handle the DummyDefs before
5290 // the block boundary - so the RefTypeBB is after all the DummyDefs.
5291 // However, for the purposes of allocation, we want to handle the block
5292 // boundary first, so that we can free any registers occupied by lclVars
5293 // that aren't live in the next block and make them available for the
5296 if (!handledBlockEnd && (refType == RefTypeBB || refType == RefTypeDummyDef))
5298 // Free any delayed regs (now in regsToFree) before processing the block boundary
5299 freeRegisters(regsToFree);
5300 regsToFree = RBM_NONE;
5301 handledBlockEnd = true;
5302 curBBStartLocation = currentRefPosition->nodeLocation;
5303 if (currentBlock == nullptr)
5305 currentBlock = startBlockSequence();
5306 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_START_BB, nullptr, REG_NA, compiler->fgFirstBB));
5310 processBlockEndAllocation(currentBlock);
5311 currentBlock = moveToNextBlock();
5315 if (refType == RefTypeBB)
5317 handledBlockEnd = false;
5321 if (refType == RefTypeKillGCRefs)
5323 spillGCRefs(currentRefPosition);
5327 // If this is a FixedReg, disassociate any inactive constant interval from this register.
5328 // Otherwise, do nothing.
5329 if (refType == RefTypeFixedReg)
5331 RegRecord* regRecord = currentRefPosition->getReg();
5332 Interval* assignedInterval = regRecord->assignedInterval;
5334 if (assignedInterval != nullptr && !assignedInterval->isActive && assignedInterval->isConstant)
5336 regRecord->assignedInterval = nullptr;
5339 // Update overlapping floating point register for TYP_DOUBLE
5340 if (assignedInterval->registerType == TYP_DOUBLE)
5342 regRecord = getSecondHalfRegRec(regRecord);
5343 assignedInterval = regRecord->assignedInterval;
5345 assert(assignedInterval != nullptr && !assignedInterval->isActive && assignedInterval->isConstant);
5346 regRecord->assignedInterval = nullptr;
5350 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_FIXED_REG, nullptr, currentRefPosition->assignedReg()));
5354 // If this is an exposed use, do nothing - this is merely a placeholder to attempt to
5355 // ensure that a register is allocated for the full lifetime. The resolution logic
5356 // will take care of moving to the appropriate register if needed.
5358 if (refType == RefTypeExpUse)
5360 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_EXP_USE));
5364 regNumber assignedRegister = REG_NA;
5366 if (currentRefPosition->isIntervalRef())
5368 currentInterval = currentRefPosition->getInterval();
5369 assignedRegister = currentInterval->physReg;
5371 // Identify the special cases where we decide up-front not to allocate
5372 bool allocate = true;
5373 bool didDump = false;
5375 if (refType == RefTypeParamDef || refType == RefTypeZeroInit)
5377 // For a ParamDef with a weighted refCount less than unity, don't enregister it at entry.
5378 // TODO-CQ: Consider doing this only for stack parameters, since otherwise we may be needlessly
5379 // inserting a store.
5380 LclVarDsc* varDsc = currentInterval->getLocalVar(compiler);
5381 assert(varDsc != nullptr);
5382 if (refType == RefTypeParamDef && varDsc->lvRefCntWtd <= BB_UNITY_WEIGHT)
5384 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_ENTRY_REG_ALLOCATED, currentInterval));
5387 setIntervalAsSpilled(currentInterval);
5389 // If it has no actual references, mark it as "lastUse"; since they're not actually part
5390 // of any flow they won't have been marked during dataflow. Otherwise, if we allocate a
5391 // register we won't unassign it.
5392 else if (currentRefPosition->nextRefPosition == nullptr)
5394 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_ZERO_REF, currentInterval));
5395 currentRefPosition->lastUse = true;
5399 else if (refType == RefTypeUpperVectorSaveDef || refType == RefTypeUpperVectorSaveUse)
5401 Interval* lclVarInterval = currentInterval->relatedInterval;
5402 if (lclVarInterval->physReg == REG_NA)
5407 #endif // FEATURE_SIMD
5409 if (allocate == false)
5411 if (assignedRegister != REG_NA)
5413 unassignPhysReg(getRegisterRecord(assignedRegister), currentRefPosition);
5417 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5420 currentRefPosition->registerAssignment = RBM_NONE;
5424 if (currentInterval->isSpecialPutArg)
5426 assert(!currentInterval->isLocalVar);
5427 Interval* srcInterval = currentInterval->relatedInterval;
5428 assert(srcInterval->isLocalVar);
5429 if (refType == RefTypeDef)
5431 assert(srcInterval->recentRefPosition->nodeLocation == currentLocation - 1);
5432 RegRecord* physRegRecord = srcInterval->assignedReg;
5434 // For a putarg_reg to be special, its next use location has to be the same
5435 // as fixed reg's next kill location. Otherwise, if source lcl var's next use
5436 // is after the kill of fixed reg but before putarg_reg's next use, fixed reg's
5437 // kill would lead to spill of source but not the putarg_reg if it were treated
5439 if (srcInterval->isActive &&
5440 genRegMask(srcInterval->physReg) == currentRefPosition->registerAssignment &&
5441 currentInterval->getNextRefLocation() == physRegRecord->getNextRefLocation())
5443 assert(physRegRecord->regNum == srcInterval->physReg);
5445 // Special putarg_reg acts as a pass-thru since both source lcl var
5446 // and putarg_reg have the same register allocated. Physical reg
5447 // record of reg continue to point to source lcl var's interval
5448 // instead of to putarg_reg's interval. So if a spill of reg
5449 // allocated to source lcl var happens, to reallocate to another
5450 // tree node, before its use at call node it will lead to spill of
5451 // lcl var instead of putarg_reg since physical reg record is pointing
5452 // to lcl var's interval. As a result, arg reg would get trashed leading
5453 // to bad codegen. The assumption here is that source lcl var of a
5454 // special putarg_reg doesn't get spilled and re-allocated prior to
5455 // its use at the call node. This is ensured by marking physical reg
5456 // record as busy until next kill.
5457 physRegRecord->isBusyUntilNextKill = true;
5461 currentInterval->isSpecialPutArg = false;
5464 // If this is still a SpecialPutArg, continue;
5465 if (currentInterval->isSpecialPutArg)
5467 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_SPECIAL_PUTARG, currentInterval,
5468 currentRefPosition->assignedReg()));
5473 if (assignedRegister == REG_NA && RefTypeIsUse(refType))
5475 currentRefPosition->reload = true;
5476 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_RELOAD, currentInterval, assignedRegister));
5480 regMaskTP assignedRegBit = RBM_NONE;
5481 bool isInRegister = false;
5482 if (assignedRegister != REG_NA)
5484 isInRegister = true;
5485 assignedRegBit = genRegMask(assignedRegister);
5486 if (!currentInterval->isActive)
5488 // If this is a use, it must have started the block on the stack, but the register
5489 // was available for use so we kept the association.
5490 if (RefTypeIsUse(refType))
5492 assert(enregisterLocalVars);
5493 assert(inVarToRegMaps[curBBNum][currentInterval->getVarIndex(compiler)] == REG_STK &&
5494 previousRefPosition->nodeLocation <= curBBStartLocation);
5495 isInRegister = false;
5499 currentInterval->isActive = true;
5502 assert(currentInterval->assignedReg != nullptr &&
5503 currentInterval->assignedReg->regNum == assignedRegister &&
5504 currentInterval->assignedReg->assignedInterval == currentInterval);
5507 // If this is a physical register, we unconditionally assign it to itself!
5508 if (currentRefPosition->isPhysRegRef)
5510 RegRecord* currentReg = currentRefPosition->getReg();
5511 Interval* assignedInterval = currentReg->assignedInterval;
5513 if (assignedInterval != nullptr)
5515 unassignPhysReg(currentReg, assignedInterval->recentRefPosition);
5517 currentReg->isActive = true;
5518 assignedRegister = currentReg->regNum;
5519 assignedRegBit = genRegMask(assignedRegister);
5520 if (refType == RefTypeKill)
5522 currentReg->isBusyUntilNextKill = false;
5525 else if (previousRefPosition != nullptr)
5527 assert(previousRefPosition->nextRefPosition == currentRefPosition);
5528 assert(assignedRegister == REG_NA || assignedRegBit == previousRefPosition->registerAssignment ||
5529 currentRefPosition->outOfOrder || previousRefPosition->copyReg ||
5530 previousRefPosition->refType == RefTypeExpUse || currentRefPosition->refType == RefTypeDummyDef);
5532 else if (assignedRegister != REG_NA)
5534 // Handle the case where this is a preassigned register (i.e. parameter).
5535 // We don't want to actually use the preassigned register if it's not
5536 // going to cover the lifetime - but we had to preallocate it to ensure
5537 // that it remained live.
5538 // TODO-CQ: At some point we may want to refine the analysis here, in case
5539 // it might be beneficial to keep it in this reg for PART of the lifetime
5540 if (currentInterval->isLocalVar)
5542 regMaskTP preferences = currentInterval->registerPreferences;
5543 bool keepAssignment = true;
5544 bool matchesPreferences = (preferences & genRegMask(assignedRegister)) != RBM_NONE;
5546 // Will the assigned register cover the lifetime? If not, does it at least
5547 // meet the preferences for the next RefPosition?
5548 RegRecord* physRegRecord = getRegisterRecord(currentInterval->physReg);
5549 RefPosition* nextPhysRegRefPos = physRegRecord->getNextRefPosition();
5550 if (nextPhysRegRefPos != nullptr &&
5551 nextPhysRegRefPos->nodeLocation <= currentInterval->lastRefPosition->nodeLocation)
5553 // Check to see if the existing assignment matches the preferences (e.g. callee save registers)
5554 // and ensure that the next use of this localVar does not occur after the nextPhysRegRefPos
5555 // There must be a next RefPosition, because we know that the Interval extends beyond the
5556 // nextPhysRegRefPos.
5557 RefPosition* nextLclVarRefPos = currentRefPosition->nextRefPosition;
5558 assert(nextLclVarRefPos != nullptr);
5559 if (!matchesPreferences || nextPhysRegRefPos->nodeLocation < nextLclVarRefPos->nodeLocation ||
5560 physRegRecord->conflictingFixedRegReference(nextLclVarRefPos))
5562 keepAssignment = false;
5565 else if (refType == RefTypeParamDef && !matchesPreferences)
5567 // Don't use the register, even if available, if it doesn't match the preferences.
5568 // Note that this case is only for ParamDefs, for which we haven't yet taken preferences
5569 // into account (we've just automatically got the initial location). In other cases,
5570 // we would already have put it in a preferenced register, if it was available.
5571 // TODO-CQ: Consider expanding this to check availability - that would duplicate
5572 // code here, but otherwise we may wind up in this register anyway.
5573 keepAssignment = false;
5576 if (keepAssignment == false)
5578 currentRefPosition->registerAssignment = allRegs(currentInterval->registerType);
5579 unassignPhysRegNoSpill(physRegRecord);
5581 // If the preferences are currently set to just this register, reset them to allRegs
5582 // of the appropriate type (just as we just reset the registerAssignment for this
5584 // Otherwise, simply remove this register from the preferences, if it's there.
5586 if (currentInterval->registerPreferences == assignedRegBit)
5588 currentInterval->registerPreferences = currentRefPosition->registerAssignment;
5592 currentInterval->registerPreferences &= ~assignedRegBit;
5595 assignedRegister = REG_NA;
5596 assignedRegBit = RBM_NONE;
5601 if (assignedRegister != REG_NA)
5603 // If there is a conflicting fixed reference, insert a copy.
5604 RegRecord* physRegRecord = getRegisterRecord(assignedRegister);
5605 if (physRegRecord->conflictingFixedRegReference(currentRefPosition))
5607 // We may have already reassigned the register to the conflicting reference.
5608 // If not, we need to unassign this interval.
5609 if (physRegRecord->assignedInterval == currentInterval)
5611 unassignPhysRegNoSpill(physRegRecord);
5613 currentRefPosition->moveReg = true;
5614 assignedRegister = REG_NA;
5615 setIntervalAsSplit(currentInterval);
5616 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_MOVE_REG, currentInterval, assignedRegister));
5618 else if ((genRegMask(assignedRegister) & currentRefPosition->registerAssignment) != 0)
5620 currentRefPosition->registerAssignment = assignedRegBit;
5621 if (!currentReferent->isActive)
5623 // If we've got an exposed use at the top of a block, the
5624 // interval might not have been active. Otherwise if it's a use,
5625 // the interval must be active.
5626 if (refType == RefTypeDummyDef)
5628 currentReferent->isActive = true;
5629 assert(getRegisterRecord(assignedRegister)->assignedInterval == currentInterval);
5633 currentRefPosition->reload = true;
5636 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, currentInterval, assignedRegister));
5640 assert(currentInterval != nullptr);
5642 // It's already in a register, but not one we need.
5643 if (!RefTypeIsDef(currentRefPosition->refType))
5645 regNumber copyReg = assignCopyReg(currentRefPosition);
5646 assert(copyReg != REG_NA);
5647 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_COPY_REG, currentInterval, copyReg));
5648 lastAllocatedRefPosition = currentRefPosition;
5649 if (currentRefPosition->lastUse)
5651 if (currentRefPosition->delayRegFree)
5653 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE_DELAYED, currentInterval,
5655 delayRegsToFree |= (genRegMask(assignedRegister) | currentRefPosition->registerAssignment);
5659 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE, currentInterval, assignedRegister));
5660 regsToFree |= (genRegMask(assignedRegister) | currentRefPosition->registerAssignment);
5663 // If this is a tree temp (non-localVar) interval, we will need an explicit move.
5664 if (!currentInterval->isLocalVar)
5666 currentRefPosition->moveReg = true;
5667 currentRefPosition->copyReg = false;
5673 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NEEDS_NEW_REG, nullptr, assignedRegister));
5674 regsToFree |= genRegMask(assignedRegister);
5675 // We want a new register, but we don't want this to be considered a spill.
5676 assignedRegister = REG_NA;
5677 if (physRegRecord->assignedInterval == currentInterval)
5679 unassignPhysRegNoSpill(physRegRecord);
5685 if (assignedRegister == REG_NA)
5687 bool allocateReg = true;
5689 if (currentRefPosition->AllocateIfProfitable())
5691 // We can avoid allocating a register if it is a the last use requiring a reload.
5692 if (currentRefPosition->lastUse && currentRefPosition->reload)
5694 allocateReg = false;
5698 // Under stress mode, don't attempt to allocate a reg to
5699 // reg optional ref position.
5700 if (allocateReg && regOptionalNoAlloc())
5702 allocateReg = false;
5709 // Try to allocate a register
5710 assignedRegister = tryAllocateFreeReg(currentInterval, currentRefPosition);
5713 // If no register was found, and if the currentRefPosition must have a register,
5714 // then find a register to spill
5715 if (assignedRegister == REG_NA)
5717 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
5718 if (refType == RefTypeUpperVectorSaveDef)
5720 // TODO-CQ: Determine whether copying to two integer callee-save registers would be profitable.
5721 // TODO-ARM64-CQ: Determine whether copying to one integer callee-save registers would be
5724 // SaveDef position occurs after the Use of args and at the same location as Kill/Def
5725 // positions of a call node. But SaveDef position cannot use any of the arg regs as
5726 // they are needed for call node.
5727 currentRefPosition->registerAssignment =
5728 (allRegs(TYP_FLOAT) & RBM_FLT_CALLEE_TRASH & ~RBM_FLTARG_REGS);
5729 assignedRegister = tryAllocateFreeReg(currentInterval, currentRefPosition);
5731 // There MUST be caller-save registers available, because they have all just been killed.
5732 // Amd64 Windows: xmm4-xmm5 are guaranteed to be available as xmm0-xmm3 are used for passing args.
5733 // Amd64 Unix: xmm8-xmm15 are guaranteed to be avilable as xmm0-xmm7 are used for passing args.
5734 // X86 RyuJIT Windows: xmm4-xmm7 are guanrateed to be available.
5735 assert(assignedRegister != REG_NA);
5739 // i) The reason we have to spill is that SaveDef position is allocated after the Kill positions
5740 // of the call node are processed. Since callee-trash registers are killed by call node
5741 // we explicity spill and unassign the register.
5742 // ii) These will look a bit backward in the dump, but it's a pain to dump the alloc before the
5744 unassignPhysReg(getRegisterRecord(assignedRegister), currentRefPosition);
5745 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, currentInterval, assignedRegister));
5747 // Now set assignedRegister to REG_NA again so that we don't re-activate it.
5748 assignedRegister = REG_NA;
5751 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
5752 if (currentRefPosition->RequiresRegister() || currentRefPosition->AllocateIfProfitable())
5756 assignedRegister = allocateBusyReg(currentInterval, currentRefPosition,
5757 currentRefPosition->AllocateIfProfitable());
5760 if (assignedRegister != REG_NA)
5763 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_SPILLED_REG, currentInterval, assignedRegister));
5767 // This can happen only for those ref positions that are to be allocated
5768 // only if profitable.
5769 noway_assert(currentRefPosition->AllocateIfProfitable());
5771 currentRefPosition->registerAssignment = RBM_NONE;
5772 currentRefPosition->reload = false;
5773 setIntervalAsSpilled(currentInterval);
5775 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5780 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, currentInterval));
5781 currentRefPosition->registerAssignment = RBM_NONE;
5782 currentInterval->isActive = false;
5783 setIntervalAsSpilled(currentInterval);
5791 if (currentInterval->isConstant && (currentRefPosition->treeNode != nullptr) &&
5792 currentRefPosition->treeNode->IsReuseRegVal())
5794 dumpLsraAllocationEvent(LSRA_EVENT_REUSE_REG, currentInterval, assignedRegister, currentBlock);
5798 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, currentInterval, assignedRegister, currentBlock);
5804 if (refType == RefTypeDummyDef && assignedRegister != REG_NA)
5806 setInVarRegForBB(curBBNum, currentInterval->varNum, assignedRegister);
5809 // If we allocated a register, and this is a use of a spilled value,
5810 // it should have been marked for reload above.
5811 if (assignedRegister != REG_NA && RefTypeIsUse(refType) && !isInRegister)
5813 assert(currentRefPosition->reload);
5817 // If we allocated a register, record it
5818 if (currentInterval != nullptr && assignedRegister != REG_NA)
5820 assignedRegBit = genRegMask(assignedRegister);
5821 currentRefPosition->registerAssignment = assignedRegBit;
5822 currentInterval->physReg = assignedRegister;
5823 regsToFree &= ~assignedRegBit; // we'll set it again later if it's dead
5825 // If this interval is dead, free the register.
5826 // The interval could be dead if this is a user variable, or if the
5827 // node is being evaluated for side effects, or a call whose result
5828 // is not used, etc.
5829 if (currentRefPosition->lastUse || currentRefPosition->nextRefPosition == nullptr)
5831 assert(currentRefPosition->isIntervalRef());
5833 if (refType != RefTypeExpUse && currentRefPosition->nextRefPosition == nullptr)
5835 if (currentRefPosition->delayRegFree)
5837 delayRegsToFree |= assignedRegBit;
5839 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE_DELAYED));
5843 regsToFree |= assignedRegBit;
5845 INDEBUG(dumpLsraAllocationEvent(LSRA_EVENT_LAST_USE));
5850 currentInterval->isActive = false;
5854 lastAllocatedRefPosition = currentRefPosition;
5858 // Free registers to clear associated intervals for resolution phase
5859 CLANG_FORMAT_COMMENT_ANCHOR;
5862 if (getLsraExtendLifeTimes())
5864 // If we have extended lifetimes, we need to make sure all the registers are freed.
5865 for (int regNumIndex = 0; regNumIndex <= REG_FP_LAST; regNumIndex++)
5867 RegRecord& regRecord = physRegs[regNumIndex];
5868 Interval* interval = regRecord.assignedInterval;
5869 if (interval != nullptr)
5871 interval->isActive = false;
5872 unassignPhysReg(®Record, nullptr);
5879 freeRegisters(regsToFree | delayRegsToFree);
5885 // Dump the RegRecords after the last RefPosition is handled.
5889 dumpRefPositions("AFTER ALLOCATION");
5890 dumpVarRefPositions("AFTER ALLOCATION");
5892 // Dump the intervals that remain active
5893 printf("Active intervals at end of allocation:\n");
5895 // We COULD just reuse the intervalIter from above, but ArrayListIterator doesn't
5896 // provide a Reset function (!) - we'll probably replace this so don't bother
5899 for (Interval& interval : intervals)
5901 if (interval.isActive)
5913 //-----------------------------------------------------------------------------
5914 // updateAssignedInterval: Update assigned interval of register.
5917 // reg - register to be updated
5918 // interval - interval to be assigned
5919 // regType - regsiter type
5925 // For ARM32, when "regType" is TYP_DOUBLE, "reg" should be a even-numbered
5926 // float register, i.e. lower half of double register.
5929 // For ARM32, two float registers consisting a double register are updated
5930 // together when "regType" is TYP_DOUBLE.
5932 void LinearScan::updateAssignedInterval(RegRecord* reg, Interval* interval, RegisterType regType)
5935 // Update overlapping floating point register for TYP_DOUBLE.
5936 Interval* oldAssignedInterval = reg->assignedInterval;
5937 if (regType == TYP_DOUBLE)
5939 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
5941 anotherHalfReg->assignedInterval = interval;
5943 else if ((oldAssignedInterval != nullptr) && (oldAssignedInterval->registerType == TYP_DOUBLE))
5945 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
5947 anotherHalfReg->assignedInterval = nullptr;
5950 reg->assignedInterval = interval;
5953 //-----------------------------------------------------------------------------
5954 // updatePreviousInterval: Update previous interval of register.
5957 // reg - register to be updated
5958 // interval - interval to be assigned
5959 // regType - regsiter type
5965 // For ARM32, when "regType" is TYP_DOUBLE, "reg" should be a even-numbered
5966 // float register, i.e. lower half of double register.
5969 // For ARM32, two float registers consisting a double register are updated
5970 // together when "regType" is TYP_DOUBLE.
5972 void LinearScan::updatePreviousInterval(RegRecord* reg, Interval* interval, RegisterType regType)
5974 reg->previousInterval = interval;
5977 // Update overlapping floating point register for TYP_DOUBLE
5978 if (regType == TYP_DOUBLE)
5980 RegRecord* anotherHalfReg = findAnotherHalfRegRec(reg);
5982 anotherHalfReg->previousInterval = interval;
5987 // LinearScan::resolveLocalRef
5989 // Update the graph for a local reference.
5990 // Also, track the register (if any) that is currently occupied.
5992 // treeNode: The lclVar that's being resolved
5993 // currentRefPosition: the RefPosition associated with the treeNode
5996 // This method is called for each local reference, during the resolveRegisters
5997 // phase of LSRA. It is responsible for keeping the following in sync:
5998 // - varDsc->lvRegNum (and lvOtherReg) contain the unique register location.
5999 // If it is not in the same register through its lifetime, it is set to REG_STK.
6000 // - interval->physReg is set to the assigned register
6001 // (i.e. at the code location which is currently being handled by resolveRegisters())
6002 // - interval->isActive is true iff the interval is live and occupying a register
6003 // - interval->isSpilled should have already been set to true if the interval is EVER spilled
6004 // - interval->isSplit is set to true if the interval does not occupy the same
6005 // register throughout the method
6006 // - RegRecord->assignedInterval points to the interval which currently occupies
6008 // - For each lclVar node:
6009 // - gtRegNum/gtRegPair is set to the currently allocated register(s).
6010 // - GTF_SPILLED is set on a use if it must be reloaded prior to use.
6011 // - GTF_SPILL is set if it must be spilled after use.
6013 // A copyReg is an ugly case where the variable must be in a specific (fixed) register,
6014 // but it currently resides elsewhere. The register allocator must track the use of the
6015 // fixed register, but it marks the lclVar node with the register it currently lives in
6016 // and the code generator does the necessary move.
6018 // Before beginning, the varDsc for each parameter must be set to its initial location.
6020 // NICE: Consider tracking whether an Interval is always in the same location (register/stack)
6021 // in which case it will require no resolution.
6023 void LinearScan::resolveLocalRef(BasicBlock* block, GenTree* treeNode, RefPosition* currentRefPosition)
6025 assert((block == nullptr) == (treeNode == nullptr));
6026 assert(enregisterLocalVars);
6028 // Is this a tracked local? Or just a register allocated for loading
6029 // a non-tracked one?
6030 Interval* interval = currentRefPosition->getInterval();
6031 if (!interval->isLocalVar)
6035 interval->recentRefPosition = currentRefPosition;
6036 LclVarDsc* varDsc = interval->getLocalVar(compiler);
6038 // NOTE: we set the GTF_VAR_DEATH flag here unless we are extending lifetimes, in which case we write
6039 // this bit in checkLastUses. This is a bit of a hack, but is necessary because codegen requires
6040 // accurate last use info that is not reflected in the lastUse bit on ref positions when we are extending
6041 // lifetimes. See also the comments in checkLastUses.
6042 if ((treeNode != nullptr) && !extendLifetimes())
6044 if (currentRefPosition->lastUse)
6046 treeNode->gtFlags |= GTF_VAR_DEATH;
6050 treeNode->gtFlags &= ~GTF_VAR_DEATH;
6054 if (currentRefPosition->registerAssignment == RBM_NONE)
6056 assert(!currentRefPosition->RequiresRegister());
6057 assert(interval->isSpilled);
6059 varDsc->lvRegNum = REG_STK;
6060 if (interval->assignedReg != nullptr && interval->assignedReg->assignedInterval == interval)
6062 updateAssignedInterval(interval->assignedReg, nullptr, interval->registerType);
6064 interval->assignedReg = nullptr;
6065 interval->physReg = REG_NA;
6066 if (treeNode != nullptr)
6068 treeNode->SetContained();
6074 // In most cases, assigned and home registers will be the same
6075 // The exception is the copyReg case, where we've assigned a register
6076 // for a specific purpose, but will be keeping the register assignment
6077 regNumber assignedReg = currentRefPosition->assignedReg();
6078 regNumber homeReg = assignedReg;
6080 // Undo any previous association with a physical register, UNLESS this
6082 if (!currentRefPosition->copyReg)
6084 regNumber oldAssignedReg = interval->physReg;
6085 if (oldAssignedReg != REG_NA && assignedReg != oldAssignedReg)
6087 RegRecord* oldRegRecord = getRegisterRecord(oldAssignedReg);
6088 if (oldRegRecord->assignedInterval == interval)
6090 updateAssignedInterval(oldRegRecord, nullptr, interval->registerType);
6095 if (currentRefPosition->refType == RefTypeUse && !currentRefPosition->reload)
6097 // Was this spilled after our predecessor was scheduled?
6098 if (interval->physReg == REG_NA)
6100 assert(inVarToRegMaps[curBBNum][varDsc->lvVarIndex] == REG_STK);
6101 currentRefPosition->reload = true;
6105 bool reload = currentRefPosition->reload;
6106 bool spillAfter = currentRefPosition->spillAfter;
6108 // In the reload case we either:
6109 // - Set the register to REG_STK if it will be referenced only from the home location, or
6110 // - Set the register to the assigned register and set GTF_SPILLED if it must be loaded into a register.
6113 assert(currentRefPosition->refType != RefTypeDef);
6114 assert(interval->isSpilled);
6115 varDsc->lvRegNum = REG_STK;
6118 interval->physReg = assignedReg;
6121 // If there is no treeNode, this must be a RefTypeExpUse, in
6122 // which case we did the reload already
6123 if (treeNode != nullptr)
6125 treeNode->gtFlags |= GTF_SPILLED;
6128 if (currentRefPosition->AllocateIfProfitable())
6130 // This is a use of lclVar that is flagged as reg-optional
6131 // by lower/codegen and marked for both reload and spillAfter.
6132 // In this case we can avoid unnecessary reload and spill
6133 // by setting reg on lclVar to REG_STK and reg on tree node
6134 // to REG_NA. Codegen will generate the code by considering
6135 // it as a contained memory operand.
6137 // Note that varDsc->lvRegNum is already to REG_STK above.
6138 interval->physReg = REG_NA;
6139 treeNode->gtRegNum = REG_NA;
6140 treeNode->gtFlags &= ~GTF_SPILLED;
6141 treeNode->SetContained();
6145 treeNode->gtFlags |= GTF_SPILL;
6151 assert(currentRefPosition->refType == RefTypeExpUse);
6154 else if (spillAfter && !RefTypeIsUse(currentRefPosition->refType))
6156 // In the case of a pure def, don't bother spilling - just assign it to the
6157 // stack. However, we need to remember that it was spilled.
6159 assert(interval->isSpilled);
6160 varDsc->lvRegNum = REG_STK;
6161 interval->physReg = REG_NA;
6162 if (treeNode != nullptr)
6164 treeNode->gtRegNum = REG_NA;
6169 // Not reload and Not pure-def that's spillAfter
6171 if (currentRefPosition->copyReg || currentRefPosition->moveReg)
6173 // For a copyReg or moveReg, we have two cases:
6174 // - In the first case, we have a fixedReg - i.e. a register which the code
6175 // generator is constrained to use.
6176 // The code generator will generate the appropriate move to meet the requirement.
6177 // - In the second case, we were forced to use a different register because of
6178 // interference (or JitStressRegs).
6179 // In this case, we generate a GT_COPY.
6180 // In either case, we annotate the treeNode with the register in which the value
6181 // currently lives. For moveReg, the homeReg is the new register (as assigned above).
6182 // But for copyReg, the homeReg remains unchanged.
6184 assert(treeNode != nullptr);
6185 treeNode->gtRegNum = interval->physReg;
6187 if (currentRefPosition->copyReg)
6189 homeReg = interval->physReg;
6193 assert(interval->isSplit);
6194 interval->physReg = assignedReg;
6197 if (!currentRefPosition->isFixedRegRef || currentRefPosition->moveReg)
6199 // This is the second case, where we need to generate a copy
6200 insertCopyOrReload(block, treeNode, currentRefPosition->getMultiRegIdx(), currentRefPosition);
6205 interval->physReg = assignedReg;
6207 if (!interval->isSpilled && !interval->isSplit)
6209 if (varDsc->lvRegNum != REG_STK)
6211 // If the register assignments don't match, then this interval is split.
6212 if (varDsc->lvRegNum != assignedReg)
6214 setIntervalAsSplit(interval);
6215 varDsc->lvRegNum = REG_STK;
6220 varDsc->lvRegNum = assignedReg;
6226 if (treeNode != nullptr)
6228 treeNode->gtFlags |= GTF_SPILL;
6230 assert(interval->isSpilled);
6231 interval->physReg = REG_NA;
6232 varDsc->lvRegNum = REG_STK;
6236 // Update the physRegRecord for the register, so that we know what vars are in
6237 // regs at the block boundaries
6238 RegRecord* physRegRecord = getRegisterRecord(homeReg);
6239 if (spillAfter || currentRefPosition->lastUse)
6241 interval->isActive = false;
6242 interval->assignedReg = nullptr;
6243 interval->physReg = REG_NA;
6245 updateAssignedInterval(physRegRecord, nullptr, interval->registerType);
6249 interval->isActive = true;
6250 interval->assignedReg = physRegRecord;
6252 updateAssignedInterval(physRegRecord, interval, interval->registerType);
6256 void LinearScan::writeRegisters(RefPosition* currentRefPosition, GenTree* tree)
6258 lsraAssignRegToTree(tree, currentRefPosition->assignedReg(), currentRefPosition->getMultiRegIdx());
6261 //------------------------------------------------------------------------
6262 // insertCopyOrReload: Insert a copy in the case where a tree node value must be moved
6263 // to a different register at the point of use (GT_COPY), or it is reloaded to a different register
6264 // than the one it was spilled from (GT_RELOAD).
6267 // block - basic block in which GT_COPY/GT_RELOAD is inserted.
6268 // tree - This is the node to copy or reload.
6269 // Insert copy or reload node between this node and its parent.
6270 // multiRegIdx - register position of tree node for which copy or reload is needed.
6271 // refPosition - The RefPosition at which copy or reload will take place.
6274 // The GT_COPY or GT_RELOAD will be inserted in the proper spot in execution order where the reload is to occur.
6276 // For example, for this tree (numbers are execution order, lower is earlier and higher is later):
6278 // +---------+----------+
6280 // +---------+----------+
6285 // +-------------------+ +----------------------+
6286 // | x (1) | "tree" | y (2) |
6287 // +-------------------+ +----------------------+
6289 // generate this tree:
6291 // +---------+----------+
6293 // +---------+----------+
6298 // +-------------------+ +----------------------+
6299 // | GT_RELOAD (3) | | y (2) |
6300 // +-------------------+ +----------------------+
6302 // +-------------------+
6304 // +-------------------+
6306 // Note in particular that the GT_RELOAD node gets inserted in execution order immediately before the parent of "tree",
6307 // which seems a bit weird since normally a node's parent (in this case, the parent of "x", GT_RELOAD in the "after"
6308 // picture) immediately follows all of its children (that is, normally the execution ordering is postorder).
6309 // The ordering must be this weird "out of normal order" way because the "x" node is being spilled, probably
6310 // because the expression in the tree represented above by "y" has high register requirements. We don't want
6311 // to reload immediately, of course. So we put GT_RELOAD where the reload should actually happen.
6313 // Note that GT_RELOAD is required when we reload to a different register than the one we spilled to. It can also be
6314 // used if we reload to the same register. Normally, though, in that case we just mark the node with GTF_SPILLED,
6315 // and the unspilling code automatically reuses the same register, and does the reload when it notices that flag
6316 // when considering a node's operands.
6318 void LinearScan::insertCopyOrReload(BasicBlock* block, GenTree* tree, unsigned multiRegIdx, RefPosition* refPosition)
6320 LIR::Range& blockRange = LIR::AsRange(block);
6323 bool foundUse = blockRange.TryGetUse(tree, &treeUse);
6326 GenTree* parent = treeUse.User();
6329 if (refPosition->reload)
6337 #if TRACK_LSRA_STATS
6338 updateLsraStat(LSRA_STAT_COPY_REG, block->bbNum);
6342 // If the parent is a reload/copy node, then tree must be a multi-reg call node
6343 // that has already had one of its registers spilled. This is Because multi-reg
6344 // call node is the only node whose RefTypeDef positions get independently
6345 // spilled or reloaded. It is possible that one of its RefTypeDef position got
6346 // spilled and the next use of it requires it to be in a different register.
6348 // In this case set the ith position reg of reload/copy node to the reg allocated
6349 // for copy/reload refPosition. Essentially a copy/reload node will have a reg
6350 // for each multi-reg position of its child. If there is a valid reg in ith
6351 // position of GT_COPY or GT_RELOAD node then the corresponding result of its
6352 // child needs to be copied or reloaded to that reg.
6353 if (parent->IsCopyOrReload())
6355 noway_assert(parent->OperGet() == oper);
6356 noway_assert(tree->IsMultiRegCall());
6357 GenTreeCall* call = tree->AsCall();
6358 GenTreeCopyOrReload* copyOrReload = parent->AsCopyOrReload();
6359 noway_assert(copyOrReload->GetRegNumByIdx(multiRegIdx) == REG_NA);
6360 copyOrReload->SetRegNumByIdx(refPosition->assignedReg(), multiRegIdx);
6364 // Create the new node, with "tree" as its only child.
6365 var_types treeType = tree->TypeGet();
6367 GenTreeCopyOrReload* newNode = new (compiler, oper) GenTreeCopyOrReload(oper, treeType, tree);
6368 assert(refPosition->registerAssignment != RBM_NONE);
6369 SetLsraAdded(newNode);
6370 newNode->SetRegNumByIdx(refPosition->assignedReg(), multiRegIdx);
6371 if (refPosition->copyReg)
6373 // This is a TEMPORARY copy
6374 assert(isCandidateLocalRef(tree));
6375 newNode->gtFlags |= GTF_VAR_DEATH;
6378 // Insert the copy/reload after the spilled node and replace the use of the original node with a use
6379 // of the copy/reload.
6380 blockRange.InsertAfter(tree, newNode);
6381 treeUse.ReplaceWith(compiler, newNode);
6385 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6386 //------------------------------------------------------------------------
6387 // insertUpperVectorSaveAndReload: Insert code to save and restore the upper half of a vector that lives
6388 // in a callee-save register at the point of a kill (the upper half is
6392 // tree - This is the node around which we will insert the Save & Reload.
6393 // It will be a call or some node that turns into a call.
6394 // refPosition - The RefTypeUpperVectorSaveDef RefPosition.
6396 void LinearScan::insertUpperVectorSaveAndReload(GenTree* tree, RefPosition* refPosition, BasicBlock* block)
6398 Interval* lclVarInterval = refPosition->getInterval()->relatedInterval;
6399 assert(lclVarInterval->isLocalVar == true);
6400 LclVarDsc* varDsc = compiler->lvaTable + lclVarInterval->varNum;
6401 assert(varTypeNeedsPartialCalleeSave(varDsc->lvType));
6402 regNumber lclVarReg = lclVarInterval->physReg;
6403 if (lclVarReg == REG_NA)
6408 assert((genRegMask(lclVarReg) & RBM_FLT_CALLEE_SAVED) != RBM_NONE);
6410 regNumber spillReg = refPosition->assignedReg();
6411 bool spillToMem = refPosition->spillAfter;
6413 LIR::Range& blockRange = LIR::AsRange(block);
6415 // First, insert the save before the call.
6417 GenTree* saveLcl = compiler->gtNewLclvNode(lclVarInterval->varNum, varDsc->lvType);
6418 saveLcl->gtRegNum = lclVarReg;
6419 SetLsraAdded(saveLcl);
6421 GenTreeSIMD* simdNode =
6422 new (compiler, GT_SIMD) GenTreeSIMD(LargeVectorSaveType, saveLcl, nullptr, SIMDIntrinsicUpperSave,
6423 varDsc->lvBaseType, genTypeSize(varDsc->lvType));
6424 SetLsraAdded(simdNode);
6425 simdNode->gtRegNum = spillReg;
6428 simdNode->gtFlags |= GTF_SPILL;
6431 blockRange.InsertBefore(tree, LIR::SeqTree(compiler, simdNode));
6433 // Now insert the restore after the call.
6435 GenTree* restoreLcl = compiler->gtNewLclvNode(lclVarInterval->varNum, varDsc->lvType);
6436 restoreLcl->gtRegNum = lclVarReg;
6437 SetLsraAdded(restoreLcl);
6439 simdNode = new (compiler, GT_SIMD) GenTreeSIMD(varDsc->lvType, restoreLcl, nullptr, SIMDIntrinsicUpperRestore,
6440 varDsc->lvBaseType, genTypeSize(varDsc->lvType));
6441 simdNode->gtRegNum = spillReg;
6442 SetLsraAdded(simdNode);
6445 simdNode->gtFlags |= GTF_SPILLED;
6448 blockRange.InsertAfter(tree, LIR::SeqTree(compiler, simdNode));
6450 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6452 //------------------------------------------------------------------------
6453 // initMaxSpill: Initializes the LinearScan members used to track the max number
6454 // of concurrent spills. This is needed so that we can set the
6455 // fields in Compiler, so that the code generator, in turn can
6456 // allocate the right number of spill locations.
6465 // This is called before any calls to updateMaxSpill().
6467 void LinearScan::initMaxSpill()
6469 needDoubleTmpForFPCall = false;
6470 needFloatTmpForFPCall = false;
6471 for (int i = 0; i < TYP_COUNT; i++)
6474 currentSpill[i] = 0;
6478 //------------------------------------------------------------------------
6479 // recordMaxSpill: Sets the fields in Compiler for the max number of concurrent spills.
6480 // (See the comment on initMaxSpill.)
6489 // This is called after updateMaxSpill() has been called for all "real"
6492 void LinearScan::recordMaxSpill()
6494 // Note: due to the temp normalization process (see tmpNormalizeType)
6495 // only a few types should actually be seen here.
6496 JITDUMP("Recording the maximum number of concurrent spills:\n");
6498 var_types returnType = compiler->tmpNormalizeType(compiler->info.compRetType);
6499 if (needDoubleTmpForFPCall || (returnType == TYP_DOUBLE))
6501 JITDUMP("Adding a spill temp for moving a double call/return value between xmm reg and x87 stack.\n");
6502 maxSpill[TYP_DOUBLE] += 1;
6504 if (needFloatTmpForFPCall || (returnType == TYP_FLOAT))
6506 JITDUMP("Adding a spill temp for moving a float call/return value between xmm reg and x87 stack.\n");
6507 maxSpill[TYP_FLOAT] += 1;
6509 #endif // _TARGET_X86_
6510 for (int i = 0; i < TYP_COUNT; i++)
6512 if (var_types(i) != compiler->tmpNormalizeType(var_types(i)))
6514 // Only normalized types should have anything in the maxSpill array.
6515 // We assume here that if type 'i' does not normalize to itself, then
6516 // nothing else normalizes to 'i', either.
6517 assert(maxSpill[i] == 0);
6519 if (maxSpill[i] != 0)
6521 JITDUMP(" %s: %d\n", varTypeName(var_types(i)), maxSpill[i]);
6522 compiler->tmpPreAllocateTemps(var_types(i), maxSpill[i]);
6528 //------------------------------------------------------------------------
6529 // updateMaxSpill: Update the maximum number of concurrent spills
6532 // refPosition - the current RefPosition being handled
6538 // The RefPosition has an associated interval (getInterval() will
6539 // otherwise assert).
6542 // This is called for each "real" RefPosition during the writeback
6543 // phase of LSRA. It keeps track of how many concurrently-live
6544 // spills there are, and the largest number seen so far.
6546 void LinearScan::updateMaxSpill(RefPosition* refPosition)
6548 RefType refType = refPosition->refType;
6550 if (refPosition->spillAfter || refPosition->reload ||
6551 (refPosition->AllocateIfProfitable() && refPosition->assignedReg() == REG_NA))
6553 Interval* interval = refPosition->getInterval();
6554 if (!interval->isLocalVar)
6556 // The tmp allocation logic 'normalizes' types to a small number of
6557 // types that need distinct stack locations from each other.
6558 // Those types are currently gc refs, byrefs, <= 4 byte non-GC items,
6559 // 8-byte non-GC items, and 16-byte or 32-byte SIMD vectors.
6560 // LSRA is agnostic to those choices but needs
6561 // to know what they are here.
6564 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6565 if ((refType == RefTypeUpperVectorSaveDef) || (refType == RefTypeUpperVectorSaveUse))
6567 typ = LargeVectorSaveType;
6570 #endif // !FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6572 GenTree* treeNode = refPosition->treeNode;
6573 if (treeNode == nullptr)
6575 assert(RefTypeIsUse(refType));
6576 treeNode = interval->firstRefPosition->treeNode;
6578 assert(treeNode != nullptr);
6580 // In case of multi-reg call nodes, we need to use the type
6581 // of the return register given by multiRegIdx of the refposition.
6582 if (treeNode->IsMultiRegCall())
6584 ReturnTypeDesc* retTypeDesc = treeNode->AsCall()->GetReturnTypeDesc();
6585 typ = retTypeDesc->GetReturnRegType(refPosition->getMultiRegIdx());
6588 else if (treeNode->OperIsPutArgSplit())
6590 typ = treeNode->AsPutArgSplit()->GetRegType(refPosition->getMultiRegIdx());
6592 else if (treeNode->OperIsPutArgReg())
6594 // For double arg regs, the type is changed to long since they must be passed via `r0-r3`.
6595 // However when they get spilled, they should be treated as separated int registers.
6596 var_types typNode = treeNode->TypeGet();
6597 typ = (typNode == TYP_LONG) ? TYP_INT : typNode;
6599 #endif // _TARGET_ARM_
6602 typ = treeNode->TypeGet();
6604 typ = compiler->tmpNormalizeType(typ);
6607 if (refPosition->spillAfter && !refPosition->reload)
6609 currentSpill[typ]++;
6610 if (currentSpill[typ] > maxSpill[typ])
6612 maxSpill[typ] = currentSpill[typ];
6615 else if (refPosition->reload)
6617 assert(currentSpill[typ] > 0);
6618 currentSpill[typ]--;
6620 else if (refPosition->AllocateIfProfitable() && refPosition->assignedReg() == REG_NA)
6622 // A spill temp not getting reloaded into a reg because it is
6623 // marked as allocate if profitable and getting used from its
6624 // memory location. To properly account max spill for typ we
6625 // decrement spill count.
6626 assert(RefTypeIsUse(refType));
6627 assert(currentSpill[typ] > 0);
6628 currentSpill[typ]--;
6630 JITDUMP(" Max spill for %s is %d\n", varTypeName(typ), maxSpill[typ]);
6635 // This is the final phase of register allocation. It writes the register assignments to
6636 // the tree, and performs resolution across joins and backedges.
6638 void LinearScan::resolveRegisters()
6640 // Iterate over the tree and the RefPositions in lockstep
6641 // - annotate the tree with register assignments by setting gtRegNum or gtRegPair (for longs)
6643 // - track globally-live var locations
6644 // - add resolution points at split/merge/critical points as needed
6646 // Need to use the same traversal order as the one that assigns the location numbers.
6648 // Dummy RefPositions have been added at any split, join or critical edge, at the
6649 // point where resolution may be required. These are located:
6650 // - for a split, at the top of the non-adjacent block
6651 // - for a join, at the bottom of the non-adjacent joining block
6652 // - for a critical edge, at the top of the target block of each critical
6654 // Note that a target block may have multiple incoming critical or split edges
6656 // These RefPositions record the expected location of the Interval at that point.
6657 // At each branch, we identify the location of each liveOut interval, and check
6658 // against the RefPositions at the target.
6661 LsraLocation currentLocation = MinLocation;
6663 // Clear register assignments - these will be reestablished as lclVar defs (including RefTypeParamDefs)
6665 if (enregisterLocalVars)
6667 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
6669 RegRecord* physRegRecord = getRegisterRecord(reg);
6670 Interval* assignedInterval = physRegRecord->assignedInterval;
6671 if (assignedInterval != nullptr)
6673 assignedInterval->assignedReg = nullptr;
6674 assignedInterval->physReg = REG_NA;
6676 physRegRecord->assignedInterval = nullptr;
6677 physRegRecord->recentRefPosition = nullptr;
6680 // Clear "recentRefPosition" for lclVar intervals
6681 for (unsigned varIndex = 0; varIndex < compiler->lvaTrackedCount; varIndex++)
6683 if (localVarIntervals[varIndex] != nullptr)
6685 localVarIntervals[varIndex]->recentRefPosition = nullptr;
6686 localVarIntervals[varIndex]->isActive = false;
6690 assert(compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate == false);
6695 // handle incoming arguments and special temps
6696 RefPositionIterator refPosIterator = refPositions.begin();
6697 RefPosition* currentRefPosition = &refPosIterator;
6699 if (enregisterLocalVars)
6701 VarToRegMap entryVarToRegMap = inVarToRegMaps[compiler->fgFirstBB->bbNum];
6702 for (; refPosIterator != refPositions.end() &&
6703 (currentRefPosition->refType == RefTypeParamDef || currentRefPosition->refType == RefTypeZeroInit);
6704 ++refPosIterator, currentRefPosition = &refPosIterator)
6706 Interval* interval = currentRefPosition->getInterval();
6707 assert(interval != nullptr && interval->isLocalVar);
6708 resolveLocalRef(nullptr, nullptr, currentRefPosition);
6709 regNumber reg = REG_STK;
6710 int varIndex = interval->getVarIndex(compiler);
6712 if (!currentRefPosition->spillAfter && currentRefPosition->registerAssignment != RBM_NONE)
6714 reg = currentRefPosition->assignedReg();
6719 interval->isActive = false;
6721 setVarReg(entryVarToRegMap, varIndex, reg);
6726 assert(refPosIterator == refPositions.end() ||
6727 (refPosIterator->refType != RefTypeParamDef && refPosIterator->refType != RefTypeZeroInit));
6730 BasicBlock* insertionBlock = compiler->fgFirstBB;
6731 GenTree* insertionPoint = LIR::AsRange(insertionBlock).FirstNonPhiNode();
6733 // write back assignments
6734 for (block = startBlockSequence(); block != nullptr; block = moveToNextBlock())
6736 assert(curBBNum == block->bbNum);
6738 if (enregisterLocalVars)
6740 // Record the var locations at the start of this block.
6741 // (If it's fgFirstBB, we've already done that above, see entryVarToRegMap)
6743 curBBStartLocation = currentRefPosition->nodeLocation;
6744 if (block != compiler->fgFirstBB)
6746 processBlockStartLocations(block, false);
6749 // Handle the DummyDefs, updating the incoming var location.
6750 for (; refPosIterator != refPositions.end() && currentRefPosition->refType == RefTypeDummyDef;
6751 ++refPosIterator, currentRefPosition = &refPosIterator)
6753 assert(currentRefPosition->isIntervalRef());
6754 // Don't mark dummy defs as reload
6755 currentRefPosition->reload = false;
6756 resolveLocalRef(nullptr, nullptr, currentRefPosition);
6758 if (currentRefPosition->registerAssignment != RBM_NONE)
6760 reg = currentRefPosition->assignedReg();
6765 currentRefPosition->getInterval()->isActive = false;
6767 setInVarRegForBB(curBBNum, currentRefPosition->getInterval()->varNum, reg);
6771 // The next RefPosition should be for the block. Move past it.
6772 assert(refPosIterator != refPositions.end());
6773 assert(currentRefPosition->refType == RefTypeBB);
6775 currentRefPosition = &refPosIterator;
6777 // Handle the RefPositions for the block
6778 for (; refPosIterator != refPositions.end() && currentRefPosition->refType != RefTypeBB &&
6779 currentRefPosition->refType != RefTypeDummyDef;
6780 ++refPosIterator, currentRefPosition = &refPosIterator)
6782 currentLocation = currentRefPosition->nodeLocation;
6784 // Ensure that the spill & copy info is valid.
6785 // First, if it's reload, it must not be copyReg or moveReg
6786 assert(!currentRefPosition->reload || (!currentRefPosition->copyReg && !currentRefPosition->moveReg));
6787 // If it's copyReg it must not be moveReg, and vice-versa
6788 assert(!currentRefPosition->copyReg || !currentRefPosition->moveReg);
6790 switch (currentRefPosition->refType)
6793 case RefTypeUpperVectorSaveUse:
6794 case RefTypeUpperVectorSaveDef:
6795 #endif // FEATURE_SIMD
6798 // These are the ones we're interested in
6801 case RefTypeFixedReg:
6802 // These require no handling at resolution time
6803 assert(currentRefPosition->referent != nullptr);
6804 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6807 // Ignore the ExpUse cases - a RefTypeExpUse would only exist if the
6808 // variable is dead at the entry to the next block. So we'll mark
6809 // it as in its current location and resolution will take care of any
6811 assert(getNextBlock() == nullptr ||
6812 !VarSetOps::IsMember(compiler, getNextBlock()->bbLiveIn,
6813 currentRefPosition->getInterval()->getVarIndex(compiler)));
6814 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6816 case RefTypeKillGCRefs:
6817 // No action to take at resolution time, and no interval to update recentRefPosition for.
6819 case RefTypeDummyDef:
6820 case RefTypeParamDef:
6821 case RefTypeZeroInit:
6822 // Should have handled all of these already
6827 updateMaxSpill(currentRefPosition);
6828 GenTree* treeNode = currentRefPosition->treeNode;
6830 #if FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6831 if (currentRefPosition->refType == RefTypeUpperVectorSaveDef)
6833 // The treeNode must be a call, and this must be a RefPosition for a LargeVectorType LocalVar.
6834 // If the LocalVar is in a callee-save register, we are going to spill its upper half around the call.
6835 // If we have allocated a register to spill it to, we will use that; otherwise, we will spill it
6836 // to the stack. We can use as a temp register any non-arg caller-save register.
6837 noway_assert(treeNode != nullptr);
6838 currentRefPosition->referent->recentRefPosition = currentRefPosition;
6839 insertUpperVectorSaveAndReload(treeNode, currentRefPosition, block);
6841 else if (currentRefPosition->refType == RefTypeUpperVectorSaveUse)
6845 #endif // FEATURE_PARTIAL_SIMD_CALLEE_SAVE
6847 // Most uses won't actually need to be recorded (they're on the def).
6848 // In those cases, treeNode will be nullptr.
6849 if (treeNode == nullptr)
6851 // This is either a use, a dead def, or a field of a struct
6852 Interval* interval = currentRefPosition->getInterval();
6853 assert(currentRefPosition->refType == RefTypeUse ||
6854 currentRefPosition->registerAssignment == RBM_NONE || interval->isStructField);
6856 // TODO-Review: Need to handle the case where any of the struct fields
6857 // are reloaded/spilled at this use
6858 assert(!interval->isStructField ||
6859 (currentRefPosition->reload == false && currentRefPosition->spillAfter == false));
6861 if (interval->isLocalVar && !interval->isStructField)
6863 LclVarDsc* varDsc = interval->getLocalVar(compiler);
6865 // This must be a dead definition. We need to mark the lclVar
6866 // so that it's not considered a candidate for lvRegister, as
6867 // this dead def will have to go to the stack.
6868 assert(currentRefPosition->refType == RefTypeDef);
6869 varDsc->lvRegNum = REG_STK;
6874 if (currentRefPosition->isIntervalRef() && currentRefPosition->getInterval()->isInternal)
6876 treeNode->gtRsvdRegs |= currentRefPosition->registerAssignment;
6880 writeRegisters(currentRefPosition, treeNode);
6882 if (treeNode->IsLocal() && currentRefPosition->getInterval()->isLocalVar)
6884 resolveLocalRef(block, treeNode, currentRefPosition);
6887 // Mark spill locations on temps
6888 // (local vars are handled in resolveLocalRef, above)
6889 // Note that the tree node will be changed from GTF_SPILL to GTF_SPILLED
6890 // in codegen, taking care of the "reload" case for temps
6891 else if (currentRefPosition->spillAfter || (currentRefPosition->nextRefPosition != nullptr &&
6892 currentRefPosition->nextRefPosition->moveReg))
6894 if (treeNode != nullptr && currentRefPosition->isIntervalRef())
6896 if (currentRefPosition->spillAfter)
6898 treeNode->gtFlags |= GTF_SPILL;
6900 // If this is a constant interval that is reusing a pre-existing value, we actually need
6901 // to generate the value at this point in order to spill it.
6902 if (treeNode->IsReuseRegVal())
6904 treeNode->ResetReuseRegVal();
6907 // In case of multi-reg call node, also set spill flag on the
6908 // register specified by multi-reg index of current RefPosition.
6909 // Note that the spill flag on treeNode indicates that one or
6910 // more its allocated registers are in that state.
6911 if (treeNode->IsMultiRegCall())
6913 GenTreeCall* call = treeNode->AsCall();
6914 call->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6917 else if (treeNode->OperIsPutArgSplit())
6919 GenTreePutArgSplit* splitArg = treeNode->AsPutArgSplit();
6920 splitArg->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6922 else if (treeNode->OperIsMultiRegOp())
6924 GenTreeMultiRegOp* multiReg = treeNode->AsMultiRegOp();
6925 multiReg->SetRegSpillFlagByIdx(GTF_SPILL, currentRefPosition->getMultiRegIdx());
6930 // If the value is reloaded or moved to a different register, we need to insert
6931 // a node to hold the register to which it should be reloaded
6932 RefPosition* nextRefPosition = currentRefPosition->nextRefPosition;
6933 assert(nextRefPosition != nullptr);
6934 if (INDEBUG(alwaysInsertReload() ||)
6935 nextRefPosition->assignedReg() != currentRefPosition->assignedReg())
6937 if (nextRefPosition->assignedReg() != REG_NA)
6939 insertCopyOrReload(block, treeNode, currentRefPosition->getMultiRegIdx(),
6944 assert(nextRefPosition->AllocateIfProfitable());
6946 // In case of tree temps, if def is spilled and use didn't
6947 // get a register, set a flag on tree node to be treated as
6948 // contained at the point of its use.
6949 if (currentRefPosition->spillAfter && currentRefPosition->refType == RefTypeDef &&
6950 nextRefPosition->refType == RefTypeUse)
6952 assert(nextRefPosition->treeNode == nullptr);
6953 treeNode->gtFlags |= GTF_NOREG_AT_USE;
6959 // We should never have to "spill after" a temp use, since
6960 // they're single use
6969 if (enregisterLocalVars)
6971 processBlockEndLocations(block);
6975 if (enregisterLocalVars)
6980 printf("-----------------------\n");
6981 printf("RESOLVING BB BOUNDARIES\n");
6982 printf("-----------------------\n");
6984 printf("Resolution Candidates: ");
6985 dumpConvertedVarSet(compiler, resolutionCandidateVars);
6987 printf("Has %sCritical Edges\n\n", hasCriticalEdges ? "" : "No");
6989 printf("Prior to Resolution\n");
6990 foreach_block(compiler, block)
6992 printf("\nBB%02u use def in out\n", block->bbNum);
6993 dumpConvertedVarSet(compiler, block->bbVarUse);
6995 dumpConvertedVarSet(compiler, block->bbVarDef);
6997 dumpConvertedVarSet(compiler, block->bbLiveIn);
6999 dumpConvertedVarSet(compiler, block->bbLiveOut);
7002 dumpInVarToRegMap(block);
7003 dumpOutVarToRegMap(block);
7012 // Verify register assignments on variables
7015 for (lclNum = 0, varDsc = compiler->lvaTable; lclNum < compiler->lvaCount; lclNum++, varDsc++)
7017 if (!isCandidateVar(varDsc))
7019 varDsc->lvRegNum = REG_STK;
7023 Interval* interval = getIntervalForLocalVar(varDsc->lvVarIndex);
7025 // Determine initial position for parameters
7027 if (varDsc->lvIsParam)
7029 regMaskTP initialRegMask = interval->firstRefPosition->registerAssignment;
7030 regNumber initialReg = (initialRegMask == RBM_NONE || interval->firstRefPosition->spillAfter)
7032 : genRegNumFromMask(initialRegMask);
7033 regNumber sourceReg = (varDsc->lvIsRegArg) ? varDsc->lvArgReg : REG_STK;
7036 if (varTypeIsMultiReg(varDsc))
7038 // TODO-ARM-NYI: Map the hi/lo intervals back to lvRegNum and lvOtherReg (these should NYI
7040 assert(!"Multi-reg types not yet supported");
7043 #endif // _TARGET_ARM_
7045 varDsc->lvArgInitReg = initialReg;
7046 JITDUMP(" Set V%02u argument initial register to %s\n", lclNum, getRegName(initialReg));
7049 // Stack args that are part of dependently-promoted structs should never be register candidates (see
7050 // LinearScan::isRegCandidate).
7051 assert(varDsc->lvIsRegArg || !compiler->lvaIsFieldOfDependentlyPromotedStruct(varDsc));
7054 // If lvRegNum is REG_STK, that means that either no register
7055 // was assigned, or (more likely) that the same register was not
7056 // used for all references. In that case, codegen gets the register
7057 // from the tree node.
7058 if (varDsc->lvRegNum == REG_STK || interval->isSpilled || interval->isSplit)
7060 // For codegen purposes, we'll set lvRegNum to whatever register
7061 // it's currently in as we go.
7062 // However, we never mark an interval as lvRegister if it has either been spilled
7064 varDsc->lvRegister = false;
7066 // Skip any dead defs or exposed uses
7067 // (first use exposed will only occur when there is no explicit initialization)
7068 RefPosition* firstRefPosition = interval->firstRefPosition;
7069 while ((firstRefPosition != nullptr) && (firstRefPosition->refType == RefTypeExpUse))
7071 firstRefPosition = firstRefPosition->nextRefPosition;
7073 if (firstRefPosition == nullptr)
7076 varDsc->lvLRACandidate = false;
7077 if (varDsc->lvRefCnt == 0)
7079 varDsc->lvOnFrame = false;
7083 // We may encounter cases where a lclVar actually has no references, but
7084 // a non-zero refCnt. For safety (in case this is some "hidden" lclVar that we're
7085 // not correctly recognizing), we'll mark those as needing a stack location.
7086 // TODO-Cleanup: Make this an assert if/when we correct the refCnt
7088 varDsc->lvOnFrame = true;
7093 // If the interval was not spilled, it doesn't need a stack location.
7094 if (!interval->isSpilled)
7096 varDsc->lvOnFrame = false;
7098 if (firstRefPosition->registerAssignment == RBM_NONE || firstRefPosition->spillAfter)
7100 // Either this RefPosition is spilled, or regOptional or it is not a "real" def or use
7102 firstRefPosition->spillAfter || firstRefPosition->AllocateIfProfitable() ||
7103 (firstRefPosition->refType != RefTypeDef && firstRefPosition->refType != RefTypeUse));
7104 varDsc->lvRegNum = REG_STK;
7108 varDsc->lvRegNum = firstRefPosition->assignedReg();
7115 varDsc->lvRegister = true;
7116 varDsc->lvOnFrame = false;
7119 regMaskTP registerAssignment = genRegMask(varDsc->lvRegNum);
7120 assert(!interval->isSpilled && !interval->isSplit);
7121 RefPosition* refPosition = interval->firstRefPosition;
7122 assert(refPosition != nullptr);
7124 while (refPosition != nullptr)
7126 // All RefPositions must match, except for dead definitions,
7127 // copyReg/moveReg and RefTypeExpUse positions
7128 if (refPosition->registerAssignment != RBM_NONE && !refPosition->copyReg &&
7129 !refPosition->moveReg && refPosition->refType != RefTypeExpUse)
7131 assert(refPosition->registerAssignment == registerAssignment);
7133 refPosition = refPosition->nextRefPosition;
7144 printf("Trees after linear scan register allocator (LSRA)\n");
7145 compiler->fgDispBasicBlocks(true);
7148 verifyFinalAllocation();
7151 compiler->raMarkStkVars();
7154 // TODO-CQ: Review this comment and address as needed.
7155 // Change all unused promoted non-argument struct locals to a non-GC type (in this case TYP_INT)
7156 // so that the gc tracking logic and lvMustInit logic will ignore them.
7157 // Extract the code that does this from raAssignVars, and call it here.
7158 // PRECONDITIONS: Ensure that lvPromoted is set on promoted structs, if and
7159 // only if it is promoted on all paths.
7160 // Call might be something like:
7161 // compiler->BashUnusedStructLocals();
7165 //------------------------------------------------------------------------
7166 // insertMove: Insert a move of a lclVar with the given lclNum into the given block.
7169 // block - the BasicBlock into which the move will be inserted.
7170 // insertionPoint - the instruction before which to insert the move
7171 // lclNum - the lclNum of the var to be moved
7172 // fromReg - the register from which the var is moving
7173 // toReg - the register to which the var is moving
7179 // If insertionPoint is non-NULL, insert before that instruction;
7180 // otherwise, insert "near" the end (prior to the branch, if any).
7181 // If fromReg or toReg is REG_STK, then move from/to memory, respectively.
7183 void LinearScan::insertMove(
7184 BasicBlock* block, GenTree* insertionPoint, unsigned lclNum, regNumber fromReg, regNumber toReg)
7186 LclVarDsc* varDsc = compiler->lvaTable + lclNum;
7187 // the lclVar must be a register candidate
7188 assert(isRegCandidate(varDsc));
7189 // One or both MUST be a register
7190 assert(fromReg != REG_STK || toReg != REG_STK);
7191 // They must not be the same register.
7192 assert(fromReg != toReg);
7194 // This var can't be marked lvRegister now
7195 varDsc->lvRegNum = REG_STK;
7197 GenTree* src = compiler->gtNewLclvNode(lclNum, varDsc->TypeGet());
7200 // There are three cases we need to handle:
7201 // - We are loading a lclVar from the stack.
7202 // - We are storing a lclVar to the stack.
7203 // - We are copying a lclVar between registers.
7205 // In the first and second cases, the lclVar node will be marked with GTF_SPILLED and GTF_SPILL, respectively.
7206 // It is up to the code generator to ensure that any necessary normalization is done when loading or storing the
7209 // In the third case, we generate GT_COPY(GT_LCL_VAR) and type each node with the normalized type of the lclVar.
7210 // This is safe because a lclVar is always normalized once it is in a register.
7213 if (fromReg == REG_STK)
7215 src->gtFlags |= GTF_SPILLED;
7216 src->gtRegNum = toReg;
7218 else if (toReg == REG_STK)
7220 src->gtFlags |= GTF_SPILL;
7221 src->gtRegNum = fromReg;
7225 var_types movType = genActualType(varDsc->TypeGet());
7226 src->gtType = movType;
7228 dst = new (compiler, GT_COPY) GenTreeCopyOrReload(GT_COPY, movType, src);
7229 // This is the new home of the lclVar - indicate that by clearing the GTF_VAR_DEATH flag.
7230 // Note that if src is itself a lastUse, this will have no effect.
7231 dst->gtFlags &= ~(GTF_VAR_DEATH);
7232 src->gtRegNum = fromReg;
7233 dst->gtRegNum = toReg;
7236 dst->SetUnusedValue();
7238 LIR::Range treeRange = LIR::SeqTree(compiler, dst);
7239 LIR::Range& blockRange = LIR::AsRange(block);
7241 if (insertionPoint != nullptr)
7243 blockRange.InsertBefore(insertionPoint, std::move(treeRange));
7247 // Put the copy at the bottom
7248 // If there's a branch, make an embedded statement that executes just prior to the branch
7249 if (block->bbJumpKind == BBJ_COND || block->bbJumpKind == BBJ_SWITCH)
7251 noway_assert(!blockRange.IsEmpty());
7253 GenTree* branch = blockRange.LastNode();
7254 assert(branch->OperIsConditionalJump() || branch->OperGet() == GT_SWITCH_TABLE ||
7255 branch->OperGet() == GT_SWITCH);
7257 blockRange.InsertBefore(branch, std::move(treeRange));
7261 assert(block->bbJumpKind == BBJ_NONE || block->bbJumpKind == BBJ_ALWAYS);
7262 blockRange.InsertAtEnd(std::move(treeRange));
7267 void LinearScan::insertSwap(
7268 BasicBlock* block, GenTree* insertionPoint, unsigned lclNum1, regNumber reg1, unsigned lclNum2, regNumber reg2)
7273 const char* insertionPointString = "top";
7274 if (insertionPoint == nullptr)
7276 insertionPointString = "bottom";
7278 printf(" BB%02u %s: swap V%02u in %s with V%02u in %s\n", block->bbNum, insertionPointString, lclNum1,
7279 getRegName(reg1), lclNum2, getRegName(reg2));
7283 LclVarDsc* varDsc1 = compiler->lvaTable + lclNum1;
7284 LclVarDsc* varDsc2 = compiler->lvaTable + lclNum2;
7285 assert(reg1 != REG_STK && reg1 != REG_NA && reg2 != REG_STK && reg2 != REG_NA);
7287 GenTree* lcl1 = compiler->gtNewLclvNode(lclNum1, varDsc1->TypeGet());
7288 lcl1->gtRegNum = reg1;
7291 GenTree* lcl2 = compiler->gtNewLclvNode(lclNum2, varDsc2->TypeGet());
7292 lcl2->gtRegNum = reg2;
7295 GenTree* swap = compiler->gtNewOperNode(GT_SWAP, TYP_VOID, lcl1, lcl2);
7296 swap->gtRegNum = REG_NA;
7299 lcl1->gtNext = lcl2;
7300 lcl2->gtPrev = lcl1;
7301 lcl2->gtNext = swap;
7302 swap->gtPrev = lcl2;
7304 LIR::Range swapRange = LIR::SeqTree(compiler, swap);
7305 LIR::Range& blockRange = LIR::AsRange(block);
7307 if (insertionPoint != nullptr)
7309 blockRange.InsertBefore(insertionPoint, std::move(swapRange));
7313 // Put the copy at the bottom
7314 // If there's a branch, make an embedded statement that executes just prior to the branch
7315 if (block->bbJumpKind == BBJ_COND || block->bbJumpKind == BBJ_SWITCH)
7317 noway_assert(!blockRange.IsEmpty());
7319 GenTree* branch = blockRange.LastNode();
7320 assert(branch->OperIsConditionalJump() || branch->OperGet() == GT_SWITCH_TABLE ||
7321 branch->OperGet() == GT_SWITCH);
7323 blockRange.InsertBefore(branch, std::move(swapRange));
7327 assert(block->bbJumpKind == BBJ_NONE || block->bbJumpKind == BBJ_ALWAYS);
7328 blockRange.InsertAtEnd(std::move(swapRange));
7333 //------------------------------------------------------------------------
7334 // getTempRegForResolution: Get a free register to use for resolution code.
7337 // fromBlock - The "from" block on the edge being resolved.
7338 // toBlock - The "to"block on the edge
7339 // type - the type of register required
7342 // Returns a register that is free on the given edge, or REG_NA if none is available.
7345 // It is up to the caller to check the return value, and to determine whether a register is
7346 // available, and to handle that case appropriately.
7347 // It is also up to the caller to cache the return value, as this is not cheap to compute.
7349 regNumber LinearScan::getTempRegForResolution(BasicBlock* fromBlock, BasicBlock* toBlock, var_types type)
7351 // TODO-Throughput: This would be much more efficient if we add RegToVarMaps instead of VarToRegMaps
7352 // and they would be more space-efficient as well.
7353 VarToRegMap fromVarToRegMap = getOutVarToRegMap(fromBlock->bbNum);
7354 VarToRegMap toVarToRegMap = getInVarToRegMap(toBlock->bbNum);
7358 if (type == TYP_DOUBLE)
7360 // We have to consider all float registers for TYP_DOUBLE
7361 freeRegs = allRegs(TYP_FLOAT);
7365 freeRegs = allRegs(type);
7367 #else // !_TARGET_ARM_
7368 regMaskTP freeRegs = allRegs(type);
7369 #endif // !_TARGET_ARM_
7372 if (getStressLimitRegs() == LSRA_LIMIT_SMALL_SET)
7377 INDEBUG(freeRegs = stressLimitRegs(nullptr, freeRegs));
7379 // We are only interested in the variables that are live-in to the "to" block.
7380 VarSetOps::Iter iter(compiler, toBlock->bbLiveIn);
7381 unsigned varIndex = 0;
7382 while (iter.NextElem(&varIndex) && freeRegs != RBM_NONE)
7384 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
7385 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
7386 assert(fromReg != REG_NA && toReg != REG_NA);
7387 if (fromReg != REG_STK)
7389 freeRegs &= ~genRegMask(fromReg, getIntervalForLocalVar(varIndex)->registerType);
7391 if (toReg != REG_STK)
7393 freeRegs &= ~genRegMask(toReg, getIntervalForLocalVar(varIndex)->registerType);
7398 if (type == TYP_DOUBLE)
7400 // Exclude any doubles for which the odd half isn't in freeRegs.
7401 freeRegs = freeRegs & ((freeRegs << 1) & RBM_ALLDOUBLE);
7405 if (freeRegs == RBM_NONE)
7411 regNumber tempReg = genRegNumFromMask(genFindLowestBit(freeRegs));
7417 //------------------------------------------------------------------------
7418 // addResolutionForDouble: Add resolution move(s) for TYP_DOUBLE interval
7419 // and update location.
7422 // block - the BasicBlock into which the move will be inserted.
7423 // insertionPoint - the instruction before which to insert the move
7424 // sourceIntervals - maintains sourceIntervals[reg] which each 'reg' is associated with
7425 // location - maintains location[reg] which is the location of the var that was originally in 'reg'.
7426 // toReg - the register to which the var is moving
7427 // fromReg - the register from which the var is moving
7428 // resolveType - the type of resolution to be performed
7434 // It inserts at least one move and updates incoming parameter 'location'.
7436 void LinearScan::addResolutionForDouble(BasicBlock* block,
7437 GenTree* insertionPoint,
7438 Interval** sourceIntervals,
7439 regNumberSmall* location,
7442 ResolveType resolveType)
7444 regNumber secondHalfTargetReg = REG_NEXT(fromReg);
7445 Interval* intervalToBeMoved1 = sourceIntervals[fromReg];
7446 Interval* intervalToBeMoved2 = sourceIntervals[secondHalfTargetReg];
7448 assert(!(intervalToBeMoved1 == nullptr && intervalToBeMoved2 == nullptr));
7450 if (intervalToBeMoved1 != nullptr)
7452 if (intervalToBeMoved1->registerType == TYP_DOUBLE)
7454 // TYP_DOUBLE interval occupies a double register, i.e. two float registers.
7455 assert(intervalToBeMoved2 == nullptr);
7456 assert(genIsValidDoubleReg(toReg));
7460 // TYP_FLOAT interval occupies 1st half of double register, i.e. 1st float register
7461 assert(genIsValidFloatReg(toReg));
7463 addResolution(block, insertionPoint, intervalToBeMoved1, toReg, fromReg);
7464 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
7465 location[fromReg] = (regNumberSmall)toReg;
7468 if (intervalToBeMoved2 != nullptr)
7470 // TYP_FLOAT interval occupies 2nd half of double register.
7471 assert(intervalToBeMoved2->registerType == TYP_FLOAT);
7472 regNumber secondHalfTempReg = REG_NEXT(toReg);
7474 addResolution(block, insertionPoint, intervalToBeMoved2, secondHalfTempReg, secondHalfTargetReg);
7475 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
7476 location[secondHalfTargetReg] = (regNumberSmall)secondHalfTempReg;
7481 #endif // _TARGET_ARM_
7483 //------------------------------------------------------------------------
7484 // addResolution: Add a resolution move of the given interval
7487 // block - the BasicBlock into which the move will be inserted.
7488 // insertionPoint - the instruction before which to insert the move
7489 // interval - the interval of the var to be moved
7490 // toReg - the register to which the var is moving
7491 // fromReg - the register from which the var is moving
7497 // For joins, we insert at the bottom (indicated by an insertionPoint
7498 // of nullptr), while for splits we insert at the top.
7499 // This is because for joins 'block' is a pred of the join, while for splits it is a succ.
7500 // For critical edges, this function may be called twice - once to move from
7501 // the source (fromReg), if any, to the stack, in which case toReg will be
7502 // REG_STK, and we insert at the bottom (leave insertionPoint as nullptr).
7503 // The next time, we want to move from the stack to the destination (toReg),
7504 // in which case fromReg will be REG_STK, and we insert at the top.
7506 void LinearScan::addResolution(
7507 BasicBlock* block, GenTree* insertionPoint, Interval* interval, regNumber toReg, regNumber fromReg)
7510 const char* insertionPointString = "top";
7512 if (insertionPoint == nullptr)
7515 insertionPointString = "bottom";
7519 JITDUMP(" BB%02u %s: move V%02u from ", block->bbNum, insertionPointString, interval->varNum);
7520 JITDUMP("%s to %s", getRegName(fromReg), getRegName(toReg));
7522 insertMove(block, insertionPoint, interval->varNum, fromReg, toReg);
7523 if (fromReg == REG_STK || toReg == REG_STK)
7525 assert(interval->isSpilled);
7529 // We should have already marked this as spilled or split.
7530 assert((interval->isSpilled) || (interval->isSplit));
7533 INTRACK_STATS(updateLsraStat(LSRA_STAT_RESOLUTION_MOV, block->bbNum));
7536 //------------------------------------------------------------------------
7537 // handleOutgoingCriticalEdges: Performs the necessary resolution on all critical edges that feed out of 'block'
7540 // block - the block with outgoing critical edges.
7546 // For all outgoing critical edges (i.e. any successor of this block which is
7547 // a join edge), if there are any conflicts, split the edge by adding a new block,
7548 // and generate the resolution code into that block.
7550 void LinearScan::handleOutgoingCriticalEdges(BasicBlock* block)
7552 VARSET_TP outResolutionSet(VarSetOps::Intersection(compiler, block->bbLiveOut, resolutionCandidateVars));
7553 if (VarSetOps::IsEmpty(compiler, outResolutionSet))
7557 VARSET_TP sameResolutionSet(VarSetOps::MakeEmpty(compiler));
7558 VARSET_TP sameLivePathsSet(VarSetOps::MakeEmpty(compiler));
7559 VARSET_TP singleTargetSet(VarSetOps::MakeEmpty(compiler));
7560 VARSET_TP diffResolutionSet(VarSetOps::MakeEmpty(compiler));
7562 // Get the outVarToRegMap for this block
7563 VarToRegMap outVarToRegMap = getOutVarToRegMap(block->bbNum);
7564 unsigned succCount = block->NumSucc(compiler);
7565 assert(succCount > 1);
7566 VarToRegMap firstSuccInVarToRegMap = nullptr;
7567 BasicBlock* firstSucc = nullptr;
7569 // First, determine the live regs at the end of this block so that we know what regs are
7570 // available to copy into.
7571 // Note that for this purpose we use the full live-out set, because we must ensure that
7572 // even the registers that remain the same across the edge are preserved correctly.
7573 regMaskTP liveOutRegs = RBM_NONE;
7574 VarSetOps::Iter liveOutIter(compiler, block->bbLiveOut);
7575 unsigned liveOutVarIndex = 0;
7576 while (liveOutIter.NextElem(&liveOutVarIndex))
7578 regNumber fromReg = getVarReg(outVarToRegMap, liveOutVarIndex);
7579 if (fromReg != REG_STK)
7581 liveOutRegs |= genRegMask(fromReg);
7585 // Next, if this blocks ends with a switch table, we have to make sure not to copy
7586 // into the registers that it uses.
7587 regMaskTP switchRegs = RBM_NONE;
7588 if (block->bbJumpKind == BBJ_SWITCH)
7590 // At this point, Lowering has transformed any non-switch-table blocks into
7592 GenTree* switchTable = LIR::AsRange(block).LastNode();
7593 assert(switchTable != nullptr && switchTable->OperGet() == GT_SWITCH_TABLE);
7595 switchRegs = switchTable->gtRsvdRegs;
7596 GenTree* op1 = switchTable->gtGetOp1();
7597 GenTree* op2 = switchTable->gtGetOp2();
7598 noway_assert(op1 != nullptr && op2 != nullptr);
7599 assert(op1->gtRegNum != REG_NA && op2->gtRegNum != REG_NA);
7600 switchRegs |= genRegMask(op1->gtRegNum);
7601 switchRegs |= genRegMask(op2->gtRegNum);
7604 #ifdef _TARGET_ARM64_
7605 // Next, if this blocks ends with a JCMP, we have to make sure not to copy
7606 // into the register that it uses or modify the local variable it must consume
7607 LclVarDsc* jcmpLocalVarDsc = nullptr;
7608 if (block->bbJumpKind == BBJ_COND)
7610 GenTree* lastNode = LIR::AsRange(block).LastNode();
7612 if (lastNode->OperIs(GT_JCMP))
7614 GenTree* op1 = lastNode->gtGetOp1();
7615 switchRegs |= genRegMask(op1->gtRegNum);
7619 GenTreeLclVarCommon* lcl = op1->AsLclVarCommon();
7620 jcmpLocalVarDsc = &compiler->lvaTable[lcl->gtLclNum];
7626 VarToRegMap sameVarToRegMap = sharedCriticalVarToRegMap;
7627 regMaskTP sameWriteRegs = RBM_NONE;
7628 regMaskTP diffReadRegs = RBM_NONE;
7630 // For each var that may require resolution, classify them as:
7631 // - in the same register at the end of this block and at each target (no resolution needed)
7632 // - in different registers at different targets (resolve separately):
7633 // diffResolutionSet
7634 // - in the same register at each target at which it's live, but different from the end of
7635 // this block. We may be able to resolve these as if it is "join", but only if they do not
7636 // write to any registers that are read by those in the diffResolutionSet:
7637 // sameResolutionSet
7639 VarSetOps::Iter outResolutionSetIter(compiler, outResolutionSet);
7640 unsigned outResolutionSetVarIndex = 0;
7641 while (outResolutionSetIter.NextElem(&outResolutionSetVarIndex))
7643 regNumber fromReg = getVarReg(outVarToRegMap, outResolutionSetVarIndex);
7644 bool isMatch = true;
7645 bool isSame = false;
7646 bool maybeSingleTarget = false;
7647 bool maybeSameLivePaths = false;
7648 bool liveOnlyAtSplitEdge = true;
7649 regNumber sameToReg = REG_NA;
7650 for (unsigned succIndex = 0; succIndex < succCount; succIndex++)
7652 BasicBlock* succBlock = block->GetSucc(succIndex, compiler);
7653 if (!VarSetOps::IsMember(compiler, succBlock->bbLiveIn, outResolutionSetVarIndex))
7655 maybeSameLivePaths = true;
7658 else if (liveOnlyAtSplitEdge)
7660 // Is the var live only at those target blocks which are connected by a split edge to this block
7661 liveOnlyAtSplitEdge = ((succBlock->bbPreds->flNext == nullptr) && (succBlock != compiler->fgFirstBB));
7664 regNumber toReg = getVarReg(getInVarToRegMap(succBlock->bbNum), outResolutionSetVarIndex);
7665 if (sameToReg == REG_NA)
7670 if (toReg == sameToReg)
7678 // Check for the cases where we can't write to a register.
7679 // We only need to check for these cases if sameToReg is an actual register (not REG_STK).
7680 if (sameToReg != REG_NA && sameToReg != REG_STK)
7682 // If there's a path on which this var isn't live, it may use the original value in sameToReg.
7683 // In this case, sameToReg will be in the liveOutRegs of this block.
7684 // Similarly, if sameToReg is in sameWriteRegs, it has already been used (i.e. for a lclVar that's
7685 // live only at another target), and we can't copy another lclVar into that reg in this block.
7686 regMaskTP sameToRegMask = genRegMask(sameToReg);
7687 if (maybeSameLivePaths &&
7688 (((sameToRegMask & liveOutRegs) != RBM_NONE) || ((sameToRegMask & sameWriteRegs) != RBM_NONE)))
7692 // If this register is used by a switch table at the end of the block, we can't do the copy
7693 // in this block (since we can't insert it after the switch).
7694 if ((sameToRegMask & switchRegs) != RBM_NONE)
7699 #ifdef _TARGET_ARM64_
7700 if (jcmpLocalVarDsc && (jcmpLocalVarDsc->lvVarIndex == outResolutionSetVarIndex))
7706 // If the var is live only at those blocks connected by a split edge and not live-in at some of the
7707 // target blocks, we will resolve it the same way as if it were in diffResolutionSet and resolution
7708 // will be deferred to the handling of split edges, which means copy will only be at those target(s).
7710 // Another way to achieve similar resolution for vars live only at split edges is by removing them
7711 // from consideration up-front but it requires that we traverse those edges anyway to account for
7712 // the registers that must note be overwritten.
7713 if (liveOnlyAtSplitEdge && maybeSameLivePaths)
7719 if (sameToReg == REG_NA)
7721 VarSetOps::AddElemD(compiler, diffResolutionSet, outResolutionSetVarIndex);
7722 if (fromReg != REG_STK)
7724 diffReadRegs |= genRegMask(fromReg);
7727 else if (sameToReg != fromReg)
7729 VarSetOps::AddElemD(compiler, sameResolutionSet, outResolutionSetVarIndex);
7730 setVarReg(sameVarToRegMap, outResolutionSetVarIndex, sameToReg);
7731 if (sameToReg != REG_STK)
7733 sameWriteRegs |= genRegMask(sameToReg);
7738 if (!VarSetOps::IsEmpty(compiler, sameResolutionSet))
7740 if ((sameWriteRegs & diffReadRegs) != RBM_NONE)
7742 // We cannot split the "same" and "diff" regs if the "same" set writes registers
7743 // that must be read by the "diff" set. (Note that when these are done as a "batch"
7744 // we carefully order them to ensure all the input regs are read before they are
7746 VarSetOps::UnionD(compiler, diffResolutionSet, sameResolutionSet);
7747 VarSetOps::ClearD(compiler, sameResolutionSet);
7751 // For any vars in the sameResolutionSet, we can simply add the move at the end of "block".
7752 resolveEdge(block, nullptr, ResolveSharedCritical, sameResolutionSet);
7755 if (!VarSetOps::IsEmpty(compiler, diffResolutionSet))
7757 for (unsigned succIndex = 0; succIndex < succCount; succIndex++)
7759 BasicBlock* succBlock = block->GetSucc(succIndex, compiler);
7761 // Any "diffResolutionSet" resolution for a block with no other predecessors will be handled later
7762 // as split resolution.
7763 if ((succBlock->bbPreds->flNext == nullptr) && (succBlock != compiler->fgFirstBB))
7768 // Now collect the resolution set for just this edge, if any.
7769 // Check only the vars in diffResolutionSet that are live-in to this successor.
7770 bool needsResolution = false;
7771 VarToRegMap succInVarToRegMap = getInVarToRegMap(succBlock->bbNum);
7772 VARSET_TP edgeResolutionSet(VarSetOps::Intersection(compiler, diffResolutionSet, succBlock->bbLiveIn));
7773 VarSetOps::Iter iter(compiler, edgeResolutionSet);
7774 unsigned varIndex = 0;
7775 while (iter.NextElem(&varIndex))
7777 regNumber fromReg = getVarReg(outVarToRegMap, varIndex);
7778 regNumber toReg = getVarReg(succInVarToRegMap, varIndex);
7780 if (fromReg == toReg)
7782 VarSetOps::RemoveElemD(compiler, edgeResolutionSet, varIndex);
7785 if (!VarSetOps::IsEmpty(compiler, edgeResolutionSet))
7787 resolveEdge(block, succBlock, ResolveCritical, edgeResolutionSet);
7793 //------------------------------------------------------------------------
7794 // resolveEdges: Perform resolution across basic block edges
7803 // Traverse the basic blocks.
7804 // - If this block has a single predecessor that is not the immediately
7805 // preceding block, perform any needed 'split' resolution at the beginning of this block
7806 // - Otherwise if this block has critical incoming edges, handle them.
7807 // - If this block has a single successor that has multiple predecesors, perform any needed
7808 // 'join' resolution at the end of this block.
7809 // Note that a block may have both 'split' or 'critical' incoming edge(s) and 'join' outgoing
7812 void LinearScan::resolveEdges()
7814 JITDUMP("RESOLVING EDGES\n");
7816 // The resolutionCandidateVars set was initialized with all the lclVars that are live-in to
7817 // any block. We now intersect that set with any lclVars that ever spilled or split.
7818 // If there are no candidates for resoultion, simply return.
7820 VarSetOps::IntersectionD(compiler, resolutionCandidateVars, splitOrSpilledVars);
7821 if (VarSetOps::IsEmpty(compiler, resolutionCandidateVars))
7826 BasicBlock *block, *prevBlock = nullptr;
7828 // Handle all the critical edges first.
7829 // We will try to avoid resolution across critical edges in cases where all the critical-edge
7830 // targets of a block have the same home. We will then split the edges only for the
7831 // remaining mismatches. We visit the out-edges, as that allows us to share the moves that are
7832 // common among allt he targets.
7834 if (hasCriticalEdges)
7836 foreach_block(compiler, block)
7838 if (block->bbNum > bbNumMaxBeforeResolution)
7840 // This is a new block added during resolution - we don't need to visit these now.
7843 if (blockInfo[block->bbNum].hasCriticalOutEdge)
7845 handleOutgoingCriticalEdges(block);
7851 prevBlock = nullptr;
7852 foreach_block(compiler, block)
7854 if (block->bbNum > bbNumMaxBeforeResolution)
7856 // This is a new block added during resolution - we don't need to visit these now.
7860 unsigned succCount = block->NumSucc(compiler);
7861 flowList* preds = block->bbPreds;
7862 BasicBlock* uniquePredBlock = block->GetUniquePred(compiler);
7864 // First, if this block has a single predecessor,
7865 // we may need resolution at the beginning of this block.
7866 // This may be true even if it's the block we used for starting locations,
7867 // if a variable was spilled.
7868 VARSET_TP inResolutionSet(VarSetOps::Intersection(compiler, block->bbLiveIn, resolutionCandidateVars));
7869 if (!VarSetOps::IsEmpty(compiler, inResolutionSet))
7871 if (uniquePredBlock != nullptr)
7873 // We may have split edges during critical edge resolution, and in the process split
7874 // a non-critical edge as well.
7875 // It is unlikely that we would ever have more than one of these in sequence (indeed,
7876 // I don't think it's possible), but there's no need to assume that it can't.
7877 while (uniquePredBlock->bbNum > bbNumMaxBeforeResolution)
7879 uniquePredBlock = uniquePredBlock->GetUniquePred(compiler);
7880 noway_assert(uniquePredBlock != nullptr);
7882 resolveEdge(uniquePredBlock, block, ResolveSplit, inResolutionSet);
7886 // Finally, if this block has a single successor:
7887 // - and that has at least one other predecessor (otherwise we will do the resolution at the
7888 // top of the successor),
7889 // - and that is not the target of a critical edge (otherwise we've already handled it)
7890 // we may need resolution at the end of this block.
7894 BasicBlock* succBlock = block->GetSucc(0, compiler);
7895 if (succBlock->GetUniquePred(compiler) == nullptr)
7897 VARSET_TP outResolutionSet(
7898 VarSetOps::Intersection(compiler, succBlock->bbLiveIn, resolutionCandidateVars));
7899 if (!VarSetOps::IsEmpty(compiler, outResolutionSet))
7901 resolveEdge(block, succBlock, ResolveJoin, outResolutionSet);
7907 // Now, fixup the mapping for any blocks that were adding for edge splitting.
7908 // See the comment prior to the call to fgSplitEdge() in resolveEdge().
7909 // Note that we could fold this loop in with the checking code below, but that
7910 // would only improve the debug case, and would clutter up the code somewhat.
7911 if (compiler->fgBBNumMax > bbNumMaxBeforeResolution)
7913 foreach_block(compiler, block)
7915 if (block->bbNum > bbNumMaxBeforeResolution)
7917 // There may be multiple blocks inserted when we split. But we must always have exactly
7918 // one path (i.e. all blocks must be single-successor and single-predecessor),
7919 // and only one block along the path may be non-empty.
7920 // Note that we may have a newly-inserted block that is empty, but which connects
7921 // two non-resolution blocks. This happens when an edge is split that requires it.
7923 BasicBlock* succBlock = block;
7926 succBlock = succBlock->GetUniqueSucc();
7927 noway_assert(succBlock != nullptr);
7928 } while ((succBlock->bbNum > bbNumMaxBeforeResolution) && succBlock->isEmpty());
7930 BasicBlock* predBlock = block;
7933 predBlock = predBlock->GetUniquePred(compiler);
7934 noway_assert(predBlock != nullptr);
7935 } while ((predBlock->bbNum > bbNumMaxBeforeResolution) && predBlock->isEmpty());
7937 unsigned succBBNum = succBlock->bbNum;
7938 unsigned predBBNum = predBlock->bbNum;
7939 if (block->isEmpty())
7941 // For the case of the empty block, find the non-resolution block (succ or pred).
7942 if (predBBNum > bbNumMaxBeforeResolution)
7944 assert(succBBNum <= bbNumMaxBeforeResolution);
7954 assert((succBBNum <= bbNumMaxBeforeResolution) && (predBBNum <= bbNumMaxBeforeResolution));
7956 SplitEdgeInfo info = {predBBNum, succBBNum};
7957 getSplitBBNumToTargetBBNumMap()->Set(block->bbNum, info);
7963 // Make sure the varToRegMaps match up on all edges.
7964 bool foundMismatch = false;
7965 foreach_block(compiler, block)
7967 if (block->isEmpty() && block->bbNum > bbNumMaxBeforeResolution)
7971 VarToRegMap toVarToRegMap = getInVarToRegMap(block->bbNum);
7972 for (flowList* pred = block->bbPreds; pred != nullptr; pred = pred->flNext)
7974 BasicBlock* predBlock = pred->flBlock;
7975 VarToRegMap fromVarToRegMap = getOutVarToRegMap(predBlock->bbNum);
7976 VarSetOps::Iter iter(compiler, block->bbLiveIn);
7977 unsigned varIndex = 0;
7978 while (iter.NextElem(&varIndex))
7980 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
7981 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
7982 if (fromReg != toReg)
7986 foundMismatch = true;
7987 printf("Found mismatched var locations after resolution!\n");
7989 unsigned varNum = compiler->lvaTrackedToVarNum[varIndex];
7990 printf(" V%02u: BB%02u to BB%02u: %s to %s\n", varNum, predBlock->bbNum, block->bbNum,
7991 getRegName(fromReg), getRegName(toReg));
7996 assert(!foundMismatch);
8001 //------------------------------------------------------------------------
8002 // resolveEdge: Perform the specified type of resolution between two blocks.
8005 // fromBlock - the block from which the edge originates
8006 // toBlock - the block at which the edge terminates
8007 // resolveType - the type of resolution to be performed
8008 // liveSet - the set of tracked lclVar indices which may require resolution
8014 // The caller must have performed the analysis to determine the type of the edge.
8017 // This method emits the correctly ordered moves necessary to place variables in the
8018 // correct registers across a Split, Join or Critical edge.
8019 // In order to avoid overwriting register values before they have been moved to their
8020 // new home (register/stack), it first does the register-to-stack moves (to free those
8021 // registers), then the register to register moves, ensuring that the target register
8022 // is free before the move, and then finally the stack to register moves.
8024 void LinearScan::resolveEdge(BasicBlock* fromBlock,
8025 BasicBlock* toBlock,
8026 ResolveType resolveType,
8027 VARSET_VALARG_TP liveSet)
8029 VarToRegMap fromVarToRegMap = getOutVarToRegMap(fromBlock->bbNum);
8030 VarToRegMap toVarToRegMap;
8031 if (resolveType == ResolveSharedCritical)
8033 toVarToRegMap = sharedCriticalVarToRegMap;
8037 toVarToRegMap = getInVarToRegMap(toBlock->bbNum);
8040 // The block to which we add the resolution moves depends on the resolveType
8042 switch (resolveType)
8045 case ResolveSharedCritical:
8051 case ResolveCritical:
8052 // fgSplitEdge may add one or two BasicBlocks. It returns the block that splits
8053 // the edge from 'fromBlock' and 'toBlock', but if it inserts that block right after
8054 // a block with a fall-through it will have to create another block to handle that edge.
8055 // These new blocks can be mapped to existing blocks in order to correctly handle
8056 // the calls to recordVarLocationsAtStartOfBB() from codegen. That mapping is handled
8057 // in resolveEdges(), after all the edge resolution has been done (by calling this
8058 // method for each edge).
8059 block = compiler->fgSplitEdge(fromBlock, toBlock);
8061 // Split edges are counted against fromBlock.
8062 INTRACK_STATS(updateLsraStat(LSRA_STAT_SPLIT_EDGE, fromBlock->bbNum));
8069 #ifndef _TARGET_XARCH_
8070 // We record tempregs for beginning and end of each block.
8071 // For amd64/x86 we only need a tempReg for float - we'll use xchg for int.
8072 // TODO-Throughput: It would be better to determine the tempRegs on demand, but the code below
8073 // modifies the varToRegMaps so we don't have all the correct registers at the time
8074 // we need to get the tempReg.
8075 regNumber tempRegInt =
8076 (resolveType == ResolveSharedCritical) ? REG_NA : getTempRegForResolution(fromBlock, toBlock, TYP_INT);
8077 #endif // !_TARGET_XARCH_
8078 regNumber tempRegFlt = REG_NA;
8079 regNumber tempRegDbl = REG_NA; // Used only for ARM
8080 if ((compiler->compFloatingPointUsed) && (resolveType != ResolveSharedCritical))
8083 // Try to reserve a double register for TYP_DOUBLE and use it for TYP_FLOAT too if available.
8084 tempRegDbl = getTempRegForResolution(fromBlock, toBlock, TYP_DOUBLE);
8085 if (tempRegDbl != REG_NA)
8087 tempRegFlt = tempRegDbl;
8090 #endif // _TARGET_ARM_
8092 tempRegFlt = getTempRegForResolution(fromBlock, toBlock, TYP_FLOAT);
8096 regMaskTP targetRegsToDo = RBM_NONE;
8097 regMaskTP targetRegsReady = RBM_NONE;
8098 regMaskTP targetRegsFromStack = RBM_NONE;
8100 // The following arrays capture the location of the registers as they are moved:
8101 // - location[reg] gives the current location of the var that was originally in 'reg'.
8102 // (Note that a var may be moved more than once.)
8103 // - source[reg] gives the original location of the var that needs to be moved to 'reg'.
8104 // For example, if a var is in rax and needs to be moved to rsi, then we would start with:
8105 // location[rax] == rax
8106 // source[rsi] == rax -- this doesn't change
8107 // Then, if for some reason we need to move it temporary to rbx, we would have:
8108 // location[rax] == rbx
8109 // Once we have completed the move, we will have:
8110 // location[rax] == REG_NA
8111 // This indicates that the var originally in rax is now in its target register.
8113 regNumberSmall location[REG_COUNT];
8114 C_ASSERT(sizeof(char) == sizeof(regNumberSmall)); // for memset to work
8115 memset(location, REG_NA, REG_COUNT);
8116 regNumberSmall source[REG_COUNT];
8117 memset(source, REG_NA, REG_COUNT);
8119 // What interval is this register associated with?
8120 // (associated with incoming reg)
8121 Interval* sourceIntervals[REG_COUNT];
8122 memset(&sourceIntervals, 0, sizeof(sourceIntervals));
8124 // Intervals for vars that need to be loaded from the stack
8125 Interval* stackToRegIntervals[REG_COUNT];
8126 memset(&stackToRegIntervals, 0, sizeof(stackToRegIntervals));
8128 // Get the starting insertion point for the "to" resolution
8129 GenTree* insertionPoint = nullptr;
8130 if (resolveType == ResolveSplit || resolveType == ResolveCritical)
8132 insertionPoint = LIR::AsRange(block).FirstNonPhiNode();
8136 // - Perform all moves from reg to stack (no ordering needed on these)
8137 // - For reg to reg moves, record the current location, associating their
8138 // source location with the target register they need to go into
8139 // - For stack to reg moves (done last, no ordering needed between them)
8140 // record the interval associated with the target reg
8141 // TODO-Throughput: We should be looping over the liveIn and liveOut registers, since
8142 // that will scale better than the live variables
8144 VarSetOps::Iter iter(compiler, liveSet);
8145 unsigned varIndex = 0;
8146 while (iter.NextElem(&varIndex))
8148 regNumber fromReg = getVarReg(fromVarToRegMap, varIndex);
8149 regNumber toReg = getVarReg(toVarToRegMap, varIndex);
8150 if (fromReg == toReg)
8155 // For Critical edges, the location will not change on either side of the edge,
8156 // since we'll add a new block to do the move.
8157 if (resolveType == ResolveSplit)
8159 setVarReg(toVarToRegMap, varIndex, fromReg);
8161 else if (resolveType == ResolveJoin || resolveType == ResolveSharedCritical)
8163 setVarReg(fromVarToRegMap, varIndex, toReg);
8166 assert(fromReg < UCHAR_MAX && toReg < UCHAR_MAX);
8168 Interval* interval = getIntervalForLocalVar(varIndex);
8170 if (fromReg == REG_STK)
8172 stackToRegIntervals[toReg] = interval;
8173 targetRegsFromStack |= genRegMask(toReg);
8175 else if (toReg == REG_STK)
8177 // Do the reg to stack moves now
8178 addResolution(block, insertionPoint, interval, REG_STK, fromReg);
8179 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8183 location[fromReg] = (regNumberSmall)fromReg;
8184 source[toReg] = (regNumberSmall)fromReg;
8185 sourceIntervals[fromReg] = interval;
8186 targetRegsToDo |= genRegMask(toReg);
8190 // REGISTER to REGISTER MOVES
8192 // First, find all the ones that are ready to move now
8193 regMaskTP targetCandidates = targetRegsToDo;
8194 while (targetCandidates != RBM_NONE)
8196 regMaskTP targetRegMask = genFindLowestBit(targetCandidates);
8197 targetCandidates &= ~targetRegMask;
8198 regNumber targetReg = genRegNumFromMask(targetRegMask);
8199 if (location[targetReg] == REG_NA)
8202 regNumber sourceReg = (regNumber)source[targetReg];
8203 Interval* interval = sourceIntervals[sourceReg];
8204 if (interval->registerType == TYP_DOUBLE)
8206 // For ARM32, make sure that both of the float halves of the double register are available.
8207 assert(genIsValidDoubleReg(targetReg));
8208 regNumber anotherHalfRegNum = REG_NEXT(targetReg);
8209 if (location[anotherHalfRegNum] == REG_NA)
8211 targetRegsReady |= targetRegMask;
8215 #endif // _TARGET_ARM_
8217 targetRegsReady |= targetRegMask;
8222 // Perform reg to reg moves
8223 while (targetRegsToDo != RBM_NONE)
8225 while (targetRegsReady != RBM_NONE)
8227 regMaskTP targetRegMask = genFindLowestBit(targetRegsReady);
8228 targetRegsToDo &= ~targetRegMask;
8229 targetRegsReady &= ~targetRegMask;
8230 regNumber targetReg = genRegNumFromMask(targetRegMask);
8231 assert(location[targetReg] != targetReg);
8232 regNumber sourceReg = (regNumber)source[targetReg];
8233 regNumber fromReg = (regNumber)location[sourceReg];
8234 assert(fromReg < UCHAR_MAX && sourceReg < UCHAR_MAX);
8235 Interval* interval = sourceIntervals[sourceReg];
8236 assert(interval != nullptr);
8237 addResolution(block, insertionPoint, interval, targetReg, fromReg);
8238 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8239 sourceIntervals[sourceReg] = nullptr;
8240 location[sourceReg] = REG_NA;
8242 // Do we have a free targetReg?
8243 if (fromReg == sourceReg && source[fromReg] != REG_NA)
8245 regMaskTP fromRegMask = genRegMask(fromReg);
8246 targetRegsReady |= fromRegMask;
8249 if (targetRegsToDo != RBM_NONE)
8251 regMaskTP targetRegMask = genFindLowestBit(targetRegsToDo);
8252 regNumber targetReg = genRegNumFromMask(targetRegMask);
8254 // Is it already there due to other moves?
8255 // If not, move it to the temp reg, OR swap it with another register
8256 regNumber sourceReg = (regNumber)source[targetReg];
8257 regNumber fromReg = (regNumber)location[sourceReg];
8258 if (targetReg == fromReg)
8260 targetRegsToDo &= ~targetRegMask;
8264 regNumber tempReg = REG_NA;
8265 bool useSwap = false;
8266 if (emitter::isFloatReg(targetReg))
8269 if (sourceIntervals[fromReg]->registerType == TYP_DOUBLE)
8271 // ARM32 requires a double temp register for TYP_DOUBLE.
8272 tempReg = tempRegDbl;
8275 #endif // _TARGET_ARM_
8276 tempReg = tempRegFlt;
8278 #ifdef _TARGET_XARCH_
8283 #else // !_TARGET_XARCH_
8287 tempReg = tempRegInt;
8290 #endif // !_TARGET_XARCH_
8291 if (useSwap || tempReg == REG_NA)
8293 // First, we have to figure out the destination register for what's currently in fromReg,
8294 // so that we can find its sourceInterval.
8295 regNumber otherTargetReg = REG_NA;
8297 // By chance, is fromReg going where it belongs?
8298 if (location[source[fromReg]] == targetReg)
8300 otherTargetReg = fromReg;
8301 // If we can swap, we will be done with otherTargetReg as well.
8302 // Otherwise, we'll spill it to the stack and reload it later.
8305 regMaskTP fromRegMask = genRegMask(fromReg);
8306 targetRegsToDo &= ~fromRegMask;
8311 // Look at the remaining registers from targetRegsToDo (which we expect to be relatively
8312 // small at this point) to find out what's currently in targetReg.
8313 regMaskTP mask = targetRegsToDo;
8314 while (mask != RBM_NONE && otherTargetReg == REG_NA)
8316 regMaskTP nextRegMask = genFindLowestBit(mask);
8317 regNumber nextReg = genRegNumFromMask(nextRegMask);
8318 mask &= ~nextRegMask;
8319 if (location[source[nextReg]] == targetReg)
8321 otherTargetReg = nextReg;
8325 assert(otherTargetReg != REG_NA);
8329 // Generate a "swap" of fromReg and targetReg
8330 insertSwap(block, insertionPoint, sourceIntervals[source[otherTargetReg]]->varNum, targetReg,
8331 sourceIntervals[sourceReg]->varNum, fromReg);
8332 location[sourceReg] = REG_NA;
8333 location[source[otherTargetReg]] = (regNumberSmall)fromReg;
8335 INTRACK_STATS(updateLsraStat(LSRA_STAT_RESOLUTION_MOV, block->bbNum));
8339 // Spill "targetReg" to the stack and add its eventual target (otherTargetReg)
8340 // to "targetRegsFromStack", which will be handled below.
8341 // NOTE: This condition is very rare. Setting COMPlus_JitStressRegs=0x203
8342 // has been known to trigger it in JIT SH.
8344 // First, spill "otherInterval" from targetReg to the stack.
8345 Interval* otherInterval = sourceIntervals[source[otherTargetReg]];
8346 setIntervalAsSpilled(otherInterval);
8347 addResolution(block, insertionPoint, otherInterval, REG_STK, targetReg);
8348 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8349 location[source[otherTargetReg]] = REG_STK;
8351 // Now, move the interval that is going to targetReg, and add its "fromReg" to
8352 // "targetRegsReady".
8353 addResolution(block, insertionPoint, sourceIntervals[sourceReg], targetReg, fromReg);
8354 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8355 location[sourceReg] = REG_NA;
8356 targetRegsReady |= genRegMask(fromReg);
8358 targetRegsToDo &= ~targetRegMask;
8362 compiler->codeGen->regSet.rsSetRegsModified(genRegMask(tempReg) DEBUGARG(true));
8364 if (sourceIntervals[fromReg]->registerType == TYP_DOUBLE)
8366 assert(genIsValidDoubleReg(targetReg));
8367 assert(genIsValidDoubleReg(tempReg));
8369 addResolutionForDouble(block, insertionPoint, sourceIntervals, location, tempReg, targetReg,
8373 #endif // _TARGET_ARM_
8375 assert(sourceIntervals[targetReg] != nullptr);
8377 addResolution(block, insertionPoint, sourceIntervals[targetReg], tempReg, targetReg);
8378 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8379 location[targetReg] = (regNumberSmall)tempReg;
8381 targetRegsReady |= targetRegMask;
8387 // Finally, perform stack to reg moves
8388 // All the target regs will be empty at this point
8389 while (targetRegsFromStack != RBM_NONE)
8391 regMaskTP targetRegMask = genFindLowestBit(targetRegsFromStack);
8392 targetRegsFromStack &= ~targetRegMask;
8393 regNumber targetReg = genRegNumFromMask(targetRegMask);
8395 Interval* interval = stackToRegIntervals[targetReg];
8396 assert(interval != nullptr);
8398 addResolution(block, insertionPoint, interval, targetReg, REG_STK);
8399 JITDUMP(" (%s)\n", resolveTypeName[resolveType]);
8403 #if TRACK_LSRA_STATS
8404 // ----------------------------------------------------------
8405 // updateLsraStat: Increment LSRA stat counter.
8408 // stat - LSRA stat enum
8409 // bbNum - Basic block to which LSRA stat needs to be
8412 void LinearScan::updateLsraStat(LsraStat stat, unsigned bbNum)
8414 if (bbNum > bbNumMaxBeforeResolution)
8416 // This is a newly created basic block as part of resolution.
8417 // These blocks contain resolution moves that are already accounted.
8423 case LSRA_STAT_SPILL:
8424 ++(blockInfo[bbNum].spillCount);
8427 case LSRA_STAT_COPY_REG:
8428 ++(blockInfo[bbNum].copyRegCount);
8431 case LSRA_STAT_RESOLUTION_MOV:
8432 ++(blockInfo[bbNum].resolutionMovCount);
8435 case LSRA_STAT_SPLIT_EDGE:
8436 ++(blockInfo[bbNum].splitEdgeCount);
8444 // -----------------------------------------------------------
8445 // dumpLsraStats - dumps Lsra stats to given file.
8448 // file - file to which stats are to be written.
8450 void LinearScan::dumpLsraStats(FILE* file)
8452 unsigned sumSpillCount = 0;
8453 unsigned sumCopyRegCount = 0;
8454 unsigned sumResolutionMovCount = 0;
8455 unsigned sumSplitEdgeCount = 0;
8456 UINT64 wtdSpillCount = 0;
8457 UINT64 wtdCopyRegCount = 0;
8458 UINT64 wtdResolutionMovCount = 0;
8460 fprintf(file, "----------\n");
8461 fprintf(file, "LSRA Stats");
8465 fprintf(file, " : %s\n", compiler->info.compFullName);
8469 // In verbose mode no need to print full name
8470 // while printing lsra stats.
8471 fprintf(file, "\n");
8474 fprintf(file, " : %s\n", compiler->eeGetMethodFullName(compiler->info.compCompHnd));
8477 fprintf(file, "----------\n");
8479 for (BasicBlock* block = compiler->fgFirstBB; block != nullptr; block = block->bbNext)
8481 if (block->bbNum > bbNumMaxBeforeResolution)
8486 unsigned spillCount = blockInfo[block->bbNum].spillCount;
8487 unsigned copyRegCount = blockInfo[block->bbNum].copyRegCount;
8488 unsigned resolutionMovCount = blockInfo[block->bbNum].resolutionMovCount;
8489 unsigned splitEdgeCount = blockInfo[block->bbNum].splitEdgeCount;
8491 if (spillCount != 0 || copyRegCount != 0 || resolutionMovCount != 0 || splitEdgeCount != 0)
8493 fprintf(file, "BB%02u [%8d]: ", block->bbNum, block->bbWeight);
8494 fprintf(file, "SpillCount = %d, ResolutionMovs = %d, SplitEdges = %d, CopyReg = %d\n", spillCount,
8495 resolutionMovCount, splitEdgeCount, copyRegCount);
8498 sumSpillCount += spillCount;
8499 sumCopyRegCount += copyRegCount;
8500 sumResolutionMovCount += resolutionMovCount;
8501 sumSplitEdgeCount += splitEdgeCount;
8503 wtdSpillCount += (UINT64)spillCount * block->bbWeight;
8504 wtdCopyRegCount += (UINT64)copyRegCount * block->bbWeight;
8505 wtdResolutionMovCount += (UINT64)resolutionMovCount * block->bbWeight;
8508 fprintf(file, "Total Tracked Vars: %d\n", compiler->lvaTrackedCount);
8509 fprintf(file, "Total Reg Cand Vars: %d\n", regCandidateVarCount);
8510 fprintf(file, "Total number of Intervals: %d\n", static_cast<unsigned>(intervals.size() - 1));
8511 fprintf(file, "Total number of RefPositions: %d\n", static_cast<unsigned>(refPositions.size() - 1));
8512 fprintf(file, "Total Spill Count: %d Weighted: %I64u\n", sumSpillCount, wtdSpillCount);
8513 fprintf(file, "Total CopyReg Count: %d Weighted: %I64u\n", sumCopyRegCount, wtdCopyRegCount);
8514 fprintf(file, "Total ResolutionMov Count: %d Weighted: %I64u\n", sumResolutionMovCount, wtdResolutionMovCount);
8515 fprintf(file, "Total number of split edges: %d\n", sumSplitEdgeCount);
8517 // compute total number of spill temps created
8518 unsigned numSpillTemps = 0;
8519 for (int i = 0; i < TYP_COUNT; i++)
8521 numSpillTemps += maxSpill[i];
8523 fprintf(file, "Total Number of spill temps created: %d\n\n", numSpillTemps);
8525 #endif // TRACK_LSRA_STATS
8528 void dumpRegMask(regMaskTP regs)
8530 if (regs == RBM_ALLINT)
8534 else if (regs == (RBM_ALLINT & ~RBM_FPBASE))
8536 printf("[allIntButFP]");
8538 else if (regs == RBM_ALLFLOAT)
8540 printf("[allFloat]");
8542 else if (regs == RBM_ALLDOUBLE)
8544 printf("[allDouble]");
8552 static const char* getRefTypeName(RefType refType)
8556 #define DEF_REFTYPE(memberName, memberValue, shortName) \
8559 #include "lsra_reftypes.h"
8566 static const char* getRefTypeShortName(RefType refType)
8570 #define DEF_REFTYPE(memberName, memberValue, shortName) \
8573 #include "lsra_reftypes.h"
8580 void RefPosition::dump()
8582 printf("<RefPosition #%-3u @%-3u", rpNum, nodeLocation);
8584 if (nextRefPosition)
8586 printf(" ->#%-3u", nextRefPosition->rpNum);
8589 printf(" %s ", getRefTypeName(refType));
8591 if (this->isPhysRegRef)
8593 this->getReg()->tinyDump();
8595 else if (getInterval())
8597 this->getInterval()->tinyDump();
8602 printf("%s ", treeNode->OpName(treeNode->OperGet()));
8604 printf("BB%02u ", this->bbNum);
8607 dumpRegMask(registerAssignment);
8609 printf(" minReg=%d", minRegCandidateCount);
8619 if (this->spillAfter)
8621 printf(" spillAfter");
8631 if (this->isFixedRegRef)
8635 if (this->isLocalDefUse)
8639 if (this->delayRegFree)
8643 if (this->outOfOrder)
8645 printf(" outOfOrder");
8648 if (this->AllocateIfProfitable())
8650 printf(" regOptional");
8655 void RegRecord::dump()
8660 void Interval::dump()
8662 printf("Interval %2u:", intervalIndex);
8666 printf(" (V%02u)", varNum);
8670 printf(" (INTERNAL)");
8674 printf(" (SPILLED)");
8682 printf(" (struct)");
8684 if (isPromotedStruct)
8686 printf(" (promoted struct)");
8688 if (hasConflictingDefUse)
8690 printf(" (def-use conflict)");
8692 if (hasInterferingUses)
8694 printf(" (interfering uses)");
8696 if (isSpecialPutArg)
8698 printf(" (specialPutArg)");
8702 printf(" (constant)");
8706 printf(" (multireg)");
8709 printf(" RefPositions {");
8710 for (RefPosition* refPosition = this->firstRefPosition; refPosition != nullptr;
8711 refPosition = refPosition->nextRefPosition)
8713 printf("#%u@%u", refPosition->rpNum, refPosition->nodeLocation);
8714 if (refPosition->nextRefPosition)
8721 // this is not used (yet?)
8722 // printf(" SpillOffset %d", this->spillOffset);
8724 printf(" physReg:%s", getRegName(physReg));
8726 printf(" Preferences=");
8727 dumpRegMask(this->registerPreferences);
8729 if (relatedInterval)
8731 printf(" RelatedInterval ");
8732 relatedInterval->microDump();
8733 printf("[%p]", dspPtr(relatedInterval));
8739 // print out very concise representation
8740 void Interval::tinyDump()
8742 printf("<Ivl:%u", intervalIndex);
8745 printf(" V%02u", varNum);
8749 printf(" internal");
8754 // print out extremely concise representation
8755 void Interval::microDump()
8757 char intervalTypeChar = 'I';
8760 intervalTypeChar = 'T';
8762 else if (isLocalVar)
8764 intervalTypeChar = 'L';
8767 printf("<%c%u>", intervalTypeChar, intervalIndex);
8770 void RegRecord::tinyDump()
8772 printf("<Reg:%-3s> ", getRegName(regNum));
8775 void TreeNodeInfo::dump(LinearScan* lsra)
8777 printf("<TreeNodeInfo %d=%d %di %df", dstCount, srcCount, internalIntCount, internalFloatCount);
8779 dumpRegMask(getSrcCandidates(lsra));
8781 dumpRegMask(getInternalCandidates(lsra));
8783 dumpRegMask(getDstCandidates(lsra));
8800 if (isInternalRegDelayFree)
8807 void LinearScan::dumpDefList()
8809 JITDUMP("DefList: { ");
8811 for (LocationInfoListNode *listNode = defList.Begin(), *end = defList.End(); listNode != end;
8812 listNode = listNode->Next())
8814 GenTree* node = listNode->treeNode;
8815 JITDUMP("%sN%03u.t%d. %s", first ? "" : "; ", node->gtSeqNum, node->gtTreeID, GenTree::OpName(node->OperGet()));
8821 void LinearScan::lsraDumpIntervals(const char* msg)
8823 printf("\nLinear scan intervals %s:\n", msg);
8824 for (Interval& interval : intervals)
8826 // only dump something if it has references
8827 // if (interval->firstRefPosition)
8834 // Dumps a tree node as a destination or source operand, with the style
8835 // of dump dependent on the mode
8836 void LinearScan::lsraGetOperandString(GenTree* tree,
8837 LsraTupleDumpMode mode,
8838 char* operandString,
8839 unsigned operandStringLength)
8841 const char* lastUseChar = "";
8842 if ((tree->gtFlags & GTF_VAR_DEATH) != 0)
8848 case LinearScan::LSRA_DUMP_PRE:
8849 _snprintf_s(operandString, operandStringLength, operandStringLength, "t%d%s", tree->gtTreeID, lastUseChar);
8851 case LinearScan::LSRA_DUMP_REFPOS:
8852 _snprintf_s(operandString, operandStringLength, operandStringLength, "t%d%s", tree->gtTreeID, lastUseChar);
8854 case LinearScan::LSRA_DUMP_POST:
8856 Compiler* compiler = JitTls::GetCompiler();
8858 if (!tree->gtHasReg())
8860 _snprintf_s(operandString, operandStringLength, operandStringLength, "STK%s", lastUseChar);
8864 _snprintf_s(operandString, operandStringLength, operandStringLength, "%s%s",
8865 getRegName(tree->gtRegNum, useFloatReg(tree->TypeGet())), lastUseChar);
8870 printf("ERROR: INVALID TUPLE DUMP MODE\n");
8874 void LinearScan::lsraDispNode(GenTree* tree, LsraTupleDumpMode mode, bool hasDest)
8876 Compiler* compiler = JitTls::GetCompiler();
8877 const unsigned operandStringLength = 16;
8878 char operandString[operandStringLength];
8879 const char* emptyDestOperand = " ";
8880 char spillChar = ' ';
8882 if (mode == LinearScan::LSRA_DUMP_POST)
8884 if ((tree->gtFlags & GTF_SPILL) != 0)
8888 if (!hasDest && tree->gtHasReg())
8890 // A node can define a register, but not produce a value for a parent to consume,
8891 // i.e. in the "localDefUse" case.
8892 // There used to be an assert here that we wouldn't spill such a node.
8893 // However, we can have unused lclVars that wind up being the node at which
8894 // it is spilled. This probably indicates a bug, but we don't realy want to
8895 // assert during a dump.
8896 if (spillChar == 'S')
8907 printf("%c N%03u. ", spillChar, tree->gtSeqNum);
8909 LclVarDsc* varDsc = nullptr;
8910 unsigned varNum = UINT_MAX;
8911 if (tree->IsLocal())
8913 varNum = tree->gtLclVarCommon.gtLclNum;
8914 varDsc = &(compiler->lvaTable[varNum]);
8915 if (varDsc->lvLRACandidate)
8922 if (mode == LinearScan::LSRA_DUMP_POST && tree->gtFlags & GTF_SPILLED)
8924 assert(tree->gtHasReg());
8926 lsraGetOperandString(tree, mode, operandString, operandStringLength);
8927 printf("%-15s =", operandString);
8931 printf("%-15s ", emptyDestOperand);
8933 if (varDsc != nullptr)
8935 if (varDsc->lvLRACandidate)
8937 if (mode == LSRA_DUMP_REFPOS)
8939 printf(" V%02u(L%d)", varNum, getIntervalForLocalVar(varDsc->lvVarIndex)->intervalIndex);
8943 lsraGetOperandString(tree, mode, operandString, operandStringLength);
8944 printf(" V%02u(%s)", varNum, operandString);
8945 if (mode == LinearScan::LSRA_DUMP_POST && tree->gtFlags & GTF_SPILLED)
8953 printf(" V%02u MEM", varNum);
8956 else if (tree->OperIsAssignment())
8958 assert(!tree->gtHasReg());
8959 printf(" asg%s ", GenTree::OpName(tree->OperGet()));
8963 compiler->gtDispNodeName(tree);
8964 if (tree->OperKind() & GTK_LEAF)
8966 compiler->gtDispLeaf(tree, nullptr);
8971 //------------------------------------------------------------------------
8972 // DumpOperandDefs: dumps the registers defined by a node.
8975 // operand - The operand for which to compute a register count.
8978 // The number of registers defined by `operand`.
8980 void LinearScan::DumpOperandDefs(
8981 GenTree* operand, bool& first, LsraTupleDumpMode mode, char* operandString, const unsigned operandStringLength)
8983 assert(operand != nullptr);
8984 assert(operandString != nullptr);
8985 if (!operand->IsLIR())
8990 int dstCount = ComputeOperandDstCount(operand);
8994 // This operand directly produces registers; print it.
8995 for (int i = 0; i < dstCount; i++)
9002 lsraGetOperandString(operand, mode, operandString, operandStringLength);
9003 printf("%s", operandString);
9008 else if (operand->isContained())
9010 // This is a contained node. Dump the defs produced by its operands.
9011 for (GenTree* op : operand->Operands())
9013 DumpOperandDefs(op, first, mode, operandString, operandStringLength);
9018 void LinearScan::TupleStyleDump(LsraTupleDumpMode mode)
9021 LsraLocation currentLoc = 1; // 0 is the entry
9022 const unsigned operandStringLength = 16;
9023 char operandString[operandStringLength];
9025 // currentRefPosition is not used for LSRA_DUMP_PRE
9026 // We keep separate iterators for defs, so that we can print them
9027 // on the lhs of the dump
9028 RefPositionIterator refPosIterator = refPositions.begin();
9029 RefPosition* currentRefPosition = &refPosIterator;
9034 printf("TUPLE STYLE DUMP BEFORE LSRA\n");
9036 case LSRA_DUMP_REFPOS:
9037 printf("TUPLE STYLE DUMP WITH REF POSITIONS\n");
9039 case LSRA_DUMP_POST:
9040 printf("TUPLE STYLE DUMP WITH REGISTER ASSIGNMENTS\n");
9043 printf("ERROR: INVALID TUPLE DUMP MODE\n");
9047 if (mode != LSRA_DUMP_PRE)
9049 printf("Incoming Parameters: ");
9050 for (; refPosIterator != refPositions.end() && currentRefPosition->refType != RefTypeBB;
9051 ++refPosIterator, currentRefPosition = &refPosIterator)
9053 Interval* interval = currentRefPosition->getInterval();
9054 assert(interval != nullptr && interval->isLocalVar);
9055 printf(" V%02d", interval->varNum);
9056 if (mode == LSRA_DUMP_POST)
9059 if (currentRefPosition->registerAssignment == RBM_NONE)
9065 reg = currentRefPosition->assignedReg();
9067 LclVarDsc* varDsc = &(compiler->lvaTable[interval->varNum]);
9069 regNumber assignedReg = varDsc->lvRegNum;
9070 regNumber argReg = (varDsc->lvIsRegArg) ? varDsc->lvArgReg : REG_STK;
9072 assert(reg == assignedReg || varDsc->lvRegister == false);
9075 printf(getRegName(argReg, isFloatRegType(interval->registerType)));
9078 printf("%s)", getRegName(reg, isFloatRegType(interval->registerType)));
9084 for (block = startBlockSequence(); block != nullptr; block = moveToNextBlock())
9088 if (mode == LSRA_DUMP_REFPOS)
9090 bool printedBlockHeader = false;
9091 // We should find the boundary RefPositions in the order of exposed uses, dummy defs, and the blocks
9092 for (; refPosIterator != refPositions.end() &&
9093 (currentRefPosition->refType == RefTypeExpUse || currentRefPosition->refType == RefTypeDummyDef ||
9094 (currentRefPosition->refType == RefTypeBB && !printedBlockHeader));
9095 ++refPosIterator, currentRefPosition = &refPosIterator)
9097 Interval* interval = nullptr;
9098 if (currentRefPosition->isIntervalRef())
9100 interval = currentRefPosition->getInterval();
9102 switch (currentRefPosition->refType)
9105 assert(interval != nullptr);
9106 assert(interval->isLocalVar);
9107 printf(" Exposed use of V%02u at #%d\n", interval->varNum, currentRefPosition->rpNum);
9109 case RefTypeDummyDef:
9110 assert(interval != nullptr);
9111 assert(interval->isLocalVar);
9112 printf(" Dummy def of V%02u at #%d\n", interval->varNum, currentRefPosition->rpNum);
9115 block->dspBlockHeader(compiler);
9116 printedBlockHeader = true;
9120 printf("Unexpected RefPosition type at #%d\n", currentRefPosition->rpNum);
9127 block->dspBlockHeader(compiler);
9130 if (enregisterLocalVars && mode == LSRA_DUMP_POST && block != compiler->fgFirstBB &&
9131 block->bbNum <= bbNumMaxBeforeResolution)
9133 printf("Predecessor for variable locations: BB%02u\n", blockInfo[block->bbNum].predBBNum);
9134 dumpInVarToRegMap(block);
9136 if (block->bbNum > bbNumMaxBeforeResolution)
9138 SplitEdgeInfo splitEdgeInfo;
9139 splitBBNumToTargetBBNumMap->Lookup(block->bbNum, &splitEdgeInfo);
9140 assert(splitEdgeInfo.toBBNum <= bbNumMaxBeforeResolution);
9141 assert(splitEdgeInfo.fromBBNum <= bbNumMaxBeforeResolution);
9142 printf("New block introduced for resolution from BB%02u to BB%02u\n", splitEdgeInfo.fromBBNum,
9143 splitEdgeInfo.toBBNum);
9146 for (GenTree* node : LIR::AsRange(block).NonPhiNodes())
9148 GenTree* tree = node;
9150 genTreeOps oper = tree->OperGet();
9151 int produce = tree->IsValue() ? ComputeOperandDstCount(tree) : 0;
9152 int consume = ComputeAvailableSrcCount(tree);
9153 regMaskTP killMask = RBM_NONE;
9154 regMaskTP fixedMask = RBM_NONE;
9156 lsraDispNode(tree, mode, produce != 0 && mode != LSRA_DUMP_REFPOS);
9158 if (mode != LSRA_DUMP_REFPOS)
9165 for (GenTree* operand : tree->Operands())
9167 DumpOperandDefs(operand, first, mode, operandString, operandStringLength);
9173 // Print each RefPosition on a new line, but
9174 // printing all the kills for each node on a single line
9175 // and combining the fixed regs with their associated def or use
9176 bool killPrinted = false;
9177 RefPosition* lastFixedRegRefPos = nullptr;
9178 for (; refPosIterator != refPositions.end() &&
9179 (currentRefPosition->refType == RefTypeUse || currentRefPosition->refType == RefTypeFixedReg ||
9180 currentRefPosition->refType == RefTypeKill || currentRefPosition->refType == RefTypeDef) &&
9181 (currentRefPosition->nodeLocation == tree->gtSeqNum ||
9182 currentRefPosition->nodeLocation == tree->gtSeqNum + 1);
9183 ++refPosIterator, currentRefPosition = &refPosIterator)
9185 Interval* interval = nullptr;
9186 if (currentRefPosition->isIntervalRef())
9188 interval = currentRefPosition->getInterval();
9190 switch (currentRefPosition->refType)
9193 if (currentRefPosition->isPhysRegRef)
9195 printf("\n Use:R%d(#%d)",
9196 currentRefPosition->getReg()->regNum, currentRefPosition->rpNum);
9200 assert(interval != nullptr);
9202 interval->microDump();
9203 printf("(#%d)", currentRefPosition->rpNum);
9204 if (currentRefPosition->isFixedRegRef)
9206 assert(genMaxOneBit(currentRefPosition->registerAssignment));
9207 assert(lastFixedRegRefPos != nullptr);
9208 printf(" Fixed:%s(#%d)", getRegName(currentRefPosition->assignedReg(),
9209 isFloatRegType(interval->registerType)),
9210 lastFixedRegRefPos->rpNum);
9211 lastFixedRegRefPos = nullptr;
9213 if (currentRefPosition->isLocalDefUse)
9215 printf(" LocalDefUse");
9217 if (currentRefPosition->lastUse)
9225 // Print each def on a new line
9226 assert(interval != nullptr);
9228 interval->microDump();
9229 printf("(#%d)", currentRefPosition->rpNum);
9230 if (currentRefPosition->isFixedRegRef)
9232 assert(genMaxOneBit(currentRefPosition->registerAssignment));
9233 printf(" %s", getRegName(currentRefPosition->assignedReg(),
9234 isFloatRegType(interval->registerType)));
9236 if (currentRefPosition->isLocalDefUse)
9238 printf(" LocalDefUse");
9240 if (currentRefPosition->lastUse)
9244 if (interval->relatedInterval != nullptr)
9247 interval->relatedInterval->microDump();
9254 printf("\n Kill: ");
9257 printf(getRegName(currentRefPosition->assignedReg(),
9258 isFloatRegType(currentRefPosition->getReg()->registerType)));
9261 case RefTypeFixedReg:
9262 lastFixedRegRefPos = currentRefPosition;
9265 printf("Unexpected RefPosition type at #%d\n", currentRefPosition->rpNum);
9272 if (enregisterLocalVars && mode == LSRA_DUMP_POST)
9274 dumpOutVarToRegMap(block);
9281 void LinearScan::dumpLsraAllocationEvent(LsraDumpEvent event,
9284 BasicBlock* currentBlock)
9290 if ((interval != nullptr) && (reg != REG_NA) && (reg != REG_STK))
9292 registersToDump |= genRegMask(reg);
9293 dumpRegRecordTitleIfNeeded();
9298 // Conflicting def/use
9299 case LSRA_EVENT_DEFUSE_CONFLICT:
9300 dumpRefPositionShort(activeRefPosition, currentBlock);
9301 printf("DUconflict ");
9304 case LSRA_EVENT_DEFUSE_CASE1:
9305 printf(indentFormat, " Case #1 use defRegAssignment");
9308 case LSRA_EVENT_DEFUSE_CASE2:
9309 printf(indentFormat, " Case #2 use useRegAssignment");
9312 case LSRA_EVENT_DEFUSE_CASE3:
9313 printf(indentFormat, " Case #3 use useRegAssignment");
9317 case LSRA_EVENT_DEFUSE_CASE4:
9318 printf(indentFormat, " Case #4 use defRegAssignment");
9321 case LSRA_EVENT_DEFUSE_CASE5:
9322 printf(indentFormat, " Case #5 set def to all regs");
9325 case LSRA_EVENT_DEFUSE_CASE6:
9326 printf(indentFormat, " Case #6 need a copy");
9330 case LSRA_EVENT_SPILL:
9331 dumpRefPositionShort(activeRefPosition, currentBlock);
9332 assert(interval != nullptr && interval->assignedReg != nullptr);
9333 printf("Spill %-4s ", getRegName(interval->assignedReg->regNum));
9337 // Restoring the previous register
9338 case LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL_AFTER_SPILL:
9339 assert(interval != nullptr);
9340 dumpRefPositionShort(activeRefPosition, currentBlock);
9341 printf("SRstr %-4s ", getRegName(reg));
9345 case LSRA_EVENT_RESTORE_PREVIOUS_INTERVAL:
9346 assert(interval != nullptr);
9347 if (activeRefPosition == nullptr)
9349 printf(emptyRefPositionFormat, "");
9353 dumpRefPositionShort(activeRefPosition, currentBlock);
9355 printf("Restr %-4s ", getRegName(reg));
9357 if (activeRefPosition != nullptr)
9359 printf(emptyRefPositionFormat, "");
9363 // Done with GC Kills
9364 case LSRA_EVENT_DONE_KILL_GC_REFS:
9365 printf(indentFormat, " DoneKillGC ");
9369 case LSRA_EVENT_START_BB:
9370 assert(currentBlock != nullptr);
9371 dumpRefPositionShort(activeRefPosition, currentBlock);
9374 // Allocation decisions
9375 case LSRA_EVENT_NEEDS_NEW_REG:
9376 dumpRefPositionShort(activeRefPosition, currentBlock);
9377 printf("Free %-4s ", getRegName(reg));
9381 case LSRA_EVENT_ZERO_REF:
9382 assert(interval != nullptr && interval->isLocalVar);
9383 dumpRefPositionShort(activeRefPosition, currentBlock);
9388 case LSRA_EVENT_FIXED_REG:
9389 case LSRA_EVENT_EXP_USE:
9390 case LSRA_EVENT_KEPT_ALLOCATION:
9391 dumpRefPositionShort(activeRefPosition, currentBlock);
9392 printf("Keep %-4s ", getRegName(reg));
9395 case LSRA_EVENT_COPY_REG:
9396 assert(interval != nullptr && interval->recentRefPosition != nullptr);
9397 dumpRefPositionShort(activeRefPosition, currentBlock);
9398 printf("Copy %-4s ", getRegName(reg));
9401 case LSRA_EVENT_MOVE_REG:
9402 assert(interval != nullptr && interval->recentRefPosition != nullptr);
9403 dumpRefPositionShort(activeRefPosition, currentBlock);
9404 printf("Move %-4s ", getRegName(reg));
9408 case LSRA_EVENT_ALLOC_REG:
9409 dumpRefPositionShort(activeRefPosition, currentBlock);
9410 printf("Alloc %-4s ", getRegName(reg));
9413 case LSRA_EVENT_REUSE_REG:
9414 dumpRefPositionShort(activeRefPosition, currentBlock);
9415 printf("Reuse %-4s ", getRegName(reg));
9418 case LSRA_EVENT_ALLOC_SPILLED_REG:
9419 dumpRefPositionShort(activeRefPosition, currentBlock);
9420 printf("Steal %-4s ", getRegName(reg));
9423 case LSRA_EVENT_NO_ENTRY_REG_ALLOCATED:
9424 assert(interval != nullptr && interval->isLocalVar);
9425 dumpRefPositionShort(activeRefPosition, currentBlock);
9429 case LSRA_EVENT_NO_REG_ALLOCATED:
9430 dumpRefPositionShort(activeRefPosition, currentBlock);
9434 case LSRA_EVENT_RELOAD:
9435 dumpRefPositionShort(activeRefPosition, currentBlock);
9436 printf("ReLod %-4s ", getRegName(reg));
9440 case LSRA_EVENT_SPECIAL_PUTARG:
9441 dumpRefPositionShort(activeRefPosition, currentBlock);
9442 printf("PtArg %-4s ", getRegName(reg));
9445 // We currently don't dump anything for these events.
9446 case LSRA_EVENT_DEFUSE_FIXED_DELAY_USE:
9447 case LSRA_EVENT_SPILL_EXTENDED_LIFETIME:
9448 case LSRA_EVENT_END_BB:
9449 case LSRA_EVENT_FREE_REGS:
9450 case LSRA_EVENT_INCREMENT_RANGE_END:
9451 case LSRA_EVENT_LAST_USE:
9452 case LSRA_EVENT_LAST_USE_DELAYED:
9460 //------------------------------------------------------------------------
9461 // dumpRegRecordHeader: Dump the header for a column-based dump of the register state.
9470 // Reg names fit in 4 characters (minimum width of the columns)
9473 // In order to make the table as dense as possible (for ease of reading the dumps),
9474 // we determine the minimum regColumnWidth width required to represent:
9475 // regs, by name (e.g. eax or xmm0) - this is fixed at 4 characters.
9476 // intervals, as Vnn for lclVar intervals, or as I<num> for other intervals.
9477 // The table is indented by the amount needed for dumpRefPositionShort, which is
9478 // captured in shortRefPositionDumpWidth.
9480 void LinearScan::dumpRegRecordHeader()
9482 printf("The following table has one or more rows for each RefPosition that is handled during allocation.\n"
9483 "The first column provides the basic information about the RefPosition, with its type (e.g. Def,\n"
9484 "Use, Fixd) followed by a '*' if it is a last use, and a 'D' if it is delayRegFree, and then the\n"
9485 "action taken during allocation (e.g. Alloc a new register, or Keep an existing one).\n"
9486 "The subsequent columns show the Interval occupying each register, if any, followed by 'a' if it is\n"
9487 "active, and 'i'if it is inactive. Columns are only printed up to the last modifed register, which\n"
9488 "may increase during allocation, in which case additional columns will appear. Registers which are\n"
9489 "not marked modified have ---- in their column.\n\n");
9491 // First, determine the width of each register column (which holds a reg name in the
9492 // header, and an interval name in each subsequent row).
9493 int intervalNumberWidth = (int)log10((double)intervals.size()) + 1;
9494 // The regColumnWidth includes the identifying character (I or V) and an 'i' or 'a' (inactive or active)
9495 regColumnWidth = intervalNumberWidth + 2;
9496 if (regColumnWidth < 4)
9500 sprintf_s(intervalNameFormat, MAX_FORMAT_CHARS, "%%c%%-%dd", regColumnWidth - 2);
9501 sprintf_s(regNameFormat, MAX_FORMAT_CHARS, "%%-%ds", regColumnWidth);
9503 // Next, determine the width of the short RefPosition (see dumpRefPositionShort()).
9504 // This is in the form:
9505 // nnn.#mmm NAME TYPEld
9507 // nnn is the Location, right-justified to the width needed for the highest location.
9508 // mmm is the RefPosition rpNum, left-justified to the width needed for the highest rpNum.
9509 // NAME is dumped by dumpReferentName(), and is "regColumnWidth".
9510 // TYPE is RefTypeNameShort, and is 4 characters
9511 // l is either '*' (if a last use) or ' ' (otherwise)
9512 // d is either 'D' (if a delayed use) or ' ' (otherwise)
9514 maxNodeLocation = (maxNodeLocation == 0)
9516 : maxNodeLocation; // corner case of a method with an infinite loop without any gentree nodes
9517 assert(maxNodeLocation >= 1);
9518 assert(refPositions.size() >= 1);
9519 int nodeLocationWidth = (int)log10((double)maxNodeLocation) + 1;
9520 int refPositionWidth = (int)log10((double)refPositions.size()) + 1;
9521 int refTypeInfoWidth = 4 /*TYPE*/ + 2 /* last-use and delayed */ + 1 /* space */;
9522 int locationAndRPNumWidth = nodeLocationWidth + 2 /* .# */ + refPositionWidth + 1 /* space */;
9523 int shortRefPositionDumpWidth = locationAndRPNumWidth + regColumnWidth + 1 /* space */ + refTypeInfoWidth;
9524 sprintf_s(shortRefPositionFormat, MAX_FORMAT_CHARS, "%%%dd.#%%-%dd ", nodeLocationWidth, refPositionWidth);
9525 sprintf_s(emptyRefPositionFormat, MAX_FORMAT_CHARS, "%%-%ds", shortRefPositionDumpWidth);
9527 // The width of the "allocation info"
9528 // - a 5-character allocation decision
9530 // - a 4-character register
9532 int allocationInfoWidth = 5 + 1 + 4 + 1;
9534 // Next, determine the width of the legend for each row. This includes:
9535 // - a short RefPosition dump (shortRefPositionDumpWidth), which includes a space
9536 // - the allocation info (allocationInfoWidth), which also includes a space
9538 regTableIndent = shortRefPositionDumpWidth + allocationInfoWidth;
9540 // BBnn printed left-justified in the NAME Typeld and allocationInfo space.
9541 int bbDumpWidth = regColumnWidth + 1 + refTypeInfoWidth + allocationInfoWidth;
9542 int bbNumWidth = (int)log10((double)compiler->fgBBNumMax) + 1;
9543 // In the unlikely event that BB numbers overflow the space, we'll simply omit the predBB
9544 int predBBNumDumpSpace = regTableIndent - locationAndRPNumWidth - bbNumWidth - 9; // 'BB' + ' PredBB'
9545 if (predBBNumDumpSpace < bbNumWidth)
9547 sprintf_s(bbRefPosFormat, MAX_LEGEND_FORMAT_CHARS, "BB%%-%dd", shortRefPositionDumpWidth - 2);
9551 sprintf_s(bbRefPosFormat, MAX_LEGEND_FORMAT_CHARS, "BB%%-%dd PredBB%%-%dd", bbNumWidth, predBBNumDumpSpace);
9554 if (compiler->shouldDumpASCIITrees())
9556 columnSeparator = "|";
9564 columnSeparator = "\xe2\x94\x82";
9565 line = "\xe2\x94\x80";
9566 leftBox = "\xe2\x94\x9c";
9567 middleBox = "\xe2\x94\xbc";
9568 rightBox = "\xe2\x94\xa4";
9570 sprintf_s(indentFormat, MAX_FORMAT_CHARS, "%%-%ds", regTableIndent);
9572 // Now, set up the legend format for the RefPosition info
9573 sprintf_s(legendFormat, MAX_LEGEND_FORMAT_CHARS, "%%-%d.%ds%%-%d.%ds%%-%ds%%s", nodeLocationWidth + 1,
9574 nodeLocationWidth + 1, refPositionWidth + 2, refPositionWidth + 2, regColumnWidth + 1);
9576 // Print a "title row" including the legend and the reg names.
9577 lastDumpedRegisters = RBM_NONE;
9578 dumpRegRecordTitleIfNeeded();
9581 void LinearScan::dumpRegRecordTitleIfNeeded()
9583 if ((lastDumpedRegisters != registersToDump) || (rowCountSinceLastTitle > MAX_ROWS_BETWEEN_TITLES))
9585 lastUsedRegNumIndex = 0;
9586 int lastRegNumIndex = compiler->compFloatingPointUsed ? REG_FP_LAST : REG_INT_LAST;
9587 for (int regNumIndex = 0; regNumIndex <= lastRegNumIndex; regNumIndex++)
9589 if ((registersToDump & genRegMask((regNumber)regNumIndex)) != 0)
9591 lastUsedRegNumIndex = regNumIndex;
9594 dumpRegRecordTitle();
9595 lastDumpedRegisters = registersToDump;
9599 void LinearScan::dumpRegRecordTitleLines()
9601 for (int i = 0; i < regTableIndent; i++)
9605 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9607 regNumber regNum = (regNumber)regNumIndex;
9608 if (shouldDumpReg(regNum))
9610 printf("%s", middleBox);
9611 for (int i = 0; i < regColumnWidth; i++)
9617 printf("%s\n", rightBox);
9619 void LinearScan::dumpRegRecordTitle()
9621 dumpRegRecordTitleLines();
9623 // Print out the legend for the RefPosition info
9624 printf(legendFormat, "Loc ", "RP# ", "Name ", "Type Action Reg ");
9626 // Print out the register name column headers
9627 char columnFormatArray[MAX_FORMAT_CHARS];
9628 sprintf_s(columnFormatArray, MAX_FORMAT_CHARS, "%s%%-%d.%ds", columnSeparator, regColumnWidth, regColumnWidth);
9629 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9631 regNumber regNum = (regNumber)regNumIndex;
9632 if (shouldDumpReg(regNum))
9634 const char* regName = getRegName(regNum);
9635 printf(columnFormatArray, regName);
9638 printf("%s\n", columnSeparator);
9640 rowCountSinceLastTitle = 0;
9642 dumpRegRecordTitleLines();
9645 void LinearScan::dumpRegRecords()
9647 static char columnFormatArray[18];
9649 for (int regNumIndex = 0; regNumIndex <= lastUsedRegNumIndex; regNumIndex++)
9651 if (shouldDumpReg((regNumber)regNumIndex))
9653 printf("%s", columnSeparator);
9654 RegRecord& regRecord = physRegs[regNumIndex];
9655 Interval* interval = regRecord.assignedInterval;
9656 if (interval != nullptr)
9658 dumpIntervalName(interval);
9659 char activeChar = interval->isActive ? 'a' : 'i';
9660 printf("%c", activeChar);
9662 else if (regRecord.isBusyUntilNextKill)
9664 printf(columnFormatArray, "Busy");
9668 sprintf_s(columnFormatArray, MAX_FORMAT_CHARS, "%%-%ds", regColumnWidth);
9669 printf(columnFormatArray, "");
9673 printf("%s\n", columnSeparator);
9674 rowCountSinceLastTitle++;
9677 void LinearScan::dumpIntervalName(Interval* interval)
9679 if (interval->isLocalVar)
9681 printf(intervalNameFormat, 'V', interval->varNum);
9683 else if (interval->isConstant)
9685 printf(intervalNameFormat, 'C', interval->intervalIndex);
9689 printf(intervalNameFormat, 'I', interval->intervalIndex);
9693 void LinearScan::dumpEmptyRefPosition()
9695 printf(emptyRefPositionFormat, "");
9698 // Note that the size of this dump is computed in dumpRegRecordHeader().
9700 void LinearScan::dumpRefPositionShort(RefPosition* refPosition, BasicBlock* currentBlock)
9702 BasicBlock* block = currentBlock;
9703 static RefPosition* lastPrintedRefPosition = nullptr;
9704 if (refPosition == lastPrintedRefPosition)
9706 dumpEmptyRefPosition();
9709 lastPrintedRefPosition = refPosition;
9710 if (refPosition->refType == RefTypeBB)
9712 // Always print a title row before a RefTypeBB (except for the first, because we
9713 // will already have printed it before the parameters)
9714 if (refPosition->refType == RefTypeBB && block != compiler->fgFirstBB && block != nullptr)
9716 dumpRegRecordTitle();
9719 printf(shortRefPositionFormat, refPosition->nodeLocation, refPosition->rpNum);
9720 if (refPosition->refType == RefTypeBB)
9722 if (block == nullptr)
9724 printf(regNameFormat, "END");
9726 printf(regNameFormat, "");
9730 printf(bbRefPosFormat, block->bbNum, block == compiler->fgFirstBB ? 0 : blockInfo[block->bbNum].predBBNum);
9733 else if (refPosition->isIntervalRef())
9735 Interval* interval = refPosition->getInterval();
9736 dumpIntervalName(interval);
9737 char lastUseChar = ' ';
9738 char delayChar = ' ';
9739 if (refPosition->lastUse)
9742 if (refPosition->delayRegFree)
9747 printf(" %s%c%c ", getRefTypeShortName(refPosition->refType), lastUseChar, delayChar);
9749 else if (refPosition->isPhysRegRef)
9751 RegRecord* regRecord = refPosition->getReg();
9752 printf(regNameFormat, getRegName(regRecord->regNum));
9753 printf(" %s ", getRefTypeShortName(refPosition->refType));
9757 assert(refPosition->refType == RefTypeKillGCRefs);
9758 // There's no interval or reg name associated with this.
9759 printf(regNameFormat, " ");
9760 printf(" %s ", getRefTypeShortName(refPosition->refType));
9764 //------------------------------------------------------------------------
9765 // LinearScan::IsResolutionMove:
9766 // Returns true if the given node is a move inserted by LSRA
9770 // node - the node to check.
9772 bool LinearScan::IsResolutionMove(GenTree* node)
9774 if (!IsLsraAdded(node))
9779 switch (node->OperGet())
9783 return node->IsUnusedValue();
9793 //------------------------------------------------------------------------
9794 // LinearScan::IsResolutionNode:
9795 // Returns true if the given node is either a move inserted by LSRA
9796 // resolution or an operand to such a move.
9799 // containingRange - the range that contains the node to check.
9800 // node - the node to check.
9802 bool LinearScan::IsResolutionNode(LIR::Range& containingRange, GenTree* node)
9806 if (IsResolutionMove(node))
9811 if (!IsLsraAdded(node) || (node->OperGet() != GT_LCL_VAR))
9817 bool foundUse = containingRange.TryGetUse(node, &use);
9824 //------------------------------------------------------------------------
9825 // verifyFinalAllocation: Traverse the RefPositions and verify various invariants.
9834 // If verbose is set, this will also dump a table of the final allocations.
9835 void LinearScan::verifyFinalAllocation()
9839 printf("\nFinal allocation\n");
9842 // Clear register assignments.
9843 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
9845 RegRecord* physRegRecord = getRegisterRecord(reg);
9846 physRegRecord->assignedInterval = nullptr;
9849 for (Interval& interval : intervals)
9851 interval.assignedReg = nullptr;
9852 interval.physReg = REG_NA;
9855 DBEXEC(VERBOSE, dumpRegRecordTitle());
9857 BasicBlock* currentBlock = nullptr;
9858 GenTree* firstBlockEndResolutionNode = nullptr;
9859 regMaskTP regsToFree = RBM_NONE;
9860 regMaskTP delayRegsToFree = RBM_NONE;
9861 LsraLocation currentLocation = MinLocation;
9862 for (RefPosition& refPosition : refPositions)
9864 RefPosition* currentRefPosition = &refPosition;
9865 Interval* interval = nullptr;
9866 RegRecord* regRecord = nullptr;
9867 regNumber regNum = REG_NA;
9868 activeRefPosition = currentRefPosition;
9870 if (currentRefPosition->refType == RefTypeBB)
9872 regsToFree |= delayRegsToFree;
9873 delayRegsToFree = RBM_NONE;
9877 if (currentRefPosition->isPhysRegRef)
9879 regRecord = currentRefPosition->getReg();
9880 regRecord->recentRefPosition = currentRefPosition;
9881 regNum = regRecord->regNum;
9883 else if (currentRefPosition->isIntervalRef())
9885 interval = currentRefPosition->getInterval();
9886 interval->recentRefPosition = currentRefPosition;
9887 if (currentRefPosition->registerAssignment != RBM_NONE)
9889 if (!genMaxOneBit(currentRefPosition->registerAssignment))
9891 assert(currentRefPosition->refType == RefTypeExpUse ||
9892 currentRefPosition->refType == RefTypeDummyDef);
9896 regNum = currentRefPosition->assignedReg();
9897 regRecord = getRegisterRecord(regNum);
9903 LsraLocation newLocation = currentRefPosition->nodeLocation;
9905 if (newLocation > currentLocation)
9908 // We could use the freeRegisters() method, but we'd have to carefully manage the active intervals.
9909 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
9911 regMaskTP regMask = genRegMask(reg);
9912 if ((regsToFree & regMask) != RBM_NONE)
9914 RegRecord* physRegRecord = getRegisterRecord(reg);
9915 physRegRecord->assignedInterval = nullptr;
9918 regsToFree = delayRegsToFree;
9919 regsToFree = RBM_NONE;
9921 currentLocation = newLocation;
9923 switch (currentRefPosition->refType)
9927 if (currentBlock == nullptr)
9929 currentBlock = startBlockSequence();
9933 // Verify the resolution moves at the end of the previous block.
9934 for (GenTree* node = firstBlockEndResolutionNode; node != nullptr; node = node->gtNext)
9936 assert(enregisterLocalVars);
9937 // Only verify nodes that are actually moves; don't bother with the nodes that are
9938 // operands to moves.
9939 if (IsResolutionMove(node))
9941 verifyResolutionMove(node, currentLocation);
9945 // Validate the locations at the end of the previous block.
9946 if (enregisterLocalVars)
9948 VarToRegMap outVarToRegMap = outVarToRegMaps[currentBlock->bbNum];
9949 VarSetOps::Iter iter(compiler, currentBlock->bbLiveOut);
9950 unsigned varIndex = 0;
9951 while (iter.NextElem(&varIndex))
9953 if (localVarIntervals[varIndex] == nullptr)
9955 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
9958 regNumber regNum = getVarReg(outVarToRegMap, varIndex);
9959 interval = getIntervalForLocalVar(varIndex);
9960 assert(interval->physReg == regNum || (interval->physReg == REG_NA && regNum == REG_STK));
9961 interval->physReg = REG_NA;
9962 interval->assignedReg = nullptr;
9963 interval->isActive = false;
9967 // Clear register assignments.
9968 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
9970 RegRecord* physRegRecord = getRegisterRecord(reg);
9971 physRegRecord->assignedInterval = nullptr;
9974 // Now, record the locations at the beginning of this block.
9975 currentBlock = moveToNextBlock();
9978 if (currentBlock != nullptr)
9980 if (enregisterLocalVars)
9982 VarToRegMap inVarToRegMap = inVarToRegMaps[currentBlock->bbNum];
9983 VarSetOps::Iter iter(compiler, currentBlock->bbLiveIn);
9984 unsigned varIndex = 0;
9985 while (iter.NextElem(&varIndex))
9987 if (localVarIntervals[varIndex] == nullptr)
9989 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
9992 regNumber regNum = getVarReg(inVarToRegMap, varIndex);
9993 interval = getIntervalForLocalVar(varIndex);
9994 interval->physReg = regNum;
9995 interval->assignedReg = &(physRegs[regNum]);
9996 interval->isActive = true;
9997 physRegs[regNum].assignedInterval = interval;
10003 dumpRefPositionShort(currentRefPosition, currentBlock);
10007 // Finally, handle the resolution moves, if any, at the beginning of the next block.
10008 firstBlockEndResolutionNode = nullptr;
10009 bool foundNonResolutionNode = false;
10011 LIR::Range& currentBlockRange = LIR::AsRange(currentBlock);
10012 for (GenTree* node : currentBlockRange.NonPhiNodes())
10014 if (IsResolutionNode(currentBlockRange, node))
10016 assert(enregisterLocalVars);
10017 if (foundNonResolutionNode)
10019 firstBlockEndResolutionNode = node;
10022 else if (IsResolutionMove(node))
10024 // Only verify nodes that are actually moves; don't bother with the nodes that are
10025 // operands to moves.
10026 verifyResolutionMove(node, currentLocation);
10031 foundNonResolutionNode = true;
10040 assert(regRecord != nullptr);
10041 assert(regRecord->assignedInterval == nullptr);
10042 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10044 case RefTypeFixedReg:
10045 assert(regRecord != nullptr);
10046 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10049 case RefTypeUpperVectorSaveDef:
10050 case RefTypeUpperVectorSaveUse:
10053 case RefTypeParamDef:
10054 case RefTypeZeroInit:
10055 assert(interval != nullptr);
10057 if (interval->isSpecialPutArg)
10059 dumpLsraAllocationEvent(LSRA_EVENT_SPECIAL_PUTARG, interval, regNum);
10062 if (currentRefPosition->reload)
10064 interval->isActive = true;
10065 assert(regNum != REG_NA);
10066 interval->physReg = regNum;
10067 interval->assignedReg = regRecord;
10068 regRecord->assignedInterval = interval;
10069 dumpLsraAllocationEvent(LSRA_EVENT_RELOAD, nullptr, regRecord->regNum, currentBlock);
10071 if (regNum == REG_NA)
10073 dumpLsraAllocationEvent(LSRA_EVENT_NO_REG_ALLOCATED, interval);
10075 else if (RefTypeIsDef(currentRefPosition->refType))
10077 interval->isActive = true;
10080 if (interval->isConstant && (currentRefPosition->treeNode != nullptr) &&
10081 currentRefPosition->treeNode->IsReuseRegVal())
10083 dumpLsraAllocationEvent(LSRA_EVENT_REUSE_REG, nullptr, regRecord->regNum, currentBlock);
10087 dumpLsraAllocationEvent(LSRA_EVENT_ALLOC_REG, nullptr, regRecord->regNum, currentBlock);
10091 else if (currentRefPosition->copyReg)
10093 dumpLsraAllocationEvent(LSRA_EVENT_COPY_REG, interval, regRecord->regNum, currentBlock);
10095 else if (currentRefPosition->moveReg)
10097 assert(interval->assignedReg != nullptr);
10098 interval->assignedReg->assignedInterval = nullptr;
10099 interval->physReg = regNum;
10100 interval->assignedReg = regRecord;
10101 regRecord->assignedInterval = interval;
10104 printf("Move %-4s ", getRegName(regRecord->regNum));
10109 dumpLsraAllocationEvent(LSRA_EVENT_KEPT_ALLOCATION, nullptr, regRecord->regNum, currentBlock);
10111 if (currentRefPosition->lastUse || currentRefPosition->spillAfter)
10113 interval->isActive = false;
10115 if (regNum != REG_NA)
10117 if (currentRefPosition->spillAfter)
10121 // If refPos is marked as copyReg, then the reg that is spilled
10122 // is the homeReg of the interval not the reg currently assigned
10124 regNumber spillReg = regNum;
10125 if (currentRefPosition->copyReg)
10127 assert(interval != nullptr);
10128 spillReg = interval->physReg;
10131 dumpEmptyRefPosition();
10132 printf("Spill %-4s ", getRegName(spillReg));
10135 else if (currentRefPosition->copyReg)
10137 regRecord->assignedInterval = interval;
10141 interval->physReg = regNum;
10142 interval->assignedReg = regRecord;
10143 regRecord->assignedInterval = interval;
10147 case RefTypeKillGCRefs:
10148 // No action to take.
10149 // However, we will assert that, at resolution time, no registers contain GC refs.
10151 DBEXEC(VERBOSE, printf(" "));
10152 regMaskTP candidateRegs = currentRefPosition->registerAssignment;
10153 while (candidateRegs != RBM_NONE)
10155 regMaskTP nextRegBit = genFindLowestBit(candidateRegs);
10156 candidateRegs &= ~nextRegBit;
10157 regNumber nextReg = genRegNumFromMask(nextRegBit);
10158 RegRecord* regRecord = getRegisterRecord(nextReg);
10159 Interval* assignedInterval = regRecord->assignedInterval;
10160 assert(assignedInterval == nullptr || !varTypeIsGC(assignedInterval->registerType));
10165 case RefTypeExpUse:
10166 case RefTypeDummyDef:
10167 // Do nothing; these will be handled by the RefTypeBB.
10168 DBEXEC(VERBOSE, printf(" "));
10171 case RefTypeInvalid:
10172 // for these 'currentRefPosition->refType' values, No action to take
10176 if (currentRefPosition->refType != RefTypeBB)
10178 DBEXEC(VERBOSE, dumpRegRecords());
10179 if (interval != nullptr)
10181 if (currentRefPosition->copyReg)
10183 assert(interval->physReg != regNum);
10184 regRecord->assignedInterval = nullptr;
10185 assert(interval->assignedReg != nullptr);
10186 regRecord = interval->assignedReg;
10188 if (currentRefPosition->spillAfter || currentRefPosition->lastUse)
10190 interval->physReg = REG_NA;
10191 interval->assignedReg = nullptr;
10193 // regRegcord could be null if the RefPosition does not require a register.
10194 if (regRecord != nullptr)
10196 regRecord->assignedInterval = nullptr;
10200 assert(!currentRefPosition->RequiresRegister());
10207 // Now, verify the resolution blocks.
10208 // Currently these are nearly always at the end of the method, but that may not alwyas be the case.
10209 // So, we'll go through all the BBs looking for blocks whose bbNum is greater than bbNumMaxBeforeResolution.
10210 for (BasicBlock* currentBlock = compiler->fgFirstBB; currentBlock != nullptr; currentBlock = currentBlock->bbNext)
10212 if (currentBlock->bbNum > bbNumMaxBeforeResolution)
10214 // If we haven't enregistered an lclVars, we have no resolution blocks.
10215 assert(enregisterLocalVars);
10219 dumpRegRecordTitle();
10220 printf(shortRefPositionFormat, 0, 0);
10221 assert(currentBlock->bbPreds != nullptr && currentBlock->bbPreds->flBlock != nullptr);
10222 printf(bbRefPosFormat, currentBlock->bbNum, currentBlock->bbPreds->flBlock->bbNum);
10226 // Clear register assignments.
10227 for (regNumber reg = REG_FIRST; reg < ACTUAL_REG_COUNT; reg = REG_NEXT(reg))
10229 RegRecord* physRegRecord = getRegisterRecord(reg);
10230 physRegRecord->assignedInterval = nullptr;
10233 // Set the incoming register assignments
10234 VarToRegMap inVarToRegMap = getInVarToRegMap(currentBlock->bbNum);
10235 VarSetOps::Iter iter(compiler, currentBlock->bbLiveIn);
10236 unsigned varIndex = 0;
10237 while (iter.NextElem(&varIndex))
10239 if (localVarIntervals[varIndex] == nullptr)
10241 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10244 regNumber regNum = getVarReg(inVarToRegMap, varIndex);
10245 Interval* interval = getIntervalForLocalVar(varIndex);
10246 interval->physReg = regNum;
10247 interval->assignedReg = &(physRegs[regNum]);
10248 interval->isActive = true;
10249 physRegs[regNum].assignedInterval = interval;
10252 // Verify the moves in this block
10253 LIR::Range& currentBlockRange = LIR::AsRange(currentBlock);
10254 for (GenTree* node : currentBlockRange.NonPhiNodes())
10256 assert(IsResolutionNode(currentBlockRange, node));
10257 if (IsResolutionMove(node))
10259 // Only verify nodes that are actually moves; don't bother with the nodes that are
10260 // operands to moves.
10261 verifyResolutionMove(node, currentLocation);
10265 // Verify the outgoing register assignments
10267 VarToRegMap outVarToRegMap = getOutVarToRegMap(currentBlock->bbNum);
10268 VarSetOps::Iter iter(compiler, currentBlock->bbLiveOut);
10269 unsigned varIndex = 0;
10270 while (iter.NextElem(&varIndex))
10272 if (localVarIntervals[varIndex] == nullptr)
10274 assert(!compiler->lvaTable[compiler->lvaTrackedToVarNum[varIndex]].lvLRACandidate);
10277 regNumber regNum = getVarReg(outVarToRegMap, varIndex);
10278 Interval* interval = getIntervalForLocalVar(varIndex);
10279 assert(interval->physReg == regNum || (interval->physReg == REG_NA && regNum == REG_STK));
10280 interval->physReg = REG_NA;
10281 interval->assignedReg = nullptr;
10282 interval->isActive = false;
10288 DBEXEC(VERBOSE, printf("\n"));
10291 //------------------------------------------------------------------------
10292 // verifyResolutionMove: Verify a resolution statement. Called by verifyFinalAllocation()
10295 // resolutionMove - A GenTree* that must be a resolution move.
10296 // currentLocation - The LsraLocation of the most recent RefPosition that has been verified.
10302 // If verbose is set, this will also dump the moves into the table of final allocations.
10303 void LinearScan::verifyResolutionMove(GenTree* resolutionMove, LsraLocation currentLocation)
10305 GenTree* dst = resolutionMove;
10306 assert(IsResolutionMove(dst));
10308 if (dst->OperGet() == GT_SWAP)
10310 GenTreeLclVarCommon* left = dst->gtGetOp1()->AsLclVarCommon();
10311 GenTreeLclVarCommon* right = dst->gtGetOp2()->AsLclVarCommon();
10312 regNumber leftRegNum = left->gtRegNum;
10313 regNumber rightRegNum = right->gtRegNum;
10314 LclVarDsc* leftVarDsc = compiler->lvaTable + left->gtLclNum;
10315 LclVarDsc* rightVarDsc = compiler->lvaTable + right->gtLclNum;
10316 Interval* leftInterval = getIntervalForLocalVar(leftVarDsc->lvVarIndex);
10317 Interval* rightInterval = getIntervalForLocalVar(rightVarDsc->lvVarIndex);
10318 assert(leftInterval->physReg == leftRegNum && rightInterval->physReg == rightRegNum);
10319 leftInterval->physReg = rightRegNum;
10320 rightInterval->physReg = leftRegNum;
10321 leftInterval->assignedReg = &physRegs[rightRegNum];
10322 rightInterval->assignedReg = &physRegs[leftRegNum];
10323 physRegs[rightRegNum].assignedInterval = leftInterval;
10324 physRegs[leftRegNum].assignedInterval = rightInterval;
10327 printf(shortRefPositionFormat, currentLocation, 0);
10328 dumpIntervalName(leftInterval);
10330 printf(" %-4s ", getRegName(rightRegNum));
10332 printf(shortRefPositionFormat, currentLocation, 0);
10333 dumpIntervalName(rightInterval);
10335 printf(" %-4s ", getRegName(leftRegNum));
10340 regNumber dstRegNum = dst->gtRegNum;
10341 regNumber srcRegNum;
10342 GenTreeLclVarCommon* lcl;
10343 if (dst->OperGet() == GT_COPY)
10345 lcl = dst->gtGetOp1()->AsLclVarCommon();
10346 srcRegNum = lcl->gtRegNum;
10350 lcl = dst->AsLclVarCommon();
10351 if ((lcl->gtFlags & GTF_SPILLED) != 0)
10353 srcRegNum = REG_STK;
10357 assert((lcl->gtFlags & GTF_SPILL) != 0);
10358 srcRegNum = dstRegNum;
10359 dstRegNum = REG_STK;
10363 Interval* interval = getIntervalForLocalVarNode(lcl);
10364 assert(interval->physReg == srcRegNum || (srcRegNum == REG_STK && interval->physReg == REG_NA));
10365 if (srcRegNum != REG_STK)
10367 physRegs[srcRegNum].assignedInterval = nullptr;
10369 if (dstRegNum != REG_STK)
10371 interval->physReg = dstRegNum;
10372 interval->assignedReg = &(physRegs[dstRegNum]);
10373 physRegs[dstRegNum].assignedInterval = interval;
10374 interval->isActive = true;
10378 interval->physReg = REG_NA;
10379 interval->assignedReg = nullptr;
10380 interval->isActive = false;
10384 printf(shortRefPositionFormat, currentLocation, 0);
10385 dumpIntervalName(interval);
10387 printf(" %-4s ", getRegName(dstRegNum));
10393 #endif // !LEGACY_BACKEND