From 6c28b49647a0d149e16f5324b6273b2b6e17a688 Mon Sep 17 00:00:00 2001 From: Daniel Dragan Date: Sun, 18 Nov 2012 18:37:18 -0500 Subject: [PATCH] refactor pp_padsv This commit rearranges the autos in a way to minimize the number of vars saved across calls. By calculating TARG after EXTEND, TARG does not need to be saved across a conditional stack grow call. In order to not read the pad slot and PL_curpad twice (once for TARG, once for save_clearsv), a * to the pad slot is saved in a volatile. var padentry is calced after the EXTEND call, but before save_clearsv, and is not saved across save_clearsv. TARG's scope was not extended to the vifify_ref call, since 1. SP (TOPs assignment) will be referenced anyway after vivify_ref and this can not be removed. 2. All of pp_padsv has only 3 non-vol inter func vars, my_perl, SP and op. Adding TARG to remain live past any call (save_clearsv) would add a 4th non-vol var to this opcode and increase the total stack frame size by 1 pointer, either to save a non-vol reg, or to put TARG directly on the stack (platform dependent). The SPAGAIN and RETURN combo was strange, since we fetch global SP, and then write it right back. But in the old code, the RETURN, if the if branch is not taken would be required for the XPUSHs. SAVECLEARSV macro included addr of op, would wouldn't work here, so the func was called directly. Originally I kept the SAVECLEARSV, but the asm of Visual C showed recalcing of the pad slot in SAVECLEARSV even though no calls are made in my revision between EXTEND and SAVECLEARSV, so I forced the compiler to cache the pad slot. By removing a 2nd PL_curpad dereference, in theory the CPU can toss a chunk of the pad array from CPU cache away sooner. TOPs does not inc/dec local SP, so a PUTBACK is not required after vivify_ref. Brief comments added to explain the unusual code on first glance. PL_op was cached to remove a couple "*(my_perl+4)"s that were PL_op being reread after the 2 calls. PL_op caching was the last change I did. Without PL_op caching, pp_padsv has max 2 cross func saved autos but was 0x77 long. With PL_op caching, pp_padsv now has 3 saved autos, but is only 0x72 since some "*(my_perl+4)"s were removed. Since Linux/Win32 32 bit x86 has only 3 (without ebp) or 4 (with ebp) non-vol regs, this func post PL_op caching still uses no C stack vars (but nonvol regs are saved and restored of course). This patch is similar in concept to commit fdf4ddd . The machine size of pp_padsv dropped from 0x85 to 0x72 for me after this commit. --- pp_hot.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/pp_hot.c b/pp_hot.c index aa9fdb6..db6945d 100644 --- a/pp_hot.c +++ b/pp_hot.c @@ -377,19 +377,29 @@ PP(pp_padrange) PP(pp_padsv) { - dVAR; dSP; dTARGET; - XPUSHs(TARG); - if (PL_op->op_flags & OPf_MOD) { - if (PL_op->op_private & OPpLVAL_INTRO) - if (!(PL_op->op_private & OPpPAD_STATE)) - SAVECLEARSV(PAD_SVl(PL_op->op_targ)); - if (PL_op->op_private & OPpDEREF) { - PUTBACK; - TOPs = vivify_ref(TOPs, PL_op->op_private & OPpDEREF); - SPAGAIN; + dVAR; dSP; + EXTEND(SP, 1); + { + OP * const op = PL_op; + /* access PL_curpad once */ + SV ** const padentry = &(PAD_SVl(op->op_targ)); + { + dTARG; + TARG = *padentry; + PUSHs(TARG); + PUTBACK; /* no pop/push after this, TOPs ok */ } + if (op->op_flags & OPf_MOD) { + if (op->op_private & OPpLVAL_INTRO) + if (!(op->op_private & OPpPAD_STATE)) + save_clearsv(padentry); + if (op->op_private & OPpDEREF) { + /* TOPs arg is TARG, but TOPs (SP) rmvs a var across save_clearsv */ + TOPs = vivify_ref(TOPs, op->op_private & OPpDEREF); + } + } + return op->op_next; } - RETURN; } PP(pp_readline) -- 2.7.4