Panu Matilainen [Fri, 14 Sep 2012 11:12:08 +0000 (14:12 +0300)]
Switch fingerprint subDir to pool ids
- <gulp> but no anomalies noted by test-suite, valgrind or
a bit of manual testing. Time will tell...
- Memory is no longer scarce or scary, the strings are simply owned
by the pool in all circumstances. Eliminate scareMemory foobar
from the fingerprinting internals.
Panu Matilainen [Fri, 14 Sep 2012 07:50:53 +0000 (10:50 +0300)]
Switch fingerprint basename to pool ids
- For now we're doing a fair amount of extra work and wasting memory
as data is in different pools so we need to copy to get the strings
and ids into our private pool. This will go away later.
Panu Matilainen [Fri, 14 Sep 2012 07:24:13 +0000 (10:24 +0300)]
Switch fingerprint cache-entry dirnames to pool ids
- This is fairly straightforward (or supposed to be...) especially
now that its all hidden behind APIs. Note that we no longer bother
precalculating the hash as a pool-id's hash is a no-cost operation,
for strings it was far more expensive.
Panu Matilainen [Fri, 14 Sep 2012 07:01:54 +0000 (10:01 +0300)]
Add a string pool to fingerprint cache (but not yet used)
Panu Matilainen [Fri, 14 Sep 2012 06:28:50 +0000 (09:28 +0300)]
Make fingerprint struct opaque outside fprint.c
- rpmfi cannot know anything about the storage, so rpmfiFpxIndex()
cannot be... change it to rpmfiFps() which only returns the pointer
we got from fpLookupList()
- Change fpCacheGetByFp() to assume it gets passed an array of fps,
and take an additional index argument. Return the fingerprint
pointer on success, NULL on not found to allow further operations
on the fp without knowing its internals.
Panu Matilainen [Fri, 14 Sep 2012 05:41:01 +0000 (08:41 +0300)]
Change fpLookup() to return malloced memory (on first call)
- If the fingerprint pointer passed to it is NULL then allocate space
for a new fingerprint, otherwise reuse the previous space. This should
allow optimizing the case where repeatedly calling and directory
doesn't change inside fpc so callers dont need special-case code
for this. For now, we dont care about optimizations, other than
making it possible later.
Panu Matilainen [Fri, 14 Sep 2012 04:45:44 +0000 (07:45 +0300)]
Make fingerprint cache entry opaque, add some kind of API for the needed bits
- Only disk-space calculations need the actual entry contents, add
getter for dir name and device. We're passing the cache to these
getters too: its not currently unusedd but will be needed for
directory name pool id->string translation once we get there...
Panu Matilainen [Fri, 14 Sep 2012 04:14:14 +0000 (07:14 +0300)]
Hide away the FP_EQUAL macros
Panu Matilainen [Thu, 13 Sep 2012 21:21:04 +0000 (00:21 +0300)]
Change fpLookupList() to return malloced memory
- Eliminates one place where knowledge about fingerprint internals
has been needed, now rpmfi just gets (an supposedly) opaque blob back.
Panu Matilainen [Thu, 13 Sep 2012 21:10:45 +0000 (00:10 +0300)]
Add internal API for fingerprint lookup-and-compare
- Replace the direct hackery in rpmdb internals with a little less
direct hackery...
Panu Matilainen [Thu, 13 Sep 2012 19:41:41 +0000 (22:41 +0300)]
Bury the fingerprint hash-types into fprint.c, clean up
- fprint.h only needs rpmtypes.h now, remove historical leftovers
- Avoids having to define the hash types multiple times as they're
now buried out of sight
- fpHashFunction() and fpLookupSubdir() can now be made static, do so...
Panu Matilainen [Thu, 13 Sep 2012 19:19:40 +0000 (22:19 +0300)]
Move the entire fingerprint cache population into fprint.c
- Rename addFingerprints() to fpCachePopulate() and move into fprint.c.
This doesn't really belong here as it requires fprint becoming aware
of transactions and all, but at least these are all controlled API
accesses unlike where in transaction.c this was messing with somebody
elses data structures directly.
- Move the by-fingerprint creation to fpCachePopulate() so it gets
lazily done as needed and copy the original hash-size heuristics
back here.
Panu Matilainen [Thu, 13 Sep 2012 18:52:38 +0000 (21:52 +0300)]
Hide by-fingerprint hash into fingerprint cache, add minimal API bits
- For now, always create the by-fingerprint hash although rpmdb usage
doesn't need it. Next steps will fix...
- Add wrapper API for retrieving the records, adjust callers
- No functional changes, at least intended ones... just first steps
towards eliminating the hash-jungle and forcing a single API
through which this stuff gets handled
Panu Matilainen [Thu, 13 Sep 2012 18:13:47 +0000 (21:13 +0300)]
Make fingerprint cache opaque outside fprint.c
- Not that it matters much when everything else is wide open but gotta
start with something...
Panu Matilainen [Thu, 13 Sep 2012 10:12:59 +0000 (13:12 +0300)]
Convert our dependency checking code to use the new rpmdsMatches()
- Instead of adding three more pool-aware versions of the old API's,
convert the main callers to the newew more flexible API. As a
"minor side-effect" these now use the transaction string-pool as well,
so ALL our pre-transaction dependency sets are now using the global
pool.
Panu Matilainen [Thu, 13 Sep 2012 09:46:41 +0000 (12:46 +0300)]
Unify the three rpmdsFooMatchesDep() functions into one
- These all do more or less the same thing, easily handled with a common
function that takes a couple of more extra parameters. The old variants
become just wrappers to call the pool-aware rpmdsMatches() with suitable
arguments.
Panu Matilainen [Thu, 13 Sep 2012 08:55:52 +0000 (11:55 +0300)]
Use transaction string pool for rpmlib() dependencies too
- This wasn't possible with the former static rpmlib() dependency set
as it would've kept the potentially huge global pool referenced
throughout process lifetime.
Panu Matilainen [Thu, 13 Sep 2012 08:54:54 +0000 (11:54 +0300)]
Add pool-aware version of rpmdsRpmlib()
Panu Matilainen [Thu, 13 Sep 2012 08:45:29 +0000 (11:45 +0300)]
Hang rpmlib() dependency set onto transaction set
- Eliminates the cumbersome static rpmlib ds instance which can never
be freed, as a member of the transaction set it simply gets cleaned
out along with other transaction (dependency) data.
Panu Matilainen [Thu, 13 Sep 2012 08:36:27 +0000 (11:36 +0300)]
Use transaction string pool for ensureOlder() dependency sets
Panu Matilainen [Thu, 13 Sep 2012 08:35:57 +0000 (11:35 +0300)]
Use transaction string pool for findPos() dependency sets
Panu Matilainen [Thu, 13 Sep 2012 08:18:23 +0000 (11:18 +0300)]
Put transaction element "self" dependency set into global pool too
Panu Matilainen [Thu, 13 Sep 2012 07:56:42 +0000 (10:56 +0300)]
Add pool-aware versions of rpmdsThis() and rpmdsSingle()
- Pooh ... I mean pool ... bah. The previous versions become simple
wrappers to the pool-aware ones, using a private pool always.
rpmdsCurrent() doesn't need pool-variant as its inherits its pool
from the parent.
Panu Matilainen [Thu, 13 Sep 2012 07:39:28 +0000 (10:39 +0300)]
Simplify single ds creation
- Eliminate the pre-created pool wtf'ery (what was I thinking?), just
create a ds with zero id's and fill them up once we have the ds.
Panu Matilainen [Thu, 13 Sep 2012 05:48:56 +0000 (08:48 +0300)]
Add a string equality check function to string pool API
- As a special case, two strings (ids) from the same pool can be tested for
equality in constant time (integer comparison). If the pools differ,
a regular string comparison is needed.
Panu Matilainen [Wed, 12 Sep 2012 17:11:00 +0000 (20:11 +0300)]
Whoopsie, unbreak checking of installed dependencies
- The dependency sets created from installed headers during rpmtsCheck()
were using a private pool and thus ids not matching with the ones
in the global pool. Oops. Somehow none of our test-suite cases
caught this, looks like we'll need more tests... Also the safe-guard
assert()'s are in all the wrong places for catching this particular
problem. Doh :)
- There's a chicken-and-egg situation involved: in order to do this,
the global pool needs to be in unfrozen state during rpmtsCheck(),
which was not possible before switching rpmal provides (and files)
to pool ids. Now that it *is* using pool id's, move the freeze-point
to rpmtsPrepare() as the fingerprinting has similar issues with
moving strings.
Panu Matilainen [Wed, 12 Sep 2012 16:29:28 +0000 (19:29 +0300)]
Only rehash the pool on insert if the data area actually moved
- realloc() might not need to actually move the data, and when it
doesn't we dont need to do the very expensive rehash either.
Unsurprisingly makes things a whole lot faster.
Panu Matilainen [Wed, 12 Sep 2012 16:20:52 +0000 (19:20 +0300)]
Switch rpmal file hash to use pool id's instead of strings
Panu Matilainen [Wed, 12 Sep 2012 13:51:12 +0000 (16:51 +0300)]
Switch rpmal provide hash to use pool id's instead of strings
Panu Matilainen [Wed, 12 Sep 2012 13:50:41 +0000 (16:50 +0300)]
Add a some transition-period asserts to ensure pool-usage sanity
Panu Matilainen [Wed, 12 Sep 2012 13:47:43 +0000 (16:47 +0300)]
Allow keeping hash table around on pool freeze, adjust callers
- Pool id -> string always works with a frozen pool, but in some cases
we'll need to go the other way, allow caller to specify whether
string -> id lookups should be possible on frozen pool.
- On glibc, realloc() to smaller size doesn't move the data but on
other platforms (including valgrind) it can and does move, which
would require a full rehash. For now, just leave all the data
alone unless we're also freeing the hash, the memory savings
isn't much for a global pool (which is where this matters)
Panu Matilainen [Wed, 12 Sep 2012 11:38:34 +0000 (14:38 +0300)]
Add getters for rpmds dependency name and EVR pool ids
Panu Matilainen [Wed, 12 Sep 2012 11:38:08 +0000 (14:38 +0300)]
Add getters for rpmfi base- and directory name pool id's
Panu Matilainen [Wed, 12 Sep 2012 11:33:17 +0000 (14:33 +0300)]
Pass transaction pool to rpmal (but not used yet)
Panu Matilainen [Wed, 12 Sep 2012 10:43:46 +0000 (13:43 +0300)]
Actually enable the global transaction string pool
- With this, practically all transaction member file (base- and dirname)
and dependency strings get now stored in a giant transaction global
string pool. Greenpeace applaudes us for all the rain-forests that
will be saved by the memory savings... not. Some memory is saved
but the real wins only start when everything is converted to use
pool id's instead of string pointers. Also we're still creating
a ridiculous number of private pools from various sources, but
we gotta start someplace.
Panu Matilainen [Wed, 12 Sep 2012 10:37:54 +0000 (13:37 +0300)]
Add infastructure for global transaction set string pool
- Add a pool pointer to to ts members struct and a getter function
- Grab the global pool for rpmte dependency- and file info creation,
if its NULL then the sets will use private pools of their own.
- Add the (currently) required magic voodoo rain-dance to freeze and
unfreeze the pool as necessary wrt new element additions: for
current rpmal and fingerprinting to work, the string pointers
must be immovable.
- This is infrastructure only: nothing creates the global pool yet,
so everything is still using private pools.
Panu Matilainen [Wed, 12 Sep 2012 10:32:31 +0000 (13:32 +0300)]
String pool id 0 equals NULL
- Pool id 0 is special case for "not found". Return an actual NULL
instead of an empty string.
Panu Matilainen [Wed, 12 Sep 2012 10:30:50 +0000 (13:30 +0300)]
Avoid doing anything if pool is already frozen
Panu Matilainen [Wed, 12 Sep 2012 09:53:56 +0000 (12:53 +0300)]
Add getter methods for rpmds and rpmfi string pool handle
Panu Matilainen [Wed, 12 Sep 2012 09:41:07 +0000 (12:41 +0300)]
Delay transaction added packages rpmal creation until required
- We're not using the added rpmal for anything before rpmtsCheck() and/or
rpmtsOrder(), so this shouldn't break anything either. This is probably
a more or less temporary setup to make string pointer -> pool id
transition a bit easier.
Panu Matilainen [Wed, 12 Sep 2012 09:30:44 +0000 (12:30 +0300)]
Add an nternal rpmal create+populate helper function, use for erased packages
- We'll need this shortly for added packages too...
Panu Matilainen [Tue, 11 Sep 2012 11:40:01 +0000 (14:40 +0300)]
Oops, only private pool should be frozen on ds create
Panu Matilainen [Tue, 11 Sep 2012 10:53:24 +0000 (13:53 +0300)]
Add an alternative rpmds constructor to allow shared pool usage
- rpmdsNewPool() allows specifying shared/private pool, and rpmdsNew()
is now just a wrapper to always call it with NULL (ie private) pool.
Panu Matilainen [Tue, 11 Sep 2012 10:42:21 +0000 (13:42 +0300)]
Freeze the rpmlib dependency set pool on successful return
- ...to avoid wasting memory on the relatively static data. We could
handle the rpmlib ds singleton behavior here too but it would change
semantics. Ponder about it later...
- Would be nicer to have rpmdsMerge() freeze on return, but that
gets called in loops so we'd be doing a whole lot of huffing and
puffing recreating the pools on each entry.
Panu Matilainen [Tue, 11 Sep 2012 10:14:04 +0000 (13:14 +0300)]
Further split single ds creation into two, sigh
- Allow rpmdsCurrent() to share the pool and id's of its "parent" ds
instead of having to repeatedly create and tear down entire pools
just for a couple of strings. Used by python bindings for rpmds
iteration so we'll want to be reasonably efficient.
- For now, rpmdsSingle() and rpmdsThis() always get a private pool,
wasteful as it might be, but at least now we can freeze them.
Panu Matilainen [Tue, 11 Sep 2012 08:46:19 +0000 (11:46 +0300)]
Unify the common parts of rpmds creation into a helper function
- No functional changes, just sanity-refactoring
Panu Matilainen [Tue, 11 Sep 2012 08:19:23 +0000 (11:19 +0300)]
Eliminate assert()'s from rpmdsMerge()
- These "can't happen" cases where EVR/Flags in source ds are missing
are just as easy to handle as is dying, handling is saner...
Panu Matilainen [Tue, 11 Sep 2012 08:00:24 +0000 (11:00 +0300)]
Eliminate assert()'s from rpmdsDup()
- The "can't happen" case where EVR/Flags are not present is just as
easily handled as dying.
Panu Matilainen [Tue, 11 Sep 2012 07:41:49 +0000 (10:41 +0300)]
Switch dependency sets to use string pool storage for names and evrs
- Always push dependency names and versions into string pool (private
for now). This is terribly wasteful for single ds items, even more
so for rpmdsCurrent() but to keep the initial switch-over changes
to minimum we'll deal with those later.
- While we freeze the pool for ds data from headers, single ds items
are on purpose not frozen for now, due to interactions with
rpmdsCurrent() and rpmds merging.
- Eliminate no longer needed rpmdsDupArgv(), we're now just copying
a bunch of integers around. Sanitize rpmdsMerge() now that we can:
realloc and shift the data instead of recreating all of N, EVR
and Flags.
Panu Matilainen [Tue, 11 Sep 2012 07:03:51 +0000 (10:03 +0300)]
Fix segfault on rpmstrPoolId() on frozen pool
- String -> id lookups need the hash table in place even if we're not
adding. We could do a linear search in such a case but...
Panu Matilainen [Tue, 11 Sep 2012 06:01:49 +0000 (09:01 +0300)]
Make rpmstrPoolUnfreeze() safe to call on unfrozen pool
Panu Matilainen [Tue, 11 Sep 2012 05:12:49 +0000 (08:12 +0300)]
Rename td2pool as rpmtdToPool, export and optimize
- Using rpmtd iteration for this is slow and stupid as we keep
pointlessly re-re-re-re-re-validating the tag type and indexes.
- Change argument order to source -> destination
- Move to rpmtd.c where it belongs and make public with a decent
name. Not sure if this is the kind of an API we really want to make
public but ... at least for now it'll do.
Panu Matilainen [Tue, 11 Sep 2012 05:05:43 +0000 (08:05 +0300)]
Eliminate direct rpmds name (and flags) access on rpmdsNew()
Panu Matilainen [Tue, 11 Sep 2012 04:52:47 +0000 (07:52 +0300)]
Clean up rpmdsSearch() a bit
- Eliminate numerous repeated direct accesses to [o]ds N, EVR and Flags,
instead use getter functions and local variable for ods name which
does not change.
Panu Matilainen [Tue, 11 Sep 2012 04:40:06 +0000 (07:40 +0300)]
Add internal indexed variants of rpmds N, EVR and Flags getters
- We'll need these to eliminate the remaining direct accesses to
N, EVR (and Flags) on random access patterns such as rpmdsSearch().
Panu Matilainen [Tue, 11 Sep 2012 04:25:04 +0000 (07:25 +0300)]
Clean up rpmdsFind() a bit
- Eliminate numerous repeated direct accesses to [o]ds N, EVR and Flags,
instead grab them into local variables through getter functions as
needed: on entry for ods which doesn't change, for ds in the loop
as we're changing ds->i here.
Panu Matilainen [Tue, 11 Sep 2012 04:23:50 +0000 (07:23 +0300)]
Split rpmds EVR comparison into function of its own
- The EVR comparison is a distinct operation of its own: rpmdsCompare()
looks at the other properties, EVR comparison is done if needed.
Doesn't affect speed or functionality, but cuts down on the
big number of local variables and has the nice side-effect of
making the xstrdup() allocations local within rpmdsCompareEVR()
Panu Matilainen [Tue, 11 Sep 2012 03:58:00 +0000 (06:58 +0300)]
Use getter functions for name, evr and flags in rpmdsCurrent()
Panu Matilainen [Tue, 11 Sep 2012 03:54:31 +0000 (06:54 +0300)]
Clean up rpmdsCompare() a bit
- Eliminate numerous repeated direct accesses to ds N, EVR and Flags,
instead grab them into local variables through getter functions
as they are needed. Besides making it easier on the eyes, makes the
function safe(r) wrt illegal iterator values etc.
Panu Matilainen [Tue, 11 Sep 2012 03:34:36 +0000 (06:34 +0300)]
Clean up rpmdsNewDNEVR() a bit
- Eliminate numerous repeated direct accesses to ds N, EVR and Flags,
instead grab them into local variables at entry. This also makes
the function safe illegal iterator values (ie calling when iteration
not started), previously the bounds were not checked.
Panu Matilainen [Sun, 9 Sep 2012 09:25:56 +0000 (12:25 +0300)]
And now, on to the embarrassing string-pool reimplementation bugs, take I
- String pool offset resize was off by one, oops
- String pool data-area resize requires rehashing all the strings,
as the key pointers change. Ouch. Should be avoidable by extending
rpmhash to allow passing the pool itself around in comparisons as "self"
and using offsets as keys, but for now working counts more than speed.
- The unfreeze-sizehint calculation could be negative. Turn the initial
size into constant and use that as a minimum, otherwise rehashing
uses (more or less arbitrary) heuristics to come up with some number.
Lots of fine-tuning ahead...
Panu Matilainen [Sat, 8 Sep 2012 08:25:16 +0000 (11:25 +0300)]
Switch file info set base- and dirnames storage to string pool
- Always push base and dir names into file info sets string pool,
whether private or shared. For basenames, this can save significant
space even in a private pool, for dirnames private pool is moot
as the names are already unique, shared pool is quite another story.
- Adjust fpLookupList() to take a pool and id's as arguments.
- This introduces a fair amount of overhead, so things will be somewhat
slower until the transition to pool id's is (more) complete. Sometimes
things have to get worse before they get better... Other than that,
this should be entirely invisible to callers.
Panu Matilainen [Sat, 8 Sep 2012 07:44:08 +0000 (10:44 +0300)]
Clean up file info set creation, comment
- Grab and validate the file triplet before placing the data into the
file set. Other than making it more explicit, doesn't matter right
now but we'll need this shortly.
- Refactor the file triplet sanity check into a generic indexed triplet
sanity check (and notice there was an error in the previous index
range checking, duh)
- Apart from the index range fix, shouldn't change any actual functionality
Panu Matilainen [Sat, 8 Sep 2012 06:43:57 +0000 (09:43 +0300)]
Push the flag foo into rpmfiPopulate(), explicit HEADERGET_ALLOC for others
Panu Matilainen [Sat, 8 Sep 2012 06:40:21 +0000 (09:40 +0300)]
Always allocate directory index data
- Just for consistency's sake: now the "core" file triplet data does
not depend on how we got called.
Panu Matilainen [Sat, 8 Sep 2012 06:18:00 +0000 (09:18 +0300)]
Refactor the big rpmfiNewPool() to two separate pieces
- Split file info generation by mandatory/optional data: every file info
set has the file triplet information, but all other data is optional
depending on the create flags. No functional changes.
- Being able to create just the core file triplet and fully populate
later might be useful for checkInstalledFiles(), but we'll see about
that...
Panu Matilainen [Fri, 7 Sep 2012 18:13:21 +0000 (21:13 +0300)]
Split the tagdata -> pool population to another helper function
- For filename triplets we'll need to get and validate the data
before inserting into the pool, so we'll need this shortly.
Panu Matilainen [Fri, 7 Sep 2012 12:48:45 +0000 (15:48 +0300)]
Add an alternative rpmfi constructor to allow shared pool usage
- rpmfiNewPool() allows specifying shared/private pool, and
rpmfiNew() is now just a wrapper to always call it with a private pool.
Panu Matilainen [Fri, 7 Sep 2012 11:09:35 +0000 (14:09 +0300)]
Move string pool typedefs to rpmtypes.h
- I suspect these will be used widely, to avoid having to include
rpmstrpool.h all over in headers...
Panu Matilainen [Fri, 7 Sep 2012 10:19:02 +0000 (13:19 +0300)]
Axe the no longer needed rpmfi string "cache" stuff
Panu Matilainen [Fri, 7 Sep 2012 09:55:28 +0000 (12:55 +0300)]
Use string pool for file set symlinks
- Removes the last use of our former simple, stupid and slow caches
- For now, use a per-fi pool for this just like the previous caching
did. Memory use is slightly increased but its faster than before,
to reap the full benefits (memory and otherwise) we'll want a
per-transaction pool for these, to be added later.
Panu Matilainen [Fri, 7 Sep 2012 08:31:12 +0000 (11:31 +0300)]
Replace user- and groupname + file lang caches with a global stringpool
- With the string pool we dont have to worry about overflowing the
indexes so we can lump all this relatively static data into one pool.
Because rpmsid's are larger than the previous cache indexes, we'll
loose some of the memory savings, but then the pool is faster on
insertion, and we'll only need one of them so...
- The misc. pool is never freed, flushed or frozen so it'll "waste" memory
throughout the lifetime of a process (similarly to the previous caches)
but its not huge so .. ignoring that for now.
Panu Matilainen [Fri, 7 Sep 2012 08:14:14 +0000 (11:14 +0300)]
Dont bother with file capability "cache"
- Very few packages have RPMTAG_FILECAPS at all, and the memory saving
for those that do is so marginal it hardly matters at all. At least
for now, dont bother.
Panu Matilainen [Fri, 7 Sep 2012 07:33:22 +0000 (10:33 +0300)]
First cut of a libsolv-style string <-> id pool API
- The pool stores "arbitrary" number of strings in a space-efficient
manner, with near constant (hashed) string -> id lookup/store and
constant time id -> string and id -> string length lookups.
- Credits for the idea go to the Suse developers working on libsolv,
the basic concept is directly lifted from there but details
differ due to using rpm's own hash table implementation etc.
Another minor difference is using size_t for offsets to permit over
4GB total data size on 64bit systems, the total number of id's in
the pool is limited to uint32 max however (like in libsolv).
- Any (re)implementation bugs by yours truly, this is almost certainly
going to need further tuning and tweaking, API and otherwise.
Panu Matilainen [Thu, 6 Sep 2012 11:49:14 +0000 (14:49 +0300)]
Missing <stdio.h> include for fprintf()
Panu Matilainen [Thu, 6 Sep 2012 09:53:05 +0000 (12:53 +0300)]
Return fingerprint lookups through retval pointer, not struct
- Returning structs by value is a bit icky, pass in a fp pointer
for fpLookup() to fill in instead. This leaves the actual return code
free for handling errors (but ignoring that for now as we always have)
The other option would be always mallocing the return, and we dont
want to do that...
- Shouldn't change any actual functionality.
Panu Matilainen [Thu, 6 Sep 2012 07:48:51 +0000 (10:48 +0300)]
Avoid double iteration on 'rpm -e' now that iterator count works
- While harmless, having to count on one and act on another iteration
gets expensive when there are lots of labels specified. Especially
as the iterator initialization can already load the same headers
multiple times, sigh...
Panu Matilainen [Thu, 6 Sep 2012 07:25:38 +0000 (10:25 +0300)]
Push RPMDBI_LABEL arch parsing down to rpmdb layer to fix stuff
- Partial NEVRA labels cannot be reliably parsed, the various combinations
need to be figured out by trial-and-error. The rpmts layer doesn't stand
a chance of getting it right so move it to rpmdb layer. This doesn't
make the process any less stupid, but at least we get correct results...
- Fixes iterator count when arch is used in a label and more than one
arch variants of a package are installed. Previously iterator count
could be more than one despite actual iteration only hitting one
match, as the arch RE match was added after already initializing
the iterator.
- Also fixes various pathological cases:
- If a legal arch was part of name, version or release (stupid but legal)
we misinterpreted it for arch and failed to find the package.
- If a package with unknown architecture was installed (with --ignorearch)
we could not remove it by its arch as we relied on rpmIsKnownArch()
Panu Matilainen [Wed, 5 Sep 2012 16:40:07 +0000 (19:40 +0300)]
Use existing fingerprint for packages being removed
- Missed opportunity in commit
1a3a4089def9b00790eeebd6f931c99a03a3d44b:
removed packages have already gotten fingerprinted so there's no need
to redo that here.
Panu Matilainen [Wed, 5 Sep 2012 14:47:24 +0000 (17:47 +0300)]
Shut up gcc whine about potentially uninitialized variable
- This is a false positive really, or at least a cant-happen case
Panu Matilainen [Wed, 5 Sep 2012 13:36:16 +0000 (16:36 +0300)]
Avoid rehashing directory name when it doesn't change in rpmal population
- Another modest improvement in redundant rehashing elimination, but in the
big picture, this is really just another fart in the Sahara desert...
Panu Matilainen [Wed, 5 Sep 2012 09:10:51 +0000 (12:10 +0300)]
Prehash dir names on fingerprinting to avoid recalculating when adding
- Speedup depends on transaction and yadda yadda, on my large erasure
transaction testcase this is circa two percent saving on rstrhash()
total costs.
Panu Matilainen [Wed, 5 Sep 2012 08:50:30 +0000 (11:50 +0300)]
Use helper variable to eliminate multiple identical conditionals
Panu Matilainen [Wed, 5 Sep 2012 08:02:14 +0000 (11:02 +0300)]
Prehash dependency strings to avoid recalculating when caching
- Speedup depends on the transaction and is likely to be rather modest
anyway, but can't hurt either...
Panu Matilainen [Wed, 5 Sep 2012 07:41:44 +0000 (10:41 +0300)]
Prehash basenames to avoid recalculation when adding new ones
- Speedup depends on the transaction and is by no means enormous,
but on my testcase of a largish erasure transaction this shaves
off circa four percent of the cycles spent in (re)hashing the
basenames.
Panu Matilainen [Wed, 5 Sep 2012 07:09:16 +0000 (10:09 +0300)]
Add alternative hash key add/get/check methods with prehashed key
- In cases where more than one operation is done with the same key, these
can be used to avoid the relatively expensive rehashing of the key.
Panu Matilainen [Wed, 5 Sep 2012 07:37:28 +0000 (10:37 +0300)]
Add hash table methodn for (pre)calculating base hash of a key
Panu Matilainen [Tue, 4 Sep 2012 10:43:43 +0000 (13:43 +0300)]
Minor optimizations to rpmvercmp()
- Avoid calculating string lengths if versions are equal
- Avoid calculating string lengths twice in numeric comparison
Panu Matilainen [Mon, 3 Sep 2012 12:44:53 +0000 (15:44 +0300)]
Minor optimization to rnibble()
- Check for lowercase letters before uppercase. A very minor difference
as such, but our file digests use lowercase hex and this gets
called a lot from rpmfiNew().
Panu Matilainen [Mon, 3 Sep 2012 12:03:12 +0000 (15:03 +0300)]
Avoid netshared path matching when netshared path is not set
- We've been calling matchNetsharedpath() for every single file in
the transaction regardless of whether %{_netsharedpath} is set or
not (and almost always it is not), meaning lots of pointless
strlen() calls and in case of erasure, rpmfi iterations performed.
Not exactly a massive speedup but a speedup nevertheless.
Panu Matilainen [Mon, 3 Sep 2012 08:16:13 +0000 (11:16 +0300)]
"Optimize" addFingerprints() a bit
- Avoid repeatedly calling rpmteGetFileStates() for every file processed
- Avoid rpmfi iteration, use a good ole for-loop and index-accessors
- Dont bother looking up symlinks for skipped files
- Eliminate rpmfi from the latter loop, its not used for anything there
- Not that this is going to show on wall-clock times, the cycles saved here
are going to get very much lost in the noise of more expensive things.
Panu Matilainen [Thu, 30 Aug 2012 12:44:29 +0000 (15:44 +0300)]
Avoid unnecessary calculations on %ghost %config
- If the replacing %config file is a %ghost, then we'll just leave whatever
might be on disk alone.
- OTOH in the opposite case we probably *should* take backups, if the
file exists on disk and is differs from the new non-ghost (but
currently we take never take backups for %ghosts)
Panu Matilainen [Thu, 30 Aug 2012 10:39:24 +0000 (13:39 +0300)]
Only backup if osuffix is set
- This can happen on %ghost files, fsm->ossufix is never set for them.
Arguably this is a bug in the file disposition calculations but for now...
Panu Matilainen [Thu, 30 Aug 2012 09:06:52 +0000 (12:06 +0300)]
Fix memleak regresssion in rpmfiDecideFateIndex()
- Similar to commit
80ee39da35544253cab12abd54af8754335ac945: this
started leaking at commit
3f996a588a56141df146c33583a13c0542323977
as rpmfiFNIndex() returns malloced memory. Refactor the lucky 13
return points into one, allowing cleanup at exit.
Panu Matilainen [Thu, 30 Aug 2012 08:02:54 +0000 (11:02 +0300)]
Cache the actual result on rpmdb header verification
- Previously we'd turn all but FAILED results into "OK" after first
check, now we return the real value. And perhaps more importantly,
no longer try re-verifying previously failed headers in vain.
Panu Matilainen [Thu, 30 Aug 2012 07:19:26 +0000 (10:19 +0300)]
Use file info set of removed packages instead of header fetches
- When we get rpmdb hits on files from packages that are to be removed
in the same transaction, we can use its existing file info set
to grab base- and directory names to avoid bunch of headerGet()'s
and consecutive rpmtd manipulation. In theory this should speed up
transactions where lots of packages get removed, in practise not
really - the big cost here is in loading the headers from db in the
first place, despite not being really needed.
Panu Matilainen [Thu, 30 Aug 2012 06:45:49 +0000 (09:45 +0300)]
Use file info set of removed packages instead of new temporary one
- When we get rpmdb hits on files from packages that are to be removed
in the same transaction, we can use the exisiting file info set
to avoid constructing a new temporary one. Might be measurable
in large updates.
Panu Matilainen [Thu, 30 Aug 2012 05:28:22 +0000 (08:28 +0300)]
Store transaction element pointers in the removedPkgs hash
- Change the hashtype name to something else, its no longer a plain
int hash. Still needs double definition as its not contained in
a single source (might want a wrapper similar to rpmal), but
slightly more contained now than the previous intHash definition.
- This opens up some new possibilities, to be taken advantage of
in later commits.
Panu Matilainen [Wed, 29 Aug 2012 14:31:32 +0000 (17:31 +0300)]
Define separate hash for rpmdb header check cache
- This is really private to rpmdb.c (other than being exposed in
the struct definition in backend/dbi.h, sigh), let it live its
own life there.
- No functional changes here, just cleaning up a bit for next steps.
OTOH we could now cache the actual result, not just success... but
leaving that to another time.
Panu Matilainen [Wed, 29 Aug 2012 13:39:33 +0000 (16:39 +0300)]
Oops, undefining wrong name...