perf, x86: P4 PMU -- redesign cache events