[X86] Optimize getImpliedDisabledFeatures & getImpliedEnabledFeatures after D83273
Previously the time complexity is O(|number of paths from the root to an
implied feature| * CPU_FWATURE_MAX) where CPU_FEATURE_MAX is 92.
The number of paths can be large (theoretically exponential).
For an inline asm statement, there is a code path
`clang::Parser::ParseAsmStatement -> clang::Sema::ActOnGCCAsmStmt -> ASTContext::getFunctionFeatureMap`
leading to potentially many calls of getImpliedEnabledFeatures (41 for my -march=native case).
We should improve the performance a bit in case the number of inline asm
statements is large (Linux kernel builds).
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D85257