This change optimizes the operator=() assignment for short strings by direcly
copying the raw data from the source into the current instance. This creates an
optimized / inlined mempcy up to over 2X faster for short string assignments.
With inlining enabled for operator=, performance is up to 6X faster.
Benchmarks 'as is':
name old time/op new time/op delta
BM_StringAssignStr_Empty_Opaque 6.05ns ± 2% 3.59ns ± 0% -40.67%
BM_StringAssignStr_Empty_Transparent 5.15ns ± 0% 3.08ns ± 0% -40.12%
BM_StringAssignStr_Small_Opaque 7.71ns ± 0% 3.59ns ± 0% -53.45%
BM_StringAssignStr_Small_Transparent 7.66ns ± 0% 3.09ns ± 0% -59.66%
BM_StringAssignStr_Large_Opaque 24.1ns ± 0% 24.9ns ± 0% +3.22%
BM_StringAssignStr_Large_Transparent 22.2ns ± 0% 22.8ns ± 0% +2.77%
BM_StringAssignStr_Huge_Opaque 315ns ± 6% 320ns ± 5% ~
BM_StringAssignStr_Huge_Transparent 318ns ± 5% 321ns ± 4% ~
Benchmarks with partial inlining operator=():
name old time/op new time/op delta
BM_StringAssignStr_Empty_Opaque 5.94ns ± 2% 1.95ns ± 0% -67.21%
BM_StringAssignStr_Empty_Transparent 5.14ns ± 0% 1.04ns ± 1% -79.73%
BM_StringAssignStr_Small_Opaque 7.69ns ± 0% 1.96ns ± 0% -74.48%
BM_StringAssignStr_Small_Transparent 7.65ns ± 0% 1.04ns ± 0% -86.40%
BM_StringAssignStr_Large_Opaque 24.1ns ± 0% 24.5ns ± 0% +1.61%
BM_StringAssignStr_Large_Transparent 22.2ns ± 0% 21.1ns ± 0% -4.70%
BM_StringAssignStr_Huge_Opaque 317ns ± 5% 323ns ± 4% ~
BM_StringAssignStr_Huge_Transparent 318ns ± 5% 320ns ± 5% ~
Patch by Martijn Vels (mvels@google.com)
Reviewed as https://reviews.llvm.org/D72704
if (this != &__str)
{
__copy_assign_alloc(__str);
- return assign(__str.data(), __str.size());
+ const bool __str_is_long = __str.__is_long(); // Force single branch
+ if (__is_long() || __str_is_long) {
+ return assign(__str.data(), __str.size());
+ }
+ __r_.first().__r = __str.__r_.first().__r;
}
return *this;
}