From: Roman Lebedev Date: Thu, 28 Mar 2019 08:55:01 +0000 (+0000) Subject: [llvm-exegesis] Introduce a 'naive' clustering algorithm (PR40880) X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=c2423fe6899aad89fe0ac2aa4b873cb79ec15bd0;p=platform%2Fupstream%2Fllvm.git [llvm-exegesis] Introduce a 'naive' clustering algorithm (PR40880) Summary: This is an alternative to D59539. Let's suppose we have measured 4 different opcodes, and got: `0.5`, `1.0`, `1.5`, `2.0`. Let's suppose we are using `-analysis-clustering-epsilon=0.5`. By default now we will start processing the `0.5` point, find that `1.0` is it's neighbor, add them to a new cluster. Then we will notice that `1.5` is a neighbor of `1.0` and add it to that same cluster. Then we will notice that `2.0` is a neighbor of `1.5` and add it to that same cluster. So all these points ended up in the same cluster. This may or may not be a correct implementation of dbscan clustering algorithm. But this is rather horribly broken for the reasons of comparing the clusters with the LLVM sched data. Let's suppose all those opcodes are currently in the same sched cluster. If i specify `-analysis-inconsistency-epsilon=0.5`, then no matter the LLVM values this cluster will **never** match the LLVM values, and thus this cluster will **always** be displayed as inconsistent. The solution is obviously to split off some of these opcodes into different sched cluster. But how do i do that? Out of 4 opcodes displayed in the inconsistency report, which ones are the "bad ones"? Which ones are the most different from the checked-in data? I'd need to go in to the `.yaml` and look it up manually. The trivial solution is to, when creating clusters, don't use the full dbscan algorithm, but instead "pick some unclustered point, pick all unclustered points that are it's neighbor, put them all into a new cluster, repeat". And just so as it happens, we can arrive at that algorithm by not performing the "add neighbors of a neighbor to the cluster" step. But that won't work well once we teach analyze mode to operate in on-1D mode (i.e. on more than a single measurement type at a time), because the clustering would depend on the order of the measurements. Instead, let's just create a single cluster per opcode, and put all the points of that opcode into said cluster. And simultaneously check that every point in that cluster is a neighbor of every other point in the cluster, and if they are not, the cluster (==opcode) is unstable. This is //yet another// step to bring me closer to being able to continue cleanup of bdver2 sched model.. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40880 | PR40880 ]]. Reviewers: courbet, gchatelet Reviewed By: courbet Subscribers: tschuett, jdoerfert, RKSimon, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59820 llvm-svn: 357152 --- diff --git a/llvm/docs/CommandGuide/llvm-exegesis.rst b/llvm/docs/CommandGuide/llvm-exegesis.rst index 13ca16a..29f2cec 100644 --- a/llvm/docs/CommandGuide/llvm-exegesis.rst +++ b/llvm/docs/CommandGuide/llvm-exegesis.rst @@ -214,10 +214,17 @@ OPTIONS If non-empty, write inconsistencies found during analysis to this file. `-` prints to stdout. By default, this analysis is not run. +.. option:: -analysis-clustering=[dbscan,naive] + + Specify the clustering algorithm to use. By default DBSCAN will be used. + Naive clustering algorithm is better for doing further work on the + `-analysis-inconsistencies-output-file=` output, it will create one cluster + per opcode, and check that the cluster is stable (all points are neighbours). + .. option:: -analysis-numpoints= Specify the numPoints parameters to be used for DBSCAN clustering - (`analysis` mode). + (`analysis` mode, DBSCAN only). .. option:: -analysis-clustering-epsilon= diff --git a/llvm/test/tools/llvm-exegesis/X86/analysis-clustering-algorithms.test b/llvm/test/tools/llvm-exegesis/X86/analysis-clustering-algorithms.test new file mode 100644 index 0000000..00cb105 --- /dev/null +++ b/llvm/test/tools/llvm-exegesis/X86/analysis-clustering-algorithms.test @@ -0,0 +1,231 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=dbscan | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-DBSCAN-05 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.49 -analysis-numpoints=1 -analysis-clustering=dbscan | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-DBSCAN-049 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-NAIVE %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.49 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-NAIVE %s + +# CHECK-CLUSTERS-ALL: {{^}}cluster_id,opcode_name,config,sched_class,inverse_throughput{{$}} + +# By default with -analysis-clustering-epsilon=0.5 everything ends up in the +# same cluster, because each next point is a neighbour of the previous point. + +# CHECK-CLUSTERS-DBSCAN-05-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-05-SAME: ,1.00{{$}} +# CHECK-CLUSTERS-DBSCAN-05-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-05-SAME: ,1.50{{$}} +# CHECK-CLUSTERS-DBSCAN-05-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-05-SAME: ,2.00{{$}} +# CHECK-CLUSTERS-DBSCAN-05-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-05-SAME: ,2.50{{$}} + +# With -analysis-clustering-epsilon=0.49 every point goes into separate cluster. + +# CHECK-CLUSTERS-DBSCAN-049-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-049-SAME: ,1.00{{$}} +# CHECK-CLUSTERS-DBSCAN-049: {{^}}1, +# CHECK-CLUSTERS-DBSCAN-049-SAME: ,1.50{{$}} +# CHECK-CLUSTERS-DBSCAN-049: {{^}}2, +# CHECK-CLUSTERS-DBSCAN-049-SAME: ,2.00{{$}} +# CHECK-CLUSTERS-DBSCAN-049: {{^}}3, +# CHECK-CLUSTERS-DBSCAN-049-SAME: ,2.50{{$}} + +# And -analysis-clustering=naive every opcode goes into separate cluster. + +# CHECK-CLUSTERS-NAIVE-049-NEXT: {{^}}0, +# CHECK-CLUSTERS-NAIVE-049-SAME: ,1.50{{$}} +# CHECK-CLUSTERS-NAIVE-049: {{^}}1, +# CHECK-CLUSTERS-NAIVE-049-SAME: ,2.00{{$}} +# CHECK-CLUSTERS-NAIVE-049: {{^}}2, +# CHECK-CLUSTERS-NAIVE-049-SAME: ,2.50{{$}} +# CHECK-CLUSTERS-NAIVE-049: {{^}}3, +# CHECK-CLUSTERS-NAIVE-049-SAME: ,1.00{{$}} + +# The "value" is manually specified, not measured. + +--- +mode: inverse_throughput +key: + instructions: + - 'ROL8ri AH AH i_0x1' + - 'ROL8ri AL AL i_0x1' + - 'ROL8ri BH BH i_0x1' + - 'ROL8ri BL BL i_0x1' + - 'ROL8ri BPL BPL i_0x1' + - 'ROL8ri CH CH i_0x1' + - 'ROL8ri CL CL i_0x1' + - 'ROL8ri DH DH i_0x1' + - 'ROL8ri DIL DIL i_0x1' + - 'ROL8ri DL DL i_0x1' + - 'ROL8ri SIL SIL i_0x1' + - 'ROL8ri R8B R8B i_0x1' + - 'ROL8ri R9B R9B i_0x1' + - 'ROL8ri R10B R10B i_0x1' + - 'ROL8ri R11B R11B i_0x1' + - 'ROL8ri R12B R12B i_0x1' + - 'ROL8ri R13B R13B i_0x1' + - 'ROL8ri R14B R14B i_0x1' + - 'ROL8ri R15B R15B i_0x1' + config: '' + register_initial_values: + - 'AH=0x0' + - 'AL=0x0' + - 'BH=0x0' + - 'BL=0x0' + - 'BPL=0x0' + - 'CH=0x0' + - 'CL=0x0' + - 'DH=0x0' + - 'DIL=0x0' + - 'DL=0x0' + - 'SIL=0x0' + - 'R8B=0x0' + - 'R9B=0x0' + - 'R10B=0x0' + - 'R11B=0x0' + - 'R12B=0x0' + - 'R13B=0x0' + - 'R14B=0x0' + - 'R15B=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 1.0000, per_snippet_value: 30.4026 } +error: '' +info: instruction has tied variables, using static renaming. +assembled_snippet: 55415741564155415453B400B000B700B30040B500B500B100B60040B700B20040B60041B00041B10041B20041B30041B40041B50041B60041B700C0C401C0C001C0C701C0C30140C0C501C0C501C0C101C0C60140C0C701C0C20140C0C60141C0C00141C0C10141C0C20141C0C30141C0C40141C0C50141C0C60141C0C7015B415C415D415E415F5DC3 +... +--- +mode: inverse_throughput +key: + instructions: + - 'ROL16ri AX AX i_0x1' + - 'ROL16ri BP BP i_0x1' + - 'ROL16ri BX BX i_0x1' + - 'ROL16ri CX CX i_0x1' + - 'ROL16ri DI DI i_0x1' + - 'ROL16ri DX DX i_0x1' + - 'ROL16ri SI SI i_0x1' + - 'ROL16ri R8W R8W i_0x1' + - 'ROL16ri R9W R9W i_0x1' + - 'ROL16ri R10W R10W i_0x1' + - 'ROL16ri R11W R11W i_0x1' + - 'ROL16ri R12W R12W i_0x1' + - 'ROL16ri R13W R13W i_0x1' + - 'ROL16ri R14W R14W i_0x1' + - 'ROL16ri R15W R15W i_0x1' + config: '' + register_initial_values: + - 'AX=0x0' + - 'BP=0x0' + - 'BX=0x0' + - 'CX=0x0' + - 'DI=0x0' + - 'DX=0x0' + - 'SI=0x0' + - 'R8W=0x0' + - 'R9W=0x0' + - 'R10W=0x0' + - 'R11W=0x0' + - 'R12W=0x0' + - 'R13W=0x0' + - 'R14W=0x0' + - 'R15W=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 1.5000, per_snippet_value: 30.154 } +error: '' +info: instruction has tied variables, using static renaming. +assembled_snippet: 5541574156415541545366B8000066BD000066BB000066B9000066BF000066BA000066BE00006641B800006641B900006641BA00006641BB00006641BC00006641BD00006641BE00006641BF000066C1C00166C1C50166C1C30166C1C10166C1C70166C1C20166C1C6016641C1C0016641C1C1016641C1C2016641C1C3016641C1C4016641C1C5016641C1C6016641C1C70166C1C0015B415C415D415E415F5DC3 +... +--- +mode: inverse_throughput +key: + instructions: + - 'ROL32ri EAX EAX i_0x1' + - 'ROL32ri EBP EBP i_0x1' + - 'ROL32ri EBX EBX i_0x1' + - 'ROL32ri ECX ECX i_0x1' + - 'ROL32ri EDI EDI i_0x1' + - 'ROL32ri EDX EDX i_0x1' + - 'ROL32ri ESI ESI i_0x1' + - 'ROL32ri R8D R8D i_0x1' + - 'ROL32ri R9D R9D i_0x1' + - 'ROL32ri R10D R10D i_0x1' + - 'ROL32ri R11D R11D i_0x1' + - 'ROL32ri R12D R12D i_0x1' + - 'ROL32ri R13D R13D i_0x1' + - 'ROL32ri R14D R14D i_0x1' + - 'ROL32ri R15D R15D i_0x1' + config: '' + register_initial_values: + - 'EAX=0x0' + - 'EBP=0x0' + - 'EBX=0x0' + - 'ECX=0x0' + - 'EDI=0x0' + - 'EDX=0x0' + - 'ESI=0x0' + - 'R8D=0x0' + - 'R9D=0x0' + - 'R10D=0x0' + - 'R11D=0x0' + - 'R12D=0x0' + - 'R13D=0x0' + - 'R14D=0x0' + - 'R15D=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 2.0000, per_snippet_value: 23.2762 } +error: '' +info: instruction has tied variables, using static renaming. +assembled_snippet: 55415741564155415453B800000000BD00000000BB00000000B900000000BF00000000BA00000000BE0000000041B80000000041B90000000041BA0000000041BB0000000041BC0000000041BD0000000041BE0000000041BF00000000C1C001C1C501C1C301C1C101C1C701C1C201C1C60141C1C00141C1C10141C1C20141C1C30141C1C40141C1C50141C1C60141C1C701C1C0015B415C415D415E415F5DC3 +... +--- +mode: inverse_throughput +key: + instructions: + - 'ROL64ri RAX RAX i_0x1' + - 'ROL64ri RBP RBP i_0x1' + - 'ROL64ri RBX RBX i_0x1' + - 'ROL64ri RCX RCX i_0x1' + - 'ROL64ri RDI RDI i_0x1' + - 'ROL64ri RDX RDX i_0x1' + - 'ROL64ri RSI RSI i_0x1' + - 'ROL64ri R8 R8 i_0x1' + - 'ROL64ri R9 R9 i_0x1' + - 'ROL64ri R10 R10 i_0x1' + - 'ROL64ri R11 R11 i_0x1' + - 'ROL64ri R12 R12 i_0x1' + - 'ROL64ri R13 R13 i_0x1' + - 'ROL64ri R14 R14 i_0x1' + - 'ROL64ri R15 R15 i_0x1' + config: '' + register_initial_values: + - 'RAX=0x0' + - 'RBP=0x0' + - 'RBX=0x0' + - 'RCX=0x0' + - 'RDI=0x0' + - 'RDX=0x0' + - 'RSI=0x0' + - 'R8=0x0' + - 'R9=0x0' + - 'R10=0x0' + - 'R11=0x0' + - 'R12=0x0' + - 'R13=0x0' + - 'R14=0x0' + - 'R15=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 2.5000, per_snippet_value: 26.2268 } +error: '' +info: instruction has tied variables, using static renaming. +assembled_snippet: 5541574156415541545348B8000000000000000048BD000000000000000048BB000000000000000048B9000000000000000048BF000000000000000048BA000000000000000048BE000000000000000049B8000000000000000049B9000000000000000049BA000000000000000049BB000000000000000049BC000000000000000049BD000000000000000049BE000000000000000049BF000000000000000048C1C00148C1C50148C1C30148C1C10148C1C70148C1C20148C1C60149C1C00149C1C10149C1C20149C1C30149C1C40149C1C50149C1C60149C1C70148C1C0015B415C415D415E415F5DC3 +... diff --git a/llvm/test/tools/llvm-exegesis/X86/analysis-naive-cluster-stabilization.test b/llvm/test/tools/llvm-exegesis/X86/analysis-naive-cluster-stabilization.test new file mode 100644 index 0000000..0ac9bbd --- /dev/null +++ b/llvm/test/tools/llvm-exegesis/X86/analysis-naive-cluster-stabilization.test @@ -0,0 +1,63 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-05 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-STABLE-05 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-display-unstable-clusters -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-UNSTABLE-05 %s + +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.49 -analysis-inconsistency-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-049 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.49 -analysis-inconsistency-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-STABLE-049 %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.49 -analysis-inconsistency-epsilon=0.5 -analysis-display-unstable-clusters -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-UNSTABLE-049 %s + +# CHECK-CLUSTERS-ALL: {{^}}cluster_id,opcode_name,config,sched_class,latency{{$}} + +# CHECK-CLUSTERS-NEXT-05: {{^}}0, +# CHECK-CLUSTERS-SAME-05: ,90.00{{$}} +# CHECK-CLUSTERS-05: {{^}}0, +# CHECK-CLUSTERS-SAME-05: ,90.50{{$}} + +# CHECK-INCONSISTENCIES-STABLE-05: ADD32rr +# CHECK-INCONSISTENCIES-STABLE-05: ADD32rr +# CHECK-INCONSISTENCIES-STABLE-05-NOT: ADD32rr + +# CHECK-INCONSISTENCIES-UNSTABLE-05-NOT: ADD32rr + +# CHECK-INCONSISTENCIES-STABLE-049-NOT: ADD32rr + +# CHECK-INCONSISTENCIES-UNSTABLE-049: ADD32rr +# CHECK-INCONSISTENCIES-UNSTABLE-049: ADD32rr +# CHECK-INCONSISTENCIES-UNSTABLE-049-NOT: ADD32rr + +--- +mode: latency +key: + instructions: + - 'ADD32rr EDX EDX EAX' + config: '' + register_initial_values: + - 'EDX=0x0' + - 'EAX=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.0000, per_snippet_value: 90.0000 } +error: '' +info: Repeating a single implicitly serial instruction +assembled_snippet: BA00000000B80000000001C201C201C201C201C201C201C201C201C201C201C201C201C201C201C201C2C3 +--- +mode: latency +key: + instructions: + - 'ADD32rr EDX EDX EAX' + config: '' + register_initial_values: + - 'EDX=0x0' + - 'EAX=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.5000, per_snippet_value: 90.5000 } +error: '' +info: Repeating a single implicitly serial instruction +assembled_snippet: BA00000000B80000000001C201C201C201C201C201C201C201C201C201C201C201C201C201C201C201C2C3 +--- +... diff --git a/llvm/test/tools/llvm-exegesis/X86/analysis-naive-clusterization.test b/llvm/test/tools/llvm-exegesis/X86/analysis-naive-clusterization.test new file mode 100644 index 0000000..9f557a5 --- /dev/null +++ b/llvm/test/tools/llvm-exegesis/X86/analysis-naive-clusterization.test @@ -0,0 +1,100 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=0.1 -analysis-inconsistency-epsilon=0.1 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-ALL,CHECK-INCONSISTENCIES-STABLE %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clustering-epsilon=0.5 -analysis-inconsistency-epsilon=0.5 -analysis-display-unstable-clusters -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-INCONSISTENCIES-ALL,CHECK-INCONSISTENCIES-UNSTABLE %s + +# We have two ADD32rr measurements, and two measurements for SQRTSSr. +# ADD32rr measurements are neighbours. +# But the measurements of SQRTSSr are not neighbours, +# so therefore that cluster is marked as unstable. + +# By default, we do not show such unstable clusters. +# If told to show, we *only* show such unstable clusters. + +# CHECK-CLUSTERS: {{^}}cluster_id,opcode_name,config,sched_class,latency{{$}} +# CHECK-CLUSTERS-NEXT: {{^}}0, +# CHECK-CLUSTERS-SAME: ,90.00{{$}} +# CHECK-CLUSTERS-NEXT: {{^}}0, +# CHECK-CLUSTERS-SAME: ,90.11{{$}} +# CHECK-CLUSTERS: {{^}}1, +# CHECK-CLUSTERS-SAME: ,90.11{{$}} +# CHECK-CLUSTERS-NEXT: {{^}}1, +# CHECK-CLUSTERS-SAME: ,100.00{{$}} + +# CHECK-INCONSISTENCIES-STABLE: ADD32rr +# CHECK-INCONSISTENCIES-STABLE: ADD32rr +# CHECK-INCONSISTENCIES-STABLE-NOT: ADD32rr +# CHECK-INCONSISTENCIES-STABLE-NOT: SQRTSSr + +# CHECK-INCONSISTENCIES-UNSTABLE: SQRTSSr +# CHECK-INCONSISTENCIES-UNSTABLE: SQRTSSr +# CHECK-INCONSISTENCIES-UNSTABLE-NOT: SQRTSSr +# CHECK-INCONSISTENCIES-UNSTABLE-NOT: ADD32rr + +--- +mode: latency +key: + instructions: + - 'ADD32rr EDX EDX EAX' + config: '' + register_initial_values: + - 'EDX=0x0' + - 'EAX=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.0000, per_snippet_value: 90.0000 } +error: '' +info: Repeating a single implicitly serial instruction +assembled_snippet: BA00000000B80000000001C201C201C201C201C201C201C201C201C201C201C201C201C201C201C201C2C3 +--- +mode: latency +key: + instructions: + - 'ADD32rr EDX EDX EAX' + config: '' + register_initial_values: + - 'EDX=0x0' + - 'EAX=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.1100, per_snippet_value: 90.1100 } +error: '' +info: Repeating a single implicitly serial instruction +assembled_snippet: BA00000000B80000000001C201C201C201C201C201C201C201C201C201C201C201C201C201C201C201C2C3 +--- +mode: latency +key: + instructions: + - 'SQRTSSr XMM11 XMM11' + config: '' + register_initial_values: + - 'XMM11=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 90.1111, per_snippet_value: 90.1111 } +error: '' +info: Repeating a single explicitly serial instruction +assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C410F3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBC3 +... +--- +mode: latency +key: + instructions: + - 'SQRTSSr XMM11 XMM11' + config: '' + register_initial_values: + - 'XMM11=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 10000 +measurements: + - { key: latency, value: 100, per_snippet_value: 100 } +error: '' +info: Repeating a single explicitly serial instruction +assembled_snippet: 4883EC10C7042400000000C744240400000000C744240800000000C744240C00000000C57A6F1C244883C410F3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBF3450F51DBC3 +... diff --git a/llvm/test/tools/llvm-exegesis/X86/analysis-same-cluster-for-ops-in-different-sched-clusters.test b/llvm/test/tools/llvm-exegesis/X86/analysis-same-cluster-for-ops-in-different-sched-clusters.test new file mode 100644 index 0000000..4739d00 --- /dev/null +++ b/llvm/test/tools/llvm-exegesis/X86/analysis-same-cluster-for-ops-in-different-sched-clusters.test @@ -0,0 +1,54 @@ +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=10 -analysis-numpoints=1 | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-DBSCAN %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=10 -analysis-numpoints=1 -analysis-clustering=dbscan | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-DBSCAN %s +# RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-clusters-output-file=- -analysis-clustering-epsilon=10 -analysis-numpoints=1 -analysis-clustering=naive | FileCheck -check-prefixes=CHECK-CLUSTERS-ALL,CHECK-CLUSTERS-NAIVE %s + +# Normally BSR32rr is in WriteBSR and BSF32rr is in WriteBSF sched classes. +# Here we check that if we have dbscan-clustered these two measurements into the +# same cluster, we don't split it per the sched classes into two. + +# CHECK-CLUSTERS-ALL: {{^}}cluster_id,opcode_name,config,sched_class,inverse_throughput{{$}} + +# CHECK-CLUSTERS-DBSCAN-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-SAME: ,4.03{{$}} +# CHECK-CLUSTERS-DBSCAN-NEXT: {{^}}0, +# CHECK-CLUSTERS-DBSCAN-SAME: ,3.02{{$}} + +# CHECK-CLUSTERS-NAIVE-NEXT: {{^}}0, +# CHECK-CLUSTERS-NAIVE-SAME: ,3.02{{$}} +# CHECK-CLUSTERS-NAIVE: {{^}}1, +# CHECK-CLUSTERS-NAIVE-SAME: ,4.03{{$}} + +--- +mode: inverse_throughput +key: + instructions: + - 'BSR32rr R11D EDI' + config: '' + register_initial_values: + - 'EDI=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 4.03048, per_snippet_value: 4.03048 } +error: '' +info: instruction has no tied variables picking Uses different from defs +assembled_snippet: BF00000000440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDF440FBDDFC3 +... +--- +mode: inverse_throughput +key: + instructions: + - 'BSF32rr EAX R14D' + config: '' + register_initial_values: + - 'R14D=0x0' +cpu_name: bdver2 +llvm_triple: x86_64-unknown-linux-gnu +num_repetitions: 1000000 +measurements: + - { key: inverse_throughput, value: 3.02186, per_snippet_value: 3.02186 } +error: '' +info: instruction has no tied variables picking Uses different from defs +assembled_snippet: 415641BE00000000410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6410FBCC6415EC3 +... diff --git a/llvm/tools/llvm-exegesis/lib/Analysis.cpp b/llvm/tools/llvm-exegesis/lib/Analysis.cpp index ec964b2..9281744 100644 --- a/llvm/tools/llvm-exegesis/lib/Analysis.cpp +++ b/llvm/tools/llvm-exegesis/lib/Analysis.cpp @@ -333,7 +333,7 @@ void Analysis::printSchedClassClustersHtml( OS << ""; } OS << ""; - for (const auto &Stats : Cluster.getRepresentative()) { + for (const auto &Stats : Cluster.getCentroid().getStats()) { OS << ""; writeMeasurementValue(OS, Stats.avg()); OS << "
["; @@ -437,14 +437,11 @@ void Analysis::SchedClassCluster::addPoint( size_t PointId, const InstructionBenchmarkClustering &Clustering) { PointIds.push_back(PointId); const auto &Point = Clustering.getPoints()[PointId]; - if (ClusterId.isUndef()) { + if (ClusterId.isUndef()) ClusterId = Clustering.getClusterIdForPoint(PointId); - Representative.resize(Point.Measurements.size()); - } - for (size_t I = 0, E = Point.Measurements.size(); I < E; ++I) { - Representative[I].push(Point.Measurements[I]); - } assert(ClusterId == Clustering.getClusterIdForPoint(PointId)); + + Centroid.addPoint(Point.Measurements); } // Returns a ProxResIdx by id or name. @@ -467,6 +464,7 @@ bool Analysis::SchedClassCluster::measurementsMatch( const llvm::MCSubtargetInfo &STI, const ResolvedSchedClass &RSC, const InstructionBenchmarkClustering &Clustering, const double AnalysisInconsistencyEpsilonSquared_) const { + ArrayRef Representative = Centroid.getStats(); const size_t NumMeasurements = Representative.size(); std::vector ClusterCenterPoint(NumMeasurements); std::vector SchedClassPoint(NumMeasurements); diff --git a/llvm/tools/llvm-exegesis/lib/Analysis.h b/llvm/tools/llvm-exegesis/lib/Analysis.h index 1a0859e..e38f460 100644 --- a/llvm/tools/llvm-exegesis/lib/Analysis.h +++ b/llvm/tools/llvm-exegesis/lib/Analysis.h @@ -73,10 +73,11 @@ private: const std::vector &getPointIds() const { return PointIds; } + void addPoint(size_t PointId, + const InstructionBenchmarkClustering &Clustering); + // Return the cluster centroid. - const std::vector &getRepresentative() const { - return Representative; - } + const SchedClassClusterCentroid &getCentroid() const { return Centroid; } // Returns true if the cluster representative measurements match that of SC. bool @@ -85,14 +86,11 @@ private: const InstructionBenchmarkClustering &Clustering, const double AnalysisInconsistencyEpsilonSquared_) const; - void addPoint(size_t PointId, - const InstructionBenchmarkClustering &Clustering); - private: InstructionBenchmarkClustering::ClusterId ClusterId; std::vector PointIds; // Measurement stats for the points in the SchedClassCluster. - std::vector Representative; + SchedClassClusterCentroid Centroid; }; void printInstructionRowCsv(size_t PointId, llvm::raw_ostream &OS) const; diff --git a/llvm/tools/llvm-exegesis/lib/Clustering.cpp b/llvm/tools/llvm-exegesis/lib/Clustering.cpp index cc46cb3..2a8cc45 100644 --- a/llvm/tools/llvm-exegesis/lib/Clustering.cpp +++ b/llvm/tools/llvm-exegesis/lib/Clustering.cpp @@ -53,6 +53,37 @@ void InstructionBenchmarkClustering::rangeQuery( } } +// Given a set of points, checks that all the points are neighbours +// up to AnalysisClusteringEpsilon. This is O(2*N). +bool InstructionBenchmarkClustering::areAllNeighbours( + ArrayRef Pts) const { + // First, get the centroid of this group of points. This is O(N). + SchedClassClusterCentroid G; + llvm::for_each(Pts, [this, &G](size_t P) { + assert(P < Points_.size()); + ArrayRef Measurements = Points_[P].Measurements; + if (Measurements.empty()) // Error point. + return; + G.addPoint(Measurements); + }); + const std::vector Centroid = G.getAsPoint(); + + // Since we will be comparing with the centroid, we need to halve the epsilon. + double AnalysisClusteringEpsilonHalvedSquared = + AnalysisClusteringEpsilonSquared_ / 4.0; + + // And now check that every point is a neighbour of the centroid. Also O(N). + return llvm::all_of( + Pts, [this, &Centroid, AnalysisClusteringEpsilonHalvedSquared](size_t P) { + assert(P < Points_.size()); + const auto &PMeasurements = Points_[P].Measurements; + if (PMeasurements.empty()) // Error point. + return true; // Pretend that error point is a neighbour. + return isNeighbour(PMeasurements, Centroid, + AnalysisClusteringEpsilonHalvedSquared); + }); +} + InstructionBenchmarkClustering::InstructionBenchmarkClustering( const std::vector &Points, const double AnalysisClusteringEpsilonSquared) @@ -95,7 +126,7 @@ llvm::Error InstructionBenchmarkClustering::validateAndSetup() { return llvm::Error::success(); } -void InstructionBenchmarkClustering::dbScan(const size_t MinPts) { +void InstructionBenchmarkClustering::clusterizeDbScan(const size_t MinPts) { std::vector Neighbors; // Persistent buffer to avoid allocs. for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) { if (!ClusterIdForPoint_[P].isUndef()) @@ -152,6 +183,48 @@ void InstructionBenchmarkClustering::dbScan(const size_t MinPts) { } } +void InstructionBenchmarkClustering::clusterizeNaive(unsigned NumOpcodes) { + // Given an instruction Opcode, which are the benchmarks of this instruction? + std::vector> OpcodeToPoints; + OpcodeToPoints.resize(NumOpcodes); + size_t NumOpcodesSeen = 0; + for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) { + const InstructionBenchmark &Point = Points_[P]; + const unsigned Opcode = Point.keyInstruction().getOpcode(); + assert(Opcode < NumOpcodes && "NumOpcodes is incorrect (too small)"); + llvm::SmallVectorImpl &PointsOfOpcode = OpcodeToPoints[Opcode]; + if (PointsOfOpcode.empty()) // If we previously have not seen any points of + ++NumOpcodesSeen; // this opcode, then naturally this is the new opcode. + PointsOfOpcode.emplace_back(P); + } + assert(OpcodeToPoints.size() == NumOpcodes && "sanity check"); + assert(NumOpcodesSeen <= NumOpcodes && + "can't see more opcodes than there are total opcodes"); + assert(NumOpcodesSeen <= Points_.size() && + "can't see more opcodes than there are total points"); + + Clusters_.reserve(NumOpcodesSeen); // One cluster per opcode. + for (ArrayRef PointsOfOpcode : llvm::make_filter_range( + OpcodeToPoints, [](ArrayRef PointsOfOpcode) { + return !PointsOfOpcode.empty(); // Ignore opcodes with no points. + })) { + // Create a new cluster. + Clusters_.emplace_back(ClusterId::makeValid( + Clusters_.size(), /*IsUnstable=*/!areAllNeighbours(PointsOfOpcode))); + Cluster &CurrentCluster = Clusters_.back(); + // Mark points as belonging to the new cluster. + llvm::for_each(PointsOfOpcode, [this, &CurrentCluster](size_t P) { + ClusterIdForPoint_[P] = CurrentCluster.Id; + }); + // And add all the points of this opcode to the new cluster. + CurrentCluster.PointIndices.reserve(PointsOfOpcode.size()); + CurrentCluster.PointIndices.assign(PointsOfOpcode.begin(), + PointsOfOpcode.end()); + assert(CurrentCluster.PointIndices.size() == PointsOfOpcode.size()); + } + assert(Clusters_.size() == NumOpcodesSeen); +} + // Given an instruction Opcode, we can make benchmarks (measurements) of the // instruction characteristics/performance. Then, to facilitate further analysis // we group the benchmarks with *similar* characteristics into clusters. @@ -246,8 +319,8 @@ void InstructionBenchmarkClustering::stabilize(unsigned NumOpcodes) { llvm::Expected InstructionBenchmarkClustering::create( - const std::vector &Points, const size_t MinPts, - const double AnalysisClusteringEpsilon, + const std::vector &Points, const ModeE Mode, + const size_t DbscanMinPts, const double AnalysisClusteringEpsilon, llvm::Optional NumOpcodes) { InstructionBenchmarkClustering Clustering( Points, AnalysisClusteringEpsilon * AnalysisClusteringEpsilon); @@ -258,13 +331,37 @@ InstructionBenchmarkClustering::create( return Clustering; // Nothing to cluster. } - Clustering.dbScan(MinPts); + if (Mode == ModeE::Dbscan) { + Clustering.clusterizeDbScan(DbscanMinPts); - if (NumOpcodes.hasValue()) - Clustering.stabilize(NumOpcodes.getValue()); + if (NumOpcodes.hasValue()) + Clustering.stabilize(NumOpcodes.getValue()); + } else /*if(Mode == ModeE::Naive)*/ { + if (!NumOpcodes.hasValue()) + llvm::report_fatal_error( + "'naive' clustering mode requires opcode count to be specified"); + Clustering.clusterizeNaive(NumOpcodes.getValue()); + } return Clustering; } +void SchedClassClusterCentroid::addPoint(ArrayRef Point) { + if (Representative.empty()) + Representative.resize(Point.size()); + assert(Representative.size() == Point.size() && + "All points should have identical dimensions."); + + for (const auto &I : llvm::zip(Representative, Point)) + std::get<0>(I).push(std::get<1>(I)); +} + +std::vector SchedClassClusterCentroid::getAsPoint() const { + std::vector ClusterCenterPoint(Representative.size()); + for (const auto &I : llvm::zip(ClusterCenterPoint, Representative)) + std::get<0>(I).PerInstructionValue = std::get<1>(I).avg(); + return ClusterCenterPoint; +} + } // namespace exegesis } // namespace llvm diff --git a/llvm/tools/llvm-exegesis/lib/Clustering.h b/llvm/tools/llvm-exegesis/lib/Clustering.h index ad4cab3..e57a46d 100644 --- a/llvm/tools/llvm-exegesis/lib/Clustering.h +++ b/llvm/tools/llvm-exegesis/lib/Clustering.h @@ -25,20 +25,24 @@ namespace exegesis { class InstructionBenchmarkClustering { public: + enum ModeE { Dbscan, Naive }; + // Clusters `Points` using DBSCAN with the given parameters. See the cc file // for more explanations on the algorithm. static llvm::Expected - create(const std::vector &Points, size_t MinPts, - double AnalysisClusteringEpsilon, + create(const std::vector &Points, ModeE Mode, + size_t DbscanMinPts, double AnalysisClusteringEpsilon, llvm::Optional NumOpcodes = llvm::None); class ClusterId { public: static ClusterId noise() { return ClusterId(kNoise); } static ClusterId error() { return ClusterId(kError); } - static ClusterId makeValid(size_t Id) { return ClusterId(Id); } + static ClusterId makeValid(size_t Id, bool IsUnstable = false) { + return ClusterId(Id, IsUnstable); + } static ClusterId makeValidUnstable(size_t Id) { - return ClusterId(Id, /*IsUnstable=*/true); + return makeValid(Id, /*IsUnstable=*/true); } ClusterId() : Id_(kUndef), IsUnstable_(false) {} @@ -120,12 +124,20 @@ private: double AnalysisClusteringEpsilonSquared); llvm::Error validateAndSetup(); - void dbScan(size_t MinPts); + + void clusterizeDbScan(size_t MinPts); + void clusterizeNaive(unsigned NumOpcodes); + + // Stabilization is only needed if dbscan was used to clusterize. void stabilize(unsigned NumOpcodes); + void rangeQuery(size_t Q, std::vector &Scratchpad) const; + bool areAllNeighbours(ArrayRef Pts) const; + const std::vector &Points_; const double AnalysisClusteringEpsilonSquared_; + int NumDimensions_ = 0; // ClusterForPoint_[P] is the cluster id for Points[P]. std::vector ClusterIdForPoint_; @@ -134,6 +146,21 @@ private: Cluster ErrorCluster_; }; +class SchedClassClusterCentroid { +public: + const std::vector &getStats() const { + return Representative; + } + + std::vector getAsPoint() const; + + void addPoint(ArrayRef Point); + +private: + // Measurement stats for the points in the SchedClassCluster. + std::vector Representative; +}; + } // namespace exegesis } // namespace llvm diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp index 81f6d2d..fb2f2ff 100644 --- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp +++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp @@ -66,7 +66,7 @@ static cl::opt cl::cat(Options), cl::init("")); static cl::opt BenchmarkMode( - "mode", cl::desc("the mode to run"), cl::cat(BenchmarkOptions), + "mode", cl::desc("the mode to run"), cl::cat(Options), cl::values(clEnumValN(exegesis::InstructionBenchmark::Latency, "latency", "Instruction Latency"), clEnumValN(exegesis::InstructionBenchmark::InverseThroughput, @@ -89,14 +89,24 @@ static cl::opt IgnoreInvalidSchedClass( cl::desc("ignore instructions that do not define a sched class"), cl::cat(BenchmarkOptions), cl::init(false)); -static cl::opt AnalysisNumPoints( +static cl::opt + AnalysisClusteringAlgorithm( + "analysis-clustering", cl::desc("the clustering algorithm to use"), + cl::cat(AnalysisOptions), + cl::values(clEnumValN(exegesis::InstructionBenchmarkClustering::Dbscan, + "dbscan", "use DBSCAN/OPTICS algorithm"), + clEnumValN(exegesis::InstructionBenchmarkClustering::Naive, + "naive", "one cluster per opcode")), + cl::init(exegesis::InstructionBenchmarkClustering::Dbscan)); + +static cl::opt AnalysisDbscanNumPoints( "analysis-numpoints", - cl::desc("minimum number of points in an analysis cluster"), + cl::desc("minimum number of points in an analysis cluster (dbscan only)"), cl::cat(AnalysisOptions), cl::init(3)); static cl::opt AnalysisClusteringEpsilon( "analysis-clustering-epsilon", - cl::desc("dbscan epsilon for benchmark point clustering"), + cl::desc("epsilon for benchmark point clustering"), cl::cat(AnalysisOptions), cl::init(0.1)); static cl::opt AnalysisInconsistencyEpsilon( @@ -460,8 +470,8 @@ static void analysisMain() { std::unique_ptr InstrInfo(TheTarget->createMCInstrInfo()); const auto Clustering = ExitOnErr(InstructionBenchmarkClustering::create( - Points, AnalysisNumPoints, AnalysisClusteringEpsilon, - InstrInfo->getNumOpcodes())); + Points, AnalysisClusteringAlgorithm, AnalysisDbscanNumPoints, + AnalysisClusteringEpsilon, InstrInfo->getNumOpcodes())); const Analysis Analyzer(*TheTarget, std::move(InstrInfo), Clustering, AnalysisInconsistencyEpsilon, diff --git a/llvm/unittests/tools/llvm-exegesis/ClusteringTest.cpp b/llvm/unittests/tools/llvm-exegesis/ClusteringTest.cpp index 2833d55..b23938a 100644 --- a/llvm/unittests/tools/llvm-exegesis/ClusteringTest.cpp +++ b/llvm/unittests/tools/llvm-exegesis/ClusteringTest.cpp @@ -46,7 +46,8 @@ TEST(ClusteringTest, Clusters3D) { // Error cluster: points {2} Points[2].Error = "oops"; - auto Clustering = InstructionBenchmarkClustering::create(Points, 2, 0.25); + auto Clustering = InstructionBenchmarkClustering::create( + Points, InstructionBenchmarkClustering::ModeE::Dbscan, 2, 0.25); ASSERT_TRUE((bool)Clustering); EXPECT_THAT(Clustering.get().getValidClusters(), UnorderedElementsAre(HasPoints({0, 3}), HasPoints({1, 4}))); @@ -73,7 +74,9 @@ TEST(ClusteringTest, Clusters3D_InvalidSize) { {"x", 0.01, 0.0}, {"y", 1.02, 0.0}, {"z", 1.98, 0.0}}; Points[1].Measurements = {{"y", 1.02, 0.0}, {"z", 1.98, 0.0}}; auto Error = - InstructionBenchmarkClustering::create(Points, 2, 0.25).takeError(); + InstructionBenchmarkClustering::create( + Points, InstructionBenchmarkClustering::ModeE::Dbscan, 2, 0.25) + .takeError(); ASSERT_TRUE((bool)Error); consumeError(std::move(Error)); } @@ -83,7 +86,9 @@ TEST(ClusteringTest, Clusters3D_InvalidOrder) { Points[0].Measurements = {{"x", 0.01, 0.0}, {"y", 1.02, 0.0}}; Points[1].Measurements = {{"y", 1.02, 0.0}, {"x", 1.98, 0.0}}; auto Error = - InstructionBenchmarkClustering::create(Points, 2, 0.25).takeError(); + InstructionBenchmarkClustering::create( + Points, InstructionBenchmarkClustering::ModeE::Dbscan, 2, 0.25) + .takeError(); ASSERT_TRUE((bool)Error); consumeError(std::move(Error)); } @@ -112,7 +117,8 @@ TEST(ClusteringTest, Ordering1) { Points[2].Measurements = { {"x", 2.0, 0.0}}; - auto Clustering = InstructionBenchmarkClustering::create(Points, 2, 1.1); + auto Clustering = InstructionBenchmarkClustering::create( + Points, InstructionBenchmarkClustering::ModeE::Dbscan, 2, 1.1); ASSERT_TRUE((bool)Clustering); EXPECT_THAT(Clustering.get().getValidClusters(), UnorderedElementsAre(HasPoints({0, 1, 2}))); @@ -128,7 +134,8 @@ TEST(ClusteringTest, Ordering2) { Points[2].Measurements = { {"x", 1.0, 0.0}}; - auto Clustering = InstructionBenchmarkClustering::create(Points, 2, 1.1); + auto Clustering = InstructionBenchmarkClustering::create( + Points, InstructionBenchmarkClustering::ModeE::Dbscan, 2, 1.1); ASSERT_TRUE((bool)Clustering); EXPECT_THAT(Clustering.get().getValidClusters(), UnorderedElementsAre(HasPoints({0, 1, 2})));