[mlir][ArmSME] Implement tile allocation
This patch adds a pass '-allocate-sme-tiles' to the ArmSME dialect that
implements allocation of SME ZA tiles.
It does this at the 'func.func' op level by replacing
'arm_sme.get_tile_id' ops with 'arith.constant' ops that represent the
tile number. The tiles in use in a given function are tracked by an
integer function attribute 'arm_sme.tiles_in_use' that is a 16-bit tile
mask with a bit for each 128-bit element tile (ZA0.Q-ZA15.Q), the
smallest ZA tile granule. This is initialized on the first
'arm_sme.get_tile_id' rewrite and updated on each subsequent rewrite.
Mixing of different element tile types is supported.
Section B2.3.2 of the SME spec [1] describes how the 128-bit element
tiles overlap with other element tiles.
Depends on D154941
[1] https://developer.arm.com/documentation/ddi0616/aa
Reviewed By: awarzynski
Differential Revision: https://reviews.llvm.org/D154955