[TOPI][x86] Injective schedule improvement (#4786)
authorAnimesh Jain <anijain@umich.edu>
Tue, 4 Feb 2020 23:25:46 +0000 (15:25 -0800)
committerGitHub <noreply@github.com>
Tue, 4 Feb 2020 23:25:46 +0000 (15:25 -0800)
* [TOPI][x86] Injective Schedule Improvement.

* Add tiling.

* Vectorize when there is an axis.

topi/python/topi/x86/injective.py

index 8c97214..d6bb762 100644 (file)
@@ -45,6 +45,12 @@ def schedule_injective_from_existing(sch, out):
         sch[out].parallel(fused)
     elif len(sch[out].op.axis) >= 1:
         sch[out].parallel(sch[out].op.axis[0])
+
+    # Vectorize the inner most for loop. Tiling first to get a const extent
+    if len(sch[out].op.axis) >= 1:
+        l = sch[out].op.axis[-1]
+        _, li = sch[out].split(l, factor=16)
+        sch[out].vectorize(li)
     return sch
 
 @generic.schedule_injective.register(["cpu"])