Animesh Jain [Thu, 17 Oct 2019 16:31:58 +0000 (09:31 -0700)]
[TOPI][x86] Cascade lake support. (#4123)
* [TOPI][x86] Cascade lake support.
* Jenkins test debug 1.
* Testing cascade lake alone.
Logan Weber [Thu, 17 Oct 2019 16:27:57 +0000 (09:27 -0700)]
[Relay] Improve build error when no lowered funcs are produced (#4132)
* Improve build error when no lowered funcs
* Switch from fatal to warning
Tianqi Chen [Wed, 16 Oct 2019 22:24:23 +0000 (15:24 -0700)]
[RUNTIME] Refactor object python FFI to new protocol. (#4128)
* [RUNTIME] Refactor object python FFI to new protocol.
This is a pre-req to bring the Node system under object protocol.
Most of the code reflects the current code in the Node system.
- Use new instead of init so subclass can define their own constructors
- Allow register via name, besides type idnex
- Introduce necessary runtime C API functions
- Refactored Tensor and Datatype to directly use constructor.
* address review comments
Tianqi Chen [Wed, 16 Oct 2019 17:44:17 +0000 (10:44 -0700)]
Update PULL_REQUEST_TEMPLATE.md
shoubhik [Wed, 16 Oct 2019 16:44:59 +0000 (09:44 -0700)]
Adding support for dequantizing from int32 to float32. (#4130)
Altan Haan [Wed, 16 Oct 2019 14:32:29 +0000 (07:32 -0700)]
[Relay][Training] Add and fix gradients (#4126)
* add and fix gradients
* fix linter issues
Animesh Jain [Wed, 16 Oct 2019 05:17:47 +0000 (22:17 -0700)]
[QNN] Change default rouning to UPWARD. (#4131)
shoubhik [Tue, 15 Oct 2019 21:02:08 +0000 (14:02 -0700)]
Fix infer type of kernel in dense. (#4125)
* Fix infer type of kernel in dense.
* - Moving the check of weight being nullptr up as it is needed in both the branches now.
- Adding test case for validating that data dtype and kernel dtypes can be different.
* - Fix the dtype check for weight. If the weight is not present then we will use the data dtype.
Animesh Jain [Tue, 15 Oct 2019 19:12:01 +0000 (12:12 -0700)]
[Relay][AlterOpLayout] NHWC to NCHWc pad operator. (#4103)
* [Relay][AlterOpLayout] NHWC to NCHWc pad operator.
* Fixing culprit.
* Flaky test 1.
* Flaky test 2.
Sergei Grechanik [Tue, 15 Oct 2019 15:51:06 +0000 (18:51 +0300)]
[ARITH] Fix lowering of floormod(x, y) != 0 (#4127)
Tianqi Chen [Tue, 15 Oct 2019 05:46:35 +0000 (22:46 -0700)]
[RFC][RUNTIME] Introduce new object protocol. (#4115)
* [RUNTIME] Introduce new object protocol.
This PR introduces a new object protocol to unify the node and object.
We also updated the existing runtime::vm code to make use of the new system.
Update to the node will be done in a follow up PR.
Other changes:
- Remove object related code in json serializer as that code logic was not complete
and we have a separate serializer for VM, can revisit later.
* address review comment
* Fix the child slot logic
Animesh Jain [Tue, 15 Oct 2019 04:46:09 +0000 (21:46 -0700)]
[Relay][Topi] Disable conv NHWC pack int8. (#4038)
Tianqi Chen [Mon, 14 Oct 2019 22:51:55 +0000 (15:51 -0700)]
Update task_cpp_unittest.sh
Tianqi Chen [Mon, 14 Oct 2019 21:05:51 +0000 (14:05 -0700)]
Revert ci-cpu due to nnpack issue (#4124)
Animesh Jain [Mon, 14 Oct 2019 19:34:09 +0000 (12:34 -0700)]
[QNN][TFLite] Parsing TFLite quantized models. (#3900)
Tianqi Chen [Mon, 14 Oct 2019 19:13:59 +0000 (12:13 -0700)]
[CI] Update ci-cpu to latest (#4121)
Leo Chen [Sun, 13 Oct 2019 18:38:00 +0000 (02:38 +0800)]
add dependency of compilation with LLVM (#4117)
Ina Dobreva [Sun, 13 Oct 2019 05:06:50 +0000 (06:06 +0100)]
Add parser support for CAST tflite operator (#4096)
This implementation provides cast to limited number of dtypes
that tflite currently supports for placeholder op. Add INT64 in the
possible dtypes as it appears to be supported accrording to tlfite schema.
Thierry Moreau [Sat, 12 Oct 2019 16:45:51 +0000 (09:45 -0700)]
adding soiferj to the list of reviewers (#4108)
Xingyu Zhou [Fri, 11 Oct 2019 21:25:14 +0000 (14:25 -0700)]
[codegen] Add multiple operands and function support when using fp16 compilation (#4056)
* overload half operators for cuda codegen
* add float16 te test_op_level1
* fix test_op_level1.py
* fix lint
* disable fp16 test if gpu does not support
* disable fp16 test if gpu does not support
* bypass float16 test if gpu does not support float16
Haichen Shen [Fri, 11 Oct 2019 21:23:01 +0000 (14:23 -0700)]
[Fix] Fix a few bugs when dtype is fp16 (#4088)
* Fix layer norm for fp16
* [Fix] Fix arange for fp16
* [Fix] Fix mxnet frontend for fp16
* [Fix] Fix arange for fp16
* remove comments
* x
* fix nnvm
Zhi [Fri, 11 Oct 2019 17:31:00 +0000 (10:31 -0700)]
[tvm][any] broadcast with values other than one (#3967)
* [tvm][any] broadcast with values other than 1
* Add test for incompatible runtime values
* Remove hybrid script compact buffer binding
* retrigger ci
Peter Yeh [Fri, 11 Oct 2019 06:45:12 +0000 (23:45 -0700)]
force code object v2 for amd gpu backend (#4099)
Chien-Yu Lin [Fri, 11 Oct 2019 06:12:22 +0000 (23:12 -0700)]
Tutorial: update Building a Graph Convolutional Network tutorial (#4060)
* update build_gcn.py tutorial
updates
* support bias in GCN layer
* download pretrained gcn model
* verify model accuracy
* use time_evaluator to measure runtime
* fix adding bias in gcn layer
* remove printing output
* fix small bug
* add DGL-PyTorch comparison into the build_gcn tutorial
* add accuracy testing
* adjust import order
* handle different dgl versions
* update number for dgl version checking
Animesh Jain [Fri, 11 Oct 2019 05:57:09 +0000 (22:57 -0700)]
[Relay][AlterOp] NHWC to NCHWc support for Pool, pad, concatenate, sum. (#4059)
Philip Hyunsu Cho [Thu, 10 Oct 2019 21:48:00 +0000 (14:48 -0700)]
[TOPI] FIFO buffer op, to accelerate sequence modeling with dilated convolutions (#4039)
* Add FIFO buffer op to enable explicit computation re-use in convolution
* Add a test
* Add end-to-end test with 1D convolution
* Add a stub in MXNet frontend
* Address reviewer comments
* Add back stub for MXNet frontend
Benjamin Tu [Thu, 10 Oct 2019 20:37:05 +0000 (13:37 -0700)]
[VTA][TSIM] Serial GEMM Application Added (#4082)
* app init push
* fix on readme
* change name, add bit serial explanantion
* rm serialLoadMM, change doc
* syntax change for readme
* add parallel test functionality
* fix readme
* add python doc
* syntax
LiangHao [Thu, 10 Oct 2019 19:59:58 +0000 (03:59 +0800)]
Add a python tutorial of deploying tvm module with tvm runtime only (#4094)
Leyuan Wang [Thu, 10 Oct 2019 19:55:59 +0000 (12:55 -0700)]
correct error (#4093)
shoubhik [Thu, 10 Oct 2019 19:52:49 +0000 (12:52 -0700)]
- Adding support for Mxnet flavored dequantization for both default and using MKLDNN. User can choose between the two at runtime. (#3945)
- Added tests for new methods added.
Yida Wang [Thu, 10 Oct 2019 19:24:32 +0000 (12:24 -0700)]
[Fix] Fix the logic of the number of nodes checking in op fusion (#4074)
* move the number of nodes constraint in op fusion up to the dom tree level
* add test case of limiting the max number of ops to be fused
* uncomment other test cases
Aniket Rangrej [Thu, 10 Oct 2019 17:35:33 +0000 (23:05 +0530)]
Fixing tensor not found issue in bitserial operator (#4095)
Marcus Shawcroft [Thu, 10 Oct 2019 16:26:57 +0000 (17:26 +0100)]
[DOCKER] torch install depends on future package (#4098)
The torch package depends on the future package but the torch wheel
does not expose that dependency resulting in an inconsitent install.
Ideally the wheel should declare all of its dependencies, I'm not sure
why the packagers have choosen not to do this, for now the simple work
around is to explicitly install the future package.
Change-Id: Ic9f0f4bb4c78ab65706fc1b20c1b4fd287856a9e
Wei Chen [Thu, 10 Oct 2019 00:47:04 +0000 (17:47 -0700)]
[Relay][VM] Fix constant folding issue in VM compiler (#4077)
* [Relay][VM] Fix constant folding issue in VM compiler
1. allow pass params when compile a module
2. enhance profiler robustness
* remove dead code
* fix lint
* add get_params
* fix test
* don't pass params back
* remove get_params
* docs
* move compile function to api
* compile clashes with builtin name
* fix compilation error
* remove dead code
Leyuan Wang [Wed, 9 Oct 2019 18:05:45 +0000 (11:05 -0700)]
[TOPI] Add valid auto tvm for Intel Graphics (#4078)
* add valid autotune
* fix pylint
Andrew Tulloch [Wed, 9 Oct 2019 16:35:49 +0000 (09:35 -0700)]
[TVM] Rewrite simplification rule to eliminate unnecessary conditionals. (#4076)
The current bounds checking infrastructure inserts checks like:
```
for (i, 0, bounds[n]) {
if (likely(i < bounds[n]) {
...
}
}
```
into the TVM IR which is currently not removed by simplification infrastructure.
This is a little unclean, as these are trivially true since for a loop var `i`
with a given min and extent, we are guaranteed that `i >= min` and `i < min +
extent`. Thus, we can insert these checks into the IR and use them to eliminate
trivial bounds checks early on.
Zhi [Wed, 9 Oct 2019 16:19:56 +0000 (09:19 -0700)]
[relay] Small refactor for context (#4091)
Animesh Jain [Wed, 9 Oct 2019 15:23:45 +0000 (08:23 -0700)]
[TOPI][X86] Pool operator parallel support. (#4090)
Yizhi Liu [Tue, 8 Oct 2019 21:54:37 +0000 (05:54 +0800)]
[topi] enable fp16 sort for arm (#4084)
Umang Yadav [Tue, 8 Oct 2019 21:17:15 +0000 (17:17 -0400)]
[ARITH] Add floordiv for the deduce bound (#4025)
Use fdiv in the tests for the deduce_bound
Attila Dusnoki [Tue, 8 Oct 2019 19:24:56 +0000 (21:24 +0200)]
Fix wrong n_trial number in autotvm tutorials' progress bar (#4070)
if n_trial is larger then config space.
Hua Jiang [Tue, 8 Oct 2019 19:16:03 +0000 (12:16 -0700)]
[VTA] hotfix for de10-nano driver (#4081)
Issue:
git clone latest TVM/VTA and run VTA on xilinx FPGA board, application
crashed due to the "call stack overflow" which caused by a infinite recursive
function call. this issue ever happen before and get addressed by PR 3843.
Analysis:
seems like de10-nano driver PR used old code base then the logic change
of 3843 get eliminated.
Solution:
add the logic back.
mbarrett97 [Tue, 8 Oct 2019 18:54:08 +0000 (19:54 +0100)]
[CodeGen] Disable -mfloat-abi hard option for LLVM < 6.0 (#4071)
The -mfloat-abi hard option does not work for LLVM < 6.0 as it is ignored.
This adds a fatal error when using unsupported LLVM versions so that the failure
is not silent.
Animesh Jain [Tue, 8 Oct 2019 18:44:45 +0000 (11:44 -0700)]
[AlterOpLayout][x86] NHWC to NCHWc conv support. (#4080)
Haichen Shen [Tue, 8 Oct 2019 17:19:49 +0000 (10:19 -0700)]
[Fix][VM] Fix VM invoke with set_params (#4079)
* Fix VM invoke with set_params
* add test
* tweak
Wuwei Lin [Tue, 8 Oct 2019 17:17:56 +0000 (13:17 -0400)]
[QNN] Refactor fixed point multiplication in requantize (#4073)
Logan Weber [Mon, 7 Oct 2019 15:20:59 +0000 (08:20 -0700)]
Fix match case in Python-side expr functor (#4037)
ndl [Mon, 7 Oct 2019 15:20:26 +0000 (17:20 +0200)]
Hide symbols from dependent libraries if HIDE_PRIVATE_SYMBOLS is ON. (#4041)
In current implementation HIDE_PRIVATE_SYMBOLS hides symbols from TVM
itself but not from its dependent libraries. This is problematic when
other third-party libraries with the same symbols are linked to the
same executable.
One example is using TVM with Mesa OpenCL drivers: they depend on LLVM
and load its shared libraries with RTLD_GLOBAL flag, which results in
conflicts with LLVM symbols that TVM uses. Arguably this particular
issue belongs to Mesa (here's their tracking bug:
https://gitlab.freedesktop.org/mesa/mesa/issues/236) but in general
that's the right thing to do regardless of this particular bug.
Note that I'm not enabling this functionality for Darwin as in my
earlier tests their linker didn't seem to understand "--exclude-libs"
(but I don't have test platform ATM to double-check).
雾雨魔理沙 [Mon, 7 Oct 2019 14:11:38 +0000 (07:11 -0700)]
Add gradient for log-softmax (#4069)
Bohan Hou [Mon, 7 Oct 2019 04:27:59 +0000 (12:27 +0800)]
[DOC] Fix typos in tutorials (#4066)
fix some typos
Chengji Yao [Mon, 7 Oct 2019 03:02:06 +0000 (11:02 +0800)]
dicrease the complexity of CalcDep from exponential to linear (#4053)
Animesh Jain [Sun, 6 Oct 2019 23:46:06 +0000 (16:46 -0700)]
[Relay][AlterOp] Minor refactor. (#4064)
Animesh Jain [Sun, 6 Oct 2019 04:18:58 +0000 (21:18 -0700)]
[Relay][AlterOp] Improving support for broadcast layout alteration. (#4040)
Ina Dobreva [Sun, 6 Oct 2019 00:40:29 +0000 (01:40 +0100)]
Add parses support for zeros_like tflite operator (#4042)
The tensorflow zeros_like operation provided in array_ops.py produces directly a tensor with zeros
without a graph, using only the shape and type of the input. This imposes the use of gen_array_ops.py
that produces both a tensor and a graph so a comparison between tflite and tvm can be done.
Yong Wu [Sun, 6 Oct 2019 00:33:14 +0000 (17:33 -0700)]
[Bugfix][TF] reset graph after getting tag of savedmodel (#4055)
@zhiics @icemelon9
Wei Chen [Sat, 5 Oct 2019 23:08:53 +0000 (16:08 -0700)]
[Relay][VM] Add more passes to VMCompiler (#4058)
* [Relay][VM] Add more passes to VMCompiler
* Check build config
* Add todo
Wei Chen [Sat, 5 Oct 2019 17:02:08 +0000 (10:02 -0700)]
[Relay][VM] Add autotvm context when compile (#4062)
Haichen Shen [Sat, 5 Oct 2019 03:51:01 +0000 (20:51 -0700)]
[Bugfix] Fix target host for vm compiler (#4057)
* fix
* tweak
雾雨魔理沙 [Sat, 5 Oct 2019 00:24:55 +0000 (17:24 -0700)]
[Relay][Training] Add gradient for Crossentropy (#3925)
* save
save
redo max test
save
address comment
fix
* address comment
* increase rtol
* address review comment
Yizhi Liu [Fri, 4 Oct 2019 22:13:38 +0000 (06:13 +0800)]
[llvm] switch to use Align for llvm trunk (#4051)
Jon Soifer [Thu, 3 Oct 2019 18:55:17 +0000 (11:55 -0700)]
[Relay][TopHub] Add switch to disable TopHub download (#4015)
bindog [Thu, 3 Oct 2019 00:01:36 +0000 (08:01 +0800)]
[Relay][Op] Add instance norm op (#4004)
* [Relay][Op] Add instance norm op
* mend
[Relay][Op] Add instance norm op
Animesh Jain [Wed, 2 Oct 2019 22:39:54 +0000 (15:39 -0700)]
[QNN][Relay] Calling Dialect passes from inside Relay Build API. (#3971)
Umang Yadav [Wed, 2 Oct 2019 20:13:10 +0000 (16:13 -0400)]
[RELAY/PASS] Fix the extent for the post_stmt in the loop partition (#3734)
Wei Chen [Wed, 2 Oct 2019 20:11:30 +0000 (13:11 -0700)]
[TF][Op] Op where (#4045)
* [TF][Op] Add TF op Where
* improve tests
* add tests for vm
Cody Hao Yu [Tue, 1 Oct 2019 23:20:29 +0000 (16:20 -0700)]
Fix split's last factor issue (#4044)
Tianqi Chen [Tue, 1 Oct 2019 21:07:49 +0000 (14:07 -0700)]
[COMMUNITY] ajtulloch -> committer (#4043)
Wei Chen [Tue, 1 Oct 2019 20:09:21 +0000 (13:09 -0700)]
[TOPI]Add op argwhere (#3994)
* Add op argwhere
* Move shape func to _algorithm.py
* Add lint rule
* Raise exception if rank is not supportted
* move argwhere to transform
* Add argwhere example
* Fix lint
* Add 1-d support
* cleanup
* Add more dtype support
* CR comment
* Improve error message
* Docs
* raise exception
Yizhi Liu [Tue, 1 Oct 2019 15:40:16 +0000 (23:40 +0800)]
[topi] add ARM v8.2 udot (uint8) support (#3978)
* [topi] add ARM v8.2 udot (uint8) support
* fix test case
* fix common conv2d schedule
* add back fp32_time in test
* fix lint
* fix doc, add support for int32_lanes=4, signed int
* fix lint
* add ic_bn % 4 checker in schedule
Tianqi Chen [Mon, 30 Sep 2019 19:14:59 +0000 (12:14 -0700)]
[COMMUNITY] anijain2305 -> reviewer (#4036)
Animesh Jain [Mon, 30 Sep 2019 17:24:36 +0000 (10:24 -0700)]
[QNN] Renaming dense operator. (#4033)
Animesh Jain [Mon, 30 Sep 2019 17:06:35 +0000 (10:06 -0700)]
[Relay][Compile_engine] Int64 shape handling for outputs. (#4031)
ndl [Mon, 30 Sep 2019 16:25:24 +0000 (18:25 +0200)]
Add dmlc-core to the list of installed header directories. (#4035)
There are dependencies on dmlc-core in TVM public API headers
(e.g. some headers include dmlc/logging.h) so it needs to be installed
as part of TVM for TVM headers to be actually usable.
Tianqi Chen [Mon, 30 Sep 2019 05:06:58 +0000 (22:06 -0700)]
[ARITH] migrate indexdiv/mod to floordiv/mod (#4008)
Logan Weber [Sun, 29 Sep 2019 23:48:10 +0000 (16:48 -0700)]
[Relay] Move prelude to text format (#3939)
* Fix parser
* Doc fix
* Add module utility functions necessary for prelude
* Implement prelude in text format
* Remove programmatically constructed prelude defs
* Fix 0-arity type conses in pretty printer and test
* Make prelude loading backwards-compatible
* Fix patterns
* Improve some prelude defs
* Fix `ImportFromStd`
It needs to also follow the "add unchecked, add checked" pattern
* Lint roller
* Woops
* Address feedback
* Fix `test_list_constructor` VM test
* Fix `test_adt.py` failures
egolearner [Sun, 29 Sep 2019 16:21:18 +0000 (00:21 +0800)]
make tvm compilable by gcc 4.9.2 (#4032)
please see https://stackoverflow.com/a/
26949099
Neo Chien [Sun, 29 Sep 2019 03:20:34 +0000 (11:20 +0800)]
[AUTOTVM][DOCS] Add a link to the defining network description of auto-tuning tutorial (#4023)
* [AUTOTVM][DOCS] Add a link to autoTVM tutorial to direct the details of building NN with relay
* [AUTOTVM][DOCS] Add a link to autoTVM tutorial to direct the details of building NN with relay
Tianqi Chen [Sat, 28 Sep 2019 21:43:44 +0000 (14:43 -0700)]
[ARITH] cleanup the indexmod/div on python side (#4028)
bindog [Sat, 28 Sep 2019 17:22:01 +0000 (01:22 +0800)]
[Fix] Add more pad_mode support for onnx converter (#4029)
* [Fix] Add more pad_mode support for onnx converter
* robustness fix
Ina Dobreva [Sat, 28 Sep 2019 00:30:11 +0000 (01:30 +0100)]
Add parser support for ReLU tflite operator (#4022)
Alex Gladkov [Sat, 28 Sep 2019 00:27:48 +0000 (17:27 -0700)]
Additional MXNet Convolution and Deconvolution tests (#4026)
Add different batch sizes and channel numbers to
MXNet Convolution and Deconvolution tests.
brett koonce [Fri, 27 Sep 2019 19:39:42 +0000 (14:39 -0500)]
docs: minor spelling tweaks (#4027)
Paddy Horan [Fri, 27 Sep 2019 16:41:31 +0000 (12:41 -0400)]
[Rust] Fix issue with CPP enums. (#4019)
Tianqi Chen [Fri, 27 Sep 2019 16:34:02 +0000 (09:34 -0700)]
[DOCKER] make demo images consistent with ci images when possible. (#4024)
Yida Wang [Fri, 27 Sep 2019 15:59:15 +0000 (08:59 -0700)]
[Fix]use a more intuitive way to limit the #ops in a group (#4018)
* use a more intuitive way to limit the #ops in a group
* format
Tianqi Chen [Fri, 27 Sep 2019 14:45:25 +0000 (07:45 -0700)]
[ARITH] Use explicit div mode in python. (#4014)
Kimish Patel [Fri, 27 Sep 2019 00:20:00 +0000 (17:20 -0700)]
Exposed lowered func to c++ API. (#4012)
So that you can use: `build_mod_.GetFunction("get_lowered_funcs", false);`
to get lowered_funcs.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Haozheng Fan [Thu, 26 Sep 2019 22:00:14 +0000 (06:00 +0800)]
hide psutil (#4013)
Animesh Jain [Thu, 26 Sep 2019 17:19:42 +0000 (10:19 -0700)]
[QNN][Conv2D] Optimize lowering. (#4006)
Jon Soifer [Thu, 26 Sep 2019 05:48:50 +0000 (22:48 -0700)]
[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets (#3983)
* Fix extern schedule for x86
* Register x86::schedule_extern
* Fix
* Fix
* Replace extern.py with extern.h
* Introduce new generic function schedule_injective_from_existing
* Fix
* Fix
* Add back to C++
* Fix style
* Injective schedule calls local schedule_injective_from_existing
* Fix
* Remove target arg from schedule_injective_from_existing
* Fix docs
* Try to fix unit test
* Fix test
* Fix other tests
* Fix bug
Yida Wang [Wed, 25 Sep 2019 23:24:14 +0000 (16:24 -0700)]
[RELAY]impose a max op limit to the op fusion pass (#4002)
* impose a max op limit to op fusion
* use cross platform data type
黎明灰烬 [Wed, 25 Sep 2019 23:02:19 +0000 (07:02 +0800)]
[TOPI] Move conv2d spatial pack schedule to dedicated file (#3972)
More schedules are making the conv2d.py file too large, so
we'd like to move the spatial pack schedule to dedicated file
before introducing NHWC schedule. No logic change in this patch.
Tianqi Chen [Wed, 25 Sep 2019 22:43:31 +0000 (15:43 -0700)]
Revert "Added tesnorizeation for avx2 based gemm. (#3982)" (#4007)
This reverts commit
23727eb49ea71609fc29963b996a68a14fddf79c.
Cody Hao Yu [Wed, 25 Sep 2019 20:50:42 +0000 (13:50 -0700)]
remove FLOP computation for 3rd party lib call (#4005)
Tianqi Chen [Wed, 25 Sep 2019 19:47:29 +0000 (12:47 -0700)]
[ARITH] Refactor to use explicit div/mod functions instead of operators. (#4000)
* [ARITH] Use explicit div/mod functions instead of operators.
* fix pooling case
Kimish Patel [Wed, 25 Sep 2019 18:22:54 +0000 (11:22 -0700)]
Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. (#4001)
* Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
* Added python binding. Added test.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Philipp Krones [Wed, 25 Sep 2019 17:29:23 +0000 (19:29 +0200)]
Change Vivado install instructions to version 2018.3 (#4003)
Kimish Patel [Wed, 25 Sep 2019 16:52:09 +0000 (09:52 -0700)]
Added tesnorizeation for avx2 based gemm. (#3982)
* Added tesnorizeation for avx2 based gemm.
Summary:
Tensorized the same region as avx512. Names produce 16x1 int32 results.
Does by doing two sets of AVX2 instructions to do reduction on 8x4 int8
kernel with 1x4 data.
Test Plan:
on avx2 machine:
python tests/python/contrib/test_gemm_avx2_acc32.py
Reviewers:
Subscribers:
Tasks:
Tags:
* Fix lint errors. Removed commented out code.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Tianqi Chen [Wed, 25 Sep 2019 03:13:29 +0000 (20:13 -0700)]
[COMMUNITY] @yongwww-> reviewer (#3997)
Ina Dobreva [Wed, 25 Sep 2019 00:13:21 +0000 (01:13 +0100)]
add parser support for GREATER tflite operator (#3963)
add test for GREATER