Lianmin Zheng [Sat, 25 Jul 2020 12:07:17 +0000 (05:07 -0700)]
[Ansor][AutoTVM v2.0] Phase 1: Access Analyzer (#6103)
* add access analyzer
* add test cases
* move header files and polish comments
* fix lint
* update
* fix lint
* address comments
* fix lint
Cody Yu [Sat, 25 Jul 2020 04:09:06 +0000 (21:09 -0700)]
\b[TOPI] Fix CUDA Library Tuning (#6132)
Yanming Wang [Fri, 24 Jul 2020 23:00:09 +0000 (16:00 -0700)]
[AutoTVM][BugFix] Fix autotvm on the conv2d_nchw_winograd.mali operator (#6130)
* [AutoTVM] Fix conv2d_nchw_winograd.mali
* Fix pylint error
Co-authored-by: Yanming Wang <yanmwang@amazon.com>
Haichen Shen [Fri, 24 Jul 2020 22:49:45 +0000 (15:49 -0700)]
[Relay][VM] Allow to config allocator type and refactor vm code structure (#6105)
* [Relay][VM] Allow to config allocator type and refactor vm code structure
* fix doc
* fix
* update
* trigger ci
* trigger ci
* trigger ci
* trigger ci
* fix doc warning
Animesh Jain [Fri, 24 Jul 2020 16:40:25 +0000 (09:40 -0700)]
[Flaky] TFLite quantized conv test (#6084)
Zheng Jiang [Fri, 24 Jul 2020 16:09:50 +0000 (00:09 +0800)]
[Relay]Port eliminate_common_subexpr to non-recursive form (#6134)
Co-authored-by: Zheng Jiang <zhejiang@amazon.com>
lixiaoquan [Fri, 24 Jul 2020 14:42:13 +0000 (22:42 +0800)]
[Relay] Keep fixed dim when unifying dynamic shape (#5795)
Alexander Booth [Fri, 24 Jul 2020 14:22:39 +0000 (07:22 -0700)]
Add 'get_num_inputs' to GraphRuntime (#6118)
Jason Knight [Thu, 23 Jul 2020 21:04:30 +0000 (14:04 -0700)]
[Rust] Some rust cleanups (#6116)
* Some rust cleanups
* Turn off default features for bindgen
* Upgrade some deps for smaller total dep tree
* Switch (/complete switch) to thiserror
* Remove unnecessary transmutes
* Fix null pointer assert
* Update wasm32 test
Haozheng Fan [Thu, 23 Jul 2020 21:04:10 +0000 (05:04 +0800)]
[RELAY][Fix] i64 indices (#5235)
* fix
* resolve comments
Siyuan Li [Thu, 23 Jul 2020 20:23:49 +0000 (04:23 +0800)]
Register Shape Func for Some Operators to Handle Dynamic Shapes (#5955)
* Register Shape Func for Floor Operator
Register the shape function for `floor` operator. Otherwise, a bug will happen when input of floor is any.
* Register shape func for log
* add shape function for crop_and_size
* change import location
* add mirror_pad shape function
* add test cases for crop_and_resize and mirror_pad shape funcs
* support different layout
* fix pylint error
* fix pylint error
* add test for nchw layout
* block nchw test
* test for nchw
* use tvm.testing.assert_allclose instead
Co-authored-by: lisiyuan <lisiyuan@nucflow>
Giuseppe Rossini [Thu, 23 Jul 2020 15:38:56 +0000 (16:38 +0100)]
Improve reduction schedule on arm CPUs (#6110)
* Improve reduction schedule on arm CPUs
Change-Id: I9cd85deac6a57666b82ff7250d827652a4000d82
* Retrigger CI
Change-Id: I5efd99e34268e6bb990904a4b98e1edf2174b26b
Max Willsey [Thu, 23 Jul 2020 08:14:17 +0000 (01:14 -0700)]
[Rust] Clean up conversions between TVM and Rust functions (#6114)
* Replace ToBoxedFn with From
* Compact and improve Typed and ToFunction impls
- Clone one less time
- Don't panic if number of args is wrong, return an error
- Actually drop functions/closures on the rust side
* Retry
Tianqi Chen [Wed, 22 Jul 2020 05:32:18 +0000 (22:32 -0700)]
[DOCS][REFACTOR] Organize Design and Architectures (#6097)
* [DOCS][REFACTOR] Design and Architectures
This PR refactors the design and architecture docs.
Previously this part of documentation was quite unstructured, and lacks a global
view of the overall architecture.
This PR takes a stab in resolving the problem
- Provide a guided overview of the current TVM's overall design
- Categorize the specific docs into architecture components or How tos.
* Apply suggestions from code review
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
* Apply suggestions from code review
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
* Update per comment
* More updates per feedbacks
* clarify external codegen
* Update per comments
Co-authored-by: Jared Roesch <roeschinc@gmail.com>
Jared Roesch [Wed, 22 Jul 2020 02:53:45 +0000 (19:53 -0700)]
[Rust][CI] Move CI over to new Rust crates and try to fix flaky test. (#6011)
Chris Hoge [Tue, 21 Jul 2020 23:41:29 +0000 (16:41 -0700)]
Update installation doc with minor improvements (#6104)
Make some minor improvements to the install from source doc
about flags to enable, package managers, and virtual environments.
Chenfan [Tue, 21 Jul 2020 19:58:48 +0000 (03:58 +0800)]
[Ansor][AutoTVM v2.0] Phase 1: Add annotation/compute_at/compute_root/compute_inline steps (#6073)
* Add annotation step
* Add compute_at/compute_root/compute_inline
* Doc update
* Update
* Update
* Update measure record UT
* Update
* Update
* Update
* Move state implementation to step
* Move measure_record implementation to step
* Order update & API update
* Update the order of state api
* Update
Haichen Shen [Tue, 21 Jul 2020 17:10:16 +0000 (10:10 -0700)]
[Relay][VM] Add ReshapeTensor instruction in the VM to replace the reshape op (#6089)
* [VM] Add reshape tensor instruction
* update
* lint
* fix
* fix
lhutton1 [Tue, 21 Jul 2020 15:30:26 +0000 (16:30 +0100)]
[BYOC][Contrib] Arm Compute Library integration (#5915)
* [BYOC][Contrib] Arm Compute Library integration
Arm Compute Library (ACL) integration using the BYOC infrastructure. This will enable offloading select operators from a relay
graph to ACL so we can achieve faster inference times on Arm CPU's due to hand crafted optimized routines. The PR adds initial
support for offloading FP32 conv2d, maxpool2d and reshape to ACL. ACL codegen is used to generate a JSON representation of an
operator or 'ACL layer', the ACL runtime then uses this representation to construct a layer, cache it and create a packed
function to for the graph runtime to call into.
RFC here: https://discuss.tvm.ai/t/rfc-byoc-arm-compute-library-integration/7082
Change-Id: If756dcea787ea346b1508e9a191b7eed7bd02b7f
* Refactor ACL integration to support JSON runtime
* Now uses JSON runtime
* Addresses tutorial comments
* Rename acl to arm_compute_lib in user facing api
Change-Id: I3b5ef80607f713e898363e82ab4398fbc2cf267a
* Address comments
Change-Id: I041fda14f3bf9975f3518ba8a4e3ab43ba98403d
* Address comments
* correct mistakes in tutorial
* reshuffle runtime to use fewer macro blocks
* preprocess module using "optimize" functionality
* use new module api
Change-Id: I219488e617e5767edd7489b43b8bfce876cd24b8
* Enable ACL codegen tests in CI
* Skips runtime tests as these are not supported on x86.
Change-Id: I6843c003a2604afe95cfdccf2323d2a336b56fe5
* Fix check for runtime
Change-Id: I3f9eec15c599f01b1105d624fb053b73bfb6ed41
* Address comments
* Add warning to ACL engine creation
* Correct runtime check so it doesn't fail when codegen not present
* Improve testing to check acl partitions is what is expected
* Check results of multiple runs test
Change-Id: I9522950930805b9b601dad03269adcf8ed3138cc
* Address comments
* Multiple style improvements
* Use base class for creating json node for single op
* Move GetSource to base class
* Improve annotation checks
Change-Id: I8219659c4b99e86df887cd914720157cb94c61a0
* Improve tutorial
Change-Id: I8f610bd37af1e3740fd48c2d502bcc4727d9d712
* Initialize conv with nullptr
Change-Id: I6c37f0d75a064001c74e171ff83b9f7a7c3f1918
Nick Hynes [Tue, 21 Jul 2020 14:52:48 +0000 (07:52 -0700)]
Update SGX example Cargo.toml (#6067)
Cody Yu [Tue, 21 Jul 2020 14:52:17 +0000 (07:52 -0700)]
load empty config (#6100)
QingFu Wei [Tue, 21 Jul 2020 14:51:33 +0000 (22:51 +0800)]
delete declaration of unused op_node (#6102)
Haichen Shen [Tue, 21 Jul 2020 04:09:55 +0000 (21:09 -0700)]
[Cmake] Add default value for option USE_DNNL_CODEGEN in the cmake (#6099)
Haibin Lin [Tue, 21 Jul 2020 00:49:59 +0000 (17:49 -0700)]
[DSL/TE] Scalar support for `te.extern` (#6079)
* fix make shape with scalar shapes
* add test
* add test
* remove scalar shape assertion
* fix the data type for overflow problems
* add extra tests
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-138.ec2.internal>
Animesh Jain [Mon, 20 Jul 2020 22:59:55 +0000 (15:59 -0700)]
MXNet pre-quantized BERT (#6039)
* MXNet pre-quantized BERT
* Comments.
* Trigger.
* Retrigger CI
* Retrigger CI
* Retrigger CI
* Retrigger
Tianqi Chen [Mon, 20 Jul 2020 20:54:06 +0000 (13:54 -0700)]
[DOCS] Cleanup docs build instructions. (#6094)
Chenfan [Mon, 20 Jul 2020 18:13:48 +0000 (02:13 +0800)]
[Ansor][AutoTVM v2.0] Phase 1: Add RPC Runner (#6077)
* Add rpc runner
* Update
* Update
* Add clflush & non-empty ndarray TODO hints
* Update
* UT Update
* Update timeout in UT
ZHANG Hao [Mon, 20 Jul 2020 15:49:42 +0000 (23:49 +0800)]
lint: add opencl .cl file type (#6092)
Yizhi Liu [Sat, 18 Jul 2020 20:28:20 +0000 (13:28 -0700)]
[Docs] improve the doc of release (#6091)
Matthew Brookhart [Fri, 17 Jul 2020 22:39:57 +0000 (15:39 -0700)]
[Relay][Dyn] Add dynamic reshape grad (#6080)
* add dynamic rehape grad
* fix lint
* fix unit tests, warning
Giuseppe Rossini [Fri, 17 Jul 2020 16:14:49 +0000 (17:14 +0100)]
Fixed point multiplication improvements for AArch64 (#5980)
* Fixed point multiplication improvements for AArch64
Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4
* Fix python linting errors
Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909
* Fix doxygen errors
Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407
* Fix arm_cpu injective tests
Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd
* Fix python linting errors - 2
Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e
* Fix arm_cpu injective tests - 2
Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979
* Redesign: introduce a qmuls (q-multiply and shift) general intrinsic
Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02
* Fix python linting errors - 3
Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366
* Addressing review comments
Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c
* Fixing test failures
Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a
* Renaming qmuls to q_multiply_shift
Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
Haichen Shen [Fri, 17 Jul 2020 16:13:19 +0000 (09:13 -0700)]
[Test] Add missing test for fast erf (#6058)
* add missing test for fast erf
* trigger ci
Tristan Konolige [Fri, 17 Jul 2020 14:51:12 +0000 (07:51 -0700)]
Fix LocalBuilder on macos with python 3.8. (#6083)
Python 3.8 changes the default way multiprocessing creates new processes
on macOS from forking to spawing. Spawning requires all objects to be
picklable. Nested functions and lambdas are not picklable, so this
commit fixes the one instance of nested functions in the codebase that
was causing issues.
Haichen Shen [Fri, 17 Jul 2020 14:48:38 +0000 (07:48 -0700)]
[Fix] Add missing expr visitor for any (#6082)
Krzysztof Parzyszek [Fri, 17 Jul 2020 02:24:06 +0000 (21:24 -0500)]
[TOPI] Fix the filter width parameter in depthwise_conv2d (#6081)
* [TOPI] Fix the filter width parameter in depthwise_conv2d
* Retrigger build
Co-authored-by: Venkat Rasagna Reddy Komatireddy <quic_rasagna@quicinc.com>
lixiaoquan [Thu, 16 Jul 2020 22:22:28 +0000 (06:22 +0800)]
Refine LSTMBlockCell to support dynamic rnn (#5963)
1. Refine conversion of `LSTMBlockCell`
1) Make its output follows definition in TensorFlow
2) Avoid introducing variables which doesn't match any placeholder nodes in TensorFlow graph
2. About change in test_forward_ptb
States nodes of LSTMBlockCell in this PB file are actually Constant node.
TF can feed data to those Constant nodes but relay can't do that, so current conversion of LSTMBockCell introduces extra variables to solve this issue.
But this causes that relay IR doesn't match original TF graph. This PR solves this issue by convert those states node into placeholders.
Lianmin Zheng [Thu, 16 Jul 2020 20:18:25 +0000 (13:18 -0700)]
[ARITH] Improve vector simplification for float operands (#6043)
Hua Jiang [Thu, 16 Jul 2020 19:06:43 +0000 (12:06 -0700)]
[VTA] Fix FSIM Compile Error. (#6070)
Issue:
when set vta target into "sim", vta compile would get fail and
show error message "fatal error: vta/driver.h: No such file or directory".
Solution:
set VTA_HW include path correctly.
Yanming Wang [Thu, 16 Jul 2020 18:02:06 +0000 (11:02 -0700)]
[AutoTVM][BugFix] Fix variable name conflict with OpenCL keyword (#6048)
Co-authored-by: Yanming Wang <yanmwang@amazon.com>
Zhao Wu [Thu, 16 Jul 2020 17:30:08 +0000 (01:30 +0800)]
Remove unnecessary std::cout (#6072)
* Remove unnecessary std::cout
* Trigger CI
windclarion [Thu, 16 Jul 2020 10:42:08 +0000 (18:42 +0800)]
[RUNTIME][CRT] init TVMPackedFunc's name (#6044)
or else src/runtime/crt/graph_runtime/graph_runtime.c TVMGraphRuntime_Run
Line 639 will show messy code.
Signed-off-by: windclarion <windclarion@gmail.com>
Krzysztof Parzyszek [Thu, 16 Jul 2020 02:22:56 +0000 (21:22 -0500)]
Fix error message in Buffer::vstore, NFC (#6056)
* Fix error message in Buffer::vstore, NFC
* Fix whitespace in comment as well
notoraptor [Thu, 16 Jul 2020 02:21:28 +0000 (22:21 -0400)]
Add operation scatter_add to relay, based on scatter implementation. (#6030)
Haichen Shen [Thu, 16 Jul 2020 01:34:21 +0000 (18:34 -0700)]
[Relay][Pass] Merge two consecutive reshape ops (#6052)
Zhi [Thu, 16 Jul 2020 01:04:37 +0000 (18:04 -0700)]
[BYOC][Optimization] Run accelerator specific optimizations (#6068)
* register and invoke optimization pipeline for external codegen
* add unit test
Zhao Wu [Wed, 15 Jul 2020 22:23:40 +0000 (06:23 +0800)]
[clflush] Enable x86 cpu cache flush (#5914)
Mahesh Ambule [Wed, 15 Jul 2020 20:24:24 +0000 (01:54 +0530)]
[TARGET] ONNX codegen (#5052)
* Relay to ONNX converter
* Relay to ONNX op test cases
* Relay to ONNX end to end model test cases
* Add test cases to jenkins
* CI CD fixes
* ONNX codegen
* ONNX codegen
* ONNX codegen
* onnx testcases
* ONNX codegen
* test onnx
* ONNX codegen
* shape calculation
* move onnx codegen to contrib/target
* review comments
* ONNX target use visitor
* onnx fixes
* lint fixes
* doc string changes
* review comments
* review comment fixes
* review comment
* pytest skip
* rename type to node type
* test
* Fix for constantshpae, add exp, fix for metadatamodule
* Fix cpplint
* change error tol values
Zhao Wu [Wed, 15 Jul 2020 17:06:36 +0000 (01:06 +0800)]
[Doc] update frontend tutorials to new model based runtime (#6063)
Chenfan [Wed, 15 Jul 2020 07:24:19 +0000 (15:24 +0800)]
[Ansor][AutoTVM v2.0] Part 1: Rename namspace form auto_schedule to auto_scheduler (#6059)
* Rename namespace auto_schedule to auto_scheduler
* Update
* Lint fix
Zhao Wu [Wed, 15 Jul 2020 03:07:43 +0000 (11:07 +0800)]
[RUNTIME] Support module based interface runtime (#5753)
Andrew Reusch [Wed, 15 Jul 2020 01:11:56 +0000 (18:11 -0700)]
Build crttest and cpptest separately. (#6057)
* Build crttest and cpptest separately.
* Try to fix random CI crashing, likely caused by concurrent cmake execution.
* Revert to -j8
Chenfan [Wed, 15 Jul 2020 00:16:22 +0000 (08:16 +0800)]
[Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule generating (#5962)
* Code migration Start (#1)
* Init commit: Code migration Start
* Add loop_state.cc/h
* Add ComputeDAG basic test
* Split transform_step out & Update more UTs (#3)
* Split transform_step out
* Update GetProducers & GetConsumers
* Update UTs
* Add UT for CacheReadWrite & Some bug fix
* Add search_task, measure and serialization (#4)
* Add FollowSplit & FollowFusedSplit tests
* Update dag.InferBound & its UT
* Add search_task, measure and serialization
* Update Serialization UT
* Add MetaTileRewritePolicy (#5)
* Add feature
* Add cost_model, meta_tile_rewrite_policy
* Add MetaTileRewritePolicy basic UT
* Basic Python API for State (#6)
* Add Basic Python API for State
* Add UTs for State
* Add Python API: Measure & Task (#7)
* Update the return value of state operation
* Add task
* Copy measure.py & utils.py
* Fix LocalBuilder
* Fix LocalRunner
* Add ansor.auto_schedule() API; First AutoSchedule working version(#8)
* Add basic Python support for ansor.auto_schedule
* Update AutoSchedule API
* Bug fix for get the attach point of a fused iter
* Update UT after infer bug fix
* Bug fix & Add python serialization API (#10)
* Delete C++ UT hack since Python is ready
* Add ndarray.non_empty
* Update Serialization python API
* Improve code style, python wrapper and test cases (#11)
* Update c++ code style and unit test
* Update python State wrapper and test cases
* fix unit tests
* Add RPCRunner & OpenCL/CUDA test (#12)
* Add RPCRunner & OpenCL search test
* Add CUDA search test
* Add RPCRunner test
* rebase to upstream/master
* Add Ansor basic tutorial (#13)
* Add basic tutorial
* migrate feature extraction (#14)
* Add XGBModel & RPCRunnerWarpper (#15)
* Add XGBModel & RPCRunnerWarpper
* Revert "Add Parallel Granularity Mutation"
* Migrate workload_registry.py (#16)
* add workload registry
* update
* update
* add task scheduler (#17)
* Add conv2d cuda tutorial with workload registry (#18)
* add tune_test.py (the old tune_wkl.py) (#19)
* add tune_test.py (the old tune_wkl.py)
* update
* fix measure
* fix for gpu
* Code refine for tune_test.py & Add a pre load callback (#20)
* Bug fix for tutorials
* Add PreLoadMeasuredStates
* Add search_callback support for task tuner
* Code refine for tune_test.py
* Update
* Update
* Update
* Update
* Bug fix
* Add python custom sketch rule (#21)
* Add custom sketch rule
* Bug fix
* Ansor Relay Integration (without layout rewrite) (#22)
* relay integration
* Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)
* Add single op tune scripts
* Add tune subgraph support
* Merge all op & all subgraph to one file
* Rename file
* add explicit_unroll_max_extent (#25)
* Add Index simplification & API update (#26)
* Add vectorized cooperative_fetching test
* Update math simplify for vectorized CF
* File rename
* Update tune_network
* API update
* Update PreLoadMeasuredStates & Some bug fix (#27)
* Add a threading wrapper to fix the test bug
* Set default TVM_USE_AUTO_SCHEDULER to false
* Update PreLoadMeasuredStates callback
* Add tensorize step for loop_state (#31)
* Add tensorize step
* State python api update (#33)
* Start to update api
* Add compute_dag to state
* API update
* kernel layout rewrite (#28)
* kernel layout rewrite
* remove some hacks
* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass
* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite
* [cache flush] port cache flush to ansor (#32)
* Improve relay integration (#34)
* tmp checkpoint
* Improve relay integration
* Improve relay integration
* Fix xgb error & Simplify dispatcher (#35)
* Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)
* Rename "MetaTileRewritePolicy" to "SketchPolicy".
* Add a new class for auto_unroll_max_step, storage_offset in StageNode
* fix tune_op_subgraph.py
* rebase
* Migrate all node::make to noderef's construct function (#37)
* Start to move xxxnode::make to noderef()
* Update
* Update
* Finish transform_step
* Finish comute dag & auto schedule
* Update
* Update
* Update
* Update
* Update
* Code refine
* Code refine
* Code refine
* Update
* Update
* Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)
* lint fix
* clang-format-fix
* pylint fix
* Update
* Recover the double constructor of tvm::PrimExpr
* Fix pylint
* pylint fix
* pylint fix
* Add MutateComputeLocation and MutateParallel in evolutionary search (#40)
* Add MutateComputeLocation and MutateParallel in evolutionary search
* fix lint
* Improve loop state python API (stage_tensors -> stage_ops) (#41)
* improve loop state python API (stage_tensors -> stage_ops)
* fix
* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)
* Bug Fix
* Sample example of Custom TensorCore Matmul
* Rever Commits, Start to build minimum Ansor system
* Code clean for minimum Ansor system
* Bug fix & Delete AccessAnalyzer
* Delete attachmap & Code clean
* Doc update
Update statenode::stages from vector to Array
* Headfile update & Python doc update
* clang-format fix
* pylint fix
* Update
* Doc update
* Update
* Bug fix after code merge to the new master
* clang-format fix
* Update
* Update
* Update std::vector to Array; Update verbosity setting; Some commemts
addressed
* std::vector->Array & std::string->String
* Add init_state to ComputeDAG
* Update
* Update some unordered_map to Map
* clang-format fix
* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon
* Lint fix
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Rename ansor namespace to auto_schedule
* Update
* Rename ThreadPool to ParallelFor
* Add parallel_for
* Remove ThreadPool
* Update python/tvm/auto_schedule/auto_schedule.py
* trigger CI
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>
Lily Orth-Smith [Tue, 14 Jul 2020 22:43:07 +0000 (15:43 -0700)]
[RELAY][DYN] Dynamic broadcast_to, zeros, ones (#6007)
* Dynamic BroadcastTo
* fixed lint!
* add test_one_hot() back
* add one_hot registration back
* Dynamic BroadcastTo
* fixed lint!
* add one_hot registration back
* fixed lint.. again
* fixed lint
* lint
* responding to comments
* skipping cuda in dynamic test
* skipping cuda in dynamic test
* fixed i386 test and GPU test
* lint
* starting ones and zeros
* fixed dynamic ones and zeros, wrote dyn ones and zeros test
* added static version of zeros, ones and added a check for size of types to static BroadCastToRel
* added dynamic to static pass for zeros and ones, dynamic test and dynamic to static test
* removed op_str in dyn to static pass test
* fixed lint
* fix lint hopefully
* removed import const
* removed import that was actually used
* copy all attributes from broadcast_to, ones, zeros, full
* responding to comments
* fixed build error
* finishing rebase
* fix lint
Co-authored-by: Lily Orth-Smith <lorthsmith@Lilys-MacBook-Pro.local>
Krzysztof Parzyszek [Tue, 14 Jul 2020 21:10:44 +0000 (16:10 -0500)]
[Hexagon] Remove use of designated initializers from hexagon_module.cc (#6055)
They are an extension, not yet a part of the C++ standard.
MORITA Kazutaka [Tue, 14 Jul 2020 15:53:56 +0000 (00:53 +0900)]
[BYOC][COREML] Handle one symbol for each runtime (#5989)
* [BYOC][COREML] Handle one symbol for each runtime
* LOG -> DLOG
Jinyu Xie [Tue, 14 Jul 2020 06:51:54 +0000 (02:51 -0400)]
Fix pytorch frontend prim::Constant issue (#6051)
Matthew Brookhart [Tue, 14 Jul 2020 03:42:54 +0000 (20:42 -0700)]
Refactor to expose MakeOp functions to C++ (#6047)
* Initial Refactor
* add templated nn Make* functions
* fix build typo
* inline functions, fix unit tests
Liangfu Chen [Tue, 14 Jul 2020 03:16:49 +0000 (11:16 +0800)]
[IR] Fix a primitive check error (#5991)
* fix primitive check error
* assuming every Op has Type defined
* CHECK_NE -> CHECK
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Giuseppe Rossini [Tue, 14 Jul 2020 03:15:42 +0000 (04:15 +0100)]
Fix conv2_gemm after target structure update (#6037)
After target structure changed in this RFC:
https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844/42
The conv2d optimizations was broken for the following reasons:
- "target" is now called mtriple (this changes how we test if the
architecture is AArch64)
- when we invoke "clang.create_llvm" we still need to specify the
"--target" option (set to aarch64-linux-gnu)
This submission reverts those changes
Change-Id: I04c597b91ca5800ddf4471255e2a358c60bc048e
Trevor Morris [Tue, 14 Jul 2020 01:12:28 +0000 (18:12 -0700)]
[Frontend][TFLite] Fix fully_connected converter when batch size is not 1 (#6038)
* Fix fully_connected when batched
* Remove unused variable
Dmitriy Smirnov [Mon, 13 Jul 2020 23:08:56 +0000 (00:08 +0100)]
Add support for tflite arg_min and arg_max (#5992)
* [Relay][Frontend][TFLite] Add parser support for arg_min_max
* this implementation supports only the case when the axis is a scalar
* tflite 1.13 removes all dims of size 1, Relay doesn't do this
* WARNING: every newer version of tflite > 1.13 needs keepdims=TRUE
* Migrated to tflite 2.1.0
keepdims set to False and added some checks
Note the unit tests emmitted following warning:
/workspace/src/te/schedule/bound.cc:119: not in feed graph consumer = compute(T_multiply_red_temp, 0x53f5050)
* linter
* Removed quantized argmin
Removed quantized argmin due to inablility to provide proper test case
* added negative ranges
* re-trigger CI
Co-authored-by: Ina_Dobreva <Ina.Dobreva@arm.com>
Yi-Hsiang (Sean) Lai [Mon, 13 Jul 2020 22:39:10 +0000 (18:39 -0400)]
[Relay] Add pass for getting calibration data from a relay module (#5997)
* add simple pass to extract outputs
* complete pass that collects all function inputs/outputs
* add analysis pass for collecting outputs
* reorganize the files
* add the first test
* update test with tuples
* clean up Python code
* merge with upstream
* clean up transform.py
* add comments for cpp files
* fix lint issues
* update submodules
* modify files according to the review
* fix style and typo
* fix lint error
* add checks for repeated function calls
* fix lint error
* merge review comments
* small simplification
* revise the code according to the review comments
* add username in TODO
* use IRModule directly
* use better APIs according to the review
* apply comments from the reviewer
* retrigger ci
Krzysztof Parzyszek [Mon, 13 Jul 2020 22:29:52 +0000 (17:29 -0500)]
[LLVM] Create TBAA information based on the unrelying buffer type (#6046)
Currently, the TBAA information is based on the access type, i.e.
the data type from the load or store instruction. When the same
memory area is accessed with different types, the corresponding
load/store instruction may end up not being aliased to each other.
This could lead to incorrect code being generated.
An example of when such a situation can occur is when two different
buffer_decl's are created for the same buffer:
ba = buffer_decl(... dtype = 'int16' ...)
bb = buffer_decl(data = ba.data, dtype = 'int32x32' ...)
Then instructions
ba[x] = 0
... = bb[x]
may be reordered in the final code due to the alias info indicating
that they are not aliased.
Lianmin Zheng [Mon, 13 Jul 2020 17:46:27 +0000 (10:46 -0700)]
[CODEGEN] Fix code generation bugs for C/CUDA & Improve VerifyGPUCode pass (#6041)
Josh Fromm [Sun, 12 Jul 2020 12:26:02 +0000 (05:26 -0700)]
[Relay][Frontend][Onnx] GRU Layer Support (#6020)
* GRU debugging and testing added to onnx frontend.
* All tests working and code formatted.
* Fix lint issues.
* Add a test case and changed RNN argument parsing.
* Small refactor.
Andrew Reusch [Sun, 12 Jul 2020 09:28:31 +0000 (02:28 -0700)]
µTVM CRT modifications for on-device RPC server (#5921)
* Reorganize CRT into parts, public API, and add standalone build.
* Create a make-based build in src/runtime/crt. This is intended to
be built in build/standalone_crt (generated by running ninja
standalone_crt in build/). Its job is to build CRT without
depending on headers not explicitly allowed in CRT.
* Create a "public-facing" CRT API targeted to firmware running
alongside CRT in include/tvm/runtime/crt. Developers who are
integrating the CRT are the target of this API.
* Reorganize CRT internally into common/ and graph_runtime/
pieces. Build each pieces as a separate statically-linked library.
* Slim down TVMGraphRuntime public-facing API to just the functions
that are used externally.
* Updates to apps/bundle_deploy to make this work.
* Add TVMFuncRegistry, CRT test infrastructure, and tests.
* Also add error_codes.h, a file containing error codes returned by CRT.
* Add TVMErrorf()
* [API_CHANGE] Integrate func registry into CRT.
* NOTE: This changes the default API for functions exposed under the
CRT by the TVMFuncCall API. `resource_handle` is now always given
as a new 6th parameter.
* `resource_handle` is NULL when invoked on a global function and a
pointer to the module owning the function otherwise.
* Generalize arena-based memory manager.
* lint
* Fix git-clang-format arg parsing
* add apache header
* add mutable func registry tests
* git-clang-format
* fix more lint
* Move memory_test to crttests.
* fix tests
* checkpoint
* checkpoint
* bundle_deploy demo_static works
* rm debug printf
* git-clang-format
* fix lint
* add asf header
* pylint
* update build configs for jenkins
* make regression compiler happy
* fix build errors in regression GCC
* address comments
* git-clang-format
* fix for 32-bit cpp regression
* fix incorrect use of memcpy and tests for 32-bit
* clang-format
Krzysztof Parzyszek [Fri, 10 Jul 2020 22:36:17 +0000 (17:36 -0500)]
[LLVM/CPU] Terminate basic block after "ret" instruction (#6036)
* [LLVM/CPU] Terminate basic block after "ret" instruction
"Ret" is a terminator in LLVM IR and there should be no instructions
in the basic block following it. When generating a "ret", end the
current block and start a new one.
Matthew Brookhart [Fri, 10 Jul 2020 18:18:34 +0000 (11:18 -0700)]
[Relay][Dyn] Dynamic TopK Op (#6008)
* add dynamic topk op
* add topk to dynamic_to_static pass
* fix TF test
* fix pylint
Zhi [Fri, 10 Jul 2020 18:03:23 +0000 (11:03 -0700)]
[REFACTOR][RELAY] Move invoke_tvm_op and shape_func to vm dialect (#5958)
* [REFACTOR][RELAY] Move invoke_tvm_op and shape_func to vm dialect
* address comments
Giuseppe Rossini [Fri, 10 Jul 2020 17:58:22 +0000 (18:58 +0100)]
[Bug fix] Fix in arm_cpu/conv2d_alter_op for NHWC quantized (#6027)
* Bug fix] Fix in arm_cpu/conv2d_alter_op for NHWC quantized
Few minor typos to be fixed in topi/arm_cpu/conv2d_alter_op.py for the
NHWC quantized route:
- Kernel shape was misread (CO, IC, KH, KW) -> (KH, KW, IC, OC)
- Pad along the K dimension was misspelled: pad_k -> pad_K
- Workload name was wrong: "conv2d_NHWC_int8_without_tranform.arm_cpu"
-> "conv2d_NHWC_quantized_without_transform.arm_cpu"
This submission fixes those errors and add a further test for conv2d_alter_op.py
Change-Id: I0622df05f1d4d15311946f6e75f1840a34815a5b
* Move -target to -mtriple
Change-Id: Ieff80c774e8ab0fa7f48d83d50a79f3a62e8fe13
* Retrigger tests
Change-Id: I5541bed54eacc5063bf4a4fda725209cc23f621e
Krzysztof Parzyszek [Fri, 10 Jul 2020 17:57:05 +0000 (12:57 -0500)]
Add creation of Hexagon device in RPC client (#6035)
lhutton1 [Fri, 10 Jul 2020 16:43:25 +0000 (17:43 +0100)]
[CI][ACL] Enable ACL installation in ci_cpu docker container (#5916)
This patch adds a cross-compiled ACL build to the ci_cpu dockerfile used for CI.
Change-Id: I66e1521ab553306bc7367b65acc0363e750f0211
Cody Yu [Fri, 10 Jul 2020 11:38:46 +0000 (04:38 -0700)]
[BYOC] JSON Runtime with DNNL End-to-End Flow (#5919)
* json runtime
* json dnnl WIP
* fix ArrayNode usages
* Support composite functions
* DNNL json runtime: conv2d/add/relu/dense/bn
* add a more complex example
* fix bias memory issue
* rebase to upstream
* merge to metadata module, remove the unused driver
* handle constant
* support composite functions
* support DNNL constant
* clean up
* Simplify dnnl user code
* GetDataSize
* fix dense bug
* improve cmake
* zero copy
* add unit test
* move json to contrib/json
* fix cmake
* lint
* max_digits10 for fp serialization
* only keep base getfunction
* fix lint
* zero copy for all data entries
* address comments
* enable ci
* address comment; fix bug
* address comment
Co-authored-by: Zhi Chen <chzhi@amazon.com>
Tianqi Chen [Fri, 10 Jul 2020 05:28:15 +0000 (22:28 -0700)]
[CI] Update ci-cpu to the latest (#6031)
Tianqi Chen [Fri, 10 Jul 2020 05:27:50 +0000 (22:27 -0700)]
[DOCKER] Pin keras version (#6032)
windclarion [Thu, 9 Jul 2020 20:12:42 +0000 (04:12 +0800)]
[TARGET] each option of target str should only contain one '=' (#5988)
src/target/target_id.cc ParseAttrsFromRawString L222:
if ((pos = FindUniqueSubstr(s, "=")) != -1)
require option contains only one '='
Signed-off-by: windclarion <windclarion@gmail.com>
Zheng Jiang [Thu, 9 Jul 2020 04:54:49 +0000 (12:54 +0800)]
fix typos in comments and relay tutorial (#5999)
* [TypoFix]fix typos in comments and relay tutorial
* retrigger
windclarion [Thu, 9 Jul 2020 04:52:24 +0000 (12:52 +0800)]
[RUNTIME] if a param not in input, we still consume it's data (#5990)
so the read pointer of stream can move forward
Signed-off-by: windclarion <windclarion@gmail.com>
Siju Samuel [Thu, 9 Jul 2020 01:11:26 +0000 (06:41 +0530)]
[PYTORCH]Gather op support added (#6013)
* [PYTORCH]Gather op support added
* retrigger
HUAN-PING SU [Wed, 8 Jul 2020 20:30:41 +0000 (04:30 +0800)]
Remove deplicate line (#6017)
Jared Roesch [Wed, 8 Jul 2020 20:04:42 +0000 (13:04 -0700)]
[Frontend][Relay] Add Parser 2.0 (#5932)
lhutton1 [Wed, 8 Jul 2020 17:14:58 +0000 (18:14 +0100)]
Option to specify alternate directory to output build to (#6016)
This is useful when you would like to manage 2 separate builds in the same tvm tree. You can specify a build directory when using make by adding OUTDIR=alternate-build-dir.
Change-Id: I3efed1135343f3903007115ce5dd683ef7bd9e8c
Yizhi Liu [Wed, 8 Jul 2020 15:35:28 +0000 (08:35 -0700)]
[TEST][FLAKY] test_arith_solve_linear_inequality.py::test_multi_equal (#6014)
Haibin Lin [Wed, 8 Jul 2020 05:37:10 +0000 (22:37 -0700)]
[Frontend][MXNet] MXNet frontend support for AMP cast op (#5976)
* amp_cast
* fix test
* more tests
* test more ctxs
* fix doc
* fix typo
* address CR comment
* fix lint
* revert doc change
* Revert "revert doc change"
This reverts commit
a410dd5569730ac81af67ddb333c3afbe97eddd7.
* fix doc
* Update relay_pass_infra.rst
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-138.ec2.internal>
Tianqi Chen [Wed, 8 Jul 2020 03:55:44 +0000 (20:55 -0700)]
[VTA] Move compiler related registry items to vta/build_module.py (#6012)
Krzysztof Parzyszek [Wed, 8 Jul 2020 00:34:10 +0000 (19:34 -0500)]
Cache object refs in loop partitioner instead of object pointers (#6004)
* Cache object refs in loop partitioner instead of object pointers
Loop partitioner modifies the IR, which can cause TIR objects to
become dead and be destroyed. To avoid working on junk data cache
object references instead of object pointers.
* Fix format/lint errors
Krzysztof Parzyszek [Wed, 8 Jul 2020 00:31:59 +0000 (19:31 -0500)]
[LLVM] Auto-convert shuffle with single index to "extract element" (#6006)
* [LLVM] Auto-convert shuffle with single index to "extract element"
Data types with a single lane are treated as scalars in TVM. On the
other hand, in LLVM there is a difference between a scalar type and
a vector type with a single lane. Because of that, a shuffle with
a single index is equivalent to extracting an element in TIR, but
not in the generated LLVM IR. This patch changes the LLVM codegen
for shuffle to auto-convert single-lane vectors to scalars.
* Try another build
Jared Roesch [Wed, 8 Jul 2020 00:15:13 +0000 (17:15 -0700)]
Fix what looks like bizzare copy-paste issue (#6010)
Tianqi Chen [Tue, 7 Jul 2020 19:53:51 +0000 (12:53 -0700)]
[DOCKER] Only pass pythonpath for ci images (#6005)
Matthew Brookhart [Tue, 7 Jul 2020 18:50:35 +0000 (11:50 -0700)]
Dynamic Tile Op (#5983)
* first working dynamic tile passes first test
* add dyn tile to dynamic_to_static
* fix cpplintt
* respond to review comments. Thanks @siju-samuel
* make dynamic tile compatible with numpy API
Christian Clauss [Tue, 7 Jul 2020 16:10:24 +0000 (18:10 +0200)]
Undefuned names: import os for line 324 & import re for line 308 (#6003)
Christian Clauss [Tue, 7 Jul 2020 16:10:10 +0000 (18:10 +0200)]
Update main.yml (#6002)
Lianmin Zheng [Tue, 7 Jul 2020 15:18:31 +0000 (08:18 -0700)]
Fix tune_relay_cuda.py (#6001)
Yizhi Liu [Mon, 6 Jul 2020 17:04:42 +0000 (10:04 -0700)]
[Arith] Inequalities solver (#5618)
Josh Fromm [Mon, 6 Jul 2020 02:38:31 +0000 (19:38 -0700)]
[Relay][Frontend][Onnx] Small bug fix for Conv1D imports. (#5995)
* Fix autopad bug in onnx importer for conv1d.
* Fix output shape in test.
* Undo commented out lines oops.
Junru Shao [Sun, 5 Jul 2020 16:52:08 +0000 (09:52 -0700)]
[Target] Use TargetNode::attrs for Target serialization (#5993)
Animesh Jain [Fri, 3 Jul 2020 01:13:56 +0000 (18:13 -0700)]
[TFLite] QNN support for TFLite 2.1.0 quantized models (#5848)
* [TFLite] TFLite 2.x parser quantization support.
* Address comments. Fix a bug for depthwise conv
* Added tests for relu, conv, quantize. Address comments.
* Using web-data. Minor refactoring.
* Removing TF hub package
* Trigger CI.
* Handle TFLite input layer naming.
* Addressing reviews.
* Retrigger CI.
Krzysztof Parzyszek [Fri, 3 Jul 2020 00:04:22 +0000 (19:04 -0500)]
[LLVM] VectorType::get with two parameters is deprecated in LLVM 11+ (#5984)
In LLVM 11+ the distinction between fixed and scalable vector types
has become more explicit. Before the introduction of scalable vector
types VectorType::get(e,n) created what is now a fixed vector type.
With the addition of scalable types, it is recommended to use
FixedVectorType and ScalableVectorType classes directly. Alternatively,
there is a VectorType::get that accepts a 3rd parameter indicating
whether the type should be fixed or scalable.
Using the older VectorType::get that implicitly assumes the fixed type
is deprecated and LLVM now generates a warning.
Change calls to VectorType::get to FixedVectorType::get to avoid
compilation warnings.
Josh Fromm [Thu, 2 Jul 2020 21:19:41 +0000 (14:19 -0700)]
[Tutorial] Demo showing how to run a pruned 🤗 model. (#5975)
Junru Shao [Thu, 2 Jul 2020 19:23:57 +0000 (12:23 -0700)]
[Target] Migrate data structure of TargetNode (#5960)