ZHANG Hao [Mon, 20 Jul 2020 15:49:42 +0000 (23:49 +0800)]
lint: add opencl .cl file type (#6092)
Yizhi Liu [Sat, 18 Jul 2020 20:28:20 +0000 (13:28 -0700)]
[Docs] improve the doc of release (#6091)
Matthew Brookhart [Fri, 17 Jul 2020 22:39:57 +0000 (15:39 -0700)]
[Relay][Dyn] Add dynamic reshape grad (#6080)
* add dynamic rehape grad
* fix lint
* fix unit tests, warning
Giuseppe Rossini [Fri, 17 Jul 2020 16:14:49 +0000 (17:14 +0100)]
Fixed point multiplication improvements for AArch64 (#5980)
* Fixed point multiplication improvements for AArch64
Change-Id: Ib3c10348d4c0eac11fa92b39cc6e792560e9eba4
* Fix python linting errors
Change-Id: I4cf5ac18aa24b39374b83805dcc8e1663e173909
* Fix doxygen errors
Change-Id: Ie3c861f8ead3f1ea5b30d5e9d7d94e222299d407
* Fix arm_cpu injective tests
Change-Id: I6ad9da61b61e6bd737627f26fba59767418c07cd
* Fix python linting errors - 2
Change-Id: Ic864a235aa5da5786393cbf6146dd815c121df5e
* Fix arm_cpu injective tests - 2
Change-Id: If9ca1cc3d947b1656c836c7f88de90470d92f979
* Redesign: introduce a qmuls (q-multiply and shift) general intrinsic
Change-Id: I1966fef9aee32eab50e4b984bbe81018488c8c02
* Fix python linting errors - 3
Change-Id: Ib87a19a8ee2d532954a7db1eb5793666e7aef366
* Addressing review comments
Change-Id: Ie82e75204e5a421d17660f381f3e31fc325cd26c
* Fixing test failures
Change-Id: I74cc675764cf8d260fe68a41e770b1ec7e84729a
* Renaming qmuls to q_multiply_shift
Change-Id: I5a8ed60ba855208040304fcdf6e1ea28061f06ad
Haichen Shen [Fri, 17 Jul 2020 16:13:19 +0000 (09:13 -0700)]
[Test] Add missing test for fast erf (#6058)
* add missing test for fast erf
* trigger ci
Tristan Konolige [Fri, 17 Jul 2020 14:51:12 +0000 (07:51 -0700)]
Fix LocalBuilder on macos with python 3.8. (#6083)
Python 3.8 changes the default way multiprocessing creates new processes
on macOS from forking to spawing. Spawning requires all objects to be
picklable. Nested functions and lambdas are not picklable, so this
commit fixes the one instance of nested functions in the codebase that
was causing issues.
Haichen Shen [Fri, 17 Jul 2020 14:48:38 +0000 (07:48 -0700)]
[Fix] Add missing expr visitor for any (#6082)
Krzysztof Parzyszek [Fri, 17 Jul 2020 02:24:06 +0000 (21:24 -0500)]
[TOPI] Fix the filter width parameter in depthwise_conv2d (#6081)
* [TOPI] Fix the filter width parameter in depthwise_conv2d
* Retrigger build
Co-authored-by: Venkat Rasagna Reddy Komatireddy <quic_rasagna@quicinc.com>
lixiaoquan [Thu, 16 Jul 2020 22:22:28 +0000 (06:22 +0800)]
Refine LSTMBlockCell to support dynamic rnn (#5963)
1. Refine conversion of `LSTMBlockCell`
1) Make its output follows definition in TensorFlow
2) Avoid introducing variables which doesn't match any placeholder nodes in TensorFlow graph
2. About change in test_forward_ptb
States nodes of LSTMBlockCell in this PB file are actually Constant node.
TF can feed data to those Constant nodes but relay can't do that, so current conversion of LSTMBockCell introduces extra variables to solve this issue.
But this causes that relay IR doesn't match original TF graph. This PR solves this issue by convert those states node into placeholders.
Lianmin Zheng [Thu, 16 Jul 2020 20:18:25 +0000 (13:18 -0700)]
[ARITH] Improve vector simplification for float operands (#6043)
Hua Jiang [Thu, 16 Jul 2020 19:06:43 +0000 (12:06 -0700)]
[VTA] Fix FSIM Compile Error. (#6070)
Issue:
when set vta target into "sim", vta compile would get fail and
show error message "fatal error: vta/driver.h: No such file or directory".
Solution:
set VTA_HW include path correctly.
Yanming Wang [Thu, 16 Jul 2020 18:02:06 +0000 (11:02 -0700)]
[AutoTVM][BugFix] Fix variable name conflict with OpenCL keyword (#6048)
Co-authored-by: Yanming Wang <yanmwang@amazon.com>
Zhao Wu [Thu, 16 Jul 2020 17:30:08 +0000 (01:30 +0800)]
Remove unnecessary std::cout (#6072)
* Remove unnecessary std::cout
* Trigger CI
windclarion [Thu, 16 Jul 2020 10:42:08 +0000 (18:42 +0800)]
[RUNTIME][CRT] init TVMPackedFunc's name (#6044)
or else src/runtime/crt/graph_runtime/graph_runtime.c TVMGraphRuntime_Run
Line 639 will show messy code.
Signed-off-by: windclarion <windclarion@gmail.com>
Krzysztof Parzyszek [Thu, 16 Jul 2020 02:22:56 +0000 (21:22 -0500)]
Fix error message in Buffer::vstore, NFC (#6056)
* Fix error message in Buffer::vstore, NFC
* Fix whitespace in comment as well
notoraptor [Thu, 16 Jul 2020 02:21:28 +0000 (22:21 -0400)]
Add operation scatter_add to relay, based on scatter implementation. (#6030)
Haichen Shen [Thu, 16 Jul 2020 01:34:21 +0000 (18:34 -0700)]
[Relay][Pass] Merge two consecutive reshape ops (#6052)
Zhi [Thu, 16 Jul 2020 01:04:37 +0000 (18:04 -0700)]
[BYOC][Optimization] Run accelerator specific optimizations (#6068)
* register and invoke optimization pipeline for external codegen
* add unit test
Zhao Wu [Wed, 15 Jul 2020 22:23:40 +0000 (06:23 +0800)]
[clflush] Enable x86 cpu cache flush (#5914)
Mahesh Ambule [Wed, 15 Jul 2020 20:24:24 +0000 (01:54 +0530)]
[TARGET] ONNX codegen (#5052)
* Relay to ONNX converter
* Relay to ONNX op test cases
* Relay to ONNX end to end model test cases
* Add test cases to jenkins
* CI CD fixes
* ONNX codegen
* ONNX codegen
* ONNX codegen
* onnx testcases
* ONNX codegen
* test onnx
* ONNX codegen
* shape calculation
* move onnx codegen to contrib/target
* review comments
* ONNX target use visitor
* onnx fixes
* lint fixes
* doc string changes
* review comments
* review comment fixes
* review comment
* pytest skip
* rename type to node type
* test
* Fix for constantshpae, add exp, fix for metadatamodule
* Fix cpplint
* change error tol values
Zhao Wu [Wed, 15 Jul 2020 17:06:36 +0000 (01:06 +0800)]
[Doc] update frontend tutorials to new model based runtime (#6063)
Chenfan [Wed, 15 Jul 2020 07:24:19 +0000 (15:24 +0800)]
[Ansor][AutoTVM v2.0] Part 1: Rename namspace form auto_schedule to auto_scheduler (#6059)
* Rename namespace auto_schedule to auto_scheduler
* Update
* Lint fix
Zhao Wu [Wed, 15 Jul 2020 03:07:43 +0000 (11:07 +0800)]
[RUNTIME] Support module based interface runtime (#5753)
Andrew Reusch [Wed, 15 Jul 2020 01:11:56 +0000 (18:11 -0700)]
Build crttest and cpptest separately. (#6057)
* Build crttest and cpptest separately.
* Try to fix random CI crashing, likely caused by concurrent cmake execution.
* Revert to -j8
Chenfan [Wed, 15 Jul 2020 00:16:22 +0000 (08:16 +0800)]
[Ansor][AutoTVM v2.0] Part 0: Ansor minimum system for auto schedule generating (#5962)
* Code migration Start (#1)
* Init commit: Code migration Start
* Add loop_state.cc/h
* Add ComputeDAG basic test
* Split transform_step out & Update more UTs (#3)
* Split transform_step out
* Update GetProducers & GetConsumers
* Update UTs
* Add UT for CacheReadWrite & Some bug fix
* Add search_task, measure and serialization (#4)
* Add FollowSplit & FollowFusedSplit tests
* Update dag.InferBound & its UT
* Add search_task, measure and serialization
* Update Serialization UT
* Add MetaTileRewritePolicy (#5)
* Add feature
* Add cost_model, meta_tile_rewrite_policy
* Add MetaTileRewritePolicy basic UT
* Basic Python API for State (#6)
* Add Basic Python API for State
* Add UTs for State
* Add Python API: Measure & Task (#7)
* Update the return value of state operation
* Add task
* Copy measure.py & utils.py
* Fix LocalBuilder
* Fix LocalRunner
* Add ansor.auto_schedule() API; First AutoSchedule working version(#8)
* Add basic Python support for ansor.auto_schedule
* Update AutoSchedule API
* Bug fix for get the attach point of a fused iter
* Update UT after infer bug fix
* Bug fix & Add python serialization API (#10)
* Delete C++ UT hack since Python is ready
* Add ndarray.non_empty
* Update Serialization python API
* Improve code style, python wrapper and test cases (#11)
* Update c++ code style and unit test
* Update python State wrapper and test cases
* fix unit tests
* Add RPCRunner & OpenCL/CUDA test (#12)
* Add RPCRunner & OpenCL search test
* Add CUDA search test
* Add RPCRunner test
* rebase to upstream/master
* Add Ansor basic tutorial (#13)
* Add basic tutorial
* migrate feature extraction (#14)
* Add XGBModel & RPCRunnerWarpper (#15)
* Add XGBModel & RPCRunnerWarpper
* Revert "Add Parallel Granularity Mutation"
* Migrate workload_registry.py (#16)
* add workload registry
* update
* update
* add task scheduler (#17)
* Add conv2d cuda tutorial with workload registry (#18)
* add tune_test.py (the old tune_wkl.py) (#19)
* add tune_test.py (the old tune_wkl.py)
* update
* fix measure
* fix for gpu
* Code refine for tune_test.py & Add a pre load callback (#20)
* Bug fix for tutorials
* Add PreLoadMeasuredStates
* Add search_callback support for task tuner
* Code refine for tune_test.py
* Update
* Update
* Update
* Update
* Bug fix
* Add python custom sketch rule (#21)
* Add custom sketch rule
* Bug fix
* Ansor Relay Integration (without layout rewrite) (#22)
* relay integration
* Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)
* Add single op tune scripts
* Add tune subgraph support
* Merge all op & all subgraph to one file
* Rename file
* add explicit_unroll_max_extent (#25)
* Add Index simplification & API update (#26)
* Add vectorized cooperative_fetching test
* Update math simplify for vectorized CF
* File rename
* Update tune_network
* API update
* Update PreLoadMeasuredStates & Some bug fix (#27)
* Add a threading wrapper to fix the test bug
* Set default TVM_USE_AUTO_SCHEDULER to false
* Update PreLoadMeasuredStates callback
* Add tensorize step for loop_state (#31)
* Add tensorize step
* State python api update (#33)
* Start to update api
* Add compute_dag to state
* API update
* kernel layout rewrite (#28)
* kernel layout rewrite
* remove some hacks
* add defuse_ops pass and move kernel_layout_rewrite pass after fuse_ops pass
* set TVM_RELAY_DISABLE_BUILD_CACHE for task extraction and prepare_layout_rewrite
* [cache flush] port cache flush to ansor (#32)
* Improve relay integration (#34)
* tmp checkpoint
* Improve relay integration
* Improve relay integration
* Fix xgb error & Simplify dispatcher (#35)
* Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)
* Rename "MetaTileRewritePolicy" to "SketchPolicy".
* Add a new class for auto_unroll_max_step, storage_offset in StageNode
* fix tune_op_subgraph.py
* rebase
* Migrate all node::make to noderef's construct function (#37)
* Start to move xxxnode::make to noderef()
* Update
* Update
* Finish transform_step
* Finish comute dag & auto schedule
* Update
* Update
* Update
* Update
* Update
* Code refine
* Code refine
* Code refine
* Update
* Update
* Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)
* lint fix
* clang-format-fix
* pylint fix
* Update
* Recover the double constructor of tvm::PrimExpr
* Fix pylint
* pylint fix
* pylint fix
* Add MutateComputeLocation and MutateParallel in evolutionary search (#40)
* Add MutateComputeLocation and MutateParallel in evolutionary search
* fix lint
* Improve loop state python API (stage_tensors -> stage_ops) (#41)
* improve loop state python API (stage_tensors -> stage_ops)
* fix
* ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)
* Bug Fix
* Sample example of Custom TensorCore Matmul
* Rever Commits, Start to build minimum Ansor system
* Code clean for minimum Ansor system
* Bug fix & Delete AccessAnalyzer
* Delete attachmap & Code clean
* Doc update
Update statenode::stages from vector to Array
* Headfile update & Python doc update
* clang-format fix
* pylint fix
* Update
* Doc update
* Update
* Bug fix after code merge to the new master
* clang-format fix
* Update
* Update
* Update std::vector to Array; Update verbosity setting; Some commemts
addressed
* std::vector->Array & std::string->String
* Add init_state to ComputeDAG
* Update
* Update some unordered_map to Map
* clang-format fix
* Comments addressed
Delete ReplayAndInferBound
Delete ReplaySteps & InferBoundCommon
* Lint fix
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Update
* Rename ansor namespace to auto_schedule
* Update
* Rename ThreadPool to ParallelFor
* Add parallel_for
* Remove ThreadPool
* Update python/tvm/auto_schedule/auto_schedule.py
* trigger CI
Co-authored-by: Lianmin Zheng <lianminzheng@gmail.com>
Co-authored-by: Minmin Sun (孙敏敏) <minmin.smm@alibaba-inc.com>
Co-authored-by: Zhao Wu <zhaowu@apache.org>
Lily Orth-Smith [Tue, 14 Jul 2020 22:43:07 +0000 (15:43 -0700)]
[RELAY][DYN] Dynamic broadcast_to, zeros, ones (#6007)
* Dynamic BroadcastTo
* fixed lint!
* add test_one_hot() back
* add one_hot registration back
* Dynamic BroadcastTo
* fixed lint!
* add one_hot registration back
* fixed lint.. again
* fixed lint
* lint
* responding to comments
* skipping cuda in dynamic test
* skipping cuda in dynamic test
* fixed i386 test and GPU test
* lint
* starting ones and zeros
* fixed dynamic ones and zeros, wrote dyn ones and zeros test
* added static version of zeros, ones and added a check for size of types to static BroadCastToRel
* added dynamic to static pass for zeros and ones, dynamic test and dynamic to static test
* removed op_str in dyn to static pass test
* fixed lint
* fix lint hopefully
* removed import const
* removed import that was actually used
* copy all attributes from broadcast_to, ones, zeros, full
* responding to comments
* fixed build error
* finishing rebase
* fix lint
Co-authored-by: Lily Orth-Smith <lorthsmith@Lilys-MacBook-Pro.local>
Krzysztof Parzyszek [Tue, 14 Jul 2020 21:10:44 +0000 (16:10 -0500)]
[Hexagon] Remove use of designated initializers from hexagon_module.cc (#6055)
They are an extension, not yet a part of the C++ standard.
MORITA Kazutaka [Tue, 14 Jul 2020 15:53:56 +0000 (00:53 +0900)]
[BYOC][COREML] Handle one symbol for each runtime (#5989)
* [BYOC][COREML] Handle one symbol for each runtime
* LOG -> DLOG
Jinyu Xie [Tue, 14 Jul 2020 06:51:54 +0000 (02:51 -0400)]
Fix pytorch frontend prim::Constant issue (#6051)
Matthew Brookhart [Tue, 14 Jul 2020 03:42:54 +0000 (20:42 -0700)]
Refactor to expose MakeOp functions to C++ (#6047)
* Initial Refactor
* add templated nn Make* functions
* fix build typo
* inline functions, fix unit tests
Liangfu Chen [Tue, 14 Jul 2020 03:16:49 +0000 (11:16 +0800)]
[IR] Fix a primitive check error (#5991)
* fix primitive check error
* assuming every Op has Type defined
* CHECK_NE -> CHECK
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Giuseppe Rossini [Tue, 14 Jul 2020 03:15:42 +0000 (04:15 +0100)]
Fix conv2_gemm after target structure update (#6037)
After target structure changed in this RFC:
https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844/42
The conv2d optimizations was broken for the following reasons:
- "target" is now called mtriple (this changes how we test if the
architecture is AArch64)
- when we invoke "clang.create_llvm" we still need to specify the
"--target" option (set to aarch64-linux-gnu)
This submission reverts those changes
Change-Id: I04c597b91ca5800ddf4471255e2a358c60bc048e
Trevor Morris [Tue, 14 Jul 2020 01:12:28 +0000 (18:12 -0700)]
[Frontend][TFLite] Fix fully_connected converter when batch size is not 1 (#6038)
* Fix fully_connected when batched
* Remove unused variable
Dmitriy Smirnov [Mon, 13 Jul 2020 23:08:56 +0000 (00:08 +0100)]
Add support for tflite arg_min and arg_max (#5992)
* [Relay][Frontend][TFLite] Add parser support for arg_min_max
* this implementation supports only the case when the axis is a scalar
* tflite 1.13 removes all dims of size 1, Relay doesn't do this
* WARNING: every newer version of tflite > 1.13 needs keepdims=TRUE
* Migrated to tflite 2.1.0
keepdims set to False and added some checks
Note the unit tests emmitted following warning:
/workspace/src/te/schedule/bound.cc:119: not in feed graph consumer = compute(T_multiply_red_temp, 0x53f5050)
* linter
* Removed quantized argmin
Removed quantized argmin due to inablility to provide proper test case
* added negative ranges
* re-trigger CI
Co-authored-by: Ina_Dobreva <Ina.Dobreva@arm.com>
Yi-Hsiang (Sean) Lai [Mon, 13 Jul 2020 22:39:10 +0000 (18:39 -0400)]
[Relay] Add pass for getting calibration data from a relay module (#5997)
* add simple pass to extract outputs
* complete pass that collects all function inputs/outputs
* add analysis pass for collecting outputs
* reorganize the files
* add the first test
* update test with tuples
* clean up Python code
* merge with upstream
* clean up transform.py
* add comments for cpp files
* fix lint issues
* update submodules
* modify files according to the review
* fix style and typo
* fix lint error
* add checks for repeated function calls
* fix lint error
* merge review comments
* small simplification
* revise the code according to the review comments
* add username in TODO
* use IRModule directly
* use better APIs according to the review
* apply comments from the reviewer
* retrigger ci
Krzysztof Parzyszek [Mon, 13 Jul 2020 22:29:52 +0000 (17:29 -0500)]
[LLVM] Create TBAA information based on the unrelying buffer type (#6046)
Currently, the TBAA information is based on the access type, i.e.
the data type from the load or store instruction. When the same
memory area is accessed with different types, the corresponding
load/store instruction may end up not being aliased to each other.
This could lead to incorrect code being generated.
An example of when such a situation can occur is when two different
buffer_decl's are created for the same buffer:
ba = buffer_decl(... dtype = 'int16' ...)
bb = buffer_decl(data = ba.data, dtype = 'int32x32' ...)
Then instructions
ba[x] = 0
... = bb[x]
may be reordered in the final code due to the alias info indicating
that they are not aliased.
Lianmin Zheng [Mon, 13 Jul 2020 17:46:27 +0000 (10:46 -0700)]
[CODEGEN] Fix code generation bugs for C/CUDA & Improve VerifyGPUCode pass (#6041)
Josh Fromm [Sun, 12 Jul 2020 12:26:02 +0000 (05:26 -0700)]
[Relay][Frontend][Onnx] GRU Layer Support (#6020)
* GRU debugging and testing added to onnx frontend.
* All tests working and code formatted.
* Fix lint issues.
* Add a test case and changed RNN argument parsing.
* Small refactor.
Andrew Reusch [Sun, 12 Jul 2020 09:28:31 +0000 (02:28 -0700)]
µTVM CRT modifications for on-device RPC server (#5921)
* Reorganize CRT into parts, public API, and add standalone build.
* Create a make-based build in src/runtime/crt. This is intended to
be built in build/standalone_crt (generated by running ninja
standalone_crt in build/). Its job is to build CRT without
depending on headers not explicitly allowed in CRT.
* Create a "public-facing" CRT API targeted to firmware running
alongside CRT in include/tvm/runtime/crt. Developers who are
integrating the CRT are the target of this API.
* Reorganize CRT internally into common/ and graph_runtime/
pieces. Build each pieces as a separate statically-linked library.
* Slim down TVMGraphRuntime public-facing API to just the functions
that are used externally.
* Updates to apps/bundle_deploy to make this work.
* Add TVMFuncRegistry, CRT test infrastructure, and tests.
* Also add error_codes.h, a file containing error codes returned by CRT.
* Add TVMErrorf()
* [API_CHANGE] Integrate func registry into CRT.
* NOTE: This changes the default API for functions exposed under the
CRT by the TVMFuncCall API. `resource_handle` is now always given
as a new 6th parameter.
* `resource_handle` is NULL when invoked on a global function and a
pointer to the module owning the function otherwise.
* Generalize arena-based memory manager.
* lint
* Fix git-clang-format arg parsing
* add apache header
* add mutable func registry tests
* git-clang-format
* fix more lint
* Move memory_test to crttests.
* fix tests
* checkpoint
* checkpoint
* bundle_deploy demo_static works
* rm debug printf
* git-clang-format
* fix lint
* add asf header
* pylint
* update build configs for jenkins
* make regression compiler happy
* fix build errors in regression GCC
* address comments
* git-clang-format
* fix for 32-bit cpp regression
* fix incorrect use of memcpy and tests for 32-bit
* clang-format
Krzysztof Parzyszek [Fri, 10 Jul 2020 22:36:17 +0000 (17:36 -0500)]
[LLVM/CPU] Terminate basic block after "ret" instruction (#6036)
* [LLVM/CPU] Terminate basic block after "ret" instruction
"Ret" is a terminator in LLVM IR and there should be no instructions
in the basic block following it. When generating a "ret", end the
current block and start a new one.
Matthew Brookhart [Fri, 10 Jul 2020 18:18:34 +0000 (11:18 -0700)]
[Relay][Dyn] Dynamic TopK Op (#6008)
* add dynamic topk op
* add topk to dynamic_to_static pass
* fix TF test
* fix pylint
Zhi [Fri, 10 Jul 2020 18:03:23 +0000 (11:03 -0700)]
[REFACTOR][RELAY] Move invoke_tvm_op and shape_func to vm dialect (#5958)
* [REFACTOR][RELAY] Move invoke_tvm_op and shape_func to vm dialect
* address comments
Giuseppe Rossini [Fri, 10 Jul 2020 17:58:22 +0000 (18:58 +0100)]
[Bug fix] Fix in arm_cpu/conv2d_alter_op for NHWC quantized (#6027)
* Bug fix] Fix in arm_cpu/conv2d_alter_op for NHWC quantized
Few minor typos to be fixed in topi/arm_cpu/conv2d_alter_op.py for the
NHWC quantized route:
- Kernel shape was misread (CO, IC, KH, KW) -> (KH, KW, IC, OC)
- Pad along the K dimension was misspelled: pad_k -> pad_K
- Workload name was wrong: "conv2d_NHWC_int8_without_tranform.arm_cpu"
-> "conv2d_NHWC_quantized_without_transform.arm_cpu"
This submission fixes those errors and add a further test for conv2d_alter_op.py
Change-Id: I0622df05f1d4d15311946f6e75f1840a34815a5b
* Move -target to -mtriple
Change-Id: Ieff80c774e8ab0fa7f48d83d50a79f3a62e8fe13
* Retrigger tests
Change-Id: I5541bed54eacc5063bf4a4fda725209cc23f621e
Krzysztof Parzyszek [Fri, 10 Jul 2020 17:57:05 +0000 (12:57 -0500)]
Add creation of Hexagon device in RPC client (#6035)
lhutton1 [Fri, 10 Jul 2020 16:43:25 +0000 (17:43 +0100)]
[CI][ACL] Enable ACL installation in ci_cpu docker container (#5916)
This patch adds a cross-compiled ACL build to the ci_cpu dockerfile used for CI.
Change-Id: I66e1521ab553306bc7367b65acc0363e750f0211
Cody Yu [Fri, 10 Jul 2020 11:38:46 +0000 (04:38 -0700)]
[BYOC] JSON Runtime with DNNL End-to-End Flow (#5919)
* json runtime
* json dnnl WIP
* fix ArrayNode usages
* Support composite functions
* DNNL json runtime: conv2d/add/relu/dense/bn
* add a more complex example
* fix bias memory issue
* rebase to upstream
* merge to metadata module, remove the unused driver
* handle constant
* support composite functions
* support DNNL constant
* clean up
* Simplify dnnl user code
* GetDataSize
* fix dense bug
* improve cmake
* zero copy
* add unit test
* move json to contrib/json
* fix cmake
* lint
* max_digits10 for fp serialization
* only keep base getfunction
* fix lint
* zero copy for all data entries
* address comments
* enable ci
* address comment; fix bug
* address comment
Co-authored-by: Zhi Chen <chzhi@amazon.com>
Tianqi Chen [Fri, 10 Jul 2020 05:28:15 +0000 (22:28 -0700)]
[CI] Update ci-cpu to the latest (#6031)
Tianqi Chen [Fri, 10 Jul 2020 05:27:50 +0000 (22:27 -0700)]
[DOCKER] Pin keras version (#6032)
windclarion [Thu, 9 Jul 2020 20:12:42 +0000 (04:12 +0800)]
[TARGET] each option of target str should only contain one '=' (#5988)
src/target/target_id.cc ParseAttrsFromRawString L222:
if ((pos = FindUniqueSubstr(s, "=")) != -1)
require option contains only one '='
Signed-off-by: windclarion <windclarion@gmail.com>
Zheng Jiang [Thu, 9 Jul 2020 04:54:49 +0000 (12:54 +0800)]
fix typos in comments and relay tutorial (#5999)
* [TypoFix]fix typos in comments and relay tutorial
* retrigger
windclarion [Thu, 9 Jul 2020 04:52:24 +0000 (12:52 +0800)]
[RUNTIME] if a param not in input, we still consume it's data (#5990)
so the read pointer of stream can move forward
Signed-off-by: windclarion <windclarion@gmail.com>
Siju Samuel [Thu, 9 Jul 2020 01:11:26 +0000 (06:41 +0530)]
[PYTORCH]Gather op support added (#6013)
* [PYTORCH]Gather op support added
* retrigger
HUAN-PING SU [Wed, 8 Jul 2020 20:30:41 +0000 (04:30 +0800)]
Remove deplicate line (#6017)
Jared Roesch [Wed, 8 Jul 2020 20:04:42 +0000 (13:04 -0700)]
[Frontend][Relay] Add Parser 2.0 (#5932)
lhutton1 [Wed, 8 Jul 2020 17:14:58 +0000 (18:14 +0100)]
Option to specify alternate directory to output build to (#6016)
This is useful when you would like to manage 2 separate builds in the same tvm tree. You can specify a build directory when using make by adding OUTDIR=alternate-build-dir.
Change-Id: I3efed1135343f3903007115ce5dd683ef7bd9e8c
Yizhi Liu [Wed, 8 Jul 2020 15:35:28 +0000 (08:35 -0700)]
[TEST][FLAKY] test_arith_solve_linear_inequality.py::test_multi_equal (#6014)
Haibin Lin [Wed, 8 Jul 2020 05:37:10 +0000 (22:37 -0700)]
[Frontend][MXNet] MXNet frontend support for AMP cast op (#5976)
* amp_cast
* fix test
* more tests
* test more ctxs
* fix doc
* fix typo
* address CR comment
* fix lint
* revert doc change
* Revert "revert doc change"
This reverts commit
a410dd5569730ac81af67ddb333c3afbe97eddd7.
* fix doc
* Update relay_pass_infra.rst
Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-138.ec2.internal>
Tianqi Chen [Wed, 8 Jul 2020 03:55:44 +0000 (20:55 -0700)]
[VTA] Move compiler related registry items to vta/build_module.py (#6012)
Krzysztof Parzyszek [Wed, 8 Jul 2020 00:34:10 +0000 (19:34 -0500)]
Cache object refs in loop partitioner instead of object pointers (#6004)
* Cache object refs in loop partitioner instead of object pointers
Loop partitioner modifies the IR, which can cause TIR objects to
become dead and be destroyed. To avoid working on junk data cache
object references instead of object pointers.
* Fix format/lint errors
Krzysztof Parzyszek [Wed, 8 Jul 2020 00:31:59 +0000 (19:31 -0500)]
[LLVM] Auto-convert shuffle with single index to "extract element" (#6006)
* [LLVM] Auto-convert shuffle with single index to "extract element"
Data types with a single lane are treated as scalars in TVM. On the
other hand, in LLVM there is a difference between a scalar type and
a vector type with a single lane. Because of that, a shuffle with
a single index is equivalent to extracting an element in TIR, but
not in the generated LLVM IR. This patch changes the LLVM codegen
for shuffle to auto-convert single-lane vectors to scalars.
* Try another build
Jared Roesch [Wed, 8 Jul 2020 00:15:13 +0000 (17:15 -0700)]
Fix what looks like bizzare copy-paste issue (#6010)
Tianqi Chen [Tue, 7 Jul 2020 19:53:51 +0000 (12:53 -0700)]
[DOCKER] Only pass pythonpath for ci images (#6005)
Matthew Brookhart [Tue, 7 Jul 2020 18:50:35 +0000 (11:50 -0700)]
Dynamic Tile Op (#5983)
* first working dynamic tile passes first test
* add dyn tile to dynamic_to_static
* fix cpplintt
* respond to review comments. Thanks @siju-samuel
* make dynamic tile compatible with numpy API
Christian Clauss [Tue, 7 Jul 2020 16:10:24 +0000 (18:10 +0200)]
Undefuned names: import os for line 324 & import re for line 308 (#6003)
Christian Clauss [Tue, 7 Jul 2020 16:10:10 +0000 (18:10 +0200)]
Update main.yml (#6002)
Lianmin Zheng [Tue, 7 Jul 2020 15:18:31 +0000 (08:18 -0700)]
Fix tune_relay_cuda.py (#6001)
Yizhi Liu [Mon, 6 Jul 2020 17:04:42 +0000 (10:04 -0700)]
[Arith] Inequalities solver (#5618)
Josh Fromm [Mon, 6 Jul 2020 02:38:31 +0000 (19:38 -0700)]
[Relay][Frontend][Onnx] Small bug fix for Conv1D imports. (#5995)
* Fix autopad bug in onnx importer for conv1d.
* Fix output shape in test.
* Undo commented out lines oops.
Junru Shao [Sun, 5 Jul 2020 16:52:08 +0000 (09:52 -0700)]
[Target] Use TargetNode::attrs for Target serialization (#5993)
Animesh Jain [Fri, 3 Jul 2020 01:13:56 +0000 (18:13 -0700)]
[TFLite] QNN support for TFLite 2.1.0 quantized models (#5848)
* [TFLite] TFLite 2.x parser quantization support.
* Address comments. Fix a bug for depthwise conv
* Added tests for relu, conv, quantize. Address comments.
* Using web-data. Minor refactoring.
* Removing TF hub package
* Trigger CI.
* Handle TFLite input layer naming.
* Addressing reviews.
* Retrigger CI.
Krzysztof Parzyszek [Fri, 3 Jul 2020 00:04:22 +0000 (19:04 -0500)]
[LLVM] VectorType::get with two parameters is deprecated in LLVM 11+ (#5984)
In LLVM 11+ the distinction between fixed and scalable vector types
has become more explicit. Before the introduction of scalable vector
types VectorType::get(e,n) created what is now a fixed vector type.
With the addition of scalable types, it is recommended to use
FixedVectorType and ScalableVectorType classes directly. Alternatively,
there is a VectorType::get that accepts a 3rd parameter indicating
whether the type should be fixed or scalable.
Using the older VectorType::get that implicitly assumes the fixed type
is deprecated and LLVM now generates a warning.
Change calls to VectorType::get to FixedVectorType::get to avoid
compilation warnings.
Josh Fromm [Thu, 2 Jul 2020 21:19:41 +0000 (14:19 -0700)]
[Tutorial] Demo showing how to run a pruned 🤗 model. (#5975)
Junru Shao [Thu, 2 Jul 2020 19:23:57 +0000 (12:23 -0700)]
[Target] Migrate data structure of TargetNode (#5960)
Krzysztof Parzyszek [Thu, 2 Jul 2020 19:19:16 +0000 (14:19 -0500)]
[LLVM] Remove redundant function CreateBufferVecPtr (#5982)
The functions CreateBufferPtr and CreateBufferVecPtr do the exact
same thing, so there is no need for both of them to exist. The
latter is only used in place, which further suggests that the
distinction is unnecessary.
Leslie-Fang [Thu, 2 Jul 2020 14:58:46 +0000 (22:58 +0800)]
fix tvm relay testing tf.py typo error (#5977)
Lianmin Zheng [Thu, 2 Jul 2020 05:59:21 +0000 (22:59 -0700)]
[TOPI] Fix x86 conv2d template when tuning with unpacked layout (#5938)
* fix x86 conv2d and conv2d_transpose template
* address comments
Trevor Morris [Thu, 2 Jul 2020 02:14:33 +0000 (19:14 -0700)]
[Relay/TOPI][OP] Add meshgrid op in Relay, TOPI, Pytorch frontend (#5961)
* Add meshgrid op with pytorch importer
* Fix c++ lint
* Fix pylint
* Meshgrid: add scalar test for pytorch, add topi python wrapper
* Add indexing mode attr.
* Add MeshgridAttrs python binding
* c++ lint
Lianmin Zheng [Wed, 1 Jul 2020 19:17:54 +0000 (12:17 -0700)]
[RELAY] Add resnet-3d & Update network definitions for NHWC layout (#5945)
Matthew Brookhart [Wed, 1 Jul 2020 18:39:21 +0000 (11:39 -0700)]
[DYNAMIC] Add Dynamic reshape to a dynamic namespace and add DynamicToStatic Pass (#5826)
* Dynamic reshape passing tests
* Add Dynamic to Static Pass
* rename test file to prevent pytest conflicts
* fix clang build
* add nested dynamic shape test
* remove cuda tests until VM supports dynamic shapes
* rename namespace from dynamic to dyn
* fix lint
* fix lint again
* Remove incorrect doc strings
* remove dynamic behavior from standard reshape
* fix some tests
* merge dynamic and static interfaces in python
* fix missing import
* missed a reference to relay.dyn.reshape
* fix vta example
* respond to review comments
Trevor Morris [Wed, 1 Jul 2020 15:04:15 +0000 (08:04 -0700)]
Add MXnNet parser for box_decode (#5967)
Andrew Reusch [Wed, 1 Jul 2020 15:03:59 +0000 (08:03 -0700)]
Improve docker/bash.sh to handle git worktrees (#5970)
* improve error code when git ls-files fails
* fix docker/bash to handle git worktrees
Krzysztof Parzyszek [Tue, 30 Jun 2020 22:25:33 +0000 (17:25 -0500)]
Print right number of parentheses for LoadNode (#5965)
Stop printing the unnecessary ')' after each LoadNode that didn't
have a matching '('.
Krzysztof Parzyszek [Tue, 30 Jun 2020 17:58:58 +0000 (12:58 -0500)]
Raise an exception when extern function does not return Stmt (#5964)
The function for tvm.te.extern should return either PrimExpr or Stmt,
however there is no check if it actually does so. If it does not, the
result may be a segmentation fault later on. Catch this case early on,
so an informative message can be shown.
Giuseppe Rossini [Tue, 30 Jun 2020 15:49:46 +0000 (16:49 +0100)]
Fix small typo in nn.conv2d_gemm_weight_transform (#5925)
* Fix small typo in nn.conv2d_gemm_weight_transform
Change-Id: I7844d898ebf82592f78f478982262ef95f83cc3e
* Add TOPI conv2d_gemm unit tests
Change-Id: I9ed82a68acffcf0dd9720781f8be4aada9d8e6e4
Thomas Viehmann [Tue, 30 Jun 2020 15:48:44 +0000 (17:48 +0200)]
Make first order gradient graphs more efficient (#5959)
Previously, nodes are visited as often as they are used and each time a
derivative is computed. Only at the leaves were the contributions of
everything added. This patch changes this to add at any node that is
used several times.
abergeron [Tue, 30 Jun 2020 07:05:43 +0000 (03:05 -0400)]
Fix the meaning of conv{1,2}d_transpose output_padding parameter. (#5758)
* Add output_padding to generic
* Add output_padding to the reference impl
* Add output_padding to arm_cpu
* Add output_padding to the test
* Add output_padding for cuda
* Add output_padding for x86
* Make use of the new output_padding argument in Relay
* Adjust conv2d_transpose Relay test
* Fix lint errors
* Fix the VTA declaration of conv2d_transpose
* support for output padding in conv2d transpose
* some output padding will break IR pass
* Fix new conv2d_transpose test
* Update tophub
* Fix conv1d output_padding too.
* Fix the conv1d_transpose reference function.
* Fix the cuda impl
* fix the topi test for conv1d
* format
* Add tests for conv1d_transpose output_padding and some check that the values are valid.
* Add check in the implementations
* Add checks to the implementations of conv2d
* Make use of the output_padding argument from topi in relay.
* Fix relay tests asking for invalid output_padding
* Fix line length
* Fix vta tests
* Update tophub references
* Trigger CI
Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
Thomas Viehmann [Tue, 30 Jun 2020 03:35:36 +0000 (05:35 +0200)]
Amendments for gradients (#5941)
* Amendments for gradients
- We fix the dtype handling of consts in generated gradients.
- We add a collapse_sum_to instruction mirroring the collapse_sum_like.
While for general definitions (potentially dynamic shapes),
collapse_sum_like is the first choice, when moving to static,
using collapse_sum_to will greatly simplify the graph.
(This simplification is not part of the PR.)
* Fix Broadcast rel description in comment
Thank you, @MarisaKirisame
Thomas Viehmann [Tue, 30 Jun 2020 03:34:20 +0000 (05:34 +0200)]
[RELAY][GRAD] handle Tuple/TupleGetItem in first order gradient (#5946)
* handle Tuple/TupleGetItem in first order gradient
* Unify MultiOnes/MultiZeros.
Leon Wang [Tue, 30 Jun 2020 03:28:30 +0000 (11:28 +0800)]
Fix some typo errors in license header (#5956)
Signed-off-by: leonwanghui <wanghui71leon@gmail.com>
Trevor Morris [Tue, 30 Jun 2020 00:55:22 +0000 (17:55 -0700)]
[OpenCL] Fix OpenCL get_valid_counts errors due to intrinsic atomic_add (#5857)
* [OpenCL] Fix atomic add used by get_valid_counts
* Rename l -> load, add flag to enable atomics
* Opencl doesn't do data rearrangement
Tianqi Chen [Mon, 29 Jun 2020 15:25:46 +0000 (08:25 -0700)]
[TIR][ANALYSIS] Refine side effect analysis. (#5954)
Yong Wu [Mon, 29 Jun 2020 06:18:38 +0000 (14:18 +0800)]
[Relay] symbolic max_output_size (#5844)
* symbolic max_output_size
* pylint
* fix ci
Yanming Wang [Sun, 28 Jun 2020 23:10:19 +0000 (23:10 +0000)]
[BUGFIX] Add cuda 11 to contrib.nvcc.find_libdevice_path() (#5902)
Tianqi Chen [Sun, 28 Jun 2020 23:02:06 +0000 (16:02 -0700)]
[REFACTOR][TIR][API-Change] Range/IntSet API style consistency. (#5953)
- Range::make_by_min_extent -> Range::FromMinExtent
- Update the APIs in IntSet to use CamelCase
Zhi [Sun, 28 Jun 2020 17:05:50 +0000 (10:05 -0700)]
[RELAY][VM] Add shape_of instruction (#5855)
Meteorix [Sun, 28 Jun 2020 16:28:33 +0000 (00:28 +0800)]
add rm xla attributes in tf docs (#5950)
Meteorix [Sun, 28 Jun 2020 16:28:15 +0000 (00:28 +0800)]
raise right error in tensorflow split op (#5951)
Tianqi Chen [Sun, 28 Jun 2020 16:22:11 +0000 (09:22 -0700)]
[TIR] Improve Let/LetStmt support. (#5949)
Let/LetStmt are useful primitives to create variable bindings.
While let binding are harmful for simplification and integer analysis,
they are useful for other cases:
- C0: LetStmt is useful to represent a step that has side effect(e.g. call a PRNG)
- C1: Let expression can be used to create deep nested expression for complicated functions.
This PR improves the let support in the following ways:
- Enable vectorization support for let
- Change let simplification strategy to simplify the most trivial case
while ignore more complicated cases(to avoid deep nest explosion)
- Enhance arith module to handle const bound and modular set for let.
The overall recommendation is to only use Let in the cases when necessary(C0, C1).
Yizhi Liu [Sun, 28 Jun 2020 08:24:38 +0000 (01:24 -0700)]
[Doc] minor fix for release doc (#5948)
Lianmin Zheng [Sun, 28 Jun 2020 00:09:39 +0000 (17:09 -0700)]
fix string argument mismatch in GraphRuntimeCodegen (#5933)