Tianqi Chen [Wed, 15 Apr 2020 22:32:59 +0000 (15:32 -0700)]
[DOCS] Bring relay docs to the top-level flat view (#5343)
- Changes most of the relay docs to use autosummary.
- Bring relay API docs to the top-level flat view for easier discovery
- Removed a few cases of re-exports.
masahi [Wed, 15 Apr 2020 20:46:28 +0000 (05:46 +0900)]
[Tutorial, QNN] Add tutorial for loading quantized PyTorch model (#5321)
* add pytorch tutorial code and doc stub
* add more docs
* formatting, more docs
* typo fix
* try make sphinx happy
* add performance section
* type and nit fix
* format fix
Trevor Morris [Wed, 15 Apr 2020 20:33:31 +0000 (13:33 -0700)]
[BYOC] Prevent duplicate outputs in subgraph Tuple (#5320)
* Fix duplicate output in partitiongraph
* Add test case
* Fix test_annotated_regions with duplicate compiler_end outputs
* Revert "Fix duplicate output in partitiongraph"
This reverts commit
e1f8ef3f4ca5b2aaa31ace6fa968bb50e5e4d1fa.
* Prevent duplicate outputs in Tuple in PartitionGraph
* Fix lint
* Add another test case for when regions are merged, and when TupleGetItem was duplicated
* Pull GetFunctionOutput out of branch, improve description of GetFunctionOutput
* Use std::move for GetFunctionOutput. Fix typo with testcase name
* Use tvm.transform.Sequential
Tianqi Chen [Wed, 15 Apr 2020 18:11:39 +0000 (11:11 -0700)]
[TIR] Remove ProducerConsumer and AllocateNode::new_expr (#5333)
* [TIR] Remove ProducerConsumer and AllocateNode::new_expr
This PR removes two legacy IR parts in TIR that are deprecated.
ProducerConsumer node only serves as a hint markup and may no longer be
informative after extensive transformations in the pass.
If necessary, we can add related info via AttrStmt.
The new_expr field in the AllocateNode is deprecated since it can just be
replaced by a LetStmt.
- Remove dependencies of passes on ProducerConsumer.
- Remove ProducerConsumer from the IR.
- Remove the deprecated fields (new_expr, free_function) from AllocateNode.
* Fix additional testcases
Tianqi Chen [Wed, 15 Apr 2020 18:11:28 +0000 (11:11 -0700)]
[PYTHON] Enhance with_attr API, cleanup MakeAPILegacy in testcases (#5335)
Leyuan Wang [Wed, 15 Apr 2020 15:32:50 +0000 (08:32 -0700)]
[TOPI] Improve get_valid_count and nms performance for CUDA (#5339)
* get_valid_count updated to have correct results
* speedup nms
* update nms
* revert back nms
* recover one test for get_valid_count
Animesh Jain [Wed, 15 Apr 2020 15:31:48 +0000 (08:31 -0700)]
[TOPI] Using x86 schedules for ARM conv2d. (#5334)
Samuel [Wed, 15 Apr 2020 10:18:03 +0000 (15:48 +0530)]
[PYTORCH]Take, Topk op support (#5332)
* [PYTORCH]take, topk op support
* Ci Failure fix
jmorrill [Wed, 15 Apr 2020 08:49:15 +0000 (01:49 -0700)]
Windows Support for cpp_rpc (#4857)
* Windows Support for cpp_rpc
* Add missing patches that fix crashes under Windows
* On Windows, use python to untar vs wsl
* remove some CMakeLists.txt stuff
* more minor CMakeLists.txt changes
* Remove items from CMakeLists.txt
* Minor CMakeLists.txt changes
* More minor CMakeLists.txt changes
* Even more minor CMakeLists.txt changes
* Modify readme
Jared Roesch [Wed, 15 Apr 2020 00:10:00 +0000 (17:10 -0700)]
[Runtime][Relay][Cleanup] Clean up for memory pass to enable heterogenous execution support. (#5324)
* Cleanup type pack and unpack for tuples.
* Clean up the memory_pass using common helpers
* Clean up memory.cc
* Refactor pass
* Add doc strings
* Fix CPPlint
* Fix PyLint
* Fix
* Apply suggestions from code review
Co-Authored-By: Zhi <5145158+zhiics@users.noreply.github.com>
* Fix typo
Co-authored-by: Zhi <5145158+zhiics@users.noreply.github.com>
Leandro Nunes [Wed, 15 Apr 2020 00:03:43 +0000 (01:03 +0100)]
[CI] Fix build.sh to propagate --network=host to the docker build command (#5336)
* when passing --net=host to build.sh it needs to be also
sent as --network=host to "docker build", so that both
build and run will use the same network configuration
Krzysztof Parzyszek [Wed, 15 Apr 2020 00:03:31 +0000 (19:03 -0500)]
[LLVM] Use llvm::FunctionCallee in IRBuilder::CreateCall with LLVM 11+ (#5338)
The older variants of CreateCall have been deprecated and were recently
removed from LLVM. This caused compilation failures.
Tianqi Chen [Wed, 15 Apr 2020 00:03:15 +0000 (17:03 -0700)]
[RELAY] Remove re-exports of tvm.transform (#5337)
Tianqi Chen [Tue, 14 Apr 2020 14:48:14 +0000 (07:48 -0700)]
[TIR] Refactor MakePackedAPI to target dependent stage. (#5326)
Previously MakePackedAPI was in the target independent stage,
but never the less requires the device_type information that will be
binded at a later target dependent stage.
The previous implementation was due to the limitation of LoweredFunc
which can not carry buffer_map info(so they have to be lowered right away).
This is no longer the case after the unified IR refactor.
This PR migrates MakePackedAPI to a target dependent stage
and removes the un-necessary BindDevice pass.
Samuel [Tue, 14 Apr 2020 09:45:02 +0000 (15:15 +0530)]
[RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops (#5316)
* [RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops
* Review comments
Wuwei Lin [Tue, 14 Apr 2020 06:47:57 +0000 (02:47 -0400)]
[TE][BuildModule] Fix import in dump pass ir (#5327)
Mahesh Ambule [Tue, 14 Apr 2020 06:09:21 +0000 (11:39 +0530)]
[Frontend|MXNet] SwapAxis operator support (#5246)
* MXNet swap axis
* MXNet swap axis
* swap axis review comment
* swap axis review comment
LiangLiu [Tue, 14 Apr 2020 03:35:31 +0000 (11:35 +0800)]
[CODEGEN][CUDA] Fix vector load (#5226)
* Fix high-low bit bug in __pack_half2
* Fix vector load
* Add unit8 support for PrintVecElemLoadExpr and BroadcastNode
masahi [Tue, 14 Apr 2020 01:21:31 +0000 (10:21 +0900)]
add memoized expr translator for use by backend codegen (#5325)
Tianqi Chen [Mon, 13 Apr 2020 23:04:32 +0000 (16:04 -0700)]
[COMMUNITY] @mbaret -> Reviewer (#5322)
Zhi [Mon, 13 Apr 2020 21:06:02 +0000 (14:06 -0700)]
[BYOC] Enhance partitioning and external codegen (#5310)
* Remove duplicated output args
* address comment
* fix codegen c
* improve comment
* VisitExprDefault_
* deduce type
Tianqi Chen [Mon, 13 Apr 2020 17:49:48 +0000 (10:49 -0700)]
[RUNTIME][IR] Allow non-nullable ObjectRef, introduce Optional<T>. (#5314)
* [RUNTIME] Allow non-nullable ObjectRef, introduce Optional<T>.
We use ObjectRef and their sub-classes extensively throughout our codebase.
Each of ObjectRef's sub-classes are nullable, which means they can hold nullptr
as their values.
While in some places we need nullptr as an alternative value. The implicit support
for nullptr in all ObjectRef creates additional burdens for the developer
to explicitly check defined in many places of the codebase.
Moreover, it is unclear from the API's intentional point of view whether
we want a nullable object or not-null version(many cases we want the later).
Borrowing existing wisdoms from languages like Rust. We propose to
introduce non-nullable ObjectRef, and Optional<T> container that
represents a nullable variant.
To keep backward compatiblity, we will start by allowing most ObjectRef to be nullable.
However, we should start to use Optional<T> as the type in places where
we know nullable is a requirement. Gradually, we will move most of the ObjectRef
to be non-nullable and use Optional<T> in the nullable cases.
Such explicitness in typing can help reduce the potential problems
in our codebase overall.
Changes in this PR:
- Introduce _type_is_nullable attribute to ObjectRef
- Introduce Optional<T>
- Change String to be non-nullable.
- Change the API of function->GetAttr to return Optional<T>
* Address review comments
* Upgrade all compiler flags to c++14
* Update as per review comment
Josh Fromm [Mon, 13 Apr 2020 17:49:17 +0000 (10:49 -0700)]
[Topi] Tensorcore support for Conv3D (#5284)
* one weird trick.
* Added schedule knob for different workloads.
* Initial conv3d tensorcore working.
* Added conv3d tensorcore strategy.
* Added layout conversion to tensorcore friendly format for conv2d and conv3d.
* Add target name check.
* Fixed bad names and depthwise check.
* Removed duplicated attribute assignment.
windclarion [Mon, 13 Apr 2020 14:55:33 +0000 (22:55 +0800)]
[REALY][OP] fix typo (#5315)
Signed-off-by: windclarion <windclarion@gmail.com>
Samuel [Mon, 13 Apr 2020 09:50:10 +0000 (15:20 +0530)]
[PYTORCH]Reduce_ops support added (#5308)
* [PYTORCH]Reduce_ops support added
* Review comments updated
* typo bug in qnn test
masahi [Mon, 13 Apr 2020 06:11:57 +0000 (15:11 +0900)]
[Torch] Support Python list, more realistic recurrent networks (#5306)
* use funcs from prelude, pass around convert_map
* get relay input type from user ishape
* handle tuple unpack
* experimenting with static tensor array
* use prelude concat instead of cons + rev
* minor clean up
* fix layer norm conversion bug, unwrap tensor array
* add infer shape on tensor array
* pass around prelude for now
* compile worked but runtime error
* fix tensor array wrapping
* begin list dynamic test
* is_list_dynamic first version
* finish dynamic list test
* a few fix
* use shape_of function if Any is found
* improve size conversion
* working on adding free vars to loop block
* fixed inlined inner loop issue
* clean up free var handling
* add support for tensor array concat
* adding ta concat on last axis
* fix concat, but got runtime error
* disable concat on axis -1 for now
* add lstm tests
* revert unrelated change
* fix stacked bidir test
* minor fix to test
* relax tol a bit, revert dnnl change to avoid conflict
* simplify infer type, use input tensor shape rather than concat shape
* more shape fix
Junru Shao [Sun, 12 Apr 2020 16:32:23 +0000 (09:32 -0700)]
[Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign (#5312)
* [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign
* Lint
Jared Roesch [Sun, 12 Apr 2020 16:30:47 +0000 (09:30 -0700)]
[Rust][CI] Restore Rust CI (#5137)
Zhi [Sun, 12 Apr 2020 16:12:23 +0000 (09:12 -0700)]
Remove PrimExpr from String (#5311)
Animesh Jain [Sun, 12 Apr 2020 06:10:52 +0000 (23:10 -0700)]
[Requantize] Cleanup and Optimize Lowering (#5286)
* Adding Cast back to Int32 in FixedPointMultiply.
* Removing extra clip.
* Fix space.
* Retrigger.
* Retrigger.
Tianqi Chen [Sun, 12 Apr 2020 00:42:42 +0000 (17:42 -0700)]
[IR][TRANSFORM] Enable CopyOnWrite for passes. (#5309)
This PR enables the copy on write optimizations passes:
- Enable COW for IRModule both TIR and relay passes.
- Enabled COW for PrimFunc in TIR passes.
Need more thoughts into whether/how to enable COW
for relay::Function, due to some function passes depend
on the presence of IRModule for context information,
and the std::move of the related function to nullptr
might affect the related behavior.
Samuel [Sat, 11 Apr 2020 05:02:58 +0000 (10:32 +0530)]
[PYTORCH]Abs, Arange, Softplus ops (#5295)
* [PYTHON]Abs, Arange, Softplus ops
* Review comments updated
Krzysztof Parzyszek [Sat, 11 Apr 2020 04:19:03 +0000 (23:19 -0500)]
[LLVM] Fix generation of LLVM intrinsics (#5282)
* [LLVM] Fix generation of LLVM intrinsics
The type list in the call to llvm::Intrinsic::getDeclaration is not
the intrinsic's signature, it's the list of overloaded types. Without
this fix, the updated unit test would cause the following error:
TVMError: LLVM module verification failed with the following errors:
Intrinsic name not mangled correctly for type arguments! Should be:
llvm.ctlz.i32
i32 (i32, i1)* @llvm.ctlz.i32.i1
Special handling for llvm.prefetch, sig matching for overloaded ints only
The prefetch intrinsic returns void in LLVM, while it returns i32 in TVM.
This case needs to be handled specially, because rule-based intrinsic
translation would cause invalid LLVM type to be created.
Do the signature matching only for overloaded intrinsics. It's not needed
for non-overloaded ones, so this can save a bit of compile-time.
* Include intrinsic name in the error message
* Fix number of arguments for llvm.fmuladd and llvm.pow
masahi [Sat, 11 Apr 2020 03:29:20 +0000 (12:29 +0900)]
[BYOC] Add example of Composite + Annotate for DNNL fused op (#5272)
* merge change from dev branch
* fix string issue
* bring comanic's change back
Yao Wang [Sat, 11 Apr 2020 01:43:23 +0000 (18:43 -0700)]
[Frontend][TensorFlow]Improve TensorFlow Static Shape Tensor Array (#5243)
* Support TF Frontend Static TensorArray
* Fix pylint
* Fix lint
* Move get_tensor_array_shape into prelude
* Fix lint
* Fix common
Tianqi Chen [Sat, 11 Apr 2020 00:07:20 +0000 (17:07 -0700)]
[RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc (#5271)
* [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc
This PR introduces RValue reference support the PackedFunc calling convention to address the above issue.
Specifically, when an argument is a r-value reference, we will use a assign a different type code(`kObjectRValueRefArg`),
and pass `Object**` (the address to the Object pointer) instead through the values array.
The callee can choose to move out this Object pointer and set the original Object pointer from the caller side to be nullptr.
We also add an experimental move support to the python side(marked as _move so to indicate the dev nature).
This enhancement will enable copy on write optimizations through out the TVM stack.
* Address review comments
* fix compilation
Huacong Yang [Fri, 10 Apr 2020 21:46:03 +0000 (05:46 +0800)]
[RELAY][FRONTEND][CAFFE2] add Mul and ConvTranspose operator (#5302)
Cody Yu [Fri, 10 Apr 2020 21:32:56 +0000 (14:32 -0700)]
[BYOC] Refine AnnotateTarget and MergeCompilerRegion Passes (#5277)
* add target to region
* refactor annotate_target
* Make all unit test working
* quick fix
* enable BN, unit test failed
* Fix vm test, unit test. Refactor annotate_target a bit.
* quick fix fusion
* revert fusion change
* style fix
* Refactor merge region pass
* format
* minor fix
* Skip e2e test
* lint
* support AnnotateTarget multiple runs
* Add HasAttr and revert DNNL codegen
* address comment
Co-authored-by: Zhi Chen <chzhi@amazon.com>
Tianqi Chen [Fri, 10 Apr 2020 17:58:59 +0000 (10:58 -0700)]
[CI] Fix the hexagon string (#5304)
Yizhi Liu [Fri, 10 Apr 2020 15:11:21 +0000 (08:11 -0700)]
[Arith] linear system and equation solver (#5171)
* [arith] linear system and equation solver
Co-authored-by: Sergei Grechanik <sergei.grechanik+h@gmail.com>
* avoid constructing analyzer every time
* generate random test cases and address comments
Co-authored-by: Sergei Grechanik <sergei.grechanik@gmail.com>
* rename linear_system to int_constraints
* add comments and use random seed
* message for reporting failure with seed
* add SEqualReduce to IntConstraints; allow variables & ranges to be None
Co-authored-by: Sergei Grechanik <sergei.grechanik+h@gmail.com>
Co-authored-by: Sergei Grechanik <sergei.grechanik@gmail.com>
Samuel [Fri, 10 Apr 2020 15:08:56 +0000 (20:38 +0530)]
[PYTORCH]Repeat, Reciprocal & Reshape Op support (#5280)
MORITA Kazutaka [Fri, 10 Apr 2020 14:47:53 +0000 (23:47 +0900)]
[FRONTEND][TENSORFLOW] Fix gather_nd indices (#5279)
* [FRONTEND][TENSORFLOW] Fix gather_nd indices
* retrigger CI
weiliangweiliang [Fri, 10 Apr 2020 14:47:19 +0000 (22:47 +0800)]
Update device_annotation.cc (#5291)
Zhi [Fri, 10 Apr 2020 14:46:23 +0000 (07:46 -0700)]
[REFACTOR][IR] Move to runtime::String (#5276)
* Use runtime::String
* move string to tvm namespace
* add const char* constructor
* implicit cast from std::string
hlu1 [Fri, 10 Apr 2020 14:42:54 +0000 (07:42 -0700)]
[NDArray] Set shape_ in NDArray::FromDLPack (#5301)
Krzysztof Parzyszek [Fri, 10 Apr 2020 13:47:59 +0000 (08:47 -0500)]
[RUNTIME] Initial implementation of Hexagon runtime support (#5252)
* [RUNTIME] Initial implementation of Hexagon runtime support
This is only the TVM runtime. The FastRPC libraries, simulator driver,
etc. will be provided in subsequent commits.
* Fix pylint complaints
* Fix some more pylint complaints
* Add link to the Hexagon SDK website
* Extract VTCM marker into a common variable
* Implement device->device memory copy
* Disable unsigned PDs by default
* Ensure that --hvx_length is present in sim_args if HVX is enabled
* Remove the line about clang from README.md
Apparently things work with libstdc++.
* Mention to set USE_RPC=OFF when building libtvm_runtime.so for Hexagon
* Remember to use codegen_hvx in validate_hvx_length
* Add a line about minimum version of LLVM
Cody Yu [Fri, 10 Apr 2020 07:29:39 +0000 (00:29 -0700)]
[BYOC] Refine DNNL Codegen (#5288)
* Improve DNNL
* Add bind params
* trigger ci
shoubhik [Fri, 10 Apr 2020 05:32:28 +0000 (22:32 -0700)]
Adding support for TFLite QnnSub operator. (#5230)
Tianqi Chen [Fri, 10 Apr 2020 05:04:08 +0000 (22:04 -0700)]
[NODE] General serialzation of leaf objects into bytes. (#5299)
This PR refactors the serialization mechanism to support general
serialization of leaf objects into bytes.
The new feature superceded the original GetGlobalKey feature for singletons.
Added serialization support for runtime::String.
Animesh Jain [Fri, 10 Apr 2020 04:01:35 +0000 (21:01 -0700)]
Legalize - Use Non-recursive Rewriter. (#5296)
* Legalize - Use Non-recursive Rewriter.
* Cleanup.
Yizhi Liu [Fri, 10 Apr 2020 03:58:43 +0000 (20:58 -0700)]
[Node] Provide guide to user who has difficulty register SEqualReduce (#5300)
Samuel [Fri, 10 Apr 2020 01:56:47 +0000 (07:26 +0530)]
[TENSORFLOW]reduce ops updated (#5180)
yongfeng-nv [Fri, 10 Apr 2020 01:49:37 +0000 (21:49 -0400)]
Create loops according to storage scope and thread hierarchies (#5190)
* Set IterVar index to 0 for local thread bound IterVars.
* Lint fix
* Use rank instead of scope name to predicate. Add tests.
* Handle cases other than local/threadIdx.
* Turn warp to the old behavior.
* Modify test to cover global/blockIdx.
* Fix a typo.
* Update test_te_schedule_ops.py with more testing coverage in test_local_stage_predicate; remove test_schedule_schedule_ops.py which was added by mistake.
Tianqi Chen [Fri, 10 Apr 2020 00:55:20 +0000 (17:55 -0700)]
[CI] Temporary disable CRT test (#5297)
Tianqi Chen [Thu, 9 Apr 2020 19:48:51 +0000 (12:48 -0700)]
[BUGFIX] Fix CRT static test bug (#5293)
* [CI][DOCS] Make sure to refresh the cython part
* [BUGFIX] Fix CRT static test bug
* Fix demo_static
* resolve review comment
Zhi [Wed, 8 Apr 2020 21:20:16 +0000 (14:20 -0700)]
[BUGFIX][IR] Fix String SEqual (#5275)
* fix String SEqual
* retrigger ci
Luis Vega [Wed, 8 Apr 2020 20:46:52 +0000 (13:46 -0700)]
update compiler version in docs (#5281)
Haichen Shen [Wed, 8 Apr 2020 03:55:48 +0000 (20:55 -0700)]
[LINT] Remove scalalint from lint deps (#5269)
Krzysztof Parzyszek [Wed, 8 Apr 2020 03:53:25 +0000 (22:53 -0500)]
[LLVM] Include Support/Host.h for declaration of getDefaultTargetTriple (#5268)
In newer versions of LLVM, this header is no longer included by one of
the already included headers in llvm_common.h, so include it explicitly.
Samuel [Wed, 8 Apr 2020 03:45:41 +0000 (09:15 +0530)]
[PYTORCH]celu, gelu, selu activations (#5263)
mbaret [Wed, 8 Apr 2020 03:12:15 +0000 (04:12 +0100)]
[RELAY][BYOC] Add support for composite functions in BYOC (#5261)
* [RELAY] Add 'check' functions to MergeComposite
Currently, MergeComposite can only perform structural
matches. This patch introduces the ability to specify
a 'check' function alongside the pattern which can include
custom logic to determine whether an extracted pattern
should be merged.
For example, if you only want to merge 'NHWC' convolutions,
you can specify a 'check' function which queries the
data_layout value of the extracted pattern (see the test).
Change-Id: I9337ce39f10997051a286d888be38ed0d410d340
* [RELAY] Reformat merge_composite.cc
Run clang-format on merge_composite.cc
Change-Id: I1736bff798cc6d93e57519b08ab3362869098779
* [RELAY][BYOC] Support composite functions in AnnotateTarget
This patch introduces support to annotate composite functions
in the AnnotateTarget pass. In order for a composite function
to be annotated, you should name it according to the style:
{codegen}.{name}
eg. dnnl.add_relu
Change-Id: I74d6c0b506153d866f6d1feb203b32dad59f2871
tobe [Tue, 7 Apr 2020 23:59:32 +0000 (07:59 +0800)]
[RUNTIME] Implement TVMDSOOp(TensorFlow custom op) for TVM runtime (#4459)
* Add implementation of TVMDSOOp
* feat: Update cmake script to work with c++11 and in-repo build
* feat: Use libtvm as oplib dependency
* fix: Add missing link dependency to libtvm
* feat: Update tf tvmdso op by review comments
* fix: Update with pr comments
* fix: Fix lint
* feat: Add test script and fix gpu shape
* feat: Add test script and fix gpu shape
* fix: Conditional build tftvm op for gpu
* fix: Conditional build tftvm op for gpu
* fix: Fix pylint of tf_op module.py
* fix: Fix pylint of tf_op module.py
* feat: Conditional enable gpu test for tftvm op
* feat: Conditional enable gpu test for tftvm op
* feat: Add tf_tvmdsoop test script as an app test
* fix: Fix gpu/cpu enabled check on tvm in test script
* fix: Make tf tvmdso op test script runnable with pytest
* remove unused test script test_tfop_module.py
* fix: Remove pushd & popd in tfdsoop test script
* fix: Upgrade tftvmop use python3 to find TensorFlow
* fix: Upgrade tftvmop use python3 to find TensorFlow
* fix: Change target_link_options to target_link_libraries
* fix: Add tftvmop build script's c++ option
* fix: Add tvm library path to tf op test library path
* fix: Debug ci build for tftvm dso op
* fix: Fix cmake error and skip tfop test
* fix: Fix typo and indentation issues
* feat: Use TF list input op def
* fix: Fix style and unexpected changes
Co-authored-by: baoxinqi <baoxinqi@4paradigm.com>
Co-authored-by: Chen Dihao <chendihao@4paradigm.com>
Co-authored-by: wrongtest <wrongtest@4paradigm.com>
Krzysztof Parzyszek [Tue, 7 Apr 2020 23:58:33 +0000 (18:58 -0500)]
[LLVM] Do not use x86_vcvtph2ps_256 intrinsic with LLVM 11+ (#5267)
This intrinsic was removed in LLVM 11.
Tianqi Chen [Tue, 7 Apr 2020 23:33:12 +0000 (16:33 -0700)]
[RUNTIME] Quick fix PackedFunc String passing (#5266)
Krzysztof Parzyszek [Tue, 7 Apr 2020 22:49:07 +0000 (17:49 -0500)]
[LLVM] Use llvm::ElementCount with LLVM 11+ when creating vectors (#5265)
LLVM 11 added support for scalable vectors, and now the number of
elements in a vector is represented by a llvm::ElementCount class,
not just a number.
Krzysztof Parzyszek [Tue, 7 Apr 2020 22:48:59 +0000 (17:48 -0500)]
[LLVM] Use llvm::Align with LLVM 11+ to avoid warnings (#5264)
LLVM 11 is introducing a separate class to represent alignment.
The functions in IRBuilder that create aligned loads and stores,
and which accept the alignment as an unsigned value have been
deprecated (and now cause warnings to be emitted).
Liangfu Chen [Tue, 7 Apr 2020 21:33:05 +0000 (05:33 +0800)]
[uTVM][Runtime] Introduce Virtual Memory Allocator to CRT (#5124)
* initial crt_memory and memory leak fix in graph_runtime
Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a
* fix memory leak
Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3
* clean up
Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3
* implement vrealloc
Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398
* allocate from stack memory for most of the variables
Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee
* allocate from stack memory for all of the variables
Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74
* lint
Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73
* lint
Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100
* facilitate the growth of TVM_CRT_MAX_NDIM
Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581
* extend test coverage of vmalloc
Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581
* lint
Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69
* move logging.h to src
* fix an error in macOS
* remove logging.h
* use cflags for gcc
* fix compilation error
Haichen Shen [Tue, 7 Apr 2020 19:05:33 +0000 (12:05 -0700)]
[Relay][OP] Add fast_erf implementation (#5241)
* add fast erf
* doc
* lint
* fix
* fix indent
Tianqi Chen [Tue, 7 Apr 2020 17:24:55 +0000 (10:24 -0700)]
[TIR] Fix perf regression of tir refactor (#5258)
Adrian Muresan [Tue, 7 Apr 2020 15:21:23 +0000 (17:21 +0200)]
Fixed typo and type mismatch (#5259)
Co-authored-by: Adrian Muresan <muresan.adrian.bn@gmail.com>
Samuel [Tue, 7 Apr 2020 10:13:45 +0000 (15:43 +0530)]
[Pytorch]layernorm bug fix and testcase updated (#5257)
Samuel [Tue, 7 Apr 2020 08:14:02 +0000 (13:44 +0530)]
[TFLITE]Hard Swish & MobilnetV3 model testing (#5239)
* [TFLITE]Hard Swish & MobilnetV3 model testing
* CI Failure addressed
Pratik Fegade [Tue, 7 Apr 2020 02:04:20 +0000 (22:04 -0400)]
[TE] Minor bugfix in message_passing.cc (#5254)
Haichen Shen [Mon, 6 Apr 2020 21:30:53 +0000 (14:30 -0700)]
[Topi] Breakdown topi.cc into smaller files (#5253)
* [Topi] Breakdown topi.cc into smaller files
* add missing file
Samuel [Mon, 6 Apr 2020 20:31:19 +0000 (02:01 +0530)]
[PYTORCH]LayerNorm support added (#5249)
Tianqi Chen [Mon, 6 Apr 2020 20:25:00 +0000 (13:25 -0700)]
[RUNTIME] Enable auto conversion from str to runtime::String in PackedFunc, move dtype related handling to data_type.h (#5251)
Tang, Shizhi [Mon, 6 Apr 2020 15:43:38 +0000 (23:43 +0800)]
fix lower_warp_memory (#5247)
chinakook [Mon, 6 Apr 2020 15:42:01 +0000 (23:42 +0800)]
fix to skip node not in graph. (#5238)
fix to skip node not in graph because some network cannot be hybridized with some var unused.
Haichen Shen [Mon, 6 Apr 2020 06:38:24 +0000 (23:38 -0700)]
[CI] Update MxNet to 1.6.0 with MKL (#5240)
Haichen Shen [Mon, 6 Apr 2020 03:53:59 +0000 (20:53 -0700)]
[Runtime][Contrib] Support cudnn softmax (#5214)
Josh Fromm [Sun, 5 Apr 2020 21:59:38 +0000 (14:59 -0700)]
[Relay][Topi][AutoTVM] Winograd support for Conv3D (#5186)
* Functional conv3d winograd working.
* Formatted python code.
* registered conv3d winograd compute and started adding relay without_weight_transform operator.
* Add topi testing for conv3d winograd.
* Format file.
* small tweak to unrolling to prevent build sticking.
* Refactoring convolution ops in relay.
* Refactored relay convolutions.
* Bug fixes.
* Fixed static bug in convolution.
* Added conv3d alter op layout and related support.
* Bug fixes and testing done.
* Fix a few autotvm bugs.
* Drop silly debug print.
* Removed debug_skip_region.
* Add variant of conv3d_winograd that doesn't transform depth.
* initial infrastructure done for depthless conv.
* Fix no_depth schedule bugs.
* automatic topi switching between depth and depthless winograd.
* Fixed bug in schedule.
* lint fixes.
* Removed indents in convolution.cc
* missed a few indents oops.
* fixed flop count.
* One more small tweak.
* Change kernel pack inner axes order.
* Style changes.
* Comment fixes.
ga [Sun, 5 Apr 2020 20:45:43 +0000 (16:45 -0400)]
[Fix][VM] Fix copy constructor (#5237)
Yao Wang [Sun, 5 Apr 2020 20:42:28 +0000 (13:42 -0700)]
[Relay][ADT]Static Tensor Array (#5103)
* Add other static tensor array ops
* Add tensor array get data
* Minor refactor
* Fix pylint
* Update docstring
* Make get data more generic
* Improve test
* Improve split test
* Improve get data
* Minor fix
* Further improvement for static shape
* Improve shape parsing
* Unify get_static_name
Tianqi Chen [Sun, 5 Apr 2020 00:36:49 +0000 (17:36 -0700)]
[REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. (#5233)
* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager.
This PR migrates the tvm.lower to return IRModule of PrimFuncs
instead of the LoweredFuncs.
* Remove LoweredFunc.
Samuel [Sat, 4 Apr 2020 09:33:43 +0000 (15:03 +0530)]
[ONNX]Pool3d & upsample3d op support (#5135)
* [ONNX]Pool3d and Upsample3d op updated
* Pool3d and Upsample3d testcase
* Review comments fixed
* Review comments
Yao Wang [Sat, 4 Apr 2020 03:46:32 +0000 (20:46 -0700)]
Fix intel conv2d auto tune (#5200)
* Fix x86 conv2d and depthwise conv2d auto tuning
* Fix depthwise conv2d infer layout
* Use random data instead of empty data for autotvm
* Fix pylint
* Keep empty array for now for autotvm
Tang, Shizhi [Sat, 4 Apr 2020 01:49:56 +0000 (09:49 +0800)]
[TE] Support mixing normal and cross-thread reduction (#5193)
* Support mixing normal and cross-thread reduction
* minor improvements
Tianqi Chen [Fri, 3 Apr 2020 22:50:11 +0000 (15:50 -0700)]
[REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. (#5225)
* [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager.
- SplitHostDevice
- ThreadSync
- BindDevice
- LowerThreadAllreduce
- Provide a temp fix for printing IRModule with PrimFunc before the formal text printer.
* Address comments, fix tests.
* Fix relay tests
* Explicit move
Tianqi Chen [Fri, 3 Apr 2020 22:24:19 +0000 (15:24 -0700)]
[PYTHON] Make IntImm more like an integer (#5232)
Matthew Brookhart [Fri, 3 Apr 2020 21:35:55 +0000 (14:35 -0700)]
[RELAY] Non-recursive Graph Vistor and Rewriter (#4886)
* First pass a defining a non-recursive Graph Vistor and Rewriter
autoformat
remove a currently empty test until testing is solidfied
* Make CalcDep from Dead Code Elimination non-recursive
* Partially working, not passing all tests yet
passes tests when disabling GetExprRefCount, I think I have a bug in visit counting
fix GetExprRefCount
Fix a subtle bug with nested recursive/non-recursive scopes
* Refactor
* improve comments
* respond to review comments on comments
* Fix a problem with default recursion for dataflow nodes
mark DataflowVisitor methods as override
* implement ScopeMutator
* convert forward_rewrite to ScopeMutator, remove DataflowMutator
* rewrite ExprRewriter and convert fast_math to use it
* switch BiasAddSimplifier to ExprRewriter
fix a clang warning
fix cpp lint
fix doc param error
* respond to review comments
* fix a typo in the iterative looping
* add a regression test for GetExprRefCount issue
* Normalize naming
* fix lint
* First pass a defining a non-recursive Graph Vistor and Rewriter
autoformat
remove a currently empty test until testing is solidfied
* Make CalcDep from Dead Code Elimination non-recursive
* Partially working, not passing all tests yet
passes tests when disabling GetExprRefCount, I think I have a bug in visit counting
fix GetExprRefCount
Fix a subtle bug with nested recursive/non-recursive scopes
* Refactor
* improve comments
* respond to review comments on comments
* Fix a problem with default recursion for dataflow nodes
mark DataflowVisitor methods as override
* implement ScopeMutator
* convert forward_rewrite to ScopeMutator, remove DataflowMutator
* rewrite ExprRewriter and convert fast_math to use it
* switch BiasAddSimplifier to ExprRewriter
fix a clang warning
fix cpp lint
fix doc param error
* respond to review comments
* fix a typo in the iterative looping
* add a regression test for GetExprRefCount issue
* Normalize naming
* fix lint
* respond to review comments
Animesh Jain [Fri, 3 Apr 2020 19:11:32 +0000 (12:11 -0700)]
[TOPI x86] Adding unroll_kw config option for depthwise conv2d. (#5197)
mbaret [Fri, 3 Apr 2020 16:33:15 +0000 (17:33 +0100)]
[RELAY][FIX] Fix hang in MergeCompilerRegions (#5227)
For certain network topologies, MCR could hang.
This patch fixes that case.
Change-Id: I3edd8a8a6b452b2b838b777720adea22a3b995b4
Samuel [Fri, 3 Apr 2020 16:04:41 +0000 (21:34 +0530)]
[KERAS]Upsample3d & ZeroPadding3d op (#5125)
* [KERAS]upsampling3d and zeropadding3d op
* [KERAS]upsampling3d and zeropadding3d test case
* Review comments updated
Samuel [Fri, 3 Apr 2020 13:41:28 +0000 (19:11 +0530)]
[DOCSTRING]missing function parameters updated (#5228)
Wei Pan [Fri, 3 Apr 2020 06:57:40 +0000 (23:57 -0700)]
[CodeGen][CUDA] Fix bugs (#5209)
- Support vectorized casts
- It is incorrect to extract elements from int8x4 with
0x000000ff & (x >> i * 8)
as this value is of type int in C/C++. If this expression
is used for sign extensions, the sign bit will be wrong.
Simply use C style casts instead and sign bits will just work.
Signed-off-by: Wei Pan <weip@nvidia.com>
Tianqi Chen [Thu, 2 Apr 2020 23:56:24 +0000 (16:56 -0700)]
[REFACTOR] tvm.hybrid -> te.hybrid (#5223)
Rationale: The current hybrid module is more aligned with the te part.
We might consider add a new varient of hybrid script that support the unified IR later.
This refactor paves for the potential later changes.
Tianqi Chen [Thu, 2 Apr 2020 23:56:11 +0000 (16:56 -0700)]
[DOCS] Misc docs improvements (#5222)
- Reduce CI docs task log size.
- Update the relation to halide to the latest state.
Samuel [Thu, 2 Apr 2020 22:20:41 +0000 (03:50 +0530)]
[PYTORCH]AvgPool3d, MaxPool3d and Squeeze op support (#5220)
* [PYTORCH]AvgPool3d, MaxPool3d and Squeeze op support
* Testcases added
* review comments
Tianqi Chen [Thu, 2 Apr 2020 20:22:23 +0000 (13:22 -0700)]
[REFACTOR][TIR] Migrate low-level pass functions to Pass Manager, (#5213)
- Migrate LowerTVMBultin
- Migrate inferFragment, LowerThreadAllreduce
- Migrate ThreadSync
- Refactor target::Build to directly take IRModule.
- Remove un-used legacy functions.
Tianqi Chen [Thu, 2 Apr 2020 19:36:55 +0000 (12:36 -0700)]
[TIR] Introduce BufferLoad/Store (#5205)
Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>
This PR introduces BufferLoad/Store to TIR. The new nodes will replace
Provide and Call with Tensor arguments in the subsequent refactors.