From 6841537933a8099e8ddcdfdd5f4b2f1101989990 Mon Sep 17 00:00:00 2001 From: Edward Yang Date: Sun, 31 Mar 2019 19:08:03 -0700 Subject: [PATCH] Delete duplicated technical content from contribution_guide.rst (#18628) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18628 ghimport-source-id: d94b81a6f303883d97beaae25344fd591e13ce52 Stack from [ghstack](https://github.com/ezyang/ghstack): * #18629 Provide flake8 install instructions. * **#18628 Delete duplicated technical content from contribution_guide.rst** There's useful guide in contributing_guide.rst, but the technical bits were straight up copy-pasted from CONTRIBUTING.md, and I don't think it makes sense to break the CONTRIBUTING.md link. Instead, I deleted the duplicate bits and added a cross reference to the rst document. Signed-off-by: Edward Z. Yang Differential Revision: D14701003 fbshipit-source-id: 3bbb102fae225cbda27628a59138bba769bfa288 --- CONTRIBUTING.md | 6 +- docs/source/community/contribution_guide.rst | 565 --------------------------- 2 files changed, 3 insertions(+), 568 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 594e668..5931345 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -14,9 +14,9 @@ into two categories: Once you finish implementing a feature or bug-fix, please send a Pull Request to https://github.com/pytorch/pytorch -If you are not familiar with creating a Pull Request, here are some guides: -- http://stackoverflow.com/questions/14680711/how-to-do-a-github-pull-request -- https://help.github.com/articles/creating-a-pull-request/ +This document covers some of the more technical aspects of contributing +to PyTorch. For more non-technical guidance about how to contribute to +PyTorch, see the [Contributing Guide](docs/source/community/contribution_guide.rst). ## Developing PyTorch diff --git a/docs/source/community/contribution_guide.rst b/docs/source/community/contribution_guide.rst index 8b95097..75fed3b 100644 --- a/docs/source/community/contribution_guide.rst +++ b/docs/source/community/contribution_guide.rst @@ -350,568 +350,3 @@ Contributing a new Tutorial `PyTorch.org Tutorial Contribution Guide `__ - -Code Style -~~~~~~~~~~ - -**Python style** - -**C++ style** - -Submitting a Pull Request -~~~~~~~~~~~~~~~~~~~~~~~~~ - -PyTorch development happens publicly on our Github repo. - -To have your feature or fix added to PyTorch, please submit a Pull -Request. - -Running Tests -~~~~~~~~~~~~~ - -Show examples for running all tests, just one individual... - -Technical Process ------------------ - -Developing PyTorch -~~~~~~~~~~~~~~~~~~ - -To develop PyTorch on your machine, here are some tips: - -1. Uninstall all existing PyTorch installs: - -:: - - conda uninstall pytorch - pip uninstall torch - pip uninstall torch # run this command twice - -2. Clone a copy of PyTorch from source: - -:: - - git clone https://github.com/pytorch/pytorch - cd pytorch - -3. Install PyTorch in ``build develop`` mode: - -A full set of instructions on installing PyTorch from source is here: -https://github.com/pytorch/pytorch#from-source - -The change you have to make is to replace - -:: - - python setup.py install - -with - -:: - - python setup.py build develop - -This is especially useful if you are only changing Python files. - -This mode will symlink the Python files from the current local source -tree into the Python install. - -Hence, if you modify a Python file, you do not need to reinstall PyTorch -again and again. - -For example: - -- Install local PyTorch in ``build develop`` mode -- modify your Python file ``torch/__init__.py`` (for example) -- test functionality -- modify your Python file ``torch/__init__.py`` -- test functionality -- modify your Python file ``torch/__init__.py`` -- test functionality - -You do not need to repeatedly install after modifying Python files. - -In case you want to reinstall, make sure that you uninstall PyTorch -first by running ``pip uninstall torch`` and ``python setup.py clean``. -Then you can install in ``build develop`` mode again. - -Codebase structure ------------------- - -- `c10 `__ - Core - library files that work everywhere, both server and mobile. We are - slowly moving pieces from - `ATen/core `__ - here. This library is intended only to contain essential - functionality, and appropriate to use in settings where binary size - matters. (But you'll have a lot of missing functionality if you try - to use it directly.) -- `aten `__ - C++ - tensor library for PyTorch (no autograd support) - - - `src `__ - - - `TH `__ - `THC `__ - `THNN `__ - `THCUNN `__ - - Legacy library code from the original Torch. Try not to add - things here; we're slowly porting these to - `native `__. - - - generic - Contains actual implementations of operators, - parametrized over ``scalar_t``. Files here get compiled N - times per supported scalar type in PyTorch. - - - `ATen `__ - - - `core `__ - - Core functionality of ATen. This is migrating to top-level - c10 folder. - - `native `__ - - Modern implementations of operators. If you want to write - a new operator, here is where it should go. Most CPU - operators go in the top level directory, except for - operators which need to be compiled specially; see cpu - below. - - - `cpu `__ - - Not actually CPU implementations of operators, but - specifically implementations which are compiled with - processor-specific instructions, like AVX. See the - `README `__ - for more details. - - `cuda `__ - - CUDA implementations of operators. - - `sparse `__ - - CPU and CUDA implementations of COO sparse tensor - operations - - `mkl `__ - `mkldnn `__ - `miopen `__ - `cudnn `__ - - - implementations of operators which simply bind to some - backend library. - -- `torch `__ - - The actual PyTorch library. Everything that is not in - `csrc `__ - is a Python module, following the PyTorch Python frontend module - structure. - - - `csrc `__ - - C++ files composing the PyTorch library. Files in this directory - tree are a mix of Python binding code, and C++ heavy lifting. - Consult ``setup.py`` for the canonical list of Python binding - files; conventionally, they are often prefixed with ``python_``. - - - `jit `__ - - Compiler and frontend for TorchScript JIT frontend. - - `autograd `__ - - Implementation of reverse-mode automatic differentiation. - - `api `__ - - The PyTorch C++ frontend. - - `distributed `__ - - Distributed training support for PyTorch. - -- `tools `__ - - Code generation scripts for the PyTorch library. See - `README `__ - of this directory for more details. -- `test `__ - - Python unit tests for PyTorch Python frontend. - - - `test\_torch.py `__ - - Basic tests for PyTorch functionality. - - `test\_autograd.py `__ - - Tests for non-NN automatic differentiation support. - - `test\_nn.py `__ - - Tests for NN operators and their automatic differentiation. - - `test\_jit.py `__ - - Tests for the JIT compiler and TorchScript. - - ... - - `cpp `__ - - C++ unit tests for PyTorch C++ frontend. - - `expect `__ - - Automatically generated "expect" files which are used to compare - against expected output. - - `onnx `__ - - Tests for ONNX export functionality, using both PyTorch and - Caffe2. - -- `caffe2 `__ - - The Caffe2 library. - - - `core `__ - - Core files of Caffe2, e.g., tensor, workspace, blobs, etc. - - `operators `__ - - Operators of Caffe2. - - `python `__ - - Python bindings to Caffe2. - - ... - -Unit Testing ------------- - -PyTorch's testing is located under ``test/``. Run the entire test suite -with - -:: - - python test/run_test.py - -or run individual test files, like ``python test/test_nn.py``, for -individual test suites. - -Better local unit tests with pytest -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We don't officially support ``pytest``, but it works well with our -``unittest`` tests and offers a number of useful features for local -developing. Install it via ``pip install pytest``. - -If you want to just run tests that contain a specific substring, you can -use the ``-k`` flag: - -:: - - pytest test/test_nn.py -k Loss -v - -The above is an example of testing a change to Loss functions: this -command runs tests such as ``TestNN.test_BCELoss``\ and -``TestNN.test_MSELoss`` and can be useful to save keystrokes. - -Writing documentation ---------------------- - -PyTorch uses `Google -style `__ -for formatting docstrings. Length of line inside docstrings block must -be limited to 80 characters to fit into Jupyter documentation popups. - -For C++ documentation (https://pytorch.org/cppdocs), we use -`Doxygen `__ and then convert it to -`Sphinx `__ via -`Breathe `__ -and\ `Exhale `__. Check the `Doxygen -reference `__ -for more information on the documentation syntax. To build the -documentation locally, ``cd`` into ``docs/cpp`` and then ``make html``. - -We run Doxygen in CI (Travis) to verify that you do not use invalid -Doxygen commands. To run this check locally, run ``./check-doxygen.sh`` -from inside ``docs/cpp``. - -Managing multiple build trees ------------------------------ - -One downside to using ``python setup.py develop`` is that your -development version of PyTorch will be installed globally on your -account (e.g., if you run ``import torch`` anywhere else, the -development version will be used. - -If you want to manage multiple builds of PyTorch, you can make use of -`conda environments `__ to -maintain separate Python package environments, each of which can be tied -to a specific build of PyTorch. To set one up: - -:: - - conda create -n pytorch-myfeaturesource activate pytorch-myfeature# if you run python now, torch will NOT be installed - python setup.py build develop - -C++ Development tips --------------------- - -If you are working on the C++ code, there are a few important things -that you will want to keep in mind: - -1. How to rebuild only the code you are working on. -2. How to make rebuilds in the absence of changes go faster. - -Build only what you need. -~~~~~~~~~~~~~~~~~~~~~~~~~ - -``python setup.py build`` will build everything, but since our build -system is not very optimized for incremental rebuilds, this will -actually be very slow. Far better is to only request rebuilds of the -parts of the project you are working on: - -- Working on the Python bindings? Run ``python setup.py develop`` to - rebuild (NB: no ``build`` here!) -- Working on ``torch/csrc`` or ``aten``? Run - ``python setup.py rebuild_libtorch`` to rebuild and avoid having to - rebuild other dependent libraries we depend on. -- Working on one of the other dependent libraries? The other valid - targets are listed in ``dep_libs`` in ``setup.py``. prepend - ``build_`` to get a target, and run as e.g. - ``python setup.py build_gloo``. -- Working on a test binary? Run - ``(cd build && ninja bin/test_binary_name)`` to rebuild only that - test binary (without rerunning cmake). (Replace ``ninja`` with - ``make`` if you don't have ninja installed). - -On the initial build, you can also speed things up with the environment -variables ``DEBUG`` and ``NO_CUDA``. - -- ``DEBUG=1`` will enable debug builds (-g -O0) -- ``REL_WITH_DEB_INFO=1`` will enable debug symbols with optimizations - (-g -O3) -- ``NO_CUDA=1`` will disable compiling CUDA (in case you are developing - on something not CUDA related), to save compile time. - -For example: - -:: - - NO_CUDA=1 DEBUG=1 python setup.py build develop - -Make sure you continue to pass these flags on subsequent builds. - -Code completion and IDE support -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When using ``python setup.py develop``, PyTorch will generate a -``compile_commands.json`` file that can be used by many editors to -provide command completion and error highlighting for PyTorch's C++ -code. You need to ``pip install ninja`` to generate accurate information -for the code in ``torch/csrc``. More information at: - -- https://sarcasm.github.io/notes/dev/compilation-database.html - -Make no-op build fast. -~~~~~~~~~~~~~~~~~~~~~~ - -Use Ninja -~~~~~~~~~ - -Python ``setuptools`` is pretty dumb, and always rebuilds every C file -in a project. If you install the ninja build system with -``pip install ninja``, then PyTorch will use it to track dependencies -correctly. If PyTorch was already built, you will need to run -``python setup.py clean`` once after installing ninja for builds to -succeed. - -Use CCache -~~~~~~~~~~ - -Even when dependencies are tracked with file modification, there are -many situations where files get rebuilt when a previous compilation was -exactly the same. - -Using ccache in a situation like this is a real time-saver. However, by -default, ccache does not properly support CUDA stuff, so here are the -instructions for installing a custom ccache fork that has CUDA support: - -:: - - # install and export ccacheif ! ls ~/ccache/bin/ccachethen - sudo apt-get update - sudo apt-get install -y automake autoconf - sudo apt-get install -y asciidoc - mkdir -p ~/ccache - pushd /tmp - rm -rf ccache - git clone https://github.com/colesbury/ccache -b ccbin - pushd ccache - ./autogen.sh - ./configure - make install prefix=~/ccache - popdpopd - - mkdir -p ~/ccache/lib - mkdir -p ~/ccache/cuda - ln -s ~/ccache/bin/ccache ~/ccache/lib/cc - ln -s ~/ccache/bin/ccache ~/ccache/lib/c++ - ln -s ~/ccache/bin/ccache ~/ccache/lib/gcc - ln -s ~/ccache/bin/ccache ~/ccache/lib/g++ - ln -s ~/ccache/bin/ccache ~/ccache/cuda/nvcc - - ~/ccache/bin/ccache -M 25Gifiexport PATH=~/ccache/lib:$PATHexport CUDA_NVCC_EXECUTABLE=~/ccache/cuda/nvcc - -CUDA Development tips ---------------------- - -If you are working on the CUDA code, here are some useful CUDA debugging -tips: - -1. ``CUDA_DEVICE_DEBUG=1`` will enable CUDA device function debug - symbols (``-g -G``). This will be particularly helpful in debugging - device code. However, it will slow down the build process for about - 50% (compared to only ``DEBUG=1``), so use wisely. -2. ``cuda-gdb`` and ``cuda-memcheck`` are your best CUDA debugging - friends. Unlike\ ``gdb``, ``cuda-gdb`` can display actual values in a - CUDA tensor (rather than all zeros). - -Hope this helps, and thanks for considering to contribute. - -Windows development tips ------------------------- - -Occasionally, you will write a patch which works on Linux, but fails CI -on Windows. There are a few aspects in which MSVC (the Windows compiler -toolchain we use) is stricter than Linux, which are worth keeping in -mind when fixing these problems. - -1. Symbols are NOT exported by default on Windows; instead, you have to - explicitly mark a symbol as exported/imported in a header file with - ``__declspec(dllexport)`` / ``__declspec(dllimport)``. We have - codified this pattern into a set of macros which follow the - convention ``*_API``, e.g., ``CAFFE2_API`` inside Caffe2 and ATen. - (Every separate shared library needs a unique macro name, because - symbol visibility is on a per shared library basis. See - c10/macros/Macros.h for more details.) The upshot is if you see an - "unresolved external" error in your Windows build, this is probably - because you forgot to mark a function with ``*_API``. However, there - is one important counterexample to this principle: if you want a - *templated* function to be instantiated at the call site, do NOT mark - it with ``*_API`` (if you do mark it, you'll have to explicitly - instantiate all of the specializations used by the call sites.) -2. If you link against a library, this does not make its dependencies - transitively visible. You must explicitly specify a link dependency - against every library whose symbols you use. (This is different from - Linux where in most environments, transitive dependencies can be used - to fulfill unresolved symbols.) -3. If you have a Windows box (we have a few on EC2 which you can request - access to) and you want to run the build, the easiest way is to just - run ``.jenkins/pytorch/win-build.sh``. If you need to rebuild, run - ``REBUILD=1 .jenkins/pytorch/win-build.sh`` (this will avoid blowing - away your Conda environment.) - -Even if you don't know anything about MSVC, you can use cmake to build -simple programs on Windows; this can be helpful if you want to learn -more about some peculiar linking behavior by reproducing it on a small -example. Here's a simple example cmake file that defines two dynamic -libraries, one linking with the other: - -:: - - project(myproject CXX)set(CMAKE_CXX_STANDARD 11)add_library(foo SHARED foo.cpp)add_library(bar SHARED bar.cpp)# NB: don't forget to __declspec(dllexport) at least one symbol from foo,# otherwise foo.lib will not be created.target_link_libraries(bar PUBLIC foo) - -You can build it with: - -:: - - mkdir buildcd build - cmake .. - cmake --build . - -Known MSVC (and MSVC with NVCC) bugs -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The PyTorch codebase sometimes likes to use exciting C++ features, and -these exciting features lead to exciting bugs in Windows compilers. To -add insult to injury, the error messages will often not tell you which -line of code actually induced the erroring template instantiation. We've -found the most effective way to debug these problems is to carefully -read over diffs, keeping in mind known bugs in MSVC/NVCC. Here are a few -well known pitfalls and workarounds: - -- This is not actually a bug per se, but in general, code generated by - MSVC is more sensitive to memory errors; you may have written some - code that does a use-after-free or stack overflows; on Linux the code - might work, but on Windows your program will crash. ASAN may not - catch all of these problems: stay vigilant to the possibility that - your crash is due to a real memory problem. -- (NVCC) ``c10::optional`` does not work when used from device code. - Don't use it from kernels. Upstream issue: - https://github.com/akrzemi1/Optional/issues/58 and our local issue - #10329. -- ``constexpr`` generally works less well on MSVC. - - - The idiom ``static_assert(f() == f())`` to test if ``f`` is - constexpr does not work; you'll get "error C2131: expression did - not evaluate to a constant". Don't use these asserts on Windows. - (Example: ``c10/util/intrusive_ptr.h``) - -- (NVCC) Code you access inside a ``static_assert`` will eagerly be - evaluated as if it were device code, and so you might get an error - that the code is "not accessible". - -:: - - class A { - static A singleton_; - static constexpr inline A* singleton() { - return &singleton_; - } - };static_assert(std::is_same(A*, decltype(A::singleton()))::value, "hmm"); - -- The compiler will run out of heap space if you attempt to compile - files that are too large. Splitting such files into separate files - helps. (Example: ``THTensorMath``, ``THTensorMoreMath``, - ``THTensorEvenMoreMath``.) -- MSVC's preprocessor (but not the standard compiler) has a bug where - it incorrectly tokenizes raw string literals, ending when it sees a - ``"``. This causes preprocessor tokens inside the literal like - an\ ``#endif`` to be incorrectly treated as preprocessor directives. - See https://godbolt.org/z/eVTIJq as an example. - -Running Clang-Tidy -~~~~~~~~~~~~~~~~~~ - -`Clang-Tidy `__ is a -C++ linter and static analysis tool based on the clang compiler. We run -clang-tidy in our CI to make sure that new C++ code is safe, sane and -efficient. See our -`.travis.yml `__ -file for the simple commands we use for this. To run clang-tidy locally, -follow these steps: - -1. Install clang-tidy. First, check if you already have clang-tidy by - simply writing ``clang-tidy`` in your terminal. If you don't yet have - clang-tidy, you should be able to install it easily with your package - manager, e.g. by writing ``apt-get install clang-tidy`` on Ubuntu. - See `https://apt.llvm.org `__ for details on - how to install the latest version. Note that newer versions of - clang-tidy will have more checks than older versions. In our CI, we - run clang-tidy-6.0. -2. Use our driver script to run clang-tidy over any changes relative to - some git revision (you may want to replace ``HEAD~1`` with ``HEAD`` - to pick up uncommitted changes). Changes are picked up based on a - ``git diff`` with the given revision: - -:: - - python tools/clang_tidy.py -d build -p torch/csrc --diff 'HEAD~1' - -Above, it is assumed you are in the PyTorch root folder. -``path/to/build`` should be the path to where you built PyTorch from -source, e.g. ``build`` in the PyTorch root folder if you used -``setup.py build``. You can use ``-c ``\ to change -the clang-tidy this script uses. Make sure you have PyYaml installed, -which is in PyTorch's ``requirements.txt``. - -Pre-commit Tidy/Linting Hook -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We use clang-tidy and flake8 to perform additional formatting and -semantic checking of code. We provide a pre-commit git hook for -performing these checks, before a commit is created: - -:: - - ln -s ../../tools/git-pre-commit .git/hooks/pre-commit - -Caffe2 notes ------------- - -In 2018, we merged Caffe2 into the PyTorch source repository. While the -steady state aspiration is that Caffe2 and PyTorch share code freely, in -the meantime there will be some separation. If you submit a PR to only -PyTorch or only Caffe2 code, CI will only run for the project you -edited. The logic for this is implemented in -``.jenkins/pytorch/dirty.sh`` and ``.jenkins/caffe2/dirty.sh``; you can -look at this to see what path prefixes constitute changes. This also -means if you ADD a new top-level path, or you start sharing code between -projects, you need to modify these files. There are a few "unusual" -directories which, for historical reasons, are Caffe2/PyTorch specific. -Here they are: - -- ``CMakeLists.txt``, ``Makefile``, ``binaries``, ``cmake``, ``conda``, - ``modules``, ``scripts`` are Caffe2-specific. Don't put PyTorch code - in them without extra coordination. -- ``mypy*``, ``requirements.txt``, ``setup.py``, ``test``, ``tools`` - are PyTorch-specific. Don't put Caffe2 code in them without extra - coordination. -- 2.7.4