Undefined behavior with memset of std::string to 0 (#18703)
authorEli Amesefe <eliamesefe@fb.com>
Tue, 2 Apr 2019 17:07:22 +0000 (10:07 -0700)
committerFacebook Github Bot <facebook-github-bot@users.noreply.github.com>
Tue, 2 Apr 2019 17:10:11 +0000 (10:10 -0700)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18703

 `zeroPtr` is sometimes a `std::string` tensor, so `memset` to 0 is undefined behavior.

This might be accidentally safe with `std::string` implementation that use SSO (Small String Optimization), but will crash otherwise.

Reviewed By: zheng-xq

Differential Revision: D14714458

fbshipit-source-id: 012a18464e6514d38ff791509b88ddc3fc55b2b1

caffe2/operators/sequence_ops.cc

index dfb01ad..164443c 100644 (file)
@@ -203,7 +203,10 @@ bool PadEmptySamplesOp<CPUContext>::RunOnDevice() {
     Tensor zero{CPU};
     zero.Resize(block_size);
     auto zeroPtr = static_cast<char*>(zero.raw_mutable_data(features.dtype()));
-    memset(zeroPtr, 0, zero.nbytes());
+    // TODO Handle other composite types, such as vector<...>
+    if (!features.dtype().Match<std::string>()) {
+      memset(zeroPtr, 0, zero.nbytes());
+    }
     int start_dest = 0;
     int start_src = 0;
     for (int i = 0; i < lengths.numel(); ++i) {