There are unnecessary copy of tensor, in case of fp16
It seems that when developing previously, the tensor structure was not accurately established, so it attempted to save by forcibly converting to FP16.
Now, when performing getData<_FP16>(), it is automatically converted, so the process of putting every tensor one by one in the temp array is unnecessary and only slows down the speed.
**Self evaluation:**
1. Build test: [X]Passed [ ]Failed [ ]Skipped
2. Run test: [X]Passed [ ]Failed [ ]Skipped
Signed-off-by: Donghak PARK <donghak.park@samsung.com>
"[Tensor::save] operation failed");
} else if (this->getDataType() == ml::train::TensorDim::DataType::FP16) {
#ifdef ENABLE_FP16
- std::vector<_FP16> temp(size());
- for (unsigned int i = 0; i < size(); ++i) {
- temp[i] = static_cast<_FP16>(getData<_FP16>()[i]);
- }
-
- checkedWrite(file, (char *)temp.data(),
+ checkedWrite(file, (char *)getData<_FP16>(),
static_cast<std::streamsize>(size() * sizeof(_FP16)),
"[Tensor::save] operation failed");
#else