Tensor's `CopyFrom` doesn't discriminate the cpu device and gpu device, it leads to redundant get device context from global pool.