Skip to content

[Cpp API Compatibility] add Compat API#78026

Open
Le-soleile wants to merge 42 commits intoPaddlePaddle:developfrom
Le-soleile:217
Open

[Cpp API Compatibility] add Compat API#78026
Le-soleile wants to merge 42 commits intoPaddlePaddle:developfrom
Le-soleile:217

Conversation

@Le-soleile
Copy link
Contributor

@Le-soleile Le-soleile commented Feb 23, 2026

PR Category

Execute Infrastructure

PR Types

New features

Description

新增兼容接口:clamp、clamp_、clamp_max、clamp_max_、clamp_min、clamp_min_、as_strided、as_strided_、as_strided_scatter、std、tensor_data、variable_data、index、index_put_、index_put

是否引起精度变化

@paddle-bot
Copy link

paddle-bot bot commented Feb 23, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Feb 23, 2026
@Le-soleile
Copy link
Contributor Author

/re-run all-failed

1 similar comment
@Le-soleile
Copy link
Contributor Author

/re-run all-failed

paddle::experimental::floor_divide_(
const_cast<PaddleTensor&>(tensor_),
paddle::experimental::full({}, other, other.dtype()));
paddle::experimental::full({}, other, tensor_.dtype()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里应该是other.dtype

@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 98.53480% with 4 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@acee55a). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...addle/phi/api/include/compat/ATen/ops/as_strided.h 95.55% 2 Missing ⚠️
...ddle/phi/api/include/compat/ATen/core/TensorBody.h 96.15% 1 Missing ⚠️
paddle/phi/api/include/compat/ATen/ops/std.h 98.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #78026   +/-   ##
==========================================
  Coverage           ?   98.53%           
==========================================
  Files              ?        6           
  Lines              ?      273           
  Branches           ?        0           
==========================================
  Hits               ?      269           
  Misses             ?        4           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds C++ API compatibility layer implementations for PyTorch/ATen operations to enable easier migration of PyTorch code to PaddlePaddle. It introduces 15 new compatibility APIs including tensor clamping operations, strided tensor views, statistical functions, data access methods, and advanced indexing operations.

Changes:

  • Added compatibility implementations for clamp operations (clamp, clamp_, clamp_max, clamp_max_, clamp_min, clamp_min_) and tensor indexing operator[]
  • Implemented strided tensor view operations (as_strided, as_strided_, as_strided_scatter) for custom memory layouts
  • Added std function wrappers and tensor data access methods (tensor_data, variable_data)
  • Implemented advanced indexing operations (index, index_put_, index_put) with c10::List support
  • Added comprehensive test coverage with 5 new test files

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
test/cpp/compat/CMakeLists.txt Registers 5 new test executables for the compatibility APIs
test/cpp/compat/ATen_clamp_test.cc Tests clamp operations and operator[] with 30+ test cases
test/cpp/compat/ATen_as_strided_test.cc Tests strided tensor view operations
test/cpp/compat/ATen_std_var_test.cc Tests standard deviation and variance calculations
test/cpp/compat/ATen_index_test.cc Tests advanced indexing with optional tensor lists
test/cpp/compat/ATen_tensor_data_test.cc Tests tensor data access and item extraction
paddle/phi/api/include/compat/c10/core/List.h New c10::List wrapper for PyTorch compatibility
paddle/phi/api/include/compat/ATen/ops/std.h Standard deviation implementations using variance calculations
paddle/phi/api/include/compat/ATen/ops/index_put.h Advanced indexing and index_put operations
paddle/phi/api/include/compat/ATen/ops/clamp.h Clamp operations with scalar/tensor bounds and operator[]
paddle/phi/api/include/compat/ATen/core/TensorBody.h Method declarations and inline implementations for new tensor operations
paddle/phi/api/include/compat/ATen/Functions.h Includes new operation headers in main functions header

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +212 to +229
inline at::Tensor Tensor::operator[](int64_t index) const {
// Handle negative index
int64_t ndim = tensor_.dims().size();
if (ndim == 0) {
// Scalar tensor - return as is for any index
return *this;
}
int64_t dim0 = tensor_.dims()[0];
if (index < 0) {
index = index + dim0;
}
return paddle::experimental::slice(tensor_,
/*axes=*/{0},
/*starts=*/{index},
/*ends=*/{index + 1},
/*infer_flags=*/{1},
/*decrease_axis=*/{0});
}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator[] implementation doesn't validate that the computed index is within bounds after handling negative indices. If the index is still out of bounds after normalization (e.g., index >= dim0 or normalized index < 0), the slice operation may fail or produce unexpected results. Add bounds checking after the negative index handling.

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +95
// Helper function implementations
namespace detail {
inline at::Scalar get_default_min_value(c10::ScalarType dtype) {
switch (dtype) {
case c10::ScalarType::Byte:
return at::Scalar(static_cast<uint8_t>(0));
case c10::ScalarType::Char:
return at::Scalar(std::numeric_limits<int8_t>::lowest());
case c10::ScalarType::Short:
return at::Scalar(std::numeric_limits<int16_t>::lowest());
case c10::ScalarType::Int:
return at::Scalar(std::numeric_limits<int32_t>::lowest());
case c10::ScalarType::Long:
return at::Scalar(std::numeric_limits<int64_t>::lowest());
case c10::ScalarType::UInt16:
return at::Scalar(static_cast<uint16_t>(0));
case c10::ScalarType::UInt32:
return at::Scalar(static_cast<uint32_t>(0));
case c10::ScalarType::UInt64:
return at::Scalar(static_cast<uint64_t>(0));
case c10::ScalarType::Half:
return at::Scalar(-std::numeric_limits<float>::infinity());
case c10::ScalarType::Float:
return at::Scalar(-std::numeric_limits<float>::infinity());
case c10::ScalarType::Double:
return at::Scalar(-std::numeric_limits<double>::infinity());
case c10::ScalarType::BFloat16:
return at::Scalar(-std::numeric_limits<float>::infinity());
case c10::ScalarType::Bool:
return at::Scalar(false);
default:
return at::Scalar(-std::numeric_limits<double>::infinity());
}
}

inline at::Scalar get_default_max_value(c10::ScalarType dtype) {
switch (dtype) {
case c10::ScalarType::Byte:
return at::Scalar(std::numeric_limits<uint8_t>::max());
case c10::ScalarType::Char:
return at::Scalar(std::numeric_limits<int8_t>::max());
case c10::ScalarType::Short:
return at::Scalar(std::numeric_limits<int16_t>::max());
case c10::ScalarType::Int:
return at::Scalar(std::numeric_limits<int32_t>::max());
case c10::ScalarType::Long:
return at::Scalar(std::numeric_limits<int64_t>::max());
case c10::ScalarType::UInt16:
return at::Scalar(std::numeric_limits<uint16_t>::max());
case c10::ScalarType::UInt32:
return at::Scalar(std::numeric_limits<uint32_t>::max());
case c10::ScalarType::UInt64:
return at::Scalar(std::numeric_limits<uint64_t>::max());
case c10::ScalarType::Half:
return at::Scalar(std::numeric_limits<float>::infinity());
case c10::ScalarType::Float:
return at::Scalar(std::numeric_limits<float>::infinity());
case c10::ScalarType::Double:
return at::Scalar(std::numeric_limits<double>::infinity());
case c10::ScalarType::BFloat16:
return at::Scalar(std::numeric_limits<float>::infinity());
case c10::ScalarType::Bool:
return at::Scalar(true);
default:
return at::Scalar(std::numeric_limits<double>::infinity());
}
}
} // namespace detail
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The detail namespace helper functions get_default_min_value and get_default_max_value are defined but never used in this file. These appear to be dead code that should either be removed or properly integrated if they're intended for future use.

Copilot uses AI. Check for mistakes.
Comment on lines +271 to +284
const ::std::optional<at::Scalar>& max = ::std::nullopt) const;

at::Tensor& clamp_(const ::std::optional<at::Tensor>& min = {},
const ::std::optional<at::Tensor>& max = {}) const;

at::Tensor clamp_max(const at::Scalar& max) const;
at::Tensor clamp_max(const at::Tensor& max) const;
at::Tensor& clamp_max_(const at::Scalar& max) const;
at::Tensor& clamp_max_(const at::Tensor& max) const;

at::Tensor clamp_min(const at::Scalar& min) const;
at::Tensor clamp_min(const at::Tensor& min) const;
at::Tensor& clamp_min_(const at::Scalar& min) const;
at::Tensor& clamp_min_(const at::Tensor& min) const;
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inplace methods clamp_, clamp_max_, and clamp_min_ are declared as const methods but modify the internal tensor state. These methods should not be const since they perform in-place modifications. The const_cast usage indicates an incorrect API design. Either remove the const qualifier from these methods or implement them correctly as non-const methods.

Suggested change
const ::std::optional<at::Scalar>& max = ::std::nullopt) const;
at::Tensor& clamp_(const ::std::optional<at::Tensor>& min = {},
const ::std::optional<at::Tensor>& max = {}) const;
at::Tensor clamp_max(const at::Scalar& max) const;
at::Tensor clamp_max(const at::Tensor& max) const;
at::Tensor& clamp_max_(const at::Scalar& max) const;
at::Tensor& clamp_max_(const at::Tensor& max) const;
at::Tensor clamp_min(const at::Scalar& min) const;
at::Tensor clamp_min(const at::Tensor& min) const;
at::Tensor& clamp_min_(const at::Scalar& min) const;
at::Tensor& clamp_min_(const at::Tensor& min) const;
const ::std::optional<at::Scalar>& max = ::std::nullopt);
at::Tensor& clamp_(const ::std::optional<at::Tensor>& min = {},
const ::std::optional<at::Tensor>& max = {});
at::Tensor clamp_max(const at::Scalar& max) const;
at::Tensor clamp_max(const at::Tensor& max) const;
at::Tensor& clamp_max_(const at::Scalar& max);
at::Tensor& clamp_max_(const at::Tensor& max);
at::Tensor clamp_min(const at::Scalar& min) const;
at::Tensor clamp_min(const at::Tensor& min) const;
at::Tensor& clamp_min_(const at::Scalar& min);
at::Tensor& clamp_min_(const at::Tensor& min);

Copilot uses AI. Check for mistakes.
Comment on lines +404 to +416
at::Tensor& index_put_(const c10::List<::std::optional<at::Tensor>>& indices,
const at::Tensor& values,
bool accumulate = false) const;

// index_put_: Set scalar value at specified indices in-place
at::Tensor& index_put_(const c10::List<::std::optional<at::Tensor>>& indices,
const at::Scalar& v,
bool accumulate = false) const;

// index_put: Non-inplace version of index_put_
at::Tensor index_put(const c10::List<::std::optional<at::Tensor>>& indices,
const at::Tensor& values,
bool accumulate = false) const;
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index_put_ methods are declared as const but modify the tensor state in-place. These methods should not be const since they perform in-place modifications. The const_cast pattern used throughout indicates a fundamental const-correctness issue.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +25
// Copyright (c) 2026 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include <ATen/Functions.h>
#include <ATen/core/TensorBody.h>
#include <ATen/ops/tensor.h>
#include <c10/core/ScalarType.h>
#include <c10/core/TensorOptions.h>
#include <cmath>

#include "ATen/ATen.h"
#include "gtest/gtest.h"
#include "torch/all.h"

Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test files ATen_std_var_test.cc, ATen_as_strided_test.cc, ATen_index_test.cc, and ATen_tensor_data_test.cc don't initialize the tensor operants, but ATen_clamp_test.cc does. This inconsistency may lead to test failures if the other tests also require initialization. Verify whether InitTensorOperants() is needed for all these tests and add it consistently if required.

Copilot uses AI. Check for mistakes.
Comment on lines +316 to +337
const at::Tensor& as_strided_(
at::IntArrayRef size,
at::IntArrayRef stride,
::std::optional<int64_t> storage_offset = ::std::nullopt) const {
auto src_impl = tensor_.impl();
auto* src_tensor =
std::dynamic_pointer_cast<phi::DenseTensor>(src_impl).get();
if (!src_tensor) {
PD_THROW("as_strided_: tensor must be a DenseTensor");
}
std::vector<int64_t> size_vec(size.begin(), size.end());
std::vector<int64_t> stride_vec(stride.begin(), stride.end());
src_tensor->Resize(common::make_ddim(size_vec));
src_tensor->set_strides(common::make_ddim(stride_vec));
int64_t offset = storage_offset.has_value() ? storage_offset.value() : 0;
if (offset != 0) {
auto meta = phi::DenseTensorMeta(src_tensor->meta());
meta.offset = static_cast<size_t>(offset);
src_tensor->set_meta(meta);
}
return *this;
}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as_strided_ method is declared as const but modifies the tensor's internal state (size, stride, and metadata). This violates const correctness. The method should either be non-const or should not modify the internal state.

Copilot uses AI. Check for mistakes.
::std::optional<int64_t> storage_offset = ::std::nullopt) const {
at::Tensor strided_view = as_strided(size, stride, storage_offset);
strided_view.copy_(src);
return strided_view;
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The as_strided_scatter method calls copy_ on a view, but the result is the view itself, not the original tensor. This means the changes are only reflected in the strided view and won't affect the original tensor that the method was called on. The return value should be a modified copy of the original tensor with the scattered values, not just the view.

Suggested change
return strided_view;
// Return the original tensor (now containing the scattered values),
// rather than the strided view.
return *this;

Copilot uses AI. Check for mistakes.
Comment on lines +166 to +202
inline at::Tensor Tensor::clamp_max(const at::Scalar& max) const {
// Create a tensor with the same shape filled with the max value
at::Tensor max_tensor = at::full(tensor_.shape(), max, {});
return clamp_max(max_tensor);
}

inline at::Tensor Tensor::clamp_max(const at::Tensor& max) const {
return Tensor(paddle::experimental::minimum(tensor_, max._PD_GetInner()));
}

inline at::Tensor& Tensor::clamp_max_(const at::Scalar& max) const {
// Create a tensor with the same shape filled with the max value
at::Tensor max_tensor = at::full(tensor_.shape(), max, {});
return clamp_max_(max_tensor);
}

inline at::Tensor& Tensor::clamp_max_(const at::Tensor& max) const {
PaddleTensor temp =
paddle::experimental::minimum(tensor_, max._PD_GetInner());
const_cast<PaddleTensor&>(tensor_) = temp;
return const_cast<at::Tensor&>(*this);
}

inline at::Tensor Tensor::clamp_min(const at::Scalar& min) const {
// Create a tensor with the same shape filled with the min value
at::Tensor min_tensor = at::full(tensor_.shape(), min, {});
return clamp_min(min_tensor);
}

inline at::Tensor Tensor::clamp_min(const at::Tensor& min) const {
return Tensor(paddle::experimental::maximum(tensor_, min._PD_GetInner()));
}

inline at::Tensor& Tensor::clamp_min_(const at::Scalar& min) const {
// Create a tensor with the same shape filled with the min value
at::Tensor min_tensor = at::full(tensor_.shape(), min, {});
return clamp_min_(min_tensor);
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clamp_max and clamp_min methods with Scalar parameters create a full tensor of the same shape as the input, which is inefficient for large tensors. Consider using element-wise operations with broadcast scalars instead of creating intermediate tensors. This could significantly improve memory usage and performance for large inputs.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +92
phi::IntArray dims_int_array(dims_vec);
paddle::Tensor tensor = self._PD_GetInner();

paddle::Tensor mean_tensor;
if (dims_vec.empty()) {
mean_tensor = paddle::experimental::mean(
tensor, phi::IntArray(std::vector<int64_t>{}), true);
} else {
mean_tensor = paddle::experimental::mean(tensor, dims_int_array, true);
}

paddle::Tensor diff = paddle::experimental::subtract(tensor, mean_tensor);
paddle::Tensor diff_squared = paddle::experimental::multiply(diff, diff);

paddle::Tensor sum_squared_diff;
if (dims_vec.empty()) {
sum_squared_diff =
paddle::experimental::sum(diff_squared,
phi::IntArray(std::vector<int64_t>{}),
diff_squared.dtype(),
keepdim);
} else {
sum_squared_diff = paddle::experimental::sum(
diff_squared, dims_int_array, diff_squared.dtype(), keepdim);
}

int64_t n = tensor.numel();
if (!dims_vec.empty()) {
n = 1;
for (int64_t d : dims_vec) {
int64_t dim_idx = d < 0 ? d + tensor.dims().size() : d;
if (dim_idx >= 0 &&
dim_idx < static_cast<int64_t>(tensor.dims().size())) {
n *= tensor.dims()[dim_idx];
}
}
}

double corrected_n = static_cast<double>(n) - correction_value;
if (corrected_n <= 0.0) {
corrected_n = static_cast<double>(n);
}

std::vector<int64_t> result_shape_vec;
for (int64_t i = 0; i < sum_squared_diff.dims().size(); ++i) {
result_shape_vec.push_back(sum_squared_diff.dims()[i]);
}
paddle::Tensor correction_scalar =
paddle::experimental::full(phi::IntArray(result_shape_vec),
phi::Scalar(corrected_n),
sum_squared_diff.dtype(),
sum_squared_diff.place());
paddle::Tensor variance =
paddle::experimental::divide(sum_squared_diff, correction_scalar);

paddle::Tensor result = paddle::experimental::sqrt(variance);

return Tensor(result);
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std_impl function has duplicated variance calculation logic that already exists in the var_impl method in TensorBody.h. This code duplication violates the DRY principle and makes maintenance harder. Consider reusing the existing var_impl by calling var() and then taking the sqrt, or extracting the common logic into a shared helper function.

Suggested change
phi::IntArray dims_int_array(dims_vec);
paddle::Tensor tensor = self._PD_GetInner();
paddle::Tensor mean_tensor;
if (dims_vec.empty()) {
mean_tensor = paddle::experimental::mean(
tensor, phi::IntArray(std::vector<int64_t>{}), true);
} else {
mean_tensor = paddle::experimental::mean(tensor, dims_int_array, true);
}
paddle::Tensor diff = paddle::experimental::subtract(tensor, mean_tensor);
paddle::Tensor diff_squared = paddle::experimental::multiply(diff, diff);
paddle::Tensor sum_squared_diff;
if (dims_vec.empty()) {
sum_squared_diff =
paddle::experimental::sum(diff_squared,
phi::IntArray(std::vector<int64_t>{}),
diff_squared.dtype(),
keepdim);
} else {
sum_squared_diff = paddle::experimental::sum(
diff_squared, dims_int_array, diff_squared.dtype(), keepdim);
}
int64_t n = tensor.numel();
if (!dims_vec.empty()) {
n = 1;
for (int64_t d : dims_vec) {
int64_t dim_idx = d < 0 ? d + tensor.dims().size() : d;
if (dim_idx >= 0 &&
dim_idx < static_cast<int64_t>(tensor.dims().size())) {
n *= tensor.dims()[dim_idx];
}
}
}
double corrected_n = static_cast<double>(n) - correction_value;
if (corrected_n <= 0.0) {
corrected_n = static_cast<double>(n);
}
std::vector<int64_t> result_shape_vec;
for (int64_t i = 0; i < sum_squared_diff.dims().size(); ++i) {
result_shape_vec.push_back(sum_squared_diff.dims()[i]);
}
paddle::Tensor correction_scalar =
paddle::experimental::full(phi::IntArray(result_shape_vec),
phi::Scalar(corrected_n),
sum_squared_diff.dtype(),
sum_squared_diff.place());
paddle::Tensor variance =
paddle::experimental::divide(sum_squared_diff, correction_scalar);
paddle::Tensor result = paddle::experimental::sqrt(variance);
return Tensor(result);
// Reuse existing variance implementation to avoid duplicating logic.
at::OptionalIntArrayRef dims_opt;
if (!dims_vec.empty()) {
dims_opt = at::OptionalIntArrayRef(dims_vec);
}
::std::optional<at::Scalar> correction_opt;
correction_opt.emplace(correction_value);
Tensor var_tensor = self.var(dims_opt, correction_opt, keepdim);
paddle::Tensor var_inner = var_tensor._PD_GetInner();
paddle::Tensor std_inner = paddle::experimental::sqrt(var_inner);
return Tensor(std_inner);

Copilot uses AI. Check for mistakes.

} // namespace at

namespace torch {} // namespace torch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这行是不是直接删掉就好了?

Comment on lines +212 to +229
inline at::Tensor Tensor::operator[](int64_t index) const {
// Handle negative index
int64_t ndim = tensor_.dims().size();
if (ndim == 0) {
// Scalar tensor - return as is for any index
return *this;
}
int64_t dim0 = tensor_.dims()[0];
if (index < 0) {
index = index + dim0;
}
return paddle::experimental::slice(tensor_,
/*axes=*/{0},
/*starts=*/{index},
/*ends=*/{index + 1},
/*infer_flags=*/{1},
/*decrease_axis=*/{0});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

奇怪,operator[] 为什么要放在这个文件呢?这个放在原来文件有什么问题么?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那我改回去吧,我就是写到clamp发现这里有几行实现,然后顺手搬过来了,现在想想确实不合适

#include "paddle/phi/common/place.h"

#ifdef PADDLE_WITH_CUDA
#include "paddle/phi/backends/gpu/forwards.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个具体是用哪个函数的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

这个函数里面用到了 cudaStream_t&
forwards.h 第27行:using cudaStream_t = struct CUstream_st *;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可是这个 record_stream 应该不是这个 PR 加的吧?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,不是这次加的

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那这个 PR 为什么要加这个 include?如果有问题的话,之前加的时候就应该挂掉?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯这个接口不是我加的,我先把forwards.h删掉吧。确实,不加也没有挂那应该没问题

at::Tensor strided_view = as_strided(size, stride, storage_offset);
strided_view.copy_(src);
return strided_view;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这几个 as_strided 是不是也应该将实现放在 ops 目录?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

1 similar comment
@Le-soleile
Copy link
Contributor Author

/re-run all-failed

}

TEST(TensorAsStridedTest, AsStridedInplace) {
at::Tensor t = at::arange(12, at::kFloat);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的测试可以测出来inplace功能吗

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已做补充

at::Tensor t = at::arange(12, at::kFloat);
at::Tensor result = t.as_strided({2, 3}, {3, 1}, 2);

ASSERT_EQ(result.sizes(), c10::IntArrayRef({2, 3}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

能解释一下这里offset的测试吗,看上去什么都没有改变

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已做补充

at::Tensor t = at::arange(12, at::kFloat);
t.as_strided_({2, 3}, {3, 1}, 1);

ASSERT_EQ(t.sizes(), c10::IntArrayRef({2, 3}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是验证了什么功能?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Inplace 操作 - 调用 as_strided_ 后 tensor 本身被修改
  2. 内存共享 - 数据指针不变(原地修改)
  3. Offset 偏移 - 数据从索引 1 开始(不是 0)

at::Tensor src = at::full({2, 3}, 99.0f, at::kFloat);
at::Tensor result = t.as_strided_scatter(src, {2, 3}, {3, 1});

ASSERT_EQ(result.sizes(), c10::IntArrayRef({2, 3}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里验证了什么功能?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 输出形状 - 返回的 tensor 形状是 {2, 3}
  2. Scatter 写入 - src 的数据(99)被写入到 result 的指定位置


// ======================== index_put_ tests ========================

TEST(TensorIndexPutTest, IndexPutInplaceWithTensor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inplace功能是怎么验证的?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加 ASSERT_EQ(t.data_ptr(), original_data_ptr)测试数据指针验证

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

@Le-soleile
Copy link
Contributor Author

/re-run all-failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants