Skip to content

Conversation

@anandamideShakyan
Copy link
Collaborator

PR Description:

Detect and address input/output errors that can result in undefined behavior. Ensure proper error handling for I/O functions that may fail and leave variables uninitialized. Neglecting to check the status of these functions before using their outputs (e.g., memory buffers, file descriptors, etc.) can lead to undefined program behavior. This update enforces checks on commonly used I/O functions to validate their return values and prevent improper usage.

This is the second PR of the several changes for this refactoring.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 11, 2025
@netlify
Copy link

netlify bot commented Jan 11, 2025

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 3e86148
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/67fe14798368710008b038e5

target_compile_definitions(dbgen PRIVATE DBNAME=dss MAC ORACLE TPCH)
target_include_directories(dbgen PRIVATE include)

target_link_libraries(dbgen PUBLIC Folly::folly)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be INTERFACE instead of PUBLIC. We don;t actually need to link to Folly here. We only want it to get to the header for the macro FOLLY_UNLIKELY.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that's actually not true, I was unsure and made a little test, see below.

The a folly header is included in a header but also an implementation file so we need to link as PUBLIC to make folly's include dir available both to targets depending on this target (aka anyone using this header) and when building build.cpp.

INTERFACE will not add folly's include dir to the compiler invocation, only to the one of targets:

cmake_minimum_required(VERSION 3.20)

project(test)
set(CMAKE_EXPORT_COMPILE_COMMANDS  ON)
add_library(main a.cpp)
add_library(lib_a INTERFACE)
target_include_directories(lib_a INTERFACE ${CMAKE_CURRENT_LIST_DIR}/include)

target_link_libraries(main INTERFACE lib_a)

INTERFACE

[
{
  "directory": "~/scratch",
  "command": "/usr/bin/c++    -o CMakeFiles/main.dir/a.cpp.o -c ~/scratch/a.cpp",
  "file": "~/scratch/a.cpp",
  "output": "CMakeFiles/main.dir/a.cpp.o"
}
]

PUBLIC

[
{
  "directory": "~/scratch",
  "command": "/usr/bin/c++  -I~/scratch/include  -o CMakeFiles/main.dir/a.cpp.o -c ~/scratch/a.cpp",
  "file": "~/scratch/a.cpp",
  "output": "CMakeFiles/main.dir/a.cpp.o"
}
]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

#define PR_DATE(tgt, yr, mn, dy) \
do { \
int res = sprintf(tgt, "19%02ld-%02ld-%02ld", yr, mn, dy); \
if(FOLLY_UNLIKELY(res < 0)) { \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come the header for this macro include isn't needed?
It seems to have been needed in the build.cpp file.
How come? Is it not actually needed everywhere because it comes in from some other include?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be coming from some other file, I will include it explicitly.

sprintf(target + 3, "%03d", static_cast<int>(acode));
sprintf(target + 7, "%03d", static_cast<int>(exchg));
sprintf(target + 11, "%04d", static_cast<int>(number));
int res = sprintf(target, "%02d", static_cast<int>(10 + (ind % NATIONS_MAX)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use fmt::format instead of the complications with sprintf ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but still we need to convert std::string back to c style right and store the output in target.

#define PR_DATE(tgt, yr, mn, dy) sprintf(tgt, "%02d-%02d-19%02d", mn, dy, yr)
#define PR_DATE(tgt, yr, mn, dy) \
do { \
int res = sprintf(tgt, "19%02ld-%02ld-%02ld", yr, mn, dy); \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt::format would be simpler here as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One doubt though, fmt::format returns a cpp string whereas tgt is supposed to be a c style null terminated string(char array) . So do you want me to use fmt::format followed by strncpy as in :

#define PR_DATE(tgt, yr, mn, dy) \
do { \
std::string formatted = fmt::format("{:02d}-{:02d}-19{:02d}", mn, dy, yr); \
std::strncpy(tgt, formatted.c_str(), formatted.size() + 1); \
tgt[formatted.size() + 1] = '\0'
} while (0)

public:
explicit AggregateSpillBenchmarkBase(std::string spillerType)
: spillerType_(spillerType){};
: spillerType_(spillerType) {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these changes here and the MultiFragmentTest.
looks like only format changes. Is that because the format-fix was making these while you undid your previous changes?

Copy link
Collaborator Author

@anandamideShakyan anandamideShakyan Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. The format fix is adding this space, what should I be doing retain the old formatting or keep it as it is?

static_cast<int>(exchg),
static_cast<int>(number));
std::strncpy(target, formatted.c_str(), sizeof(target) - 1);
target[sizeof(target) - 1] = '\0';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strncpy already adds a null-terminator. This is not needed.


if (!bInit) {
sprintf(szFormat, C_NAME_FMT, 9, &HUGE_FORMAT[1]);
auto result = sprintf(szFormat, C_NAME_FMT, 9, &HUGE_FORMAT[1]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we want consistency in the naming of the variable. It is called res everywhere else.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is another variable named res just below, so I didn't want to cause any scoping issues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no scoping issue really. This one is local to this block only.
The compiler will tell you if there are problems.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@anandamideShakyan anandamideShakyan changed the title Refactor: handle in-output errors for string functions (Part 2) refactor: Handle in-output errors for string functions (Part 2) Feb 24, 2025
@anandamideShakyan
Copy link
Collaborator Author

@czentgr waiting for your review on this one.

@libianoss
Copy link
Collaborator

@czentgr can you help to review this PR based on @anandamideShakyan 's change? Thanks.

@czentgr
Copy link
Collaborator

czentgr commented Mar 7, 2025

@anandamideShakyan please review the CI failures. Rebase if necessary.

@anandamideShakyan
Copy link
Collaborator Author

@czentgr resolved the CI failures, waiting for your review.

@anandamideShakyan anandamideShakyan force-pushed the dev4 branch 2 times, most recently from 8c599a5 to 722535e Compare March 21, 2025 04:24
Copy link
Collaborator

@czentgr czentgr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@prestodb-ci
Copy link

@ethanyzhang imported this issue into IBM GitHub Enterprise

@anandamideShakyan anandamideShakyan force-pushed the dev4 branch 3 times, most recently from 4aa27d4 to bae1d95 Compare April 7, 2025 14:29
@majetideepak majetideepak changed the title refactor: Handle in-output errors for string functions (Part 2) refactor(tpch): Handle in-output errors for string functions (Part 2) Apr 7, 2025
@majetideepak majetideepak added the ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall label Apr 7, 2025
#include "dbgen/dsstypes.h" // @manual

#include <math.h>
#include "common/base/Exceptions.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target still needs fmt as a dependency. It might be able to stay "private".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "velox/common/base/Exceptions.h" , please fix.

Copy link
Collaborator Author

@anandamideShakyan anandamideShakyan Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This was added as an automatic import by clion when I used VELOX_CHECK_GT macro, not sure why common/base/Exceptions.h was used in the import path, it is strange.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this line:include_directories(SYSTEM velox) in root CmakeLists.txt allow us to pass just "common/base/Exceptions.h" as file path?

@anandamideShakyan anandamideShakyan force-pushed the dev4 branch 2 times, most recently from ea293ba to a9c6fc2 Compare April 8, 2025 05:16
@kgpai
Copy link
Contributor

kgpai commented Apr 11, 2025

Can you rebase to latest main @anandamideShakyan ?

@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@kgpai merged this pull request in 4ddf2dd.

zhanglistar pushed a commit to bigo-sg/velox that referenced this pull request Apr 22, 2025
…facebookincubator#12064)

Summary:
# PR Description:

Detect and address input/output errors that can result in undefined behavior. Ensure proper error handling for I/O functions that may fail and leave variables uninitialized. Neglecting to check the status of these functions before using their outputs (e.g., memory buffers, file descriptors, etc.) can lead to undefined program behavior. This update enforces checks on commonly used I/O functions to validate their return values and prevent improper usage.

This is the second PR of the several changes for this refactoring.

Pull Request resolved: facebookincubator#12064

Reviewed By: pedroerp

Differential Revision: D73005009

Pulled By: kgpai

fbshipit-source-id: 5cce140545fb9f682b036ce86875bd2c1b1b5a06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants