[Improve][Connector-v2][File] error handling for file and directory operations in HadoopFileSystemProxy#10433
[Improve][Connector-v2][File] error handling for file and directory operations in HadoopFileSystemProxy#10433zhangshenghang wants to merge 1 commit intoapache:devfrom
Conversation
Issue 1: path.getParent() may return null causing NPELocation: Modified code: IOException enhanced =
enhanceMkdirsException(
fs, path.getParent(), "create file " + path.getName(), e);Related context:
Problem description: Path parent = path.getParent();
if (parent != null && !fs.exists(parent)) { // If parent is null, this will be skipped here
reason.append("Parent directory does not exist: ").append(parent).append(". ");
} else {
reason.append("Directory does not exist and creation failed: ")
.append(path)
.append(". ");
}Although L297 has a null check, the subsequent L306-L316 will still call Potential risks:
Impact scope:
Severity: MAJOR (medium-high) Improvement suggestion: public FSDataOutputStream getOutputStream(String filePath) throws IOException {
return execute(
() -> {
Path path = new Path(filePath);
FileSystem fs = getFileSystem();
try {
return fs.create(path, true);
} catch (IOException e) {
Path parent = path.getParent();
// Fix: Handle null parent
String pathContext = (parent != null) ? parent.toString() : "current directory";
IOException enhanced =
enhanceMkdirsException(
fs, parent, "create file " + path.getName(), e);
throw CommonError.fileOperationFailed(
"SeaTunnel", "create", filePath, enhanced);
}
});
}
// Also modify enhanceMkdirsException:
private IOException enhanceMkdirsException(
FileSystem fs, Path path, String operation, IOException cause) throws IOException {
StringBuilder reason = new StringBuilder();
// Fix: Handle null path
if (path == null) {
reason.append("Path is null. ");
} else if (!fs.exists(path)) {
Path parent = path.getParent();
if (parent != null && !fs.exists(parent)) {
reason.append("Parent directory does not exist: ").append(parent).append(". ");
} else if (parent == null) {
reason.append("Path is in current directory. ");
} else {
reason.append("Directory does not exist and creation failed: ")
.append(path)
.append(". ");
}
try {
fs.getFileStatus(path);
} catch (IOException e) {
if (e.getMessage() != null) {
if (e.getMessage().contains("Permission denied")) {
reason.append("Permission denied. ");
} else {
reason.append("Hadoop error: ").append(e.getMessage()).append(". ");
}
}
}
} else {
reason.append("Path exists but may be inaccessible: ").append(path).append(". ");
}
reason.append("Operation: ")
.append(operation)
.append(". ")
.append("Current working directory: ")
.append(fs.getWorkingDirectory());
IOException enhanced = new IOException(reason.toString());
if (cause != null) {
enhanced.addSuppressed(cause);
}
return enhanced;
}Rationale:
Issue 2: enhanceMkdirsException() method lacks JavaDocLocation: Modified code: private IOException enhanceMkdirsException(
FileSystem fs, Path path, String operation, IOException cause) throws IOException {
// 45 lines of code, no JavaDoc
}Related context:
Problem description: Potential risks:
Impact scope:
Severity: MINOR (low) Improvement suggestion: /**
* Enhances IOException with detailed diagnostic information for directory/file creation failures.
* <p>
* This method performs the following diagnostic checks:
* <ul>
* <li>Checks if the parent directory exists</li>
* <li>Detects permission denied errors from Hadoop</li>
* <li>Captures Hadoop-specific error messages</li>
* <li>Includes current working directory for context</li>
* </ul>
*
* @param fs the FileSystem instance to perform diagnosticchecks on
* @param path the path that failed to create (can be null for relative paths)
* @param operation the operation being performed (e.g., "create directory", "create file")
* @param cause the original IOException from FileSystem.mkdirs() or FileSystem.create() (can be null)
* @return an enhanced IOException with detailed diagnostic information in the message
* @throws IOException if diagnostic checks (e.g., fs.exists(), fs.getFileStatus()) fail
*
* @see CommonError#fileOperationFailed(String, String, String, Throwable)
*/
private IOException enhanceMkdirsException(
FileSystem fs, Path path, String operation, IOException cause) throws IOException {
// ...
}Rationale:
Issue 3: Missing unit tests for this modificationLocation: Modified code: Related context:
Problem description: Potential risks:
Impact scope:
Severity: MAJOR (medium-high) Improvement suggestion: package org.apache.seatunnel.connectors.seatunnel.file.hadoop;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.LocalFileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.seatunnel.common.exception.SeaTunnelRuntimeException;
import org.apache.seatunnel.connectors.seatunnel.file.config.HadoopConf;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import static org.junit.jupiter.api.Assertions.*;
class HadoopFileSystemProxyTest {
private HadoopFileSystemProxy proxy;
private HadoopConf hadoopConf;
@TempDir
java.nio.file.Path tempDir;
@BeforeEach
void setUp() throws IOException {
hadoopConf = new HadoopConf();
hadoopConf.setHdfsPath("file://" + tempDir.toString());
proxy = new HadoopFileSystemProxy(hadoopConf);
}
@AfterEach
void tearDown() throws IOException {
if (proxy != null) {
proxy.close();
}
}
@Test
void testCreateDirWhenAlreadyExists() throws IOException {
// Given: Directory already exists
String dirPath = tempDir.resolve("test-dir").toString();
proxy.createDir(dirPath);
// When: Create again
// Then: Should not throw exception
assertDoesNotThrow(() -> proxy.createDir(dirPath));
}
@Test
void testCreateDirWhenParentNotExists() {
// Given: Parent directory does not exist
String dirPath = tempDir.resolve("non-existent-parent/child").toString();
// When & Then: Should throw exception with detailed information
SeaTunnelRuntimeException ex = assertThrows(
SeaTunnelRuntimeException.class,
() -> proxy.createDir(dirPath)
);
// Verify error message contains useful diagnostic information
assertTrue(ex.getMessage().contains("Parent directory does not exist") ||
ex.getMessage().contains("Hadoop error"));
}
@Test
void testGetOutputStreamWhenSuccess() throws IOException {
// Given: Parent directory exists
String filePath = tempDir.resolve("test-file.txt").toString();
// When: Create output stream
// Then: Should not throw exception
assertDoesNotThrow(() -> proxy.getOutputStream(filePath));
}
@Test
void testGetOutputStreamWhenParentNotExists() {
// Given: Parent directory does not exist
String filePath = tempDir.resolve("non-existent-parent/file.txt").toString();
// When & Then: Should throw exception with parent directory information
SeaTunnelRuntimeException ex = assertThrows(
SeaTunnelRuntimeException.class,
() -> proxy.getOutputStream(filePath)
);
// Verify error message contains useful diagnostic information
String message = ex.getMessage();
assertTrue(message.contains("Parent directory") ||
message.contains("Hadoop error") ||
message.contains("Permission denied"));
}
@Test
void testGetOutputStreamWithRelativePath() throws IOException {
// Given: Relative path
String relativePath = "test-relative.txt";
// When: Create output stream
// Then: Should not throw NPE
assertDoesNotThrow(() -> proxy.getOutputStream(relativePath));
}
@Test
void testGetOutputStreamWithRootPath() {
// Given: Root path (Edge Case)
String rootPath = "/";
// When & Then: Should throw exception (but not NPE)
Exception ex = assertThrows(Exception.class, () -> {
proxy.getOutputStream(rootPath);
});
// Verify it's not NPE
assertFalse(ex instanceof NullPointerException);
}
}Rationale:
Issue 4: fs.exists() and fs.getFileStatus() in enhanceMkdirsException() may throw swallowed exceptionsLocation: Modified code: if (!fs.exists(path)) {
Path parent = path.getParent();
if (parent != null && !fs.exists(parent)) {
reason.append("Parent directory does not exist: ").append(parent).append(". ");
} else {
reason.append("Directory does not exist and creation failed: ")
.append(path)
.append(". ");
}
try {
fs.getFileStatus(path); // A new IOException may be thrown here
} catch (IOException e) {
if (e.getMessage() != null) {
if (e.getMessage().contains("Permission denied")) {
reason.append("Permission denied. ");
} else {
reason.append("Hadoop error: ").append(e.getMessage()).append(". ");
}
}
}
}Related context:
Problem description:
Potential risks:
Impact scope:
Severity: MINOR (low-medium) Improvement suggestion: private IOException enhanceMkdirsException(
FileSystem fs, Path path, String operation, IOException cause) throws IOException {
StringBuilder reason = new StringBuilder();
try {
if (!fs.exists(path)) {
Path parent = path.getParent();
if (parent != null && !fs.exists(parent)) {
reason.append("Parent directory does not exist: ").append(parent).append(". ");
} else if (parent == null) {
reason.append("Path is in current directory. ");
} else {
reason.append("Directory does not exist and creation failed: ")
.append(path)
.append(". ");
}
// Safely attempt to get detailed error
try {
fs.getFileStatus(path);
} catch (IOException e) {
String errorMsg = e.getMessage();
if (errorMsg != null && !errorMsg.isEmpty()) {
if (errorMsg.contains("Permission denied")) {
reason.append("Permission denied. ");
} else {
reason.append("Hadoop error: ").append(errorMsg).append(". ");
}
} else {
reason.append("Hadoop error: (no detailed message). ");
}
}
} else {
reason.append("Path exists but may be inaccessible: ").append(path).append(". ");
}
} catch (IOException diagnosticEx) {
// If diagnostic checks fail, fall back to basic information
reason.append("Failed to diagnose path: ")
.append(path)
.append(". Diagnostic error: ")
.append(diagnosticEx.getMessage())
.append(". ");
}
reason.append("Operation: ")
.append(operation)
.append(". ")
.append("Current working directory: ");
// Safely get working directory
try {
reason.append(fs.getWorkingDirectory());
} catch (IOException e) {
reason.append("(unknown: ").append(e.getMessage()).append(")");
}
IOException enhanced = new IOException(reason.toString());
if (cause != null) {
enhanced.addSuppressed(cause);
}
return enhanced;
}Rationale:
Issue 5: Log warning level in renameFile() may be inappropriateLocation: Modified code: if (!fileExist(oldPath.toString())) {
log.warn(
"rename file:[{}] to [{}] already finished in the last commit, skip. "
+ "WARNING: In cluster mode with LocalFile without shared storage, "
+ "the file may not be actually synced successfully, but the status shows success.",
oldPath,
newPath);
return Void.class;
}Related context:
Problem description:
Using the
Potential risks:
Impact scope:
Severity: MINOR (low) Improvement suggestion: Option 1: Change to INFO level (recommended) if (!fileExist(oldPath.toString())) {
log.info(
"rename file:[{}] to [{}] already finished in the last commit, skip. "
+ "INFO: In cluster mode with LocalFile without shared storage, "
+ "the file may not be actually synced successfully, but the status shows success.",
oldPath,
newPath);
return Void.class;
}Option 2: Distinguish log levels based on scenario if (!fileExist(oldPath.toString())) {
// Check if in checkpoint recovery scenario (can be determined from context)
if (isCheckpointRecovery()) {
log.debug("File [{}] already renamed to [{}], skipping", oldPath, newPath);
} else {
log.warn(
"rename file:[{}] to [{}] already finished in the last commit, skip. "
+ "WARNING: In cluster mode with LocalFile without shared storage, "
+ "the file may not be actually synced successfully, but the status shows success.",
oldPath,
newPath);
}
return Void.class;
}Option 3: Keep WARN, but clearly mark as "expected behavior" if (!fileExist(oldPath.toString())) {
log.warn(
"rename file:[{}] to [{}] already finished in the last commit, skip. "
+ "Note: This is expected during checkpoint recovery. "
+ "However, in cluster mode with LocalFile without shared storage, "
+ "the file may not be actually synced successfully, but the status shows success.",
oldPath,
newPath);
return Void.class;
}Rationale:
|
…
Purpose of this pull request
Does this PR introduce any user-facing change?
How was this patch tested?
Check list
New License Guide
incompatible-changes.mdto describe the incompatibility caused by this PR.