Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -260,23 +260,20 @@ public List<GroupedSplitContainer> getGroupedSplits(Configuration conf,
desiredNumSplits = newDesiredNumSplits;
} else if (lengthPerGroup < minLengthPerGroup) {
// splits too small to work. Need to override with size.
int newDesiredNumSplits = (int)(totalLength/minLengthPerGroup) + 1;
/**
* This is a workaround for systems like S3 that pass the same
* fake hostname for all splits.
*/
if (!allSplitsHaveLocalhost) {
int newDesiredNumSplits = (int)(totalLength/minLengthPerGroup) + 1;
LOG.info("Desired splits: " + desiredNumSplits + " too large. " +
" Desired splitLength: " + lengthPerGroup +
" Min splitLength: " + minLengthPerGroup +
" New desired splits: " + newDesiredNumSplits +
" Total length: " + totalLength +
" Original splits: " + originalSplits.size());
desiredNumSplits = newDesiredNumSplits;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about adding an else branch here with some useful log message describing whatever happens (or doesn't happen) when allSplitsHaveLocalhost=true
otherwise LGTM
cc: @rbalamohan

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a log message here c715c0b and also made a small refactoring to gather logging in one place and always log if splitLength bounds are exceeded.

Anyways the most important point in this patch is to be able to see the original desiredNumSplits in every case; especially when it is different from newDesiredNumSplits or not.


LOG.info("Desired splits: " + desiredNumSplits + " too large. " +
" Desired splitLength: " + lengthPerGroup +
" Min splitLength: " + minLengthPerGroup +
" New desired splits: " + newDesiredNumSplits +
" Final desired splits: " + desiredNumSplits +
" All splits have localhost: " + allSplitsHaveLocalhost +
" Total length: " + totalLength +
" Original splits: " + originalSplits.size());
}
}

Expand Down