Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions core/src/main/scala/org/apache/spark/network/ManagedBuffer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ package org.apache.spark.network

import java.io.{FileInputStream, RandomAccessFile, File, InputStream}
import java.nio.ByteBuffer
import java.nio.channels.FileChannel
import java.nio.channels.FileChannel.MapMode

import com.google.common.io.ByteStreams
Expand Down Expand Up @@ -66,8 +67,13 @@ final class FileSegmentManagedBuffer(val file: File, val offset: Long, val lengt
override def size: Long = length

override def nioByteBuffer(): ByteBuffer = {
val channel = new RandomAccessFile(file, "r").getChannel
channel.map(MapMode.READ_ONLY, offset, length)
var channel: FileChannel = null
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part looks good to me

channel = new RandomAccessFile(file, "r").getChannel
channel.map(MapMode.READ_ONLY, offset, length)
} finally {
channel.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would throw an NPE if an error occurred in new RandomAccessFile or getChannel. I suppose you move new RandomAccessFile(file, "r").getChannel before the try block. Before that method returns, there is no FileChannel that successfully opened and therefore needs closing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I was going to check channel is null or not, but I forgot at previous PR.
Now I've modified.

}
}

override def inputStream(): InputStream = {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@

package org.apache.spark.storage

import java.io.InputStream
import java.util.concurrent.LinkedBlockingQueue

import scala.collection.mutable.ArrayBuffer
import scala.collection.mutable.HashSet
import scala.collection.mutable.Queue

import org.apache.spark.{TaskContext, Logging, SparkException}
import org.apache.spark.{TaskContext, Logging}
import org.apache.spark.network.{ManagedBuffer, BlockFetchingListener, BlockTransferService}
import org.apache.spark.serializer.Serializer
import org.apache.spark.util.Utils
Expand Down Expand Up @@ -111,13 +112,21 @@ final class ShuffleBlockFetcherIterator(
blockTransferService.fetchBlocks(req.address.host, req.address.port, blockIds,
new BlockFetchingListener {
override def onBlockFetchSuccess(blockId: String, data: ManagedBuffer): Unit = {
results.put(new FetchResult(BlockId(blockId), sizeMap(blockId),
() => serializer.newInstance().deserializeStream(
blockManager.wrapForCompression(BlockId(blockId), data.inputStream())).asIterator
))
shuffleMetrics.remoteBytesRead += data.size
shuffleMetrics.remoteBlocksFetched += 1
logDebug("Got remote block " + blockId + " after " + Utils.getUsedTimeMs(startTime))
var is: InputStream = null
try {
is = data.inputStream()
results.put(new FetchResult(BlockId(blockId), sizeMap(blockId),
() => serializer.newInstance().deserializeStream(
blockManager.wrapForCompression(BlockId(blockId), is)).asIterator
))
shuffleMetrics.remoteBytesRead += data.size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these three lines follow the finally block? the stream can be closed at this point I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what do you mean? I didn't get.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three lines do not need to be within the try block, right? I figure it's best to complete the try block and have is be closed before moving on to further operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, do you mean line 122 - 124 should be out of try block? I agree with that.

shuffleMetrics.remoteBlocksFetched += 1
logDebug("Got remote block " + blockId + " after " + Utils.getUsedTimeMs(startTime))
} finally {
if (is != null) {
is.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this close the inputstream prematurely? Note that the 3ard argument to results is passed in as a closure so it is lazy.

BTW in my new refactoring of this, there is a place where we should explicitly close the streams:

https://github.com/apache/spark/pull/2330/files#diff-27109eb30a77542d377c936e0d134420R295

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin Exactly, it's not make sense and I noticed the InputStream is closed via NextIterator#close.
So, I revert this part of change.

}
}
}

override def onBlockFetchFailure(e: Throwable): Unit = {
Expand Down