-
Notifications
You must be signed in to change notification settings - Fork 89
Description
Hi. Thanks for the codes and the detailed instruction.
I implemented sparse convolution into my encoder:
with tf.variable_scope('featureEncoder'):
auxiShape = (self.inputShape[0], self.inputShape[1], self.inputShape[2], 7)
featureShape = (self.inputShape[0], self.inputShape[1], self.inputShape[2], 32)
blockSize = 8
blockStride = (8,8)
blockOffset = (0,0)
blockCount = (self.divup(self.inputShape[1], blockStride[0]), self.divup(self.inputShape[2], blockStride[1]))
inBlockParams = { "dynamic_bsize": (blockSize, blockSize), "dynamic_boffset": blockOffset, "dynamic_bstride": blockStride }
outBlockParams = { "dynamic_bsize": (blockSize, blockSize), "dynamic_boffset": blockOffset, "dynamic_bstride": blockStride }
if not self.training:
indices = sbnet_module.reduce_mask(self.mask, blockCount, tol=0.1, **inBlockParams)
# stack active overlapping tiles to batch dimension
stack = sbnet_module.sparse_gather(
auxi, indices.bin_counts, indices.active_block_indices, transpose=False, **inBlockParams)
else:
stack = auxi
# perform dense convolution on a sparse stack of tiles
stack = self.conv_layer2(stack, 7, 32, name='1')
stack = tf.nn.leaky_relu(stack)
stack = self.conv_layer2(stack, 32,32, name='2')
stack = tf.nn.leaky_relu(stack)
stack = self.conv_layer2(stack, 32,32, name='3')
stack = tf.nn.leaky_relu(stack)
stack = self.conv_layer2(stack, 32,32, name='4')
stack = tf.nn.leaky_relu(stack)
stack = self.conv_layer2(stack, 32,32, name='5')
stack = tf.nn.leaky_relu(stack)
# write/scatter the tiles back on top of original tensor
# note that the output tensor is reduced by 1 on each side due to 'VALID' convolution
if not self.training:
feature=sbnet_module.sparse_scatter(
stack, indices.bin_counts, indices.active_block_indices,
self.lastFeature, transpose=False, add=False, atomic=False, **outBlockParams)
feature.set_shape(featureShape)
else:
feature=stackself.training is set False when training and True when testing. Variable mask is generated outside the network and fed in via tf.placeholder. So does self.lastFeature.
I tried to measure the inference time with timeline:
feed_dict = {model.source: src, model.target: tgt, model.batch_size:src_hdr.shape[0], model.mask:Mask, model.feature:Feature}
denoised_1_bd, Feature = sess.run([model.fake_image, model.feature], feed_dict, options=run_options, run_metadata=run_metadata)
tl = timeline.Timeline(run_metadata.step_stats)
ctf = tl.generate_chrome_trace_format(show_memory=True)
with open(os.path.join(errorlog_dir, 'timeline.json'),'w') as wd:
wd.write(ctf)However, I can't find time records of layers under 'featureEncoder'. And there are two bars captioned unknown, the second of which is strangely long. Some Pooling and LeakyRelu‘s time is also strange, costing nearly 2ms.
I wonder how I can get the proper time measurement. Thanks.
My Environment
TensorFlow Version: 1.15.0
Operating System: Ubuntu 16.04
Python Version: 3.6.13
CUDA Version: 10.0
CUDNN Version: 7.6.4
GPU Type: RTX 2080ti
Nvidia Driver Version: 460.67

