-
Notifications
You must be signed in to change notification settings - Fork 36
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
After running analysis with --gvcf option on a 50 Gb BAM file containing 4 ONT runs and HG19 reference, the resulting tmp output subfolder takes 419 Gb, plus 117 Gb in the main output folder. Probably, it would make sense to remove VCF partial files after concatenating and sorting them and compress the output. For instance, a 117 Gb GVCF file takes only 8.5 Gb when bzip2-compressed. Some libraries as lbzip2 can decompress it in parallel. Perhaps you want to minimize dependencies, but disk space efficiency is also important when it comes to renting servers with fast SSDs.
547M ./tmp/full_alignment_output/candidate_bed
3.6G ./tmp/full_alignment_output
233G ./tmp/gvcf_tmp_output
117G ./tmp/merge_output
18G ./tmp/pileup_output
174M ./tmp/phase_output/phase_vcf
48G ./tmp/phase_output/phase_bam
48G ./tmp/phase_output
419G ./tmp
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request