Skip to content
This repository was archived by the owner on May 22, 2026. It is now read-only.
This repository was archived by the owner on May 22, 2026. It is now read-only.

The SNP detection result for the amp_WGS project is empty #198

@Bioinfo-dataming

Description

@Bioinfo-dataming

Dear Vardict team, thank you for providing this powerful, ultra-sensitive detection tool.

I am currently working on amp WGS “SpCas9 off-targets”, amplifying 5k~ target sequences using PCR, then randomly fragmenting them for sequencing using an Illumina sequencer. I want to detect SNP mutations in 0.2% VAF.

Unfortunately, my output of vardict -G $REF -f 0.001 -N "$TUMOR_NAME" -b "${TUMOR_BAM}" \ -c 1 -S 2 -E 3 -R chr19:55113494-55119083 \ | teststrandbias.R | var2vcf_valid.pl -N "$TUMOR_NAME" -E -f $AF_THR is empty, only containing the vardict help documentation and VCF(details in add file) header files.

My preprocessing code is as follows, and the BAM file is generated normally:
`fastp -i $FQ1 -I $FQ2
-o $OUT_DIR/${SAMPLE}_clean_R1.fq.gz -O $OUT_DIR/${SAMPLE}_clean_R2.fq.gz
--html $OUT_DIR/${SAMPLE}_fastp.html
--json $OUT_DIR/${SAMPLE}_fastp.json \
--correction --trim_poly_g --thread $THREADS

bwa mem -t $THREADS -R "@rg\tID:${SAMPLE}\tSM:${SAMPLE}\tPL:ILLUMINA" \
$REF\
$OUT_DIR/${SAMPLE}_clean_R1.fq.gz \
$OUT_DIR/${SAMPLE}_clean_R2.fq.gz | \
samtools view -Sb -> $OUT_DIR/${SAMPLE}_raw.bam

samtools sort -@ $THREADS $OUT_DIR/${SAMPLE}_raw.bam -o $OUT_DIR/${SAMPLE}_sorted.bam
samtools index $OUT_DIR/${SAMPLE}_sorted.bam`

I investigated the following:
① Using samtools tview, I confirmed that a mutation exists in the region "chr19:55113494-55119083".
② The ref and bam values, as well as the chromosome field specified by the -R parameter, are consistent, both being "chr19".

③ I used the samtools coverage command, which outputs: "#rname startpos endpos numreads covbases coverage meandepth meanbaseq meanmapq
chr19 55113494 55119083 46505102 5590 100 927261 39 59.9". I'm wondering if the extremely high data depth is causing vardict to fail to produce results.

Additionally, I tried two workflows: one using the whole genome "hg38_v0/Homo_sapiens_assembly38.fasta" as the reference genome, and the other using the target region (5k+) directly as the reference genome. Both VCF outputs were empty.

Any suggestions that could help my project would be greatly appreciated.

test_vcf.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions