Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
ed87dcc
added fasta parsing
jonas-fuchs Jan 19, 2026
9177fbc
implemented fasta parsing and dimer check
jonas-fuchs Jan 23, 2026
aab3a4c
inluded endoverlap in log
jonas-fuchs Jan 26, 2026
7f3b28b
renamed function
jonas-fuchs Jan 26, 2026
714834a
Merge branch 'master' into scheme_compatibility
jonas-fuchs Jan 26, 2026
5c6de3c
optimized the primer dimer solve
jonas-fuchs Jan 27, 2026
ae2980a
added multi-processing for final dimer solve
jonas-fuchs Jan 27, 2026
e0f94f0
added multi-batch processing for kmer evaluations and amplicon evalau…
jonas-fuchs Jan 27, 2026
8b3a88f
fix when there are no dimers
jonas-fuchs Jan 27, 2026
77cb3a5
fixed bug with compatible primers and qPCR mode
jonas-fuchs Jan 28, 2026
e52f558
moved get_permutations and changed threshold to required
jonas-fuchs Jan 29, 2026
7b1d130
Update varvamp/scripts/primers.py
jonas-fuchs Jan 31, 2026
c6cb54f
Update varvamp/scripts/primers.py
jonas-fuchs Jan 31, 2026
336b8cb
Update varvamp/command.py
jonas-fuchs Jan 31, 2026
2ec7d45
Update varvamp/scripts/qpcr.py
jonas-fuchs Jan 31, 2026
e9679cd
Update varvamp/scripts/qpcr.py
jonas-fuchs Jan 31, 2026
0cb29a7
Update varvamp/scripts/qpcr.py
jonas-fuchs Jan 31, 2026
6c4c75d
moved deduplication to parse function
jonas-fuchs Jan 31, 2026
e5ebc68
changed combination logic
jonas-fuchs Jan 31, 2026
9b8bcec
update
jonas-fuchs Jan 31, 2026
2f3c47f
Merge remote-tracking branch 'origin/scheme_compatibility' into schem…
jonas-fuchs Jan 31, 2026
4765001
fixed outer loop
jonas-fuchs Jan 31, 2026
4fb8226
batch size calc for find primers
jonas-fuchs Jan 31, 2026
549d48b
finalized review from wm75
jonas-fuchs Jan 31, 2026
5fc0795
small stuff
jonas-fuchs Jan 31, 2026
1bad536
added multi-batch processing for deltaG and fixed bug for batch calcu…
jonas-fuchs Feb 3, 2026
2cf54fa
added n_to_test again
jonas-fuchs Feb 3, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions docs/how_varvamp_works.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,24 @@ To search for the best amplicon, varVAMP uses a graph based approach.
7. Test amplicons for their [minimal free energy](https://en.wikipedia.org/wiki/Gibbs_free_energy) at their lowest primer temperature with [`seqfold`](https://github.com/Lattice-Automation/seqfold) and filter to avoid secondary structures. Amplicons with large potential deletions (>QAMPLICON_DEL_CUTOFF) will be ignored. Smaller deletions will be accepted.
8. Take the best qPCR schemes of overlapping schemes.


#### Multi-processing
varVAMP can use multiple cores at some steps in the workflow. If you have performance issues
at the following steps it might be worth increasing the number of cores.

1. All workflows:
- BLAST search: Each amplicon is searched for off-targets in the BLAST database.
- Primer dimer search against a user-provided list of primers.
- Primer search: Each kmer is evaluated as a potential primer.

2. Tiled workflow:
- Final primer dimer solve.

3. qPCR workflow:
- Probe search: Each kmer is evaluated as a potential probe.
- Amplicon search: Each potential amplicon is checked for its parameters.
- Amplicon deltaG calculation: Each amplicon is checked for potential secondary structures.

#### Penalty calculation

```python
Expand Down
2 changes: 1 addition & 1 deletion docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ varVAMP produces multiple main output files:
| ALL | per_base_mismatches.pdf | Barplot of the percent mismatches at each nucleotide position of the primer. |
| ALL | primers.bed | A bed file with the primer locations. Includes the primer penalty. The lower, the better. |
| ALL | varvamp_log.txt | Log file. |
| TILED | unsolvable_primer_dimers.tsv | Only produced if there are primer dimers without replacements. Tells which primers form dimers and at which temperature. |
| TILED | unsolvable_primer_dimers.txt | Only produced if there are primer dimers without replacements. Tells which primers form dimers and at which temperature. |
| TILED | primers_pool_0/1.fasta | Primer sequences per pool in fasta format. |
| SINGLE | primers.fasta | Primer sequences in fasta format. |
| TILED/SINGLE | primer_to_amplicon_assignments.tabular | Simple tab separated file, which primers belong together. Useful for bioinformatic workflows that include primer trimming |
Expand Down
125 changes: 92 additions & 33 deletions varvamp/command.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ def get_args(sysargs):
QPCR_parser = mode_parser.add_parser(
"qpcr",
help="design qPCR primers",
usage="varvamp qpcr [optional arguments] <alignment> <output dir>"
usage="varvamp qpcr -t <threshold> [optional arguments] <alignment> <output dir>"
)
parser.add_argument(
"input",
Expand All @@ -73,7 +73,7 @@ def get_args(sysargs):
par.add_argument(
"-th",
"--threads",
help="number of threads for BLAST search and deltaG calculations",
help="number of threads",
metavar="1",
type=int,
default=1
Expand All @@ -85,6 +85,14 @@ def get_args(sysargs):
type=str,
default="varVAMP"
)
par.add_argument(
"--compatible-primers",
metavar="None",
type=str,
default=None,
help="FASTA primer file with which new primers should not form dimers. Sequences >40 nt are ignored. Can significantly increase runtime."
)

for par in (SINGLE_parser, TILED_parser):
par.add_argument(
"-t",
Expand Down Expand Up @@ -138,9 +146,7 @@ def get_args(sysargs):
"-t",
"--threshold",
required=True,
metavar="0.9",
type=float,
default=0.9,
help="consensus threshold (0-1) - higher values result in higher specificity at the expense of found primers"
)
QPCR_parser.add_argument(
Expand Down Expand Up @@ -184,9 +190,18 @@ def shared_workflow(args, log_file):
"""
# start varvamp
logging.varvamp_progress(log_file, mode=args.mode)

# read in alignment and preprocess
preprocessed_alignment = alignment.preprocess(args.input[0])
# read in external primer sequences with which new primers should not form dimers
if args.compatible_primers is not None:
compatible_primers = primers.parse_primer_fasta(args.compatible_primers)
if not compatible_primers:
logging.raise_error(
"no valid primers found in the provided primer file.\n",
log_file,
)
else:
compatible_primers = None
# check alignment length and number of gaps and report if its significantly more/less than expected
logging.check_alignment_length(preprocessed_alignment, log_file)
logging.check_gaped_sequences(preprocessed_alignment, log_file)
Expand Down Expand Up @@ -276,7 +291,8 @@ def shared_workflow(args, log_file):
left_primer_candidates, right_primer_candidates = primers.find_primers(
kmers,
ambiguous_consensus,
alignment_cleaned
alignment_cleaned,
args.threads
)
for primer_type, primer_candidates in [("+", left_primer_candidates), ("-", right_primer_candidates)]:
if not primer_candidates:
Expand All @@ -292,7 +308,22 @@ def shared_workflow(args, log_file):
progress_text=f"{len(left_primer_candidates)} fw and {len(right_primer_candidates)} rv potential primers"
)

return alignment_cleaned, majority_consensus, ambiguous_consensus, primer_regions, left_primer_candidates, right_primer_candidates, potential_primer_regions
# filter primers against user-provided list of compatible primers, can use multi-processing
if compatible_primers:
left_primer_candidates = primers.filter_non_dimer_candidates(
left_primer_candidates, compatible_primers, args.threads
)
right_primer_candidates = primers.filter_non_dimer_candidates(
right_primer_candidates, compatible_primers, args.threads
)
logging.varvamp_progress(
log_file,
progress=0.65,
job="Filtering primers against provided primers.",
progress_text=f"{len(left_primer_candidates)} fw and {len(right_primer_candidates)} rv primers after filtering"
)

return alignment_cleaned, majority_consensus, ambiguous_consensus, primer_regions, left_primer_candidates, right_primer_candidates, potential_primer_regions, compatible_primers


def single_and_tiled_shared_workflow(args, left_primer_candidates, right_primer_candidates, potential_primer_regions, data_dir, log_file):
Expand Down Expand Up @@ -323,8 +354,7 @@ def single_and_tiled_shared_workflow(args, left_primer_candidates, right_primer_
)
if not amplicons:
logging.raise_error(
"no amplicons found. Increase the max amplicon length or \
number of ambiguous bases or lower threshold!\n",
"no amplicons found. Increase the max amplicon length or number of ambiguous bases or lower threshold!\n",
log_file,
exit=True
)
Expand Down Expand Up @@ -369,7 +399,7 @@ def single_workflow(args, amplicons, log_file):
return amplicon_scheme


def tiled_workflow(args, amplicons, left_primer_candidates, right_primer_candidates, all_primers, ambiguous_consensus, log_file, results_dir):
def tiled_workflow(args, amplicons, left_primer_candidates, right_primer_candidates, all_primers, ambiguous_consensus, log_file):
"""
part of the workflow specific for the tiled mode
"""
Expand All @@ -383,21 +413,9 @@ def tiled_workflow(args, amplicons, left_primer_candidates, right_primer_candida
amplicon_graph
)

# check for dimers
dimers_not_solved = scheme.check_and_solve_heterodimers(
amplicon_scheme,
left_primer_candidates,
right_primer_candidates,
all_primers)
if dimers_not_solved:
logging.raise_error(
f"varVAMP found {len(dimers_not_solved)} primer dimers without replacements. Check the dimer file and perform the PCR for incomaptible amplicons in a sperate reaction.",
log_file
)
reporting.write_dimers(results_dir, dimers_not_solved)

# evaluate coverage
# ATTENTION: Genome coverage of the scheme might still change slightly through resolution of primer dimers, but this potential, minor inaccuracy is currently accepted.
# ATTENTION: Genome coverage of the scheme might still change slightly through resolution of primer dimers,
# but this potential, minor inaccuracy is currently accepted.
percent_coverage = round(coverage/len(ambiguous_consensus)*100, 2)
logging.varvamp_progress(
log_file,
Expand All @@ -414,10 +432,36 @@ def tiled_workflow(args, amplicons, left_primer_candidates, right_primer_candida
"\t - relax primer settings (not recommended)\n",
log_file
)
return amplicon_scheme

# check for dimers
dimers_not_solved, n_initial_dimers = scheme.check_and_solve_heterodimers(
amplicon_scheme,
left_primer_candidates,
right_primer_candidates,
all_primers,
args.threads
)

# report dimers solve
if n_initial_dimers > 0 and not dimers_not_solved:
logging.varvamp_progress(
log_file,
progress=0.95,
job="Trying to solve primer dimers.",
progress_text=f"all dimers (n={n_initial_dimers}) could be resolved"
)
elif dimers_not_solved:
logging.varvamp_progress(
log_file,
progress=0.95,
job="Trying to solve primer dimers.",
progress_text=f"{len(dimers_not_solved)}/{n_initial_dimers} dimers could not be resolved"
)

return amplicon_scheme, dimers_not_solved


def qpcr_workflow(args, data_dir, alignment_cleaned, ambiguous_consensus, majority_consensus, left_primer_candidates, right_primer_candidates, log_file):
def qpcr_workflow(args, data_dir, alignment_cleaned, ambiguous_consensus, majority_consensus, left_primer_candidates, right_primer_candidates, compatible_primers, log_file):
"""
part of the workflow specific for the tiled mode
"""
Expand All @@ -428,7 +472,7 @@ def qpcr_workflow(args, data_dir, alignment_cleaned, ambiguous_consensus, majori
)
if not probe_regions:
logging.raise_error(
"no regions that fullfill probe criteria! lower threshold or increase number of ambiguous chars in probe\n",
"no regions that fulfill probe criteria! lower threshold or increase number of ambiguous chars in probe\n",
log_file,
exit=True
)
Expand All @@ -440,7 +484,7 @@ def qpcr_workflow(args, data_dir, alignment_cleaned, ambiguous_consensus, majori
config.QPROBE_SIZES
)
# find potential probes
qpcr_probes = qpcr.get_qpcr_probes(probe_kmers, ambiguous_consensus, alignment_cleaned)
qpcr_probes = qpcr.get_qpcr_probes(probe_kmers, ambiguous_consensus, alignment_cleaned, args.threads)
if not qpcr_probes:
logging.raise_error(
"no qpcr probes found\n",
Expand All @@ -454,8 +498,21 @@ def qpcr_workflow(args, data_dir, alignment_cleaned, ambiguous_consensus, majori
progress_text=f"{len(qpcr_probes)} potential qPCR probes"
)

# filter primers against non-dimer sequences if provided
if compatible_primers:
qpcr_probes = primers.filter_non_dimer_candidates(
qpcr_probes, compatible_primers, args.threads)
logging.varvamp_progress(
log_file,
progress=0.75,
job="Filtering probes against provided primers.",
progress_text=f"{len(qpcr_probes)} potential qPCR probes after filtering"
)

# find unique amplicons with a low penalty and an internal probe
qpcr_scheme_candidates = qpcr.find_qcr_schemes(qpcr_probes, left_primer_candidates, right_primer_candidates, majority_consensus, ambiguous_consensus)
qpcr_scheme_candidates = qpcr.find_qcr_schemes(
qpcr_probes, left_primer_candidates, right_primer_candidates, majority_consensus, ambiguous_consensus, args.threads
)
if not qpcr_scheme_candidates:
logging.raise_error(
"no qPCR scheme candidates found. lower threshold or increase number of ambiguous chars in primer and/or probe\n",
Expand Down Expand Up @@ -516,7 +573,7 @@ def main():
blast.check_BLAST_installation(log_file)

# mode unspecific part of the workflow
alignment_cleaned, majority_consensus, ambiguous_consensus, primer_regions, left_primer_candidates, right_primer_candidates, potential_primer_regions = shared_workflow(args, log_file)
alignment_cleaned, majority_consensus, ambiguous_consensus, primer_regions, left_primer_candidates, right_primer_candidates, potential_primer_regions, compatible_primers = shared_workflow(args, log_file)

# write files that are shared in all modes
reporting.write_regions_to_bed(primer_regions, args.name, data_dir)
Expand All @@ -538,6 +595,7 @@ def main():

# SINGLE/TILED mode
if args.mode == "tiled" or args.mode == "single":
dimers_not_solved = None
all_primers, amplicons = single_and_tiled_shared_workflow(
args,
left_primer_candidates,
Expand All @@ -553,15 +611,14 @@ def main():
log_file
)
elif args.mode == "tiled":
amplicon_scheme = tiled_workflow(
amplicon_scheme, dimers_not_solved = tiled_workflow(
args,
amplicons,
left_primer_candidates,
right_primer_candidates,
all_primers,
ambiguous_consensus,
log_file,
results_dir
)

# write files
Expand All @@ -578,7 +635,8 @@ def main():
ambiguous_consensus,
args.name,
args.mode,
log_file
log_file,
dimers_not_solved
)
reporting.varvamp_plot(
results_dir,
Expand All @@ -600,6 +658,7 @@ def main():
majority_consensus,
left_primer_candidates,
right_primer_candidates,
compatible_primers,
log_file
)

Expand Down
8 changes: 5 additions & 3 deletions varvamp/scripts/default_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
'PCR_DNA_CONC', 'PCR_DNTP_CONC', 'PCR_DV_CONC', 'PCR_MV_CONC',
'PRIMER_3_PENALTY', 'PRIMER_GC_END', 'PRIMER_GC_PENALTY',
'PRIMER_GC_RANGE', 'PRIMER_HAIRPIN', 'PRIMER_MAX_BASE_PENALTY',
'PRIMER_MAX_DIMER_TMP', 'PRIMER_MAX_DINUC_REPEATS', 'PRIMER_MAX_POLYX',
'PRIMER_MAX_DIMER_TMP', 'PRIMER_MAX_DIMER_DELTAG', 'PRIMER_MAX_DINUC_REPEATS', 'PRIMER_MAX_POLYX',
'PRIMER_MIN_3_WITHOUT_AMB', 'PRIMER_PERMUTATION_PENALTY',
'PRIMER_SIZES', 'PRIMER_SIZE_PENALTY',
'PRIMER_TMP', 'PRIMER_TM_PENALTY',
Expand All @@ -30,7 +30,10 @@
PRIMER_HAIRPIN = 47 # max melting temp for secondary structure
PRIMER_GC_END = (1, 3) # min/max GCs in the last 5 bases of the 3' end
PRIMER_MIN_3_WITHOUT_AMB = 3 # min len of 3' without ambiguous charaters
PRIMER_MAX_DIMER_TMP = 47 # max melting temp for dimers (homo- or heterodimers)
# primer dimer constraints
PRIMER_MAX_DIMER_TMP = 35 # max melting temp for dimers, lower temperature means more stringent filtering
PRIMER_MAX_DIMER_DELTAG = -9000 # max allowed gibbs free energy for dimer formation, higher values mean more stringent filtering
END_OVERLAP = 5 # maximum allowed nt overlap between ends of primers to be considered a dimer

# QPCR parameters
# basic probe parameters
Expand All @@ -42,7 +45,6 @@
QPRIMER_DIFF = 2 # maximal temperature diff of qPCR primers
QPROBE_TEMP_DIFF = (5, 10) # min/max temp diff between probe and primers
QPROBE_DISTANCE = (4, 15) # min/max distance to the primer on the same strand
END_OVERLAP = 5 # maximum allowed nt overlap between the ends of probe and primer
QAMPLICON_LENGTH = (70, 200) # min/max length of the qPCR amplicon
QAMPLICON_GC = (40, 60) # GC min/max of the qPCR amplicon
QAMPLICON_DEL_CUTOFF = 4 # consider regions of the alignment for deltaG calculation if they have smaller deletions than cutoff
Expand Down
Loading