Releases · PacificBiosciences/paraphase

07 Nov 19:16

xiao-chen-xc

v3.4.0

a72afbc

Version 3.4.0 Latest

Latest

Summary of changes:

Fix bug where MM/ML tags are off in supplementary alignments
Fix bug where the last base of a read is sometimes used in phasing
Fix bug in variant calling where in some reads a wrong base is chosen between primary and supplementary alignments
Fix bug where some alleles may include redundant haplotype names
Fix bug where it's not possible to lower the minimum variant frequency for variant calling for individual target regions. This change affects opn1lw and ikbkg.
Improve large deletion calling in reads. This change may affect all targets, particularly ikbkg and pms2.
Rename haplotypes to label gene1 and gene2 for regions with fusion calling(GBA, CYP2D6, CYP11B1 and CFH/CFHR3)
For smn1, consider more scenarios for adjusting SMN2 copy number
For ncf1, adjust copy number for the scenario where three haplotypes are found and all are present at two copies.
Update two copy haplotypes for CFHclust

Assets 2

20 Aug 17:00

xiao-chen-xc

v3.3.4

ef1f25c

Version 3.3.4

Summary of changes

Fix f-string bug that causes python 3.11 or earlier to fail.

Assets 2

15 Aug 21:58

xiao-chen-xc

v3.3.3

1c265d7

Version 3.3.3

Summary of changes:

Fix bug that min_variant_frequency cannot be set lower than the default value
Fix minor bug in ikbkg that causes program to error out
Sort reads by name first to remove indeterminism in haplotype names
Do not write VCF when region is clearly not homozygous but no haplotypes are phased (most likely due to low depth)
Add the phase_region field in JSON output to report the coordinates of the analysis region and the genome build
Minor improvement on indel detection
Minor improvement on handling gene1_cn2, a scenario specified in the config asking Paraphase to assume a paralog group to always have two copies of gene1
Improve documentation
- Update NEB tutorial to clarify on the order of TRI1/2/3
- Update README to clarify that the fusions_called field is only reported for four regions
- Update the targeted data tutorial to include more details on PureTarget

Assets 2

28 May 15:54

xiao-chen-xc

v3.3.2

dce99f8

Version 3.3.2

Summary of changes:

Fix rare scenarios when program errors out in some low-depth regions. No algorithm change.
Update license.

Assets 2

02 May 21:47

xiao-chen-xc

v3.3.1

bc7134f

Version 3.3.1

Bug fix:

Preserve MM/ML tags through minimap2 realignment instead of parsing MM/ML tags (Version 3.3.0 uses pysam to parse base modifications, but we have noticed some cases where pysam v0.23 crashes when parsing base modifications)

Assets 2

26 Apr 17:37

xiao-chen-xc

v3.3.0

16c90e4

Version 3.3.0

Summary of changes:

Improve phasing haplotypes into alleles, allowing (n-1)/1 scenario
Do not adjust total_cn from 2 to 4 based on depth when calling fusions
Fix rare bugs in depth-related analysis
Update ALT alleles to . in the VCF for LowQual calls when ALT is equal to REF
Add Ml/Mm tags in Paraphase bam for base modification information
In Json output, rename hap_links to haplotype_links and rename linked_haplotypes to raw_alleles
For targeted data
- Improve copy number adjustment based on depth
- Add option to assume a paralog group to always have more than one copy of gene1
- Update command line options to use frequency-based parameters: --min-variant-frequency and --min-haplotype-frequency
smn1
- Enable smn1 analysis for CHM13-mapped data (Note that haplogroup assignment is not available with CHM13)
- Fix bug with assigning haplogroups to smn2 haplotypes
- Rename smn2_del78 haplotypes to smn_del78
hba
- Do not consider homology haplotypes during allele phasing
- Better handle hba with targeted data, fixing problems with identifying homology haplotypes
- Update genotype calls based on phased alleles
- Report genome coordinates for 3p7 and 4p2 SVs in the sv_called field
Minor update to strc and ncf1 for copy number adjustment based on depth
Update f8 to reflect SV types in the haplotype name

Assets 2

10 Feb 19:44

xiao-chen-xc

v3.2.1

ce232a9

Version 3.2.1

Summary of changes:

Fix problem working with CRAM
Other minor changes:
- Add RN tag to output bam. RN stands for region name, indicating reads used to analyze one region (paralog group)
- Sort some output fields in the JSON, making it consistent from run to run
- Clean up reported alleles, filtering out cases where all haplotypes are linked into one allele, or not all haplotypes are included when two alleles are reported
- Fix getting sample ID from bam header when there are blank spaces
- Fix logic for writing homozygous haplotypes in VCF (no haplotypes phased -> no haplotypes phased and no heterozygous variant sites)
- Update two_cp_haplotypes when adjusting total_cn from 2 to 4 based on depth
- Use all reads instead of unique reads for variant calling at edges of clipped haplotypes

Assets 2

25 Jan 21:32

xiao-chen-xc

v3.2.0

933f7d1

Version 3.2.0

Summary of changes:

Updates to better handle targeted data

Filter reads on rq (>=0.99), if rq is present in input bam
Add a --targeted option for targeted data to drop the assumption of uniform coverage across the genome
Add two optional parameters for targeted data
- --min-read-variant: Partially controls the number of supporting reads for a variant for identifying variants used for phasing. The cutoff for variant-supporting reads is determined by min(this number, max(5, depth*0.11)). Default is 20. At standard WGS depth, the default value is overwritten by max(5, depth*0.11).
  - Use cases: 1) Set this number low for low-coverage data or to increase sensitivity. 2) For targeted data with high coverage, set this number relatively high to avoid picking up sequencing errors and to reduce run time.
- --min-read-haplotype: Minimum number of unique supporting reads for a haplotype. Default is 4. For targeted data with high coverage, this cutoff can be increased to reduce errors and to reduce run time.

Updates to target regions:

Update coordinates of some target regions to include full genes whenever possible: pms2,ikbkg,hba,DDT,MBD3L2,DEFA1,PRY,CHRNA7,DHX40,GOLGA8A,IQCK,NXF2,OTOA,PDPK1,POTEI,RGPD1,RGPD3,RSPH10B,SIK1,TMLHE,CBS,KCNE1,CASTOR2,NBPF4,RGPD5,GOLGA8N,POTEB,ANKRD20A1,NSF
Add TNXB as a region on its own so that the full gene can be genotyped (the RCCX region only includes part of TNXB)

Algorithmic changes

Improve fusion calling in cases of homozygous deletion
Add some homozygous sites to cover target regions evenly during phasing to improve read assignment to haplotypes and variant calling
Update a few gene-specific callers
- hba: Add calling of 4.2 deletion/duplication
- smn1: If homozygous throughout region, default to CN =2 instead of 1; Drop carrier call if only one SMN1 haplotype is found but the total CN of SERF1A/B (neighboring locus) is larger than the total CN of SMN1/2
- ikbkg: Improve calling of the 11.7kb deletion; Update the config to genotype the entire gene
- ncf1: Drop carrier call if only one NCF1 haplotype is found but the total CN of GTF2I (neighboring locus) is larger than the total CN of NCF1 family
- rccx: Better handle homozygous deletion cases
- pms2: Update the config to genotype the entire gene

Other changes:

Support cram as input
Standardize haplotype naming across regions: {gene name}_{haplotype name}

Assets 2

24 Jan 23:06

xiao-chen-xc

v3.1.2

f4630d2

Version 3.1.2

Summary of changes:

Add --write-nocalls-in-vcf option to write no-call sites in the VCF

Assets 2

18 Apr 18:47

xiao-chen-xc

v3.1.1

8de77bb

Version 3.1.1

Summary of changes:
Minor update. Fix program error in low-depth or no-data regions. Completes analysis even when the input is a small bamlet (result is still a no-call).

Assets 2

Releases: PacificBiosciences/paraphase

Version 3.4.0

Uh oh!

Version 3.3.4

Uh oh!

Version 3.3.3

Uh oh!

Version 3.3.2

Uh oh!

Version 3.3.1

Uh oh!

Version 3.3.0

Uh oh!

Version 3.2.1

Uh oh!

Version 3.2.0

Uh oh!

Version 3.1.2

Uh oh!

Version 3.1.1

Uh oh!