Skip to content

Include barcode integer suffix in index. #8

@pontushojer

Description

@pontushojer

Relates to #6.

As noted in the longranger docs (below) the suffix number can be any integer, not just "-1", as it is mean to allow for merging of different 10X libraries into the same BAM.

The BX tag includes a suffix with a dash separator followed by a number:
AGAATGGTCTGCATCG-1
This number denotes what we call a GEM group, and is used to virtualize barcodes in order to achieve a higher effective barcode diversity when combining samples generated from separate GEM chip channel runs. Normally, this number will be "1" across all barcodes when analyzing a sample generated from a single GEM chip channel. It can either be left in place and treated as part of a unique barcode identifier, or explicitly parsed out to leave only the barcode sequence itself.

I run into this issue when trying to run LRez index bam on a BAM with multiple libraries which resulted in the following error:

determineSequencingTechnology: Unrecognized sequencing technology. Please make sure your barcodes originate from a compatible technology or are reported as nucleotides in the BX:Z tag.

From what I can understand from the code this suffix is currently not include in the index. For LRez to work with BAMs that contain multiple libraries this would need to be fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions